Project organization and CLI

201 views
Skip to first unread message

TR NS

unread,
Jun 16, 2014, 12:16:23 PM6/16/14
to julia...@googlegroups.com
I am trying to organize my current project so that I have a few separate files/modules to keep things tidy, including a separate cli module for invoking the program. But I am having some trouble bringing it all together. The main of my project's layout is:

    bin/
      corpus
    code/
      cli.jl
      ngrams.jl
      ...

Where `bin/corpus` contains:

    #!/usr/bin/env julia
    include("../code/cli.jl")
    Corpus.Cli.run(ARGS)

And `code/cli.jl` starts out with:

    module Corpus
    module Cli

    include("./ngrams.jl")

    function run(args)
      ...

But when I run `bin/corpus`, the program just hangs, and it appears to be doing so on the `include`.

So how does one do this properly?

I have read through the documentation on Modules, but it is not very clear to me. In fact, to be honest, it seems overly complicated, with `include`, `import`, `require`, `using`, etc. (Makes me long for the simplicity of Lua's and Javascript/NPM's `require`.)






println("Corpus")

include("../code/cli.jl")

Corpus.Cli.run(ARGS)

TR NS

unread,
Jun 16, 2014, 12:53:40 PM6/16/14
to julia...@googlegroups.com
Made a modicum of progress. I am not sure why, but it stopped hanging, and now I get a warning. "replacing module Corpus" and then an error that is can't find Ngrams.

My `ngrams.jl` file starts out:

    module Corpus
      module Ngrams

And now I am pretty sure I totally don't understand how different files are supposed to be included together.



Leah Hanson

unread,
Jun 16, 2014, 12:56:29 PM6/16/14
to julia...@googlegroups.com
`include` is like copy-pasting the code from the included file into the spot where you called include.

You shouldn't have `module Corpus` in ngrams.jl.

-- Leah

TR NS

unread,
Jun 16, 2014, 1:31:01 PM6/16/14
to julia...@googlegroups.com
On Monday, June 16, 2014 12:56:29 PM UTC-4, Leah Hanson wrote:
`include` is like copy-pasting the code from the included file into the spot where you called include.

You shouldn't have `module Corpus` in ngrams.jl.

Thanks. That helps me understand include().

Unfortunately it doesn't seem to bring things together though. Per your advice, I add a new corpus.jl file:

    module Corpus
      include("cli.jl")
      include("ngrams.jl")
    end

And I removed `module Corpus` from both cli.jl and ngrams.jl. Then in `bin/corpus` I change the include to `include("../code/corpus.jl")`. The end result is the error: `Ngrams not defined`. Apparently the cli.jl code can't see the ngrams.jl code?





Leah Hanson

unread,
Jun 16, 2014, 1:58:54 PM6/16/14
to julia...@googlegroups.com
Could you post a gist with all the files? (https://gist.github.com/)
It would be easier to understand what's going on if I could see the whole thing.

Have you tried switching the order of the imports? `cli.jl` won't be able to see `ngrams.jl` if all of cli is included & run first, before ngrams is included & run.

-- Leah

TR NS

unread,
Jun 16, 2014, 2:56:37 PM6/16/14
to julia...@googlegroups.com
You can see the project here:

    https://github.com/openbohemians/corpus

But I've now changed the code to get it to work. I just had to throw the `Corpus` module out the window and include `ngrams.jl` directly into `cli.jl`. That works, but it doesn't get me anywhere with designing more complex compositions in the future -- I was hoping to have a few submodules in the end, `Corpus.Ngrams`, `Corpus.Words` and `Corpus.Letters`.

I started looking at some other projects to see how other's did things. I am seeing the same basic pattern: A single main capitalized file that has the *only* module statement and every other file is a set of types and functions dumped into it (via include). I have yet to find a project using submodules.

Stefan Karpinski

unread,
Jun 16, 2014, 3:07:36 PM6/16/14
to Julia Users
There's significantly less need for fine-grained modules in Julia. Is the lack of submodules causing some kind of problem or just discomfort at their absence?

TR NS

unread,
Jun 16, 2014, 4:13:29 PM6/16/14
to julia...@googlegroups.com
On Monday, June 16, 2014 3:07:36 PM UTC-4, Stefan Karpinski wrote:
There's significantly less need for fine-grained modules in Julia. Is the lack of submodules causing some kind of problem or just discomfort at their absence?

Is there? I always appreciated code that broke things up into reasonably manageable chunks. Having a single module with a dozens upon dozens of functions tends to overtax the function namespace. For example, in my case I want the separate submodules so I can do `Ngrams.report(...)` and `Words.report(...)` rather than having them in the same space with function names `ngramsreport()` and `wordsreport`, plus having to ensure none of their supporting function names clash.

Presently I was able to work around my previous troubles by dumping the Cli module and just have a `cli()` method within the toplevel Corpus module, under which I import the submodules. That works well. The only potential problem I foresee is sharing code between submodules. Since Cli previously was unable to see Ngrams, I imagine all my submodules are going to have the same issue if I create a Utils module to be shared. if so, that means I will have to include utils.jl into each submodule. But will that create code redundancy --the same exact code being included multiple times in different places?

I am too new to Julia, so its too early for me to say if it's a real disadvantage or simply a discomfort of unfamiliarity. But I will note that I have a renewed appreciation of Ruby's "open class" approach.


Stefan Karpinski

unread,
Jun 16, 2014, 4:44:11 PM6/16/14
to Julia Users
Generic functions are the reason this issue is less pressing in Julia. Instead of Ngrams.report and Words.report or ngramsreport and wordsreport, you can have report(x::Ngrams, ...) and report(x::Words, ...) – Ngrams, Words and report can all live in the same namespace without any issues and the two report methods are just different ways to report things.

Stefan Karpinski

unread,
Jun 16, 2014, 4:49:47 PM6/16/14
to Julia Users
You should definitely not include the same code many times – in that case, what you need is a module that all the users use.

TR NS

unread,
Jun 17, 2014, 10:11:42 PM6/17/14
to julia...@googlegroups.com


On Monday, June 16, 2014 4:44:11 PM UTC-4, Stefan Karpinski wrote:
Generic functions are the reason this issue is less pressing in Julia. Instead of Ngrams.report and Words.report or ngramsreport and wordsreport, you can have report(x::Ngrams, ...) and report(x::Words, ...) – Ngrams, Words and report can all live in the same namespace without any issues and the two report methods are just different ways to report things.

My first reaction was "Oh yeah, cool!" But on later consideration I don't think this works. I take your point in general --method dispatch is really an awesome feature of Julia that may indeed lesson the need for compartmentalization. But in my case NGrams and Words aren't types, they have no state. They are simply a related set of functions. Conceivably I could create two separate packages altogether, one for ngrams counts and the other for individual word counts. But I don't want them to be separate packages; obviously they have some things in common. I just want to keep them nicely separated within the same package.



yfra...@gmail.com

unread,
Jun 17, 2014, 10:44:28 PM6/17/14
to julia...@googlegroups.com
Hope i understand your question :)

I think this may be related the pwd.

I create a "bin" directory, and touch corpus

and the in the corpus I write 

```
println("the pwd is:")
println(pwd())

println("the code/clj.lj is:")
println(joinpath(pwd(),"../"))

```

Then i run the file by ` julia bin/corpus ` and got

```
the pwd is:
/Users/y/tmp
the code/clj.lj is:
/Users/y/tmp/../
```


the "include" use the "abspath", and `abspath(a::String) = normpath(isabspath(a) ? a : joinpath(pwd(),a))`, so i guess it related to pwd.

Maybe put the file in the root is ok.


TR NS於 2014年6月17日星期二UTC+8上午12時16分23秒寫道:

Stefan Karpinski

unread,
Jun 18, 2014, 3:43:17 PM6/18/14
to Julia Users
It's not uncommon in Julia to have stateless "empty" types just for dispatch. The real question here is whether the "report" verb really means the same thing or not for n-grams and words. If they're different variations on the same meaning, then the two methods should belong to the same generic function. If they don't mean the same thing, then they shouldn't and they either need different names or different namespaces. I really think that the tendency to have many methods with the same name and unclear relationships in single dispatch o.o. languages is problematic – especially in dynamic languages where there doesn't have to be an inheritance relationship. Generic functions force you to be decisive about whether two methods are related or not.
Reply all
Reply to author
Forward
0 new messages