Sorry to start a new thread for this one, not sure of the proper
etiquette...
My name is Oren Leiman. I am an undergraduate in the Computer
Science department at the University of California at Santa Cruz, and
I am interested in participating in GSoC this spring/summer.
I have experience with functional programming at the level of the
SICP,
which I read and worked through with great zeal. Although I have
limited
experience with C++, I am proficient in both ANSI C and Java (and I am
a fast learner), so I would not anticipate any serious trouble in that
transition.
I have not yet built OpenCog on my home machine; however, I have read
the paper linked to in the project idea, and this sounds right up my
alley.
My primary interests in CS are AI/machine learning (with which I have
little experience), functional programming (fascinated by the lambda
calculus),
and the analysis of algorithms. This project appeals to me
specifically because
it sounds like an opportunity to use my limited knowledge of LISP to
some
practical purpose (Combo, which I would be extending, is a lightweight
LISP
dialect, no?).
I would like to submit a patch to demonstrate my abilities, but I am
unsure
of where to begin (I have zero experience in open source development).
Any
guiding suggestions would be most welcome. Also, any other papers on
the
algebra of fold that I could read?
Just register there and install bazaar to your system. Linux is more comfortable for me because you simply input "bzr" command to the console and get all of the code, but it is your choice, of course. Here is Linux examples. In Windows you could use bazaar explorer or command promt too, but I haven't used it yet. (Sorry for that.) After registering you can put your modified sources to the opencog branches. You will get individual branch with a unique name such a lp:~oren-leiman/opencog/opencog
You can get opencog sources with bzr branch lp:opencog
2) Bugtracker https://launchpad.net/opencog/+bugs Choose your favorite bug here, fix it and submit code to your branch with bzr push lp:~oren-leiman/opencog/opencog (you need put your branch name there)
On Wednesday, March 28, 2012 9:30:47 AM UTC+4, Oren wrote:
> Hi,
> Sorry to start a new thread for this one, not sure of the proper > etiquette...
> My name is Oren Leiman. I am an undergraduate in the Computer > Science department at the University of California at Santa Cruz, and > I am interested in participating in GSoC this spring/summer.
> I have experience with functional programming at the level of the > SICP, > which I read and worked through with great zeal. Although I have > limited > experience with C++, I am proficient in both ANSI C and Java (and I am > a fast learner), so I would not anticipate any serious trouble in that > transition.
> I have not yet built OpenCog on my home machine; however, I have read > the paper linked to in the project idea, and this sounds right up my > alley. > My primary interests in CS are AI/machine learning (with which I have > little experience), functional programming (fascinated by the lambda > calculus), > and the analysis of algorithms. This project appeals to me > specifically because > it sounds like an opportunity to use my limited knowledge of LISP to > some > practical purpose (Combo, which I would be extending, is a lightweight > LISP > dialect, no?).
> I would like to submit a patch to demonstrate my abilities, but I am > unsure > of where to begin (I have zero experience in open source development). > Any > guiding suggestions would be most welcome. Also, any other papers on > the > algebra of fold that I could read?
> I have not yet built OpenCog on my home machine; however, I have read > the paper linked to in the project idea, and this sounds right up my > alley. > My primary interests in CS are AI/machine learning (with which I have > little experience), functional programming (fascinated by the lambda > calculus), > and the analysis of algorithms. This project appeals to me > specifically because > it sounds like an opportunity to use my limited knowledge of LISP to > some > practical purpose (Combo, which I would be extending, is a lightweight > LISP > dialect, no?).
pretty close yes, but not as fully fledged of course. Only boolean, continuous types and operators are supported + some home brewed loops and sequencing operators for acting in a virtual world.
> I would like to submit a patch to demonstrate my abilities, but I am > unsure > of where to begin (I have zero experience in open source development).
I can't think of anything useful that would take just a couple of hours of work to a newcomer. I'll try to think about it...
What you could do in the mean time is compile, install and try MOSES on some toy problems, see moses --help for more information.
> Any > guiding suggestions would be most welcome. Also, any other papers on > the > algebra of fold that I could read?
> -- > You received this message because you are subscribed to the Google Groups "opencog" group. > To post to this group, send email to opencog@googlegroups.com. > To unsubscribe from this group, send email to opencog+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/opencog?hl=en.
So, I've been fiddling with MOSES all day, and I
think I've arrived at some meaningful questions about
the proposed project.
It seems that, as stated on the ideas page, items 1
and 2 (adding lists and a fold builtin to combo)
would be not absurdly difficult. I have been rooting
through the existing Combo code base and, although the
closest thing I've seen to a Combo interpreter written
in C++ is Mr. Norvig's SCHEME interpreter written in
Java, it seems pretty intuitive. Obviously, this would
be hard work, but it doesn't seem beyond my grasp.
With regard to this first part, I am wondering how
robust the list/fold implementations should be. For
example, should there be support for nested lists?
Would there need to be 'public' methods for operations
like car, cdr, and cons (or equivalent)? That is, I'm
struggling to conceptualize whether the Combo programs
that MOSES pops out will ever include these things or
if they would basically just take place under the hood.
Is there a general specification that somebody has
in mind?
Item 3 is probably the most interesting-sounding to
me at this time. Would I have the opportunity to piece
together the reduction rules myself (this sounds like
the interesting part), or would this be done for me,
leaving only the actual implementation?
As far as item 4 is concerned, I would certainly be
interested in attempting it, but the representation-
building phase seems to me by far the most mysterious
part of MOSES. How does the initial exemplar, the sort
of protozoa of the genetic algorithm get created? Is this
process randomized, like the knob placement?
On Sat, Mar 31, 2012 at 9:54 AM, Oren <mumblemumble...@gmail.com> wrote: > Hi All,
> So, I've been fiddling with MOSES all day, and I > think I've arrived at some meaningful questions about > the proposed project.
> It seems that, as stated on the ideas page, items 1 > and 2 (adding lists and a fold builtin to combo) > would be not absurdly difficult. I have been rooting > through the existing Combo code base and, although the > closest thing I've seen to a Combo interpreter written > in C++ is Mr. Norvig's SCHEME interpreter written in > Java, it seems pretty intuitive. Obviously, this would > be hard work, but it doesn't seem beyond my grasp.
> With regard to this first part, I am wondering how > robust the list/fold implementations should be. For > example, should there be support for nested lists? > Would there need to be 'public' methods for operations > like car, cdr, and cons (or equivalent)? That is, I'm > struggling to conceptualize whether the Combo programs > that MOSES pops out will ever include these things or > if they would basically just take place under the hood. > Is there a general specification that somebody has > in mind?
It hasn't been decided. We're aiming to a minimal set of operators, I suppose the constructors and access functions will simply be public as you anticipate.
> Item 3 is probably the most interesting-sounding to > me at this time. Would I have the opportunity to piece > together the reduction rules myself (this sounds like > the interesting part), or would this be done for me, > leaving only the actual implementation?
Yeah, you've got to design them too.
> As far as item 4 is concerned, I would certainly be > interested in attempting it, but the representation- > building phase seems to me by far the most mysterious > part of MOSES. How does the initial exemplar, the sort > of protozoa of the genetic algorithm get created? Is this > process randomized, like the knob placement?
If it's too big it can be randomized. Have you watch that lecture I gave on MOSES? It includes an overview of representation building
> Alright that's all for now. Sorry for the length.
> Thanks, > Oren
> -- > You received this message because you are subscribed to the Google Groups "opencog" group. > To post to this group, send email to opencog@googlegroups.com. > To unsubscribe from this group, send email to opencog+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/opencog?hl=en.
> > I have not yet built OpenCog on my home machine; however, I have read > > the paper linked to in the project idea, and this sounds right up my > > alley. > > My primary interests in CS are AI/machine learning (with which I have > > little experience), functional programming (fascinated by the lambda > > calculus), > > and the analysis of algorithms. This project appeals to me > > specifically because > > it sounds like an opportunity to use my limited knowledge of LISP to > > some > > practical purpose (Combo, which I would be extending, is a lightweight > > LISP > > dialect, no?).
> pretty close yes, but not as fully fledged of course. Only boolean, > continuous types and operators are supported + some home brewed loops > and sequencing operators for acting in a virtual world.
Well, as pointed out elsewhere, combo has a dynamic type system; whereas scheme and lisp are un-typed. So this does make things different. (so, for example, caml is the best-known example of a typed functional programming language).
> > I would like to submit a patch to demonstrate my abilities, but I am > > unsure > > of where to begin (I have zero experience in open source development).
> I can't think of anything useful that would take just a couple of > hours of work to a newcomer. I'll try to think about it...
Well, I can think of something, but it might take a lot more than a few hours. Not sure. As I complained in a different email, when combo reads in a table of values, it infers the types (i.e. just-plain guesses) by looking at values in the first row. So if the first row contains 0's and 1's it will guess that every column is a boolean. If later rows contain other numbers, it will crash, as it was expecting booleans...
This might be hard because -- it will take you a while to figure out how to test this; then you'd have to findthe right code to read (and that not trivial) and then you have to fix it (and there will be many places where a first-row type of assumption is made) It might be very tangled. Not sure. Might take you a day or a week.
> > algebra of fold that I could read?
> I don't know except trying Google.
Well, if you've read SICP you should know of SRFI-1 and that provides the defacto/best implementation of fold and its friends, for scheme.
> Well, I can think of something, but it might take a lot more than a few
> hours. Not sure. As I complained in a different email, when combo reads
> in a table of values, it infers the types (i.e. just-plain guesses) by
> looking at values in the first row. So if the first row contains 0's and
> 1's it will guess that every column is a boolean. If later rows contain
> other numbers, it will crash, as it was expecting booleans...
Thanks! Whether or not I can figure it out, the only thing I have to
lose is a little bit of time, which isn't really lost as long as I
learn
something...
I had actually encountered this bug firsthand, and I was wondering
whether there was some input-specification that I was missing...
> This might be hard because -- it will take you a while to figure out how to
> test this; then you'd have to findthe right code to read (and that not
> trivial) and then you have to fix it (and there will be many places where a
> first-row type of assumption is made) It might be very tangled. Not sure.
> Might take you a day or a week.
I'm not sure I know what you mean by testing this...I have definitely
seen it happen and, since you mentioned it, am experimenting with
*making* it happen, but I might be missing something.
I just produced the bug, and I see that MOSES crashes on an
assertion exception generated in table.cc (located in
opencog/comboreduct/combo). I see a method there called
'infer_data_type_tree' which seems to handle input-table I/O, so I
think I'm on the right track. Since Nil seems to have authored this
module, perhaps he can provide some insight into the problem...
> Well, if you've read SICP you should know of SRFI-1 and that provides the
> defacto/best implementation of fold and its friends, for scheme.
Nice one. I hadn't thought to look in SRFI-1. This was very helpful.
Takes some of the pressure off to see how it's done in Scheme
(familiar ground).
Also, I watched Nil's lecture on MOSES yesterday afternoon, and
it cleared up some questions I had on the representation building
process, so thanks for that. I'm glad to hear that such a large
portion
of the design work on this project is left to the student.
> > This might be hard because -- it will take you a while to figure out how > to > > test this; then you'd have to findthe right code to read (and that not > > trivial) and then you have to fix it (and there will be many places > where a > > first-row type of assumption is made) It might be very tangled. Not > sure. > > Might take you a day or a week.
> I'm not sure I know what you mean by testing this...I have definitely > seen it happen and, since you mentioned it, am experimenting with > *making* it happen, but I might be missing something.
> I just produced the bug, and I see that MOSES crashes on an > assertion exception generated in table.cc (located in > opencog/comboreduct/combo). I see a method there called > 'infer_data_type_tree' which seems to handle input-table I/O, so I > think I'm on the right track. Since Nil seems to have authored this > module, perhaps he can provide some insight into the problem...
I thought I'd already explained the problem: it is not the crash, per-se, but that, by looking at only the first row, the wrong types are inferred, The right types won't be clear until you look at all rows. So you'll have to look all the rows, guess the types, and only then save the table contents into the internal data structures, this time with the correct types.
Yeah, sorry for the confusion on my end.
Thanks for clarifying, and all that is well understood. I just hadn't
looked closely enough at the code. In any event, I'm pretty sure I
see a path to an easy enough fix. I'll see what I can do.
On Wed, Apr 4, 2012 at 7:04 AM, Oren <mumblemumble...@gmail.com> wrote: > Yeah, sorry for the confusion on my end. > Thanks for clarifying, and all that is well understood. I just hadn't > looked closely enough at the code. In any event, I'm pretty sure I > see a path to an easy enough fix. I'll see what I can do.
> -- > You received this message because you are subscribed to the Google Groups "opencog" group. > To post to this group, send email to opencog@googlegroups.com. > To unsubscribe from this group, send email to opencog+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/opencog?hl=en.
On Wed, Apr 4, 2012 at 8:01 AM, Nil Geisweiller <ngeis...@googlemail.com> wrote: > On Wed, Apr 4, 2012 at 7:04 AM, Oren <mumblemumble...@gmail.com> wrote: >> Yeah, sorry for the confusion on my end. >> Thanks for clarifying, and all that is well understood. I just hadn't >> looked closely enough at the code. In any event, I'm pretty sure I >> see a path to an easy enough fix. I'll see what I can do.
> you might want to use get_intersection in file
> opencog/comboreduct/combo/type_tree.h
More specificly, you could say that 1 or 0 has type union(contin boolean), and other stuff that can be casted into float have type contin. At each row you perform their intersection and at the end anything that is still union(contin boolean) becomes boolean. That's the way I would do it, but of course if you come up a simpler code go ahead with it.
>> -- >> You received this message because you are subscribed to the Google Groups "opencog" group. >> To post to this group, send email to opencog@googlegroups.com. >> To unsubscribe from this group, send email to opencog+unsubscribe@googlegroups.com. >> For more options, visit this group at http://groups.google.com/group/opencog?hl=en.
Couldn't you just specify in the input file that some columns are boolean rather than contin? e.g. either by saying the list of types as the first row of thie file. or e.g. you could just have 'T' and 'F' instead of 1 and 0 for boolean...
On Wed, Apr 4, 2012 at 1:11 PM, Nil Geisweiller <ngeis...@googlemail.com> wrote: > On Wed, Apr 4, 2012 at 8:01 AM, Nil Geisweiller <ngeis...@googlemail.com> wrote: >> On Wed, Apr 4, 2012 at 7:04 AM, Oren <mumblemumble...@gmail.com> wrote: >>> Yeah, sorry for the confusion on my end. >>> Thanks for clarifying, and all that is well understood. I just hadn't >>> looked closely enough at the code. In any event, I'm pretty sure I >>> see a path to an easy enough fix. I'll see what I can do.
>> you might want to use get_intersection in file
>> opencog/comboreduct/combo/type_tree.h
> More specificly, you could say that 1 or 0 has type union(contin > boolean), and other stuff that can be casted into float have type > contin. At each row you perform their intersection and at the end > anything that is still union(contin boolean) becomes boolean. That's > the way I would do it, but of course if you come up a simpler code go > ahead with it.
> Nil
>> Nil
>>> Oren
>>> -- >>> You received this message because you are subscribed to the Google Groups "opencog" group. >>> To post to this group, send email to opencog@googlegroups.com. >>> To unsubscribe from this group, send email to opencog+unsubscribe@googlegroups.com. >>> For more options, visit this group at http://groups.google.com/group/opencog?hl=en.
> -- > You received this message because you are subscribed to the Google Groups "opencog" group. > To post to this group, send email to opencog@googlegroups.com. > To unsubscribe from this group, send email to opencog+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/opencog?hl=en.
What I had started to do was to just hold an array of
current type inferences at the infer_data_type_tree
level and then call infer_row_type_tree on each row
in the input table. I was going to overload infer_row_type_tree
to take a last line flag and the array of type_nodes as
arguments so that, instead of creating a new type tree
every time it is called, infer_row_type_tree would just
update the array until the last line is reached, and only
then create the final type tree with the correct types.
My goal here is to make as few changes as possible
to the existing code. I don't think this approach will
cause any serious efficiency hit, so I'm going to finish
implementing and see if it works. On the other hand, I
might be missing something, so any feedback is always
appreciated.
On Wed, Apr 4, 2012 at 2:39 PM, Jared Wigmore <jared.wigm...@gmail.com> wrote: > Couldn't you just specify in the input file that some columns are > boolean rather than contin? e.g. either by saying the list of types as > the first row of thie file. or e.g. you could just have 'T' and 'F' > instead of 1 and 0 for boolean...
Yeah I thought of using T/F instead of 1/0 at some point, but since I haven't encountered the contin vs bool ambiguity yet I didn't bother to change it. And I think the user expects to have moses handles 0 and 1 as representing booleans. But we could certainly support T/F, True/False, etc as well, and possibly add an option to deactivate the 0/1 coding when it gets really annoying, but again so far I never had such annoyance.
> On Wed, Apr 4, 2012 at 1:11 PM, Nil Geisweiller <ngeis...@googlemail.com> wrote: >> On Wed, Apr 4, 2012 at 8:01 AM, Nil Geisweiller <ngeis...@googlemail.com> wrote: >>> On Wed, Apr 4, 2012 at 7:04 AM, Oren <mumblemumble...@gmail.com> wrote: >>>> Yeah, sorry for the confusion on my end. >>>> Thanks for clarifying, and all that is well understood. I just hadn't >>>> looked closely enough at the code. In any event, I'm pretty sure I >>>> see a path to an easy enough fix. I'll see what I can do.
>>> you might want to use get_intersection in file
>>> opencog/comboreduct/combo/type_tree.h
>> More specificly, you could say that 1 or 0 has type union(contin >> boolean), and other stuff that can be casted into float have type >> contin. At each row you perform their intersection and at the end >> anything that is still union(contin boolean) becomes boolean. That's >> the way I would do it, but of course if you come up a simpler code go >> ahead with it.
>> Nil
>>> Nil
>>>> Oren
>>>> -- >>>> You received this message because you are subscribed to the Google Groups "opencog" group. >>>> To post to this group, send email to opencog@googlegroups.com. >>>> To unsubscribe from this group, send email to opencog+unsubscribe@googlegroups.com. >>>> For more options, visit this group at http://groups.google.com/group/opencog?hl=en.
>> -- >> You received this message because you are subscribed to the Google Groups "opencog" group. >> To post to this group, send email to opencog@googlegroups.com. >> To unsubscribe from this group, send email to opencog+unsubscribe@googlegroups.com. >> For more options, visit this group at http://groups.google.com/group/opencog?hl=en.
> -- > You received this message because you are subscribed to the Google Groups "opencog" group. > To post to this group, send email to opencog@googlegroups.com. > To unsubscribe from this group, send email to opencog+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/opencog?hl=en.
On Thu, Apr 5, 2012 at 1:33 AM, Oren <mumblemumble...@gmail.com> wrote: > Nil and Jared,
> What I had started to do was to just hold an array of > current type inferences at the infer_data_type_tree > level and then call infer_row_type_tree on each row > in the input table. I was going to overload infer_row_type_tree > to take a last line flag and the array of type_nodes as > arguments so that, instead of creating a new type tree > every time it is called, infer_row_type_tree would just > update the array until the last line is reached, and only > then create the final type tree with the correct types.
Yeah, that sounds cheaper.
> My goal here is to make as few changes as possible > to the existing code. I don't think this approach will > cause any serious efficiency hit,
If the hit is really unavoidable (though I don't think it is), it could eventually be parallelized, with the new C++11 standard and gnu's OMP parallelization is really painless http://wiki.opencog.org/w/Multithreading
> so I'm going to finish > implementing and see if it works. On the other hand, I > might be missing something, so any feedback is always > appreciated.
> -- > You received this message because you are subscribed to the Google Groups "opencog" group. > To post to this group, send email to opencog@googlegroups.com. > To unsubscribe from this group, send email to opencog+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/opencog?hl=en.
I finished the code for the type-inference fix
yesterday evening. 'table.cc' compiles without
issue, but, during the linking phase, I get an error
originating from a file called 'libcomboreduct.so'.
The error claims an undefined reference to the
function that I overloaded (infer_row_type_tree).
I modified the function prototype in 'table.h'
(opencog/comboreduct/combo), but I may have
done so incorrectly or incompletely. Also,
libcomboreduct.so appears to be an executable
binary, so I don't really know how to interpret the
error.
I am assuming that this problem arises from my
lack of experience with c++, so I am optimistic
that I will be able to test my patch soon enough.
However, I think it best that I pause now and get
my application kinda dialed in before the deadline
tomorrow, so I may not get the patch finished by
Friday. C'est la vie. After I finish my application,
I will resume work on the patch.
As for the possibility of using multithreading to
accomplish this task, that sounds interesting, and
I will be sure to read up on it. Again, not super
experienced in c++, so it's all uncharted territory
for me.
On 5 April 2012 20:25, Oren <mumblemumble...@gmail.com> wrote:
> Hi,
> I finished the code for the type-inference fix > yesterday evening. 'table.cc' compiles without > issue, but, during the linking phase, I get an error > originating from a file called 'libcomboreduct.so'. > The error claims an undefined reference to the > function that I overloaded (infer_row_type_tree).
Do you really need to overload?
If you specified the code in a header file, then you probably forgot to say "inline" Otherwise, be sure to put it in a C file, and make sure the C file list listed in the CMakefile so that it gets built.
> I modified the function prototype in 'table.h' > (opencog/comboreduct/combo), but I may have > done so incorrectly or incompletely. Also, > libcomboreduct.so appears to be an executable > binary, so I don't really know how to interpret the > error.
a lib*.so is a "shared object" aka "shared library" aka "dll dynamically linked library". They typically need to be fully linked when being built, just like an executable.
You're right, I probably don't need to overload. I'm not sure
whether any other modules use the method in question,
so I wanted to overload at first, just to be safe. Can change
this later.
> If you specified the code in a header file, then you probably forgot to say
> "inline" Otherwise, be sure to put it in a C file, and make sure the C
> file list listed in the CMakefile so that it gets built.
I only added a function prototype to the header file, and I
put my definition in the same source file as the original.
Is it better form to create a whole new C file? In any case,
I know my code is getting built because it's in a file already
listed in the CMakefile.
> a lib*.so is a "shared object" aka "shared library" aka "dll dynamically
> linked library". They typically need to be fully linked when being built,
> just like an executable.
I'll read up on this. I'm entering week two of my quarter at
university, and things are starting to heat up. I spent a
couple solid days on this last week, but now I can only
look at it sporadically. I'll see what I can get done in the
next few days.
> > If you specified the code in a header file, then you probably forgot to > say > > "inline" Otherwise, be sure to put it in a C file, and make sure the C > > file list listed in the CMakefile so that it gets built.
> I only added a function prototype to the header file, and I > put my definition in the same source file as the original. > Is it better form to create a whole new C file? In any case, > I know my code is getting built because it's in a file already > listed in the CMakefile.
>> I'm entering week two of my quarter at
> university, and things are starting to heat up. I spent a
> couple solid days on this last week, but now I can only
> look at it sporadically. I'll see what I can get done in the > next few days.
That's fine. I can't guess why your code won't link. If you post a patch here, we can look at it.