In order to integrate OpenPetBrain into OpenCog (using the main opencog
development branch; not a specific or private one), I need to merge the
features in lp:~gama/opencog/classserver into the staging branch.
Actually, I guess all features in this classserver branch are also in my
private lp:~welterls/opencog/libatomspace-OpenPB
<https://code.launchpad.net/%7Ewelterls/opencog/libatomspace-OpenPB> and
lp:~welterls/opencog/libatomspace
<https://code.launchpad.net/%7Ewelterls/opencog/libatomspace> branches,
but I need to double check that yet.
Anyway, I know there were some issues at the time Gama was about to
merge these changes to staging branch some time ago (a couple of months
ago, I guess), related to conflicts with support for Guile or something
like that. I would like to know which are/were these issues and if they
remain or not so that I can resume that task.
So, could you please explain me these issues and tell me which is the
current status?
Any approaches you have already thought of for this at that time (or
recently) would be appreciated too.
thanks,
Welter
> Anyway, I know there were some issues at the time Gama was about to merge
> these changes to staging branch some time ago (a couple of months ago, I
> guess), related to conflicts with support for Guile or something like that.
> I would like to know which are/were these issues and if they remain or not
> so that I can resume that task.
> So, could you please explain me these issues and tell me which is the
> current status?
The guile issues have all been resolved. The core problem
was that there were large, divergent changes made to the
same .cc files in two different branches. Since these
changes weren't coordinated, merging them automatically
proved impossible.
> Any approaches you have already thought of for this at that time (or
> recently) would be appreciated too.
I have not studied the classserver, and don't know how
it works. I'm guessing that is has something to do with
dynamically creating new types.
I would like to tell you about how I use opencog types:
if the classerver broke these, I would be unhappy.
-- Some of the C++ code references types at compile time.
I don't know how the class server expects to resolve this.
It sure would be nice if the solution was *fast*. So, e.g.
current code performs an int compare on atom->getType()
and if this was turned into a table lookup and string compare,
that would be ... not really optimal. How does the class server
change this?
-- Many of the types I use are only referenced in scheme
code. That is, if those types were absent at compile time,
it wouldn't be noticed, but the scheme code would break
at runtime. I'd like some guarantee of it not breaking :-)
I guess I'd have to write scheme wrappers for the
class server, and then create all needed atom types,
before using them?
-- The SQL persistence code saves and restores atom
types. It has been designed so that the type numbering
can change between save and restore, and so the
type id numbers can be safely changed. However,
during restore, it does expect that all needed types
already exist. It would be fairly straightforward to modify
the code to create the needed types, before restoring
from the DB. But it is extra coding work.
As you can see, the last two points require me to do
more work, I'm guessing about a week or so. And
having to do work that I'd otherwise not have to do
makes me cranky :-/ But I can do it as long as its
not urgent.
Could there be a transition plan: e.g. pre-create all
of the existing atom types? That way, existing code could
work indefinitely; later, when I have more time and feel
more relaxed, I could code up the needed scheme and
persist shims.
--linas
> I have not studied the classserver, and don't know how
> it works. I'm guessing that is has something to do with
> dynamically creating new types.
One more thing I forgot to mention: when a new type is
created, I'd like a callback to find out about it. There's
some scheme wrapper code that is currently generated
directly from the types.list file; this would have to be
updated somehow.
--linas
Types are used extensively as ints in the PLN code and compared to the
defined values (e.g. comparisons like getType() == INHERITANCE_LINK).
I guess the simplest way to allow transition is to make these #defines
call the appropriate ClassServer command that returns the correct ID.
Like Linas mentions however, it should remain fast!
J
Hi Welter,
"make test" runs the unit tests. :)
J
I too have complained about this failing. Linas, please fix your unit
test. It should pass when db connections are not available. I've
fixed this test for you twice already.
Trent
Unless I missed something, what you did was to always
pass the test, even when it failed ... which makes testing
pointless.
Basically, if you have ODBC installed and configured, the
test should pass. If it fails, it means that you have ODBC
installed, but mis-configured. If it is not installed, then
the test won't even be run -- this is the correct behaviour.
Now its possible that some people will have the
ODBC development environment installed for other
reasons, but do not plan on using persistence within
OpenCog. (Who among us is doing ODBC development, besides me? If you
know it well enough that you need
it for your other projects, why can't you configure it
correctly for opencog?)
For this (unusual!?) case, the CMakefiles should be
modified to include a flag to indicate that this
particular OpenCog subsystem should be disabled.
However, its wrong to ask for the subsystem to be
enabled, and then ask that the tests be not run, or
to intentionally disable them.
--linas
When you first brought up the class server, you pointed
at two of your branches. I looked at those, but could not
find any class-server code that differed from what is in
the current staging branch. Where is this code? Was
I looking in the wrong place?
> and
> check if it will break any of your code (as you explained bellow) or cause
> any
> relevant performance issues.
> It would be great, however, if I had a set of unit tests (or any other
> specific test code)
> I can run to check these things. Could you point that out?
Well, as I mentioned before, a variety of subsystems
will simply not compile, if the types are removed, and
so this will be a strong indicator.
I do not have any unit test cases for the scheme code.
This is in part because the CXXTest infrastructure can
only test C++ code, and not anything else. Hmm.
I suppose I could hack around this; but there are also
other tricky parts -- the tests presume the existence of
various data stores, which users would have to download
and install.
--linas
After reviewing the implementation of extensible classserver carefully,
I don't think it would break anything. Anyway, I'll answer each of your
questions/requirements bellow so you can see how extensible classserver
may work in each situation.
I'm also attaching the opencog/atomspace/README file, where you can find
the "Adding atom types" section, which explains very well how to use it
to add new specific atom types.
After you read my answers, let me know if you still see any potential
problems with the extensible class server...
Linas Vepstas wrote:
> I have not studied the classserver, and don't know how
> it works. I'm guessing that is has something to do with
> dynamically creating new types.
>
Yes and no. Indeed it creates the atom types dynamically. However, I
think its main/original purpose is to provide a way so new modules
and/or applications can add specific atom types at the *startup* (using
a "type script" and some cmake macros that makes automatic the process
of declaring the new C++ atom types as static variables under opencog
namespace; plus some simple initialization code we must include at the
startup of each specific module/application).
So, in practice, the atom types are still static, but the initialization
is somehow dynamic (i.e., you must call a method that initializes a
specific set of atom types at the startup -- static initialization may
be still possible but it's quite dangerous because it's compiler-dependent).
Using this approach, we can have a minimal set of basic atom types
defined in opencog/atomspace and all other specific ones defined by
specific modules or applications (that's how OpenPetBrain works
currently, not polluting the basic set of atom types pre-defined in
OpenCog).
Optionally, if opencog code (or any other code) really needs to know all
atom types before anything starts (say, a scheme console that reads all
types from a file), we can still keep all atom types defined in
opencog/atomspace/atom_types.script (new name for the current
type.script), which is the current approach.
> I would like to tell you about how I use opencog types:
> if the classerver broke these, I would be unhappy.
>
> -- Some of the C++ code references types at compile time.
> I don't know how the class server expects to resolve this.
> It sure would be nice if the solution was *fast*. So, e.g.
> current code performs an int compare on atom->getType()
> and if this was turned into a table lookup and string compare,
> that would be ... not really optimal. How does the class server
> change this?
>
It does not change this.
First, the C++ type of an atom type remains the same (i.e., unsigned
short, as Type is defined in types.h). So, comparison between atom types
remains as fast as before.
Each atom type is now a static variable, not a constant defined with
preprocessor #define command. This is the right way to define these
constants and may save us from headache as we add/integrate more code to
OpenCog. For example, we had many problems with 3rd-party code (or even
code written by our team) that include any of these constants for some
reason (any NODE, LINK and ATOM patterns, for example, will be replaced
by its corresponding number where they occur...).
> -- Many of the types I use are only referenced in scheme
> code. That is, if those types were absent at compile time,
> it wouldn't be noticed, but the scheme code would break
> at runtime. I'd like some guarantee of it not breaking :-)
>
> I guess I'd have to write scheme wrappers for the
> class server, and then create all needed atom types,
> before using them?
>
Right. I see the problem. If the scheme code really needs to know all
atom types (even those from specific modules or applications), it would
break at runtime. Unless we define all atom types in
opencog/atomspace/atom_types.script, as I said before.
Anyway, if schema code refers to the number (not the name) of each atom
type it uses, we still have a problem.
I think scheme wrappers for accessing class server may be a solution.
However, the idea is that the creation of specific atom types must be
done by the specific module/application that needs it. So, in order to
make schema code knows about all possible atom types, we may need to
synchronize the initialization of the modules somehow so that schema
code knows when the startup is done and it may get all created atom
types from ClassServer.
Let me know if this is the case before I start thinking of a solution
for that...
> -- The SQL persistence code saves and restores atom
> types. It has been designed so that the type numbering
> can change between save and restore, and so the
> type id numbers can be safely changed. However,
> during restore, it does expect that all needed types
> already exist. It would be fairly straightforward to modify
> the code to create the needed types, before restoring
> from the DB. But it is extra coding work.
>
Well, it seems just a matter of wait for the basic initialization
(startup) of a cogserver before restoring any atom from a DB.
This is what we do in OpenPetBrain code, which uses SavingLoading class
to save/restore AtomSpace from disk. Although I don't know the SQL
persistence code yet, it seems the same approach could be applied to it...
I agree this is extra work, but it does not seem too much work at all.
It's almost impossible to evolve a software without having some extra work.
Anyway, I can take care of it myself, if nobody else can help me on this
now.
That's why I'm asking for the relevant unit tests (or any specific or
manual test) that may help me on checking if the extensible classserver
will break anything. It would be too much easier if I could validate my
changes with such tests.
> As you can see, the last two points require me to do
> more work, I'm guessing about a week or so. And
> having to do work that I'd otherwise not have to do
> makes me cranky :-/ But I can do it as long as its
> not urgent.
>
Right. Some eventual support on my questions should be enough, I guess.
> Could there be a transition plan: e.g. pre-create all
> of the existing atom types? That way, existing code could
> work indefinitely; later, when I have more time and feel
> more relaxed, I could code up the needed scheme and
> persist shims.
>
Yes, I guess this may work for starters. Anyway, some code adaptations
will be needed, but I don't think they will be big or complex at all.
Specifically for OpenPetBrain, right now we don't need scheme code
dealing its specific atom types, right? However, AFAIK, the plan is to
use scheme in OpenPetBrain in the near future. When we're about to do
that, we can re-evaluate the code and decide on the best approach:
pre-create all atom types in opencog/atomspace or make design changes to
the code -- including scheme -- so it become flexible enough to handle
specific atom types (even for those created at runtime, if it's proved
to be needed later).
Welter
Welter
> The main code is actually the cmake macro OPENCOG_ADD_ATOM_TYPES, which
> process the script file that defines the atom types and creates C++ code
> automatically.
Oh. Sounds like a trivial change.
> For C++ code, maybe not, since the atom type names remain the same.
> The only difference is that they are now static variables, not constants.
Hm, well, this is a bit of a performance hit, since static
variables need to be looked up in the module TOC.
But, OK, that's the price of run-time flexibility.
>> other tricky parts -- the tests presume the existence of
>> various data stores, which users would have to download
>> and install.
>>
> Hmm, I see. You don't mean postgres databases used in unit tests, but
> other (maybe external) data stores, right?
Yes, to both. The WSD code won't do anything if
you don't have a table of word sense similarities installed.
----
Anyway, I just added one small, simple "sniff" test for
the basic scheme functions.
--linas
Linas Vepstas wrote:
>> Hmm, I see. You don't mean postgres databases used in unit tests, but
>> other (maybe external) data stores, right?
>>
>
> Yes, to both. The WSD code won't do anything if
> you don't have a table of word sense similarities installed.
>
Well, I'll finish the merge and publish the resulting branch somewhere
at launchpad (it will be useful for others to test OpenPetBrain anyway).
Then we decide how to make these tests before committing the changes to
the staging branch.
> ----
> Anyway, I just added one small, simple "sniff" test for
> the basic scheme functions.
>
OK, but I'm afraid to merge my local branch with your recent changes,
since I've noticed from buildbot that the latest revision has now 3
failing unit tests:
http://buildbot.opencog.org/staging/builders/ubuntu-intrepid-x86/builds/99/steps/test/logs/stdio
Welter
Since they are not changing, couldn't they be made static const variables?
Surely compilers are smart enough now to treat these essentially as
defines/constants (but with a symbol name, making debugging much
easier).
J
Sorry! Obviously I wasn't paying attention ;-)
One thing I'm concerned about, however, is that it seems like you're
suggesting that all modules *need* to be initialized at start up of
the server?
Currently we can load and unload modules dynamically while the server
is running and it's been very useful (not least because I can
recompile a module and then reload it. This avoids having to go
through the process of getting the AtomSpace in the same state again
while debugging). I'd really dislike losing this functionality.
J
No, they can't, because this would lead to "uninitialized const"
compiler errors,
since they are initialized by calling the ClassServer::addType(Type
parent, const string& name) method in non-static blocks, as follows:
/* File automatically generated. Do not edit */
opencog::ATOM = opencog::ClassServer::addType(opencog::ATOM, "Atom");
opencog::NODE = opencog::ClassServer::addType(opencog::ATOM, "Node");
opencog::LINK = opencog::ClassServer::addType(opencog::ATOM, "Link");
opencog::CONCEPT_NODE = opencog::ClassServer::addType(opencog::NODE,
"ConceptNode");
opencog::NUMBER_NODE = opencog::ClassServer::addType(opencog::NODE,
"NumberNode")
...
Using static blocks to initialize them would be tricky and dangerous
since the order of execution of static blocks is undefined (or
compiler/platform dependent) among other issues. See
http://www.tilander.org/aurora/2007/10/static-initialization-in-c.html
Welter
Welter
Cool, glad to hear we can still use the modules in this way, and
thanks for the info.
J
yep. Since the values are not known at compile time,
the compiler can't optimize them. Now, *if* the values
were known at compile time, then it wouldn't be
"debuggable" the way you'd hoped: the compiler
would throw away the symbol.
This is entirely analogous to inline functions: when
a function has been inlined, you will not see it in a
stack trace any more, since it no longer exists as
a symbol (at least, not in the compilation unit where
it was inlined).
--linas
Yes, OK, thanks for the long reply, I think its all good; this
sounds reasonable and doable.
> Using this approach, we can have a minimal set of basic atom types defined
> in opencog/atomspace and all other specific ones defined by specific modules
> or applications
Yes, OK. I'd like to ask that the NLP-related nodes and
links stay there for the short term. As I get time, I'll deal
with redoing my other parts to "do the right thing".
[... scheme discusssion ...]
I notice you kept writing "schema" .. you realize scheme
and opencog schemas are two unrelated things, right?
> Let me know if this is the case before I start thinking of a solution for
> that...
I'll deal with it. The C++ code to declare a new node type
called "BlahBlahNode" in scheme is nearly trivial:
SchemeEval eval;
eval.eval("(define (BlahBlahLink . x)
(apply cog-new-link (append (list 'BlaBlahLink) x)))");
that's it. This code can be put anywhere where its
convenient. Registering a callback with the class server,
to find out when a new type has been created, would be
great. I assume there will be such callbacks ...
> I agree this is extra work, but it does not seem too much work at all.
Yeah, its pretty easy, and its not a lot of work. Its just
easier if I can save it for later, for some "rainy day".
--linas
Linas Vepstas wrote:
> Yes, OK. I'd like to ask that the NLP-related nodes and
> links stay there for the short term. As I get time, I'll deal
> with redoing my other parts to "do the right thing".
>
OK
> [... scheme discusssion ...]
>
> I notice you kept writing "schema" .. you realize scheme
> and opencog schemas are two unrelated things, right?
>
I know "scheme" (the script language) and "schema" are unrelated things,
although I'm not sure what you mean by "opencog schemas". I've been
written "schema" (meaning combo schema) for so long time while
developing PetBrain project. That's why I do this typo every time. Sorry...
>> Let me know if this is the case before I start thinking of a solution for
>> that...
>>
>
> I'll deal with it. The C++ code to declare a new node type
> called "BlahBlahNode" in scheme is nearly trivial:
>
> SchemeEval eval;
> eval.eval("(define (BlahBlahLink . x)
> (apply cog-new-link (append (list 'BlaBlahLink) x)))");
>
> that's it. This code can be put anywhere where its
> convenient. Registering a callback with the class server,
> to find out when a new type has been created, would be
> great. I assume there will be such callbacks ...
>
OK, the registered callbacks can be implemented easily as soon as we
need it.
>
>> I agree this is extra work, but it does not seem too much work at all.
>>
>
> Yeah, its pretty easy, and its not a lot of work. Its just
> easier if I can save it for later, for some "rainy day".
>
All right.
thanks,
Welter