Is the open source AtomSpace?

88 views
Skip to first unread message

tom.mayna...@gmail.com

unread,
Oct 28, 2016, 6:29:18 PM10/28/16
to opencog
I am new to OpenCog and I don't manage to find any reasonable open source AtomSpace/Knowledge base? Do OpenCog users publish their AtopSpace as open source projects? If all the important AtomSpaces are closed source, then is it possible to start the open source AtopSpace that could become the most complete knowledge base and that can be used for research projects and that can achieve or surpass the level of Cyc knowledge base.

What is the sense of use the OpenCog if no AtomSpace is available?

T.M.R.

tom.mayna...@gmail.com

unread,
Oct 28, 2016, 6:33:29 PM10/28/16
to opencog
Specifically - I am trying to do the formalization of the legal knowledge and I find it necessary to use the real-world concepts, e.g. formalization of the VAT tax law requires ontologies of goods, of delivery terms, of company types. It would be easier to contribute in already existing knowledge base than to build it from scratch. 

Ben Goertzel

unread,
Oct 28, 2016, 10:15:42 PM10/28/16
to opencog, Andre Luiz de Senna, Cassio Pennachin
Hi Tom,

Hmm, it's true that the best Atomspaces I'm aware of are for
proprietary projects (a Hanson Robotics one with some
robot-character-personality stuff; and one for a consulting project
that has a customer's product catalogue and related info loaded into
it...)

There is a bio-Atomspace with a bunch of bio-ontologies in it, but
that's kinda specialized...

Creating a good Atomspace for public use, is a good project that
should be done, I agree...

Senna has made the best start for this, I think, via writing code to
import Simple English Wikipedia into the Atomspace, and saving a bunch
of these Atoms in postgres...

We could perhaps put a big Postgres DB on AWS somewhere, containing
Atoms resultant from parsing Simple English Wikipedia. That would be
a start.

Be aware however that parsing Simple English Wikipedia currently
results in a lot of Atoms, i.e. way more than you're gonna fit in RAM
one one machine unless you have a supercomputer...

ben
> --
> You received this message because you are subscribed to the Google Groups
> "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to opencog+u...@googlegroups.com.
> To post to this group, send email to ope...@googlegroups.com.
> Visit this group at https://groups.google.com/group/opencog.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/opencog/b5f6352c-38e0-4a1b-a257-7c850937e022%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.



--
Ben Goertzel, PhD
http://goertzel.org

“I tell my students, when you go to these meetings, see what direction
everyone is headed, so you can go in the opposite direction. Don’t
polish the brass on the bandwagon.” – V. S. Ramachandran

Linas Vepstas

unread,
Oct 31, 2016, 4:57:40 PM10/31/16
to opencog, Andre Luiz de Senna, Cassio Pennachin
On Fri, Oct 28, 2016 at 9:15 PM, Ben Goertzel <b...@goertzel.org> wrote:

Be aware however that parsing Simple English Wikipedia currently
results in a lot of Atoms, i.e. way more than you're gonna fit in RAM
one one machine unless you have a supercomputer...

Really?  That's wrong. I am guessing that is because every single sentence was stored, and that is a mistake: instead, we should extract plausible meanings for words, and keep only those, and discard the individual words.

I discovered -- what maybe 5 or more years ago -- that the LG disjuncts correlate very well with word-meanings.    Thus, I would recommend the following "beginner" database:  parse SEW, assign to each word its "meaning" , i.e. its disjunct, and then keep only the network of these word "meanings".  Discard all the individual word instances.

I bet the result of that would fit in 8GB, and would be an adequate rough cut as input to PLN.  I doubt that it would be any worse, in terms of data quality, than, say, WordNet or FrameNet.  It would probably be higher quality, is my guess, based on previously fumbling with this stuff.

--linas


Matt Chapman

unread,
Oct 31, 2016, 6:09:14 PM10/31/16
to opencog, Andre Luiz de Senna, Cassio Pennachin

On Fri, Oct 28, 2016 at 7:15 PM, Ben Goertzel <b...@goertzel.org> wrote:
Senna has made the best start for this, I think, via writing code to
import Simple English Wikipedia into the Atomspace, and saving a bunch
of these Atoms in postgres...

Where is this code? I didn't find it browsing through the github repo's for a few minutes...

All the Best,

Matt

Linas Vepstas

unread,
Oct 31, 2016, 9:05:14 PM10/31/16
to opencog, Andre Luiz de Senna, Cassio Pennachin
Hi Matt,

haven't forgotten about your bot, its here I have to find a moment.

I don't know what Senna is using but there are a bunch of wikipedia-article-parsing & management scripts here: https://github.com/opencog/relex/tree/master/src/perl  U sed to use those scripts to strip out html markup, and feed the articles one by one through the parser, dumping the results into opencog which then ran additional scripts to pick through the data.

--linas

--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+unsubscribe@googlegroups.com.

To post to this group, send email to ope...@googlegroups.com.
Visit this group at https://groups.google.com/group/opencog.

Leung Man Hin

unread,
Nov 1, 2016, 11:39:52 PM11/1/16
to André Senna, opencog, Andre Luiz de Senna, Cassio Pennachin
On Wed, Nov 2, 2016 at 1:49 AM, André Senna <se...@igenesis.com.br> wrote:
Man Hin, do you think SuReal will work with this simplification?


​I am not sure​... I'll find out what is minimally required for sureal to generate sentences

 
--
Andre Senna
IGENESIS - http://www.igenesis.com.br
(31) 9120 3292

Reply all
Reply to author
Forward
0 new messages