Hi Vignav,On Thu, Jun 25, 2020 at 12:57 AM <rvi...@gmail.com> wrote:Hi Linas,Thanks for your feedback. Quick question - are there any files or links in particular part of opencog/generate that would be most helpful as a reference/general code structure when building the language production engine in Java?I don't know how to answer that question, other than to say "all of it" which is not what you want to hear....The README explains in detail what the code does: it walks over a collection of "jigsaw puzzle pieces" and attempts to join them together -- it does this in a breadth-first search -- that is adding one piece at a time, always trying to attach to one of the existing unconnected connectors. This sounds easy, but is remarkably hard -- I had to create something I call an "odometer", to track the state of what has been tried and what has not been tried. It took me a while, to get it to work right ... there were some tricky bits.. but now it works.So .. basically -- I'm saying -- gee, well, here is this complicated code .. do you really want to port that to java?There is also some interesting interplay with something called "the pattern engine" inside of the atomspace. Let me give a very short explanation. So .. say you have a function f(x) and you want to plug a value "b" into it, to get "f(b)". What you are "really doing" is connecting "f(x) <==> b" where f(x) is one "jigsaw puzzle piece connector" (say, the hole) and "b" is the jigsaw tab that plugs in. Painfully obvious, right?So, the pattern engine is currently a database search engine that has a bunch of "holes" aka "variables x,y,z..." and it searches for all "patterns" which can match those "holes" (and so the name "pattern matcher"). It vaguely resembles a "perl regex" and an "SQL query" and a "prolog inference step" all mashed up into one, and generalized for graph search.I have a plan to generalize the pattern engine "real soon now" to generalize the "variables x,y,z" and "things that match them" into generalized "jigsaw puzzle piece connectors". This is a natural generalization, because the pattern engine already applies type-checking (so, for example, you can only plug an int into an int variable, a float into a float variable, an instance of "class foo" into something that takes "class foo" as an argument.... think of function signatures as you already know them in java.) So the pattern engine already knows how to plug "instances of things" into "fill-in-the-blanks" search queries, with type-checking (with a full type-theoretical type hierarchy and type constructors) ... I want to generalize that into arbitrary user-defined "connectors" that can "connect together" instead of "instances of objects" and "things that accept instances of objects".I want to do the above in parallel with the graph generator work, so that the two systems are inter-compatible. I think its neat.Instead of thinking "jigsaw puzzle-pieces" you can think of biology analogs: bits of proteins that fit into other proteins. Immunoglobulin parts that can mate-with, stick to other other parts. RNA that sticks to DNA ... So, in a certain sense, I feel like I am re-inventing biochemistry: different "types" have different "affinities" for forming "chemical bonds".- Linas.p.s. I'm cc'ing the opencog mailing list as a way of saying "here's another way of explaining that sheaf thing I keep talking about".p.p.s. to avoid confusion between link-grammar links and atomsace links, I plan to rename the link-grammar links to "bonds". All the old language-learning code, and the generator code still calls them "links" but this is confusing. Of course, I cannot rename "link grammar" to "bond grammar", so that stays unchanged. But calling them "bonds" in analogy to "chemical bonds" gets across a better idea of the concept of attachment.I'm not the first: Eugene A Nida: "The Molecular Level of Lexical Semantics" https://www.academia.edu/36534355/The_Molecular_Level_of_Lexical_Semantics_by_EA_NidaInternational Journal of Lexicography, Volume 10, Issue 4, December 1997, Pages 265–274, https://doi.org/10.1093/ijl/10.4.265--Thanks,Vignav
On Sunday, June 7, 2020 at 1:48:34 AM UTC-7, Anton Kolonin @ Gmail wrote:Hi Linas, thank you for the guidance.
>Again, I would like to remind you that there already is an existing project for language generation here: https://github.com/opencog/generate that is link-grammar compatible, and it already generates simple sentences from simple dictionaries.
Cool, worth looking into that as a reference!
>I would rather see work proceed on that, rather than all-new green-field development. There are major compatibility problems with coding in java. It's not a good language choice for these kinds of projects. -- speaking from experience.
In given case, we are looking for "Pure Java" solution so it can run under the Aigents framework on smartphones and JVM-enabled coffee machines ;-)
We also need the grammar to be coupled with ontology which is already present in the in-memory Java graph DB.
Thanks,
-Anton
On 07/06/2020 11:04, Linas Vepstas wrote:
--Hi Vignav,
Mailing lists work better if you actually subscribe to them. As it is, you will not receive any replies unless people explicitly CC you.
On Sat, Jun 6, 2020 at 10:45 PM Vignav Ramesh <rvi...@gmail.com> wrote:
Hi Linas & Amir,
This is Vignav. I am working with Anton on the Java-based NL language production for Aigents.
I am taking a look at the LinkGrammar.java file in bindings/java/org/linkgrammar but there does not seem to be much clear documentation on how to use the LinkGrammar class and its methods after downloading the file and importing it. Since it has native methods, I am assuming it uses JNI and there is a processing of integrating the C and Java code. Is there any documentation on this that I can follow to sort this all out and get the Java version working?
If you look at the java-jni directory, you will find the java jni bindings. You should spend some quality time reading the contents of java-jni/jni-client.h and java-jni/jni-client.c so that you can clearly understand what the jni bindings actually are. After all, this is open source, and part of what makes it open is that you can actually examine and explore it, instead of depending on a corporation to feed you one spoonful at a time.
The jni bindings have corresponding java files. if you say `find bindings/java/org` you will see:
org
org/linkgrammar
org/linkgrammar/Link.java
org/linkgrammar/LGConfig.java
org/linkgrammar/LGRemoteClient.java
org/linkgrammar/ParseResult.java
org/linkgrammar/LinkGrammar.java
org/linkgrammar/JSONUtils.java
org/linkgrammar/LGService.java
org/linkgrammar/Linkage.java
org/linkgrammar/JSONReader.java
Hopefully it is painfully obvious what each file does, given its name: a configuration file, two json processing files, a file for working with parses, a file for working with linkages, a server file, a client file, and an API file.
All of the documentation is in javadoc format. All you have to do is to run your favorite javadoc tool on it and you will have full and complete documentation for everything. Please keep in mind that many different kinds of documentation systems are compatible with javadoc, and so just about any system will produce documentation for you.
Again, I would like to remind you that there already is an existing project for language generation here: https://github.com/opencog/generate that is link-grammar compatible, and it already generates simple sentences from simple dictionaries. I would rather see work proceed on that, rather than all-new green-field development. There are major compatibility problems with coding in java. It's not a good language choice for these kinds of projects. -- speaking from experience.
-- Linas
Thanks,Vignav
On Sat, Jun 6, 2020 at 9:56 AM Linas Vepstas <linasv...@gmail.com> wrote:
Hi Anton,
--On Fri, Jun 5, 2020 at 11:24 PM Anton Kolonin @ Gmail <akol...@gmail.com> wrote:
Hi Linas and Amir!
We are going to try using LG formalism for language production in Java
project:
https://github.com/aigents/aigents-java/issues/22
So, NL Generation is the goal of https://github.com/opencog/generate -- it already generates small sentences from simple vocabularies just fine. I have not attempted anything complex, maybe it will work and maybe it won't. It's alpha version 0.1.0 so many of the things I can think of that I want to have are absent.
It's not Java.
I think it would be awesome if you could work on that project, but I imagine that you would not want to, that you prefer green-field solutions written by your own people over which you have total control.
Note that we are not going use LG to "parse" NL texts, we are going to
use it to "generate" NL texts (the opposite task but the same formalism
and the same dictionaries are to be used).
Can one of you answer some questions?
1. What is the current location of the best-of-breed LG dictionaries
(for English and Russian in particular)?
Comes with LG
2. What is the location of most reliable code branch to read these
dictionaries?
Comes with LG
3. If there are any known Java projects of pieces that can be re-used
under OSS license?
Comes with LG
4. Can we use this tutorial
(https://www.abisource.com/projects/link-grammar/api/index.html) to make
a Java implementation of the Link Grammar parser?
Yes, that is the official LG documentation.
5. We rewrote the C++ code from the tutorial above in Java - but any
recommendation on what the Java substitute for #include
"link-includes.h" is? We know it has to do with the Java bindings in the
opencog repo but we are not totally sure how to use that.
LG already comes with java bindings that work in both local and remote mode, and also comes with two different java servers. One java server generates json and the other generates atomese.
Depending on what API you want, include either bindings/java/org/linkgrammar/LinkGrammar.java or bindings/java/org/linkgrammar/LGRemoteClient.java the README file explains how to use these
-- linas
cassette tapes - analog TV - film cameras - you
--
cassette tapes - analog TV - film cameras - you
You received this message because you are subscribed to the Google Groups "link-grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email to link-g...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/link-grammar/CAHrUA37h_f_9dLK2toQxyqXjEmDBcfst%2Bi-WqvpFUnCpD9d93w%40mail.gmail.com.
-- -Anton Kolonin skype: akolonin cell: +79139250058 akol...@aigents.com https://aigents.com https://www.youtube.com/aigents https://www.facebook.com/aigents https://wt.social/wt/aigents https://medium.com/@aigents https://steemit.com/@aigents https://reddit.com/r/aigents https://twitter.com/aigents https://golos.in/@aigents https://vk.com/aigents https://aigents.com/en/slack.html https://www.messenger.com/t/aigents https://web.telegram.org/#/im?p=@AigentsBot
You received this message because you are subscribed to the Google Groups "link-grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email to link-grammar...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/link-grammar/570da168-fce7-4630-8347-1518171e07dbo%40googlegroups.com.
--cassette tapes - analog TV - film cameras - you