Has anyone created an english to Narsese translator using a LLM?

67 views
Skip to first unread message

stephen clark

unread,
Oct 1, 2023, 9:57:26 AM10/1/23
to open-nars
Hi,

I have seen a couple of demos where Dr. Patrick Hammer has created a few
components of the process to translate english to narsese.

Has anyone built out a dataset to fine tune a LLM?

If not any suggestions of places where I can acquire data would be
appreciated?


thanks

Stephen

Patrick Hammer

unread,
Oct 1, 2023, 10:40:49 AM10/1/23
to open...@googlegroups.com
Hi Stephen!

You can just call me Patrick! :)

We have built a quite sophisticated natural language channel in recent months which makes use of GPT-4, or alternatively GPT-3.5.
Please feel free to try it:

It is based on prompting, we do not have a dataset for finetuning at current stage but I think that is a great idea to improve LLM's for this task and to make open source models competitive on it.

Best regards,
Patrick



--
You received this message because you are subscribed to the Google Groups "open-nars" group.
To unsubscribe from this group and stop receiving emails from it, send an email to open-nars+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/open-nars/aad5ad8e-476b-5c32-4517-f91c42fc4355%40gmail.com.

Maxim Tarasov

unread,
Oct 2, 2023, 11:11:58 AM10/2/23
to open-nars
Hi Stephen,

One way I can think of to begin compiling such a dataset is to use existing test cases in OpenNARS or ONA. For example, here's nal1.0.nal:
```
'Revision ------

'Bird is a type of swimmer.
<bird --> swimmer>.

'Bird is probably not a type of swimmer.
<bird --> swimmer>. %0.10;0.60%

1

'Bird is very likely to be a type of swimmer.
''outputMustContain('<bird --> swimmer>. %0.87;0.91%')
```

Most test cases have some Narsese with a comment line before it in English. It would be trivial to write a script to scrape Narsese-English pairs from all the test cases and compile a small-ish dataset this way. One could then do some post-processing on the resulting set, to expand it with more synthetic data, like doing a simple search and replace to turn 'Bird is a type of swimmer into `Bird is a type of flyer and 'Fish is a type of swimmer etc..

This would be just a start of course, and there's also a question whether you would want to keep the truth values in the set or are interested in only the statements. In the latter case you'd probably want to do something special to differentiate between 'Bird is a type of swimmer and 'Bird is probably not a type of swimmer which in the example above both translate to <bird --> swimmer> but with different truth values.
Reply all
Reply to author
Forward
0 new messages