[ANN] Shen-C

Affichage de 116 messages sur 16
[ANN] Shen-C Tatsuya Tsuda 17/04/17 22:16
I have been developing a C port of Shen and finally passed all 129 tests of Shen 19.3.1.

-Shen-C

-Test results

Some details of the implementation:
-Implemented in C99 as an Interpreter
-Tested on macOS and Clang (GCC also should work)
-Using Boehm GC
-Implemented TCO for self recursion by transforming it into a Clojure like loop-recur expression 
-Compiled with -O3 compiler option but the implementation is not optimized yet 
(the standard test takes about 646 secs on my 2011 MacBook Air)

I have been following the recent discussion about the performance optimizations, so the next step is to do it :)

Tatsuya Tsuda
Re: [ANN] Shen-C Mark Tarver 18/04/17 00:05
My heartiest congratulations from all of us.

Mark :)
Re: [ANN] Shen-C Tatsuya Tsuda 18/04/17 05:10
Thank you so much and for all of your great work!

Tatsuya Tsuda

2017年4月18日火曜日 16時05分32秒 UTC+9 Mark Tarver:
Re: [ANN] Shen-C Bruno Deferrari 18/04/17 05:31
Great work, congratulations!

I added a section on the wiki about performance optimizations, but right now it is very incomplete, what it covers is mostly stuff that you can do in the soon to be released ShenOS 20, which I don't think I will be useful for you at this point where your biggest performance gains will come from making your interpreter evaluator faster.


Have you already decided what to do first in terms of performance optimizations?



--
You received this message because you are subscribed to the Google Groups "Shen" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qilang+un...@googlegroups.com.
To post to this group, send email to qil...@googlegroups.com.
Visit this group at https://groups.google.com/group/qilang.
For more options, visit https://groups.google.com/d/optout.



--
BD
Re: [ANN] Shen-C Tatsuya Tsuda 18/04/17 06:50
Thank you!

Peephole optimization as you suggested before should be a good candidate.

Other than that I'd like to do some profiling to understand the bottleneck.

Tatsuya Tsuda

2017年4月18日火曜日 21時31分46秒 UTC+9 Bruno Deferrari:
Re: [ANN] Shen-C Bruno Deferrari 18/04/17 07:03
On Tue, Apr 18, 2017 at 10:49 AM, Tatsuya Tsuda <ota...@gmail.com> wrote:
Thank you!

Peephole optimization as you suggested before should be a good candidate.

Other than that I'd like to do some profiling to understand the bottleneck.


Yes, if you can profile it, that is going to be your best guide.

I gave a quick read to the code, and something I noticed is that symbol's are not interned, which means you always have to compare them with strcmp. If you add symbol interning so that each symbol is unique you will be able to use pointer comparison which is much faster (and also save memory), and being that the interpreter has to compare symbols a lot I would expect a big payoff. Doing this should not require any big changes in the structure of your interpreter.

The next thing is that there are a lot of dynamic lookups happening that could be avoided. Once your symbols are interned, you can make it so that the symbol object contains a pointer to the function defined with that name (if any), that way, upon function application your interpreter just has to extract the value from the immediate symbol object instead of having to do a dynamic lookup. This also, is a change that will not require big changes to the way the interpreter is structured.

Most of what was discussed with Robert for the F# implementation probably applies to your interpreter too (I think you said you already read that, but mentioning it again just in case).



--
BD
Re: [ANN] Shen-C Tatsuya Tsuda 18/04/17 08:11
Thank you for the suggestions.
As you know, currently symbols are just a string and adding a new struct Symbol which contains a name and a pointer to a function or variable value seems to be a good solution.
Yes, I have read the discussion of the F# implementation and would like to take some ideas from it.

Tatsuya Tsuda

2017年4月18日火曜日 23時03分57秒 UTC+9 Bruno Deferrari:
Re: [ANN] Shen-C Antti Ylikoski 19/04/17 08:57

BTW approximately when is the ShenOS 20 going to be published?

I have been attempting to create some nontrivial software with Shen;
everything else is good, but it looks to me like, the OS Shen v.19 has
some problems with the type checking functionality, ie. with proving
complex expressions containing pairs (ie such as (list (number *
string))) etc. in particular when a complex pairs containing
expression is in a function, on the left hand side of the "->".

(Or, I have succeeded in making some marvelously complicated errors!)

I have communicated on this with the important gurus.

yours, AJY
Re: [ANN] Shen-C Bruno Deferrari 19/04/17 12:47
On Wed, Apr 19, 2017 at 11:34 AM, Antti Ylikoski <antti.y...@gmail.com> wrote:

BTW approximately when is the ShenOS 20 going to be published?


Soon, probably this weekend. There isn't really much left to do, it is just that I don't have much time on weekdays to dedicate to it.

I would have made a release before but Robert found some issues on Windows when compiling with newer versions of SBCL.

The Shen 19.2 binaries were compiled with 1.1.1, but since 1.1.2 SBCL switched to multibyte IO encodings on the Windows console, which broke the read-byte based IO Shen uses. Between a dirty solution in the SBCL port and a cleaner solution that adds some extra support in the kernel I opted for the later (it is done already and it works, and will probably help in the future if anyone wants to add unicode support).
 
I have been attempting to create some nontrivial software with Shen;
everything else is good, but it looks to me like, the OS Shen v.19 has
some problems with the type checking functionality, ie. with proving
complex expressions containing pairs (ie such as (list (number *
string))) etc. in particular when a complex pairs containing
expression is in a function, on the left hand side of the "->".


If it is indeed a bug (can't tell without seeing the program and error), can you create an entry on the issue tracker on Github?




--
BD
Re: [ANN] Shen-C Antti Ylikoski 20/04/17 07:48
I made an issue entry in the Github.  It is important for me, since the general N-level ANN backpropagation program depends on this point, as well as the resolution algorithm that I'm in the process of making.

My nick in the Github is bluejay77.

yours, AJY

PS.  I could use two member lists instead of pairs: [ 1.33 "S" ] pro (@p 1.33 "S") but I won't -- it were a weak escape, and against Dr Tarver's paradigm.
Re: [ANN] Shen-C Tatsuya Tsuda 24/04/17 04:23
Did some C-level CPU profiling and performance optimizations for Shen-C.

On my 2011 MacBook Air, the initial results were like
-Start up time: 26 secs
-Test suite: 646 secs

and now
-Start up time: 1.8 secs
-Test suite: 190 secs

Here are some info of what I have done and noticed.

First did some cpu profiling using gperftools and Apple Instruments.
Although without Shen-level profiling it might be difficult to understand the system function bottleneck, the profiling result showed that the Big3 (lookup_environment, eval_kl_object, strcmp) functions were taking large amount of CPU time and needed serious performance optimizations.
Other than that, there were many frequently used small functions that were not inlined and could be optimized.

Next the details of the optimizations and the reduction of time for the test suite.
1. Fixed a memory allocation issue that prevented -foptimize-sibling-calls option to be used => 34 secs
2. Inlined small functions => 125 secs
3.  Overwrite system functions
  -hash => 20 secs
  -boolean? => 3 secs
  -symbol? => 22 secs
  -integer? => 40 secs
  -shen.numbyte? => 5 secs
  -shen.byte->digit => Not noticeable difference
  -not => Not noticeable difference
  -variable? => 94 secs
  -shen.pvar? => 5 secs
4. Reuse boolean objects (previously boolean objects were created every time when used) => 16 secs
5. Simplified branching in eval_kl_object function => 3 secs
6. Intern symbols that reduces use of strcmp, including reduction of global environment lookups => 89 secs

Thanks to Bruno, next will be to update to Shen 20.0 and see how it improves.


2017年4月19日水曜日 0時11分26秒 UTC+9 Tatsuya Tsuda:
Re: [ANN] Shen-C Tatsuya Tsuda 09/05/17 20:05
Still as an interpreter, on my 2.13 GHz Core 2 Duo, 4 GB RAM, SSD, MacBook Air, now the test suite for Shen-C finishes in 43-47 secs.
I have done various performance updates (mainly overwrites) after updating Shen to 20.0 and would like to share some data.

Before the updates, the test suite finished in 133 secs.
Below are the details and the reduction of time for the test suite.

1. Removed user function checking in symbol function application => 6 secs
2. Disabled -f-optimize-sibling-calls compiler option that caused a weird issue => -8 secs
3. Overwrite system functions
  -value/or => No noticeable difference
  -<-address/or => 1 sec
  -read-file-as-charlist => 2 secs
  -map => 2 secs
  -reverse => No noticeable difference
  -element? => 3 secs
  -shen.valvector => 3 secs
  -shen.lazyderef => 21 secs
  -shen.compose => 3 secs
  -shen.deref => 4 secs
  -shen.occurs? => 7 secs
  -assoc => 9 secs
  -shen.bindv => 4 secs
  -shen.unbindv => 2 secs
  -bind => 3 secs
  -shen.lzy=! => 3 secs
  -shen.ebr => 6 secs
4. Flatten multiple cons calls and convert to (mcons a b c ...) => 3 secs
5. Fix partial application issue => -2 secs
6. Optimize function for getting the size of a list => 1 sec
7. Convert multiple hd and tl calls to (nth-hd 3 a) and (nth-tl 3 b) => No noticeable difference
8. Overwrite system functions
  -shen.hdtl => No noticeable difference
  -shen.mk-pvar => 1 sec
  -shen.copy-vector => 7 secs
  -shen.resize-vector => No noticeable difference
  -shen.resizeprocessvector => No noticeable difference
  -shen.newpv => 5 secs 


Tatsuya Tsuda

2017年4月24日月曜日 20時23分46秒 UTC+9 Tatsuya Tsuda:
Re: [ANN] Shen-C Bruno Deferrari 09/05/17 20:18
Congratulations, those numbers are quite good. I just tested it on my computer and the test suite ran in 31.8 seconds, which is almost the same amount of time it takes for shen-scheme to run the test suite when using chibi-scheme as the underlying platform.

chibi-scheme may not be a particularly fast implementation of Scheme, but it is not terribly slow either, and the VM interprets bytecode instead of interpreting a syntax tree which puts it in a big advantage compared to the current implementation of Shen-C.


Re: [ANN] Shen-C Tatsuya Tsuda 09/05/17 23:46
Wow thanks for the info!

2017年5月10日水曜日 12時18分32秒 UTC+9 Bruno Deferrari:
Re: [ANN] Shen-C Mark Tarver 12/05/17 15:00
I've put this up under the OS downloads.

Congratulations again.

Mark


On Tuesday, April 18, 2017 at 6:16:19 AM UTC+1, Tatsuya Tsuda wrote:
Re: [ANN] Shen-C Tatsuya Tsuda 12/05/17 16:42
Thank you very much!

Tatsuya

2017年5月13日土曜日 7時00分17秒 UTC+9 Mark Tarver: