[ANN] Shen-C

570 views
Skip to first unread message

Tatsuya Tsuda

unread,
Apr 18, 2017, 1:16:19 AM4/18/17
to Shen
I have been developing a C port of Shen and finally passed all 129 tests of Shen 19.3.1.

-Shen-C

-Test results

Some details of the implementation:
-Implemented in C99 as an Interpreter
-Tested on macOS and Clang (GCC also should work)
-Using Boehm GC
-Implemented TCO for self recursion by transforming it into a Clojure like loop-recur expression 
-Compiled with -O3 compiler option but the implementation is not optimized yet 
(the standard test takes about 646 secs on my 2011 MacBook Air)

I have been following the recent discussion about the performance optimizations, so the next step is to do it :)

Tatsuya Tsuda

Mark Tarver

unread,
Apr 18, 2017, 3:05:32 AM4/18/17
to Shen
My heartiest congratulations from all of us.

Mark :)

Tatsuya Tsuda

unread,
Apr 18, 2017, 8:10:04 AM4/18/17
to Shen
Thank you so much and for all of your great work!

Tatsuya Tsuda

2017年4月18日火曜日 16時05分32秒 UTC+9 Mark Tarver:

Bruno Deferrari

unread,
Apr 18, 2017, 8:31:46 AM4/18/17
to qil...@googlegroups.com
Great work, congratulations!

I added a section on the wiki about performance optimizations, but right now it is very incomplete, what it covers is mostly stuff that you can do in the soon to be released ShenOS 20, which I don't think I will be useful for you at this point where your biggest performance gains will come from making your interpreter evaluator faster.


Have you already decided what to do first in terms of performance optimizations?



--
You received this message because you are subscribed to the Google Groups "Shen" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qilang+unsubscribe@googlegroups.com.
To post to this group, send email to qil...@googlegroups.com.
Visit this group at https://groups.google.com/group/qilang.
For more options, visit https://groups.google.com/d/optout.



--
BD

Tatsuya Tsuda

unread,
Apr 18, 2017, 9:50:58 AM4/18/17
to Shen
Thank you!

Peephole optimization as you suggested before should be a good candidate.

Other than that I'd like to do some profiling to understand the bottleneck.

Tatsuya Tsuda

2017年4月18日火曜日 21時31分46秒 UTC+9 Bruno Deferrari:
Great work, congratulations!

I added a section on the wiki about performance optimizations, but right now it is very incomplete, what it covers is mostly stuff that you can do in the soon to be released ShenOS 20, which I don't think I will be useful for you at this point where your biggest performance gains will come from making your interpreter evaluator faster.


Have you already decided what to do first in terms of performance optimizations?


On Tue, Apr 18, 2017 at 2:02 AM, Tatsuya Tsuda <ota...@gmail.com> wrote:
I have been developing a C port of Shen and finally passed all 129 tests of Shen 19.3.1.

-Shen-C

-Test results

Some details of the implementation:
-Implemented in C99 as an Interpreter
-Tested on macOS and Clang (GCC also should work)
-Using Boehm GC
-Implemented TCO for self recursion by transforming it into a Clojure like loop-recur expression 
-Compiled with -O3 compiler option but the implementation is not optimized yet 
(the standard test takes about 646 secs on my 2011 MacBook Air)

I have been following the recent discussion about the performance optimizations, so the next step is to do it :)

Tatsuya Tsuda

--
You received this message because you are subscribed to the Google Groups "Shen" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qilang+un...@googlegroups.com.

To post to this group, send email to qil...@googlegroups.com.
Visit this group at https://groups.google.com/group/qilang.
For more options, visit https://groups.google.com/d/optout.



--
BD

Bruno Deferrari

unread,
Apr 18, 2017, 10:03:57 AM4/18/17
to qil...@googlegroups.com
On Tue, Apr 18, 2017 at 10:49 AM, Tatsuya Tsuda <ota...@gmail.com> wrote:
Thank you!

Peephole optimization as you suggested before should be a good candidate.

Other than that I'd like to do some profiling to understand the bottleneck.


Yes, if you can profile it, that is going to be your best guide.

I gave a quick read to the code, and something I noticed is that symbol's are not interned, which means you always have to compare them with strcmp. If you add symbol interning so that each symbol is unique you will be able to use pointer comparison which is much faster (and also save memory), and being that the interpreter has to compare symbols a lot I would expect a big payoff. Doing this should not require any big changes in the structure of your interpreter.

The next thing is that there are a lot of dynamic lookups happening that could be avoided. Once your symbols are interned, you can make it so that the symbol object contains a pointer to the function defined with that name (if any), that way, upon function application your interpreter just has to extract the value from the immediate symbol object instead of having to do a dynamic lookup. This also, is a change that will not require big changes to the way the interpreter is structured.

Most of what was discussed with Robert for the F# implementation probably applies to your interpreter too (I think you said you already read that, but mentioning it again just in case).

 
To unsubscribe from this group and stop receiving emails from it, send an email to qilang+unsubscribe@googlegroups.com.

To post to this group, send email to qil...@googlegroups.com.
Visit this group at https://groups.google.com/group/qilang.
For more options, visit https://groups.google.com/d/optout.



--
BD

Tatsuya Tsuda

unread,
Apr 18, 2017, 11:11:26 AM4/18/17
to Shen
Thank you for the suggestions.
As you know, currently symbols are just a string and adding a new struct Symbol which contains a name and a pointer to a function or variable value seems to be a good solution.
Yes, I have read the discussion of the F# implementation and would like to take some ideas from it.

Tatsuya Tsuda

2017年4月18日火曜日 23時03分57秒 UTC+9 Bruno Deferrari:

Antti Ylikoski

unread,
Apr 19, 2017, 11:57:37 AM4/19/17
to Shen

BTW approximately when is the ShenOS 20 going to be published?

I have been attempting to create some nontrivial software with Shen;
everything else is good, but it looks to me like, the OS Shen v.19 has
some problems with the type checking functionality, ie. with proving
complex expressions containing pairs (ie such as (list (number *
string))) etc. in particular when a complex pairs containing
expression is in a function, on the left hand side of the "->".

(Or, I have succeeded in making some marvelously complicated errors!)

I have communicated on this with the important gurus.

yours, AJY



tiistai 18. huhtikuuta 2017 15.31.46 UTC+3 Bruno Deferrari kirjoitti:
Great work, congratulations!

I added a section on the wiki about performance optimizations, but right now it is very incomplete, what it covers is mostly stuff that you can do in the soon to be released ShenOS 20, which I don't think I will be useful for you at this point where your biggest performance gains will come from making your interpreter evaluator faster.


Have you already decided what to do first in terms of performance optimizations?


On Tue, Apr 18, 2017 at 2:02 AM, Tatsuya Tsuda <ota...@gmail.com> wrote:
I have been developing a C port of Shen and finally passed all 129 tests of Shen 19.3.1.

-Shen-C

-Test results

Some details of the implementation:
-Implemented in C99 as an Interpreter
-Tested on macOS and Clang (GCC also should work)
-Using Boehm GC
-Implemented TCO for self recursion by transforming it into a Clojure like loop-recur expression 
-Compiled with -O3 compiler option but the implementation is not optimized yet 
(the standard test takes about 646 secs on my 2011 MacBook Air)

I have been following the recent discussion about the performance optimizations, so the next step is to do it :)

Tatsuya Tsuda

--
You received this message because you are subscribed to the Google Groups "Shen" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qilang+un...@googlegroups.com.

To post to this group, send email to qil...@googlegroups.com.
Visit this group at https://groups.google.com/group/qilang.
For more options, visit https://groups.google.com/d/optout.



--
BD

Bruno Deferrari

unread,
Apr 19, 2017, 3:47:41 PM4/19/17
to qil...@googlegroups.com
On Wed, Apr 19, 2017 at 11:34 AM, Antti Ylikoski <antti.y...@gmail.com> wrote:

BTW approximately when is the ShenOS 20 going to be published?


Soon, probably this weekend. There isn't really much left to do, it is just that I don't have much time on weekdays to dedicate to it.

I would have made a release before but Robert found some issues on Windows when compiling with newer versions of SBCL.

The Shen 19.2 binaries were compiled with 1.1.1, but since 1.1.2 SBCL switched to multibyte IO encodings on the Windows console, which broke the read-byte based IO Shen uses. Between a dirty solution in the SBCL port and a cleaner solution that adds some extra support in the kernel I opted for the later (it is done already and it works, and will probably help in the future if anyone wants to add unicode support).
 
I have been attempting to create some nontrivial software with Shen;
everything else is good, but it looks to me like, the OS Shen v.19 has
some problems with the type checking functionality, ie. with proving
complex expressions containing pairs (ie such as (list (number *
string))) etc. in particular when a complex pairs containing
expression is in a function, on the left hand side of the "->".


If it is indeed a bug (can't tell without seeing the program and error), can you create an entry on the issue tracker on Github?

 
To unsubscribe from this group and stop receiving emails from it, send an email to qilang+unsubscribe@googlegroups.com.

To post to this group, send email to qil...@googlegroups.com.
Visit this group at https://groups.google.com/group/qilang.
For more options, visit https://groups.google.com/d/optout.



--
BD

Antti Ylikoski

unread,
Apr 20, 2017, 10:48:46 AM4/20/17
to Shen
I made an issue entry in the Github.  It is important for me, since the general N-level ANN backpropagation program depends on this point, as well as the resolution algorithm that I'm in the process of making.

My nick in the Github is bluejay77.

yours, AJY

PS.  I could use two member lists instead of pairs: [ 1.33 "S" ] pro (@p 1.33 "S") but I won't -- it were a weak escape, and against Dr Tarver's paradigm.

Tatsuya Tsuda

unread,
Apr 24, 2017, 7:23:46 AM4/24/17
to Shen
Did some C-level CPU profiling and performance optimizations for Shen-C.

On my 2011 MacBook Air, the initial results were like
-Start up time: 26 secs
-Test suite: 646 secs

and now
-Start up time: 1.8 secs
-Test suite: 190 secs

Here are some info of what I have done and noticed.

First did some cpu profiling using gperftools and Apple Instruments.
Although without Shen-level profiling it might be difficult to understand the system function bottleneck, the profiling result showed that the Big3 (lookup_environment, eval_kl_object, strcmp) functions were taking large amount of CPU time and needed serious performance optimizations.
Other than that, there were many frequently used small functions that were not inlined and could be optimized.

Next the details of the optimizations and the reduction of time for the test suite.
1. Fixed a memory allocation issue that prevented -foptimize-sibling-calls option to be used => 34 secs
2. Inlined small functions => 125 secs
3.  Overwrite system functions
  -hash => 20 secs
  -boolean? => 3 secs
  -symbol? => 22 secs
  -integer? => 40 secs
  -shen.numbyte? => 5 secs
  -shen.byte->digit => Not noticeable difference
  -not => Not noticeable difference
  -variable? => 94 secs
  -shen.pvar? => 5 secs
4. Reuse boolean objects (previously boolean objects were created every time when used) => 16 secs
5. Simplified branching in eval_kl_object function => 3 secs
6. Intern symbols that reduces use of strcmp, including reduction of global environment lookups => 89 secs

Thanks to Bruno, next will be to update to Shen 20.0 and see how it improves.


2017年4月19日水曜日 0時11分26秒 UTC+9 Tatsuya Tsuda:

Tatsuya Tsuda

unread,
May 9, 2017, 11:05:36 PM5/9/17
to Shen
Still as an interpreter, on my 2.13 GHz Core 2 Duo, 4 GB RAM, SSD, MacBook Air, now the test suite for Shen-C finishes in 43-47 secs.
I have done various performance updates (mainly overwrites) after updating Shen to 20.0 and would like to share some data.

Before the updates, the test suite finished in 133 secs.
Below are the details and the reduction of time for the test suite.

1. Removed user function checking in symbol function application => 6 secs
2. Disabled -f-optimize-sibling-calls compiler option that caused a weird issue => -8 secs
3. Overwrite system functions
  -value/or => No noticeable difference
  -<-address/or => 1 sec
  -read-file-as-charlist => 2 secs
  -map => 2 secs
  -reverse => No noticeable difference
  -element? => 3 secs
  -shen.valvector => 3 secs
  -shen.lazyderef => 21 secs
  -shen.compose => 3 secs
  -shen.deref => 4 secs
  -shen.occurs? => 7 secs
  -assoc => 9 secs
  -shen.bindv => 4 secs
  -shen.unbindv => 2 secs
  -bind => 3 secs
  -shen.lzy=! => 3 secs
  -shen.ebr => 6 secs
4. Flatten multiple cons calls and convert to (mcons a b c ...) => 3 secs
5. Fix partial application issue => -2 secs
6. Optimize function for getting the size of a list => 1 sec
7. Convert multiple hd and tl calls to (nth-hd 3 a) and (nth-tl 3 b) => No noticeable difference
8. Overwrite system functions
  -shen.hdtl => No noticeable difference
  -shen.mk-pvar => 1 sec
  -shen.copy-vector => 7 secs
  -shen.resize-vector => No noticeable difference
  -shen.resizeprocessvector => No noticeable difference
  -shen.newpv => 5 secs 


Tatsuya Tsuda

2017年4月24日月曜日 20時23分46秒 UTC+9 Tatsuya Tsuda:

Bruno Deferrari

unread,
May 9, 2017, 11:18:32 PM5/9/17
to qil...@googlegroups.com
Congratulations, those numbers are quite good. I just tested it on my computer and the test suite ran in 31.8 seconds, which is almost the same amount of time it takes for shen-scheme to run the test suite when using chibi-scheme as the underlying platform.

chibi-scheme may not be a particularly fast implementation of Scheme, but it is not terribly slow either, and the VM interprets bytecode instead of interpreting a syntax tree which puts it in a big advantage compared to the current implementation of Shen-C.



To unsubscribe from this group and stop receiving emails from it, send an email to qilang+unsubscribe@googlegroups.com.

To post to this group, send email to qil...@googlegroups.com.
Visit this group at https://groups.google.com/group/qilang.
For more options, visit https://groups.google.com/d/optout.



--
BD

Tatsuya Tsuda

unread,
May 10, 2017, 2:46:18 AM5/10/17
to Shen
Wow thanks for the info!

2017年5月10日水曜日 12時18分32秒 UTC+9 Bruno Deferrari:

Mark Tarver

unread,
May 12, 2017, 6:00:17 PM5/12/17
to Shen
I've put this up under the OS downloads.

Congratulations again.

Mark


On Tuesday, April 18, 2017 at 6:16:19 AM UTC+1, Tatsuya Tsuda wrote:

Tatsuya Tsuda

unread,
May 12, 2017, 7:42:09 PM5/12/17
to Shen
Thank you very much!

Tatsuya

2017年5月13日土曜日 7時00分17秒 UTC+9 Mark Tarver:

Frederick Isaac

unread,
May 16, 2019, 8:06:29 PM5/16/19
to Shen
Hi

I tried this on an older gcc 4.1 and a few problems popped up.

I also have it compiled on a Silicon Graphics Irix with Mipspro 7.4.4 c99 compiler - which is a real blast from the past

A few major things jump out

1 - inttypes.h and stdint.h seem to get multiple definitions for UINT_MAX etc. - removing inttypes.h seems to fix it.
2 - the sigsetjmp is used to set a jmp_buf type and on some platforms it (I think maybe most) should use sigjmp_buf (this was nasty to find debug but obvious once I found it) otherwise it causes chaos.
3 - the whole inline in header files is a mess for porting - not because of your code, just older compilers really hate it :)
4 - on the mipspro it needs to be a static inline, whereas on the gcc 4.1 (which had the bad c99/gnu99 inline semantics)  it needs to use static inline and the extern inline. I ended up defining macros for INLINE and EXTERN for different platforms but that means I have to replace all the inlines/externs with each release :)
5 - couple of places you have c == -1 and c == EOF but EOF it an in and c is a char so it produces correct code if c == (int) EOF
6 - older compilers do not like the printf("{0x016" PRIxPTR "}") style so I have to use 0x016p\" PRIxPTR\" - not sure if that is correct or not never used PRIxPTR
7 - maybe a few other things I have forgotten

Other than that it boots up and seems to respond to most things. Need to run the tests though and see how they go through.

I can't get the code to you as of yet because I am still trying to get a working Boehm GC for the iris platform (had to stub it out so it currently never releases memory - it's ok though I have 8gb of ram so I can last for a while)

One I get a working boehmite gc I will create a branch and let you see it.

Other than those few things - sweet job writing this one :)

Cheers

Tatsuya Tsuda

unread,
May 17, 2019, 1:23:39 AM5/17/19
to Shen
Thanks for the detailed report! Making C code portable is a tired work :(
I haven't much time now for digging deep, but after I take a look at your code I would like to consider fixing my code.
BTW, thanks for the kind words :)

Tatsuya

2019年5月17日金曜日 9時06分29秒 UTC+9 Frederick Isaac:

Frederick Isaac

unread,
Oct 22, 2019, 5:27:58 PM10/22/19
to qil...@googlegroups.com

sorry for the long delay - work/life etc.

created a branch which does compile on sgi and linux - fixed some of the extern/inline issues and also the jmp_buf -> sigjmp_buf

can see it here 


its in the linux-sgi branch

thoughts ?

--
You received this message because you are subscribed to the Google Groups "Shen" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qilang+un...@googlegroups.com.
To post to this group, send email to qil...@googlegroups.com.
Visit this group at https://groups.google.com/group/qilang.

Tatsuya Tsuda

unread,
Oct 25, 2019, 9:45:06 PM10/25/19
to Shen
Thanks for that!
I read your code and since it is quite large, I'm not sure I can fix this right away without compatibility issues with other platforms.
I would like to consider this as a future work.

Tatsuya

2019年10月23日水曜日 6時27分58秒 UTC+9 Frederick Isaac:
To unsubscribe from this group and stop receiving emails from it, send an email to qil...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages