Looking for article "A Scheme for native threads" by Kent Dybvig and some questions

Sidharth Kshatriya

unread,

Oct 2, 2017, 4:35:44 AM10/2/17

to chez-scheme

Hi All,

I looking for further examples and descriptions of the Chez Scheme threading system (I've already read the Threading system section of the user guide).

In that regard I understand that the paper/article "A Scheme for native threads" might be a good place to look but I'm not able to find it on the internet? If its useful and freely shareable, can someone post a copy?

I also have a few questions (sorry for this long list, please answer as many or as few as you have time!):

Is the final code generation in Chez scheme 9.4 still loosely based on the paper "Destination-Driven Code Generation" by Kent Dybvig et. all? In other words, does the nanopass framework use this technique in its final passes?
Is the garbage collection in Chez scheme 9.4 still based on the paper "Don’t Stop the BIBOP: Flexible and Efficient Storage Management for Dynamically Typed Languages" ?

If so, I was not able to understand something: the description is of a generational garbage collection in the paper which does copying from segment for generation n (for a particular metatype) to generation n+1. What about the segment for the oldest/max generation ? Does copying collection take place on that also or do we have mark and sweep on that generation?

Is the garbage collection with multiple threads truly concurrent or is there some sort of global lock?
What is a good way to see the internals of the Chez scheme compiler in action given that its a mix of c runtime code and scheme? Is there are setup in which I can "step through" the code in a scheme+gdb debugger combo?
What does it mean for Chez scheme to be an incremental compiler? What can we do practically speaking with an incremental compiler?
Given that we can call the scheme compiler from the code while a scheme program is running is the compiler fast enough to serve as JIT compiler?
Lastly, is Chez Scheme suited to be a good intermediate language? It seems very fast so I'm wondering if it would make sense for language front ends to simply generate chez scheme code that is then compiled to native code. The advantage i see is that we get garbage collection for free on a very expressive IR!
Does the keyword "pariah" to mark branches that are unlikely to be taken helpful in performance? Are there ways to speed up huge cond statements that are likely to arise in interpreters that are implemented in Chez?

Thanks!

Sidharth

Sidharth Kshatriya

unread,

Oct 6, 2017, 12:32:51 AM10/6/17

to chez-scheme

Seems like I scared you folks with this long laundry list of questions!!

I'm really excited about Chez Scheme -- Any reply, howsoever brief will be helpful :-) !

Thanks!

Sidharth

--
You received this message because you are subscribed to a topic in the Google Groups "chez-scheme" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/chez-scheme/UE-R9a3RuXc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to chez-scheme+unsubscribe@googlegroups.com.
To post to this group, send email to chez-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chez-scheme/c55ac57f-c095-43dc-9227-8c5c4a745193%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andrei Formiga

unread,

Oct 6, 2017, 2:35:25 PM10/6/17

to chez-scheme

Regarding chez as an "intermediate" language, we know that Racket will use Chez as a kind of "backend" in the future. See https://groups.google.com/forum/#!msg/racket-dev/2BV3ElyfF8Y/4RSd3XbECAAJ

On Monday, October 2, 2017 at 5:35:44 AM UTC-3, Sidharth Kshatriya wrote:

Lastly, is Chez Scheme suited to be a good intermediate language? It seems very fast so I'm wondering if it would make sense for language front ends to simply generate chez scheme code that is then compiled to native code. The advantage i see is that we get garbage collection for free on a very expressive IR!

Thanks!

Sidharth

Sidharth Kshatriya

unread,

Oct 6, 2017, 5:17:48 PM10/6/17

to Andrei Formiga, chez-scheme

Yes, I did read about that -- the proposal is that racket 7 will have a Chez scheme core. However, Racket and Chez are quite similar. I was thinking whether it would be a good idea to have scripted language like Python/PHP/etc. be translated via a transpiler (written in the nanopass framework) to scheme source and then compiled by chez (rather than write a hugely complex JIT compiler like, say, PyPy).

Is this even a good idea? Or is it a non-starter for various reasons?

--

You received this message because you are subscribed to a topic in the Google Groups "chez-scheme" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/chez-scheme/UE-R9a3RuXc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to chez-scheme+unsubscribe@googlegroups.com.
To post to this group, send email to chez-...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/chez-scheme/55b0c832-f225-4e3b-b351-aec6009757a8%40googlegroups.com.

Andy Keep

unread,

Oct 7, 2017, 3:25:16 PM10/7/17

to chez-scheme, Sidharth Kshatriya

Hey Sidharth,

Answers to questions inline below...

On October 2, 2017 at 4:35:46 AM, Sidharth Kshatriya (sid.ks...@gmail.com) wrote:

Hi All,

I looking for further examples and descriptions of the Chez Scheme threading system (I've already read the Threading system section of the user guide).

The threading system is implemented using pthreads (and is only available in the threaded version of Chez). Not all all of pthreads is available (for instance we do not have a thread_kill exposed in Chez). There are also thread-local parameters for state that needs to be held on to on a per-thread basis.

In that regard I understand that the paper/article "A Scheme for native threads" might be a good place to look but I'm not able to find it on the internet? If its useful and freely shareable, can someone post a copy?

I sent Kent a request for this, so we’ll see what he says.

I also have a few questions (sorry for this long list, please answer as many or as few as you have time!):
Is the final code generation in Chez scheme 9.4 still loosely based on the paper "Destination-Driven Code Generation" by Kent Dybvig et. all? In other words, does the nanopass framework use this technique in its final passes?

No, when we rewrote the compiler we moved to using a graph coloring register allocator and along with this there were some significant changes to the compiler backend. Rather than try to go through all of the pieces of this, have a look at section 3.3.3 of my dissertation: http://www.andykeep.com/pubs/dissertation.pdf It is just a couple of double-spaced pages that sums up the basics of how this changed.

Is the garbage collection in Chez scheme 9.4 still based on the paper "Don’t Stop the BIBOP: Flexible and Efficient Storage Management for Dynamically Typed Languages" ?

Yes, we’ve still not stopped the BiBOP :) The current system has new, impure, symbol, port, weak pair, ephemeron, pure, continuation, code, pure-typed-object, impure-record, and data spaces. Different types end up in different places based on some combination of type and description. Records for instance, might end up in pure, impure, or impure-record space based on the mutability and type of fields in the record.

If so, I was not able to understand something: the description is of a generational garbage collection in the paper which does copying from segment for generation n (for a particular metatype) to generation n+1. What about the segment for the oldest/max generation ? Does copying collection take place on that also or do we have mark and sweep on that generation?

Copying takes place at all levels. The maximum generation is collected back into the maximum generation. During collection segments being copied from are marked as “old space” and segments being copied to are marked as “new space”, so even during maximum generation collection there is no danger of collecting an item more than once.

Is the garbage collection with multiple threads truly concurrent or is there some sort of global lock?

No, there is not concurrency when the collector is running. It is a stop and copy collector. While some systems collect when more allocation space is needed, collection is asynchronous with allocation in Chez. Essentially what happens is after a certain amount of allocation happens, a request for collection is made. As each thread in the system gets to an event check, it sleeps waiting for the other threads to arrive there. Once all threads hit their event check, the garbage collector runs, and after completing the threads are released again. How much allocation happens before the collection event is raised is configurable through collect-trip-bytes.

What is a good way to see the internals of the Chez scheme compiler in action given that its a mix of c runtime code and scheme? Is there are setup in which I can "step through" the code in a scheme+gdb debugger combo?

You can trace using gdb, however Chez Scheme does not produce debugging information in the compiler output and does not follow the calling conventions of the machine ABI, so the support for this is fairly barebones. The one bridge we do provide is s_breakhere. From inside gdb you can mark a break point on s_breakhere, and then from inside the reply if you call #%$breakhere with a scheme object (such as a function) the pointer to the argument of s_breakhere will be that scheme object. Another handy tool is S_prin1. You can call this function from within gdb and pass it a scheme object pointer to print out the pointer.

Another potential option is to use the trace facilities. You can use trace-define, trace-lambda, etc. to trace. It is also possible to trace using the trace function, though R6RS definitions cannot be redefined (which is how the trace function works), so it is of limited use there, since any compiled reference to an R6RS definition will not be traced.

You can also trace the behavior of the garbage collector by setting the collect-notify parameter or even run code around every collection by setting the collect-request-handler.

Finally, you can inspect the current continuation by call inspect on the result of call-cc. This is the same functionality the debugger provides, but you can call it from within a function you are interested in walking through.

What does it mean for Chez scheme to be an incremental compiler? What can we do practically speaking with an incremental compiler?

I’m not really sure what you mean by incremental compiler here. All this really means in the case of Chez Scheme is that if you have a program composed of a number of libraries, only libraries with updated source code (and libraries that depend upon them) will be recompiled. This is just like using make with a C compiler where make only rebuilds files that have changed. This functionality is internal to Chez Scheme though, and you can even use maybe-compile-* to only recompile a library, program, or file, if it or one of its dependencies has changed.

Given that we can call the scheme compiler from the code while a scheme program is running is the compiler fast enough to serve as JIT compiler?

Chez Scheme does compile at the REPL, but in general I would not call it a JIT compiler. It just compiles everything the same way regardless of if you are compiling it ahead of time or while you are running in an interactive session. Other than a few corner cases the compiler is fast which is why you don’t generally notice a slow down at the REPL. There are also certain cases where we cheat at the REPL and call the interpreter. This happens for very simple expressions after a quick check. This is mostly just to avoid the overhead of compilation when you are running something like: (+ 4 5). You can disable this with the compile-interpret-simple parameter.

Lastly, is Chez Scheme suited to be a good intermediate language? It seems very fast so I'm wondering if it would make sense for language front ends to simply generate chez scheme code that is then compiled to native code. The advantage i see is that we get garbage collection for free on a very expressive IR!

This is sort of a philosophical question. Is C a good target for compilation? Certainly many people use C as a compilation target, because it is lower level than the language they are writing a compiler for, but it frees them from having to write a register allocator and provides function and expression abstraction which are generally an easier to work with target than assembly or machine code. That said, we don’t target C from Chez Scheme, because we don’t want to be restricted by its calling conventions and features like call/cc are not necessarily an easy fit in C, though these things can be mitigated or overcome if you really want to and there are other Scheme implementation that do. Generally, I would say if you feel like the language and runtime features provided by your target jive well with the language that you are trying to target there, it is a good fit, and I can certainly imagine targets that would work well there.

All of that said, I do think Scheme, in general, is a great language for experimenting with languages. If your language is an S-expression language you can get pretty far just using the macro system. If you need more than that, you can write a parser that produces S-expression output that the compiler can consume. I’ve even worked on systems where I’ve written a compiler in the macro system using the nanopass framework that produces source. (I’ve been playing with a matching macro that does this, mostly just for fun.)

Does the keyword "pariah" to mark branches that are unlikely to be taken helpful in performance? Are there ways to speed up huge cond statements that are likely to arise in interpreters that are implemented in Chez?

The pariah keyword is used to mark “exceptional” code paths. This information is used when we layout code blocks to put the expected code contiguous in the output, so that likely to visit code is all in the icache. For small functions it probably doesn’t make a big difference, but it can make a difference for larger functions. The pariah information is also used to determine variable restoration around non-tail calls. The idea being that variables used in pariah blocks aren’t frequently needed so they generally don’t need to be restored from the frame eagerly.

Separately, Chez Scheme now has profile-driven optimization for cond that allows you to produce profile data during a testing run of a program and feed that back into the compiler to reorder the clauses of the cond automatically so the the more common path is earlier in the cond. This can only happen if the clauses are independent, which can be indicated by the programmer using exclusive-cond.

Thanks!

No problem, good luck with your explorations!

-andy:)

Sidharth
--
You received this message because you are subscribed to the Google Groups "chez-scheme" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chez-scheme...@googlegroups.com.

To post to this group, send email to chez-...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/chez-scheme/c55ac57f-c095-43dc-9227-8c5c4a745193%40googlegroups.com.

Sidharth Kshatriya

unread,

Oct 8, 2017, 2:11:12 AM10/8/17

to Andy Keep, chez-scheme

Thanks for your detailed response!

You can trace using gdb, however Chez Scheme does not produce debugging information in the compiler output and does not follow the calling conventions of the machine ABI, so the support for this is fairly barebones.

How difficult a project will it be to get Chez to generate ELF debugging information so that a chez compilation can be debugged in gdb? I would guess we would need to follow machine ABI also everywhere otherwise gdb would get confused. How difficult or even possible is such a project? If you were to do it, having worked extensively on the compiler how many man-months would it take (off the top of your head)?

Also, will such a project be useful?

Thanks,

Sidharth

Sidharth

To unsubscribe from this group and stop receiving emails from it, send an email to chez-scheme+unsubscribe@googlegroups.com.
To post to this group, send email to chez-scheme@googlegroups.com.

Andy Keep

unread,

Oct 8, 2017, 3:06:01 PM10/8/17

to Sidharth Kshatriya, chez-scheme

On October 8, 2017 at 2:11:11 AM, Sidharth Kshatriya (sid.ks...@gmail.com) wrote:

Thanks for your detailed response!

You can trace using gdb, however Chez Scheme does not produce debugging information in the compiler output and does not follow the calling conventions of the machine ABI, so the support for this is fairly barebones.

How difficult a project will it be to get Chez to generate ELF debugging information so that a chez compilation can be debugged in gdb? I would guess we would need to follow machine ABI also everywhere otherwise gdb would get confused. How difficult or even possible is such a project? If you were to do it, having worked extensively on the compiler how many man-months would it take (off the top of your head)?

So, it isn’t really ELF and the OS specified ABI that you want, it is actually DWARF support. DWARF is the system that allows gdb to understand what is going on in the program it is debugging and relate it to the original source. I don’t know much about DWARF or how gdb interacts with it, so I don’t really know how much effort this would take. There are some challenges here too though. I don’t know how gdb relates DWARF information to what is happening in the virtual memory of the machine being run, however, it is worth noting that the garbage collector may copy (and collect) code objects associated with procedures loaded from the scheme object file. It might be interesting to see if python, ruby, or node js provides this kind of support, because they may have needed to solve some of the same kind of problems. I’ve not used gdb support with any of these tools, do you have a feel for if this exists?

I’m sure it is possible to use the ABI specified calling conventions, but I’m not sure it is possible to do this without impacting performance (I’d have to give it a bit more thought), and if the only advantage is gdb support, it probably isn’t worth it. Part of the challenge is that we maintain a segmented stack, which supports call/cc, threading, and deep recursion stacks. This is possible, but tricky using the standard ABI calling convention (see Rust’s experience with this). Chez Scheme also maintains the architectural stack pointer register pointing at the C stack, which it uses for instances where it needs to call into C, either when interacting with the run time or for the FFI.

Also, will such a project be useful?

Hard to answer that question in a non-biased way. To me, it is probably not that useful. I don’t use gdb all that much (especially since I’m generally on a mac where gdb has mostly been replaced by lldb). I never use gdb (or lldb) when I am just developing scheme code, but it can be useful when making changes in how the C run time and Chez Scheme compiler produced code interact, or in cases where the FFI is especially complicated and something goes wrong.

-andy:)

Андрей Алпеев

unread,

Aug 29, 2018, 12:13:19 AM8/29/18

to chez-scheme

Is there any way to get article "A scheme for native threads"? I've only managed to find slides (and that was with the help of the internet time machine =).

Kind regrads

-Andrei

понедельник, 2 октября 2017 г., 11:35:44 UTC+3 пользователь Sidharth Kshatriya написал:

Reply all

Reply to author

Forward