Structured Concurrency in Racket

236 views
Skip to first unread message

jab

unread,
Oct 7, 2019, 1:46:38 PM10/7/19
to Racket Users
Coming across https://trio.discourse.group/t/structured-concurrency-and-delimited-continuations/ just provoked me to search for discussion of structured concurrency in Racket. I didn’t immediately find much.* I hope that doesn’t mean that the interesting work that’s being discussed over in https://trio.discourse.group/c/structured-concurrency etc. has been largely unknown by the Racket community. Trio is having a profound impact on the future of concurrency, not just in Python but in many other languages. There’s even a post on Wikipedia now: https://en.wikipedia.org/wiki/Structured_concurrency

(For anyone new to the term, https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/ might be the best starting point. One persuasive example shows Nathaniel live coding a correct implementation of RFC 655 “Happy Eyeballs” in 40 lines of Python: https://github.com/python-trio/trio-talks/tree/master/njsmith-async-concurrency-for-mere-mortals (For comparison, Twisted’s implementation took hundreds of lines and still had a logic bug after years of work.) There is also some related reading in https://github.com/python-trio/trio/wiki/Reading-list.)

I hope this post provokes discussion of structured concurrency in Racket. It’d be fascinating to read more of your thoughts!

Thanks,
Josh

* For example, Googling “structured concurrency racket” turned up mostly just a brief mention of Racket’s custodians in the bottom-most comment on this post: http://250bpm.com/blog:71

Matt Jadud

unread,
Oct 7, 2019, 2:28:26 PM10/7/19
to jab, Racket Users
Hi Josh,

Racket has a number of powerful concurrency libraries/extensions that handle both concurrent execution of code in a single process as well as parallel execution across multiple processes/hosts. There is the "futures" library, which might be the most similar to Trio.


However, there are libraries that provide a larger lift, and are grounded in language traditions that tie back to algebras for reasoning about parallel systems. The channel library, and the code that builds on it ("places"), are examples.



The history of formal reasoning, and implementation of, mechanisms for concurrency and parallelism go back to at least the... 70's? 


"Structured concurrency" does not, from the link(s) provided, seem to be anything that stands out from this long tradition. (Or, if you prefer, computer scientists have been reasoning about, building systems around, and integrating into languages notions of parallelism and concurrency for roughly 40+ years.) 

So, personally, I would say that Racket has multiple libraries (which, because of the nature of Racket, are in some cases extensions to the base language) that implement structured concurrency. 

Unless I'm completely misunderstanding your post.

Cheers,
Matt


--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/bb36e50a-a77b-4c5b-b144-71ce647069b7%40googlegroups.com.

jab

unread,
Oct 7, 2019, 2:59:06 PM10/7/19
to Racket Users
Yeah, I’d say give a closer read to https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/

Excerpting a footnote:

> For those who can't possibly pay attention to the text without first knowing whether I'm aware of their favorite paper, my current list of topics to include in my review are: the "parallel composition" operator in Cooperating/Communicating Sequential Processes and Occam, the fork/join model, Erlang supervisors, Martin Sústrik's article on Structured concurrency and work on libdill, and crossbeam::scope / rayon::scope in Rust. [Edit: I've also been pointed to the highly relevant golang.org/x/sync/errgroup and github.com/oklog/run in Golang.]

Arie Schlesinger

unread,
Oct 7, 2019, 3:01:50 PM10/7/19
to jab, Racket Users
Can somebody specify how to use racket in jupyter notebook ?
Thanks

--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.

George Neuner

unread,
Oct 7, 2019, 4:08:08 PM10/7/19
to jab, racket users
"Structured concurrency" as defined by those articles really applies
only to programs that operate in distinct phases -  which is a pretty
good description of many science programs  - but such programs represent
a (important, but) small subset of all parallel programs.

Racket's APIs are designed to support general parallelism, not just
phased parallelism.

I suppose an argument could be made for syntax explicitly aimed at
phased parallelism ... but using macros, such syntax could be built on
top of the existing parallel (future, thread, place) APIs - so there is
nothing stopping you, or anyone else, from creating a "structured
concurrency" package that everyone can use.


One problem I foresee is defining a reasonable syntax and semantics.  
Most of the languages that promote "structured" concurrency have C-like
syntax.  Moreover they vary considerably in semantics - in some of them
the spawning thread stops immediately and can't continue until its
children exit, but other allow the spawning thread to continue until it
executes an explicit join / wait for its children.

The Lisp/Scheme based idioms of S-expr Racket are very different from
those of C-like languages, and whether or not some "Racket2"
materializes having a more C-like syntax, many people like Racket just
the way it is.  I can envision heated debates about what should be the
proper "structuring" semantics and what syntax best represents them.

YMMV,
George

Luke Whittlesey

unread,
Oct 8, 2019, 10:39:15 AM10/8/19
to George Neuner, jab, racket users
I think the ceu language has a nice model of what I would consider
"structured". http://ceu-lang.org/ It has automatic cancellation and
finalization. Racket can easily support this model. Await statements
are captured through delimited continuations and processes are managed
in a tree. If a parent process exits the children are canceled and
finalized.
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/8b83ff75-7b57-f3f1-0514-46170c9c3141%40comcast.net.

George Neuner

unread,
Oct 8, 2019, 10:56:15 PM10/8/19
to racket...@googlegroups.com
On Tue, 8 Oct 2019 10:36:42 -0400, Luke Whittlesey
<luke.wh...@gmail.com> wrote:

>I think the ceu language has a nice model of what I would consider
>"structured". http://ceu-lang.org/ It has automatic cancellation and
>finalization. Racket can easily support this model. Await statements
>are captured through delimited continuations and processes are managed
>in a tree. If a parent process exits the children are canceled and
>finalized.

Interesting. It appears - to me, anyway - that Ceu took some
inspiration from Occam, but is implemented with coroutines instead of
threads or light-weight processes. Ceu programs are limited to a
single core/CPU ... so there is no real parallelism possible other
than by a multiple process approach.

Racket's threads have similar single core/CPU limitations, but futures
and places do not - they can run simultaneously on a multicore CPU.
Whatever "structure" model might be adopted for Racket would need to
account for the semantic differences.

George

Zelphir Kaltstahl

unread,
Oct 9, 2019, 2:34:35 AM10/9/19
to racket...@googlegroups.com
I don't think places are a good example for good support of parallelism.

It is difficult to get a flexible multi processing implementation done,
without hard-coding the lambdas, that run in each place, because we
cannot send serializable lambdas (which also are not core, but only
exist in the web server package) over channels. That means, that one
needs to define ones lambdas ahead of time before even starting a place
and sending it the data to process. That means, that we cannot have
something like a process pool.

The other problem is, that using places means using multiple Racket VMs,
if I remember correctly, which uses some RAM. It is not like places are
super lightweight at least.

Racket threads run on a single core, I think.

I know there is a tutorial about using futures somewhere, where it
depends on the number type one uses, whether the code can be
automatically run in parallel or not, so there is also some issue there,
or at least it did not look to me like one could use futures everywhere
and have neat parallelism.


When I tried to write a process pool kind of thing (had a lot of help
from this mailing list!), which I called work distributor, the above
mentioned problems with lambdas caused me to abandon that project, until
I can send lambdas over channels.

Correct any of the things I wrote above, if they are not true, but I
think Racket definitely needs a better multi processing story. I would
love to see something like Guile Fibers. Andy Wingo even mentioned in
his video, that some of the Racket greats advised him to look at
Concurrent ML and that that is where he got some ideas from, when
implementing Guile Fibers as a library. Shouldn't Racket then be able to
have a similar library? I don't understand how Fibers really works, but
that is a thought I had many times, since I heard about the Fibers library.

Regards,

Zelphir

Paulo Matos

unread,
Oct 9, 2019, 2:53:08 AM10/9/19
to racket...@googlegroups.com


On 07/10/2019 21:01, Arie Schlesinger wrote:
> Can somebody specify how to use racket in jupyter notebook ?
> Thanks
>

There have been earlier threads about that you might want to look at.
https://groups.google.com/d/msg/racket-users/qw7u9pNFbuQ/eot1Acw7DAAJ

I don't know the answer to your question but you might have better luck
at an answer if you post it as a separate email instead of a sub-message
to an unrelated thread.

Good luck,
Paulo

> On Mon, Oct 7, 2019 at 21:59 jab <jabr...@gmail.com
> <mailto:jabr...@gmail.com>> wrote:
>
> Yeah, I’d say give a closer read to
> https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/
>
> Excerpting a footnote:
>
> > For those who can't possibly pay attention to the text without
> first knowing whether I'm aware of their favorite paper, my current
> list of topics to include in my review are: the "parallel
> composition" operator in Cooperating/Communicating Sequential
> Processes and Occam, the fork/join model, Erlang supervisors, Martin
> Sústrik's article on Structured concurrency and work on libdill, and
> crossbeam::scope / rayon::scope in Rust. [Edit: I've also been
> pointed to the highly relevant golang.org/x/sync/errgroup
> <http://golang.org/x/sync/errgroup> and github.com/oklog/run
> <http://github.com/oklog/run> in Golang.]
>
> --
> You received this message because you are subscribed to the Google
> Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to racket-users...@googlegroups.com
> <mailto:racket-users%2Bunsu...@googlegroups.com>.
> --
> You received this message because you are subscribed to the Google
> Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to racket-users...@googlegroups.com
> <mailto:racket-users...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/racket-users/CAMJd8C5JqCU82gg3ZoRPwV4H1_SxGQTSht9UXCWFtneSp_MVOA%40mail.gmail.com
> <https://groups.google.com/d/msgid/racket-users/CAMJd8C5JqCU82gg3ZoRPwV4H1_SxGQTSht9UXCWFtneSp_MVOA%40mail.gmail.com?utm_medium=email&utm_source=footer>.

David Storrs

unread,
Oct 9, 2019, 5:27:08 AM10/9/19
to Zelphir Kaltstahl, Racket Users
Note that it's possible to send S-expressions through a channel and then eval them on the far end. This would let you do something like this:

(hash 'func 'my-predefined-lambda 'args '(arg1 arg2))

Which calls a predefined function, or:

(hash 'install '(lambda (username) (displayln (~a "Hello, " username))) 'name 'greet)

Which defines a new function and installs it for later use under the name "greet".

It's not elegant and it has all the usual problems with eval'ing code, but it's possible.

--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/59c68f8e-b668-f08e-55ae-a977084f5bb1%40gmail.com.

George Neuner

unread,
Oct 9, 2019, 7:36:36 AM10/9/19
to Zelphir Kaltstahl, racket users

On 10/9/2019 2:34 AM, Zelphir Kaltstahl wrote:
I don't think places are a good example for good support of parallelism.

Hoare's "Communicating Sequential Processes" is a seminal work in Computer Science.  We can argue about whether places are - or not - a good implementation of CSP,  but you can't very well knock the concept.



It is difficult to get a flexible multi processing implementation done,
without hard-coding the lambdas, that run in each place, because we
cannot send serializable lambdas (which also are not core, but only
exist in the web server package) over channels. That means, that one
needs to define ones lambdas ahead of time before even starting a place
and sending it the data to process. That means, that we cannot have
something like a process pool.

1) Serial lambdas and closures are in web-server-lib, not the framework.  What is the problem including a library?

2) serial-lambda, define-closure, etc.  only create functions that CAN BE serialized.  You still do have to (de)serialize them.

3) You can send serialized code over place channels - it is just a string.


The difficulty in sending code is making sure all the support context is available.  To be sure, this can be a pain.  But consider that for a local place you can send the name of a code file to dynamic-require ... and for a remote place you can send the file itself.

And if channels present a problem, take a look at Paulo Matos's Loci package which implements (local, same machine) distributed places over sockets.    https://pkgs.racket-lang.org/package/loci



The other problem is, that using places means using multiple Racket VMs,
if I remember correctly, which uses some RAM. It is not like places are
super lightweight at least.

This is a more substantial argument.  Each place is a separate copy of the VM.  I've wished publicly to have more control over the resources used by Racket - hard limits on heap and so forth, like with server JVM.  It's just a small problem on Unix/Linux where you can ulimit ... but its a major PITA on Windows where you can't.

But, you can choose whether to use *dynamic* places which are OS threads in the same process, or *distributed* places which are separate processes, or a mixture of the two.



Racket threads run on a single core, I think.

Actually the whole VM is single core limited, but multiple threads can execute within it.  That's why places and futures are important - for multi-core support.

The reason for the thread limitation is GC.  The mutator has to stop - at least momentarily - when the collection begins and again (possibly multiple times) to check that the collection is complete.  User-space threads can be halted easily ... OS threads are much more difficult to deal with - there generally is little control over their scheduling and little or no visibility into when it would be safe to stop them.

Obviously GC can be done with OS threads sharing memory - but it is order of magnitude harder than with user-space threads.  Maybe with the switch to Chez, it is time to revisit this.



I know there is a tutorial about using futures somewhere, where it
depends on the number type one uses, whether the code can be
automatically run in parallel or not, so there is also some issue there,
or at least it did not look to me like one could use futures everywhere
and have neat parallelism.

would-be-future  returns what is essentially testing code that logs any future-unsafe operations when executed.  You can use it to determine if the code actually would run in parallel as a future.

Generally, you can access vectors/arrays and do math.  Almost anything else is problematic.


Correct any of the things I wrote above, if they are not true, but I
think Racket definitely needs a better multi processing story.

You certainly can argue that sending code for execution on a remote place is not as easy as it could be.  And unloading code is not as easy as it should be.  Killing / spawning a new place works, but it is a heavy hammer that works against having a compute pool. 


I would love to see something like Guile Fibers. Andy Wingo even mentioned in
his video, that some of the Racket greats advised him to look at
Concurrent ML and that that is where he got some ideas from, when
implementing Guile Fibers as a library. Shouldn't Racket then be able to
have a similar library? I don't understand how Fibers really works, but
that is a thought I had many times, since I heard about the Fibers library.

A fiber is just a coroutine ... implemented with continuations and a nicer API.

Regards,
Zelphir

George

jab

unread,
Oct 9, 2019, 1:43:15 PM10/9/19
to Racket Users
So far from this thread, it seems the idea of Structured Concurrency hasn’t yet made it into the Racket world. I’ll be interested to see if it gets adopted in Racket in the future (or at least better understood) as its adoption grows elsewhere.

In the meantime, in case it helps illustrate the idea to anyone else still interested, check out the talk I linked to previously showing an implementation of Happy Eyeballs using structured concurrency. Or just read it directly from the Trio source once you understand what a nursery and a cancel scope are:
https://github.com/python-trio/trio/blob/master/trio/_highlevel_open_tcp_stream.py

(RFC 6555 Happy Eyeballs is like a Hello World of the problem space that structured concurrency is addressing, so this is an illustrative example.)

If there is some blessed Racket implementation of Happy Eyeballs, it could be useful to compare – for correctness, completeness, obviousness, and elegance.

Zelphir Kaltstahl

unread,
Oct 9, 2019, 1:51:26 PM10/9/19
to David Storrs, Racket Users

True. However, here comes the big "but": What about capturing the environment of expressions? For example I might have identifiers in my S-expressions bound to potentially a lot of data, which must also be send through the channel. It would be painful (if not impossibly at the time of writing the code) to have to copy the whole representation (of a creating expression) for that data into an expression to send it.

Zelphir Kaltstahl

unread,
Oct 9, 2019, 2:09:20 PM10/9/19
to George Neuner, racket users

Hi George!

I was wrongly under the impression, that serializable-lambda are supposed to work out of the box, when sending them over channels, without needing to do any further work ("are serialized automatically" instead of "can be serialized"). This is what I would have expected, as it seems to be the case in other programming languages, where one can simply send lambdas to other actors, which can, but don't have to, run on other cores or other machines. A year ago or so there was an Racket event in Berlin (Racket Summerfest), where people also tried to use serializable-lambda to send them over channels, but did not succeed. I do not know what the problems there were, as I did not attend that specific workshop. However, I think some of the more knowledgeable people of the Racket community might remember that workshop. Maybe there also was some code from that event. If you know how to serialize those serializable-lambdas, it is possible, that you could solve the problems they faced.

I have a question regarding what you wrote: Is there a generic way to serialize such lambdas, no matter what they look like?

I think that would be kind of necessary to abstract away the painful points of coming up with a way of serializing the lambdas. It also seems to be necessary for eventually creating a library, which provides a process pool, as such library possibly should be easy to use and not force the user to think about difficult thing, when the intention the user has seems so simple "just do that on another core".

I have not read CSP and I know, that it is considered to be _the_ standard for multi processing, that much I have to admit. I am also not trying to argue the concept away : ) Just saying, that in Racket it is quite difficult (at least for me, although I have tried many hours, when I was working on my project) to get multi processing done. I believe in Python at least superficially one faces a similar situation: Threads on same core, processes can go on other core, which is the parallel to starting a new Racket VM, because it also starts a new Python process. However, somehow it is easily possible to give a reference or a lambda to a process pool in Python. I don't know the implementation details of course.

Best regards,

Zelphir

jab

unread,
Oct 9, 2019, 2:28:16 PM10/9/19
to Racket Users
For example, here’s a more functional implementation of Happy Eyeballs in Clojure, using the author’s “missionary” library (a functional effect and streaming system):

https://cljdoc.org/d/missionary/missionary/b.11/doc/readme/guides/happy-eyeballs

Sam Tobin-Hochstadt

unread,
Oct 9, 2019, 4:15:31 PM10/9/19
to jab, Racket Users
The Racket community, and even more so the design of Racket
concurrency APIs, is very strongly influenced by the academic side of
Racket. As far as I can tell, structured concurrency is fairly close
to what is traditionally called the fork/join model. Concurrency in
Racket is usually structured in a somewhat different way, around
first-class events and channels. First-class events were originally
created in Concurrent ML, and the basic idea is that you can package
up things that you might think of as concurrency operations (such as
`select` in Go) and turn them into _values_ which you can then further
synchronize on.

Here's how I would write the Happy Eyeballs program in Racket:

#lang racket/base

(define DELAY 300)
(require racket/tcp racket/match racket/set)

(define threads (mutable-set))

(define (connect hosts)
(let loop ([hosts hosts] [chans null])
(cond [(null? hosts)
#f]
[else
(define ch (make-channel))
(define t (thread (lambda ()
(with-handlers ([exn:fail? (λ _
(channel-put ch #f))])
(define-values (in out) (tcp-connect
(car hosts) 80))
(channel-put ch (cons in out))))))
(set-add! threads t)
(match (apply sync (alarm-evt (+ (current-milliseconds)
DELAY)) ch chans)
[(cons in out) ;; success
(for ([t threads]) (kill-thread t))
(cons in out)]
[#f ;; error
;; thread is dead, don't need to kill it
;; don't need to remember this channel
(loop (cdr hosts) chans)]
[_ ;; timeout
;; ask the next iteration to sync on this channel too
(loop (cdr hosts) (cons ch chans))])])))

I think that this does the right thing, although it's not actually
using IP addresses and the IP-level operations, but hostnames and TCP
sockets.

One other thing I would note. This is simple, but even simpler would
be to package up `tcp-connect` as an event, the way `tcp-accept-evt`
works. Then you could write something like:

(apply sync (for/list ([(i h) hosts]) (replace-evt (alarm-evt (+
(current-milliseconds) (* DELAY i)))

(lambda _ (tcp-connect-evt h 80)))))

Sam
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/ed325f57-d7e3-47dc-a2e7-76fc0af1ff50%40googlegroups.com.

Philip McGrath

unread,
Oct 9, 2019, 5:58:25 PM10/9/19
to Zelphir Kaltstahl, George Neuner, racket users
On Wed, Oct 9, 2019 at 2:09 PM Zelphir Kaltstahl <zelphirk...@gmail.com> wrote:
I was wrongly under the impression, that serializable-lambda are supposed to work out of the box, when sending them over channels, without needing to do any further work ("are serialized automatically" instead of "can be serialized"). … If you know how to serialize those serializable-lambdas, it is possible, that you could solve the problems they faced.

… Is there a generic way to serialize such lambdas, no matter what they look like?

You can serialize the procedures produced by `serial-lambda` using the `racket/serialize` library, the same way you serialize other kinds of Racket values. Here is an extended example:
#lang racket

(require web-server/lang/serial-lambda
         racket/serialize
         rackunit)

(define (make-serializable-adder n)
  (serial-lambda (x)
    (+ n x)))

(define serializable-add5
  (make-serializable-adder 5))

;; The value of a serial-lambda expression is a procedure.
(check-eqv? (serializable-add5 2)
            7)

;; The procedure can't be sent over a place-channel directly ...
(check-false
 (place-message-allowed? serializable-add5))
;; ... but it is serializable:
(check-true
 (serializable? serializable-add5))

(define serialized
  (serialize serializable-add5))
;; The serialized form can be sent over a place-channel.
(check-true
 (place-message-allowed? serialized))

(define-values [in out]
  (place-channel))

(place-channel-put in serialized)

;; When we deserialize the received value, potentially in a new place ...
(define received
  (place-channel-get out))
(define deserialized
  (deserialize received))

;; ... it works like the original, including closing over the lexical environment.
(check-eqv? (deserialized 11)
            16)

One thing to note is that the procedures returned by `serial-lambda` do close over their lexical environments, which is desirable for the reasons you mention in your previous message. That means that values which are captured as part of the closure must also be serializable, or `serialize` will raise an exception. (The same is true for lists: they are serializable as long as their contents are also serializable.)

Concretely, in the example above, `n` is part of the closure and therefore must be serialized—which works out great, because `n` must be a number, and numbers are serializable. On the other hand, the procedure `+` isn't serializable, but that's ok, because `+` isn't part of the closure: it's a reference to a module-level identifier.

… it seems to be the case in other programming languages, where one can simply send lambdas to other actors, which can, but don't have to, run on other cores or other machines. … It also seems to be necessary for eventually creating a library, which provides a process pool, as such library possibly should be easy to use and not force the user to think about difficult thing, when the intention the user has seems so simple "just do that on another core".

This depends on what you mean by "actors." You can simply send closures to other Racket-level threads (in the sense of `thread`), which share state and run concurrently (but not in parallel) within the same OS-level thread. While not framed specifically in the vocabulary of the actor model, Racket threads seem to fit the model pretty well: they are what I think of first when I hear "actors" in a Racket context.

When you want machine-level parallelism (i.e. multiple OS threads), that shared mutable state becomes a problem, because hardware provides only very low-level mechanisms for coordinating parallel threads so that they have a consistent view of the state (i.e. no variables are half-assigned). This is true regardless of the language you work in: languages differ in what tools, if any, they give you to manage the problem.

Racket provides two mechanisms for parallelism, places and futures. Futures share state like Racket-level threads, but block when attempting an operation that can't safely be done in parallel: in other words, they provide "best-effort" parallelism. On traditional Racket, there are lots of operations that can't safely be done in parallel, so futures have mostly been useful so far for numeric computation. There's hope that will change with Racket-on-Chez, but I think we're still at the stage of needing more people to try it and see how it goes in practice.

Places, on the other hand, provide "shared nothing" parallelism: each exists in its own world, without shared mutable state, and they communicate by explicitly sending messages over place-channels (which are different than normal channels, e.g. by being asynchronous, but have much the same API and integrate with Racket's general system for threads and events). (Actually, places do support a very restricted form of shared mutable state through things like `make-shared-bytes`.) There are also distributed places and Paulo Matos's loci, which extend the idea of places to multiple processes or machines.

The point is that "just do that on another core" doesn't work in general, because "that" might rely on mutable state or other aspects of the context where it is run. Of course there are lots of cases that are trivially parallelizable, but, even in those cases, starting parallel threads and communicating between them involves overhead. I don't know of any language or library that completely relieves the programmer of thinking about what can be parallelized well and how much parallelism to use, and then writing those decisions down somehow in the programming language. In my experience, Racket makes it easier to reason about parallel programming than languages that expose you to the wild west of the hardware, and I don't find writing down parallel programs in Racket especially onerous.

-Philip

George Neuner

unread,
Oct 10, 2019, 2:02:13 AM10/10/19
to racket...@googlegroups.com
On Wed, 9 Oct 2019 16:15:14 -0400, Sam Tobin-Hochstadt
<sa...@cs.indiana.edu> wrote:

>The Racket community, and even more so the design of Racket
>concurrency APIs, is very strongly influenced by the academic side of
>Racket. As far as I can tell, structured concurrency is fairly close
>to what is traditionally called the fork/join model.

To a 1st approximation. There are different implementations of
fork/join.

In some the forking thread stops immediately to wait for the spawned
child(ren) to exit. In others, the forking thread can continue until
it executes a deliberate join.

But then there are systems where join implicitly targets all children.
Then there are systems where join implicitly targets the last child
forked and to wait for all children you have to keep executing join
until it fails. Still others where (like with Unix processes) join
implicitly targets a single child, but you don't know which one will
return and to wait for all you have to keep executing joins. And
still others where join can be told to wait for a particular set of
children (ignoring others).


Threads in Racket are signaling, and their completion (or death) can
be detected using events or directly using their handles. So Racket
can easily emulate any of the behaviors above. Putting a nicer syntax
on it would just be some macrology.

George

Zelphir Kaltstahl

unread,
Oct 10, 2019, 2:42:38 AM10/10/19
to Philip McGrath, George Neuner, racket users

Hi!

Hmmm, that code example looks simple enough. If that works for arbitrary serializable-lambda with only serializable parts, I could continue my process pool project. The only issue would then be, that any user of it would have to know in advance, that they cannot define their lambdas as usual, but have to use serializable-lambda, potentially through their entire code (as there might be references to things which contain references to things, which ...). The abstraction is in this way leaky, but it would make a working library then.

`n` does not need to be serializable, because it is defined in the module and will be defined in the module "on the other side" as well? If I understand this correctly.

When saying, that in other languages it is possible to simply define a lambda and send it to another actor, I meant of course with immutable data, or at least with data, which is not actually mutated. I assumed that already, but yes, of course you are right about that. I was thinking of Erlang, Elixir and maybe Pony.

When hearing actors, then Threads could be thought of a means of implementing them, but I think might not be useful to do so when thinking about performance. Architecturally yes, maybe. That is, why I would think of places as a means of implementing an actor model kind of thing.

When you say, that loci extends the idea of places to multiple machines, what do you mean? I thought places can already run on multiple machines.

I might try to use your example code to finally finish the process pool implementation I started. If everything works like that, then I probably have to retract any statements about parallelism being too hard in Racket : ) It would be nice however, to not have to use a different construct to define serializable lambdas and to be able to go to any program and simply use existing lambdas to send them to a process pool to make use of multiple cores, instead of having to refactor many things into serializable things.

Regards,

Zelphir

Philip McGrath

unread,
Oct 10, 2019, 9:29:02 AM10/10/19
to Zelphir Kaltstahl, George Neuner, racket users
On Thu, Oct 10, 2019 at 2:42 AM Zelphir Kaltstahl <zelphirk...@gmail.com> wrote:
… If that works for arbitrary serializable-lambda with only serializable parts, I could continue my process pool project.

Yes, this would work for any value created by `serial-lambda`.
 
The only issue would then be, that any user of it would have to know in advance, that they cannot define their lambdas as usual, but have to use serializable-lambda, potentially through their entire code (as there might be references to things which contain references to things, which ...). The abstraction is in this way leaky, but it would make a working library then.

`n` does not need to be serializable, because it is defined in the module and will be defined in the module "on the other side" as well? If I understand this correctly.

I think it can help in understanding to know a bit about how `serial-lambda` is implemented. In a context like:
(define (make-serializable-adder n)
  (serial-lambda (x)
    (+ n x)))
 `serial-lambda` uses some advanced macrology to generate a module-level struct declaration like:
(serializable-struct representation (n)
  #:property prop:procedure
  (λ (this x)
    (+ (representation-n this) x)))
and the `serial-lambda` expression itself expands to code constructing an instance of the struct, like `(representation n)`.

The thing to note here is that the fields of the struct hold the contents of the closure, i.e. local variables from the lexical environment of the `serial-lambda` expression. These values need to be serializable so that they can be sent across the place-channel. The arguments to the `serial-lambda` procedure—in this case, `x`—don't need to be serializable, because they aren't part of the struct but are supplied when the procedure is called. Similarly, module-level variables (including imports via `require`) can be referenced directly in the generated `prop:procedure` value, so they also don't need to be serializable: this is why `+` in the example is ok, but the principle also covers much more complicated cases in general.

So, users of your process pool only need to use `serial-lambda` for procedures that are captured in the closure you want to send to the other place, not for all of the functions they use to actually do the computation. From my experience programming in `#lang web-server`, where these rules apply to the implicit closure created around a web interaction, it doesn't turn out to be an issue most of the time.

When hearing actors, then Threads could be thought of a means of implementing them, but I think might not be useful to do so when thinking about performance. Architecturally yes, maybe. That is, why I would think of places as a means of implementing an actor model kind of thing.

It depends on how you want to use your system. If you want a few, relatively long-lived actors, places might work well. If you want many, potentially short-lived actors, you will want to use threads, because creating places is expensive and doesn't give a benefit beyond `(processor-count)` places. Potentially, you could use threads running across a pool of places, though you would then potentially need to think about how to optimally distribute the threads among the places.

When you say, that loci extends the idea of places to multiple machines, what do you mean? I thought places can already run on multiple machines.

Yes, distributed places can run across multiple machines.

It would be nice however, to not have to use a different construct to define serializable lambdas and to be able to go to any program and simply use existing lambdas to send them to a process pool to make use of multiple cores, instead of having to refactor many things into serializable things.

I do see what you mean, but I think of this as protecting me from bugs. When a programmer writes `serial-lambda`, they are saying "I've thought about it, and it makes sense to serialize this and later call the deserialized procedure in some other context." If you could serialize any first-class closure you find, without cooperation from the creator of the closure, it might not logically make sense to call it in some other context (perhaps because it relies on mutable state), in which case you would get nonsense results and potentially break the other module's invariant.

 -Philip

Nathaniel Smith

unread,
Oct 12, 2019, 4:28:03 AM10/12/19
to Racket Users
:wave: Hi all, Josh pointed me to this thread. I'm the author of that blog post he linked to.

Sam Tobin-Hochstadt wrote:
> The Racket community, and even more so the design of Racket
> concurrency APIs, is very strongly influenced by the academic side of
> Racket. As far as I can tell, structured concurrency is fairly close
> to what is traditionally called the fork/join model. Concurrency in
> Racket is usually structured in a somewhat different way, around
> first-class events and channels. First-class events were originally
> created in Concurrent ML, and the basic idea is that you can package
> up things that you might think of as concurrency operations (such as
> `select` in Go) and turn them into _values_ which you can then further
> synchronize on.
>
> Here's how I would write the Happy Eyeballs program in Racket:

I believe this has a bug – if the 'connect' call is killed, then the threads it spawned will "leak":

(define connect-thread (thread (lambda () (connect ...))))
(kill-thread connect-thread)
;; Threads may still be running in the background here

Of course you could also accidentally leak threads via a regular bug, where you just forget to wait for the child threads on some exit path. I don't see any bugs like that in this particular code, but in general, hey, stuff happens, no-one's perfect, so we need to be able to cope with thread leaks.

And Racket has much better tools to handle this kind of situation than most systems, via "custodians". Unfortunately I'm not very familiar with Racket myself; I just know about these from Flatt & Findler's excellent paper on kill-safe abstractions :-). So I might have some details wrong here, but I think the caller could prevent this issue by doing something like:

;; Encapsulate the 'connect' call and all threads it spawns inside a custodian
(define connect-custodian (make-custodian))
(define connect-thread
(parametrize ([current-custodian connect-custodian])
(thread (lambda () (connect ...)))))
;; Kill that custodian, instead of just the thread
(custodian-shutdown-all connect-custodian)
;; Now all the child threads have been killed too

But, to do this, the caller has to know that 'connect' spawns threads. The threads were supposed to be an internal implementation detail, but now they've leaked out and become part of our function's publicly visible semantics. We could fix this by making a custodian inside 'connect', but: why does our language allow us to express this bug in the first place?

The core idea of the "structured programming" movement in the 60s/70s was that functions should be opaque abstractions that encapsulate control flow. Therefore, operators that can break the function abstraction boundary like cross-function 'goto' should be eliminated from our vocabulary, and replaced by operators like 'if' and 'loop' that respect functional abstraction. The core idea of "structured concurrency" is that, well, concurrent control flow is a type of control flow. Therefore, we should also get rid of operators like 'thread' that can cause control flow to accidentally leak outside the boundary of a function, and replace them with new concurrency operators that respect functional abstraction. That way when we call 'connect' we can tell that it doesn't leak threads by just looking at the signature, without having to peek inside and read the body. And this bug becomes inexpressible.

Concurrent ML's 'select' is a "structured" operator in this sense, and Concurrent ML is super cool as far as it goes. But unfortunately, you can't use Concurrent ML events as your sole form of concurrent abstraction, because any given event can only have at most one side-effect (roughly speaking). So you can't, like, have your entire program be one big event; you need something like 'thread' too. And the blog post proposes a specific structured replacement for the 'thread' operator, that's sort of fork/join-ish. But the overall program isn't about fork/join; it's about respecting functional abstraction.

-n
Reply all
Reply to author
Forward
0 new messages