managing competing access to filesystem within places

106 views
Skip to first unread message

Matthew Butterick

unread,
Jun 13, 2019, 12:28:40 PM6/13/19
to Racket Users
To Pollen, I've recently added a parallel-rendering facility with places, in emulation of how Racket docs are rendered in parallel.

But it doesn't quite work — I get intermittent / inconsistent errors from the filesystem (for instance, certain files that should exist do not, or certain files that shouldn't exist, do)

My intuition about places is weak. But it seems to me that Scribble renders are largely independent (= happening in separate folders on the filesystem). Whereas within a Pollen project, the rendered sources are often relying on shared files (for instance, a "template.html"). So there is more direct competition for use of certain files, which perhaps is Bad News.

Is there a rule of thumb for how to manage these conflicts? For instance, is it better to just avoid having places try to read or write from the same files? Is there some filesystem-blocking technique that can be used to make sure reads & writes don't collide?

Matthew Flatt

unread,
Jun 13, 2019, 1:04:12 PM6/13/19
to Matthew Butterick, Racket Users
I recommend a lock-serving place in Pollen to manage concurrency for
parallel rendering. Don't let multiple places try to read and/or write
to the same file, and don't try to use filesystem locks.

You can use filesystem locks to manage concurrency, but it's not
convenient. Part of the problem is that there's no blocking operation
to take a lock (due to the underlying OS abstractions). You can also
use atomic filesystem operations to achieve the effect of locks or
transactions, but it's really difficult to get that right.

For the compilation part of `raco setup`, parallel builds rely on a
combination of atomic filesystem actions plus the main place acting as
a lock server. That is, a place can only compile a file if it can get
permission for the file from the main place, and it reports back to the
main place when the file has been compiled. If you run two `raco
setup`s or `raco pkg install`s at the same time, things can go wrong,
because the lock server within one multi-place process is important.

For the document-building part of `raco setup`, you're right that it
relies mostly on independent document rendering. Pollen rendering is
probably more like the compilation part of `raco setup`. Document
rendering in `raco setup` uses a shared cross-reference database, but
concurrency at that level is managed by database transactions (as
implemented somehow in SQLite).
> --
> You received this message because you are subscribed to the Google Groups
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to racket-users...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/racket-users/2A49317F-F4C8-4EF2-B144-E8846571
> DD7F%40mbtype.com.
> For more options, visit https://groups.google.com/d/optout.

Matthew Butterick

unread,
Jun 13, 2019, 9:16:59 PM6/13/19
to Matthew Flatt, Racket Users

On Jun 13, 2019, at 10:04 AM, Matthew Flatt <mfl...@cs.utah.edu> wrote:

I recommend a lock-serving place in Pollen to manage concurrency for
parallel rendering. Don't let multiple places try to read and/or write
to the same file, and don't try to use filesystem locks.


Thanks for the suggestion. I see what you mean. Locking the source file being rendered did reduce the filesystem errors ... 

... but it didn't eliminate them. The wrinkle is that in the typical Pollen project, many (or even all) these source files may transitively rely on writing the same template file, e.g., "template.html.p". I could lock that file too, but then other renders that rely on it would be blocked, which would tend to relinearize the job.

As an alternative to file locking, I tried having each rendering place attempt the render three times with a short delay in between. On the idea that a temporary filesystem error is likely to resolve by the third try, and a permanent failure (e.g., a problem with the source file itself) will not, and can be raised as a "real" error. 

Maybe there is a secretly perncious aspect to this muddle-through-it technique, though it does avoid the relinearization problem.

Matthew Flatt

unread,
Jun 13, 2019, 9:43:58 PM6/13/19
to Matthew Butterick, Racket Users
At Thu, 13 Jun 2019 18:16:50 -0700, Matthew Butterick wrote:
>
> > On Jun 13, 2019, at 10:04 AM, Matthew Flatt <mfl...@cs.utah.edu> wrote:
> >
> > I recommend a lock-serving place in Pollen to manage concurrency for
> > parallel rendering. Don't let multiple places try to read and/or write
> > to the same file, and don't try to use filesystem locks.
>
>
> Thanks for the suggestion. I see what you mean. Locking the source file being
> rendered did reduce the filesystem errors ...
>
> ... but it didn't eliminate them. The wrinkle is that in the typical
> Pollen project, many (or even all) these source files may
> transitively rely on writing the same template file, e.g.,
> "template.html.p". I could lock that file too, but then other renders
> that rely on it would be blocked, which would tend to relinearize the
> job.

The compilation part of `raco setup` recovers some parallelism by
detecting "prefetch" messages that are generated by the `require`
macro. The prefetch messages report modules that will be demanded soon
via the module name resolver, so compilation can start on them in
parallel. It doesn't recover the potential parallelism completely, but
it helps. Generating and/or catching prefetch messages might be a good
way to go in your case, too.

> As an alternative to file locking, I tried having each rendering
> place attempt the render three times with a short delay in between.
> On the idea that a temporary filesystem error is likely to resolve by
> the third try, and a permanent failure (e.g., a problem with the
> source file itself) will not, and can be raised as a "real" error.
>
> Maybe there is a secretly perncious aspect to this muddle-through-it
> technique, though it does avoid the relinearization problem.

Oh, don't do that. Unreliable workarounds to concurrency problems
really do come back to bite you later.

George Neuner

unread,
Jun 13, 2019, 10:05:50 PM6/13/19
to Matthew Butterick, racket users

On 6/13/2019 9:16 PM, Matthew Butterick wrote:
On Jun 13, 2019, at 10:04 AM, Matthew Flatt <mfl...@cs.utah.edu> wrote:

I recommend a lock-serving place in Pollen to manage concurrency for
parallel rendering. Don't let multiple places try to read and/or write
to the same file, and don't try to use filesystem locks.

Thanks for the suggestion. I see what you mean. Locking the source file being rendered did reduce the filesystem errors ... 

... but it didn't eliminate them. The wrinkle is that in the typical Pollen project, many (or even all) these source files may transitively rely on writing the same template file, e.g., "template.html.p". I could lock that file too, but then other renders that rely on it would be blocked, which would tend to relinearize the job.

I don't know Pollen or your program architecture, so apologies if this is a dumb suggestion.

I assume there is no required ordering to the "blocks" that you are trying to render in parallel - i.e. it doesn't matter in what order the output gets written to the template so long as each output block is written contiguously  (not interspersed with another).

If so, it would seem that each worker place could write into its own temporary output file, and when finished pass the file back to the dispatcher (task master) and let it assemble the final output.  My understanding is that places can pass file descriptors within the same system, so although the dispatcher would have to copy finished work into the template, the rendered content itself would not need to be passed (piped,messaged), and since there would be just a single writer (the dispatcher), you wouldn't need to lock the output template file(s). 

YMMV,
George

Matthew Butterick

unread,
Jun 14, 2019, 6:26:26 PM6/14/19
to Matthew Flatt, Racket Users


> On Jun 13, 2019, at 6:43 PM, Matthew Flatt <mfl...@cs.utah.edu> wrote:
>
>
> Oh, don't do that. Unreliable workarounds to concurrency problems
> really do come back to bite you later.

To ask the second dumbest possible question, what error-recovery policies are reliable under concurrency? (Put another way, what guarantees exist about errors under concurrency? E.g. can we trust that all errors will trigger an exception?)

And the dumbest possible: if there are no such policies, then how can you ever be certain the code works?

Matthew Flatt

unread,
Jun 14, 2019, 7:34:29 PM6/14/19
to Matthew Butterick, Racket Users
That's a difficult and non-dumb question, and what follows is probably
too general of an answer, but I'll try.

Concurrency with shared state is a particularly rich source of
unspecified behavior. The only way to deal with that is to make sure
your program never reaches an unspecified situation. If concurrently
reading and writing to a shared mutable variable is not specified, for
example, then one way to avoid that is by using a lock (whose behavior
with concurrency is specified) and only touch the variable while
holding the lock. And if your program does reach an unspecified
situation? Normally you do not get an error; some difficulty in
detecting the error is probably why it's unspecified in the first
place.

But I think you're more interested in the filesystem level. The range
of possible behaviors is usually more specified at the filesystem
level, but it's not always easy to reason about. For example,
`rename-file-or-directory` is an atomic operation on Unix filesystems,
which means that if two processes concurrently try to rename different
files to the same path, then one of them will win; the resulting path
will refer to one file or the other with its contents intact, and the
other file's content will effectively disappear. That's enough
specification to be useful for some tasks, and other tasks just
shouldn't enable those kinds of races.

And then there's Windows, where `rename-file-or-directory` is not
straightforwardly atomic. Meanwhile, errors such as "permission denied"
can be reported for things that you would not consider to be permission
errors, like moving a file to a name that was formerly occupied, but
that was deleted while some process still has it open, so the name is
not yet released. So, as a practical matter, the set of behaviors to
avoid is larger.

The errors that are possible from filesystem operations are usually
pretty open-ended, especially at the Racket level, where we don't try
to enumerate them. Meanwhile, you usually get some integrity guarantees
from the filesystem, though we don't try to enumerate those
exhaustively in the Racket documentation, either.

Sometimes, staying out of the error-triggering space is unavoidable,
and the possible errors are well enough defined by all filesystems, so
retrying is a workable solution. If you really have to be in that
space, then retry an arbitrary number of times, perhaps with an
exponential backoff (i.e., double the delay between retries).

Matthew Butterick

unread,
Jun 14, 2019, 11:15:38 PM6/14/19
to Matthew Flatt, Racket Users


On Jun 14, 2019, at 4:34 PM, Matthew Flatt <mfl...@cs.utah.edu> wrote:

Sometimes, staying out of the error-triggering space is unavoidable,
and the possible errors are well enough defined by all filesystems, so
retrying is a workable solution. If you really have to be in that
space, then retry an arbitrary number of times, perhaps with an
exponential backoff (i.e., double the delay between retries).

OK, thanks for the encouragement. After more experimentation I got the lock-server idea to work (meaning, my parallel renders now complete without errors, which was not true before).

For the benefit of future generations, two complications tripped me up, the first I understand, the second I don't, but maybe someday:

1) At first I had each rendering place issue its lock requests file by file. This didn't work because each place needed to lock a set of files, and the individual lock / unlock requests could end up interleaved in the lock server, with strange results (some locked / some unlocked). So I bundled each set of files into a "lock transaction" which I passed in a single place message, then all of them would get locked or unlocked as a group.

2) I set up each rendering place to repeatedly ask for a lock until it was approved. But even after approving a lock request from a certain rendering place, the lock server would end up with repeat lock requests from there. To fix this, I needed to create a third response for the lock server, which was "dude you already have a lock".

Matthew Butterick

unread,
Jun 15, 2019, 11:19:25 AM6/15/19
to Matthew Flatt, Racket Users

On Jun 14, 2019, at 8:15 PM, Matthew Butterick <m...@mbtype.com> wrote:

But even after approving a lock request from a certain rendering place, the lock server would end up with repeat lock requests from there.


BTW here's a minimal example showing the puzzling message-garbling behavior between places.

Here, instead of a lock server, I just have a loopback server. What I expect is that each worker sends a request number, and the loopback immediately echoes this number back as a response, so the numbers will always match. 

But that's not what happens. At first, the response numbers match the requests. But then they start coming back mismatched. The error rate increases as the `sleep` value decreases, for instance:

worker request 1:46 got response 1:46 
worker request 0:90 got response 0:90 
worker request 0:91 got response 2:31 ### fail
worker request 2:31 got response 0:91 ### fail
worker request 7:12 got response 7:12 
worker request 1:47 got response 1:47 
worker request 4:19 got response 0:92 ### fail
worker request 0:92 got response 4:19 ### fail
worker request 5:16 got response 5:16 
worker request 0:93 got response 0:93 

The garbling happens in a predictable way: every request is answered, but a closely occurring set of requests and responses will be mismatched among themselves. By adjusting the `sleep` value, it's easy to create a flow rate of messages where almost all of them are garbled.



#lang racket
(provide main)

(define (main)
  (define loopback-p
    (place ch
           (let loop ()
             (place-channel-put ch (place-channel-get ch))
             (loop))))
     
  (define worker-ps
    (for/list ([i (in-range (processor-count))])
      (place ch
             (match-define (list loopback-p wpidx) (place-channel-get ch))
             (let make-request ([count 1])
               (match-define (list wpidxout countout)
                 (place-channel-put/get loopback-p (list wpidx count)))
               (displayln (format "worker request ~a:~a got response ~a:~a ~a"
                                  wpidx count wpidxout countout
                                  (if (and (eq? wpidx wpidxout) (eq? count countout)) "" "### fail")))
               (sleep (* 0.01 (add1 wpidx))) ; multiplier makes workers go at different speeds
               (make-request (add1 count))))))
  
  (for ([(wp wpidx) (in-indexed worker-ps)])
    (place-channel-put wp (list loopback-p wpidx)))
  (map place-wait worker-ps))

Robby Findler

unread,
Jun 15, 2019, 11:31:39 AM6/15/19
to Matthew Butterick, Matthew Flatt, Racket Users
Place channels are asynchronous, not syncronous (i.e. you don't have
two-way rendez-vous). So the call to put returns immediately in
`loopback-p` before the other end receives the result. And now the
next iteration of the loop's get is in parallel to the put from the
previous iteration's.

A standard way to work with this is to send a channel over in the
first communication and then the communication that's specific to that
initial request happens on that channel. The precise way you set this
up depends on what invariants of the communication you want. Like, you
can create a new thread and just do all the communication there, or
you can keep a list of in-progress communications (in some struct that
holds the channel you're communicating on plus details of the current
state of that "conversation") and then have a big sync that collects
all of the relevant ways in which some communication might make
progress and then uses handle-evt with handlers that loop back after
updating the details of the specific communication that has moved
forward. (This second kind of structure (with the big `sync` and
`handle-evt`) really showcases the beauty of CML, which is where
Racket's evt system comes from (intellectually). It's really amazing
how looking at these kinds of communication in that way can make it
easy to get lots of concurrency and never have race conditions or
other such badness.)

hth,
Robby
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/1A3D2E52-1DDA-40FC-B8F7-816B6E9EFF70%40mbtype.com.

Matthew Butterick

unread,
Jun 15, 2019, 12:58:15 PM6/15/19
to Robby Findler, Racket Users


On Jun 15, 2019, at 8:31 AM, Robby Findler <ro...@cs.northwestern.edu> wrote:

Place channels are asynchronous, not syncronous (i.e. you don't have
two-way rendez-vous). So the call to put returns immediately in
`loopback-p` before the other end receives the result. And now the
next iteration of the loop's get is in parallel to the put from the
previous iteration's.


Are place channels "Buffered Asynchronous Channels"? [1] Or just a similar idea? The `place` docs describe place channels as "endpoints for a two-way buffered communication" and "asychronous" [2] but there is no link to BAC, which makes me think they are different. But reading your description of the behavior of place channels with `sync` and `handle-evt`, now I'm not sure.



Robby Findler

unread,
Jun 15, 2019, 1:17:11 PM6/15/19
to Matthew Butterick, Racket Users
I guess one way to look at it is that place-channels have guarantees
that are like [1] when putting and guarantees that are like regular
channels when getting. Or, put another way, you can't sync on sending
with place-channels, only on receiving. So it is a similar idea, yes.
I'm not sure exactly which constraints went into this particular
design point, but I think that syncronous receive and async send is a
natural point in the design space in some vague general sense.

Robby
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/C4E43B34-A41D-4A0D-8FC1-D5E0BF377D34%40mbtype.com.

Matthew Butterick

unread,
Jun 21, 2019, 7:34:06 PM6/21/19
to Robby Findler, Racket Users
On Jun 15, 2019, at 8:31 AM, Robby Findler <ro...@cs.northwestern.edu> wrote:

A standard way to work with this is to send a channel over in the
first communication and then the communication that's specific to that
initial request happens on that channel.

This is probably the crudest possible implementation of your suggestion, but this version avoids mismatched requests & responses. The `loopback-p` makes a table of the worker channels at the outset. After that, as it gets requests, it dispatches the response directly to the worker that issued the request.

(I'm not asking you to review this code, I'm just posting this for curious people of the future.)


#lang racket
(require racket/place)
(provide main)

(define (main)
  (define loopback-p
    (place ch
           (let ([chs (make-hasheq)])
             (let loop ()
               (match (place-channel-get ch)
                 [(list wpidx (? place-channel? wp))
                  (hash-set! chs wpidx wp)
                  (place-channel-put wp 'ack)]
                 [(list wpidx count)
                  (place-channel-put (hash-ref chs wpidx) (list wpidx count))])
               (loop)))))
     
  (define worker-ps
    (for/list ([i (in-range (processor-count))])
              (place ch
                     (match-define (list loopback-p wpidx wp) (place-channel-get ch))
                     (place-channel-put loopback-p (list wpidx wp))
                     (place-channel-get ch) ; wait for 'ack
                     (let make-request ([count 1])
                       (place-channel-put loopback-p (list wpidx count))
                       (match-define (list wpidxout countout)
                         (place-channel-get ch))
                       (displayln (format "worker request ~a:~a got response ~a:~a ~a"
                                          wpidx count wpidxout countout
                                          (if (and (eq? wpidx wpidxout) (eq? count countout)) "" "### fail")))
                       (sleep (* 0.01 (add1 wpidx))) ; multiplier makes workers go at different speeds
                       (make-request (add1 count))))))
  
  (for ([(wp wpidx) (in-indexed worker-ps)])
       (place-channel-put wp (list loopback-p wpidx wp)))
  (map place-wait worker-ps))

Matthew Butterick

unread,
Jun 22, 2019, 12:02:32 AM6/22/19
to Robby Findler, Racket Users


On 06 15 19, at 8:31 AM, Robby Findler <ro...@cs.northwestern.edu> wrote:

A standard way to work with this is to send a channel over in the
first communication and then the communication that's specific to that
initial request happens on that channel. The precise way you set this
up depends on what invariants of the communication you want. Like, you
can create a new thread and just do all the communication there


And here's a threaded version, which I agree is neater, because by closing over each worker channel, the threads make other channel-tracking housekeeping unnecessary.

#lang racket
(require racket/place)
(provide main)

(define (main)
  (define loopback-p
    (place lch
           (let loop ()
             (define wp (place-channel-get lch))
             (thread (λ () (let loop ()
                             (place-channel-put wp (place-channel-get wp))
                             (loop))))
             (loop))))
     
  (define worker-ps
    (for/list ([i (in-range (processor-count))])
      (place wch
             (let make-request ([count 1])
               (define countout (place-channel-put/get wch count))
               (displayln (format "worker request ~a:~a got response ~a:~a ~a"
                                  0 count 0 countout
                                  (if (eq? count countout) "" "### fail")))
               (make-request (add1 count))))))
  
  (for ([worker-p (in-list worker-ps)])
    (place-channel-put loopback-p worker-p))
  
  (map place-wait worker-ps))

Robby Findler

unread,
Jun 22, 2019, 9:14:18 AM6/22/19
to Matthew Butterick, Racket Users
Thanks!

One thing you can do, when the place-specific communication takes
multiple steps is to, like you did in the first example, put the
channels into a hash, but then on each iteration of the loop, pull all
of them out and put them into a giant select that takes the relevant
step forward for that specific piece of the communication (you use
handle-evt to wrap the successful communication with the state update
and then the call to `loop` back for some other step or of
(potentially) some other communication).

The advantage to this kind of setup is that you have sequential access
to some shared state that is kept entirely on the server side. So you
completely avoid the conventional pitfalls of mutable state combined
with concurrency.

For example, if your main loop was the one that was parcelling out,
say, files to compile, you could keep a list of the pending files to
compile on the server and when each worker finished with a file or
encountered a new file, it would let the server know and the server
would tell it what to do, eg "oh, wait, someone else is compiling that
file" or "go ahead, I've noted that you're the one that's compiling
that file and I'll tell others who also need it to wait for you to
finish". And the latter case, the worker that was waiting would also
get an evt that would become ready when the file was finish compiling
(which would also be mediated through the server).

Robby
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/A5055051-898C-46F9-82C6-BD0A50AC4D20%40mbtype.com.

Matthew Butterick

unread,
Jun 22, 2019, 11:45:36 PM6/22/19
to Robby Findler, Racket Users


On 06 22 19, at 6:14 AM, Robby Findler <ro...@cs.northwestern.edu> wrote:

One thing you can do, when the place-specific communication takes
multiple steps is to, like you did in the first example, put the
channels into a hash, but then on each iteration of the loop, pull all
of them out and put them into a giant select that takes the relevant
step forward for that specific piece of the communication

Not sure what you mean by "giant select" — ought that be "giant sync"? But as for "sequential access to some shared state" — In this toy example, maybe more along the lines of your suggestion, the server makes a random sequence of workers and triggers the workers in turn, so none of them are ever working at the same time. (So they could, in principle, all be writing to a single file.) 

I can imagine a more nuanced version involving each worker chekcing out sets of files, and the server permitting progress by more than one worker at a time, so long as the file sets didn't overlap.


#lang racket
(require racket/place racket/dict)
(provide main)

(define (main)
  (define server-p
    (place sch
           (define wp-dict (place-channel-get sch))
           (let loop ()
             (define wp-dict-shuffled (shuffle wp-dict))
             (displayln (format "order: ~a" (dict-keys wp-dict-shuffled)))
             (for ([(widx wp) (in-dict wp-dict-shuffled)])
               (displayln (format "server: start ~a" widx))
               (place-channel-put wp widx)
               (apply sync (dict-values wp-dict)))
             (loop))))
     
  (define worker-ps
    (for/list ([i (in-range (processor-count))])
      (place wch
             (let loop ()
               (define widx (place-channel-get wch))
               (sleep 0.2)
               (displayln (format "worker ~a: done" widx))
               (place-channel-put wch 'done)
               (loop)))))

  (place-channel-put server-p (for/list ([(wp widx) (in-indexed worker-ps)])
                                (cons widx wp)))
  
  (for-each place-wait worker-ps))

Robby Findler

unread,
Jun 23, 2019, 4:12:21 PM6/23/19
to Matthew Butterick, Racket Users
On Sat, Jun 22, 2019 at 9:45 PM Matthew Butterick <m...@mbtype.com> wrote:


On 06 22 19, at 6:14 AM, Robby Findler <ro...@cs.northwestern.edu> wrote:

One thing you can do, when the place-specific communication takes
multiple steps is to, like you did in the first example, put the
channels into a hash, but then on each iteration of the loop, pull all
of them out and put them into a giant select that takes the relevant
step forward for that specific piece of the communication

Not sure what you mean by "giant select" — ought that be "giant sync"?

Right, thanks. 


But as for "sequential access to some shared state" — In this toy example, maybe more along the lines of your suggestion, the server makes a random sequence of workers and triggers the workers in turn, so none of them are ever working at the same time. (So they could, in principle, all be writing to a single file.) 

I can imagine a more nuanced version involving each worker chekcing out sets of files, and the server permitting progress by more than one worker at a time, so long as the file sets didn't overlap.



I'm attaching an example of what I was trying to get at for your consumption. Hope you find it useful.

Robby
 
places-example.rkt
Reply all
Reply to author
Forward
0 new messages