looking for advice on a web background process manager

28 views
Skip to first unread message

Wayne Harris

unread,
Dec 30, 2021, 3:33:23 PM12/30/21
to racket users
I'm considering writing a manager for background processes --- such as send a batch of e-mail or other process that takes a while to finish --- for a web system.

I see the challenge here as just writing something that will look like a
very basic UNIX shell --- so I'll call it ``web-api-shell'' from now on.
(``Web'' because it will be used by a web system through some HTTP API.)

This thing has to be flawless.  I'm looking for design principles and advice.

I don't know which language I will use, but I'd like to use Racket at
least as a prototype.  I am looking at section 15.4 at


and I'm not sure it gives me all the control I need.  I have a more
lower view of the job --- fork(), execve(), waitpid(), SIGCHLD.  But I
suppose Racket handles this much more elegantly that I would in C.

Your advice will be very appreciated.

(*) Where will it run

It will run on GNU systems running the Linux kernel.

(*) My own thoughts

The interface to shell will be through HTTP requests, so this shell will
likely be a web server of some sort.  But before I get involved in the
web at all, I need the shell working flawlessly.

So I need a laboratory first.  I could write a program that reads some
named pipe on disk to get commands such as ``run this or that'' while I
work.  (Later I can write a web server replacing this named-pipe
interface.)

Just like a UNIX shell, this web-api-shell must know all every process
it runs.  I suppose the work is essentially fork(), execve() followed by
waitpid().

One concern I have is the following.  Is it possible for a process to
simply ``get out of'' the shell?  What do I mean by that?  A process
that does fork() and puts itself into background would leave the
web-api-shell's control, wouldn't it?

I think I must avoid that.  Perhaps I can't let just any process run.
Perhaps the web-api-shell must only offer a few processes carefully
written by myself --- so that I know they won't put themselves in
background.  (For instance, I can't let them change PIDs, otherwise I
won't have any idea who they are and that's a mess.  I'd love to somehow
restrict system calls such as fork().)

(*) Serialization

I also think this web-api-shell must not be invoked in parallel.  So I
guess I must use some queue of requests with no race condition and
pull each request as it comes.  Any pointers on how to do this basic
thing with my zero experience?

(*) What is my level of training?

In the past I've studied many parts of

  Advanced Programming in the UNIX Environment
  W. Richard Stevens

I will definitely have to read it again to get work on this project.
Can you mention any UNIX concepts that are of great relevance for this
project?  I don't think I ever got my mind wrapped around things like
sessions, session leaders and so on.  Are these concepts relevant to
this application?

Thank you very much.

Philip McGrath

unread,
Jan 5, 2022, 4:35:12 PM1/5/22
to Wayne Harris, racket users
Hi,

As you suspected, in Racket, the approach will be significantly different than in C—hopefully, safer and more elegant!

The most basic concepts you will need to learn about are Racket's "green" threads (not OS/POSIX threads) and "synchronizable events," which are based on Concurrent ML. A good place to start would be the tutorial introduction "More: Systems Programming with Racket": https://docs.racket-lang.org/more/index.html

You will surely also want to read the Racket Guide chapter on "Concurrency and Synchronization" (https://docs.racket-lang.org/guide/concurrency.html) and the associated Racket Reference sections. I have found Matthew and Robbie's paper “Kill-Safe Synchronization Abstractions” (https://www.cs.utah.edu/plt/publications/pldi04-ff.pdf) an accessible introduction to "synchronizable events" and the Concurrent ML implementation. Andy Wingo, the Guile maintainer, has some blog posts about how Concurrent ML's approach to concurrency works under the hood: https://wingolog.org/archives/2017/06/29/a-new-concurrent-ml For maximum detail, I can also recommend John Reppy's book Concurrent Programming in ML from Cambridge UP (most recently revised in 2007, IIUC).

Once you understand Racket's approach to synchronization in general, I think the way OS-level facilities (especially processes: https://docs.racket-lang.org/reference/subprocess.html) interact with them will make more sense. In particular, when we say that something is "blocking" in Racket, we mean that it blocks a Racket-level green thread, while allowing other Racket threads implemented by the same OS thread to run concurrently. (It is possible to block the whole process using unsafe functionality from the FFI, but that would generally be a bug.) This means that even the simple `system*/exit-code` function, wrapped in a call to `thread`, can do very well for running a background process, especially in combination with `current-subprocess-custodian-mode` and `subprocess-group-enabled`.

It's hard to give much more specific advice without knowing more about your requirements. I guess I'd say that overall, because Racket provides such powerful facilities for concurrent programming, I find that I can often just write the code I want, without needing indirection through a general-purpose background task manager. For example, a web application can very reasonably send email just by using the `net/sendmail` or `net/imap` libraries in a function wrapped in a Racket-level thread, perhaps even (depending on the behavior you want) directly in the per-request thread created for you by the Racket web server.

For a different approach, while I don't know much about the implementations, you might be interested in the GNU Daemon Shepherd and Mcron, an init system and cron implementation written in Guile: see https://www.gnu.org/software/shepherd/manual/html_node/ and https://www.gnu.org/software/mcron/manual/html_node/index.html.

-Philip

Stefan Schwarzer

unread,
Jan 5, 2022, 7:31:38 PM1/5/22
to racket...@googlegroups.com
On 2021-12-30 21:33, 'Wayne Harris' via Racket Users wrote:
> I'm considering writing a manager for background processes --- such as send a batch of e-mail or other process that takes a while to finish --- for a web system.
>
> I see the challenge here as just writing something that will look like a
> very basic UNIX shell --- so I'll call it ``web-api-shell'' from now on.
> (``Web'' because it will be used by a web system through some HTTP API.)
>
> This thing has to be flawless. I'm looking for design principles and advice.
> [...]

Have you looked at existing task queues and message brokers?
I guess they can already give you a part of the robustness
you're looking for. It would be a pity to reinvent the wheel,
especially since it will probably be quite difficult and
time-consuming to implement this robustness yourself.

But it might as well be that I misunderstand your requirements.
:-)

Stefan

David Storrs

unread,
Jan 6, 2022, 11:59:39 AM1/6/22
to Racket Users
Speaking of existing task managers:  https://docs.racket-lang.org/majordomo2/index.html   </shameless type="self-promotion">



--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/ddeb9170-6273-34be-ec6d-7edc8b3a4146%40sschwarzer.net.
Reply all
Reply to author
Forward
0 new messages