Concurrrent Release Builds/Compile File Lock

117 views
Skip to first unread message

Brendan Ball

unread,
Jan 27, 2022, 3:47:44 AM1/27/22
to elixir-lang-core
Hi

We've been getting weird errors when building releases:
```
Compiling 42 files (.ex)
** (File.Error) could not remove files and directories recursively from "/home/brendan/dev/app/_build/prod/lib/shared_app/priv": file already exists
    (elixir 1.13.2) lib/file.ex:1292: File.rm_rf!/1
    (mix 1.13.2) lib/mix/utils.ex:444: Mix.Utils.symlink_or_copy/3
    (mix 1.13.2) lib/mix/project.ex:738: anonymous fn/5 in Mix.Project.build_structure/2
    (elixir 1.13.2) lib/enum.ex:2396: Enum."-reduce/3-lists^foldl/2-0-"/3
    (mix 1.13.2) lib/mix/project.ex:737: Mix.Project.build_structure/2
    (mix 1.13.2) lib/mix/tasks/compile.all.ex:34: Mix.Tasks.Compile.All.run/1
    (mix 1.13.2) lib/mix/task.ex:397: anonymous fn/3 in Mix.Task.run_task/3
    (mix 1.13.2) lib/mix/project.ex:396: Mix.Project.in_project/4
```

We build different releases concurrently after running a single `mix compile`.
It turns out that the different release builds were not running in isolated environments, but shared the filesystem. This is a very weird and confusing error especially since most devs didn't know why it was breaking and didn't know that releases were being run in a shared environment.

This seems like something that might happen fairly often in CI environments, so I really think it's in everyone's interest to improve this.
Most good build systems/compilers implement some kind of file lock for operations that can't be done concurrently. I think implementing this would be a minimal acceptable solution. Ideally the file lock is used to wait until the operation can be performed, but even just an error saying you can't run operations in parallel would be better than what we have now.

Thoughts?

Kind regards
Brendan Ball
 

José Valim

unread,
Feb 16, 2022, 5:09:07 AM2/16/22
to elixir-lang-core
My biggest concern with a file lock is: what happens if the other process terminates unexpectedly and does not remove the file lock? How can we ensure this won't happen?

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/8dfd467c-ec43-40de-bb2d-c9632d3de0can%40googlegroups.com.

Brendan Ball

unread,
Feb 16, 2022, 5:49:14 AM2/16/22
to elixir-lang-core
I am looking at the Rust Cargo implementation since I know that works: https://github.com/rust-lang/cargo/blob/master/src/cargo/util/flock.rs
I believe these OS APIs take care of removing file locks on process exit.
Locks can be implemented on a best effort basis for platforms/environments that support it. E.g. it seems that Cargo doesn't implement locks on NFS filesystems.

José Valim

unread,
Feb 16, 2022, 9:42:09 AM2/16/22
to elixir-lang-core
That's perfect. I believe the first step is to get those operations into Erlang/OTP, likely under the file module. Would you like to send a PR for that? I can probably provide some guidance around it. :)

Brendan Ball

unread,
Feb 17, 2022, 1:41:00 AM2/17/22
to elixir-lang-core
Sure. This would be my first time contributing to Erlang/OTP, so I would definitely appreciate some guidance.

José Valim

unread,
Feb 17, 2022, 1:44:48 AM2/17/22
to elixir-lang-core
They just published a new development guide page with OTP 25: https://github.com/erlang/otp/blob/master/HOWTO/DEVELOPMENT.md

If you are on IRC, feel free to ping me on #elixir-lang libera. Otherwise in the Erlang Ecosystem Foundation slack (you can create an account here for free then to go settings to get an invite) or email me privately for further discussion!

Reply all
Reply to author
Forward
0 new messages