Early reactions to Rust

50 views
Skip to first unread message

Jonathan S. Shapiro

unread,
Jan 3, 2026, 4:05:03 PM (6 days ago) Jan 3
to cap-talk
I've done enough coding in Rust at this point to start to have a sense of it. Most of my searches at this point are "find something in the package ecosystem" sorts of things. Some early comments and thoughts - and I acknowledge at the outset that some of these are "it's a work in progress, put up or shut up" sorts of things. Some of them probably reflect misunderstandings.

I emphasize the "early" part of that - this is definitely a case of "thoughts arising from the first dawn of use."

ChatGPT This isn't strictly a Rust thing, but I've found ChatGPT surprisingly capable of answering in-depth questions that a PL designer might ask. I've used ChatGPT enough that I can usually spot bullshit responses, which helps. When asked "Explain the inference rules for lifetime variables in Rust", it did a pretty credible job. Caveat: some of that may be because I learned a bunch about region types from Cyclone.

Brevity Rust does not suffer from an economy of text. I don't want to get bogged down in surface syntax issues here, but I personally find it ugly to a degree that is distracting. For whatever reason, the angle bracket syntax for type variables bugs me. To each their own.

Rust Analyzer This is a hugely useful tool. Without it, Rust would be much harder to use. It may be the best tool I've seen at explaining what's going on in the presence of type inference. Somebody put some very serious thought into this, and I'm both appreciative and impressed. Editors were so much less capable 20 years ago that this didn't even occur to us while we were working on BitC. Really nice to see.

Borrow Checker You hear a lot of people talking about the borrow checker and how it is hard to understand. I've hit a couple of cases where an item was retired by a copy I didn't recognize, but the diagnostics from the compiler are quite good at explaining what happened - I'm sure this wasn't always the case. For people coming from other languages, this violates intuitions about scope and lifetime, but the compiler diagnostics make it learnable.

I've hit two issues, one mostly just surprising and the other annoying. The surprising one is the tendency to introduce borrows of borrows of borrows that aren't necessary. Hypothesis: if "&(**foo)" works, the analyzer probably shouldn't have suggested deeper borrowing. I haven't looked at the generated code yet, but I'm left wondering if cascading borrows turn into cascading dereferences at the instruction level. Maybe better: I'm wondering if the dereferencing '*' should be understood as a borrow dereference, a pointer dereference, or both simultaneously.

The annoying one has to do with indexing patterns. As test projects, I've re-built the regdef utility and a cargo extension to manage multi-workspace builds (cargo-mk). cargo-mk wants to do dependency ordering, which leaves me with two indexing structures: a HashMap for name lookup and a vector for order-of-build. There's a cycle check in there, and it would be very natural in other languages to have a boolean field saying "we've already checked this one, and it is cycle free". But this requires mutability, and rust won't let you do that straightforwardly when the data structure is indexed in more than one way. I'm aware of several workarounds and I used one. Still, it's surprising how many simple graph algorithms are defeated by the "one mutable reference XOR multiple read-only references" rule.

A commonly cited workaround is to maintain a vector of mutable structures. Usually unspoken is that this requires boxing lest the vector grow, and that this tends to make you choose between lifetimes, storage reclamation, and modularity boundaries.

Overall, my suspicion is that there are only a few patterns that get impacted substantively by borrowing. The one everyone knows about is cyclic structures; the other is multi-collection with item mutability. The nice thing about this is that the two can be explained along with workarounds.

The borrow checker, taken overall, is not as painful as people like to make it. I intentionally used two torture test programs as my learning tool, and I wasn't unduly impeded. That said, I'd have been impeded a lot more if my exposure to programming languages wasn't as broad or I didn't understand typestate graphs. It's nice to see Rob Strom and Shaula Yemini's work showing up so prominently in a mainstream language.

Lifetime Variables This is one of the weak spots, but it's mainly a matter of missing documentation. I think these are actually region type variables (c.f. Cyclone), but even now I'm not sure. When the compiler demands introduction of lifetime variables, it doesn't say why, and the inference rules are not clear.

I find that if I assume they are actually region type variables and decorate accordingly I can get by, but it would be really nice (a) to have that intuition confirmed or corrected, and (b) to better understand what the inference rules are. If I build a list of Box<& 'a Foo>, where Foo has lifetime 'a, what lifetime is assigned to the boxes? I'm sure the answer is clear, but it seems to be on the list of things that The Rust Book doesn't (yet) explain.

Region/lifetime inference is hard to explain in error messages because it uses ordered constraints rather than equality constraints.

I do find myself wondering if first-class regions might find a role in Rust at some point.

The 'static Lifespan

The 'static lifespan, which the Coyotos kernel and any fast tokenizer are going to rely on a lot, is subtly dangerous. I don't have an intrinsic problem with this - it's doing what I told it to do - but (contrary to documentation) it isn't safe. It's surprisingly easy to generate crashes that the compiler does not diagnose.

If V is a vector, and Vslice is a slice of that vector, then any operation that re-sizes V invalidates Vslice leaving dangling pointers. For ordinary lifespans this is diagnosed by the compiler as a lifetime violation, because the slice lifetimes must exit before the vector lifetime, but for 'static there appears to be a reductio bug (I need to test this).

Since 'static is (effectively) the "forever" lifespan, it's reasonable to imagine that the reference lifespan induction "grounds out" at 'static, and the compiler therefore allows a 'static slice to be constructed from a 'static vector. The problem with this is that it isn't correct when the 'static vector is mutable and can be resized.

There's an idiom that can ensure a vector is not resizable, and this isn't going to be a problem in Coyotos because the relevant vectors are never resized. I was merely surprised that this lifetime issue wasn't diagnosed.

Exports Puzzle Me

If we have a workspace containing crates A and B-lib, where A depends on B-lib, we may not want to export B-lib public items to consumers outside the workspace. Same issue for a package having multiple sub-crates. It may also be true that we have fields that should be visible within the package but not visible to the public.

I think this is something I don't understand yet, but it provisionally feels to me as if visibility is half-baked. I obviously need to read up.

It would be really helpful to have a tool I could point at a crate, package, or workspace that would tell me what symbols are exported. Does such a tool exist?

Cargo

I find cargo mildly fascinating. In some places - notably build traceability - it is downright obsessively developed. In others it seems to be more of a work in progress. It's adequate for building applications, but doesn't address multi-workspace builds or building for multiple architectures. It's going to be entertaining to see how one handles testing for so-called "canadian cross" scenarios. I suspect there are things to be learned from yocto here.

Cargo is regrettably inconsistent in some cases. If builds are "from scratch" or "based on dependencies", that should be the rule for all builds. Document builds notably violate this. Similarly for "clean". I can see the use case for "clean --but-not-docs", but that shouldn't be the default. This is how we ended up with the "make clobber" convention in makefiles.

The Coyotos tree is looking like it will end up having many workspaces. I eventually decided to build a cargo plugin to deal with this, though I may yet adopt the build.rs mechanism if there's a good reason to do it. For now, I'm going to stick with what I have in the interest of progress.

The main issue I see with build.rs is that it causes common tools to get re-built in every consuming build. If capidl is bound as a library, then it is compiled into the build.rs binary in every consuming workspace. It is totally unclear to me why that should be necessary even given the desire for artifact traceability. And it has the (perhaps unintended) consequence that pre-build artifacts are successfully obscured from the human developer - one must dig into build.rs to find out what they are.

I need to understand the rationale of build.rs to understand why people thought this was a good choice. Maybe I'm just being a curmudgeon because building tools as libraries (and therefore building them over and over) seems unfortunate. On the other hand, disk is a lot cheaper than it once was.



Jonathan

Rob Meijer

unread,
Jan 3, 2026, 4:35:10 PM (6 days ago) Jan 3
to cap-...@googlegroups.com
I think the lifetime management in Rust is very good when building a monolithic rust project. In such settings, coming from C++, I never felt it was worse than C++ in any way while it was better than C++ in some. In that context it felt like Rust was C++ done right.

But then I needed to do something I had done in C++ hundreds of times: Write language bindings for my rust code for other languages and I just couldn't wrap my head around cross language boundary ownership anymore. This went so far that I basically had to decide to throw away 8 months of work and start over in C++, a language I thought I wouldn't touch again after 3 years of being quite contented with Rust. 

And not just native. Even when squeezing language bindings through WASM, good (or bad) old C++ generated language bindings still fit mentally, while I feel I'm building a three tier system made of rocks at the bottom, steel at the top, and ducktape and tie wraps in the middle. 

I'm not going to say Rust is bad, it's probably the best system language out there at the moment, but ownership model combined with language bindings? Maybe I'm missing some crucial connection, but for me that is one crucial area where I feel Rust may not yet be mature enough.


--
You received this message because you are subscribed to the Google Groups "cap-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cap-talk+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/cap-talk/CAAP%3D3QOcKgj%3D%3Dc27G2kBV7f6KqTCPquT89YZ7SnSAZ1DhzRLtw%40mail.gmail.com.

Matt Rice

unread,
Jan 3, 2026, 4:37:02 PM (6 days ago) Jan 3
to cap-...@googlegroups.com
On Sat, Jan 3, 2026 at 9:05 PM Jonathan S. Shapiro
<jonathan....@gmail.com> wrote:
>
> Exports Puzzle Me
>
> If we have a workspace containing crates A and B-lib, where A depends on B-lib, we may not want to export B-lib public items to consumers outside the workspace. Same issue for a package having multiple sub-crates. It may also be true that we have fields that should be visible within the package but not visible to the public.
>
> I think this is something I don't understand yet, but it provisionally feels to me as if visibility is half-baked. I obviously need to read up.
>
> It would be really helpful to have a tool I could point at a crate, package, or workspace that would tell me what symbols are exported. Does such a tool exist?
>

A lot to respond to, for now I'll just respond to this part.
There really is nothing like "friend" classes in rusts visibility
mechanism, things are either public or private.
About the most complicated visibility gets is when dealing with sealed
traits, there is a good article here.

https://predr.ag/blog/definitive-guide-to-sealed-traits-in-rust/

The only thing I would add to that article is that you can also
combine the powers of sealed traits with cargo's "features",
So you can say a trait is sealed unless a feature is enabled and what not.
Features in rust must be additive, and it's kind of a common pitfall
to break things by making features non-additive.

But these can sometimes be (ab)used to indicate that certain API is
only public because it needs to be called from a specific crate.
I tend to recommend avoiding it if possible, and keeping libraries in
the same crate with private API private as much as is possible to
avoid that.
But sometimes it cannot really be helped.

Here is an example where I've used that, in the yacc implementation
it'll generate a bunch of code which calls some unstable api,
We name the feature "_unstable_api", with a leading underscore so the
feature can be hidden from `rustdoc`.

https://github.com/softdevteam/grmtools/blob/master/lrpar/src/lib/mod.rs#L239-L304

As far as tools which tell you which symbols are exported, there is as
far as I know just `cargo doc` and cargo doc --document-private-items
these also take a `--features` or `--all-features`.

Anyhow this combination of sealed traits and hidden features is the
only real "principled" mechanism I've found for carving out private
API between crates. Hope that helps.

Kevin Reid

unread,
Jan 3, 2026, 6:46:13 PM (6 days ago) Jan 3
to cap-...@googlegroups.com
On Sat, Jan 3, 2026 at 1:05 PM Jonathan S. Shapiro <jonathan....@gmail.com> wrote:
Borrow Checker You hear a lot of people talking about the borrow checker and how it is hard to understand. I've hit a couple of cases where an item was retired by a copy I didn't recognize, but the diagnostics from the compiler are quite good at explaining what happened - I'm sure this wasn't always the case.

Indeed, the rustc developers realized this would be a problem, and put a lot of effort into building helpfully structured error messages.

I've hit two issues, one mostly just surprising and the other annoying. The surprising one is the tendency to introduce borrows of borrows of borrows that aren't necessary. Hypothesis: if "&(**foo)" works, the analyzer probably shouldn't have suggested deeper borrowing.

Can you say more about where you are seeing this suggestion? In most cases (not all!), either one `&` or one `*` is sufficient. Perhaps you are confusing rust-analyzer’s “inlay hints”, that are telling you what borrowing is automatically happening, with suggestions to add it explicitly?

cargo-mk wants to do dependency ordering, which leaves me with two indexing structures: a HashMap for name lookup and a vector for order-of-build. There's a cycle check in there, and it would be very natural in other languages to have a boolean field saying "we've already checked this one, and it is cycle free". But this requires mutability, and rust won't let you do that straightforwardly when the data structure is indexed in more than one way. I'm aware of several workarounds and I used one.

I would say that the most appropriate solution *for a marking problem* is to use interior mutability — either Cell<bool> (for single-threaded-only programs) or AtomicBool.
 
Lifetime Variables This is one of the weak spots, but it's mainly a matter of missing documentation. I think these are actually region type variables (c.f. Cyclone), but even now I'm not sure.

I’m not up on programming-language-theory enough to tell you the right technical terms, but the way I think about it is that a lifetime (not a lifetime variable!) specifies an end-point in time, at which some set of references becomes invalid (when they must not be used). Note I said “end-point” — this is because the start-point is defined by the reference value coming to exist. (Therefore, it is possible to introduce new references under already-existing lifetimes, as long as they will continue to be valid for long enough.)
 
When the compiler demands introduction of lifetime variables, it doesn't say why, and the inference rules are not clear.

The inference rules for contexts where an explicit variable might or might be required are the lifetime elision rules: <https://doc.rust-lang.org/reference/lifetime-elision.html#lifetime-elision-in-functions>
Note that while these rules involve types, they are driven purely by the grammar of types (that is, “does this generic type have a lifetime parameter?”), not by more elaborate type inference.
 
I find that if I assume they are actually region type variables and decorate accordingly I can get by, but it would be really nice (a) to have that intuition confirmed or corrected, and (b) to better understand what the inference rules are. If I build a list of Box<& 'a Foo>, where Foo has lifetime 'a, what lifetime is assigned to the boxes?

“What lifetime is assigned to the boxes?” is an ill-formed question. The type Box<i32> contains no lifetimes, and the type Box<&'a Foo> contains one lifetime. A common confusion is to think that lifetimes describe how long values exist — this is not true. Lifetimes describe how long borrows exist, or how long references are valid for. Box<i32> is not a reference and does not contain any references.
 
The 'static Lifespan

The 'static lifespan, which the Coyotos kernel and any fast tokenizer are going to rely on a lot, is subtly dangerous. I don't have an intrinsic problem with this - it's doing what I told it to do - but (contrary to documentation) it isn't safe. It's surprisingly easy to generate crashes that the compiler does not diagnose.

If V is a vector, and Vslice is a slice of that vector, then any operation that re-sizes V invalidates Vslice leaving dangling pointers. For ordinary lifespans this is diagnosed by the compiler as a lifetime violation, because the slice lifetimes must exit before the vector lifetime, but for 'static there appears to be a reductio bug (I need to test this).

Since 'static is (effectively) the "forever" lifespan, it's reasonable to imagine that the reference lifespan induction "grounds out" at 'static, and the compiler therefore allows a 'static slice to be constructed from a 'static vector. The problem with this is that it isn't correct when the 'static vector is mutable and can be resized.

I don't have enough information to tell you what exactly is wrong, but from this description, you have written unsafe code that contains undefined behavior. It is not possible (without exploiting very tricky type-system holes) to use 'static to create dangling pointers. If you have an &'static reference to a vector, then mutating that vector by other means than that specific reference is UB. If you share the code that manages the mutable vector, I can tell you where it is incorrect — you can also use the Miri tool (either locally or on the Rust Playground) to test it. Miri will straightforwardly detect all violations of borrow exclusivity/immutability.

Exports Puzzle Me

If we have a workspace containing crates A and B-lib, where A depends on B-lib, we may not want to export B-lib public items to consumers outside the workspace. Same issue for a package having multiple sub-crates. It may also be true that we have fields that should be visible within the package but not visible to the public.

I think this is something I don't understand yet, but it provisionally feels to me as if visibility is half-baked. I obviously need to read up.

Rust does not, currently, support any notion of fine-grained visibility beyond the crate, only visibility within modules. Are you sure you need separate library crates and not modules?
 
It would be really helpful to have a tool I could point at a crate, package, or workspace that would tell me what symbols are exported. Does such a tool exist?

It is conventional to use rustdoc (cargo doc --open) for this purpose, and consider whatever you can see in the documentation to be what is public for API design and stability purposes.
 
The main issue I see with build.rs is that it causes common tools to get re-built in every consuming build. If capidl is bound as a library, then it is compiled into the build.rs binary in every consuming workspace. It is totally unclear to me why that should be necessary even given the desire for artifact traceability.

There is work towards Cargo being able to share artifacts between multiple workspaces. (This will not completely prevent all rebuilds, because different workspaces can have different compilation options and package features, but it will help a lot.)

Matt Rice

unread,
Jan 3, 2026, 9:45:40 PM (6 days ago) Jan 3
to cap-...@googlegroups.com
On Sat, Jan 3, 2026 at 11:46 PM Kevin Reid <kpr...@switchb.org> wrote:
>
> On Sat, Jan 3, 2026 at 1:05 PM Jonathan S. Shapiro <jonathan....@gmail.com> wrote:
>>
>> Borrow Checker You hear a lot of people talking about the borrow checker and how it is hard to understand. I've hit a couple of cases where an item was retired by a copy I didn't recognize, but the diagnostics from the compiler are quite good at explaining what happened - I'm sure this wasn't always the case.
>
>
> Indeed, the rustc developers realized this would be a problem, and put a lot of effort into building helpfully structured error messages.
>
>> I've hit two issues, one mostly just surprising and the other annoying. The surprising one is the tendency to introduce borrows of borrows of borrows that aren't necessary. Hypothesis: if "&(**foo)" works, the analyzer probably shouldn't have suggested deeper borrowing.
>
>
> Can you say more about where you are seeing this suggestion? In most cases (not all!), either one `&` or one `*` is sufficient. Perhaps you are confusing rust-analyzer’s “inlay hints”, that are telling you what borrowing is automatically happening, with suggestions to add it explicitly?
>
>> cargo-mk wants to do dependency ordering, which leaves me with two indexing structures: a HashMap for name lookup and a vector for order-of-build. There's a cycle check in there, and it would be very natural in other languages to have a boolean field saying "we've already checked this one, and it is cycle free". But this requires mutability, and rust won't let you do that straightforwardly when the data structure is indexed in more than one way. I'm aware of several workarounds and I used one.
>
>
> I would say that the most appropriate solution *for a marking problem* is to use interior mutability — either Cell<bool> (for single-threaded-only programs) or AtomicBool.
>
>>
>> Lifetime Variables This is one of the weak spots, but it's mainly a matter of missing documentation. I think these are actually region type variables (c.f. Cyclone), but even now I'm not sure.
>
>
> I’m not up on programming-language-theory enough to tell you the right technical terms, but the way I think about it is that a lifetime (not a lifetime variable!) specifies an end-point in time, at which some set of references becomes invalid (when they must not be used). Note I said “end-point” — this is because the start-point is defined by the reference value coming to exist. (Therefore, it is possible to introduce new references under already-existing lifetimes, as long as they will continue to be valid for long enough.)
>
>>
>> When the compiler demands introduction of lifetime variables, it doesn't say why, and the inference rules are not clear.
>
>
> The inference rules for contexts where an explicit variable might or might be required are the lifetime elision rules: <https://doc.rust-lang.org/reference/lifetime-elision.html#lifetime-elision-in-functions>
> Note that while these rules involve types, they are driven purely by the grammar of types (that is, “does this generic type have a lifetime parameter?”), not by more elaborate type inference.
>

FWIW, one of the things with lifetimes I've used but had difficulty
explaining is the subtleties surrounding is implicit lifetime bounds
and their interactions with higher ranked trait bounds.
https://sabrinajewson.org/blog/the-better-alternative-to-lifetime-gats#hrtb-implicit-bounds
Where the borrow checker can infer some bounds, but specifying those
explicitly cannot be written in the code in a way which actually
passes the borrow checker. I guess I wish at some level there was a
completely explicit syntax for everything that could be implied. But I
think that is a fairly obscure corner case.

>>
>> I find that if I assume they are actually region type variables and decorate accordingly I can get by, but it would be really nice (a) to have that intuition confirmed or corrected, and (b) to better understand what the inference rules are. If I build a list of Box<& 'a Foo>, where Foo has lifetime 'a, what lifetime is assigned to the boxes?
>
>
> “What lifetime is assigned to the boxes?” is an ill-formed question. The type Box<i32> contains no lifetimes, and the type Box<&'a Foo> contains one lifetime. A common confusion is to think that lifetimes describe how long values exist — this is not true. Lifetimes describe how long borrows exist, or how long references are valid for. Box<i32> is not a reference and does not contain any references.
>
>>
>> The 'static Lifespan
>>
>> The 'static lifespan, which the Coyotos kernel and any fast tokenizer are going to rely on a lot, is subtly dangerous. I don't have an intrinsic problem with this - it's doing what I told it to do - but (contrary to documentation) it isn't safe. It's surprisingly easy to generate crashes that the compiler does not diagnose.
>>
>> If V is a vector, and Vslice is a slice of that vector, then any operation that re-sizes V invalidates Vslice leaving dangling pointers. For ordinary lifespans this is diagnosed by the compiler as a lifetime violation, because the slice lifetimes must exit before the vector lifetime, but for 'static there appears to be a reductio bug (I need to test this).
>>
>> Since 'static is (effectively) the "forever" lifespan, it's reasonable to imagine that the reference lifespan induction "grounds out" at 'static, and the compiler therefore allows a 'static slice to be constructed from a 'static vector. The problem with this is that it isn't correct when the 'static vector is mutable and can be resized.
>
>
> I don't have enough information to tell you what exactly is wrong, but from this description, you have written unsafe code that contains undefined behavior. It is not possible (without exploiting very tricky type-system holes) to use 'static to create dangling pointers. If you have an &'static reference to a vector, then mutating that vector by other means than that specific reference is UB. If you share the code that manages the mutable vector, I can tell you where it is incorrect — you can also use the Miri tool (either locally or on the Rust Playground) to test it. Miri will straightforwardly detect all violations of borrow exclusivity/immutability.
>
>> Exports Puzzle Me
>>
>> If we have a workspace containing crates A and B-lib, where A depends on B-lib, we may not want to export B-lib public items to consumers outside the workspace. Same issue for a package having multiple sub-crates. It may also be true that we have fields that should be visible within the package but not visible to the public.
>>
>> I think this is something I don't understand yet, but it provisionally feels to me as if visibility is half-baked. I obviously need to read up.
>
>
> Rust does not, currently, support any notion of fine-grained visibility beyond the crate, only visibility within modules. Are you sure you need separate library crates and not modules?
>
>>
>> It would be really helpful to have a tool I could point at a crate, package, or workspace that would tell me what symbols are exported. Does such a tool exist?
>
>
> It is conventional to use rustdoc (cargo doc --open) for this purpose, and consider whatever you can see in the documentation to be what is public for API design and stability purposes.
>
>>
>> The main issue I see with build.rs is that it causes common tools to get re-built in every consuming build. If capidl is bound as a library, then it is compiled into the build.rs binary in every consuming workspace. It is totally unclear to me why that should be necessary even given the desire for artifact traceability.
>
>
> There is work towards Cargo being able to share artifacts between multiple workspaces. (This will not completely prevent all rebuilds, because different workspaces can have different compilation options and package features, but it will help a lot.)
>
> --
> You received this message because you are subscribed to the Google Groups "cap-talk" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to cap-talk+u...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/cap-talk/CANkSj9WdgjZgwW2bcpP1ze_%3D66xEADemC%2BBHW9h9cvHiQkK%3DEw%40mail.gmail.com.

Jonathan S. Shapiro

unread,
Jan 4, 2026, 1:26:37 AM (5 days ago) Jan 4
to cap-...@googlegroups.com
Kevin: Thanks for jumping in.

All: Sorry for extended response.

On Sat, Jan 3, 2026 at 3:46 PM Kevin Reid <kpr...@switchb.org> wrote:
Indeed, the rustc developers realized this would be a problem, and put a lot of effort into building helpfully structured error messages.

I did enough work on issuing diagnostics for unification trails in the BitC compiler to appreciate their diligence! 

I've hit two issues, one mostly just surprising and the other annoying. The surprising one is the tendency to introduce borrows of borrows of borrows that aren't necessary. Hypothesis: if "&(**foo)" works, the analyzer probably shouldn't have suggested deeper borrowing.

Can you say more about where you are seeing this suggestion? In most cases (not all!), either one `&` or one `*` is sufficient. Perhaps you are confusing rust-analyzer’s “inlay hints”, that are telling you what borrowing is automatically happening, with suggestions to add it explicitly?

I'll have to extract the next example as an illustration, but it goes something like this:

  • You have something of type &T, and you go to pass it as a parameter of type &T and then store or pass that parameter. Let's say the parameter name is "param"
  • For reasons I do not understand, the borrow checker (at least I don't think it's the type checker) suggests adding an extra & at the "store or pass" step. So you have param of type &T being assigned to a field of type &T and you get a diagnostic saying you should add a borrow, such that field = param turns into field = &param, with the result type of the right side becoming &&T
Don't hold me to that precisely. Like I said, I'll extract a small example. It's one of those things that will be obvious once we look at a concrete example.


cargo-mk wants to do dependency ordering, which leaves me with two indexing structures: a HashMap for name lookup and a vector for order-of-build. There's a cycle check in there, and it would be very natural in other languages to have a boolean field saying "we've already checked this one, and it is cycle free". But this requires mutability, and rust won't let you do that straightforwardly when the data structure is indexed in more than one way. I'm aware of several workarounds and I used one.

I would say that the most appropriate solution *for a marking problem* is to use interior mutability — either Cell<bool> (for single-threaded-only programs) or AtomicBool.

But if I understand the borrow rules, that doesn't actually work. In the case at hand, we have:

let namemap: HashMap<String, &mut MyStruct> = HashMap::new()
let vec_of_map: Vector<& MyStruct> = vec![]

We want those references to refer to the same object, with the result that setting HashMap["some/path"].safe = true is visible when accessing the same MyMap within vec_of_map. But (a) the borrow rules don't allow this aliasing, and (b) we kind of want the rest of the structure to be immutable. I cheated by adding a side table:

let safemap: HashMap<String, bool> ...
 
This gets used in the circularity check to short-circuit the recursion if you reach something that has already been determined to be cycle free. But it's a moderately expensive solution.

 
Lifetime Variables This is one of the weak spots, but it's mainly a matter of missing documentation. I think these are actually region type variables (c.f. Cyclone), but even now I'm not sure.

I’m not up on programming-language-theory enough to tell you the right technical terms, but the way I think about it is that a lifetime (not a lifetime variable!) specifies an end-point in time, at which some set of references becomes invalid (when they must not be used). Note I said “end-point” — this is because the start-point is defined by the reference value coming to exist. (Therefore, it is possible to introduce new references under already-existing lifetimes, as long as they will continue to be valid for long enough.)

For clarity: when we are asked to add a lifetime annotation 'a, the 'a is not a lifetime. It is a lifetime type variable, used mainly to unify lifetimes within a type declaration. Yes, lifetimes are distinct from lifetime type variables. If you substitute "region" for "lifetime" here, I think there is no change to meaning, except that it puts you into the established type system literature about such things, with the small quirk that the borrow checker's "take ownership" operation means that the end-of-scope for a variable may not be the same as the end-of-scope for its containing lexical block. From a region type system perspective that's not a conceptual big deal. As a matter of formalization, it's complicated mainly because it [appears to] require formalizing a limited form of typestate at the type level. Feasible in a dependent type system, but tricky to ensure that it converges. 
 
When the compiler demands introduction of lifetime variables, it doesn't say why, and the inference rules are not clear.

The inference rules for contexts where an explicit variable might or might be required are the lifetime elision rules: <https://doc.rust-lang.org/reference/lifetime-elision.html#lifetime-elision-in-functions>
Note that while these rules involve types, they are driven purely by the grammar of types (that is, “does this generic type have a lifetime parameter?”), not by more elaborate type inference.

We may be talking past each other here. If you have a construction involving many elements, some annotated with lifetime variable 'a and others annotated with lifetime variable 'b, the rule is that all 'a annotations denote the same lifetime, all 'b annotations denote the same lifetime, and the relationship between 'a and 'b (if any can be established) is determined by inference constraints. For example:

& 'b Box::new(item: & 'a T)  [implicitly: where 'b < 'a because the box is constructed from the item]
 
means that the lifetime of the constructed box (the 'b) is less than the lifetime of the [interior] boxed type. Formally, the constraint here is 'b < 'a. Though in the absence of explicit T::Drop it would technically be acceptable for the constraint to be "'b <= 'a". That isn't the rust specification; I'm saying it's okay as a matter of correctness as long as there are no finalizers for T.

The reason I'm harping on this a little is that if you have something with 'static lifespan way up at the beginning of the call stack, it can unify its way down the call stack through the lifespan type variables, and you can find yourself very deep in the call stack constructing an instance with lifespan 'a = 'b = ... = 'z = 'static by virtue of chained lifetime variable inference. Which can lead to very interesting, and very useful, and occasionally very confusing lifetimes getting bound to allocations that appear deep in the call stack.

 
I find that if I assume they are actually region type variables and decorate accordingly I can get by, but it would be really nice (a) to have that intuition confirmed or corrected, and (b) to better understand what the inference rules are. If I build a list of Box<& 'a Foo>, where Foo has lifetime 'a, what lifetime is assigned to the boxes?

“What lifetime is assigned to the boxes?” is an ill-formed question. The type Box<i32> contains no lifetimes, and the type Box<&'a Foo> contains one lifetime. A common confusion is to think that lifetimes describe how long values exist — this is not true. Lifetimes describe how long borrows exist, or how long references are valid for. Box<i32> is not a reference and does not contain any references.

From a type theory perspective, that is not correct. Every construction implicitly has a lifetime variable for the constructed thing. Generally speaking, if the parameters to a construction are a, b, ... z with associated lifespan variables 'a, 'b, ... 'z, then the result of a construction is a fresh type variable 'constructed that satisfied 'constructed < 'a, 'constructed < 'b, ... 'constructed < 'z. Heap constructions require a fresh type variable, which is a little tricky to write down in a way that will make sense here. The intuition is that Boxing a value implicitly pushes it to the heap, which requires either a Clone or a Copy operation under the covers. So the construction of a Box has a type something like "Box::new<T>(& fresh 'a T) -> &'b Box <& 'aT where 'a < 'b". Note here that 'a is a fresh type variable, meaning that it can only unify with "free" type variables, and that the result type of copy or clone is always a free type variable. From a type perspective, this is why boxing from the stack requires a copy or a clone.

There's a right way to write this down, and this isn't quite it. I can refresh myself on the type theory if anybody is interested.
 
 
The 'static Lifespan

The 'static lifespan, which the Coyotos kernel and any fast tokenizer are going to rely on a lot, is subtly dangerous. I don't have an intrinsic problem with this - it's doing what I told it to do - but (contrary to documentation) it isn't safe. It's surprisingly easy to generate crashes that the compiler does not diagnose.

If V is a vector, and Vslice is a slice of that vector, then any operation that re-sizes V invalidates Vslice leaving dangling pointers. For ordinary lifespans this is diagnosed by the compiler as a lifetime violation, because the slice lifetimes must exit before the vector lifetime, but for 'static there appears to be a reductio bug (I need to test this).

Since 'static is (effectively) the "forever" lifespan, it's reasonable to imagine that the reference lifespan induction "grounds out" at 'static, and the compiler therefore allows a 'static slice to be constructed from a 'static vector. The problem with this is that it isn't correct when the 'static vector is mutable and can be resized.

I don't have enough information to tell you what exactly is wrong, but from this description, you have written unsafe code that contains undefined behavior. It is not possible (without exploiting very tricky type-system holes) to use 'static to create dangling pointers.

Well, I'm not sure how tricky the type system holes are, but I agree with you. My point was that these type system holes are undiagnosed by the compiler, and that is a bug.
 
If you have an &'static reference to a vector, then mutating that vector by other means than that specific reference is UB.

Yes. But I'm saying it shouldn't be. I'm saying that it's a type error, and it should be diagnosed. If taking a slice of a mutable vector and then appending to the underlying vector is undefined behavior, then any Rust claim to memory safety is just flatly and unequivocally a lie. To be clear, I think this is a lifespan handling bug in the compiler rather than a fundamental flaw in the language. I'm just saying there's a bug. Given:

let buf & mut [u8] = vec![]
let buf_slice = buf[1:] // error

I claim that construction of the slice is a [static] type error because the vector is mutable and can therefore be appended with the result of invalidating the underlying storage locations and leaving the slice value unsafe. There is no corresponding problem if the vector is immutable, even in the presence of interior mutability. There is no good justification to resorting to undefined behavior claims about this when the problem can be caught in the type checker, statically, 100% of the time.

Now that I stare at it, it's the vector mutability that's the problem, not the use of a 'static lifespan.

Exports Puzzle Me

If we have a workspace containing crates A and B-lib, where A depends on B-lib, we may not want to export B-lib public items to consumers outside the workspace. Same issue for a package having multiple sub-crates. It may also be true that we have fields that should be visible within the package but not visible to the public.

I think this is something I don't understand yet, but it provisionally feels to me as if visibility is half-baked. I obviously need to read up.

Rust does not, currently, support any notion of fine-grained visibility beyond the crate, only visibility within modules. Are you sure you need separate library crates and not modules?

It's a fair question. Let me dig in to understand that better.
 
 
It would be really helpful to have a tool I could point at a crate, package, or workspace that would tell me what symbols are exported. Does such a tool exist?

It is conventional to use rustdoc (cargo doc --open) for this purpose, and consider whatever you can see in the documentation to be what is public for API design and stability purposes.

I'm not sure that has any relationship to what is public from a language definition perspective. There's no guarantee that a publicly exported function has any documentation at all. What I'm after is "here is the set of symbols that are recognized as resolvable at the linker level of abstraction."
 
 
The main issue I see with build.rs is that it causes common tools to get re-built in every consuming build. If capidl is bound as a library, then it is compiled into the build.rs binary in every consuming workspace. It is totally unclear to me why that should be necessary even given the desire for artifact traceability.

There is work towards Cargo being able to share artifacts between multiple workspaces. (This will not completely prevent all rebuilds, because different workspaces can have different compilation options and package features, but it will help a lot.)

That seems constructive, and yes, certain options dictate distinct builds. I'm a little suspicious that feature unification across target and host builds (workspaces and build-workspaces, if you will) may turn out to be entertaining.

Matt Rice

unread,
Jan 4, 2026, 2:35:13 AM (5 days ago) Jan 4
to cap-...@googlegroups.com
I'm a little confused by the combination of the example and the
description of the problem,
In that code example, with the empty vec, then getting a slice
starting from the element at index 1 causes a runtime panic.
Rust it doesn't track any kind of dependent type that any length of
the vector is valid at compile time.

I think rather, following the description you gave that the confusion
might be that rust is a little lazy about turning the existence of
multiple references
into errors, waiting until these errors are actually dereferenced.
This is both the case for the out of bounds panic that happens at
runtime above.
That is to say that `&mut [u8]` has *exclusive* ownership, and something like
`let buf_slice = &buf[0..];` would give shared ownership.

Rust really won't complain until you use these in some way which
violates the exclusive ownership, but once you do you will get a
static compile time error. For instance, the following will compain
"cannot borrow `v` as mutable because it is also borrowed as
immutable"

```
let mut v: Vec<u8> = vec![];
let buf_slice = &v[1..];
v.clear();
eprintln!("{:?}", buf_slice);
```

> I claim that construction of the slice is a [static] type error because the vector is mutable and can therefore be appended with the result of invalidating the underlying storage locations and leaving the slice value unsafe. There is no corresponding problem if the vector is immutable, even in the presence of interior mutability. There is no good justification to resorting to undefined behavior claims about this when the problem can be caught in the type checker, statically, 100% of the time.
>
> Now that I stare at it, it's the vector mutability that's the problem, not the use of a 'static lifespan.
>
>>> Exports Puzzle Me
>>>
>>> If we have a workspace containing crates A and B-lib, where A depends on B-lib, we may not want to export B-lib public items to consumers outside the workspace. Same issue for a package having multiple sub-crates. It may also be true that we have fields that should be visible within the package but not visible to the public.
>>>
>>> I think this is something I don't understand yet, but it provisionally feels to me as if visibility is half-baked. I obviously need to read up.
>>
>>
>> Rust does not, currently, support any notion of fine-grained visibility beyond the crate, only visibility within modules. Are you sure you need separate library crates and not modules?
>
>
> It's a fair question. Let me dig in to understand that better.
>
>>
>>
>>>
>>> It would be really helpful to have a tool I could point at a crate, package, or workspace that would tell me what symbols are exported. Does such a tool exist?
>>
>>
>> It is conventional to use rustdoc (cargo doc --open) for this purpose, and consider whatever you can see in the documentation to be what is public for API design and stability purposes.
>
>
> I'm not sure that has any relationship to what is public from a language definition perspective. There's no guarantee that a publicly exported function has any documentation at all. What I'm after is "here is the set of symbols that are recognized as resolvable at the linker level of abstraction."
>
>>
>>
>>>
>>> The main issue I see with build.rs is that it causes common tools to get re-built in every consuming build. If capidl is bound as a library, then it is compiled into the build.rs binary in every consuming workspace. It is totally unclear to me why that should be necessary even given the desire for artifact traceability.
>>
>>
>> There is work towards Cargo being able to share artifacts between multiple workspaces. (This will not completely prevent all rebuilds, because different workspaces can have different compilation options and package features, but it will help a lot.)
>
>
> That seems constructive, and yes, certain options dictate distinct builds. I'm a little suspicious that feature unification across target and host builds (workspaces and build-workspaces, if you will) may turn out to be entertaining.
>
> --
> You received this message because you are subscribed to the Google Groups "cap-talk" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to cap-talk+u...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/cap-talk/CAAP%3D3QPg17cUuZ56wyWrkAjMR4K3tdhiFCM5iBMcL%3DJc1J1eLg%40mail.gmail.com.

William ML Leslie

unread,
Jan 4, 2026, 5:51:01 AM (5 days ago) Jan 4
to cap-...@googlegroups.com
On Sun, 4 Jan 2026 at 07:05, Jonathan S. Shapiro <jonathan....@gmail.com> wrote:
The main issue I see with build.rs is that it causes common tools to get re-built in every consuming build. If capidl is bound as a library, then it is compiled into the build.rs binary in every consuming workspace. It is totally unclear to me why that should be necessary even given the desire for artifact traceability. And it has the (perhaps unintended) consequence that pre-build artifacts are successfully obscured from the human developer - one must dig into build.rs to find out what they are.

The build dependencies need to be specified in the build-dependencies section of the Cargo.toml, the same way that regular dependencies are.

If you find it is being re-run when it doesn't need to be, see the documentation here: https://doc.rust-lang.org/cargo/reference/build-scripts.html#change-detection

If you are seeing expensive builds, it might be that cargo is using a single-threaded linker.  It might be worth checking that it's picking up the linker from LLVM rather than GCC.
 
--
William ML Leslie

Kevin Reid

unread,
Jan 4, 2026, 3:12:54 PM (5 days ago) Jan 4
to cap-...@googlegroups.com
On Sat, Jan 3, 2026 at 10:26 PM Jonathan S. Shapiro <jonathan....@gmail.com> wrote:
On Sat, Jan 3, 2026 at 3:46 PM Kevin Reid <kpr...@switchb.org> wrote:
cargo-mk wants to do dependency ordering, which leaves me with two indexing structures: a HashMap for name lookup and a vector for order-of-build. There's a cycle check in there, and it would be very natural in other languages to have a boolean field saying "we've already checked this one, and it is cycle free". But this requires mutability, and rust won't let you do that straightforwardly when the data structure is indexed in more than one way. I'm aware of several workarounds and I used one.

I would say that the most appropriate solution *for a marking problem* is to use interior mutability — either Cell<bool> (for single-threaded-only programs) or AtomicBool.

But if I understand the borrow rules, that doesn't actually work. In the case at hand, we have:

let namemap: HashMap<String, &mut MyStruct> = HashMap::new()
let vec_of_map: Vector<& MyStruct> = vec![]

We want those references to refer to the same object, with the result that setting HashMap["some/path"].safe = true is visible when accessing the same MyMap within vec_of_map. But (a) the borrow rules don't allow this aliasing, and (b) we kind of want the rest of the structure to be immutable.

The point of using Cell or atomics is that you then don't require &mut to change the value of the cell. (&mut is a bit of a misnomer; it is often more accurate to describe it as an exclusive reference than a mutable reference.) But perhaps I don't understand the situation.

Also, my general advice to Rust beginners is that you should refrain from building primary data structures out of references. The simple rule of thumb is that references should be used for temporary purposes only. As a refinement of this, in applications like compilers which have phases/passes where some results are computed in each pass, it's okay to take long-lived & references to pass 1 while in pass 2. But it’s not clear to me whether that’s what you are doing. If it is, then you can use & references from pass 2 to point to cells stored in the data structures built by pass 1.

I’m not up on programming-language-theory enough to tell you the right technical terms, but the way I think about it is that a lifetime (not a lifetime variable!) specifies an end-point in time, at which some set of references becomes invalid (when they must not be used). Note I said “end-point” — this is because the start-point is defined by the reference value coming to exist. (Therefore, it is possible to introduce new references under already-existing lifetimes, as long as they will continue to be valid for long enough.)

For clarity: when we are asked to add a lifetime annotation 'a, the 'a is not a lifetime. It is a lifetime type variable, used mainly to unify lifetimes within a type declaration.

Yes, that’s right. But lifetime variables are used to discuss lifetimes, so knowing what lifetimes are is the first step to understanding lifetime variables.
 
The inference rules for contexts where an explicit variable might or might be required are the lifetime elision rules: <https://doc.rust-lang.org/reference/lifetime-elision.html#lifetime-elision-in-functions>
Note that while these rules involve types, they are driven purely by the grammar of types (that is, “does this generic type have a lifetime parameter?”), not by more elaborate type inference.

We may be talking past each other here. If you have a construction involving many elements, some annotated with lifetime variable 'a and others annotated with lifetime variable 'b, the rule is that all 'a annotations denote the same lifetime, all 'b annotations denote the same lifetime, and the relationship between 'a and 'b (if any can be established) is determined by inference constraints. For example:

& 'b Box::new(item: & 'a T)  [implicitly: where 'b < 'a because the box is constructed from the item]
 
means that the lifetime of the constructed box (the 'b) is less than the lifetime of the [interior] boxed type. Formally, the constraint here is 'b < 'a. Though in the absence of explicit T::Drop it would technically be acceptable for the constraint to be "'b <= 'a". That isn't the rust specification; I'm saying it's okay as a matter of correctness as long as there are no finalizers for T.'

'b does not refer to a lifetime “of the box”. It is a lifetime of a particular borrow of the box. This is a key distinction, because whenever the box's contents are mutated, this invalidates all prior borrows of the box (ends their lifetime).
 
The reason I'm harping on this a little is that if you have something with 'static lifespan way up at the beginning of the call stack, it can unify its way down the call stack through the lifespan type variables, and you can find yourself very deep in the call stack constructing an instance with lifespan 'a = 'b = ... = 'z = 'static by virtue of chained lifetime variable inference. Which can lead to very interesting, and very useful, and occasionally very confusing lifetimes getting bound to allocations that appear deep in the call stack.

Yes, getting lifetimes set equal to 'static can be very confusing. There are probably ways to think about this problem that make it less confusing, but it would help to have some concrete code to discuss as an example.
  
“What lifetime is assigned to the boxes?” is an ill-formed question. The type Box<i32> contains no lifetimes, and the type Box<&'a Foo> contains one lifetime. A common confusion is to think that lifetimes describe how long values exist — this is not true. Lifetimes describe how long borrows exist, or how long references are valid for. Box<i32> is not a reference and does not contain any references.

From a type theory perspective, that is not correct. Every construction implicitly has a lifetime variable for the constructed thing.

IIUC, the problem here is that Rust’s “lifetimes” are not the same thing as type theory’s “lifetimes”. As I said above, Rust lifetimes are of borrowings, not objects. It is certainly an unfortunate choice of terms, but you need to keep this distinction in mind or you will continue to be confused by Rust.
 
Well, I'm not sure how tricky the type system holes are, but I agree with you. My point was that these type system holes are undiagnosed by the compiler, and that is a bug.

Compiler bugs are of course possible, but my claim is that it is more likely that you have written incorrect unsafe code, than that you have hit a compiler bug. (Of course, that is easily disproven if your program contains no unsafe code.) Stand-alone compilable sample code to discuss would be very helpful to this discussion.
 
If you have an &'static reference to a vector, then mutating that vector by other means than that specific reference is UB.

Yes. But I'm saying it shouldn't be. I'm saying that it's a type error, and it should be diagnosed. If taking a slice of a mutable vector and then appending to the underlying vector is undefined behavior, then any Rust claim to memory safety is just flatly and unequivocally a lie. To be clear, I think this is a lifespan handling bug in the compiler rather than a fundamental flaw in the language. I'm just saying there's a bug. Given:

let buf & mut [u8] = vec![]
let buf_slice = buf[1:] // error

I claim that construction of the slice is a [static] type error because the vector is mutable and can therefore be appended with the result of invalidating the underlying storage locations and leaving the slice value unsafe.

In this kind of code (supposing a version that compiles), constructing buf_slice causes buf to be inaccessible for the duration of the immutable/shared borrow. If that does not happen, then either there is a compiler bug or incorrect unsafe code involved.

It is conventional to use rustdoc (cargo doc --open) for this purpose, and consider whatever you can see in the documentation to be what is public for API design and stability purposes.

I'm not sure that has any relationship to what is public from a language definition perspective. There's no guarantee that a publicly exported function has any documentation at all.

Rustdoc shows all public symbols (except for ones explicitly marked as hidden) regardless of whether any documentation text has been written for them. If you haven’t looked at the output of rustdoc for your own project, I encourage you to do so.
 
What I'm after is "here is the set of symbols that are recognized as resolvable at the linker level of abstraction."

I’m not sure this is a meaningful set, but I’m not well-informed on how Rust is linked in practice.
 
There is work towards Cargo being able to share artifacts between multiple workspaces. (This will not completely prevent all rebuilds, because different workspaces can have different compilation options and package features, but it will help a lot.)

That seems constructive, and yes, certain options dictate distinct builds. I'm a little suspicious that feature unification across target and host builds (workspaces and build-workspaces, if you will) may turn out to be entertaining.

To refine my previous message: you can already configure Cargo to use the same build directory for multiple workspaces. It just doesn’t work as well as one might like, yet, so it’s not offered as a default behavior. But, whether currently or in a future better version, doing this will not cause any feature unification that doesn’t already happen — you’ll just get sharing of artifacts when the features are the same.

Matt Rice

unread,
Jan 4, 2026, 5:36:43 PM (5 days ago) Jan 4
to cap-...@googlegroups.com
On Sun, Jan 4, 2026 at 8:12 PM Kevin Reid <kpr...@switchb.org> wrote:
>
> On Sat, Jan 3, 2026 at 10:26 PM Jonathan S. Shapiro <jonathan....@gmail.com> wrote:
>>
>> From a type theory perspective, that is not correct. Every construction implicitly has a lifetime variable for the constructed thing.
>
>
> IIUC, the problem here is that Rust’s “lifetimes” are not the same thing as type theory’s “lifetimes”. As I said above, Rust lifetimes are of borrowings, not objects. It is certainly an unfortunate choice of terms, but you need to keep this distinction in mind or you will continue to be confused by Rust.

I guess I have a little difficulty with Kevin's explanation that they
are borrowings not objects when we look at a particularly weird case
like,
https://github.com/ratmice/fnmutant/blob/master/src/lib.rs

It isn't clear what is being borrowed here, the type variables are
just generic types which can be owned/borrowed/whatever.
the usage of the type variable `for <'a> Fn(...)` is saying that the
function is valid for all lifetimes 'a. Thus the parameters of the
function
`In`, `Out`, etc, can have lifetimes *less* than the closer itself.
Thus it is setting up a scenario between the lifetime of the object,
and
the lifetime of the closure such that it cannot be captured.

I at least have trouble rectifying this at least with the "lifetimes
are of borrowings" philosophy.
but this is a pretty mystifying corner case

Kevin Reid

unread,
Jan 4, 2026, 10:06:24 PM (5 days ago) Jan 4
to cap-...@googlegroups.com
On Sun, Jan 4, 2026 at 2:36 PM Matt Rice <rat...@gmail.com> wrote:
I guess I have a little difficulty with Kevin's explanation that they are borrowings not objects when we look at a particularly weird case like, https://github.com/ratmice/fnmutant/blob/master/src/lib.rs

It isn't clear what is being borrowed here, the type variables are just generic types which can be owned/borrowed/whatever. the usage of the type variable `for <'a> Fn(...)` is saying that the function is valid for all lifetimes 'a. 

In that code, the for<'a> introduces a completely unused lifetime variable. You can delete it and the program will function identically. Nothing is being borrowed and the lifetime variable says nothing about the functions.

Matt Rice

unread,
Jan 4, 2026, 10:14:57 PM (5 days ago) Jan 4
to cap-...@googlegroups.com
I don't think that is the case, (however it has been years since i've
looked at this code), what you say *might* be true in a sense,
that removing the bounds will still compile, but if I recall the
difference is when you swap to a Fn which captures the `In` parameter,
such as one that keeps a local mutable vector, IIRC there are implied
bounds at play.

I.e. I think if you change the code to capture the argument in the Fn,
and then remove the bounds it would compile,
where it would currently fail if you tried to capture the argument.

Matt Rice

unread,
Jan 4, 2026, 10:22:43 PM (5 days ago) Jan 4
to cap-...@googlegroups.com
I should also add it was never clear to me that this was actually an
intentional part of the compiler/ownership model.

Matt Rice

unread,
Jan 4, 2026, 10:39:40 PM (5 days ago) Jan 4
to cap-...@googlegroups.com
On Mon, Jan 5, 2026 at 3:06 AM Kevin Reid <kpr...@switchb.org> wrote:
>
Sorry for all the replies, I guess what I'm really trying to say here
regardless of that code, is that implied bounds seem to act in
mysterious ways on *types* rather than on borrows.

Kevin Reid

unread,
Jan 5, 2026, 12:37:34 AM (4 days ago) Jan 5
to cap-...@googlegroups.com
On Sun, Jan 4, 2026 at 7:39 PM Matt Rice <rat...@gmail.com> wrote:
I guess what I'm really trying to say hereregardless of that code, is that implied bounds seem to act in mysterious ways on *types* rather than on borrows.

Ah. Hm. I cannot provide any authoritative answers about implied bounds (and I believe there are even some inconsistencies in how they work), but speaking of bounds in general: a lifetime bound on a type is about references that values of the type might contain, not about the existence of any given value. (That’s a part of the distinction I have been talking about, that Rust lifetimes describe borrows, not values.)

The most common and dramatic example of this is that T: 'static does not mean that values of type T must continue to exist until the program exits; rather, it means that a value of type T can be kept around until the program exits, if its owner so chooses.

Or, in another perspective, it means “if values of type T contain any references, they must be &'static references” (but that’s a simplification, because it’s possible for a type to be meaningfully constrained by a lifetime parameter without any actual references per se existing within it).


Matt Rice

unread,
Jan 5, 2026, 1:39:54 AM (4 days ago) Jan 5
to cap-...@googlegroups.com
On Mon, Jan 5, 2026 at 5:37 AM Kevin Reid <kpr...@switchb.org> wrote:
>
> On Sun, Jan 4, 2026 at 7:39 PM Matt Rice <rat...@gmail.com> wrote:
>>
>> I guess what I'm really trying to say hereregardless of that code, is that implied bounds seem to act in mysterious ways on *types* rather than on borrows.
>
>
> Ah. Hm. I cannot provide any authoritative answers about implied bounds (and I believe there are even some inconsistencies in how they work),

Yeah, there are definitely inconsistencies in how they work, in that
link GATs link I sent previously about implied bounds shows one where
replacing the implied bounds with explicit user specified bounds leads
to a compilation error.

> but speaking of bounds in general: a lifetime bound on a type is about references that values of the type might contain, not about the existence of any given value. (That’s a part of the distinction I have been talking about, that Rust lifetimes describe borrows, not values.)
>
> The most common and dramatic example of this is that T: 'static does not mean that values of type T must continue to exist until the program exits; rather, it means that a value of type T can be kept around until the program exits, if its owner so chooses.
>
> Or, in another perspective, it means “if values of type T contain any references, they must be &'static references” (but that’s a simplification, because it’s possible for a type to be meaningfully constrained by a lifetime parameter without any actual references per se existing within it).

Yeah I think your simplification text hits pretty much at the heart of
the different perspectives here.
At least for me since I tend to equate borrows and references since
the borrow trait returns a reference.
Anyhow I think it is a nice simplification, since once you've gone
outside of that you are probably finding yourself in the weeds.

Jonathan S. Shapiro

unread,
Jan 7, 2026, 3:37:52 PM (2 days ago) Jan 7
to cap-...@googlegroups.com
Kevin:

Meant to circle back sonner, and wanted to say how helpful some of your details have been. I clearly need to dig deeper into understanding how lifetimes and borrowing interact. Might even dig back into Niko's dissertation. :-)

On Sun, Jan 4, 2026 at 12:12 PM Kevin Reid <kpr...@switchb.org> wrote:
On Sat, Jan 3, 2026 at 10:26 PM Jonathan S. Shapiro <jonathan....@gmail.com> wrote:
But if I understand the borrow rules, that doesn't actually work. In the case at hand, we have:

let namemap: HashMap<String, &mut MyStruct> = HashMap::new()
let vec_of_map: Vector<& MyStruct> = vec![]

We want those references to refer to the same object, with the result that setting HashMap["some/path"].safe = true is visible when accessing the same MyMap within vec_of_map. But (a) the borrow rules don't allow this aliasing, and (b) we kind of want the rest of the structure to be immutable.

The point of using Cell or atomics is that you then don't require &mut to change the value of the cell. (&mut is a bit of a misnomer; it is often more accurate to describe it as an exclusive reference than a mutable reference.) But perhaps I don't understand the situation.

There are some ways in which the documentation of the Cell family could be clearer. :-) But let's back up for a sec. You wrote below this:

Also, my general advice to Rust beginners is that you should refrain from building primary data structures out of references.

The only places where I'm building primary data structures out of references are places where it is a reference to something with static lifetime and the underlying memory block has been leaked. Either stuff that is global or stuff that was heap allocated and leaked intentionally. I'm not aware of an alternative to using the references for those, but I'd be happy to learn.

Turns out I simplified my example in counterproductive ways. The actual code is a tokenizer that uses permanent & 'static [u8] buffers for the content of each file, and creates & 'static FileInfo structures to associate a file name or input source with the buffer. So the actual data structure pair is closer to:

let namemap: HashMap<String, &mut 'static FileInfo> = HashMap::new()
let vec_of_map: Vector<& 'static FileInfo> = vec![]

and it is important here that the same FileInfo reference be able to be stored in both collections. The problem here is that one of them needs to be mutable so that one of its fields can be updated later. Once updated, that update needs to be visible when looked up from *either* container.

I'm not sure where the Cell wrapper would go here? Do you have in mind to make it

Cell<& 'static FileInfo>?

I'm not sure how that helps, because the reference isn't the thing we're wanting to modify. Or do you have in mind

& 'static Cell<& 'static FileInfo>

There are moments when I think that Rust is unique in exploiting carpal tunnel syndrome to defend memory safety. :-)
 
The simple rule of thumb is that references should be used for temporary purposes only. As a refinement of this, in applications like compilers which have phases/passes where some results are computed in each pass, it's okay to take long-lived & references to pass 1 while in pass 2. But it’s not clear to me whether that’s what you are doing.

Well, I'm not doing it yet, in any case, but I'll be getting there. The compiler case may be an unfortunate example, because things like ASTs have very long lives - at least as long as the front end, and sometimes the mid end. They're a great example of something that wants to be allocated out of a pool where the pool establishes the lifespan for all of them. But perhaps not in rust, because that idiom doesn't interact well with the "one mutable reference" idea.

Not complaining about that, just making note of it.
 
For clarity: when we are asked to add a lifetime annotation 'a, the 'a is not a lifetime. It is a lifetime type variable, used mainly to unify lifetimes within a type declaration.

Yes, that’s right. But lifetime variables are used to discuss lifetimes, so knowing what lifetimes are is the first step to understanding lifetime variables.

Circling back to this for just a moment, I'm now clear that regions and lifetimes are different. In a region-based system, the regions are [mostly] lexically scoped, and the region type variables provide a way to pass around the names of those scopes. By "passing" those type variables you can propagate 'region-way-up-the-stack down to some tiny utility function way down the stack and cause unifying allocations in that utility function that to allocate within the lexical scope of 'region-way-up-the-stack and get deallocated when that much higher lexical scope exits. Or more precisely: when the extent of that region ends.

That has its purposes, but passing explicit lifetimes can get horrifically cumbersome. Sometimes unification makes it unnecessary to pass them explicitly, but not always. And a key thing here is that the regions, once bound, never change.

Rust's move operation does something very different, because the transfer of ownership when something gets moved into a new parent data structure is effectively (at least at first glance) a re-assignment of a lifetime variable. Unification might be enough to erase all of that, but I don't understand that part yet. I obviously need to dig in harder.
 
 
The inference rules for contexts where an explicit variable might or might be required are the lifetime elision rules: <https://doc.rust-lang.org/reference/lifetime-elision.html#lifetime-elision-in-functions>
Note that while these rules involve types, they are driven purely by the grammar of types (that is, “does this generic type have a lifetime parameter?”), not by more elaborate type inference.

That seems like a fair capture of where explicit type variables need to get introduced into the program text, but behind the scenes the type checker is introducing them left and right and then unifying most of those variables out of existence. As evidence, error messages very frequently show you places where those type variables appear in some place you didn't explicitly put them. When that happens, it's because the type inference process put them there and the unification process then unified most of them out of existence.

The section of the reference you pointed to is actually describing how the compiler introduces lifetime variables where the programmer elided them, and then most of them go away through unification or [perhaps?] bi-unification or something along those lines.

 
'b does not refer to a lifetime “of the box”. It is a lifetime of a particular borrow of the box. This is a key distinction, because whenever the box's contents are mutated, this invalidates all prior borrows of the box (ends their lifetime).

This statement was very helpful. It suggests that what Rust is calling "lifetimes" is actually something that has historically been called "extent". Extents describe the lifetimes of bindings. When prior borrows are invalidated, I now think what's actually happening is that their bindings are revoked and the extent of those bindings therefore ends. After that, those names can no longer be used to reference the data they used to reference.

This is distinct from "liveness", which describes the lifetime of the underlying value. Reclamation can only occur safely when liveness ends, and liveness cannot end while bindings with valid extents continue to exist.

Sorry. I'm typing this out to get my own thoughts organized. I'll get there.
 
IIUC, the problem here is that Rust’s “lifetimes” are not the same thing as type theory’s “lifetimes”. As I said above, Rust lifetimes are of borrowings, not objects. It is certainly an unfortunate choice of terms, but you need to keep this distinction in mind or you will continue to be confused by Rust.

Type theory didn't have lifetimes until Rust introduced them. :-)  Before that, when people talked about "lifetimes" they were generally referring imprecisely to "extent". For Cyclone, the term would be "regions", which isn't quite the same as either liveness or extent. The nomenclature around different details in this area is confusing, partly because the underlying subject matter is rife with subtle distinctions that are important to get right if you want your GC to work. :-)

Compiler bugs are of course possible, but my claim is that it is more likely that you have written incorrect unsafe code, than that you have hit a compiler bug. (Of course, that is easily disproven if your program contains no unsafe code.) Stand-alone compilable sample code to discuss would be very helpful to this discussion.

I'm planning to push the early steps to github in the next week or so. Then everybody can feel entirely justified talking about how poorly I understand Rust, compilation, and life generally. :-)

Coming from a hardware point of view, the fact that rust has undefined behavior seems like a huge red flag. In the unsafe language subset, undefined behavior is sometimes unavoidable because hardware has it and then you're stuck with it. But in the safe part of the language, undefined behaviot amounts to saying that the language doesn't have defined semantics.

To refine my previous message: you can already configure Cargo to use the same build directory for multiple workspaces...

That's exactly what I'm trying to avoid. I've got a host workspace happily building binaries for cross-tools that need to be executed to generate source code used in kernel and library builds.

And yes, I understand that's not "the way of cargo" and not how build.rs wants to work. For regdef it's not a big deal - that only gets run in one consuming workspace, and it wouldn't be a big deal to link it into build.rs except that doing so buries visibility into the dependencies. But capidl gets run in a larger number of places.

My view certainly may change, but at the moment I think those are both examples of things where the "go generate" approach feels like a better answer than the build.rs approach. But maybe it's just more familiar.

Jonathan S. Shapiro

unread,
Jan 7, 2026, 3:50:06 PM (2 days ago) Jan 7
to cap-...@googlegroups.com
On Sun, Jan 4, 2026 at 9:37 PM Kevin Reid <kpr...@switchb.org> wrote:
The most common and dramatic example of this is that T: 'static does not mean that values of type T must continue to exist until the program exits; rather, it means that a value of type T can be kept around until the program exits, if its owner so chooses.

Hmm. I see what you are saying, but if I understand things correctly the only ways you can end up with 'static are something being global, something getting re-owned into a 'static container (a reductio case), something in the heap that has been leaked, or something that gets boxed by box::new() that has a lifetime variable in the right position that got unified with one of the previous cases.

Maybe some examples involving Cells that I don't understand yet?

The "ground" cases seem to be "I'm global" or "I'm a result of Box::leak()", both of which are explicitly outside the domain of things that get reclaimed during execution.

So yeah, the rule is that the reference cannot outlive its target, and nothing in 'static dies until exit(), and so you can keep such things around. But is there any mechanism within the safe language subset for explicitly releasing those things before exit? I wouldn't think so, because that would be unsafe.


Jonathan

Jonathan S. Shapiro

unread,
Jan 7, 2026, 4:03:54 PM (2 days ago) Jan 7
to cap-...@googlegroups.com
On Sun, Jan 4, 2026 at 10:39 PM Matt Rice <rat...@gmail.com> wrote:
On Mon, Jan 5, 2026 at 5:37 AM Kevin Reid <kpr...@switchb.org> wrote:
> Or, in another perspective, it means “if values of type T contain any references, they must be &'static references” (but that’s a simplification, because it’s possible for a type to be meaningfully constrained by a lifetime parameter without any actual references per se existing within it).

Anyhow I think it is a nice simplification, since once you've gone
outside of that you are probably finding yourself in the weeds.

At very least there is a difference in lifetime requirements once 'static enters the picture.  For everything else, the rule is that if A [might] hold a reference to B then 'a < 'b (where these are the associated lifetime variables for the reference to A and the contained reference to B, respectively). But in the case where 'b == 'static, the lifetime constraint is actually 'a <= 'b.

In many cases, we could accept a <= constraint for the first case as well. The '=' only needs to be removed if there is something that imposes an ordering on reclamation, like a finalizer. I believe "drop" semantics imposes that requirement as well. Conceptually, things in 'static cannot have drop semantics even if they carry the trait. It's surprisingly hard to run drop code after you exit...

Jonathan

Matt Rice

unread,
Jan 7, 2026, 4:24:53 PM (2 days ago) Jan 7
to cap-...@googlegroups.com
On Wed, Jan 7, 2026 at 8:50 PM Jonathan S. Shapiro
<jonathan....@gmail.com> wrote:
>
> On Sun, Jan 4, 2026 at 9:37 PM Kevin Reid <kpr...@switchb.org> wrote:
>>
>> The most common and dramatic example of this is that T: 'static does not mean that values of type T must continue to exist until the program exits; rather, it means that a value of type T can be kept around until the program exits, if its owner so chooses.
>
>
> Hmm. I see what you are saying, but if I understand things correctly the only ways you can end up with 'static are something being global, something getting re-owned into a 'static container (a reductio case), something in the heap that has been leaked, or something that gets boxed by box::new() that has a lifetime variable in the right position that got unified with one of the previous cases.
>

This may be a case of "or something that gets boxed by ..." but there
are also cases like when using `Box<dyn SomeTrait>`,
where it defaults to 'static unless you use `dyn SomeTrait + 'some_lifetime`

https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=dc3e357c0ab6c289e14512b179062787

It isn't paged into my head how you would get the compiler to show
that this is inferring a 'static, but I've run into it in the past
and could circle back to find an example that shows it if necessary.
Anyhow my point is sometimes 'static happens without actually being
present in the code. I guess "unified with one of the previous
cases", which in this case is `Dyn`. I can't remember more corner
cases like that at the moment though.

Kevin Reid

unread,
Jan 7, 2026, 5:14:39 PM (2 days ago) Jan 7
to cap-...@googlegroups.com
On Wed, Jan 7, 2026 at 12:37 PM Jonathan S. Shapiro <jonathan....@gmail.com> wrote:
Turns out I simplified my example in counterproductive ways. The actual code is a tokenizer that uses permanent & 'static [u8] buffers for the content of each file, and creates & 'static FileInfo structures to associate a file name or input source with the buffer. So the actual data structure pair is closer to:

let namemap: HashMap<String, &mut 'static FileInfo> = HashMap::new()
let vec_of_map: Vector<& 'static FileInfo> = vec![]

and it is important here that the same FileInfo reference be able to be stored in both collections. The problem here is that one of them needs to be mutable so that one of its fields can be updated later. Once updated, that update needs to be visible when looked up from *either* container.

The data structure you have declared and described here cannot be constructed in Rust. A &mut reference is an exclusive reference. The borrow checker will not allow you to construct simultaneously usable & and &mut references, and if you use unsafe code to do it anyway, the program exhibits undefined behavior. Everything that follows from that cannot be attributed to any “type system hole”; the hole is that you’ve applied a sledgehammer to the wall and the building is falling down.

The correct way to have shared, mutable parts of your data is to put the shared, mutable parts in an interior-mutability type such as RefCell, Mutex, Cell, AtomicBool, etc, and then use & shared references or Rc to refer to the whole data structures as needed.

I'm not sure where the Cell wrapper would go here?

The Cell goes on the boolean field.

struct FileInfo {
    ....
    marked: Cell<bool>,
}

Then you can update the flag, and only the flag, through any shared reference.
 
'b does not refer to a lifetime “of the box”. It is a lifetime of a particular borrow of the box. This is a key distinction, because whenever the box's contents are mutated, this invalidates all prior borrows of the box (ends their lifetime).

This statement was very helpful. It suggests that what Rust is calling "lifetimes" is actually something that has historically been called "extent". Extents describe the lifetimes of bindings. When prior borrows are invalidated, I now think what's actually happening is that their bindings are revoked and the extent of those bindings therefore ends. After that, those names can no longer be used to reference the data they used to reference.

This may be an accurate analogy — again, I don’t have enough PL theory to comment — but note that in Rust a “binding” is a variable or perhaps its introduction — `let (x, y) = z` has two bindings — and both references and the things they borrow do not necessarily exist in bindings, but are simply values. What you describe sounds more like the theory of a language like Lisp or Java where values may be implicitly shared among many bindings.
 
This is distinct from "liveness", which describes the lifetime of the underlying value. Reclamation can only occur safely when liveness ends, and liveness cannot end while bindings with valid extents continue to exist.

Yes, continuing the above, I think this sentence is using an ontology in which bindings are implicitly sharing. In Rust, each route of access to a value is either ownership or borrowing, and the distinction is explicit and appears in the types of the values of bindings, so we talk more about those types and values and less about the bindings.

Coming from a hardware point of view, the fact that rust has undefined behavior seems like a huge red flag. In the unsafe language subset, undefined behavior is sometimes unavoidable because hardware has it and then you're stuck with it. But in the safe part of the language, undefined behaviot amounts to saying that the language doesn't have defined semantics.

Safe Rust cannot be responsible for undefined behavior. Unsafe Rust can be.

Note that I’m not saying “Safe Rust cannot cause undefined behavior” because safe code can call a safe function that contains unsafe code, and therefore be part of the causality of the undefined behavior. But if such a call results in undefined behavior, we consider this a bug in the unsafe code; we say that the particular safe function containing unsafe code is unsound.

If all unsafe code in a Rust program is sound, then that program can never exhibit undefined behavior, regardless of what the safe code in it does.

On Wed, Jan 7, 2026 at 12:50 PM Jonathan S. Shapiro <jonathan....@gmail.com> wrote:
On Sun, Jan 4, 2026 at 9:37 PM Kevin Reid <kpr...@switchb.org> wrote:
The most common and dramatic example of this is that T: 'static does not mean that values of type T must continue to exist until the program exits; rather, it means that a value of type T can be kept around until the program exits, if its owner so chooses.

Hmm. I see what you are saying, but if I understand things correctly the only ways you can end up with 'static are something being global, something getting re-owned into a 'static container (a reductio case), something in the heap that has been leaked, or something that gets boxed by box::new() that has a lifetime variable in the right position that got unified with one of the previous cases.

I’m not sure exactly what you mean when you say “end up with 'static”. To restate the parts I definitely agree with, in order to soundly come into possession of an &'static reference to some value, something such as the following must happen:
  • The value was constructed at compile time.
  • The value was leaked.
  • The value was moved, irrevocably, into a container that is already under one of these cases.
However, the existence of &'static T is not the same thing as saying T: 'static. For example, Vec<&'static str>: 'static is true, regardless of the liveness of any particular Vec<&'static str> value, and I suspect what you mean by “unified with one of the previous cases” must be more like the Vec. A lifetime unification can never extend the liveness of a value, in any way, let alone extend it out to 'static; it can only either do nothing or cause a compilation error because the value doesn’t already live long enough.

On Wed, Jan 7, 2026 at 1:03 PM Jonathan S. Shapiro <jonathan....@gmail.com> wrote:
At very least there is a difference in lifetime requirements once 'static enters the picture.  For everything else, the rule is that if A [might] hold a reference to B then 'a < 'b (where these are the associated lifetime variables for the reference to A and the contained reference to B, respectively). But in the case where 'b == 'static, the lifetime constraint is actually 'a <= 'b.

In many cases, we could accept a <= constraint for the first case as well. The '=' only needs to be removed if there is something that imposes an ordering on reclamation, like a finalizer. I believe "drop" semantics imposes that requirement as well. Conceptually, things in 'static cannot have drop semantics even if they carry the trait. It's surprisingly hard to run drop code after you exit...

Actually, I believe all lifetime constraints are <= constraints (at least, the only kind you can write explicitly are). This is not a problem for finalization, because drop ordering is defined completely independently of lifetimes. In general, lifetime analysis never affects the behavior of the program, only whether compilation succeeds or fails. This is an intentional design constraint; it allows borrow checking to be clever, and for future versions of Rust to increase how many programs the borrow checker accepts by redesigning the borrow checking algorithm, without any of that influencing the correctness of the program (such as it might if, say, the order of two drop side-effects were swapped).

Matt Rice

unread,
Jan 7, 2026, 6:06:13 PM (2 days ago) Jan 7
to cap-...@googlegroups.com
On Wed, Jan 7, 2026 at 10:14 PM Kevin Reid <kpr...@switchb.org> wrote:
>
> On Wed, Jan 7, 2026 at 12:37 PM Jonathan S. Shapiro <jonathan....@gmail.com> wrote:
>>
>> Coming from a hardware point of view, the fact that rust has undefined behavior seems like a huge red flag. In the unsafe language subset, undefined behavior is sometimes unavoidable because hardware has it and then you're stuck with it. But in the safe part of the language, undefined behaviot amounts to saying that the language doesn't have defined semantics.
>
>
> Safe Rust cannot be responsible for undefined behavior. Unsafe Rust can be.
>
> Note that I’m not saying “Safe Rust cannot cause undefined behavior” because safe code can call a safe function that contains unsafe code, and therefore be part of the causality of the undefined behavior. But if such a call results in undefined behavior, we consider this a bug in the unsafe code; we say that the particular safe function containing unsafe code is unsound.
>
> If all unsafe code in a Rust program is sound, then that program can never exhibit undefined behavior, regardless of what the safe code in it does.
>

It is perhaps worth pointing out that soundness in rust and the safe +
unsafe dichotomy requires a different logical framework
than the typical syntactic soundness, Derek Dreyer's group have done a
lot of work formulating 'semantic soundness' proofs
capable of characterizing rust using safe + unsafe code. Some of that
work (Particularly Ralf Jung's miri interpreter) has made it back into
the rust project.

https://dl.acm.org/doi/10.1145/3676954
https://people.mpi-sws.org/~dreyer/papers/safe-sysprog-rust/paper.pdf

I know Amal Ahmed has also done a lot of work related to formalizing
interactions between multiple-language models, and
I think she has also done a paper on applying that approach to rust
https://www.ccs.neu.edu/home/amal/papers/rustdistilled.pdf

Just to say that formalizing this part of rust is a little more
complicated than predominantly safe languages.

Jonathan S. Shapiro

unread,
Jan 8, 2026, 12:02:56 AM (yesterday) Jan 8
to cap-...@googlegroups.com
On Wed, Jan 7, 2026 at 2:14 PM Kevin Reid <kpr...@switchb.org> wrote:
On Wed, Jan 7, 2026 at 12:37 PM Jonathan S. Shapiro <jonathan....@gmail.com> wrote:

The data structure you have declared and described here cannot be constructed in Rust. A &mut reference is an exclusive reference.

Yes. Like i said at one point, I worked around the problem by using a side structure, which left the rest immutable.
 
The borrow checker will not allow you to construct simultaneously usable & and &mut references, and if you use unsafe code to do it anyway, the program exhibits undefined behavior. Everything that follows from that cannot be attributed to any “type system hole”; the hole is that you’ve applied a sledgehammer to the wall and the building is falling down.

No unsafe code here anywhere, but understood.
 
The correct way to have shared, mutable parts of your data is to put the shared, mutable parts in an interior-mutability type such as RefCell, Mutex, Cell, AtomicBool, etc, and then use & shared references or Rc to refer to the whole data structures as needed.

Good pointer, and thanks.
 

I'm not sure where the Cell wrapper would go here?

The Cell goes on the boolean field.

struct FileInfo {
    ....
    marked: Cell<bool>,
}

Also helpful.
 
This may be an accurate analogy — again, I don’t have enough PL theory to comment — but note that in Rust a “binding” is a variable or perhaps its introduction — `let (x, y) = z` has two bindings — and both references and the things they borrow do not necessarily exist in bindings, but are simply values. What you describe sounds more like the theory of a language like Lisp or Java where values may be implicitly shared among many bindings.

Agreed, though from a PL perspective, a binding to a compiler-introduced temporary is still a binding, and incorporation (at construction) into a cell of an immutable container has many properties of bindings.

The origin of the "extent" and "live" terms originates in some very old languages, but the definitions of "live" and "extent" actually hold in Rust as well, with some revision to account for move. Values in these older languages are never implicitly shared when viewed at the level of the PL semantics. The extent of a binding never has anything to do with sharing. The liveness of a value depends on the number of outstanding references (therefore sharing), but this is true for Rust as well given the possibility of multiple simultaneous read-only references.
 
Safe Rust cannot be responsible for undefined behavior. Unsafe Rust can be.

That hasn't always matched my experience so far, but I did say at the beginning that I chose my first programs with some amount of malice. Regardless, deficiencies in the compiler (if any) shouldn't be confused with deficiencies in the language specification. Could have been my error, but if I find that I've reconstructed them I'll pass examples along.

Note that I’m not saying “Safe Rust cannot cause undefined behavior” because safe code can call a safe function that contains unsafe code, and therefore be part of the causality of the undefined behavior. But if such a call results in undefined behavior, we consider this a bug in the unsafe code; we say that the particular safe function containing unsafe code is unsound.

Unsound seems an unfortunate coinage, since it has a very firmly established and Rust-relevant meaning in the type theory world that has nothing to do with this. A reasonable alternative might be "ill/well behaved" or something along those lines.

From a PL perspective, I'd argue that unsafety is both transitive and sticky, and that (absent external proof) any code that depends on unsafe code is, from an end-to-end perspective, unsafe by definition. There's a reasonable escape hatch of the form that sometimes the language mechanisms aren't flexible enough to express safe solutions that can be verified by some other means. Having been verified, such code is not unsafe from a semantic perspective regardless of what the source code said.
 
Actually, I believe all lifetime constraints are <= constraints (at least, the only kind you can write explicitly are). This is not a problem for finalization, because drop ordering is defined completely independently of lifetimes.

Ah. I need to go look at that, but I think I see how it might be done, and if cleanup order is separated from lifetime then all of the lifetime relationships can safely be <=. It does, however, mean that the casual statement that the lifetime of the reference must be shorter than the thing referred to is not strictly correct. A better formulation would be "lifetime must be no longer than the thing referred." Not a big deal, except that phrasing was used often enough with me that it led to a substantially incorrect understanding of how things had to be working. 

In general, lifetime analysis never affects the behavior of the program, only whether compilation succeeds or fails. This is an intentional design constraint; it allows borrow checking to be clever, and for future versions of Rust to increase how many programs the borrow checker accepts by redesigning the borrow checking algorithm, without any of that influencing the correctness of the program (such as it might if, say, the order of two drop side-effects were swapped).

Well thought out and good call!


Jonathan

Kevin Reid

unread,
Jan 8, 2026, 12:34:24 AM (yesterday) Jan 8
to cap-...@googlegroups.com
On Wed, Jan 7, 2026 at 9:02 PM Jonathan S. Shapiro <jonathan....@gmail.com> wrote:
On Wed, Jan 7, 2026 at 2:14 PM Kevin Reid <kpr...@switchb.org> wrote:
Safe Rust cannot be responsible for undefined behavior. Unsafe Rust can be.

That hasn't always matched my experience so far, but I did say at the beginning that I chose my first programs with some amount of malice. Regardless, deficiencies in the compiler (if any) shouldn't be confused with deficiencies in the language specification. Could have been my error, but if I find that I've reconstructed them I'll pass examples along.

I would be interested to see those examples. And if they aren’t one of the known issues, so will the compiler developers.
 
From a PL perspective, I'd argue that unsafety is both transitive and sticky, and that (absent external proof) any code that depends on unsafe code is, from an end-to-end perspective, unsafe by definition.

Standard practice is that each safe function that executes unsafe code contains a justification for why the function is sound, that is, why its use of unsafe code cannot be responsible for UB no matter what inputs it is passed. These are currently informal, but there is some work on the possibility of adding annotations that constitute machine-checkable proofs (I believe this accepted change proposal is most relevant, but I may be confusing it with another one).

It does, however, mean that the casual statement that the lifetime of the reference must be shorter than the thing referred to is not strictly correct. A better formulation would be "lifetime must be no longer than the thing referred." Not a big deal, except that phrasing was used often enough with me that it led to a substantially incorrect understanding of how things had to be working. 

It doesn’t help that the relationship is often called “outlives”.

Another thing to note is that there’s a significant difference between types which have destructors and types which don’t. Values with destructors get borrowed, in order to execute the destructor, one last time when they are dropped, and values without destructors do not, which can sometimes cause a surprising amount of difference in what programs are accepted. For example, the following code is valid:

struct Foo<'a> {
    r: Option<&'a mut String>,
}

fn main() {
    let mut f = Foo { r: None };
    let mut s = String::new();
    f.r = Some(&mut s);
}


However, adding an impl Drop for Foo<'_> will break it, because f and s are getting dropped in reverse declaration order, so the f value is actually invalid due to containing a dangling reference, and this is accepted only as long as nothing actually reads or borrows f after s is dropped. (Having this kind of flexibility at all may seem pointless and dangerous, but it’s an important part of keeping the language usable. Though it isn’t ideal that this behavior isn’t opt-in on the part of Foo.)

Because of this final, implicit borrow, other borrows of Foo must in fact be shorter than the liveness of Foo and shorter than 'a, not just shorter or equal.

Valerio Bellizzomi

unread,
Jan 8, 2026, 2:09:22 AM (yesterday) Jan 8
to cap-talk
As I said earlier privately to Shap, the Cloud-Hypervisor is a project written in Rust, the project has been started by Intel and is now open-source on GitHub, so you could look there for *examples* ...
Reply all
Reply to author
Forward
0 new messages