base::ByteCount

258 views
Skip to first unread message

Avi Drissman

unread,
Aug 28, 2025, 9:57:28 AMAug 28
to cxx, Peter Kasting
I have a question about byte::BaseCount. We don't seem to have a base/ owners list, and this touches on C++, so this list is somewhat appropriate.

There's an ongoing switchover to base::ByteCount, which is a strongly-typed int64_t wrapper. Peter brought up concerns that a signed type was chosen rather than an unsigned one, and while I personally lean the other way (we need to handle negative byte amounts, we generally prefer signed types) his point about this decision being underdocumented is legitimate.

I have multiple refactoring CLs that I'd like to finish up and land, but I don't want to do so if there's murkiness about whether we're all on-board with this as-is.

Would we be able to come to consensus about this? Thanks.

Avi 

K. Moon

unread,
Aug 28, 2025, 10:20:01 AMAug 28
to Avi Drissman, cxx, Peter Kasting
Do we actually need both a signed and unsigned version of this? You mentioned needing negative byte counts, but maybe that's situational?

Although I'm personally OK with just having a signed version if the negative byte counts are a hard requirement anyway.

--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.
To view this discussion visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CACWgwAZ7UKKCkfX7Pqcqj7jsN%2BrzVkRhid-eZp7jFefjfy2ViA%40mail.gmail.com.

Avi Drissman

unread,
Aug 28, 2025, 10:28:13 AMAug 28
to K. Moon, François Doray, cxx, Peter Kasting
+François Doray who started base::ByteCount and who I missed when writing this thread.

Alexei Svitkine

unread,
Aug 28, 2025, 10:35:27 AMAug 28
to Avi Drissman, K. Moon, François Doray, cxx, Peter Kasting
My 2c is that it seems useful to be able to have negative ByteCounts, without resorting to different types or having to pass by just a number when a param can accept both negative and positive.

So IMHO, signed SGTM and we could just document the rationale in the header.

I guess the downside is code may need to validate that values passed are non-negative, but imho code should likely be doing validation anyway for values too large as well, so it's not much difference.


Marc Treib

unread,
Aug 28, 2025, 10:38:04 AMAug 28
to Alexei Svitkine, Avi Drissman, K. Moon, François Doray, cxx, Peter Kasting
Do you have examples for situations where negative ByteCounts are needed? From a distance, having something called ByteCount be negative seems quite surprising to me.

Peter Kasting

unread,
Aug 28, 2025, 3:05:23 PMAug 28
to Marc Treib, Alexei Svitkine, Avi Drissman, K. Moon, François Doray, cxx
On Thu, Aug 28, 2025 at 7:38 AM Marc Treib <tr...@chromium.org> wrote:
Do you have examples for situations where negative ByteCounts are needed? From a distance, having something called ByteCount be negative seems quite surprising to me.

This was my thought too. The idea of subtracting a positive byte count (e.g. "how much of the input stream is left after we skip past the header") is valuable; but I don't know what a negative byte count means (if the input stream in said example is still in the header, then IMO the answer is not a negative byte count, but either "zero" or "error, that subtraction wasn't safe/sane right now" -- that is, either the ClampedNumeric or CheckedNumeric behaviors).

That said, I'm less worried about the semantic sanity of negative byte counts than I am about the practical potential for problems, given that our previous type for byte counts in most places was unsigned (size_t). Signedness conversions are scary, especially with sizes of objects in memory; the potential to paper over security holes by silencing warnings with static_casts (or actually introduce such holes) seems high.

PK

Joe Mason

unread,
Aug 28, 2025, 3:07:59 PMAug 28
to Marc Treib, Alexei Svitkine, Avi Drissman, K. Moon, François Doray, cxx, Peter Kasting
Free space with a "reserved" watermark. If free space falls below the watermark, the API can report "negative space free" since it's now in the red zone and become critical to free resources.

I think an unsigned ByteCount would be useful too. The standard practice in C++ is to use a "u" or "unsigned" prefix for unsigned, and no prefix for signed (eg. int32_t vs uint32_t, IdType32 vs IdTypeU32 in base/types/id_type.h), so I'd suggest leaving ByteCount as signed and add UByteCount for unsigned (if there's enough demand).

K. Moon

unread,
Aug 28, 2025, 3:15:33 PMAug 28
to Joe Mason, Marc Treib, Alexei Svitkine, Avi Drissman, François Doray, cxx, Peter Kasting
I do worry that a negative byte "count" is maybe conflating concepts we don't want to conflate. Differences aren't really the same thing as a size, even if they're measured in the same units (bytes). We see this sort of dichotomy with time types, for example.

Alexei Svitkine

unread,
Aug 28, 2025, 4:21:37 PMAug 28
to François Doray, K. Moon, Joe Mason, Marc Treib, Avi Drissman, cxx, Peter Kasting
My preference is signed per my earlier reply, but imho the biggest pro for unsigned is:

unsigned
+ When taking this type as a param, can assume it's not negative.

On Thu, Aug 28, 2025 at 4:10 PM François Doray <fdo...@google.com> wrote:
The use of a signed integer for base::ByteCount was inspired from the use of a signed integer for base::TimeDelta. 

Comparing the 2 choices:

signed
+ Subtracting [large value] from [small value] works.
+ Can represent a possibly negative byte delta [example]
- Cannot represent values larger than 8EiB (my assumption is that this is rarely needed)

unsigned
- Subtracting [large value] from [small value] is an error.
- Cannot represent a possibly negative byte delta [example]
+ Can represent value larger than 8EiB

Joe Mason

unread,
Aug 28, 2025, 5:06:24 PMAug 28
to François Doray, K. Moon, Marc Treib, Alexei Svitkine, Avi Drissman, cxx, Peter Kasting
Another +ve of unsigned is that the overflow behaviour of unsigned integers is well-defined (it wraps around) while signed integers work in mysterious ways on overflow.

Unsigned integers are preferred in security-sensitive code (like the size of buffers) because of this.

I don't think ByteCount is intended for buffer sizes, though. The main purpose is for measurements that are usually given in kiB, MiB, etc, to avoid confusion over which unit this particular variable holds. As a security reviewer, I'd look oddly at a ByteCount used to exactly measure a buffer size, since I'd expect that ByteCounts are used for conversions and might well be rounded. Maybe the name could be improved? (I don't want to get into too much byteshedding, though.)

Speaking of TimeDelta, maybe make ByteCount unsigned and name the signed equivalent ByteDelta?

On Thu, Aug 28, 2025 at 4:10 PM François Doray <fdo...@google.com> wrote:
The use of a signed integer for base::ByteCount was inspired from the use of a signed integer for base::TimeDelta. 

Comparing the 2 choices:

signed
+ Subtracting [large value] from [small value] works.
+ Can represent a possibly negative byte delta [example]
- Cannot represent values larger than 8EiB (my assumption is that this is rarely needed)

unsigned
- Subtracting [large value] from [small value] is an error.
- Cannot represent a possibly negative byte delta [example]
+ Can represent value larger than 8EiB


On Thu, Aug 28, 2025 at 3:15 PM K. Moon <km...@chromium.org> wrote:

Peter Kasting

unread,
Aug 28, 2025, 5:15:24 PMAug 28
to François Doray, K. Moon, Joe Mason, Marc Treib, Alexei Svitkine, Avi Drissman, cxx
On Thu, Aug 28, 2025 at 1:10 PM François Doray <fdo...@google.com> wrote:
The use of a signed integer for base::ByteCount was inspired from the use of a signed integer for base::TimeDelta.

As Kahmy said, though, this conflates differences and sizes, which are measured in the same units here but aren't the same concept. `base::TimeDelta` says in the name that it's a difference, so a negative value makes semantic sense. `ByteCount`, OTOH, is a size, and a negative size isn't meaningful.

As Joe alludes to, if you want to be able to mean both, you should probably have two separate types, ByteSize and ByteSizeDelta. The former is unsigned, the latter is signed. Subtracting one ByteSize from another gives a ByteSizeDelta, and so forth. This is precisely how the Time[Ticks] vs. TimeDelta types work. This also lets functions be clearer about which concept they mean, and makes it more obvious when sanity-checking for negatives is important.

PK

Leszek Swirski

unread,
Aug 29, 2025, 4:18:17 AMAug 29
to Peter Kasting, François Doray, K. Moon, Joe Mason, Marc Treib, Alexei Svitkine, Avi Drissman, cxx
Having signed types for diff and unsigned types for size has its own caveats, like how there would be cases of undefined subtraction between two valid sizes (like how ptrdiff_t can be UB for large enough pointer subtraction); unless ByteCount was constrained to the intersection of int64_t and uint64_t range, in which case it doesn't super matter which underlying representation it has: we either have dodgy code when it exits its valid range or it needs to anyway be runtime checked for exceeding its valid range.

- Leszek

--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.

François Doray

unread,
Aug 29, 2025, 2:10:00 PMAug 29
to K. Moon, Joe Mason, Marc Treib, Alexei Svitkine, Avi Drissman, cxx, Peter Kasting
The use of a signed integer for base::ByteCount was inspired from the use of a signed integer for base::TimeDelta. 

Comparing the 2 choices:

signed
+ Subtracting [large value] from [small value] works.
+ Can represent a possibly negative byte delta [example]
- Cannot represent values larger than 8EiB (my assumption is that this is rarely needed)

unsigned
- Subtracting [large value] from [small value] is an error.
- Cannot represent a possibly negative byte delta [example]
+ Can represent value larger than 8EiB

On Thu, Aug 28, 2025 at 3:15 PM K. Moon <km...@chromium.org> wrote:

François Doray

unread,
Aug 29, 2025, 2:10:00 PMAug 29
to Peter Kasting, Leszek Swirski, K. Moon, Joe Mason, Marc Treib, Alexei Svitkine, Avi Drissman, cxx
base::ByteCount uses CheckedNumeric to ensure that no overflow goes unnoticed.

I decided not to have separate types for "absolute quantity" and "delta" when I implemented the API, because we cannot deduce the right type to use for the result of an operation without extra context:

base::ByteCount total_ram;
base::ByteCount used_ram;
base::ByteCount available_ram = total_ram - used_ram; // we want the result of this subtraction to be a ByteSize

base::ByteCount previous_size;
base::ByteCount new_size;
base::ByteCount size_delta = new_size - previous_size; // we want the result of this subtraction to be a ByteSizeDelta

base::ByteCount bytes_transferred_at_beginning_of_interval;
base::ByteCount bytes_transferred_at_end_of_interval;
base::ByteCount bytes_transferred_during_interval; // depending on the larger context, we could argue that this is a ByteSize (absolute amount of bytes transferred during the interval) or a ByteSizeDelta (variation of the total amount of bytes transferred)

base::ByteCount with a signed integer ensures that a subtraction of normal quantities of bytes is always well-defined, and it's up to the calling code to decide how to handle negative values. asvitkine@ is right that a downside of this implementation choice is that code receiving a ByteCount cannot assume that it's a positive value.

On Fri, Aug 29, 2025 at 10:40 AM Peter Kasting <zer...@gmail.com> wrote:
On Fri, Aug 29, 2025 at 1:18 AM Leszek Swirski <les...@chromium.org> wrote:
Having signed types for diff and unsigned types for size has its own caveats, like how there would be cases of undefined subtraction between two valid sizes (like how ptrdiff_t can be UB for large enough pointer subtraction); unless ByteCount was constrained to the intersection of int64_t and uint64_t range, in which case it doesn't super matter which underlying representation it has: we either have dodgy code when it exits its valid range or it needs to anyway be runtime checked for exceeding its valid range.

Using CheckedNumeric under the hood would eliminate this problem, as it would automatically do the runtime check that the result is valid.

It would also constrain the set of problem cases to O(1): the implementation of these types themselves. This is preferable to potentially having problems at the usage sites.

PK

Peter Kasting

unread,
Aug 29, 2025, 2:10:07 PMAug 29
to Leszek Swirski, François Doray, K. Moon, Joe Mason, Marc Treib, Alexei Svitkine, Avi Drissman, cxx
On Fri, Aug 29, 2025 at 1:18 AM Leszek Swirski <les...@chromium.org> wrote:
Having signed types for diff and unsigned types for size has its own caveats, like how there would be cases of undefined subtraction between two valid sizes (like how ptrdiff_t can be UB for large enough pointer subtraction); unless ByteCount was constrained to the intersection of int64_t and uint64_t range, in which case it doesn't super matter which underlying representation it has: we either have dodgy code when it exits its valid range or it needs to anyway be runtime checked for exceeding its valid range.

Peter Kasting

unread,
Aug 29, 2025, 7:57:46 PMAug 29
to François Doray, Leszek Swirski, K. Moon, Joe Mason, Marc Treib, Alexei Svitkine, Avi Drissman, cxx
On Fri, Aug 29, 2025 at 9:39 AM François Doray <fdo...@google.com> wrote:
I decided not to have separate types for "absolute quantity" and "delta" when I implemented the API, because we cannot deduce the right type to use for the result of an operation without extra context:

base::ByteCount total_ram;
base::ByteCount used_ram;
base::ByteCount available_ram = total_ram - used_ram; // we want the result of this subtraction to be a ByteSize

base::ByteCount previous_size;
base::ByteCount new_size;
base::ByteCount size_delta = new_size - previous_size; // we want the result of this subtraction to be a ByteSizeDelta

base::ByteCount bytes_transferred_at_beginning_of_interval;
base::ByteCount bytes_transferred_at_end_of_interval;
base::ByteCount bytes_transferred_during_interval; // depending on the larger context, we could argue that this is a ByteSize (absolute amount of bytes transferred during the interval) or a ByteSizeDelta (variation of the total amount of bytes transferred)

Agree that these sorts of distinctions are important, and I appreciate your illustrative choices. I don't know that using a signed type unconditionally is the ideal answer, though.

One route (akin to some things base/time and ui/gfx/geometry do) would be to provide `ByteSize base::ByteSizeDelta::AsByteSize() const;`, which is effectively a `checked_cast`. So you'd have:

base::ByteSize total_ram;
base::ByteSize used_ram;
base::ByteSize available_ram = (total_ram - used_ram).AsByteSize();  // CHECKs if used_ram > total_ram

This way callers can make their intent clear. While it looks cumbersome at first, it's safer than leaving "are negative values possible/meaningful here" to call sites, and I suspect it wouldn't actually be used as frequently as people think.

(On its face this may appear to mirror size_t and ptrdiff_t, but there are important differences: subtracting two size_ts gives a size_t and may easily underflow, while subtracting two ByteSizes gives a ByteSizeDelta and will CHECK if the result is outside the representable range; and conversion between size_t and ptrdiff_t is two way and likely implicit (depending on compiler flags), while conversion between ByteSize and ByteSizeDelta is one-way and explicit. These distinctions reduce the footgun possibilities significantly.)

PK

K. Moon

unread,
Aug 29, 2025, 8:35:14 PMAug 29
to Peter Kasting, François Doray, Leszek Swirski, Joe Mason, Marc Treib, Alexei Svitkine, Avi Drissman, cxx
I agree that you want to explicitly convert from a byte difference to a byte size. The available RAM example has me wondering how this is going to handle potentially negative available RAM. Using the right types eliminates that class of problems.

The conversions are only context-dependent because they're missing the explicit intent they should have.

Avi Drissman

unread,
Sep 3, 2025, 8:06:21 PMSep 3
to K. Moon, Peter Kasting, François Doray, Leszek Swirski, Joe Mason, Marc Treib, Alexei Svitkine, cxx
Are we making progress on a decision? Should I land my pending ByteCount CLs while we discuss this further or should I continue to hold off?

Avi

K. Moon

unread,
Sep 3, 2025, 8:09:35 PMSep 3
to Avi Drissman, Peter Kasting, François Doray, Leszek Swirski, Joe Mason, Marc Treib, Alexei Svitkine, cxx
I think using ByteCount in an unsigned way would be unobjectionable to everyone. I think using it for differences doesn't have consensus.

My vote would be for two types here, a count type and a difference type, or something along those lines.

Peter Kasting

unread,
Sep 3, 2025, 8:16:24 PMSep 3
to K. Moon, Avi Drissman, François Doray, Leszek Swirski, Joe Mason, Marc Treib, Alexei Svitkine, cxx
By "using in an unsigned way" do you mean making the underlying type unsigned and using everywhere, making it unsigned and using only for non-differences, leaving it signed and using only for non-differences and inserting casts when necessary, ...?

I ask because my initial suggestion was basically "make unsigned and use everywhere" and it definitely didn't seem like that was unobjectionable to everyone. 

PK

K. Moon

unread,
Sep 3, 2025, 8:21:46 PMSep 3
to Peter Kasting, Avi Drissman, François Doray, Leszek Swirski, Joe Mason, Marc Treib, Alexei Svitkine, cxx
My suggestion boils down to, "If this wouldn't change if the underlying type became unsigned, you can keep going. Otherwise, wait."

I agree that there isn't consensus on what these types should mean/be implemented, but treating a ByteCount as a count seems fine.

Peter Kasting

unread,
Sep 3, 2025, 9:06:49 PMSep 3
to K. Moon, Avi Drissman, François Doray, Leszek Swirski, Joe Mason, Marc Treib, Alexei Svitkine, cxx
Sure, I think use that would literally require no changes (adding/removing casts) if we were to change the underlying type is uncontroversial. Knowing whether that's the case might be less obvious :)

PK

Alexei Svitkine

unread,
Sep 4, 2025, 10:26:04 AMSep 4
to Peter Kasting, K. Moon, Avi Drissman, François Doray, Leszek Swirski, Joe Mason, Marc Treib, cxx
Re: 

base::ByteSize total_ram;
base::ByteSize used_ram;
base::ByteSize available_ram = (total_ram - used_ram).AsByteSize();  // CHECKs if used_ram > total_ram

That SGTM too. My vote for a signed type was in the scenario where we just have one type. But the ByteSizeDelta and above helper approach SGTM.

Jeremy Roman

unread,
Sep 4, 2025, 2:24:54 PMSep 4
to Alexei Svitkine, Peter Kasting, K. Moon, Avi Drissman, François Doray, Leszek Swirski, Joe Mason, Marc Treib, cxx
If I might bikeshed, ByteSize (or ByteCount) vs ByteOffset would be an option (off_t is analogous in name even if ptrdiff_t isn't).

--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.

Marc Treib

unread,
Sep 5, 2025, 4:24:02 AMSep 5
to Jeremy Roman, Alexei Svitkine, Peter Kasting, K. Moon, Avi Drissman, François Doray, Leszek Swirski, Joe Mason, cxx
+1 to ByteSize + ByteSizeDelta (or ByteCount / ByteOffset - I'm not gonna participate in the bikeshed ;P)
Marc Treib

Software Engineer

tr...@google.com



Google Germany GmbH

Erika-Mann-Straße 33 80636 München


Geschäftsführer: Paul Manicle, Liana Sebastian

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde. This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.

Peter Kasting

unread,
Sep 5, 2025, 10:08:29 AMSep 5
to Jeremy Roman, Alexei Svitkine, K. Moon, Avi Drissman, François Doray, Leszek Swirski, Joe Mason, Marc Treib, cxx
Yes, ByteOffset seems mildly better than ByteSizeDelta IMO.

PK

K. Moon

unread,
Sep 5, 2025, 10:19:18 AMSep 5
to Peter Kasting, Jeremy Roman, Alexei Svitkine, Avi Drissman, François Doray, Leszek Swirski, Joe Mason, Marc Treib, cxx
I kinda like consistency with the similar time types, but I think the important thing is that it sounds like there's consensus around having 2 types.

Presumably can settle the bike shed color in the code review? 🙂

Avi Drissman

unread,
Sep 11, 2025, 2:26:14 PMSep 11
to K. Moon, Peter Kasting, Jeremy Roman, Alexei Svitkine, François Doray, Leszek Swirski, Joe Mason, Marc Treib, cxx
@François Doray you were the originator of the type; do you have an opinion here?

Right now, ByteCount seems to be in limbo, and either we should agree on a way forward so that work can continue, or we should unwind things.

Avi

François Doray

unread,
Sep 12, 2025, 12:40:12 PMSep 12
to Avi Drissman, K. Moon, Peter Kasting, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Joe Mason, Marc Treib, cxx
Hi everyone!

Thanks for sharing your thoughts. I'm now convinced that our best end state is an unsigned ByteSize and a signed ByteSizeDelta.
  • Code consuming ByteSize can safely assume the value is non-negative and doesn't need to handle negative cases. This is the typical case.
  • As pkasting@ demonstrated, we can have type conversion for cases where a difference conceptually yields a ByteSize, instead of a ByteSizeDelta:
    • base::ByteSize available_ram = (total_ram - used_ram).AsByteSize();
    However, my primary concern is the migration path. A reason why I initially went with the signed ByteCount, is because our existing codebase uses signed integers for "byte sizes" [examples: 1, 2, 3]. My assessment was that negative byte counts were more common than positive counts exceeding the max int64_t, so using int64_t would make the migration easier (no need to change logic as part of the migration). This assessment seems correct, as the migration to ByteCount has progressed without triggering CheckedNumeric failures [code]. 

    Given this, I would be more comfortable reaching our end state through the following intermediate steps:

    Step 1:
    Introduce ByteSize and ByteSizeDelta, but keep ByteCount.
    • ByteCount (signed, with a comment recommending ByteSize/ByteSizeDelta instead)
    • ByteSize (unsigned)
    • ByteSizeDelta (signed)
    • ByteSize/Count - ByteSize/Count → ByteSizeDelta
    • ByteSize + ByteCount → ByteCount
    • ByteSizeDelta::AsByteSize() to allow conversion
    Step 2:
    • Disallow construction of ByteCount from unsigned variables or integer literals at compile-time. This forces all simple cases to migrate to ByteSize. We can still create ByteCount from a signed integer variable.
    • Use runtime checks (DCHECKs, then DumpWithoutCrashing) to find instances of ByteCount holding a negative value, so we can make necessary adjustments.
    Step 3:
    • Once we have sufficient confidence that all remaining ByteCount instances represent positive values, we migrate all ByteCount usage to ByteSize and remove ByteCount.
    • We probably want ByteSize::FromSignedBytes/FromSignedKiB/FromSignedMiB/etc. static methods to highlight cases where a ByteSize is created from a signed integer. A comment on this method would indicate that the calling code must ensure that the value is positive, and this is enforced with runtime checks.
    What do you think of this phased approach?

    François

    Daniel Cheng

    unread,
    Sep 12, 2025, 1:19:53 PMSep 12
    to François Doray, Avi Drissman, K. Moon, Peter Kasting, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Joe Mason, Marc Treib, cxx
    The staged rollout seems like a lot of work, but probably the safest approach. For step 3, one suggestion would be to have the "from signed" conversion helpers return an optional and have the caller decide if they want to handle it gracefully or not, but I don't feel strongly about that.

    Naming-wise, I'd prefer the name ByteOffset just to make it a bit shorter, and it has precedence with `off_t` (which is admittedly not the best name) :)
    80 columns is not a lot.

    Alternatively we could do `ByteDelta` for consistency with `TimeDelta` but `ByteDelta` doesn't seem like the best name.

    Daniel

    K. Moon

    unread,
    Sep 12, 2025, 2:04:29 PMSep 12
    to Daniel Cheng, François Doray, Avi Drissman, Peter Kasting, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Joe Mason, Marc Treib, cxx
    Glad to see this is moving in a productive direction!

    I think a gradual migration can make sense if there's a lot of existing usage that needs to be fixed up. (Note that this excludes existing usage that is already correct as unsigned.) If there isn't, then it's probably not worth it. I don't have data on which would be better in this case.

    As far as construction goes, I'd try to emulate CheckedNumeric. It should be impossible to ever construct a "signed" ByteSize anyway, so I don't think we need a bunch of helpers to enforce this, necessarily, but no strong opinions if some would still be useful.

    I think ByteDelta would only make sense if the fundamental type was Byte, which doesn't seem great. So +1 to sticking with ByteSize/ByteSizeDelta, or something similar.

    Similarity in naming is great for APIs because of Principle of Least Astonishment: You see a thing, and another similarly-named thing behaves similarly. Which is why I like the Foo/FooDelta naming.

    Avi Drissman

    unread,
    Sep 30, 2025, 11:32:30 AMSep 30
    to K. Moon, Daniel Cheng, François Doray, Peter Kasting, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Joe Mason, Marc Treib, cxx
    Right now we have base::ByteCount which is apparently not the direction that we want to do, which is used in several key base/ utility files and for which I have a bunch of pending CLs.

    We seem to be in agreement about having ByteSize/ByteOffset types but there's been no movement on making that happen.

    What do we do here? Is there movement? I don't see it as tenable to just let our code base sit with a core base type we have decided we don't want.

    Avi

    Joe Mason

    unread,
    Oct 1, 2025, 1:38:15 PMOct 1
    to Avi Drissman, K. Moon, Daniel Cheng, François Doray, Peter Kasting, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    Thanks for the reminder. I just discussed this with François - I'll land the ByteSize & ByteSizeDelta types and mark ByteCount as deprecated with a comment. Not sure if I'll get to the next steps of removing existing uses of ByteCount, though. I'll ping the thread once ByteSize & ByteSizeDelta are available to see who should pick up rest of the work.

    Avi Drissman

    unread,
    Oct 7, 2025, 1:54:34 PMOct 7
    to Joe Mason, K. Moon, Daniel Cheng, François Doray, Peter Kasting, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    Even though I posted a note on the ByteCount bug that we're going in a different direction, CLs are continuing to be landed adding more usage of ByteCount, CLs that have approvals that post-date the decision made here.

    I'm tempted to revert, but at least, can we please stop adding new usage of ByteCount if we're not going in that direction?

    Avi

    Joe Mason

    unread,
    Oct 8, 2025, 10:49:22 AMOct 8
    to Avi Drissman, K. Moon, Daniel Cheng, François Doray, Peter Kasting, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    Wow, boundary conditions are hard.

    I've written a ByteSize (wraps an underlying uint64_t) and a ByteSizeDelta (wraps an underlying int64_t), but I'm having trouble getting the semantics of subtraction right. Like ByteCount these use checked arithmetic so crash on out of bounds results.

    It seems obvious that ByteSize - ByteSize should return a ByteSizeDelta, to allow for negative results. Also ByteSize +  ByteSizeDelta  should return ByteSizeDelta, since the delta may be negative. But that means a ByteSize that's higher than INT64_MAX will fail in surprising ways.

    For example with ByteSize b(INT64_MAX + 10):

    b - 5 is out of bounds, and will crash.
    b + ByteDelta(5) is out of bounds, and will crash.

    Those are baked into the semantic of returning ByteSizeDelta. Is that too surprising?

    b - 15 SHOULD be in bounds, because it fits in  ByteSizeDelta.
    b - (INT64_MAX + 20) = -20 should also be in bounds

    But it's surprisingly hard to implement subtraction in a way that satisfies both. CheckedNumeric<uint64_t>(b) minus (other type) returns a uint64_t, which doesn't allow any negative results. (That was surprising to me.) CheckedNumeric<int64_t>(b) minus (other type) returns an int64_t, but fails whenever b is out of range for int64_t before the subtraction. I think I could figure out a clever way to implement this semantic, but I'm not sure I should bother if the semantic isn't what we want anyway.

    Should ByteSize - ByteSize just return a ByteSize, with some other function to return it as a ByteSizeDelta? (This doesn't seem like what most users would want.)
    Should ByteSize actually hold a uint32_t, so its range completely overlaps ByteSizeDelta? That's the easiest semantic to reason about, but seems limiting.
    Should ByteSizeDelta hold a signed 128 bit int? That feels like overkill...


    Joe Mason

    unread,
    Oct 8, 2025, 11:00:18 AMOct 8
    to Avi Drissman, K. Moon, Daniel Cheng, François Doray, Peter Kasting, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    Since ByteSize isn't ready yet (see my other post), I think it's ok to convert uses of int64_t to ByteCount: we would still want to switch them from int64_t to either ByteSize or ByteSizeDelta later, and it's no harder (in fact, probably easier) to switch from ByteCount. So it's won't add extra conversion work, and in the meantime using ByteCount adds safety & clarity.

    There's also a situation where ByteCount is useful as an intermediate step: a lot of uses of int64_t are just to use -1 as an error code. Those should really be changed to optional<ByteSize>, but that's a lot harder to do mechanically. It would be nice to do this as two steps: changing from int64_t to ByteCount (gaining safety & clarity around units), then change ByteCount to optional<ByteSize> (gaining safety & clarity around error handling). Using ByteSizeDelta as the intermediate step would be possible, but ByteCount's a useful marker that this type is still in migration.

    Peter Kasting

    unread,
    Oct 8, 2025, 12:14:15 PMOct 8
    to Joe Mason, Avi Drissman, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    ByteSize + ByteSizeDelta should be a ByteSize, not a ByteSizeDelta. Follow the example of the Time and TimeDelta operators.

    PK

    Peter Kasting

    unread,
    Oct 8, 2025, 12:20:58 PMOct 8
    to Joe Mason, Avi Drissman, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    (This is also similar to how point + vector2d is point, not vector2d. And yes, the delta may be negative; that's fine.)

    PK

    Joe Mason

    unread,
    Oct 8, 2025, 2:24:03 PMOct 8
    to Peter Kasting, Avi Drissman, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    François pointed out offline something obvious that I missed here:

    > Should ByteSize actually hold a uint32_t, so its range completely overlaps ByteSizeDelta? That's the easiest semantic to reason about, but seems limiting.

    uint32_t isn't the only way to get the same positive range as int64_t. I can constrain ByteSize to only use 63 bits of a uint64_t to keep most of the range, and still mean that every possible ByteSize can convert to a ByteSizeDelta. (Equivalently, store an int64_t internally and add checks to enforce that it's never negative.) That should take care of every corner case I mentioned.

    On Wed, Oct 8, 2025 at 12:20 PM Peter Kasting <zer...@gmail.com> wrote:
    (This is also similar to how point + vector2d is point, not vector2d. And yes, the delta may be negative; that's fine.)

    PK

    On Wed, Oct 8, 2025, 9:13 AM Peter Kasting <zer...@gmail.com> wrote:
    ByteSize + ByteSizeDelta should be a ByteSize, not a ByteSizeDelta. Follow the example of the Time and TimeDelta operators.

    PK

    I had an argument against this, but while typing it up I talked myself out of it.

    The difference there is that point and Time can both be negative, so Time(1) + TimeDelta(-2) can return Time(-1). It would be surprisingly asymmetrical if ByteSize(1) - ByteSize(2) returns ByteSizeDelta(-1), but ByteSize(1) + ByteSizeDelta(-2) crashes.

    On the other hand it's also surprisingly asymmetrical that ByteSize(1) - ByteSize(2) returns ByteSizeDelta(-1), but "ByteSize b(1); b -= ByteSize(2)" crashes. And the asymmetry between "ByteSize - ByteSize" and "ByteSize + ByteSizeDelta" is encoded in the return type so it's not really that surprising.

    And on the gripping hand, the reason Time + TimeDelta returns a Time is that Time has a fixed origin point (the epoch) while TimeDelta is relative. Adding a delta to Time doesn't change the origin point so it returns a Time. You could also view ByteSize as having a fixed origin point (0) so the same logic applies.

    So:

    ByteSize + ByteSize -> ByteSize (crashes if the result > INT64_MAX)
    ByteSize - ByteSize -> ByteSizeDelta (crashes if the result < INT64_MIN)
    ByteSize +- ByteSizeDelta -> ByteSize (crashes if the result < 0 or > INT64_MAX)
    ByteSizeDelta +- ByteSizeDelta -> ByteSizeDelta (crashes if the result < INT64_MIN or > INT64_MAX)
    ByteSizeDelta +- ByteSize -> undefined, to parallel TimeDelta

    To do "byte_size + delta" and allow a negative result, use "byte_size.AsByteSizeDelta() + delta".
    To do "delta + byte_size", either change it to "byte_size + delta" (returns ByteSize) or do "delta + byte_size.AsByteSizeDelta()" (returns ByteSizeDelta).

    Peter Kasting

    unread,
    Oct 8, 2025, 3:20:27 PMOct 8
    to Joe Mason, Avi Drissman, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    ByteSizeDelta + ByteSize should be well-defined and equivalent to the other order. ByteSizeDelta - ByteSize, however, should not compile. Addition is commutative; subtraction is not. And this is coherent with how subtraction is equivalent to adding a negative: negating a ByteSize is nonsensical.

    IIRC this is how Time types work also.

    I don't instantly have a strong opinion about limiting ByteSize to INT64_MAX. My gut feel is that that is a mistake and you will forget to manually check edge cases, and it would be better to use the full range of the type and rely entirely on the checked numeric code. But I am not certain without thinking deeply.

    PK

    Joe Mason

    unread,
    Oct 8, 2025, 3:25:32 PMOct 8
    to Peter Kasting, Avi Drissman, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    On Wed, Oct 8, 2025 at 3:20 PM Peter Kasting <zer...@gmail.com> wrote:
    ByteSizeDelta + ByteSize should be well-defined and equivalent to the other order. ByteSizeDelta - ByteSize, however, should not compile. Addition is commutative; subtraction is not. And this is coherent with how subtraction is equivalent to adding a negative: negating a ByteSize is nonsensical.

    IIRC this is how Time types work also.

    Ah, you're right - TimeDelta + Time is defined in a templated function outside the class, so I didn't notice it.

    Gabriel Charette

    unread,
    Oct 8, 2025, 10:19:14 PMOct 8
    to Joe Mason, Peter Kasting, Avi Drissman, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    Chiming in as the original reviewer of this class.

    The reason I was okay with not having a delta-class is that it's not as clear cut as with Time.

    For instance, TotalPhysicalMemory - UsedPhysicalMemory = AvailablePhysicalMemory. All of these are ByteCounts which can be represented independently, not deltas, yet the math makes sense too.

    Because it was confusing to represent some things, we opted to alleviate the confusion by not having a delta type altogether.

    If we did add a delta type, I concur that it should look exactly like TimeBase and TimeDelta and TimeBase is int64_t underneath (matching the style guide which says to use signed types for everything but bitfields). Thus I don't believe we should have unsigned storage and Francois had concluded the same in the original implementation (hitting edge cases when considering unsigned).

    I'd also like to caution against some stop energy I'm feeling in this thread. Francois identified and addressed a problem caused by the absence of a ByteCount type. Some bugs have been caught and fixed since and semantics clarified at many callsites. We shouldn't block using it some more while we're discussing a delta-type (thiabaud@ in fact sits next to joenotcharles@ and they're in sync about the need to rebase on whoever lands first).

    ByteCount alone is a clear improvement on status quo, let's cherish that and not discourage such improvements in the future.

    Given there are cases where a delta-type doesn't make sense, I feel we're over-investing eng time here to tweak something which is by itself a sufficiently superior representation of what were integral types for years before.
    ByteCount is just a fancy type for "a number of bytes" and that can be absolute or relative, I don't think it's incorrect for it to also represent a delta.

    - Gab

    Avi Drissman

    unread,
    Oct 9, 2025, 10:31:14 AMOct 9
    to Gabriel Charette, Joe Mason, Peter Kasting, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    To be clear, I'm OK with ByteCount as-is as a signed type. I'm also OK with the plans to make an unsigned version. Either way is fine with me. The reason I brought this up on the list was that, in one of my refactor CLs, I received an expression of surprise from a senior engineer that ByteCount was signed. My objections to pouring work into the refactor is that we didn't seem to have consensus.

    I don't mean to bring "stop energy" here, but as someone who is doing these rewrites in spare time in-between other CLs, I don't want to spend time grinding through a change if it's only going to need to be changed again. Many of these CLs were requiring long discussions with experts about correct behavior, and I don't want to have to have a long discussion in which I have to defend why a signed byte count type is OK, only to then turn around and have to write a new CL in which I have to defend why an unsigned byte count type is OK.

    Avi

    Peter Kasting

    unread,
    Oct 9, 2025, 11:52:37 AMOct 9
    to Avi Drissman, Gabriel Charette, Joe Mason, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    I'm a bit confused by Gab's reply talking about confusion and stop energy when as far as I can tell we reached consensus on what changes to implement (having a delta type and making the count type unsigned) and Joe was in implementation process, and now it sounds like the underlying message of Gab's mail is "I don't like that plan, I like status quo". If that doesn't inject a lot of confusion and stop energy for anyone else it certainly does for me. But perhaps I misconstrued the intent.

    If there's a need for further discussion or implementation assistance, please ping me directly. I am happy to VC or contribute a bit of engineering time, since I care quite a lot about this one.

    Otherwise I will assume Joe is proceeding with the plan above and there are no further reasons to stop.

    PK

    Greg Thompson

    unread,
    Oct 10, 2025, 6:36:38 AMOct 10
    to Gabriel Charette, Joe Mason, Peter Kasting, Avi Drissman, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    Not to stir the pot, but I disagree on this point:

    "For instance, TotalPhysicalMemory - UsedPhysicalMemory = AvailablePhysicalMemory. All of these are ByteCounts which can be represented independently, not deltas, yet the math makes sense too."

    The computed value is a delta between two other values by definition. There is nothing about that expression that says that the second operand must be less than the first, so it's perfectly valid for the result to be negative. If, in this particular context, the expectation is that the result should always be positive and that it should be interpreted as a positive count rather than a delta, then that should be reflected in the code. For example:

      avail = base::saturated_cast<ByteCount>(total - used);

    or maybe even checked_cast<> if you want to crash in the unexpected case.

    It seems that I'm in the "counts and deltas are different things and they shouldn't be confused" camp. If we do consider them to be two distinct things, then I like my counts to be unsigned, thankyouverymuch. :-)

    Gabriel Charette

    unread,
    Oct 10, 2025, 10:33:30 AMOct 10
    to Greg Thompson, Gabriel Charette, Joe Mason, Peter Kasting, Avi Drissman, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx

    @PK: Sorry I came here without reading the entire history, precisely because I believe we're spending too much time bikeshedding this. I heard about this thread and fallout work in a lunch discussion, there's frustration on the ground that the original change was a positive delta but now we're essentially saying "by improving things from zero, you're signing up for the ensuing bikeshed" (and also that others should refrain from using the improved status while we're discussing the potential improved++ status). That's what I believe is stop energy. It's not encouraging folks to improve the status quo in the future.

    I'm grateful that Joe picked up what seemed like a good directional improvement, but he's now running into further edge cases and I'm skeptical that's the best use of our shared time. Hence I'm indeed questioning whether we should stop here in order to encourage future such zero->better improvements elsewhere (we're paving a cultural memory and I'm worried we're encoding "don't touch anything, you'll end up with a hot potato".

    > no further reasons to stop
    Eng time is finite and I believe we've already spent more than the expected return value. Byte counts have been integral for years. This has caused accounting bugs. ByteCount fixes that. I don't see what ByteSizeDelta addresses (other than the ability for ByteCount/ByteSize to be unsigned but that goes against the style guide and TimeBase and I'm strongly opposed to that).

    Le ven. 10 oct. 2025, 06 h 36, Greg Thompson <g...@chromium.org> a écrit :
    Not to stir the pot, but I disagree on this point:

    "For instance, TotalPhysicalMemory - UsedPhysicalMemory = AvailablePhysicalMemory. All of these are ByteCounts which can be represented independently, not deltas, yet the math makes sense too."

    The computed value is a delta between two other values by definition. There is nothing about that expression that says that the second operand must be less than the first, so it's perfectly valid for the result to be negative. If, in this particular context, the expectation is that the result should always be positive and that it should be interpreted as a positive count rather than a delta, then that should be reflected in the code. For example:

      avail = base::saturated_cast<ByteCount>(total - used);

    or maybe even checked_cast<> if you want to crash in the unexpected case.

    It seems that I'm in the "counts and deltas are different things and they shouldn't be confused" camp. If we do consider them to be two distinct things, then I like my counts to be unsigned, thankyouverymuch. :-)

    My point here is that all 3 of these values can be obtained from the OS as independent values (and thus ByteCount's) but it's also true that you can get from any of them to the 3rd one through delta maths.

    If we see ByteCount as just "an integral number of bytes", it makes sense to allow it to be both an absolute count and a delta.
    TimeTicks is different, `ticks_a - ticks_b` is never semantically another TimeTicks value.

    Re. signed vs unsigned: the style guide is firm and TimeBase obliges so I don't see why we should differ here.

    Greg Thompson

    unread,
    Oct 10, 2025, 10:57:30 AMOct 10
    to Gabriel Charette, Joe Mason, Peter Kasting, Avi Drissman, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    If we're talking about style guides, then don't forget this gem: "Use size_t for object and allocation sizes, object counts, array and pointer offsets, vector indices, and so on. This prevents casts when dealing with STL APIs, and if followed consistently across the codebase, minimizes casts elsewhere."

    Gabriel Charette

    unread,
    Oct 10, 2025, 12:27:43 PMOct 10
    to Greg Thompson, Gabriel Charette, Joe Mason, Peter Kasting, Avi Drissman, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    On Fri, Oct 10, 2025 at 10:57 AM Greg Thompson <g...@chromium.org> wrote:
    If we're talking about style guides, then don't forget this gem: "Use size_t for object and allocation sizes, object counts, array and pointer offsets, vector indices, and so on. This prevents casts when dealing with STL APIs, and if followed consistently across the codebase, minimizes casts elsewhere."

    Yikes, looks like we have a divergence between Chromium and Google styles here, I hadn't realized that. I wonder if it's intentional and why it's in a "platform-specific" section...

    I still believe it's a definitive improvement to have a type for "integer which is a number of bytes" (signed) and am not convinced about the value of going beyond that (given the edge cases surfaced in this thread). Thanks everyone for exploring this, but let's be wary of sunk costs and a feeling that we must commit to something else.

    Peter Kasting

    unread,
    Oct 10, 2025, 7:50:34 PMOct 10
    to Gabriel Charette, Greg Thompson, Joe Mason, Avi Drissman, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    The chromium style guidance to use size_t is very much intentional, and long-standing. It also does not actually diverge from the upstream style guide; the upstream guide supports use of unsigned types when doing so avoids conversions, which is important in this case since basically all stdlib and many os functions take/return unsigned types (e.g. size_t, uint64_t, DWORD). I've discussed the style guidance a lot with Google style arbiters and I don't think we want to change direction.

    This is part of why I was so surprised that ByteCount uses a signed underlying type. I think this is problematic and potentially introduces more problems than the ByteCount type solves. I think fixing this is imperative, enough so that (as I already mentioned) I am willing to volunteer the eng time to fix it.

    I don't think a delta type is critical, but the thread consensus was to add one. I also don't think it's necessary to tell people to stop using ByteCount in the meantime, but Avi apparently did. In my view, we can Just Fix This without halting the world or inconveniencing any engineers.

    If you don't agree, let's just chat about it over VC. It's probably faster? I am very much with you on the desire to minimize unnecessary overhead. 

    PK

    Joe Mason

    unread,
    Oct 14, 2025, 6:19:11 PMOct 14
    to Peter Kasting, Gabriel Charette, Greg Thompson, Avi Drissman, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    The ByteSize / ByteSizeDelta impl is now in review.

    Some comments inline:

    On Fri, Oct 10, 2025 at 7:50 PM Peter Kasting <pkas...@chromium.org> wrote:
    The chromium style guidance to use size_t is very much intentional, and long-standing. It also does not actually diverge from the upstream style guide; the upstream guide supports use of unsigned types when doing so avoids conversions, which is important in this case since basically all stdlib and many os functions take/return unsigned types (e.g. size_t, uint64_t, DWORD). I've discussed the style guidance a lot with Google style arbiters and I don't think we want to change direction.

    This is part of why I was so surprised that ByteCount uses a signed underlying type. I think this is problematic and potentially introduces more problems than the ByteCount type solves. I think fixing this is imperative, enough so that (as I already mentioned) I am willing to volunteer the eng time to fix it.

    I don't think a delta type is critical, but the thread consensus was to add one. I also don't think it's necessary to tell people to stop using ByteCount in the meantime, but Avi apparently did. In my view, we can Just Fix This without halting the world or inconveniencing any engineers.

    I think the main problem with converting raw ints to ByteCount is loss of context - we don't know (without looking at history) whether the code originally took a signed or unsigned int. Since ByteCount is signed, converting an existing signed int to ByteCount is a pure upgrade, but converting an unsigned int loses that context.

    Hopefully ByteSize lands soon so this becomes moot, but in the meantime I'd allow signed -> ByteCount conversions but ask unsigned -> ByteCount conversions to hold off.

    As an aside, Thiabaud found that a lot of uses of signed ints are just to use -1 as an error code. A lot of those places should become optional<ByteSize>, but converting a numeric to an optional as well as int to ByteSize is a lot more steps. I'm thinking of adding a helper function for this conversion.

    Avi Drissman

    unread,
    Oct 14, 2025, 7:48:52 PMOct 14
    to Joe Mason, Peter Kasting, Gabriel Charette, Greg Thompson, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    A lot of the research that I was putting into my conversions was figuring out how to move "int + -1 for unknown/error" into optional<ByteCount>. While we gain some by moving to (now) ByteSize, we gain so much more by retrofitting with optional<ByteSize> that I think that effort to detangle it will be worth doing.

    Avi

    Gabriel Charette

    unread,
    Oct 15, 2025, 12:20:03 PMOct 15
    to Avi Drissman, Joe Mason, Peter Kasting, Gabriel Charette, Greg Thompson, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    Thanks everyone, I'm happy to hear that the unknowns are resolved and that we're landing in a better state overall in a reasonable amount of time, with more error checking and better semantics.

    Interesting point about signed vs unsigned PK: I realize that more thought has been previously poured into this than I was aware of and I have no reason for us to differ (my initial point was that we shouldn't deviate from what I believed was the style consensus). I'm curious why TimeBase::us_ isn't unsigned as well then, but that's not for this discussion.

    Also, yay for finding more existing bugs and preventing future ones by preventing ByteSize(-1) and forcing optional<ByteSize>.

    PS: The upcoming ByteDelta::AsByteSize() addresses my concern that some deltas are also okay to interpret as absolute byte sizes.

    David Benjamin

    unread,
    Oct 15, 2025, 12:30:09 PMOct 15
    to Gabriel Charette, Avi Drissman, Joe Mason, Peter Kasting, Greg Thompson, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    > I'm curious why TimeBase::us_ isn't unsigned as well then, but that's not for this discussion.

    This is a digression, but unlike sizes, negative time values are still a thing. Timestamps at the year 1599 were valid times, in a way that an array of -1 bytes is not. I imagine we don't need to compute over negative times as often with the Windows 1600 epoch, compared to the POSIX 1970 epoch, but it seems better to be able to represent them than not.

    (Granted, too negative and we very quickly run to the 1582 Julian vs Gregorian calendar cut-off. I imagine we'll give those times historically inaccurate labels when exploding into year/month/day representations, but ah well. Mumble mumble proleptic Gregorian calendar...)

    K. Moon

    unread,
    Oct 15, 2025, 12:45:15 PMOct 15
    to David Benjamin, Gabriel Charette, Avi Drissman, Joe Mason, Peter Kasting, Greg Thompson, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    My phone is struggling to keep up with the size of this thread at this point, but agreed that calendar calculations are tricky, which is generally why you'd use a dedicated calendar library for that. base::Time has a simpler purpose, fortunately, but negative (pre-epoch) times are still reasonably in range.

    I think the more relevant argument probably is just that size_t is unsigned, though, and linked to the size of addressable memory, while signed vs. unsigned is not really an important consideration for most time calculations. (Fractional time parts are well within 32-bit integer ranges, while far past/future times generally aren't interesting in day-to-day use.)

    David Benjamin

    unread,
    Oct 15, 2025, 1:11:31 PMOct 15
    to K. Moon, Gabriel Charette, Avi Drissman, Joe Mason, Peter Kasting, Greg Thompson, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    Yeah, the true reason, at the end of the day, to use size_t for memory sizes is just that everything we try to interop with in C++ uses size_t, stemming from the C++ and, ultimately, C standard libraries.

    It doesn't work an object in C++ whose size exceeds PTRDIFF_MAX because of how pointer arithmetic is defined. Every platform makes (signed) ptrdiff_t and size_t the same size, so size_t(-1) is as invalid of an addressible byte count, within a single allocation, as -1. Either way, we have values in the type that are invalid counts and offsets.

    Some other languages (e.g. Go) use signed values for sizes and it works fine too. But the important thing is that it's consistent so we don't have to worry about converting, and the language picks it for us.

    James Hawkins

    unread,
    Oct 15, 2025, 2:06:28 PMOct 15
    to K. Moon, David Benjamin, Gabriel Charette, Avi Drissman, Joe Mason, Peter Kasting, Greg Thompson, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx
    I'm experiencing this lag as well. It takes about a minute for scroll-ability on this thread; such a strange bug which I will file.

    Joe Mason

    unread,
    Oct 15, 2025, 6:38:46 PMOct 15
    to Avi Drissman, Peter Kasting, Gabriel Charette, Greg Thompson, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx, Thiabaud Engelbrecht
    The ByteSize / ByteSizeDelta patch is https://crrev.com/c/7042842, BTW.

    On Tue, Oct 14, 2025 at 7:48 PM Avi Drissman <a...@google.com> wrote:
    A lot of the research that I was putting into my conversions was figuring out how to move "int + -1 for unknown/error" into optional<ByteCount>. While we gain some by moving to (now) ByteSize, we gain so much more by retrofitting with optional<ByteSize> that I think that effort to detangle it will be worth doing.

    My current thought is for a helper:

      ByteSizeDelta GetByteSizeDeltaOr(const std::optional<ByteSize> opt_bytes, ByteSizeDelta default) {
        if (opt_bytes.has_value()) {
          return opt_bytes.value().AsByteSizeDelta();
        }
        return default;
      }

    Then to progressively convert to ByteSize, something like this:

      int64_t value_in_kb = ValueAvailable() ? GetValueInKb() : -1;
      ... much later ...
      UseValueInKB(value_in_kb);

    Can convert to:

      optional<ByteSize> value = ValueAvailable() ? KiB(GetValueInKb()) : nullopt;
      ... much later ...
      int64_t value_in_kb = GetByteSizeDeltaOr(opt_bytes, KiBS(-1)).InKilobytes();
      UseValueInKB(value_in_kb);

    That's still a bit hard to hold, though, because it would be easy to accidentally use -1 bytes instead of -1 kb as the error value.

    (I have no plans to actually implement that helper - once we figure out a good interface it's simple enough for someone doing the conversion to add it.)

    Joe Mason

    unread,
    Oct 21, 2025, 12:55:38 PMOct 21
    to Avi Drissman, Peter Kasting, Gabriel Charette, Greg Thompson, K. Moon, Daniel Cheng, François Doray, Jeremy Roman, Alexei Svitkine, Leszek Swirski, Marc Treib, cxx, Thiabaud Engelbrecht
    Got distracted and forgot to mention - ByteSize / ByteSizeDelta has landed. You may continue conversions of raw ints.

    The new classes use base::KiBU etc to create ByteSize and base::KiBS etc to create ByteSizeDelta. There's a TODO to convert existing KiB calls to either KiBU or KiBS, and then rename KiBU to KiB once there are no more.

    I don't have more time to work on this now so someone else will have to pick up the deprecation of ByteCount.
    Reply all
    Reply to author
    Forward
    0 new messages