base::File int size

34 views
Skip to first unread message

Vladislav Kuzkokov

unread,
Oct 4, 2017, 2:09:43 PM10/4/17
to Chromium-dev
Is there any good reason base::File::Read* and Write* methods use int for size?
I'm mostly interested in this in the context of printing which uses uint32_t.

Lei Zhang

unread,
Oct 4, 2017, 2:19:54 PM10/4/17
to Vladislav Kuzkokov, Chromium-dev
Probably because they need -1 as a sentinel value to represent failure.
> --
> --
> Chromium Developers mailing list: chromi...@chromium.org
> View archives, change email options, or unsubscribe:
> http://groups.google.com/a/chromium.org/group/chromium-dev
> ---
> You received this message because you are subscribed to the Google Groups
> "Chromium-dev" group.
> To view this discussion on the web visit
> https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/34e81997-c1c4-483f-8444-8017cda3f3a0%40chromium.org.

Brett Wilson

unread,
Oct 4, 2017, 2:22:00 PM10/4/17
to Lei Zhang, Vladislav Kuzkokov, Chromium-dev
According to the style guide, you shouldn't use uint32_t for this. I could see an argument for going to a size_t but then we'd need some kind of magical constant for the error case like std::string::npos.

Brett

Vladislav Kuzkokov

unread,
Oct 4, 2017, 2:23:06 PM10/4/17
to Chromium-dev, vkuz...@chromium.org
That sounds exactly like what ssize_t was made for.

Alexei Svitkine

unread,
Oct 4, 2017, 2:26:02 PM10/4/17
to Brett Wilson, Lei Zhang, Vladislav Kuzkokov, Chromium-dev
Shouldn't it be 64-bit given a file can be above 4GB in size?

Alexei Svitkine

unread,
Oct 4, 2017, 2:27:15 PM10/4/17
to Vladislav Kuzkokov, Chromium-dev
(Ignore my message, I missed that this is about parameters to Read()/Write() and not the size of the file.)

Vladislav Kuzkokov

unread,
Oct 4, 2017, 2:32:46 PM10/4/17
to Chromium-dev, vkuz...@chromium.org
That's fine for output.
Why would you need to pass -1 as an input? (Unless you rely on Read*/Write* behavior where, it turns out, they silently shortcut to return -1 for any negative value)


On Wednesday, October 4, 2017 at 8:19:54 PM UTC+2, Lei Zhang wrote:

Matthew Menke

unread,
Oct 4, 2017, 2:36:17 PM10/4/17
to Chromium-dev, vkuz...@chromium.org
If they return the number of bytes they wrote, then that necessitates passing in the same type as a size as they return (Or having them return a type that contains the type passed in as a subcomponent).

Chris Palmer

unread,
Oct 4, 2017, 2:51:13 PM10/4/17
to Chromium-dev, vkuz...@chromium.org
FWIW, POSIX `read` doesn't follow that common sense rule: `ssize_t read(int fd, void *buf, size_t count);`. (But it's better than `snprintf`, which truly makes no sense, and requires you to do the equivalent of `checked_cast`, which nobody ever does.)

In any case, you can still use -1 as a sentinel: That's what `std::string::npos` is: `static_cast<size_t>(-1)`. http://www.cplusplus.com/reference/string/string/npos/

Given my druthers, I'd have POSIX, libc, and `base::File` all use uint64_t for sizes and offsets. Not ssize_t, size_t, or even off_t (whose width, and even signedness, you have to look up). Reasons: uint64_t is explicit about signedness and width; it's more than large enough to handle all practical sizes; it doesn't require any weird stuff like http://users.suse.com/~aj/linux_lfs.html; it's unambiguous across IPC, RPC, and hardware architectures; and semantics for overflow are defined because it's unsigned. (Although you should be using `base::checked_cast` and `base::CheckedNumeric` anyway.)

Part of the value that `base::File` could/should bring is that rationalization. It'd be nice to dis-int-fect `base::File`. :)

Peter Kasting

unread,
Oct 4, 2017, 3:25:14 PM10/4/17
to Chris Palmer, Chromium-dev, vkuz...@chromium.org
On Wed, Oct 4, 2017 at 11:51 AM, 'Chris Palmer' via Chromium-dev <chromi...@chromium.org> wrote:
Given my druthers, I'd have POSIX, libc, and `base::File` all use uint64_t for sizes and offsets. Not ssize_t, size_t, or even off_t (whose width, and even signedness, you have to look up). Reasons: uint64_t is explicit about signedness and width; it's more than large enough to handle all practical sizes; it doesn't require any weird stuff like http://users.suse.com/~aj/linux_lfs.html; it's unambiguous across IPC, RPC, and hardware architectures; and semantics for overflow are defined because it's unsigned. (Although you should be using `base::checked_cast` and `base::CheckedNumeric` anyway.)

size_t and ssize_t aren't appropriate unless the API intentionally restricts use to "files which can fit in memory", since size_t is all about objects that can fit in memory.  off_t isn't standard C/C++ (it's a POSIX-ism) and has confusion regarding its size (the whole existence of off64_t makes me kind of facepalm).  int has the problems mentioned earlier.

This basically leaves int64_t and uint64_t.  The style guide says to prefer the former unless we need defined overflow.  I don't think we need defined overflow here; as you note, if there is a concern about overflow, better to use the checking facilities.

So my suggestion would be to provide an alias (e.g. FileOffset) to int64_t.

Separately, I suggest modifying APIs to avoid using -1 as a signal value where possible.  Now that we have things like base::Optional, there is usually a good way to write the API to avoid such magic sentinels.

PK

Vladislav Kuzkokov

unread,
Oct 4, 2017, 3:30:19 PM10/4/17
to Chromium-dev, pal...@google.com, vkuz...@chromium.org
Chromium style guide already recommends size_t wherever it seems reasonable.

Peter Kasting

unread,
Oct 4, 2017, 4:39:22 PM10/4/17
to vkuz...@chromium.org, Chromium-dev, Chris Palmer
On Wed, Oct 4, 2017 at 12:30 PM, Vladislav Kuzkokov <vkuz...@chromium.org> wrote:
Chromium style guide already recommends size_t wherever it seems reasonable.

Yes, and unless the API is constrained as I mentioned, it's not reasonable here.  Files are not guaranteed to be small enough to fit in memory, so size_t is asking for overflow (much like int).

PK

Vladislav Kuzkokov

unread,
Oct 4, 2017, 4:57:12 PM10/4/17
to Chromium-dev, vkuz...@chromium.org, pal...@google.com
Sizes passed to Read* and Write* are sizes of a memory buffer. And we should try to avoid 2Gb contiguous buffers even on 64 bit systems.
For file sizes and offsets base::File uses int64_t (always signed), so it's not functionally broken.

Daniel Cheng

unread,
Oct 4, 2017, 5:36:21 PM10/4/17
to vkuz...@chromium.org, Chromium-dev, pal...@google.com
I'm with Chris on this one: it should be uint64_t for sizes and absolute offsets, and int64_t for relative offsets. Otherwise it forces everything that passes these values over IPC to do sanity checks: instead, we'd strongly prefer the types themselves express the constraints.

Daniel

--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.

dan...@chromium.org

unread,
Oct 4, 2017, 5:39:45 PM10/4/17
to Daniel Cheng, Vladislav Kuzkokov, Chromium-dev, Chris Palmer
On Wed, Oct 4, 2017 at 5:35 PM, Daniel Cheng <dch...@chromium.org> wrote:
I'm with Chris on this one: it should be uint64_t for sizes and absolute offsets, and int64_t for relative offsets. Otherwise it forces everything that passes these values over IPC to do sanity checks: instead, we'd strongly prefer the types themselves express the constraints.

Doesn't that follow for every thing we pass over ipc tho, in which case we should replace all size_t with uint64_t? afaik we just write uint64_t in the IPC defns for things that are size_ts in the c++ code, and that's worked out.
 

Daniel

On Wed, Oct 4, 2017, 15:57 Vladislav Kuzkokov <vkuz...@chromium.org> wrote:
Sizes passed to Read* and Write* are sizes of a memory buffer. And we should try to avoid 2Gb contiguous buffers even on 64 bit systems.
For file sizes and offsets base::File uses int64_t (always signed), so it's not functionally broken.


On Wednesday, October 4, 2017 at 10:39:22 PM UTC+2, Peter Kasting wrote:
On Wed, Oct 4, 2017 at 12:30 PM, Vladislav Kuzkokov <vkuz...@chromium.org> wrote:
Chromium style guide already recommends size_t wherever it seems reasonable.

Yes, and unless the API is constrained as I mentioned, it's not reasonable here.  Files are not guaranteed to be small enough to fit in memory, so size_t is asking for overflow (much like int).

PK

--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/c8f6a142-e0f8-4d92-a1c0-408ea7153fe9%40chromium.org.

--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.

Peter Kasting

unread,
Oct 4, 2017, 5:47:07 PM10/4/17
to Vladislav Kuzkokov, Chromium-dev, Chris Palmer
On Wed, Oct 4, 2017 at 1:57 PM, Vladislav Kuzkokov <vkuz...@chromium.org> wrote:
Sizes passed to Read* and Write* are sizes of a memory buffer.

In that case, size_t is the most appropriate built-in type.

However, if these values are serialized as-is over IPC, we probably want an explicitly sized type.  I believe I added functionality a while back to serialize size_t properly over IPC (you have to beware of potential size differences at the ends of the pipe), but the direction I've been given since then is "all types going over IPC shall always have an explicit size".

Re: Dana's question, we didn't use to pass so many things over IPC, so it wasn't as big an issue, but the problem with using uint64_t for the IPC definition but size_t for the send/receive ends is that you can have problems where a 64-bit value on the send side overflows on a receiver compiled for 32 bit size_t.  It's usually easier to replumb the pipeline to have 64-bit safety even when compiled for 32 bits than to properly handle these sorts of errors.

PK

Daniel Cheng

unread,
Oct 4, 2017, 5:48:06 PM10/4/17
to dan...@chromium.org, Chris Palmer, Chromium-dev, Vladislav Kuzkokov
Not everything can be replaced though, nor should it. I think my opinion on this would be different if we could actually enable warnings for implicit narrowing conversions--but as things currently stand, it depends on manual review to make sure the appropriate overflow checks are done. =(

Daniel

Chris Palmer

unread,
Oct 4, 2017, 5:51:28 PM10/4/17
to Dana Jansens, Daniel Cheng, Vladislav Kuzkokov, Chromium-dev
On Wed, Oct 4, 2017 at 2:37 PM, <dan...@chromium.org> wrote:

I'm with Chris on this one: it should be uint64_t for sizes and absolute offsets, and int64_t for relative offsets. Otherwise it forces everything that passes these values over IPC to do sanity checks: instead, we'd strongly prefer the types themselves express the constraints.

Doesn't that follow for every thing we pass over ipc tho, in which case we should replace all size_t with uint64_t? afaik we just write uint64_t in the IPC defns for things that are size_ts in the c++ code, and that's worked out.


Obviously, size_t → uint64_t works because it's either a no-op or an up-cast. When we port to 128-bit machines we'll need to revisit all that code. ;)

Vladislav Kuzkokov

unread,
Oct 16, 2017, 2:13:37 PM10/16/17
to Chromium-dev, dan...@chromium.org, dch...@chromium.org, vkuz...@chromium.org
So, we just leave it as is then.
Reply all
Reply to author
Forward
0 new messages