| Commit-Queue | +1 |
struct BrotliDecoderStateDeleter {Aryan KrishnanThis can live inside brotli_decoder.cpp in an anonymous namespace. No need to expose it in the header if it is not used anywhere else.
Done
if (estimated_decode_size > kMaxDecodeBytes) {Aryan KrishnanThis means some data stream out there is larger than 64 MB uncompressed, but this code will refuse to decompress it. This does not feel right.
I'm fine with doing Brotli decompression in chunks in the line 45 loop below, but I'm concerned that the overall function cannot handle larger outputs.
Aryan KrishnanThere are 3 main choices here:
1. Fail on large sizes
Pros: A lower risk of having a PDF that just zip bombs your ram
Cons: A lot of PDFs fail to decode which can be confusing for users.2. Return a half decoded stream
Pros: We don't return a failure state and at least give a bit of the stream.
Cons: Unpredictable Behaviour and the PDF may not render.3. Removing the check
Pros: Supports more PDFs without failing
Cons: Floods the RAM and may cause a DOS.For these large buffers, even just at the max size of 64MB, a full decode takes ~1.7 seconds on my system (Might differ system to system but just giving an example). Allowing exponential growth to support such streams will cause a 10GiB sample for instance to take 270s to decode.
AFAIK FlateDecode also imposes a limit of 1GiB to protect against this case.
I would say at this point option 1 seems like the best choice considering that while some PDFs fail, if the limit is set right the vast majority should render fine.
Lei ZhangResolving for now, reopen if we should discuss a bit more or if there is a choice 4 I missed somewhere.
Aryan KrishnanIf the Flate code uses a 1 GB limit, then why not use that here as well? Why set the limit so low?
Done
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
This CL Adds brotli_decoder.cpp/.h implementing BrotliDecoder::Decode()Same comment as other CLs. Not sure why Regular words in the middle of a sentence gets capitalized.
} // namespacePut a newline above, and balance it out with a newline after line 16.
void BrotliDecoderStateDeleter::operator()(Just merge this with line 20.
if (estimated_decode_size > kMaxDecodeBytes) {Aryan KrishnanThis means some data stream out there is larger than 64 MB uncompressed, but this code will refuse to decompress it. This does not feel right.
I'm fine with doing Brotli decompression in chunks in the line 45 loop below, but I'm concerned that the overall function cannot handle larger outputs.
Aryan KrishnanThere are 3 main choices here:
1. Fail on large sizes
Pros: A lower risk of having a PDF that just zip bombs your ram
Cons: A lot of PDFs fail to decode which can be confusing for users.2. Return a half decoded stream
Pros: We don't return a failure state and at least give a bit of the stream.
Cons: Unpredictable Behaviour and the PDF may not render.3. Removing the check
Pros: Supports more PDFs without failing
Cons: Floods the RAM and may cause a DOS.For these large buffers, even just at the max size of 64MB, a full decode takes ~1.7 seconds on my system (Might differ system to system but just giving an example). Allowing exponential growth to support such streams will cause a 10GiB sample for instance to take 270s to decode.
AFAIK FlateDecode also imposes a limit of 1GiB to protect against this case.
I would say at this point option 1 seems like the best choice considering that while some PDFs fail, if the limit is set right the vast majority should render fine.
Lei ZhangResolving for now, reopen if we should discuss a bit more or if there is a choice 4 I missed somewhere.
Aryan KrishnanIf the Flate code uses a 1 GB limit, then why not use that here as well? Why set the limit so low?
Done
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
This CL Adds brotli_decoder.cpp/.h implementing BrotliDecoder::Decode()Same comment as other CLs. Not sure why Regular words in the middle of a sentence gets capitalized.
Done
Put a newline above, and balance it out with a newline after line 16.
Done
Just merge this with line 20.
Done
if (estimated_decode_size > kMaxDecodeBytes) {Aryan KrishnanThis means some data stream out there is larger than 64 MB uncompressed, but this code will refuse to decompress it. This does not feel right.
I'm fine with doing Brotli decompression in chunks in the line 45 loop below, but I'm concerned that the overall function cannot handle larger outputs.
Aryan KrishnanThere are 3 main choices here:
1. Fail on large sizes
Pros: A lower risk of having a PDF that just zip bombs your ram
Cons: A lot of PDFs fail to decode which can be confusing for users.2. Return a half decoded stream
Pros: We don't return a failure state and at least give a bit of the stream.
Cons: Unpredictable Behaviour and the PDF may not render.3. Removing the check
Pros: Supports more PDFs without failing
Cons: Floods the RAM and may cause a DOS.For these large buffers, even just at the max size of 64MB, a full decode takes ~1.7 seconds on my system (Might differ system to system but just giving an example). Allowing exponential growth to support such streams will cause a 10GiB sample for instance to take 270s to decode.
AFAIK FlateDecode also imposes a limit of 1GiB to protect against this case.
I would say at this point option 1 seems like the best choice considering that while some PDFs fail, if the limit is set right the vast majority should render fine.
Lei ZhangResolving for now, reopen if we should discuss a bit more or if there is a choice 4 I missed somewhere.
Aryan KrishnanIf the Flate code uses a 1 GB limit, then why not use that here as well? Why set the limit so low?
Lei ZhangDone
Not yet.
Oops - maybe it got overwritten by some other patchset I uploaded without pulling all gerrit commits. Should be done now.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
size_t available_out = decoded_buffer.size() - total_out;if we make auto next_out = span(decoded_buffer).subspan(total_out), we can avoid the decoded_buffer.size() - total_out calculation, instead relying on next_out.size(). That way, the two can't get out of sync with each other in the future.
std::min(static_cast<uint32_t>(decoded_buffer.size()) * 2,
static_cast<uint32_t>(kMaxDecodeBytes));nit: can avoid std::min here as we checked N/2 above.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Commit-Queue | +1 |
if we make auto next_out = span(decoded_buffer).subspan(total_out), we can avoid the decoded_buffer.size() - total_out calculation, instead relying on next_out.size(). That way, the two can't get out of sync with each other in the future.
Yeah checks out! Thanks!
std::min(static_cast<uint32_t>(decoded_buffer.size()) * 2,
static_cast<uint32_t>(kMaxDecodeBytes));nit: can avoid std::min here as we checked N/2 above.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
std::min(static_cast<uint32_t>(decoded_buffer.size()) * 2,
static_cast<uint32_t>(kMaxDecodeBytes));Aryan Krishnannit: can avoid std::min here as we checked N/2 above.
Done (Although differently to the edit).
After this, can just combine with the resize() call. Also no need to static_cast().
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Commit-Queue | +1 |
std::min(static_cast<uint32_t>(decoded_buffer.size()) * 2,
static_cast<uint32_t>(kMaxDecodeBytes));Aryan Krishnannit: can avoid std::min here as we checked N/2 above.
Lei ZhangDone (Although differently to the edit).
After this, can just combine with the resize() call. Also no need to static_cast().
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |