This CL removes the character width check to fix bugs in tests such as FPDFTextEmbedderTest.Bug921, FPDFTextEmbedderTest.WhitespaceCharCount and FPDFTextEmbedderTest.Bug444176962. All of these tests had similar issues of very thin characters not being loaded correctly due to a check that filters them out. This CL works to update the expectations to their correct values after removing the check to demonstrate how the removal of the check addresses the issues with some characters being accidentally removed.Aryan Krishnanwrap
Done
if (str.GetLength() >= 2 && str[str.GetLength() - 1] == ' ' &&Lei ZhangI think this can be simplified to `if (bPrevSpace && str.GetLength() > 1) {`
Aryan KrishnanOr wrap the entire for-loop above inside a `str.GetLength() > 1` check. No need to do any of this if `str` is 0 or 1 char long.
Did both, considering the loop can also delete characters, It would still require the check for string length being > 1 for the if statement.
if (str.GetLength() >= 2 && str[str.GetLength() - 1] == ' ' &&Aryan KrishnanThe CL description doesn't mention what's going on here.
Done
temp_text_buf_.Delete(str.GetLength() - 1, 1);Aryan KrishnanCalculate this index once and save it to a local variable.
Done.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
bool bPrevSpace = false;Move this one line down to reduce its scope.
if (bPrevSpace && str.GetLength() > 1 && str[last_idx] == ' ') {Can check `last_idx` instead.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Move this one line down to reduce its scope.
Done
if (bPrevSpace && str.GetLength() > 1 && str[last_idx] == ' ') {Aryan KrishnanCan check `last_idx` instead.
Yeah was considering that, just thought it would be less readable than the GetLength() call. Will edit.
Done.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Remove character width check from cpdf_textpage.cpp
This CL removes the character width check to fix bugs in tests such as
FPDFTextEmbedderTest.Bug921, FPDFTextEmbedderTest.WhitespaceCharCount
and FPDFTextEmbedderTest.Bug444176962. All of these tests had similar
issues of very thin characters not being loaded correctly due to a
check that filters them out. This CL works to update the expectations
to their correct values after removing the check to demonstrate how the
removal of the check addresses the issues with some characters being
accidentally removed.
This CL also adds a section to CPDF_TextPage::CloseTempLine() to avoid
additional whitespace characters incorrectly being counted.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
@the...@chromium.org This crashed the auto-roller it is either 1 of 2 things in my opinion.
1. Test expectations for chromium/pdf need to still be updated
2. There is something wrong with this patch.
I am going to prepare for both approaches by updating the patch to use new testing expectations and also prepare a revert of this if something truly is broken.
Am making a CL for revert, let's merge it if it is case 2.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
@the...@chromium.org This crashed the auto-roller it is either 1 of 2 things in my opinion.
1. Test expectations for chromium/pdf need to still be updated
2. There is something wrong with this patch.I am going to prepare for both approaches by updating the patch to use new testing expectations and also prepare a revert of this if something truly is broken.
Am making a CL for revert, let's merge it if it is case 2.
Unresolving.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Aryan Krishnan@the...@chromium.org This crashed the auto-roller it is either 1 of 2 things in my opinion.
1. Test expectations for chromium/pdf need to still be updated
2. There is something wrong with this patch.I am going to prepare for both approaches by updating the patch to use new testing expectations and also prepare a revert of this if something truly is broken.
Am making a CL for revert, let's merge it if it is case 2.
Unresolving.
Have you looked at the failing test in Chromium to see if it's a regression or an improvement?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Aryan Krishnan@the...@chromium.org This crashed the auto-roller it is either 1 of 2 things in my opinion.
1. Test expectations for chromium/pdf need to still be updated
2. There is something wrong with this patch.I am going to prepare for both approaches by updating the patch to use new testing expectations and also prepare a revert of this if something truly is broken.
Am making a CL for revert, let's merge it if it is case 2.
Lei ZhangUnresolving.
Have you looked at the failing test in Chromium to see if it's a regression or an improvement?
It should be an improvement, preparing the other in case I am wrong.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Aryan Krishnan@the...@chromium.org This crashed the auto-roller it is either 1 of 2 things in my opinion.
1. Test expectations for chromium/pdf need to still be updated
2. There is something wrong with this patch.I am going to prepare for both approaches by updating the patch to use new testing expectations and also prepare a revert of this if something truly is broken.
Am making a CL for revert, let's merge it if it is case 2.
Lei ZhangUnresolving.
Aryan KrishnanHave you looked at the failing test in Chromium to see if it's a regression or an improvement?
It should be an improvement, preparing the other in case I am wrong.
Resolving: It is clear that it is an improvement.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
if (str.GetLength() > 1) {For the regression, perhaps this is the culprit.
This should run despite the length of the string passed in.
If a line with just " " is passed in, in the context of this regression it should return nothing.
However, that breaks whitespace.pdf. We currently have 2, sort of conflicting cases.
Something like "if there is a next line after this in the stream", then run for str.GetLength() > 1 as well.
So we could do option a. which is pushing this fix.
or option b. push a revert to fix this ASAP and then fix this later on.
I'd say option b, where we push a revert and then look into this later.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
if (str.GetLength() > 1) {For the regression, perhaps this is the culprit.
This should run despite the length of the string passed in.
If a line with just " " is passed in, in the context of this regression it should return nothing.However, that breaks whitespace.pdf. We currently have 2, sort of conflicting cases.
Something like "if there is a next line after this in the stream", then run for str.GetLength() > 1 as well.
So we could do option a. which is pushing this fix.
or option b. push a revert to fix this ASAP and then fix this later on.I'd say option b, where we push a revert and then look into this later.
I'm fine with reverting.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
if (str.GetLength() > 1) {Lei ZhangFor the regression, perhaps this is the culprit.
This should run despite the length of the string passed in.
If a line with just " " is passed in, in the context of this regression it should return nothing.However, that breaks whitespace.pdf. We currently have 2, sort of conflicting cases.
Something like "if there is a next line after this in the stream", then run for str.GetLength() > 1 as well.
So we could do option a. which is pushing this fix.
or option b. push a revert to fix this ASAP and then fix this later on.I'd say option b, where we push a revert and then look into this later.
I'm fine with reverting.
Update, got a fix-ish thing working. It is not exactly ideal, but keeps most of the test cases working: notable exception being whitespace.pdf. I'd say we still explore it post-revert but still will upload it for reference.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |