PDF417 detection changes, ready for merge, java

58 views
Skip to first unread message

Guenther Grau

unread,
Apr 25, 2013, 5:46:41 PM4/25/13
to zx...@googlegroups.com
Hi,

I've cleaned up my changes for the PDF417 detection and think that they are ready to be merged now.

I've attached a zip file which contains new and changed classes. I've also attached a patch file, but that doesn't include the new test images. I've created a branch in my zxing git clone which contains my changes https://github.com/graug/zxing/tree/mergeBranch.

The new version completely replaces the existing PDF417 detection code. It reuses the idea of finding the most likely codeword by finding the smallest squared error. It does not, however use oversampling or any image transformations. In my internal testing I found it to be about 40 times faster than the current implementation on my laptop.
My implementation also deals with more corrupt barcodes which are missing the or have a corrupt start or end pattern and can also handle missing columns, as long as enough bytes are available for the error correction. The implementation also supports Compact PDF417 which is missing the stop pattern and optionally the right row indicator column.
I've removed the special code for the pure_barcode option, because the existing implementation wasn't working correctly and I don't think it's needed given the very fast and low resource usage of the new implementation.

The code detects all current test images correctly (the current code fails to detect a few). I've also added a few more test images.

It also adds support for Macro PDF417, which allows information to be spread across several barcodes.

I think we should change the test images directory pdf417 to pdf417-1 to be more consistent with the other test images. I haven't included those changes in my patch. This should be a simple directory rename and updating the path in PDF417BlackBox1TestCase.

As the cutoff date has been pushed back a little, maybe the changes can even be included in the release? Let me know what you think.

Best regards,

  Guenther

P.S.: I still have a few ideas on how to improve the detection even further, but this will only make a difference for really bad images.
P.P.S.: Here is a short description of my changes

MultiFormatReader:
added @SuppressWarnings("unchecked")

DecoderResult:
remove final to allow subclasses

GridSampler:
The current code breaks if three points need to be nudged. It only nudges the leftmost and rightmost one, so it fails if there are 3. Should probably be fixed in a better way.

PDF417Common:
New Class that contains common constants.

PDF417DecoderResult:
specific result for PDF417. It needs more attributes then other barcodes

PDF417Reader
New implementation which supports reading multiple barcodes in one image

PDF417ResultMetadata
Class that contains the PDF417 specific barcode metadata

package com.google.zxing.pdf417.decoder
The new decoder implementation

ErrorCorrection:
return number of corrected errors to measure detection quality

ModulusGF
use common constant

Detector
Improved barcode detection code

LineSampler
removed

PDF417DetectorResult
class holds detection results for multiple barcodes on one image

pdf417-2:
1 new test image

pdf417-3
6 new test images

PDF417BlackBox[1-3]TestCase
updated detection counts

DetectorTest
new test class for the detector

DecodeWorker
added null check for points (incomplete barcodes don't have all points set)






patch.zip
patch.txt

Sean Owen

unread,
Apr 26, 2013, 5:07:33 AM4/26/13
to zx...@googlegroups.com
This is broadly looking good; here are my questions or concerns.

- GridSampler -- I'm not sure about removing the 'nudging', is it harmful? it helps some edge cases as I recall
- DecoderResult subclass -- this is one of several cases where the general classes get a special-case implementation for PDF417. Can this just be handled with the normal class plus new metadata?
- The change inside the .pdf417 is much less of an issue since it's contained. These can be package-private declarations as a result, almost all of them. 
- BarcodeMatrix -- is this duplicative of BitMatrix?

tiny things:

- I wouldn't put 'final' on method params. There's a good theory about why to do that, and I like it, but generally I think it adds clutter more than helps. At least that's the de facto standard here.

Guenther Grau

unread,
Apr 26, 2013, 5:40:15 AM4/26/13
to zx...@googlegroups.com
Thanx a lot for the feedback!

The GridSampler change is not required. You can keep the existing version. If I find the time I'll try to reproduce the problem I ran into but until then there's no need to apply my change for the new code to work. I probably should have submitted a separate patch for that.

DecoderResult - I'm happy if the additional data is added to the general decoder result. I remembered from an earlier discussion that you didn't want additional properties added to the general decoder result. Hence I created a subclass, which also minimizes the impact on other code.

BarcodeMatrix - No, this is not a duplicate of BitMatrix. BitMatrix stores bit values. BarcodeMatrix stores detected codeword values. It supports a majority vote system if several, potential different values are decoded for the same barcode row/column position.

One of the primary objectives of the patch was to minimize the changes in the general code.

I don't mind if you remove the final for parameters. I'll keep that in mind for future patches.

How are we moving forward with this? Are you applying the suggested changes and commit the result or do you need me to produce a new version of the patch?

Btw., can you diff your sources against a git-branch? That would make submitting future patches easier.
Creating a patch file using eclipse doesn't include the binary file changes.
Creating a zip file with all changed files is more work and needs additional information about deleted files.

Best regards,

  Guenther

Sean Owen

unread,
Apr 26, 2013, 8:30:34 AM4/26/13
to zx...@googlegroups.com


On Friday, April 26, 2013 10:40:15 AM UTC+1, Guenther Grau wrote:
Thanx a lot for the feedback!

The GridSampler change is not required. You can keep the existing version. If I find the time I'll try to reproduce the problem I ran into but until then there's no need to apply my change for the new code to work. I probably should have submitted a separate patch for that.

DecoderResult - I'm happy if the additional data is added to the general decoder result. I remembered from an earlier discussion that you didn't want additional properties added to the general decoder result. Hence I created a subclass, which also minimizes the impact on other code.

Here I mean that it should likely be more optional metadata. Let me look at that instead.


How are we moving forward with this? Are you applying the suggested changes and commit the result or do you need me to produce a new version of the patch?

Let me try to commit a version of the existing patch since it seems close enough.
 

Btw., can you diff your sources against a git-branch? That would make submitting future patches easier.
Creating a patch file using eclipse doesn't include the binary file changes.
Creating a zip file with all changed files is more work and needs additional information about deleted files.

It might work, yes. The git mirror is kept fairly up to date so I imagine it will work without much trouble.

Sean Owen

unread,
Apr 26, 2013, 8:34:19 AM4/26/13
to zx...@googlegroups.com
Guenther in this code in BarcodeValue:

    for (Entry<Integer,Integer> entry : values.entrySet()) {
      if (entry.getValue() > maxConfidence) {
        maxConfidence = entry.getValue();
        result = entry.getKey();
        ambigous = false;
      } else if (entry.getValue() > maxConfidence) {
        ambigous = true;
      }
    }

the two if conditions are the same. Is the second supposed to be "<"?

Guenther Grau

unread,
Apr 26, 2013, 8:47:52 AM4/26/13
to zx...@googlegroups.com
Sean,

damn last minute changes :-( No, the second should be an ==.

Best regards,

  Guenther

Sean Owen

unread,
Apr 26, 2013, 8:48:37 AM4/26/13
to zx...@googlegroups.com
And am I right that this test method...

  private static BitArray getExpected(int size) {
    BitArray expected = new BitArray(size);
    for (int index : BIT_SET_INDEX) {
      expected.set(size - index);
    }
    return expected;
  }

... needs to set "size - 1 - index", since index is 0-based? it fails for me otherwise, once the equality assertion is fixed. It was comparing == equality.

Sean Owen

unread,
Apr 26, 2013, 9:05:30 AM4/26/13
to zx...@googlegroups.com
OK, when I run the unit tests originally, I see that suite #2 actually fails by 2 images. After 'fixing' that first item I mentioned, suite #1 fails by 2 images as well. This is with Java 6/7 on Mac OS X. Does it pass for you?

If it turns out that you want/need to do more work on this, I could run ahead and edit this patch locally, then post back to you a new patch for you to work off of. It's a little bit manual but probably works OK?
Or you could work on your end and post a new patch for me to start from again.

Guenther Grau

unread,
Apr 26, 2013, 9:05:32 AM4/26/13
to zx...@googlegroups.com
when changing it to ==, one of the tests fails :-( I guess it's better to remove the second else if completely for now. I'll look into this again.

Guenther Grau

unread,
Apr 26, 2013, 9:08:24 AM4/26/13
to zx...@googlegroups.com
duh, how embarrasing :-( yes, this should be "size - 1 - index".

Maybe you have done it already, but I guess changing the Assert import to org.junit.Assert and use
Assert.assertArrayEquals(getExpected(size).getBitArray(), result.getBitArray());

is better.

Best regards,

  Guenther

Sean Owen

unread,
Apr 26, 2013, 9:12:11 AM4/26/13
to zx...@googlegroups.com
Yeah or "extend Assert" is fine too. Yes I had changed that to compare "toString()", same thing.

Guenther Grau

unread,
Apr 26, 2013, 9:31:30 AM4/26/13
to zx...@googlegroups.com
I'm running on Win7, Java 7. Removing the second else if makes all test cases of suit #1 work.

One image (#24) fails in suite #2, but I think this is due to an incorrect expected result. Locally I was comparing the result against a .txt file, not the .bin.

You can either remove the second else if and commit or send me the changes you have done so far so I can make adjustments on top of them. I don't want you to start from scratch again. Whatever you prefer.

Best regards,

  Guenther

Sean Owen

unread,
Apr 26, 2013, 9:36:01 AM4/26/13
to zx...@googlegroups.com
Yeah the patch munged the .txt files used in tests to add a newline, I fixed that. Maybe that's the cause of others.
I'm OK to commit my version of the changes, and will change the test to expect fewer successes. It is still a net win I think. Then you can investigate from that point.

Sean Owen

unread,
Apr 26, 2013, 11:32:37 AM4/26/13
to zx...@googlegroups.com
I committed my version of the patch. There were a number of small style changes, and one of substance -- rearranging the PDF417 metadata. Most is the same though of course.
From there you may have a clearer view into what's going on with the test suite #2.

Steven Parkes

unread,
Apr 26, 2013, 11:46:19 AM4/26/13
to zx...@googlegroups.com
Is anyone that's interested in C++ PDF417 volunteering to port these changes to C++? Guenther? Christoph? Hartmut? Someone else?

On Apr 26, 2013, at 8:32 AM, Sean Owen <sro...@gmail.com> wrote:

> I committed my version of the patch. There were a number of small style changes, and one of substance -- rearranging the PDF417 metadata. Most is the same though of course.
> From there you may have a clearer view into what's going on with the test suite #2.
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "zxing" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to zxing+un...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Guenther Grau

unread,
Apr 26, 2013, 12:01:41 PM4/26/13
to zx...@googlegroups.com
Thanx. Will have a look at the test suite #2

Guenther Grau

unread,
Apr 26, 2013, 12:10:08 PM4/26/13
to zx...@googlegroups.com
Hi Steven,

my C++ is a bit rusty. If someone else would volunteer that would be great. I'm happy to add more documentation/comments to the Java code base and answer questions.

I have a few more ideas for improving barcode detection. Those changes will hopefully only impact smaller portions of the code and shouldn't keep someone from starting to work on a C++ port.

Best regards,

  Guenther
Reply all
Reply to author
Forward
0 new messages