Update of last PDF417 commit

331 views
Skip to first unread message

hfneubauer

unread,
Oct 12, 2012, 11:36:40 AM10/12/12
to zx...@googlegroups.com
Hi all,
hereby I submit an update of PDF417 in c++; I also add some files where the differences between the new C++ code and the Java code in the trunk are explained. Regards, hfneubauer
zxing-update-pdf417-cpp-hfn-02.zip
zxing-diffs-between-cpp-and-java.zip

hfneubauer

unread,
Oct 16, 2012, 5:34:41 AM10/16/12
to zx...@googlegroups.com
Another update, where the Detector has been "slimmed". As yet proposed, findVertices checks every 8th row; correctCodeWordVertices, patchVerticePosThinBars and patchVerticePos have been removed; Detector::sampleGrid has been removed, too, and replaced by specialGrid which has been simplified. Regards, hfneubauer

Am Montag, 15. Oktober 2012 21:20:55 UTC+2 schrieb hfneubauer-home:
There is still much optimization potential, particularly in the Detector.

For example, in the findVertices method still every line is checked about the presence of the guard patterns. The current ZXING Java source does sometimes not check all the lines, dependent on the "tryHarder" hint. But now, even under "tryHarder" conditions, it will be not necessary to do this; only check every 8th or 16th line because we need only two points for the left and two for the right side. These are sufficient for the vertical lines, and we have to find the intersections with the horizontal lines at the top and the bottom.
Furthermore, the "correctCodeWordVertices" and "patchVerticePos..." methods are no longer needed.

It is also possible that the perspective transformation of points in the methods SampleGrid and SpecialGrid is superfluous because the "LinesMatrix" created by createLinesMatrix from the Delta2Binarizer is yet perspectively transformed. This would mean to invoke SpecialGrid instead of SampleGrid in Detector::detect; to skip some lines at the beginning of SpecialGrid; and to replace the arguments of "image->get(...)" by the non-transformed co-ordinates (say, (x,y)) - with some other simplifications.

Maybe that someone also finds a simplification/optimization of the "computeRowCount" method.

zxing-update-pdf417-cpp-hfn-03.zip

Brett Nieland

unread,
Oct 17, 2012, 6:36:05 AM10/17/12
to zx...@googlegroups.com
Sir,

Thank you for your continuing contributions on PDF 417.  I hope at some point I, or someone else will have time to back port it into the Java versions!

Best,

Brettt

hfneubauer-home

unread,
Oct 17, 2012, 3:14:46 PM10/17/12
to zx...@googlegroups.com
Hi Brett,

I would be very glad if this would happen. And in this case, my name (hfneubauer, or Hartmut Neubauer) would be added to the authors' or/and contributors' list?
Best, hfn

Sean Owen

unread,
Oct 18, 2012, 3:09:37 AM10/18/12
to zx...@googlegroups.com
Yes. I looked at the smart diff though, and it's a good bit of effort to document the changes, though I am still finding it hard to figure out which are the key changes and which are necessary to port back. I don't think I can do it myself.

hfneubauer

unread,
Jan 8, 2013, 6:41:33 AM1/8/13
to zx...@googlegroups.com
Hi Christoph,

can you please provide the complete URL path to the testdata?

Regards, hfneubauer

Am Montag, 7. Januar 2013 22:40:27 UTC+1 schrieb Christoph Schulz:
Hi,

in absence of a clean patch file against a certain revision, I merged all of the zip files you provied (zxing-update-hfn-pdf417-20121203.zip along with zxing-update-pdf417-cpp-hfn-01.zip to zxing-update-pdf417-cpp-hfn-03.zip). See: https://github.com/schulzch/zxing/commit/2c300e2f964f93ad6368c11c774e014b56412ffa and following (note: you might be interested in files for CMake there :))

While merging I fixed a bunch of compilation errors, so I could have added a bad fix. I'll look into it the next couple of days. I just want to confirm something: have you ever tried zxing test data with your code (e.g. ZXing-2.1-testdata/pdf417/01.png)?

Regards,
Christoph

Am Mittwoch, 17. Oktober 2012 21:14:46 UTC+2 schrieb hfneubauer-home:

Christoph Schulz

unread,
Jan 8, 2013, 7:31:38 AM1/8/13
to zx...@googlegroups.com
Hi,

Link to zip: http://code.google.com/p/zxing/downloads/detail?name=ZXing-2.1-testdata.zip
Path: /core/test/data/blackbox/pdf417/*.png and friends

Regards,
Christoph

hfneubauer-home

unread,
Jan 8, 2013, 1:52:09 PM1/8/13
to zx...@googlegroups.com
Hi Christoph,

I tried to read the *.png images directly from screen with Windows CE devices equipped with a 3MP autofocus camera directly from the screen. Most of the "pdf417" images could be recognized and also a big part of the "pdf417-2" images, too.

Which compiler do you use? I must say that, during my last modifications, I used the eMbedded C++ compiler, while for earlier modifications I also compiled using Microsoft Visual Studio 2008.

Thank you for your work. Two other notes:
(1) Please note that, until now, the PDF417 recognition, because of the Delta2Binarizer, only works together with the GreyscaleLuminanceSource. To activate it for other LuminanceSources, it would be necessary:
  - either to add a virtual method "getStraightLine" to these LuminanceSources,
  - or to define "getStraightLine" for the base class LuminanceSource, replacing
  "greyData_[(top_ + y) * dataWidth_ + left_ + x]"
by
"get(x,y)"
and to define "virtual int get(x,y)" in "GreyscaleLuminanceSource" as "return greyData_[(top_+y)*dataWidth_+left_+x];" ;
and define "virtual int get(x,y)" in other LuminanceSources in the same way.


(2) In the PDF417 BitMatrixParser (BitMatrixParserPD.cpp), I have added some validity checks, but may be,
two of them may be removed due to improved error correction. First is line 168 through 171:

      //* 2012-06-27 HFN: cw_block should be the modulus of the row number by 3;
      //* otherwise, handle the codeword as erasure:
      if ((cw_block >= 0) && (cw_block != rowNumber % 3))
        cw = -1;

Second is line 256 throug line 259:

  /* 2012-06-22 hfn: verify whether outer columns are still okay: */
  if(!VerifyOuterColumns(rowNumber)) {
    throw FormatException("BitMatrixParser::processRow(PDF417): outer columns corrupted!");
  }

In the case this would be removed, the methods "VerifyOuterColumns" and "IsEqual" could be deleted, too.
I did not yet try to remove the two code parts; I had added them in an earlier state of development when error & erasure
correction was not yet up to date.

Best regards, hfneubauer


Am Dienstag, 8. Januar 2013 13:31:38 UTC+1 schrieb Christoph Schulz:
Hi,

Christoph Schulz

unread,
Jan 9, 2013, 10:08:27 AM1/9/13
to zx...@googlegroups.com
Hi,

Good. On my side PDF417 gets detected, but not decoded (high erasure count).


Am Dienstag, 8. Januar 2013 19:52:09 UTC+1 schrieb hfneubauer-home:
Which compiler do you use? I must say that, during my last modifications, I used the eMbedded C++ compiler, while for earlier modifications I also compiled using Microsoft Visual Studio 2008.
Visual Studio 2010 32bit (Windows), G++ (Linux), Clang 4.1 (OS X).
 

Thank you for your work. Two other notes:
(1) Please note that, until now, the PDF417 recognition, because of the Delta2Binarizer, only works together with the GreyscaleLuminanceSource. To activate it for other LuminanceSources, it would be necessary:
Done.
 
(2) In the PDF417 BitMatrixParser (BitMatrixParserPD.cpp), I have added some validity checks, but may be,
two of them may be removed due to improved error correction.
Can you please re-pack a (very complete) snapshot of your sources? Fiddling with code snippets from diffrent revisions is error prone. If you don't have any objections I would create a clean patch/open an issue to get your changes merged back into trunk at the end of the month.

Regards,
Christoph

hfneubauer-home

unread,
Jan 10, 2013, 6:05:12 PM1/10/13
to zx...@googlegroups.com
Hi Christoph,

regarding your request: the last time I committed a package I intentionally added only PDF417 classes and affected files. I did not commit the whole package because I realized that in the meantime many sources have changed. For example, the error correction classes for QR Code and DataMatrix were earlier called "GF256" and "GF256Poly"; later they are "GeneralGF" and "GeneralGFPoly" (the reason may be that this works for Aztec, too). In my company's project, I didn't replace the GF256 by GeneralGF and so on. On the other hand, for the committed sources, I compared them with the latest stuff in the ZXing trunk and modified them as far as I could; for example, I took the latest PDF417 error correction classes from ZXing and translated them from Java to C++.

Your idea to create a patch/open an issue is a very good idea.

Regards, Hartmut

Christoph Schulz

unread,
Jan 11, 2013, 11:12:06 AM1/11/13
to zx...@googlegroups.com
Hi,

Its working now (more or less): numbers work, text fails and generally PDF417s with low resolution (perfectly valid) seem to be an issue, too; nothing unfixable though :)

Regards,
Christoph
P.S.: see https://github.com/schulzch/zxing/tree/master/cpp/core/src/zxing

hfneubauer-home

unread,
Jan 12, 2013, 7:29:42 AM1/12/13
to zx...@googlegroups.com
Hi Christoph,

seems interesting, though I am wondering that only numbers work and text fails.
If you're interested, I also can add some classes where it is easier to follow the detection status. They are able to convert a BitMatrix into a BMP file. In the sources they can be found in the "#iif HFNDIAG_" positions. Of course this is only useful for debug/diagnosis purposes.
Regards, hfn

Christoph Schulz

unread,
Jan 12, 2013, 12:42:37 PM1/12/13
to zx...@googlegroups.com
Hi,

Seems like I can save myself some work by asking you, so: yes :)

I was thinking of something similar: currently building zxing(.exe) is quite a pain on Windows and Mac OS X due to ImageMagick. I was thinking of replacing it with something lightweigh that can be shipped with ZXing, such as http://nothings.org/stb_image.c (loading jpg, png, etc.) or http://lodev.org/lodepng/ (loading/saving png). While on the run I would have added something similar to BitMatrix. Not sure if Sean likes the idea of embedding a small image library (at least in debug builds of libzxing) or adding an interface to intercept detection results.

Regards,
Christoph

Steven Parkes

unread,
Jan 12, 2013, 2:10:15 PM1/12/13
to Christoph Schulz, zx...@googlegroups.com
Sean isn't particularly active in C++ things, licensing notwithstanding.

I have no objections to alternatives to imagemagick as long as there are no license issues and the functionality is reasonably comparable.
> --
>
>
>

hfneubauer

unread,
Jan 14, 2013, 6:50:02 AM1/14/13
to zx...@googlegroups.com
Hi,
you can use the function "BmtxSave" in zxingwce.cpp and specify the filename, the bit matrix and two zoom multipliers (horizontal and vertical). I hope that that's all you need (add these files, i hope that no other include is necessary ...)
Good luck, hfn.


Am Samstag, 12. Januar 2013 18:42:37 UTC+1 schrieb Christoph Schulz:
Hi,

hfndiag.zip

Christoph Schulz

unread,
Jan 15, 2013, 7:37:17 AM1/15/13
to zx...@googlegroups.com
Hi,

thanks for your helper method - espcially the zoom factor is useful!

Many testcases wont get read without --try-harder. Happens. The "real" problem is when it comes to small barcodes when the detector's output is funny (see attached image for an ugly real life sample): large "black" ares get filled with "white" for some reason. Any ideas, what might have went wrong?

Regards,
Christoph
pdf417-detector-output.png

Christoph Schulz

unread,
Jan 23, 2013, 9:18:25 AM1/23/13
to zx...@googlegroups.com
Hi Hartmut,

after quite a week, I can tell why you stated earlier "it was really hard (to code)" :). I'm going to iterate over all things I noted (nothing personal, just code review):
  • Detector::computeRowCount(): the sum involving aDiffVals and borderCount... I can't tell why this seems valid or what it is, since multiple concepts are cleverly mixed. Do these sums/methods/assumptions have a name?
  • Detector::computeRowCount(): the assumption, that perspective has to be transformed only once for a quadrilateral (top line) isn't valid. It breaks when the image is slightly rotated and skewed, so that the step height between each line changes. Bi-linear interpolation along the sides would fix this though.
  • Detector::computeRowCount(): the assumption, that moduleHeights are equal is wrong for my use-case (barcodes are tiny and printed using (bad) ink printers || paper jitter while scanning => the defintion of "dot" changes). I'm trying to fix this by computing moduleHeight on the fly while sampling the special grid.
  • Detector::specialGrid()+getNextPattern(): sometimes it introduces error instead of fixing it. Not sure if its a problem with using fixed comma integer math or with bar width averaging.
  • Detector::getNextPattern(): Sainte-Lange has up to 8 faculty alternatives when all parties have the same amount of votes, why just compute one alternative using this "highest party looses one bit" trick?
  • Delta2Binarizer::createLinesMatrix(): switching between line traversal modes feels wrong and overly complicated here. Why not just use bi-linear interpolation along the sides?
  • Delta2Binarizer: using member variables as method parameters is one of Martin Folwer's code smells. Makes the code hard to understand and debug.
  • Delta2Binarizer: its punching holes into larger black/white areas when there is noise on it. Applying moving average to the definition of black and white (after all illumination doesn't jitter, right...?) should fix this.
I started changing quite a bit of your code, so comments/clarifications on my notes would be welcome (see pdf417 branch for work in progress - might be broken ;)).

Regards,
Christoph

hfneubauer-home

unread,
Feb 27, 2013, 1:44:59 PM2/27/13
to zx...@googlegroups.com
Hi Christoph,
after some weeks I try to answer some of your questions.
  • Detector::computeRowCount: The concept is that in PDF417 you very quickly know the width of the code in modules (number of columns), but it is harder to calculate the height (number of rows). For me, it is very important to know the height because this is the base for the Delta2Binarizer-based lines matrix (and you have seen, Delta2Binarizer is too slow for the whole image, so minimize the number of lines first). - The concept is to look after the significant differences between each pixel rows; if there are more than one consecutive significant difference, you have to accumulate them and to calculate the average position of this difference (I took the square of the differences as weights). Afterwards, I have computed the next differences - the differences between the pixelrow difference positions. Then these differences are sorted, and you can see some kind of average difference which might be an estimation of the module height. As you have stated, this height may alter slightly in case of perspective distortion, and may be there would be something to do with the transformation - but in general, this algorithm tells me the height of the code in terms of modules. (And, as I have stated, this is important, and I don't want to return to the "computeYDimension" method which simply takes the module width as module height which is not true.)
  • Delta2Binarizer::createLinesMatrix: okay, in the beginning I have not seen the importance of the perspective issue and that in this case the distances between the lines should not be equal, and similarly in horizontal direction. As you have seen, I have done a correction with "exp" and "log"; but I did not yet check your latest commit, perhaps you have worked on this issue? (My latest commit of yesterday would fix the horizontal issue under certain circumstances at a later stage, and it works.)
  • Delta2Binarizer: as I have stated earlier, its purpose is only limited: (1) to find some of the positions of the guard patterns and (2) to create the lines matrix with the symbols/codewords. Because of this limited purpose the behaviour in larger black/white areas is not so important.
  • Sainte-Lague: it is not "the highest party loses one bit", but "the latest seat is given to the next party". But maybe this is not very effective, so it is not so important to discuss this issue.

Regards, Hartmut

Guenther Grau

unread,
Feb 28, 2013, 2:41:29 AM2/28/13
to zx...@googlegroups.com
Hi Hartmut,

I see that you have been/are working on PDF417 detection and decoding for c++. I'm currently working on the Java side and would like to have a look at your code. I don't seem to be able to locate any c++ PDF417 file svn repository on http://zxing.googlecode.com/svn or https://github.com/zxing/zxing. Can you give me a pointer where to find the code that you've been working on?

Thanx and best regards,

  Guenther

hfneubauer-home

unread,
Feb 28, 2013, 5:55:56 PM2/28/13
to zx...@googlegroups.com
Hi Guenther,
I didn't add my commitments to public repositories, but I committed the sources as zip files or directly. I must add that most of these c++ files are compiled for Windows CE (MS eMbedded Visual C++ compiler), also for MS Visual Studio 2005. Christoph Schulz adapted these sources, compiled them with many other compilers, but he changed many things, too.

You will find my sources
(1) at the head of this thread "Update of last PDF417 commit" as "zxing-update-pdf417-cpp-hfn-02.zip",

(2) an update in the following thread: https://groups.google.com/forum/?fromgroups=#!topic/zxing/Y1OPo44e93Y , my last answer contains the zip file "zxing-update-hfn-pdf417-20121203.zip",

(3) this update which contains only one C++ file and a "diff": https://groups.google.com/forum/?fromgroups=#!topic/zxing/a7InBFMuJCA .

You'll find the discussion with Mr Schulz here: https://groups.google.com/forum/?fromgroups=#!topic/zxing/PxPnNuufkSI . It contains his repository: https://github.com/schulzch/zxing/tree/pdf417 (please look at "./cpp/core/zxing/cpp/....").

Thank you for your interest. Best regards, hfn.
Reply all
Reply to author
Forward
0 new messages