QR Code Detection algorithm?

3,030 views
Skip to first unread message

Brian

unread,
Jul 28, 2009, 6:25:44 AM7/28/09
to zxing
I'm curious about the algorithm used by the QR code detector. From a
raw image to an input to the Reed-Solomon decoder, I suppose that
these tasks must be performed.

* image rotation and correct perspective distortion
* find the finder and alignment patterns
* make threshold decisions on grey scale data to get binary data
* sample raw image pixels into modules bits (I mean, each QR Code
module must be spanned by many raw image pixels)

My question is, what order are these steps performed in? What other
things does the detection algorithm do?

I have to admit that my question is largely academic -- I'm interested
in the design and improvement of 2-D detection algorithms generally.
Thanks for your answer!

-Brian

srowen

unread,
Jul 28, 2009, 6:40:15 AM7/28/09
to zxing
Yes, though the order is really:

* make threshold decisions on grey scale data to get binary data
* sample raw image pixels into modules bits (I mean, each QR Code
module must be spanned by many raw image pixels)
* find the finder and alignment patterns
* image rotation and correct perspective distortion

That's about it. There are a few more details like a little bit of
deconvolution for blur, and sorting out which finder pattern is which.

Finding the finder and alignment patterns is not hard, and the rest
from there is simple. I think the library's implementation of these
parts are about as fast as it gets, that I can imagine. This feature
of QR Codes makes it really easy to decode.

The challenging, or interesting parts, are before -- dealing with
blur, noise, and, correctly dithering to monochrome. You can imagine
the issues here. If you're interested in this from an academic
perspective, these are the challenges.

Brian

unread,
Jul 28, 2009, 7:01:34 AM7/28/09
to zxing
It sounds tricky to sample to QR Code modules *before* rotation. The
modules are rather small compared to allowable rotation, the x-y scan
could easily jump rows.

I'm particularly interested in the deconvolution -- at what stage do
you perform that? How do you pick the blur pattern that you use for
the deconvolution?

Thanks again for your answers -- I've been reading the code, but I
don't have familiarity with Java or C++.

srowen

unread,
Jul 29, 2009, 4:30:27 AM7/29/09
to zxing
I misread your step. Yes, sampling the modules happens after
correcting for rotation and perspective.

I say we do deconvolution... but actually I think that is just for 1D.
It is in any event a very light blind deconvolution IIRC (-0.5 2 -0.5
kernel?) to sharpen edges slightly. We have never gotten any
satisfactory results from any more significant deconvolution. Without
knowing the distance and camera properties a priori, and we don't, we
don't see how a blind deconvolution can help. Of course if you knew
the right parameters...

Brian

unread,
Jul 29, 2009, 5:42:53 AM7/29/09
to zxing
OK -- it makes sense that sampling the modules happens later.

De-convolution must be computationally demanding in 2D. But if you
were going to do it, couldn't you estimate the kernel by looking at
the finder patterns? Since the finder patterns are known, it seems
possible (in principle) to estimate the kernel by "dividing" the image
by the known pattern.

Just for fun, here is demo of a "deconvolution" project I worked on.
I got to make lots of wonderful assumptions: perfect black and white,
perfect sampling of the modules, etc. The 2D convolution pattern was
only 2 by 2. The goal was actually to show joint "deconvolution" and
decoding (using LDPC codes rather than Reed Solomon codes).

http://www.lit.ice.uec.ac.jp/kurkoski/2d-okaasan.pdf

srowen

unread,
Jul 29, 2009, 8:24:32 AM7/29/09
to zxing
Yes, I think we believe you're just never going to get enough accuracy
and speed to make a decent user experience on a device without good
optics, so we've punted on this as not a high priority. The current
process doesn't even look at most of the image... a 2D deconvolution
would by itself make it several times slower. The new native code API
makes that less of a problem in theory but... still, seems like a
problem far better solved by a camera.

This step happens before finding finder patterns, so you have a
chicken-and-egg problem there... without it, it's hard to find the
finder patterns. Yes in principle you could estimate a point spread
function by photographing a thin black line or something.
Automatically figuring out the kernel seems way hard.

I tried many variations on this, passing several Gaussian kernels,
large and small, and some based on photographs from the camera, and
could never get a result that was meaningfully better. I could have
been missing something.
Reply all
Reply to author
Forward
0 new messages