Workflow for Stereo capture

Jonathan Hung

unread,

Jul 9, 2012, 2:54:44 PM7/9/12

to Decapod, Martin Krämer

Hi Martin,

I would like to start discussions about how Stereo capture would work for an end user.

Perhaps the best way to start the conversation would be for me to give my understanding of how it would work, and for you to help correct and add where needed.

1. Hardware setup and calibration:

- the book to be digitized is placed open on a surface

- a pair of cameras or a single stereo camera are mounted

- cameras are positioned so that the book / pattern are completely viewable by both cameras

2. Calibrate using a checkerboard pattern:

- checkerboard pattern is placed flat on surface in front of properly positioned cameras.

- X number of samples are taken to determine positioning of cameras. Between capture of each sample, the pattern is moved a little.

- The software would tell the user that calibration is complete.

3. Capture

- user can start capturing using the software

- for each page spread, two images of the spread are produced

4. Dewarp

- Input: a stereo pair of images

- Output: a dewarped rendition of the book surface

Caveats / Limitations:

- cameras should not be moved once calibrated

Questions:

- can the cameras be mounted with different "toe-in" angles?

- can the cameras be mounted with different "pitch" angles?

- if the cameras are not positioned properly, will the calibrator know to abort?

- for calibration, is it a fixed number of samples or will it vary from session to session?

- what additional data does the dewarp script require in addition to the two input images? I assume some sort of calibration data (a text or XML file)?

That's all for now.

Thanks Martin!

- Jon.

Martin Krämer

unread,

Jul 10, 2012, 4:34:44 PM7/10/12

to Jonathan Hung, Decapod

Hi Jon,

I'll annotate your mail w/ MK. Hope this clarifies the procedure a bit!

Cheers
Martin

On Mon, Jul 9, 2012 at 8:54 PM, Jonathan Hung <jh...@ocadu.ca> wrote:
> Hi Martin,
>
> I would like to start discussions about how Stereo capture would work for an
> end user.

MK: Ok, sure

>
> Perhaps the best way to start the conversation would be for me to give my
> understanding of how it would work, and for you to help correct and add
> where needed.
>
> 1. Hardware setup and calibration:
> - the book to be digitized is placed open on a surface
> - a pair of cameras or a single stereo camera are mounted
> - cameras are positioned so that the book / pattern are completely viewable
> by both cameras

MK: Generally this is only required to verify that the book(s) are
completely visible from both cameras.

> 2. Calibrate using a checkerboard pattern:

MK: Regarding the calibration we will have to see whether we can work
with a stripped down version. But let's discuss the full regular
procedure for now.

> - checkerboard pattern is placed flat on surface in front of properly
> positioned cameras.
> - X number of samples are taken to determine positioning of cameras. Between
> capture of each sample, the pattern is moved a little.

MK: It probably won't work if the pattern is simply only moved a
little across a flat surface each time.
The user should try to vary the position of the checkerboard pattern
such that some rotation and skew is introduced as well.

> - The software would tell the user that calibration is complete.

MK: Yes, this will be possible. We will directly try to detect corners
of the checkerboard pattern after taking each image and be able to
stop after having a certain number of good image pairs.

> 3. Capture
> - user can start capturing using the software
> - for each page spread, two images of the spread are produced
>
> 4. Dewarp
> - Input: a stereo pair of images
> - Output: a dewarped rendition of the book surface

MK: Technically we get one image per page, i.e. two images from the
dewarping procedure. We could of course also stitch them together, but
one image per page is probably better.

>
> Caveats / Limitations:
> - cameras should not be moved once calibrated

MK: Specifically their relative position and orientation to each other
must not be changed. If properly mounted on a tripod with a stereo bar
it may be possible to move the tripod around (depending on whether the
stereo bar is robust enough to hold the cameras in exactly the same
position).

MK: Possible other caveats may be that we may require a uniform
background color for the capturing procedure. It may also be possible
that we require a stick-like object of uniform color to allow for page
separation.
We'll have to conduct more experiments there to determine how the
final setup will look like.

>
> Questions:
> - can the cameras be mounted with different "toe-in" angles?
> - can the cameras be mounted with different "pitch" angles?

MK: I cannot give a final answer to this. If we generally stay with a
full calibration it shouldn't be an issue. But as we are trying to
reduce the effort it may also be restricted to parallel setups in the
end.

> - if the cameras are not positioned properly, will the calibrator know to
> abort?

MK: Right now it's hard to tell how the calibration will look like in
that much detail. But likely we will during calibration only be able
to determine when the checkerboard pattern is not fully visible in
both cameras.
Other problems with calibration can only be detected after taking all
the pictures and actually performing the calibration.

> - for calibration, is it a fixed number of samples or will it vary from
> session to session?

MK: It should practically work with a fixed number of image pairs. But
as corner detection may fail in some cases the number of images the
user has to take may vary.

> - what additional data does the dewarp script require in addition to the two
> input images? I assume some sort of calibration data (a text or XML file)?

MK: Currently it requires both input images and a directory containing
several XML files for calibration. We can pack these together into a
single file in a future version though.
It also requires the user to specify height of output image at the
moment. Shouldn't be a problem to provide a default setting in a
future version as well.

> That's all for now.
>
> Thanks Martin!

MK: No problem.
>
> - Jon.

Jonathan Hung

unread,

Jul 12, 2012, 3:53:30 PM7/12/12

to dec...@googlegroups.com, Martin Krämer

Hi Martin,

My comments inline..

On Tue, Jul 10, 2012 at 4:34 PM, Martin Krämer <kra...@iupr.com> wrote:

> Caveats / Limitations:
> - cameras should not be moved once calibrated

MK: Specifically their relative position and orientation to each other
must not be changed. If properly mounted on a tripod with a stereo bar
it may be possible to move the tripod around (depending on whether the
stereo bar is robust enough to hold the cameras in exactly the same
position).

Okay good to know. I think for now we will require users to calibrate with each new session. This may not be a big deal if the process is painless.

MK: Possible other caveats may be that we may require a uniform
background color for the capturing procedure. It may also be possible
that we require a stick-like object of uniform color to allow for page
separation.
We'll have to conduct more experiments there to determine how the
final setup will look like.

I think a solid background is acceptable. Let me know what you discover.

>
> Questions:
> - can the cameras be mounted with different "toe-in" angles?
> - can the cameras be mounted with different "pitch" angles?

MK: I cannot give a final answer to this. If we generally stay with a
full calibration it shouldn't be an issue. But as we are trying to
reduce the effort it may also be restricted to parallel setups in the
end.

I think we should go with whatever approach will give the best results with least user effort. If the full calibration is longer, but has a more flexible hardware setup, then we may want to consider that. Often users will take a longer route to accomplishing their task because they perceive it was easier, not necessarily because it's quicker.

> - for calibration, is it a fixed number of samples or will it vary from
> session to session?

MK: It should practically work with a fixed number of image pairs. But
as corner detection may fail in some cases the number of images the
user has to take may vary.

Do you know roughly how many images are required to do a full calibration? How much for a "quick" calibration?

Sounds like corner detection is a major requirement for calibration. Does this imply a solid background to calibrate on to improve corner detection?

> - what additional data does the dewarp script require in addition to the two
> input images? I assume some sort of calibration data (a text or XML file)?

MK: Currently it requires both input images and a directory containing
several XML files for calibration. We can pack these together into a
single file in a future version though.
It also requires the user to specify height of output image at the
moment. Shouldn't be a problem to provide a default setting in a
future version as well.

Do you have an example of the command line options for the script? This would give us an idea of what we will need to code for the UI.

Does each pair of images has unique calibration / XML data, or does each pair of image get the same data?

- Jon.

Martin Krämer

unread,

Jul 12, 2012, 5:25:45 PM7/12/12

to Jonathan Hung, dec...@googlegroups.com

Hi Jon,

see inline.

On Thu, Jul 12, 2012 at 9:53 PM, Jonathan Hung <jh...@ocadu.ca> wrote:
> Hi Martin,
>
> My comments inline..
>
>
> On Tue, Jul 10, 2012 at 4:34 PM, Martin Krämer <kra...@iupr.com> wrote:
>>
>>
>> > Caveats / Limitations:
>> > - cameras should not be moved once calibrated
>> MK: Specifically their relative position and orientation to each other
>> must not be changed. If properly mounted on a tripod with a stereo bar
>> it may be possible to move the tripod around (depending on whether the
>> stereo bar is robust enough to hold the cameras in exactly the same
>> position).
>>
>
> Okay good to know. I think for now we will require users to calibrate with
> each new session. This may not be a big deal if the process is painless.
>

MK: Guess that is okay. It shouldn't be a terrible effort and we can
allow saving the data in a later version.

>>
>> MK: Possible other caveats may be that we may require a uniform
>> background color for the capturing procedure. It may also be possible
>> that we require a stick-like object of uniform color to allow for page
>> separation.
>> We'll have to conduct more experiments there to determine how the
>> final setup will look like.
>
>
> I think a solid background is acceptable. Let me know what you discover.
>

MK: Ok, sure.

>>
>>
>> >
>> > Questions:
>> > - can the cameras be mounted with different "toe-in" angles?
>> > - can the cameras be mounted with different "pitch" angles?
>> MK: I cannot give a final answer to this. If we generally stay with a
>> full calibration it shouldn't be an issue. But as we are trying to
>> reduce the effort it may also be restricted to parallel setups in the
>> end.
>
>
> I think we should go with whatever approach will give the best results with
> least user effort. If the full calibration is longer, but has a more
> flexible hardware setup, then we may want to consider that. Often users will
> take a longer route to accomplishing their task because they perceive it was
> easier, not necessarily because it's quicker.
>

MK: Yes, I tend to think that reduced calibration should be more
user-friendly, but it can't be decided finally yet.

>> > - for calibration, is it a fixed number of samples or will it vary from
>> > session to session?
>> MK: It should practically work with a fixed number of image pairs. But
>> as corner detection may fail in some cases the number of images the
>> user has to take may vary.
>
>
> Do you know roughly how many images are required to do a full calibration?
> How much for a "quick" calibration?
>

MK: 15 should be fine for a full calibration. More won't hurt. Can't
tell yet how the quick calibration will look like in the end.
We're currently experimenting with a single image of two chessboard
patterns at a 90° angle with regards to each other, but don't have
results yet.

> Sounds like corner detection is a major requirement for calibration. Does
> this imply a solid background to calibrate on to improve corner detection?
>

MK: No, it just needs to detect the chessboard corners, which only
requires good lighting, a reasonable print quality of the pattern on a
non-reflective surface.
The solid background is only important for dewarping the book.

>>
>> > - what additional data does the dewarp script require in addition to the
>> > two
>> > input images? I assume some sort of calibration data (a text or XML
>> > file)?
>> MK: Currently it requires both input images and a directory containing
>> several XML files for calibration. We can pack these together into a
>> single file in a future version though.
>> It also requires the user to specify height of output image at the
>> moment. Shouldn't be a problem to provide a default setting in a
>> future version as well.
>
>
> Do you have an example of the command line options for the script? This
> would give us an idea of what we will need to code for the UI.
>

MK: Currently it's like:
./script <calibration data path> <left image path> <right image path>
<output height> <output path>

> Does each pair of images has unique calibration / XML data, or does each
> pair of image get the same data?
>

MK: Each image pair uses the same calibration data. It only has to be
determined once in the beginning.

>
> - Jon.

Martin

Jonathan Hung

unread,

Aug 7, 2012, 11:48:12 AM8/7/12

to dec...@googlegroups.com

Hi Martin,

Been thinking through the camera calibration some more. I have some more questions.

1. Is it possible to do calibration with a single capture? If so, what are the restrictions and conditions the user needs to follow? What is the ideal case that results in the least number of captures?

2. If multiple images are absolutely needed for calibration, can the types of calibration images be categorized and are the numbers fixed?

i.e. Calibration requires:

- X number of "well positioned" shots.

- Y number of rotated pattern shots.

- X number of skewed pattern shots.

3. Is the purpose of changing the position of the checkerboard to cover the view area of both cameras over a series of images?

4. What happens if one of the cameras is mounted upside down? Will calibration know to rotate it or will this be a failure case? (This may happen when mounting two cameras on a bar in portrait orientation where the tops of both cameras are facing each other).

5. Can changing the size / configuration of the calibration target speed up or improve end-user experience? i.e. Will a bigger pattern with more grids give a quicker calibration? What if the user placed multiple targets in front of the camera instead of one? What if the pattern were a "T" pattern thus giving additional information about camera orientation?

Thanks!

- Jon.

--
You received this message because you are subscribed to the Google Groups "Decapod" group.
To post to this group, send email to dec...@googlegroups.com.
To unsubscribe from this group, send email to decapod+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/decapod?hl=en.

--

---

JONATHAN HUNG

INCLUSIVE DESIGNER, IDRC

T: 416 977 6000 x3951

F: 416 977 9844

E: jh...@ocadu.ca

OCAD UNIVERSITY

Inclusive Design Research Centre

205 Richmond Street W, Toronto, ON, M5V 1V3

www.ocadu.ca

www.idrc.ocad.ca

Martin Krämer

unread,

Aug 8, 2012, 5:06:02 PM8/8/12

to dec...@googlegroups.com

Hi Jon,

see below.

On Tue, Aug 7, 2012 at 5:48 PM, Jonathan Hung <jh...@ocadu.ca> wrote:
> Hi Martin,
>

> Been thinking through the camera calibration some more. I have some more
> questions.
>
> 1. Is it possible to do calibration with a single capture? If so, what are
> the restrictions and conditions the user needs to follow? What is the ideal
> case that results in the least number of captures?

It depends in the end. Standard calibration as it is implemented now
requires around 15-25 pictures for good results according to my
initial experience.
Theoretically it is possible to calibrate on a single image, but then
it requires two planes at an 90° angle towards each other compared to
the x shots of one plane for normal calibration and we have some
prototypical code to do that.
Depending on how much time is left after dewarping implementation I'll
try to implement a module, which calibrates based on a single shot of
a Rubik's cube.
But I can't promise to deliver such a thing right now as my main focus
is on the dewarping part.

>
> 2. If multiple images are absolutely needed for calibration, can the types
> of calibration images be categorized and are the numbers fixed?
>
> i.e. Calibration requires:
> - X number of "well positioned" shots.
> - Y number of rotated pattern shots.
> - X number of skewed pattern shots.

We could try to formulate it like this and there are actually
publications, which analyze the underlying maths in detail.
But in practice it's probably simpler to just take 20 images from a
different perspective with some change.
Didn't encounter a case, where this didn't suffice.

>
> 3. Is the purpose of changing the position of the checkerboard to cover the
> view area of both cameras over a series of images?

No, the checkerboard always has to be completely visible in both
views. It's rather to introduce enough variance to allow derivation of
a fitting camera model.

>
> 4. What happens if one of the cameras is mounted upside down? Will
> calibration know to rotate it or will this be a failure case? (This may
> happen when mounting two cameras on a bar in portrait orientation where the
> tops of both cameras are facing each other).
>

It shouldn't matter in the end with regards to calibration, but I'm
not 100% sure that there are no problems in the rest of the pipeline,
i.e. theoretically it's no problem, but there may be some "hard-coded"
parts, which will cease to work (generally I'd say it should already
work - just not sure).
I'd have to try it at least once with my current implementation to see.

> 5. Can changing the size / configuration of the calibration target speed up
> or improve end-user experience? i.e. Will a bigger pattern with more grids
> give a quicker calibration? What if the user placed multiple targets in
> front of the camera instead of one? What if the pattern were a "T" pattern
> thus giving additional information about camera orientation?

Theoretically multiple targets could help, but in practice it will
make corner detection too complicated. So in my opinion the currently
feasible options atm are either the single shot Rubik's cube
calibration or a full calibration. Anything else doesn't seem feasible
with the current project plan.

>
> Thanks!
>
> - Jon.

Welcome

- Martin

Reply all

Reply to author

Forward