Latest Hugin model, transform specification?

295 views
Skip to first unread message

paul womack

unread,
Sep 30, 2014, 7:36:34 AM9/30/14
to hugi...@googlegroups.com
I am using Hugin 2014.0.0.51ff237f209e.

Is there a document describing what the optimisation parameters

Yaw/Pitch/Roll/TrX/TrY/tRz/Plane yaw/Plane pitch

are, and what the units are?

I am (once again) trying to get essentially arbitrary photographs
(who's only shared properties are overlapping subjects) to line up.

I'm trying to understanding why I'm failing, and wether
success is even possible.

BugBear

T. Modes

unread,
Sep 30, 2014, 3:53:39 PM9/30/14
to hugi...@googlegroups.com


Am Dienstag, 30. September 2014 13:36:34 UTC+2 schrieb bugbear:
I am using Hugin 2014.0.0.51ff237f209e.

Is there a document describing what the optimisation parameters

Yaw/Pitch/Roll/TrX/TrY/tRz/Plane yaw/Plane pitch

are, and what the units are?

A description can be found here
http://wiki.panotools.org/Stitching_a_photo-mosaic
This file is also include in the help which is delivered with Hugin.

Yaw/Pitch/Roll/Plane yaw/Plane pitch are angles in degree.
TrXYZ are in Cartesian coordinates (without unit, reference is the panosphere with r=1)
 

Roger Broadie

unread,
Oct 1, 2014, 10:48:46 AM10/1/14
to hugi...@googlegroups.com
On Tue, 30 Sep 2014 at 12:36:22 +0100 Paul Womack wrote

----------------
----------------

Paul, is this essentially a follow-up to your earlier question about
photographing old documents in record offices in sections and then
stitching them? If so, it certainly can be done, with the main problems
coming from the nature of the subject and the environment, including
low-contrast faint originals, poor lighting and the need to unfold the
originals or unroll them in sections.

I've been doing it for some years for the same purpose as you. Up to
now I have used the old technique of photographing the original as far
as possible at a constant height and straight downwards (assuming the
original is on a horizontal table) and then treating the camera as being
at a great distance by defining a very small fov for the individual
images. I then optimise the lens parameters individually for the
different images (except one fov must be frozen as an anchor), which
compensates to some extent for the inevitable departures in the camera
positioning from the ideal. Of course, this method is a kludge and
certainly does not meet your requirement that the photos should be
‘arbitrary’, which I take to mean the camera is not constrained in
position, direction or focal length. I think that may well be possible
in general for flat originals by using Hugin’s translation capability,
but personally I think it is prudent to give the optimiser a little more
help when taking the photographs.

I entirely agree that what is needed is a model that allows you to
understand what you are doing and what changes might help if it does not
work. I use the 2013 version, which I think is the latest available for
Windows, and that has the parameters

Yaw, Pitch, Roll, X, Y and Z

Your 2014 version appears to add two further parameters, ‘Plane Yaw’ and
‘Plane Pitch’. As I know nothing about those, I’d like to start with
the 2013 set.

It is explained in the Help files in the section ‘Stitching a
photo-mosaic’, which is also to be found here:
http://wiki.panotools.org/index.php?title=Stitching_a_photo-mosaic&oldid=13518.
I see Thomas Modes as already mentioned it, but it is rather brief (it
calls itself a stub) and does raise some questions. I have had to
puzzle over it a bit, so it may be worth expanding on the way I
interpret it; if others think what I have to say is wrong, perhaps they
will tell us. If this piece is longer than is normal on this group, it
is because there seems so little information about planar photomosaics,
as opposed to linear photomosaics, i.e. single rows of images, that a
longer account may benefit others, or at any rate engender comment and a
better understanding for me as well as others.

As explained in the help file, X, Y and Z are the coordinates of the
individual camera positions in a coordinate system with its origin at
the centre of the panosphere, the imaginary sphere on the surface of
which a normal central-viewpoint panorama is assembled during
optimisation. On the face of it, the panosphere is not needed for a
planar photomosaic, but I guess it is there first to latch onto the
existing code for assembling and outputting the completed panorama and
secondly because it is actually needed when the translation is used its
original purpose of allowing a nadir to be stitched in (not something I
have any experience of).

So how about the scale? I think the diagram in the help files must
catch the situation in the middle of optimisation - after all, in what
is said to be a linear photomosaic the individual images do not
completely overlap on the image plane. Outboard of the image plane and
parallel to it must be a plane (call it the subject plane) that
represents the original document in the scale of the drawing. Then, in
the optimisation the two must be brought into coincidence. One can look
on that as inflating the panosphere, thereby pushing the image plane out
until it merges with the subject plane, or, alternatively, since the
radius of the panosphere is fixed at 1, as shrinking the scale of the
combination of the subject plane and the individual camera positions
until the subject plane is brought into coincidence with the image
plane. Formally, I think one can say that the unit distance of the
scale in which the coordinate distances for X, Y and Z are expressed is
the radius of the panosphere.

Of course a coordinate system also needs axis directions. As far as I
can see, in the 2013 version, the Z axis is perpendicular to the image
plane, intersecting it where the panosphere touches it, and that is the
direction with respect to which yaw and pitch are measured (in degrees,
as Thomas has confirmed). One can expect yaw to be measured by
deflection along the X axis and pitch by deflection along the Y axis.

However, there are problems with the directions of these axes. The text
and the diagram seem not to agree about the direction of Z, the text
stating that the panosphere touches the image plane at (0,0,1) and the
diagram implying it does so at (0,0,-1). In fact I think it must be the
latter. All my examples seem to me to show that Z points, in the sense
of becoming more positive, downwards as seen in the diagram, i.e. in the
opposite direction to the arrow-head in the diagram. That corresponds
to the direction from the subject towards the camera. If that is right,
it would be helpful to those struggling with the explanation to make it
clear. Further, although X increases from left to right, as one might
expect, Y increases downwards, which is not what I, at least, would
expect. I did wonder if the directions were chosen with the right-hand
screw rule in mind, but in that case, with that configuration of X and
Y, surely Z should go into the paper. The direction of Z is arbitrary,
provided it is applied consistently, but it is important, because it
helps interpret the values thrown up by the optimiser when things go wrong.

One of the unexpected features of the model when applied to planar
mosaics is that since there is no idealised camera taking a
single-viewpoint panorama to be placed at the centre of the panosphere
there is nothing to anchor the centre of the panosphere: in the diagram
its position seems completely arbitrary, provided it does not actually
fall in the subject plane. But it has to be specified in some way, to
act as the origin for the camera positions. The simplest course is to
put one of the camera positions at (0,0,0) and anchor it (i.e. make it
non-optimisable). That is equivalent to making the centre of the
panosphere coincide with one of the actual camera positions. The
complete panorama is then the one seen from that position and the field
of view, for a planar mosaic of reasonable size, will stretch almost to
180°. I did wonder if it would improve matters to move the panosphere
centre further away from the subject, to get a smaller overall field of
view. Since the distance from this position to the image is 1by
definition, that must be achieved by shrinking the system consisting of
the subject plane and the individual camera positions, which is done by
putting the anchor camera position at say (0,0,-0.5). Correspondingly,
putting the anchor at (0,0,1) will bring the panosphere centre closer to
the subject and increase the total field of view. In fact, I can't
detect any significant difference in the final quality between these
different positions. Taking Z=0 for the anchor seems as good a course as
any.

The procedure I have found to work generally is to take a set of photos
that cover the subject in a roughly regular array, with the camera
pointing generally downwards and held roughly at the same height and in
the same position. That can be supplemented with photos taken at an
angle, e.g. to avoid the camera casting a shadow on the subject, or
zoomed to capture areas of interest better. If necessary the basic set
can be used to establish the panorama and the supplementary images added
later.

In optimising I first set the output format at rectilinear (because
that’s easy to forget). Then I return all the parameters in the
optimiser to 0 and render everything non-optimisable except the X and Y
values for all positions except one, which thereby anchors the
panosphere centre. Keep the lens (assuming there has been no zooming)
non-optimisable for the moment at either the EXIF value or your own
calibration. Generate the control points and try a first optimisation.
What you want here is for the images to coalesce into a rough block:
any that are missing or are badly out of place probably show that there
are missing or erroneous control points. Investigate, add any control
points needed and remove any that are obviously bogus. Re-optimise and
with any luck you will get a sensible-looking block of images. Then you
can successively and cumulatively add the other non-anchor Z values,
then the roll and finally the yaw and pitch to the optimisation. By now
the optimisation should be pretty good and you can try an optimisation
of the lens parameters.

There is a particular problem if you have more than one lens. The
optimiser does seem a bit unstable if you try to optimise two different
lenses simultaneously, because if you repeatedly press the Optimise
button one of the fovs is prone to change each time. In fact what is
happening is that as the fov goes in one direction the corresponding Z
values go in the other so that the final size of the constituent image
stays the same. Perhaps with more than one lens you need to optimise
each on its own.

Z values can be particularly troublesome, especially if included in the
initial optimisation. If all the images are taken from roughly the same
height and the centre of the panosphere is made to coincide with one
camera position (i.e. it has Z=0) all the other Zs should be close to 0.
If any are close to, or even worse more negative than, -1, the
optimiser has failed. Sometimes simply resetting them to 0 and
reoptimising is all that is needed, though there may well be
control-point problems that need sorting out.

And what about the direction of the axis with respect to which yaw and
pitch are measured? If yaw and pitch are anchored at 0 for one camera
position, it will be better if that position is one for which the camera
was pointed directly down, although it is always possible to move the
panorama in the Preview window until it looks right. But it is best to
define horizontal and vertical lines if any are available. Often a map
has a surrounding border which is ideal for the purpose and Hugin seems
to respond particularly well to these controls. Of course all the yaw,
pitch and roll parameters need to be optimisable for this approach to
work properly.

As to the added parameters Plane Yaw and Plane Pitch, I cannot at all
say what they are meant to do. They look very like the VP Pan and VP
Tilt of PTGui Pro, and there seems remarkably little information about
how they work. But if Hugin has introduced them they must presumably
have a substantial purpose. Possibly they are intended to define the Z
axis as pointing perpendicularly to the plane of the nadir while the
axis of panorama as a whole, assuming it to be a central panorama,
points differently. For a planar photomosaic that consideration does not
apply and it may be that the plane pitch and tilt parameters can be left
at 0. On the other hand, PTGui Pro does a good job on planar
photomosaics (at least as good as Hugin in the interior of the stitch,
though possibly with less control on the shape of the whole) and it does
not seem possible to suppress its practice of allowing both Yaw and VP
Pan to vary, as well as Pitch and VP Tilt. So it would be worth seeing
if allowing Yaw and Plane Yaw and Pitch and Plane Pitch all to vary
under optimisation improves the result in Hugin.

Good luck

Roger Broadie










bugbear

unread,
Oct 2, 2014, 4:48:19 AM10/2/14
to hugi...@googlegroups.com
paul womack wrote:

> I'm trying to understanding why I'm failing, and wether
> success is even possible.

I have discovered a little about what's possible.

I decided to make a test using "perfect" data,

So I took a large (4327x3104) screen shot of a google map,
and used Gimp to:

scale
rotate (followed by fit canvas to layers)
perspective warp

the map. Both images were saved as PNGs with no transparency.

I imported both images, and set separate lenses.

After setting lots of CPs (manually, and even fine adjust wouldn't work for me)
I left all settings on the first image alone, and allowed the optimiser to adjust
Yaw/Pitch/Roll/TrX/TrY/tRz/Plane yaw/Plane pitch and HFOV.

The optimisation was near enough perfect.

average control point distance: 0.082138
standard deviation: 0.043298
maximum: 0.168715

The question is now - why can't I get this on "real" shots?

BugBear

T. Modes

unread,
Oct 3, 2014, 4:08:33 AM10/3/14
to hugi...@googlegroups.com


Am Donnerstag, 2. Oktober 2014 10:48:19 UTC+2 schrieb bugbear:
I imported both images, and set separate lenses.

After setting lots of CPs (manually, and even fine adjust wouldn't work for me)
I left all settings on the first image alone, and allowed the optimiser to adjust
Yaw/Pitch/Roll/TrX/TrY/tRz/Plane yaw/Plane pitch and HFOV.

Some comments from me:
* When stitching a map or when all is on the same plane you don't need to optimize plane yaw/plane pitch. These variables should only used when there are several remapping planes in the pictures are needed (e.g. for patching a nadir image).  For stitching a map you have only one defined remapping plane.
* Optimizing TrZ *and* HFOV for all images with separate lenses is not helpful. Changing TrZ and HFOV have both a very similar effect on the remapped image (both scale the image). So the optimizer can achieve the same result with 2 different values. This makes the optimisation more fragile than needed.
Optimize only TrZ or HFOV. Or alternative don't use separate lenses for each image (then HFOV is linked for all images, this helps also the optimizer).

Thomas

Rogier Wolff

unread,
Oct 3, 2014, 5:36:18 AM10/3/14
to hugi...@googlegroups.com
On Fri, Oct 03, 2014 at 01:08:33AM -0700, T. Modes wrote:
> * Optimizing TrZ *and* HFOV for all images with separate lenses is not
> helpful. Changing TrZ and HFOV have both a very similar effect on the
> remapped image (both scale the image). So the optimizer can achieve the
> same result with 2 different values. This makes the optimisation more
> fragile than needed.

"more fragile" is sort of an understatement. When the size of an image
ends up like size = C * TrZ * HFOV, optimizing for both should
converge on any solution where TrZ * HFOV comes out to the same value.

Then the optimization is in a difficult situation. Each time it has a
tentative solution it will figure out if the solution becomes better
or worse when increasing or decreasing each variable. Well, the
solution will have ALMOST no change when decreasing HFOV as long as
you increase TrZ proportionally. But going the other way also has a
similar effect. In practise the computer will always find a minimal
effect, and end up with one of the variables huge and the other very
small. Small computational rounding issues will determine what way
things will go. But you'll usually end up with a "silly" solution,
that exagerates any computational roudings. So you'll end up with
something like: "Oh, you took this picture from the moon, with a very,
very good telescope!", but then tilting the telescope just a faction
of a degree will make the placement of the picture totally wrong.
That sort of stuff.

Even more practical, the transformations are probably not identical,
but only very close. So slight mis-positioning of a control point
(even at fractions of a pixel) will cause the algorithm to go haywire
into some silly direction.

Roger.

--
+-- Rogier Wolff -- www.harddisk-recovery.nl -- 0800 220 20 20 --
- Datarecovery Services Nederland B.V. Delft. KVK: 30160549 -
| Files foetsie, bestanden kwijt, alle data weg?!
| Blijf kalm en neem contact op met Harddisk-recovery.nl!

bugbear

unread,
Oct 3, 2014, 7:14:57 AM10/3/14
to hugi...@googlegroups.com
Since my images have different pixel counts, sharing a lens causes some issues.
But you're quite right, I have set trZ to 0 on both images, and the optimisation
still finds a nigh-perfect solution.

But if I lock-down plane pitch, the optimiser cannot find a perfect solution.

While I'm here, I'm still getting the fault in the "Fast Preview" display where the "Overview" panel
of the dialog is not fully drawn, so that background shows through.

This occurs both when "docked" and "free".

BugBear


Reply all
Reply to author
Forward
0 new messages