I'm not sure if many people are aware, but there has been a lot of ongoing development lately over in the repository for enblend/enfuse. Perhaps most importantly, Christoph Spiel's staging branch has been fully merged in, which introduces support for OpenMP on platforms that support it, as well as resolving several long standing leaks and bugs.
I've gone and packaged up some snapshots for Windows users of the various configuration options that are possible - OpenMP support, GPU support, 64-bit support, and posted them at http://ryan.sleevi.com/files/enblend-enfuse-4.0/
In a change-up of things, I've gone ahead and linked them against the DLL-based versions of the MS C/C++ Runtime (for various reasons), and I want to make sure that process is working smooth for people. Most importantly, if you're planning to mess with OpenMP, you'll need to make sure you have the redistributable. Many already have the necessary files, as they have been shipping for some time, but for those who have issue, the two links are:
This isn't an "official" release of 4.0 by any means, but this is an attempt to get dev binaries out there, as well as a chance to try out a more automated build system that should make it easier to package enblend/enfuse and Hugin in the future, so feedback would be appreciated.
Test machine: Xp sp2 x86, e6550 3.35GHz 4gb ram, nVidia 9800GT 512mb,
fairly new drivers
(I just saw that google censored my cpu in the last post :D )
Old, 3.2, was crashing for me a lot lately on bigger projects, I'll
try to test it against this one when I find some time.
Also when running GPU --gpu i get the following warnings:
-------------------
: info: using graphics card: NVIDIA Corporation
: info: renderer: GeForce 9800 GT/PCI/SSE2
: info: GL info log
0(1) : warning C7531: global type sampler2DRect requires "#extension
GL_ARB_texture_rectangle : enable" before use
-------------------
...
enblend: info: state (130, 580) weight = nan
enblend: info: state (117, 580) weight = nan
enblend: info: state (104, 580) weight = nan
enblend: info: new estimate = (-2147483648, -2147483648)
enblend: warning: new mean field estimate outside cost image
-------------------
> Test machine: Xp sp2 x86, e6...@3.35GHz 4gb ram, nVidia 9800GT 512mb
the next level would be to run this kind of tests in the installer and select the appropriate version to be installed on the system.
If a dual core is already so much faster than GPU stitching, I'd tend to install the OpenMP version on any x86 multi-core. Not sure about the old multi-threading pentium. Leave the GPU accelerated only for people with old boxes and poweful GPUs.
Fine! I tried the x86 OpenMP, SSE2 version (assuming that this is right for a core2duo). I copied just enblend.exe into my hugin/bin directory (latest available build). The error I get looks not like it's a problem of MP or the like, its about an "unrecognized wrap-around mode". It won't let me copy the output as the window disappears just after klicking OK (and before you can't select text) but I felt this would happen and made a screenshot: http://www.joachimschneider.info/enblend-error/enblend-error.gif The first of the numbers mentioned is the width.
Any idea?
Btw, there are droplets supplied only for enfuse but not enblend. Is it because those for enblend didn't need any changes and I can continue to use them?
> Probably because enblend was called with "-w -f3000x1500".
yes, there are some changes in how enblend / enfuse is called in 4.0
we will need to adapt this in Hugin - ideally with an enblend --version to find out if we're using a "legacy" enblend-enfuse; and with the Makefile adapted to the enblend-enfuse "generation" found on the machine.
Sorry, I was too fast posting; I recognized the cause was changed options syntax for enblend. I have to review my enblend options in hugin. But this probably means we need new droplets for enblend, too. regards Joachim
> Fine! I tried the x86 OpenMP, SSE2 version (assuming that this is right > for a core2duo). > I copied just enblend.exe into my hugin/bin directory (latest available > build). > The error I get looks not like it's a problem of MP or the like, its > about an "unrecognized wrap-around mode". It won't let me copy the > output as the window disappears just after klicking OK (and before you > can't select text) but I felt this would happen and made a screenshot: > http://www.joachimschneider.info/enblend-error/enblend-error.gif > The first of the numbers mentioned is the width.
> Any idea?
> Btw, there are droplets supplied only for enfuse but not enblend. Is it > because those for enblend didn't need any changes and I can continue to > use them?
> > Probably because enblend was called with "-w -f3000x1500".
> yes, there are some changes in how enblend / enfuse is called in 4.0
> we will need to adapt this in Hugin - ideally with an enblend --version
> to find out if we're using a "legacy" enblend-enfuse; and with the
> Makefile adapted to the enblend-enfuse "generation" found on the machine.
I think it could have been done more backward compatible way in
enblend.
Anyway, using OpenMP, SSE2 version (on a 20000x10000 pano) from
command line I quickly bails with just a:
enblend: out of memory
enblend: bad allocation
I guess this is cause it was compiled without image cache :/
Yay for 32-bit OSes!
And by the testing I've done, the next fastest is v3.2 :(
Is there a reason it wasn't compiled with image cache and OpenMP?
J. Schneider wrote: > Sorry, I was too fast posting; I recognized the cause was changed > options syntax for enblend. I have to review my enblend options in > hugin. But this probably means we need new droplets for enblend, too.
> -----Original Message----- > From: hugin-ptx@googlegroups.com [mailto:hugin-ptx@googlegroups.com] On > Behalf Of Zoran Zorkic > Sent: Monday, September 07, 2009 4:25 PM > To: hugin and other free panoramic software > Subject: [hugin-ptx] Re: Enblend/Enfuse 4.0 and Hugin <SNIP>
> Is there a reason it wasn't compiled with image cache and OpenMP?
If I recall correctly, the cached file representation used by enblend has not been made re-entrant/multi-threaded aware yet, so it causes some issues.
I'm not sure if any work has been done towards supporting it, but I suspect it's not exactly an 'easy' task either, tracking 'hot' and 'cold' image segments across multiple threads efficiently.
On Sep 7, 10:37 pm, "Ryan Sleevi" <ryan+hu...@sleevi.com> wrote:
> > Is there a reason it wasn't compiled with image cache and OpenMP?
> If I recall correctly, the cached file representation used by enblend has
> not been made re-entrant/multi-threaded aware yet, so it causes some issues.
> I'm not sure if any work has been done towards supporting it, but I suspect
> it's not exactly an 'easy' task either, tracking 'hot' and 'cold' image
> segments across multiple threads efficiently.
So not anytime soon? :(
Any workarounds? BTW shouldn't the OS do the swapping for Enblend? Or
they do, but I have a 32-bit one?
just trying a selfcompiled v4.0 - stitching a pretty big project now.
Inspecting the enblend run with top I see that indeed enblend sometimes uses more than 100% cpu - but never more than like 200%, usually it is around 100% and from time to time there is a peak using somewhat more. As I have a quadcore and I see it is perfectly used up by nona I'm a little disappointed about enblend - was this a malconfiguration?
enblend verbose version info tells me: enblend 4.0-b93c2aed500d
Extra feature: dmalloc support: no Extra feature: image cache: no Extra feature: GPU acceleration: yes Extra feature: OpenMP: yes - version 2005-5 - support for nested parallelism - support for dynamic adjustment of the number of threads - using 4 processors and up to 3 threads
Is this the up-to date version or id I compile something wrong?
On Sep 7, 10:37 pm, "Ryan Sleevi" <ryan+hu...@sleevi.com> wrote:
> > Is there a reason it wasn't compiled with image cache and OpenMP?
> If I recall correctly, the cached file representation used by enblend has
> not been made re-entrant/multi-threaded aware yet, so it causes some issues.
> I'm not sure if any work has been done towards supporting it, but I suspect
> it's not exactly an 'easy' task either, tracking 'hot' and 'cold' image
> segments across multiple threads efficiently.
Tried using GPU, SSE2 and the regular release, they all die with :
enblend: out of memory
enblend: bad allocation
at some point. Is that expected or normal on win32?
They did accept -m parameter with no complaint.
OpenMP and GPU die after blending 5 photos, regular after 22 (of 44).
Is there a way to check why?
On Sep 8, 1:33 am, Zoran Zorkic <zo...@gmx.net> wrote:
> Tried using GPU, SSE2 and the regular release, they all die with :
> enblend: out of memory
> enblend: bad allocation
> at some point. Is that expected or normal on win32?
Could you please give us the output of
enblend --version --verbose
as I suspect your version runs w/o image-cache &&
your images are large && your memory is scarce
(wrt the images' sizes).
> They did accept -m parameter with no complaint.
If you actually use a variant w/o image cache, you get
a warning for every option that is related to the image cache,
like
enblend: warning: option "-m" has no effect in this version of
enblend,
enblend: warning: because it was compiled without image cache
However, processing continues; it is just a warning.
<benjamin.schnied...@gmail.com> wrote:
> Inspecting the enblend run with top I see that indeed enblend sometimes
> uses more than 100% cpu - but never more than like 200%, usually it is
> around 100% and from time to time there is a peak using somewhat more.
First of all CPU load is a silly
measure. You don't look at your car's
(clunker's?) gas consumption and say: "5 MPG,
that must have been an awfully fast ride!" For
a user only wall-clock time matters.
> As I have a quadcore and I see it is perfectly used up by nona I'm a
> little disappointed about enblend - was this a malconfiguration?
The current implementation does _not_ scale
well beyond two processors. The reasons for
this are yet unknown. You are welcome to run
Enblend with your favorite profiler, identify
the bottleneck(s), and send me a bunch of
patches that rectify the congestion.
> enblend verbose version info tells me:
> enblend 4.0-b93c2aed500d
> Extra feature: dmalloc support: no
> Extra feature: image cache: no
> Extra feature: GPU acceleration: yes
> Extra feature: OpenMP: yes - version 2005-5
> - support for nested parallelism
> - support for dynamic adjustment of the number of threads
> - using 4 processors and up to 3 threads
> Is this the up-to date version or did I compile something wrong?
Your revision is reasonably recent.
Thanks for including the full version
information!
IMO, you did nothing wrong. My results are
similar. Let's see what I've got. The Enblend
that came with my Debian distribution is
$ enblend --version
enblend 3.2-cvs
Extra feature: image cache
Extra feature: GPU acceleration
Using it to build a medium-sized pano (62MPixel,
see details if tiffinfo(1) output below) gives
Extra feature: dmalloc support: no
Extra feature: image cache: no
Extra feature: GPU acceleration: yes
Extra feature: OpenMP: yes - version 2005-5
- support for nested parallelism
- support for dynamic adjustment of the
number of threads
- using 4 processors and up to 4 threads
and (for the same panorama as above) shows the
following performance for different numbers of
threads
For 2 threads, we gain some 10% in performance
for 24% more CPU load; for 3 threads we gain
15% for 40% more CPU load. 4 threads obviously
suffer from stalls, contention, etc.
> First of all CPU load is a silly > measure.[...] For > a user only wall-clock time matters.
sure. I did not yet take times, but the new version is definitely faster than the old one. Of course 3 busy-waiting cpus and only one doing something useful will stress a 4-core cpu the same as 4 useful threads - the same in 'top', different wall clock time.
I know Amdahl's law, (good old parallel programming lectures) but I thought enblends work was somewhat simpler to split up than it seems.
> The current implementation does _not_ scale > well beyond two processors. The reasons for > this are yet unknown. You are welcome to run > Enblend with your favorite profiler, identify > the bottleneck(s), and send me a bunch of > patches that rectify the congestion.
Can you propose a (free) multi-thread profiler for linux? (Looking out for one for some time now...) If so, I'll have a look at it, enough of exploiting hugin/panotools now, time to give something back ;)
> IMO, you did nothing wrong. My results are > similar.
Fine :) I will check later how much faster enblend has become. I'm missing the output which image is processed (or, is finished being processed) a little...
<benjamin.schnied...@gmail.com> wrote:
> Can you propose a (free) multi-thread profiler for linux? (Looking out
> for one for some time now...) If so, I'll have a look at it, enough of
> exploiting hugin/panotools now, time to give something back ;)
because, sadly, I found out that my g++-4.3.2
does not produce reliable results when profiling
OpenMP code. As I consider optimizing without
profiling an entirely meaningless business, I
had to look for a profiler, too.
You might find that TAU-2.18.2p2 is not very C++
friendly, though. I have submitted some patches
to the TAU developers, which are already in
their CVS, but are not available to the general
public. Give me a buzz on private e-mail, if
you need more info.
BTW, TAU is tremendously more fun when operated
on top of a PerfCtr-enhanced kernel. :-(=)
> > > Probably becauseenblendwas called with "-w -f3000x1500".
> > yes, there are some changes in howenblend/ enfuse is called in 4.0
> > we will need to adapt this in Hugin - ideally with anenblend--version
> > to find out if we're using a "legacy"enblend-enfuse; and with the
> > Makefile adapted to theenblend-enfuse "generation" found on the machine.
> I think it could have been done more backward compatible way inenblend.
Yes of course it should be done backward compatible, actually I can
not see why there is a new command. Why not make Enblend working
properly with the current command for both ways. I can not see any
reason for having 2 commands.
As far as I know the -w command is the only command built in in any
stitcher, PTGui PTMac or Hugin.
It has to be backward compatible or you loose 90% of the users.
Zoran Zorkic wrote: > Probably because enblend was called with "-w -f3000x1500".
Probably. The problem is that enblend doesn't correctly parse the command line. It expects a MODE setting after -w or --wrap (which are synonymous). This is somehow contrary to the documentation which says: "Specifying --wrap without MODE selects horizontal wrapping".
I suspect this will be corrected in the final version.