I just tested the latest SVN on my workstation with the following GPU
(se diagnostic commands and output below) and it fails. Is this my GPU
or do we have a bug in the code? This is on Ubuntu 64bit, and I've even
tried with 9.10 (alpha2) because it has a slightly newer GLEW.
$ lspci | grep VGA
00:05.0 VGA compatible controller: nVidia Corporation C51PV [GeForce
6150] (rev a2)
$ glxinfo | grep OpenGL
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: GeForce 6150/PCI/SSE2
OpenGL version string: 2.1.2 NVIDIA 185.18.14
OpenGL shading language version string: 1.20 NVIDIA via Cg compiler
OpenGL extensions:
$ glewinfo | grep GL_ARB
GL_ARB_color_buffer_float: OK
GL_ARB_depth_buffer_float: MISSING
GL_ARB_depth_texture: OK
GL_ARB_draw_buffers: OK
GL_ARB_draw_instanced: OK [MISSING]
GL_ARB_fragment_program: OK
GL_ARB_fragment_program_shadow: OK
GL_ARB_fragment_shader: OK
GL_ARB_framebuffer_object: MISSING [OK]
GL_ARB_framebuffer_sRGB: MISSING
GL_ARB_geometry_shader4: OK [MISSING]
GL_ARB_half_float_pixel: OK
GL_ARB_half_float_vertex: OK
GL_ARB_imaging: OK
GL_ARB_instanced_arrays: MISSING
GL_ARB_map_buffer_range: OK
GL_ARB_matrix_palette: MISSING
GL_ARB_multisample: OK
GL_ARB_multitexture: OK
GL_ARB_occlusion_query: OK
GL_ARB_pixel_buffer_object: OK
GL_ARB_point_parameters: OK
GL_ARB_point_sprite: OK
GL_ARB_shader_objects: OK
GL_ARB_shading_language_100: OK
GL_ARB_shadow: OK
GL_ARB_shadow_ambient: MISSING
GL_ARB_texture_border_clamp: OK
GL_ARB_texture_buffer_object: OK [MISSING]
GL_ARB_texture_compression: OK
GL_ARB_texture_compression_rgtc: MISSING
GL_ARB_texture_cube_map: OK
GL_ARB_texture_env_add: OK
GL_ARB_texture_env_combine: OK
GL_ARB_texture_env_crossbar: MISSING
GL_ARB_texture_env_dot3: OK
GL_ARB_texture_float: OK
GL_ARB_texture_mirrored_repeat: OK
GL_ARB_texture_non_power_of_two: OK
GL_ARB_texture_rectangle: OK
GL_ARB_texture_rg: MISSING
GL_ARB_transpose_matrix: OK
GL_ARB_vertex_array_object: OK
GL_ARB_vertex_blend: MISSING
GL_ARB_vertex_buffer_object: OK
GL_ARB_vertex_program: OK
GL_ARB_vertex_shader: OK
GL_ARB_window_pos: OK
and this is what happens when I run nona with -g
$ nona -g -o testgpu.tif _MG_8768-_MG_8873.pto
nona: using graphics card: NVIDIA Corporation GeForce 6150/PCI/SSE2
destStart=[1900, 2303]
destEnd=[2804, 5700]
destSize=[(904, 3397)]
srcSize=[(3480, 2314)]
srcBuffer=0x7f4fd4934010
srcAlphaBuffer=0x7f4fd35cd010
destBuffer=0x7f4fd406a010
destAlphaBuffer=0x7f4fd3d7c010
destGLInternalFormat=GL_RGBA8
destGLFormat=GL_RGB
destGLType=GL_UNSIGNED_BYTE
srcGLInternalFormat=GL_RGBA8
srcGLFormat=GL_RGB
srcGLType=GL_UNSIGNED_BYTE
srcAlphaGLType=GL_UNSIGNED_BYTE
destAlphaGLType=GL_UNSIGNED_BYTE
warparound=0
needsAtanWorkaround=0
maxTextureSize=4096
Source chunks:
[(0, 0) to (3480, 2314) = (3480x2314)]
Dest chunks:
[(0, 0) to (904, 1699) = (904x1699)]
[(0, 1699) to (904, 3397) = (904x1698)]
Total GPU memory used: 132001184
Interpolator chunks:
[(0, 0) to (4, 4) = (4x4)]
[(4, 0) to (8, 4) = (4x4)]
[(8, 0) to (12, 4) = (4x4)]
[(12, 0) to (16, 4) = (4x4)]
[(16, 0) to (20, 4) = (4x4)]
[(20, 0) to (24, 4) = (4x4)]
[(24, 0) to (28, 4) = (4x4)]
[(28, 0) to (32, 4) = (4x4)]
[(0, 4) to (4, 8) = (4x4)]
[(4, 4) to (8, 8) = (4x4)]
[(8, 4) to (12, 8) = (4x4)]
[(12, 4) to (16, 8) = (4x4)]
[(16, 4) to (20, 8) = (4x4)]
[(20, 4) to (24, 8) = (4x4)]
[(24, 4) to (28, 8) = (4x4)]
[(28, 4) to (32, 8) = (4x4)]
[(0, 8) to (4, 12) = (4x4)]
[(4, 8) to (8, 12) = (4x4)]
[(8, 8) to (12, 12) = (4x4)]
[(12, 8) to (16, 12) = (4x4)]
[(16, 8) to (20, 12) = (4x4)]
[(20, 8) to (24, 12) = (4x4)]
[(24, 8) to (28, 12) = (4x4)]
[(28, 8) to (32, 12) = (4x4)]
[(0, 12) to (4, 16) = (4x4)]
[(4, 12) to (8, 16) = (4x4)]
[(8, 12) to (12, 16) = (4x4)]
[(12, 12) to (16, 16) = (4x4)]
[(16, 12) to (20, 16) = (4x4)]
[(20, 12) to (24, 16) = (4x4)]
[(24, 12) to (28, 16) = (4x4)]
[(28, 12) to (32, 16) = (4x4)]
[(0, 16) to (4, 20) = (4x4)]
[(4, 16) to (8, 20) = (4x4)]
[(8, 16) to (12, 20) = (4x4)]
[(12, 16) to (16, 20) = (4x4)]
[(16, 16) to (20, 20) = (4x4)]
[(20, 16) to (24, 20) = (4x4)]
[(24, 16) to (28, 20) = (4x4)]
[(28, 16) to (32, 20) = (4x4)]
[(0, 20) to (4, 24) = (4x4)]
[(4, 20) to (8, 24) = (4x4)]
[(8, 20) to (12, 24) = (4x4)]
[(12, 20) to (16, 24) = (4x4)]
[(16, 20) to (20, 24) = (4x4)]
[(20, 20) to (24, 24) = (4x4)]
[(24, 20) to (28, 24) = (4x4)]
[(28, 20) to (32, 24) = (4x4)]
[(0, 24) to (4, 28) = (4x4)]
[(4, 24) to (8, 28) = (4x4)]
[(8, 24) to (12, 28) = (4x4)]
[(12, 24) to (16, 28) = (4x4)]
[(16, 24) to (20, 28) = (4x4)]
[(20, 24) to (24, 28) = (4x4)]
[(24, 24) to (28, 28) = (4x4)]
[(28, 24) to (32, 28) = (4x4)]
[(0, 28) to (4, 32) = (4x4)]
[(4, 28) to (8, 32) = (4x4)]
[(8, 28) to (12, 32) = (4x4)]
[(12, 28) to (16, 32) = (4x4)]
[(16, 28) to (20, 32) = (4x4)]
[(20, 28) to (24, 32) = (4x4)]
[(24, 28) to (28, 32) = (4x4)]
[(28, 28) to (32, 32) = (4x4)]
#version 110
#extension GL_ARB_texture_rectangle : enable
uniform sampler2DRect SrcTexture;
float sinh(const in float x) { return (exp(x) - exp(-x)) / 2.0; }
float cosh(const in float x) { return (exp(x) + exp(-x)) / 2.0; }
float atan2_xge0(const in float y, const in float x) {
return atan(y, x);
}
float atan2_safe(const in float y, const in float x) {
return atan(y, x);
}
float atan_safe(const in float yx) {
return atan(yx);
}
void main(void)
{
float discardA = 1.0;
float discardB = 0.0;
vec2 src = gl_TexCoord[0].st;
src -= vec2(9012.0000000000000000, 4506.0000000000000000);
// rotate_erect(27036.000000000000000, 7570.1936456957173505)
{
src.s += 7570.1936456957173505;
float w = (abs(src.s) > 27036.000000000000000) ? 1.0 : 0.0;
float n = (src.s < 0.0) ? 0.5 : -0.5;
src.s += w * -54072.000000000000000 * ceil(src.s /
54072.000000000000000 + n);
}
// sphere_tp_erect(8605.8260828649654286)
{
float phi = src.s / 8605.8260828649654286;
float theta = -src.t / 8605.8260828649654286 +
1.5707963267948965580;
if (theta < 0.0) {
theta = -theta;
phi += 3.1415926535897931160;
}
if (theta > 3.1415926535897931160) {
theta = 3.1415926535897931160 - (theta -
3.1415926535897931160);
phi += 3.1415926535897931160;
}
float s = sin(theta);
vec2 v = vec2(s * sin(phi), cos(theta));
float r = length(v);
theta = 8605.8260828649654286 * atan2_safe(r, s * cos(phi));
src = v * (theta / r);
}
// persp_sphere(8605.8260828649654286)
{
mat3 m = mat3(-0.021348969898465505746,
-0.99861950017682066250, -0.047992867708350636646,
0.99977208476946100024, -0.021324357795218604888,
-0.0010248318628368971190,
0.0000000000000000000, -0.048003808507433361197,
0.99884715265589141264);
float r = length(src);
float theta = r / 8605.8260828649654286;
float s = 0.0;
if (r != 0.0) s = sin(theta) / r;
vec3 v = vec3(s * src.s, s * src.t, cos(theta));
vec3 u = v * m;
r = length(u.st);
theta = 0.0;
if (r != 0.0) theta = 8605.8260828649654286 * atan2_safe(r,
u.p) / r;
src = theta * u.st;
}
// rect_sphere_tp(8605.8260828649654286)
{
float r = length(src);
float theta = r / 8605.8260828649654286;
float rho = 0.0;
if (theta >= 1.5707963267948965580) rho = 1.6e16;
else if (theta == 0.0) rho = 1.0;
else rho = tan(theta) / theta;
src *= rho;
}
// resize(0.99565793672523394964, 0.99565793672523394964)
src *= vec2(0.99565793672523394964, 0.99565793672523394964);
// radial(1.0043634782470169942, 0.0000000000000000000,
-0.0043634782470170496715, 0.0000000000000000000, 1157.0000000000000000,
8.7592802374617306782)
{
float r = length(src) / 1157.0000000000000000;
float scale = 1000.0;
if (r < 8.7592802374617306782) {
scale = ((0.0000000000000000000 * r +
-0.0043634782470170496715) * r + 0.0000000000000000000) * r +
1.0043634782470169942;
}
src *= scale;
}
src += vec2(1736.5000000000000000, 1156.5000000000000000);
src = src * discardA + vec2(-1000.0, -1000.0) * discardB;
gl_FragColor = vec4(src.s, 0.0, 0.0, src.t);
}
#version 110
#extension GL_ARB_texture_rectangle : enable
uniform sampler2DRect CoordTexture;
uniform sampler2DRect SrcTexture;
uniform sampler2DRect AccumTexture;
uniform vec2 SrcUL;
uniform vec2 SrcLR;
uniform vec2 KernelUL;
uniform vec2 KernelWH;
float w(const in float i, const in float f) {
float c = (i < 16.000000000000000000) ? 1.0 : -1.0;
float x = c * (15.000000000000000000 - i + f);
vec2 xpi = vec2(x, x / 16.000000000000000000) * 3.1415926535897931160;
vec2 xsin = sin(xpi);
vec2 result = vec2(1.0, 1.0);
if (xpi.x != 0.0) result.x = xsin.x / xpi.x;
if (xpi.y != 0.0) result.y = xsin.y / xpi.y;
return result.x * result.y;
}
void main(void)
{
vec2 src = texture2DRect(CoordTexture, gl_TexCoord[0].st).sq;
vec4 accum = texture2DRect(AccumTexture, gl_TexCoord[0].st);
src -= SrcUL;
vec2 t = floor(src) + -14.500000000000000000;
vec2 f = fract(src);
vec2 k = vec2(0.0, 0.0);
for (float ky = 0.0; ky < 4.0000000000000000000; ky += 1.0) {
k.t = ky + KernelUL.t;
float wy = w(k.t, f.t);
for (float kx = 0.0; kx < 4.0000000000000000000; kx += 1.0) {
k.s = kx + KernelUL.s;
float wx = w(k.s, f.s);
vec2 ix = t + k;
vec4 sp = texture2DRect(SrcTexture, ix);
float weight = wx * wy * sp.a;
accum += sp * weight;
}
}
gl_FragColor = accum;
}
#version 110
#extension GL_ARB_texture_rectangle : enable
uniform sampler2DRect NormTexture;
uniform sampler2DRect CoordTexture;
uniform sampler2DRect InvLutTexture;
uniform sampler2DRect DestLutTexture;
void main(void)
{
// Normalization
vec4 n = texture2DRect(NormTexture, gl_TexCoord[0].st);
vec4 p = vec4(0.0, 0.0, 0.0, 0.0);
if (n.a >= 0.2) p = n / n.a;
// Photometric
// invLutSize = 256.00000000000000000
// pixelMax = 255.00000000000000000
// destLutSize = 1024.0000000000000000
// destExposure = 3.4434159776346751091e-05
// srcExposure = 3.4435263503927971915e-05
// whiteBalanceRed = 1.0000000000000000000
// whiteBalanceBlue = 1.0000000000000000000
p.rgb = p.rgb * 255.00000000000000000;
vec2 invR = texture2DRect(InvLutTexture, vec2(p.r, 0.0)).sq;
vec2 invG = texture2DRect(InvLutTexture, vec2(p.g, 0.0)).sq;
vec2 invB = texture2DRect(InvLutTexture, vec2(p.b, 0.0)).sq;
vec3 invX = vec3(invR.x, invG.x, invB.x);
vec3 invY = vec3(invR.y, invG.y, invB.y);
vec3 invA = fract(p.rgb);
p.rgb = mix(invX, invY, invA);
// VigCorrMode=VIGCORR_RADIAL
float vig = 1.0;
{
vec2 vigCorrCenter = vec2(1736.5000000000000000,
1156.5000000000000000);
float radiusScale=0.00047914298856256606332;
float radialVigCorrCoeff[4] = float[4](1.0000000000000000000,
0.0000000000000000000, 0.0000000000000000000, 0.0000000000000000000);
vec2 src = texture2DRect(CoordTexture, gl_TexCoord[0].st).sq;
vec2 d = src - vigCorrCenter;
d *= radiusScale;
vig = radialVigCorrCoeff[0];
float r2 = dot(d, d);
float r = r2;
vig += radialVigCorrCoeff[1] * r;
r *= r2;
vig += radialVigCorrCoeff[2] * r;
r *= r2;
vig += radialVigCorrCoeff[3] * r;
}
vec3 exposure_whitebalance = vec3(0.99996794775271302669,
0.99996794775271302669, 0.99996794775271302669);
p.rgb = (p.rgb * exposure_whitebalance) / vig;
p.rgb = p.rgb * 1023.0000000000000000;
vec2 destR = texture2DRect(DestLutTexture, vec2(p.r, 0.0)).sq;
vec2 destG = texture2DRect(DestLutTexture, vec2(p.g, 0.0)).sq;
vec2 destB = texture2DRect(DestLutTexture, vec2(p.b, 0.0)).sq;
vec3 destX = vec3(destR.x, destG.x, destB.x);
vec3 destY = vec3(destR.y, destG.y, destB.y);
vec3 destA = fract(p.rgb);
p.rgb = mix(destX, destY, destA);
gl_FragColor = p;
}
nona: normalization/photometric shader program could not be compiled.
nona: GL info log:
0(35) : error C7551: OpenGL first class arrays require #version 120
0(35) : error C7553: OpenGL array assignments require #version 120
any hints / help?
Yuv
thanks for testing, Zoran.
> nvidia 9800gt 512mb.
> Gpu drivers: 178.24.
we'll need a lot for this. I was not sure if it was just a problem of my
specific video card or a general one.
I fixed this (SVN4169). We need to provide a nona gpu binary for testing
of that (and I guess of a few other revisions in the future) on as broad
a set of GPUs as possible.
My GPU now compiles this part of the code but fails later with:
$ nona -g -o test.tif _MG_8768-_MG_8873.pto
nona: using graphics card: NVIDIA Corporation GeForce 6150/PCI/SSE2
<removed much of the verbosity>
gpu shader program compile time = 0.026
nona: GL error: Framebuffer incomplete, incomplete attachment in:
/home/yuv/src/hugin/src/hugin_base/vigra_ext/ImageTransformsGPU.cpp:700
we need more tests from more video cards to determine if it is a bug in
the code or if it is the specifics of my video card or even of my system
/ driver.
Yuv
Hi,
I'm getting exactly the same error. Graphics card is Asus nVidia 6200
LE and drivers are 190.16
Lukáš
Hi all
I just tested the latest SVN on my workstation with the following GPU
(se diagnostic commands and output below) and it fails. Is this my GPU
or do we have a bug in the code? This is on Ubuntu 64bit, and I've even
tried with 9.10 (alpha2) because it has a slightly newer GLEW.
<--snip-->
nona: normalization/photometric shader program could not be compiled.
nona: GL info log:
0(35) : error C7551: OpenGL first class arrays require #version 120
0(35) : error C7553: OpenGL array assignments require #version 120
any hints / help?
Yuv
actually this is exactly the same error as me after I thought I fixed
the bug in Rev 4169.
and with the same hardware, same Rev., though in Windows I get a
different error - most likely because of the driver that does not
support v1.20 of the shader language:
nona: normalization/photometric shader program could not be compiled.
nona: GL info log:
(1) : error C0201: unsupported version 120
(3) : warning C7508: extension GL_ARB_texture_rectangle not supported
(3) : warning C7506: OpenGL does not define the global type sampler2DRect
(10) : warning C7506: OpenGL does not define the global function
texture2DRect
(35) : error C0000: syntax error, unexpected '[', expecting '(' at token "["
(35) : error C0501: type name expected at token "["
(35) : error C1068: too much data in type constructor
(35) : error C1033: cast not allowed
(35) : error C1056: invalid initialization
Yuv
Zoran Zorkic wrote:
> Got a friend to test on an ATI card.
thanks for this.
<cut a long chunk of nona output>
> nona: GL info log:
> Fragment shader was successfully compiled to run on hardware.
>
> nona: GL info log:
> Fragment shader(s) linked, no vertex shader(s) defined.
I'm not sure what to make of this. Any image at the end of the process?
and I guess this is still with the first nona-gpu binary by Guido?
I've just built nona-gpu Rev. 4169 on Windows (yes, my tool chain in
Windows is up and running again, kind of)
It's available for download at <http://www.photopla.net/hugin/nona_4169.7z>
Looking forward for further test reports.
Yuv
if it was that easy...
there are so many variations out there of nVidia and ATI cards; and of
drivers; and there are plenty of factors that could influence success or
failure of GPU stitching with nona.
so we need to test broadly (look at my test - same hardware, two
different systems, two different errors) and we need to collect and
document results so that when we make this broadly available we can give
as precise as possible guidance as to what hardware/driver/system
combination will work.
Yuv
I'll see what other video cards I can scrape up and see how they handle.
Is there some particular set of test data to use? Just threw some simple
stuff at it, nothing complex, but it all turned out fine.
Test System: Vista x64 (SP2)
Video Card: GeForce 8800 GTS (256 mb)
Video BIOS: 60.80.0D.00.01
Video Driver: 186.18
RAM: 8GB
Proc: C2D 6600 @ 2.40 GHz
SVN: 4169
Summary: No problems - Image was adjusted as expected
Log:
nona: using graphics card: NVIDIA Corporation GeForce 8800 GTS/PCI/SSE2
destStart=[17, 50]
destEnd=[3001, 2600]
destSize=[(2984, 2550)]
srcSize=[(4296, 2856)]
srcBuffer=0000000005DB0040
srcAlphaBuffer=0000000009DF0040
destBuffer=00000000080D0040
destAlphaBuffer=00000000096A0040
destGLInternalFormat=GL_RGBA8
destGLFormat=GL_RGB
destGLType=GL_UNSIGNED_BYTE
srcGLInternalFormat=GL_RGBA8
srcGLFormat=GL_RGB
srcGLType=GL_UNSIGNED_BYTE
srcAlphaGLType=GL_UNSIGNED_BYTE
destAlphaGLType=GL_UNSIGNED_BYTE
warparound=0
needsAtanWorkaround=0
maxTextureSize=8192
Source chunks:
[(0, 0) to (4296, 2856) = (4296x2856)]
Dest chunks:
[(0, 0) to (1492, 1275) = (1492x1275)]
[(1492, 0) to (2984, 1275) = (1492x1275)]
[(0, 1275) to (1492, 2550) = (1492x1275)]
[(1492, 1275) to (2984, 2550) = (1492x1275)]
Total GPU memory used: 181856208
Interpolator chunks:
[(0, 0) to (4, 4) = (4x4)]
#version 110
#extension GL_ARB_texture_rectangle : enable
uniform sampler2DRect SrcTexture;
float sinh(const in float x) { return (exp(x) - exp(-x)) / 2.0; }
float cosh(const in float x) { return (exp(x) + exp(-x)) / 2.0; }
float atan2_xge0(const in float y, const in float x) {
return atan(y, x);
}
float atan2_safe(const in float y, const in float x) {
return atan(y, x);
}
float atan_safe(const in float yx) {
return atan(yx);
}
void main(void)
{
float discardA = 1.0;
float discardB = 0.0;
vec2 src = gl_TexCoord[0].st;
src -= vec2(1500.0000000000000000, 1300.0000000000000000);
// rotate_erect(18000.000000000000000, -13.341048242108400000)
{
src.s += -13.341048242108400000;
float w = (abs(src.s) > 18000.000000000000000) ? 1.0 : 0.0;
float n = (src.s < 0.0) ? 0.5 : -0.5;
src.s += w * -36000.000000000000000 * ceil(src.s /
36000.000000000000000 + n);
}
// sphere_tp_erect(5729.5779513082325000)
{
float phi = src.s / 5729.5779513082325000;
float theta = -src.t / 5729.5779513082325000 +
1.5707963267948966000;
if (theta < 0.0) {
theta = -theta;
phi += 3.1415926535897931000;
}
if (theta > 3.1415926535897931000) {
theta = 3.1415926535897931000 - (theta - 3.1415926535897931000);
phi += 3.1415926535897931000;
}
float s = sin(theta);
vec2 v = vec2(s * sin(phi), cos(theta));
float r = length(v);
theta = 5729.5779513082325000 * atan2_safe(r, s * cos(phi));
src = v * (theta / r);
}
// persp_sphere(5729.5779513082325000)
{
mat3 m = mat3(0.86726875127917369000, 0.49776314112882103000,
-0.0087617571429537341000,
-0.49784024852824299000, 0.86713442538204355000,
-0.015263527203445419000,
0.00000000000000000000, 0.017599535531440003000,
0.99984511618003991000);
float r = length(src);
float theta = r / 5729.5779513082325000;
float s = 0.0;
if (r != 0.0) s = sin(theta) / r;
vec3 v = vec3(s * src.s, s * src.t, cos(theta));
vec3 u = v * m;
r = length(u.st);
theta = 0.0;
if (r != 0.0) theta = 5729.5779513082325000 * atan2_safe(r, u.p) /
r;
src = theta * u.st;
}
// rect_sphere_tp(5729.5779513082325000)
{
float r = length(src);
float theta = r / 5729.5779513082325000;
float rho = 0.0;
if (theta >= 1.5707963267948966000) rho = 1.6e16;
else if (theta == 0.0) rho = 1.0;
else rho = tan(theta) / theta;
src *= rho;
}
// resize(1.6728386297652815000, 1.6728386297652815000)
src *= vec2(1.6728386297652815000, 1.6728386297652815000);
src += vec2(2144.5000000000000000, 1427.5000000000000000);
src = src * discardA + vec2(-1000.0, -1000.0) * discardB;
gl_FragColor = vec4(src.s, 0.0, 0.0, src.t);
}
#version 110
#extension GL_ARB_texture_rectangle : enable
uniform sampler2DRect CoordTexture;
uniform sampler2DRect SrcTexture;
uniform sampler2DRect AccumTexture;
uniform vec2 SrcUL;
uniform vec2 SrcLR;
uniform vec2 KernelUL;
uniform vec2 KernelWH;
float w(const in float i, const in float f) {
float A = -0.75000000000000000000;
float c = abs(i - 1.0);
float m = (i > 1.0) ? -1.0 : 1.0;
float p = c + m * f;
if (i == 1.0 || i == 2.0) {
return (( A + 2.0 )*p - ( A + 3.0 ))*p*p + 1.0;
} else {
return (( A * p - 5.0 * A ) * p + 8.0 * A ) * p - 4.0 * A;
}
}
void main(void)
{
vec2 src = texture2DRect(CoordTexture, gl_TexCoord[0].st).sq;
vec4 accum = texture2DRect(AccumTexture, gl_TexCoord[0].st);
src -= SrcUL;
vec2 t = floor(src) + -0.50000000000000000000;
vec2 f = fract(src);
vec2 k = vec2(0.0, 0.0);
for (float ky = 0.0; ky < 4.0000000000000000000; ky += 1.0) {
k.t = ky + KernelUL.t;
float wy = w(k.t, f.t);
for (float kx = 0.0; kx < 4.0000000000000000000; kx += 1.0) {
k.s = kx + KernelUL.s;
float wx = w(k.s, f.s);
vec2 ix = t + k;
vec4 sp = texture2DRect(SrcTexture, ix);
float weight = wx * wy * sp.a;
accum += sp * weight;
}
}
gl_FragColor = accum;
}
#version 120
#extension GL_ARB_texture_rectangle : enable
uniform sampler2DRect NormTexture;
uniform sampler2DRect CoordTexture;
uniform sampler2DRect InvLutTexture;
uniform sampler2DRect DestLutTexture;
void main(void)
{
// Normalization
vec4 n = texture2DRect(NormTexture, gl_TexCoord[0].st);
vec4 p = vec4(0.0, 0.0, 0.0, 0.0);
if (n.a >= 0.2) p = n / n.a;
// Photometric
// invLutSize = 256.00000000000000000
// pixelMax = 255.00000000000000000
// destLutSize = 1024.0000000000000000
// destExposure = 0.95663269200564105000
// srcExposure = 0.95663269262595030000
// whiteBalanceRed = 1.0000000000000000000
// whiteBalanceBlue = 1.0000000000000000000
p.rgb = p.rgb * 255.00000000000000000;
vec2 invR = texture2DRect(InvLutTexture, vec2(p.r, 0.0)).sq;
vec2 invG = texture2DRect(InvLutTexture, vec2(p.g, 0.0)).sq;
vec2 invB = texture2DRect(InvLutTexture, vec2(p.b, 0.0)).sq;
vec3 invX = vec3(invR.x, invG.x, invB.x);
vec3 invY = vec3(invR.y, invG.y, invB.y);
vec3 invA = fract(p.rgb);
p.rgb = mix(invX, invY, invA);
// VigCorrMode=VIGCORR_RADIAL
float vig = 1.0;
{
vec2 vigCorrCenter = vec2(2144.5000000000000000,
1427.5000000000000000);
float radiusScale=0.00038806915548755783000;
float radialVigCorrCoeff[4] = float[4](1.0000000000000000000,
0.00000000000000000000, 0.00000000000000000000, 0.00000000000000000000);
vec2 src = texture2DRect(CoordTexture, gl_TexCoord[0].st).sq;
vec2 d = src - vigCorrCenter;
d *= radiusScale;
vig = radialVigCorrCoeff[0];
float r2 = dot(d, d);
float r = r2;
vig += radialVigCorrCoeff[1] * r;
r *= r2;
vig += radialVigCorrCoeff[2] * r;
r *= r2;
vig += radialVigCorrCoeff[3] * r;
}
vec3 exposure_whitebalance = vec3(0.99999999935157013000,
0.99999999935157013000, 0.99999999935157013000);
p.rgb = (p.rgb * exposure_whitebalance) / vig;
p.rgb = p.rgb * 1023.0000000000000000;
vec2 destR = texture2DRect(DestLutTexture, vec2(p.r, 0.0)).sq;
vec2 destG = texture2DRect(DestLutTexture, vec2(p.g, 0.0)).sq;
vec2 destB = texture2DRect(DestLutTexture, vec2(p.b, 0.0)).sq;
vec3 destX = vec3(destR.x, destG.x, destB.x);
vec3 destY = vec3(destR.y, destG.y, destB.y);
vec3 destA = fract(p.rgb);
p.rgb = mix(destX, destY, destA);
gl_FragColor = p;
}
gpu shader program compile time = 0.046
gpu shader texture/framebuffer setup time = 0
gpu dest chunk=[(0, 0) to (1492, 1275) = (1492x1275)] coord image render
time = 0.016
gpu dest chunk=[(0, 0) to (1492, 1275) = (1492x1275)] source chunk=[(0, 0)
to (4296, 2856) = (4296x2856)] src upload = 0.062
gpu dest chunk=[(0, 0) to (1492, 1275) = (1492x1275)] source chunk=[(0, 0)
to (4296, 2856) = (4296x2856)] src alpha upload = 0.016
gpu dest chunk=[(0, 0) to (1492, 1275) = (1492x1275)] source chunk=[(0, 0)
to (4296, 2856) = (4296x2856)] src+alpha render = 0.047
gpu dest chunk=[(0, 0) to (1492, 1275) = (1492x1275)] source chunk=[(0, 0)
to (4296, 2856) = (4296x2856)] interpolation chunk=[(0, 0) to (4, 4) =
(4x4)] setup = 0.015
gpu dest chunk=[(0, 0) to (1492, 1275) = (1492x1275)] source chunk=[(0, 0)
to (4296, 2856) = (4296x2856)] interpolation chunk=[(0, 0) to (4, 4) =
(4x4)] render = 0
gpu dest chunk=[(0, 0) to (1492, 1275) = (1492x1275)] normalization setup =
0
gpu dest chunk=[(0, 0) to (1492, 1275) = (1492x1275)] normalization render =
0.016
gpu dest chunk=[(0, 0) to (1492, 1275) = (1492x1275)] dest rgb disassembly
setup = 0
gpu dest chunk=[(0, 0) to (1492, 1275) = (1492x1275)] dest rgb disassembly
render = 0
gpu dest chunk=[(0, 0) to (1492, 1275) = (1492x1275)] rgb readback = 0.016
gpu dest chunk=[(0, 0) to (1492, 1275) = (1492x1275)] dest alpha disassembly
setup = 0
gpu dest chunk=[(0, 0) to (1492, 1275) = (1492x1275)] dest alpha disassembly
render = 0
gpu dest chunk=[(0, 0) to (1492, 1275) = (1492x1275)] alpha readback = 0
gpu dest chunk=[(1492, 0) to (2984, 1275) = (1492x1275)] coord image render
time = 0
gpu dest chunk=[(1492, 0) to (2984, 1275) = (1492x1275)] source chunk=[(0,
0) to (4296, 2856) = (4296x2856)] interpolation chunk=[(0, 0) to (4, 4) =
(4x4)] setup = 0
gpu dest chunk=[(1492, 0) to (2984, 1275) = (1492x1275)] source chunk=[(0,
0) to (4296, 2856) = (4296x2856)] interpolation chunk=[(0, 0) to (4, 4) =
(4x4)] render = 0.015
gpu dest chunk=[(1492, 0) to (2984, 1275) = (1492x1275)] normalization setup
= 0
gpu dest chunk=[(1492, 0) to (2984, 1275) = (1492x1275)] normalization
render = 0
gpu dest chunk=[(1492, 0) to (2984, 1275) = (1492x1275)] dest rgb
disassembly setup = 0
gpu dest chunk=[(1492, 0) to (2984, 1275) = (1492x1275)] dest rgb
disassembly render = 0
gpu dest chunk=[(1492, 0) to (2984, 1275) = (1492x1275)] rgb readback =
0.016
gpu dest chunk=[(1492, 0) to (2984, 1275) = (1492x1275)] dest alpha
disassembly setup = 0
gpu dest chunk=[(1492, 0) to (2984, 1275) = (1492x1275)] dest alpha
disassembly render = 0
gpu dest chunk=[(1492, 0) to (2984, 1275) = (1492x1275)] alpha readback = 0
gpu dest chunk=[(0, 1275) to (1492, 2550) = (1492x1275)] coord image render
time = 0
gpu dest chunk=[(0, 1275) to (1492, 2550) = (1492x1275)] source chunk=[(0,
0) to (4296, 2856) = (4296x2856)] interpolation chunk=[(0, 0) to (4, 4) =
(4x4)] setup = 0
gpu dest chunk=[(0, 1275) to (1492, 2550) = (1492x1275)] source chunk=[(0,
0) to (4296, 2856) = (4296x2856)] interpolation chunk=[(0, 0) to (4, 4) =
(4x4)] render = 0.015
gpu dest chunk=[(0, 1275) to (1492, 2550) = (1492x1275)] normalization setup
= 0
gpu dest chunk=[(0, 1275) to (1492, 2550) = (1492x1275)] normalization
render = 0
gpu dest chunk=[(0, 1275) to (1492, 2550) = (1492x1275)] dest rgb
disassembly setup = 0
gpu dest chunk=[(0, 1275) to (1492, 2550) = (1492x1275)] dest rgb
disassembly render = 0
gpu dest chunk=[(0, 1275) to (1492, 2550) = (1492x1275)] rgb readback =
0.016
gpu dest chunk=[(0, 1275) to (1492, 2550) = (1492x1275)] dest alpha
disassembly setup = 0
gpu dest chunk=[(0, 1275) to (1492, 2550) = (1492x1275)] dest alpha
disassembly render = 0
gpu dest chunk=[(0, 1275) to (1492, 2550) = (1492x1275)] alpha readback = 0
gpu dest chunk=[(1492, 1275) to (2984, 2550) = (1492x1275)] coord image
render time = 0
gpu dest chunk=[(1492, 1275) to (2984, 2550) = (1492x1275)] source
chunk=[(0, 0) to (4296, 2856) = (4296x2856)] interpolation chunk=[(0, 0) to
(4, 4) = (4x4)] setup = 0.016
gpu dest chunk=[(1492, 1275) to (2984, 2550) = (1492x1275)] source
chunk=[(0, 0) to (4296, 2856) = (4296x2856)] interpolation chunk=[(0, 0) to
(4, 4) = (4x4)] render = 0
gpu dest chunk=[(1492, 1275) to (2984, 2550) = (1492x1275)] normalization
setup = 0
gpu dest chunk=[(1492, 1275) to (2984, 2550) = (1492x1275)] normalization
render = 0.015
gpu dest chunk=[(1492, 1275) to (2984, 2550) = (1492x1275)] dest rgb
disassembly setup = 0
gpu dest chunk=[(1492, 1275) to (2984, 2550) = (1492x1275)] dest rgb
disassembly render = 0
gpu dest chunk=[(1492, 1275) to (2984, 2550) = (1492x1275)] rgb readback = 0
gpu dest chunk=[(1492, 1275) to (2984, 2550) = (1492x1275)] dest alpha
disassembly setup = 0.016
gpu dest chunk=[(1492, 1275) to (2984, 2550) = (1492x1275)] dest alpha
disassembly render = 0
gpu dest chunk=[(1492, 1275) to (2984, 2550) = (1492x1275)] alpha readback =
0
gpu destruct time = 0.047
gpu total time = 0.39
THANKS FOR THE GOOD NEWS, Ryan!
keep them coming.
Yuv
Zoran Zorkic wrote:
>> I'm not sure what to make of this. Any image at the end of the process?
>> and I guess this is still with the first nona-gpu binary by Guido?
>
> Yup.
yup = image at the end of the process? is it as expected.
> No problem. Glad I can help out.
> 4169 gives me this on my system:
I see no error message. was there a resulting image? does it look as
expected?
> I'll try to get ATI results later today.
thanks for your effort, Zoran.
Yuv
Thank you, dear Guido and Ryan.
any update of <http://wiki.panotools.org/Hugin_SDK_(MSVC_2008)> needed?
Yuv
Guido
> --~--~---------~--~----~------------~-------
I messed around a bit last night and found that if I change the format
for the coordinate texture from luminance_alpha32f to say rgba32f, I
could get Nona to run a bit further. Perhaps that format is not
supported by the driver I am using?
I know very little about gpu programming, let alone gpgpu approaches
used here.
- Gerry
Guido Kohlmeyer wrote:
>> any update of <http://wiki.panotools.org/Hugin_SDK_(MSVC_2008)> needed?
>>
> Yes of course, I have to add the description how to generate the library
> from source package with MSVC 2008 EE. An updated SDK package is
> mandatory too.
thank you.
there are a few other things I'd suggest doing for the next edition of
the SDK:
1. autopano-sift-C:
- replace autopano-sift-C with Tom newest build (that solved so many
memory leaks)
- *but* keep the old "generatekeys" in the same folder (Tom has now
deprecated but is still so useful to so many different workflows)
2. update to the latest Exiftool (this is a continuum - regressions are
very seldom and the continuous evolution of camera models and related
EXIF data must be tracked)
3. update exiv2 to 0.18.2
4. libpano:
- move pano13 up one folder to make more compatible with Tom's new CMake
build for pano13. The Cmake build for libpano is very new and still not
fully complete, but it is already useable and makes life *much* easier
- especially on Windows system. This will likely require a small change
in Hugin's CMakeLists.txt.
- use the latest SVN (with the fix for the locale mangling and with
Tom's latest CMake build
5. UnxUtils: for now leave them there but
http://gnuwin32.sourceforge.net/ is better maintained and officially
supports Vista too. At some point replacement should be tested and when
the test are positive GnuWin32 shall replace UnxUtils. There also seems
to be a 64bit relative https://sourceforge.net/projects/gnuwin64/ but I
have not looked into it.
6. Add enblend-enfuse. It is required to make the INSTALL target (and
later on the installer).
7. for each folder, it would be good to document if it is downloaded or
self-built, and the exact revision used for the build. I had to
investigate / guess some of these (e.g. autopano-sift-C does not print
the revision number in the help text).
8. a readme.txt file at the top level of the SDK would be helpful - a
simple URL to the wiki page is enough for a start.
9. nice to have: add LAPACK <http://www.netlib.org/lapack/> libraries.
> I wanted to wait until some developers approve the
> functionality of nona based on the static library. Due to the fact that
> my graphic card is too old I cannot do such tests. Based on first
> impressions in this thread it seams that the lib works or other way
> around there are situations where it works thus it is no general broken
> build of nona.
Everything is fine. Andrew has added GPU-stitching in a non-obtrusive
way. The current trunk does not break existing functionality (exception:
I have no reports yet if the new code breaks build or functionality on OSX).
> I will update the SDK in few days or hours ...
take your time (focus on robustness and quality), and thank you for the
effort.
Yuv
same here. Today I found out why on the same hardware I had a different
error: my windows nvidia driver was still 81.98 from January 2006. I
just updated to 190.38 and now I get the same error message as in Ubuntu:
gpu shader program compile time = 0.844
nona: GL error: Framebuffer incomplete, incomplete attachment in:
..\..\..\hugin
\src\hugin_base\vigra_ext\ImageTransformsGPU.cpp:700
Ryan is so far the only one who has reported success, with his
self-built version. I wonder if one of the pre-compiled (from Guido or
from me) yield the same result. This would exclude building errors.
His video card is a GeForce 8800 GTS (256 mb) - anybody else with that
same video card who can run nona -g successfully? and other video cards?
Yuv
Yuv,
So I did some testing with your exe
(http://www.photopla.net/hugin/nona_4169.7z ). Several projects I was able
to run without error with mine generated the following error:
nona: GL error in
..\..\..\hugin\src\hugin_base\vigra_ext\ImageTransformsGPU.cpp:743: out of
memory
However, when I re-created the simple project I'd used when testing
my version (just a simple transform), that I previously posted, I was able
to run without errors.
Since there appears to be some degree of variance depending on the
project and the settings, is there any desire to create a sample setup (or
preferably few) that test different levels of complexity?
For what it's worth, here's the results using your exe:
nona: using graphics card: NVIDIA Corporation GeForce 8800 GTS/PCI/SSE2
destStart=[0, 0]
destEnd=[3000, 2800]
destSize=[(3000, 2800)]
srcSize=[(4296, 2856)]
srcBuffer=062D0020
srcAlphaBuffer=0A610020
destBuffer=085F0020
destAlphaBuffer=09E00020
destGLInternalFormat=GL_RGBA8
destGLFormat=GL_RGB
destGLType=GL_UNSIGNED_BYTE
srcGLInternalFormat=GL_RGBA8
srcGLFormat=GL_RGB
srcGLType=GL_UNSIGNED_BYTE
srcAlphaGLType=GL_UNSIGNED_BYTE
destAlphaGLType=GL_UNSIGNED_BYTE
warparound=0
needsAtanWorkaround=0
maxTextureSize=8192
Source chunks:
[(0, 0) to (4296, 2856) = (4296x2856)]
Dest chunks:
[(0, 0) to (1500, 1400) = (1500x1400)]
[(1500, 0) to (3000, 1400) = (1500x1400)]
[(0, 1400) to (1500, 2800) = (1500x1400)]
[(1500, 1400) to (3000, 2800) = (1500x1400)]
Total GPU memory used: 190555008
Interpolator chunks:
[(0, 0) to (4, 4) = (4x4)]
#version 110
#extension GL_ARB_texture_rectangle : enable
uniform sampler2DRect SrcTexture;
float sinh(const in float x) { return (exp(x) - exp(-x)) / 2.0; }
float cosh(const in float x) { return (exp(x) + exp(-x)) / 2.0; }
float atan2_xge0(const in float y, const in float x) {
return atan(y, x);
}
float atan2_safe(const in float y, const in float x) {
return atan(y, x);
}
float atan_safe(const in float yx) {
return atan(yx);
}
void main(void)
{
float discardA = 1.0;
float discardB = 0.0;
vec2 src = gl_TexCoord[0].st;
src -= vec2(1500.0000000000000000, 1400.0000000000000000);
// rotate_erect(18000.000000000000000, -0.00000000000000000000)
{
//src.s += -0.00000000000000000000;
float w = (abs(src.s) > 18000.000000000000000) ? 1.0 : 0.0;
float n = (src.s < 0.0) ? 0.5 : -0.5;
src.s += w * -36000.000000000000000 * ceil(src.s /
36000.000000000000000 + n);
}
// sphere_tp_erect(5729.5779513082325000)
{
float phi = src.s / 5729.5779513082325000;
float theta = -src.t / 5729.5779513082325000 +
1.5707963267948966000;
if (theta < 0.0) {
theta = -theta;
phi += 3.1415926535897931000;
}
if (theta > 3.1415926535897931000) {
theta = 3.1415926535897931000 - (theta - 3.1415926535897931000);
phi += 3.1415926535897931000;
}
float s = sin(theta);
vec2 v = vec2(s * sin(phi), cos(theta));
float r = length(v);
theta = 5729.5779513082325000 * atan2_safe(r, s * cos(phi));
src = v * (theta / r);
}
// persp_sphere(5729.5779513082325000)
{
mat3 m = mat3(0.80742836807921270000, -0.58996561799899783000,
0.00000000000000000000,
0.58996561799899783000, 0.80742836807921270000,
0.00000000000000000000,
0.00000000000000000000, 0.00000000000000000000,
1.0000000000000000000);
float r = length(src);
float theta = r / 5729.5779513082325000;
float s = 0.0;
if (r != 0.0) s = sin(theta) / r;
vec3 v = vec3(s * src.s, s * src.t, cos(theta));
vec3 u = v * m;
r = length(u.st);
theta = 0.0;
if (r != 0.0) theta = 5729.5779513082325000 * atan2_safe(r, u.p) /
r;
src = theta * u.st;
}
// rect_sphere_tp(5729.5779513082325000)
{
float r = length(src);
float theta = r / 5729.5779513082325000;
float rho = 0.0;
if (theta >= 1.5707963267948966000) rho = 1.6e16;
else if (theta == 0.0) rho = 1.0;
else rho = tan(theta) / theta;
src *= rho;
}
// resize(1.6728386297652815000, 1.6728386297652815000)
src *= vec2(1.6728386297652815000, 1.6728386297652815000);
src += vec2(2144.5000000000000000, 1427.5000000000000000);
src = src * discardA + vec2(-1000.0, -1000.0) * discardB;
gl_FragColor = accum;
}
// destExposure = 0.95663269200564105000
// srcExposure = 0.95663269262595030000
// whiteBalanceRed = 1.0000000000000000000
// whiteBalanceBlue = 1.0000000000000000000
p.rgb = p.rgb * 255.00000000000000000;
vec2 invR = texture2DRect(InvLutTexture, vec2(p.r, 0.0)).sq;
vec2 invG = texture2DRect(InvLutTexture, vec2(p.g, 0.0)).sq;
vec2 invB = texture2DRect(InvLutTexture, vec2(p.b, 0.0)).sq;
vec3 invX = vec3(invR.x, invG.x, invB.x);
vec3 invY = vec3(invR.y, invG.y, invB.y);
vec3 invA = fract(p.rgb);
p.rgb = mix(invX, invY, invA);
// VigCorrMode=VIGCORR_RADIAL
float vig = 1.0;
{
vec2 vigCorrCenter = vec2(2144.5000000000000000,
1427.5000000000000000);
float radiusScale=0.00038806915548755783000;
float radialVigCorrCoeff[4] = float[4](1.0000000000000000000,
0.00000000000000000000, 0.00000000000000000000, 0.00000000000000000000);
vec2 src = texture2DRect(CoordTexture, gl_TexCoord[0].st).sq;
vec2 d = src - vigCorrCenter;
d *= radiusScale;
vig = radialVigCorrCoeff[0];
float r2 = dot(d, d);
float r = r2;
vig += radialVigCorrCoeff[1] * r;
r *= r2;
vig += radialVigCorrCoeff[2] * r;
r *= r2;
vig += radialVigCorrCoeff[3] * r;
}
vec3 exposure_whitebalance = vec3(0.99999999935157013000,
0.99999999935157013000, 0.99999999935157013000);
p.rgb = (p.rgb * exposure_whitebalance) / vig;
p.rgb = p.rgb * 1023.0000000000000000;
vec2 destR = texture2DRect(DestLutTexture, vec2(p.r, 0.0)).sq;
vec2 destG = texture2DRect(DestLutTexture, vec2(p.g, 0.0)).sq;
vec2 destB = texture2DRect(DestLutTexture, vec2(p.b, 0.0)).sq;
vec3 destX = vec3(destR.x, destG.x, destB.x);
vec3 destY = vec3(destR.y, destG.y, destB.y);
vec3 destA = fract(p.rgb);
p.rgb = mix(destX, destY, destA);
gl_FragColor = p;
}
gpu shader program compile time = 0.032
gpu shader texture/framebuffer setup time = 0.015
gpu dest chunk=[(0, 0) to (1500, 1400) = (1500x1400)] coord image render
time = 0
gpu dest chunk=[(0, 0) to (1500, 1400) = (1500x1400)] source chunk=[(0, 0)
to (4296, 2856) = (4296x2856)] src upload = 0.063
gpu dest chunk=[(0, 0) to (1500, 1400) = (1500x1400)] source chunk=[(0, 0)
to (4296, 2856) = (4296x2856)] src alpha upload = 0.015
gpu dest chunk=[(0, 0) to (1500, 1400) = (1500x1400)] source chunk=[(0, 0)
to (4296, 2856) = (4296x2856)] src+alpha render = 0.047
gpu dest chunk=[(0, 0) to (1500, 1400) = (1500x1400)] source chunk=[(0, 0)
to (4296, 2856) = (4296x2856)] interpolation chunk=[(0, 0) to (4, 4) =
(4x4)] setup = 0.016
gpu dest chunk=[(0, 0) to (1500, 1400) = (1500x1400)] source chunk=[(0, 0)
to (4296, 2856) = (4296x2856)] interpolation chunk=[(0, 0) to (4, 4) =
(4x4)] render = 0
gpu dest chunk=[(0, 0) to (1500, 1400) = (1500x1400)] normalization setup =
0
gpu dest chunk=[(0, 0) to (1500, 1400) = (1500x1400)] normalization render =
0.015
gpu dest chunk=[(0, 0) to (1500, 1400) = (1500x1400)] dest rgb disassembly
setup = 0
gpu dest chunk=[(0, 0) to (1500, 1400) = (1500x1400)] dest rgb disassembly
render = 0
gpu dest chunk=[(0, 0) to (1500, 1400) = (1500x1400)] rgb readback = 0.016
gpu dest chunk=[(0, 0) to (1500, 1400) = (1500x1400)] dest alpha disassembly
setup = 0
gpu dest chunk=[(0, 0) to (1500, 1400) = (1500x1400)] dest alpha disassembly
render = 0
gpu dest chunk=[(0, 0) to (1500, 1400) = (1500x1400)] alpha readback = 0
gpu dest chunk=[(1500, 0) to (3000, 1400) = (1500x1400)] coord image render
time = 0
gpu dest chunk=[(1500, 0) to (3000, 1400) = (1500x1400)] source chunk=[(0,
0) to (4296, 2856) = (4296x2856)] interpolation chunk=[(0, 0) to (4, 4) =
(4x4)] setup = 0
gpu dest chunk=[(1500, 0) to (3000, 1400) = (1500x1400)] source chunk=[(0,
0) to (4296, 2856) = (4296x2856)] interpolation chunk=[(0, 0) to (4, 4) =
(4x4)] render = 0.015
gpu dest chunk=[(1500, 0) to (3000, 1400) = (1500x1400)] normalization setup
= 0
gpu dest chunk=[(1500, 0) to (3000, 1400) = (1500x1400)] normalization
render = 0
gpu dest chunk=[(1500, 0) to (3000, 1400) = (1500x1400)] dest rgb
disassembly setup = 0
gpu dest chunk=[(1500, 0) to (3000, 1400) = (1500x1400)] dest rgb
disassembly render = 0
gpu dest chunk=[(1500, 0) to (3000, 1400) = (1500x1400)] rgb readback =
0.016
gpu dest chunk=[(1500, 0) to (3000, 1400) = (1500x1400)] dest alpha
disassembly setup = 0
gpu dest chunk=[(1500, 0) to (3000, 1400) = (1500x1400)] dest alpha
disassembly render = 0
gpu dest chunk=[(1500, 0) to (3000, 1400) = (1500x1400)] alpha readback = 0
gpu dest chunk=[(0, 1400) to (1500, 2800) = (1500x1400)] coord image render
time = 0
gpu dest chunk=[(0, 1400) to (1500, 2800) = (1500x1400)] source chunk=[(0,
0) to (4296, 2856) = (4296x2856)] interpolation chunk=[(0, 0) to (4, 4) =
(4x4)] setup = 0
gpu dest chunk=[(0, 1400) to (1500, 2800) = (1500x1400)] source chunk=[(0,
0) to (4296, 2856) = (4296x2856)] interpolation chunk=[(0, 0) to (4, 4) =
(4x4)] render = 0.016
gpu dest chunk=[(0, 1400) to (1500, 2800) = (1500x1400)] normalization setup
= 0
gpu dest chunk=[(0, 1400) to (1500, 2800) = (1500x1400)] normalization
render = 0
gpu dest chunk=[(0, 1400) to (1500, 2800) = (1500x1400)] dest rgb
disassembly setup = 0
gpu dest chunk=[(0, 1400) to (1500, 2800) = (1500x1400)] dest rgb
disassembly render = 0
gpu dest chunk=[(0, 1400) to (1500, 2800) = (1500x1400)] rgb readback =
0.015
gpu dest chunk=[(0, 1400) to (1500, 2800) = (1500x1400)] dest alpha
disassembly setup = 0
gpu dest chunk=[(0, 1400) to (1500, 2800) = (1500x1400)] dest alpha
disassembly render = 0
gpu dest chunk=[(0, 1400) to (1500, 2800) = (1500x1400)] alpha readback = 0
gpu dest chunk=[(1500, 1400) to (3000, 2800) = (1500x1400)] coord image
render time = 0.016
gpu dest chunk=[(1500, 1400) to (3000, 2800) = (1500x1400)] source
chunk=[(0, 0) to (4296, 2856) = (4296x2856)] interpolation chunk=[(0, 0) to
(4, 4) = (4x4)] setup = 0
gpu dest chunk=[(1500, 1400) to (3000, 2800) = (1500x1400)] source
chunk=[(0, 0) to (4296, 2856) = (4296x2856)] interpolation chunk=[(0, 0) to
(4, 4) = (4x4)] render = 0
gpu dest chunk=[(1500, 1400) to (3000, 2800) = (1500x1400)] normalization
setup = 0
gpu dest chunk=[(1500, 1400) to (3000, 2800) = (1500x1400)] normalization
render = 0
gpu dest chunk=[(1500, 1400) to (3000, 2800) = (1500x1400)] dest rgb
disassembly setup = 0
gpu dest chunk=[(1500, 1400) to (3000, 2800) = (1500x1400)] dest rgb
disassembly render = 0
gpu dest chunk=[(1500, 1400) to (3000, 2800) = (1500x1400)] rgb readback =
0.015
gpu dest chunk=[(1500, 1400) to (3000, 2800) = (1500x1400)] dest alpha
disassembly setup = 0
gpu dest chunk=[(1500, 1400) to (3000, 2800) = (1500x1400)] dest alpha
disassembly render = 0
gpu dest chunk=[(1500, 1400) to (3000, 2800) = (1500x1400)] alpha readback =
0.016
gpu destruct time = 0.016
gpu total time = 0.344
Ryan Sleevi wrote:
> So I did some testing with your exe
> (http://www.photopla.net/hugin/nona_4169.7z ). Several projects I was able
> to run without error with mine generated the following error:
>
> nona: GL error in
> ..\..\..\hugin\src\hugin_base\vigra_ext\ImageTransformsGPU.cpp:743: out of
> memory
mhh... is yours 64bit? or 32bit?
> Since there appears to be some degree of variance depending on the
> project and the settings, is there any desire to create a sample setup (or
> preferably few) that test different levels of complexity?
Yes, your suggestion makes sense. We need a set of typical test projects
(we anyway should have them and run them before we ship *any* version of
Hugin).
This is a task that any user (i.e. non-developer) can contribute to. We
need one volunteer to collect the test cases and organize them. And
everybody can contribute test cases.
I'm away for one week starting tomorrow morning. You don't need me for
this. There are a number of people around here who have access codes (I
actually don't even remember where I put the access code to
hugin.panotools.org).
I'd be happy to see the community put together the test cases.
Yuv
64-bit through and through. Using a DLL version of Glut, rather than a
static library, simply because it was convenient at the time. However, the
error seemed to suggest to me it was a GPU allocation error and not a system
allocation error. The memory usage of nona only peaked at ~100 megs when
generating that error.
Zoran Zorkic schrieb:
You can find one here (built by me):
http://hugin.panotools.org/testing/hugin/nona.zip
Another one can be found here (I suppose Yuval has built it)
http://www.photopla.net/hugin/nona_4169.7z
Guido
Guido Kohlmeyer wrote:
> You can find one here (built by me):
> http://hugin.panotools.org/testing/hugin/nona.zip
>
> Another one can be found here (I suppose Yuval has built it)
> http://www.photopla.net/hugin/nona_4169.7z
I guess Zoran was asking for Ryan to put one up. Ryan is the only one so
far to report success with his self build in 64bit. If he builds it with
32bit and Zoran test it to be working, then we would know that the issue
is not in the nona-gpu code but in the SDK on hugin.panotools.org (I've
used the SDK to build mine).
Yuv
Zoran Zorkic wrote:
> I'm up for it.
thanks for volunteering to put together a collection of test cases.
> But what do you propose as test cases?
see what this user community proposes. I would suggest that you set up a
framework and ask people to contribute to it. Then you can store the
contributed projects somewhere...
> What complexity?
increasing. Continuing from above: ... sort them out in terms of
complexity, put them up for download on hugin.panotools.org - each in
its own archive, linked from a Wiki page. On the wiki you could also
collect test results, i.e. put the tests in the column header and let
people add to the wiki one line for each test, with a description of
their hardware and the version of Hugin used as line header.
> How about:
> 2x5mp photos
> 4x5mp photos
that's a good start.
> Do we need a full spherical pano test?
Hugin is used for full sphericals too and I'm sure somebody will
contribute a project.
Yuv
thanks for the work and for the log, Harry.
> nona: normalization/photometric shader program could not be compiled.
> nona: GL info log:
> ERROR: 0:35: 'array of float' : constructor not supported for type
> ERROR: 0:35: 'array of float' : no matching overloaded function found
> ERROR: 0:35: 'radialVigCorrCoeff' : redefinition
I'm no expert either, so maybe my hypothesis is completely wrong. I
would like to exclude a driver issue and the best way to find out is if
you can set up an Ubuntu partition on your IntelMac and try to run the
exact same nona project there.
I had a different error at the same stage. Failed on Windows. Passed on
Ubuntu. Updated Windows driver and it passed there too. Now on my system
it fails at a later point, on both Windows and Ubuntu.
Yuv
thanks. bug report at
<https://sourceforge.net/tracker/?func=detail&aid=2844187&group_id=77506&atid=550441>
Yuv
I did a few test with my new machine:
Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz
GeForce 9500 GT.
8 gb of RAM.
Ubuntu 64 bits,
hugin svn 4233.
enblend / enfuse 3.2-staging-rev350
I did a processing of 12000x6000 pixels - 360x180 equirectangular
panorama (32 images, 10 at 36/0, 10 at 36/45, 10 at 36/-45 , zenith &
nadir)
The images were already processed, so I just runned the enfuse command
twice, while measuring the time and (trying) to do nothing else during it.
I got 309s without the gpu option used and 282s with it.
Is this linked to my cheap gpu?
There were some NaN (not a number) warnings during the processing with nona.
Is it linked to the size of the panorama?
What are the gain that could be achieved or estimated with a better gpu?
esby / Y. Tennevin