Laszip wrapper

80 views
Skip to first unread message

J Luis

unread,
Dec 16, 2015, 12:23:51 PM12/16/15
to julia-geo
Hi,

I made a wrapper  to the LIDAR Laszip libary with which one can load (and some basic writing) point clouds stored using the LAS format and mostly in its compressed form the LAZ format.

https://github.com/joa-quim/Laszip.jl

Joaquim

Martijn Visser

unread,
Dec 16, 2015, 1:50:22 PM12/16/15
to julia-geo
Hi Joaquim,

Thanks for sharing!

I probably should have spread the word a bit more, but I've created a basic wrapper for the BSD-licensed LibLAS. With its dependency on LASzip it can be used on LAS and LAZ equally.
Would this also have covered your needs or do you have bigger plans?


-Martijn

Chris Foster

unread,
Dec 16, 2015, 2:52:33 PM12/16/15
to julia-geo
Thanks guys, these look both look useful. LibLAS is a relatively
heavy dependency when all you want to do is read las/z, so it's nice
to have both as an option.

I could be wrong, but it seems that every las reader library has ended
up with a lot of utility code around it (liblas, lastools, pdal) and
there's nothing which focuses purely on doing the file IO part nicely,
and letting people use their own tools for the rest. I suspect the
problem comes in at least two parts:
* C++ users want their own application-specific point buffer format to
communicate the results of IO operations, and to use in their
application in arbitrary program-specific ways.
* Reading a particular variant of uncompressed las into a custom
datastructure is pretty easy and very fast, so people are tempted to
roll their own reader.

Coming up with a nice in-memory point buffer that satisfies everyone
from both a generality and efficiency point of view is just a hard
problem and libraries end up as walled gardens built around their
specific blessed point buffer data structure. This isn't the library
author's fault, it's inherent limitations of C++ (ABI specific struct
padding being one of the worst things if you're doing an
array-of-structs layout).

I'm hopeful we can avoid this in julia at some stage by defining a
nice buffer API which everyone can agree on. Maybe the right thing is
just to use DataFrames once the performance problems are solved (see
http://www.johnmyleswhite.com/notebook/2015/11/28/why-julias-dataframes-are-still-slow/).

Oops, I didn't mean to write that much... thanks guys for making these
libraries in any case :)
~Chris
> --
> You received this message because you are subscribed to the Google Groups
> "julia-geo" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to julia-geo+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

J Luis

unread,
Dec 16, 2015, 6:18:59 PM12/16/15
to julia-geo
Hi, I must tell you guys that I'm not a heavy (maybe not even a light one) LIDAR user. I this wrapper because I wanted to investigate more a fact that I found where xyz grids stored as laz are more than 4 times smaller than netCDF grids using the best deflation level. But I'm even more confused now because if I store just the Z vector (by cheating a bit) I don't recover the small grids as I do by simply converting a grid in xyz with txt2las.

If any of want to try it, it can be done with this simple example (which uses GMT as well)

   using Laszip, GMT
   G = gmt("gmtread -Tg V:/002_4_13-mis_orto.grd");
   z=reshape(G.z,G.n_rows*G.n_columns,1);
   dat2las("V:/bathy.laz", z, [G.range' G.registration G.inc']);

and recover the grid with

   Z,h=las2dat("V:/bathy.laz");

I also wanted to provoke Simon to test his GLVisualiser and apparently the result was a success

https://github.com/JuliaGL/GLVisualize.jl/issues/44#issuecomment-164005658

Chris Foster

unread,
Dec 17, 2015, 9:37:29 AM12/17/15
to julia-geo
On Thu, Dec 17, 2015 at 9:18 AM, J Luis <jmf...@gmail.com> wrote:
> Hi, I must tell you guys that I'm not a heavy (maybe not even a light one)
> LIDAR user. I this wrapper because I wanted to investigate more a fact that
> I found where xyz grids stored as laz are more than 4 times smaller than
> netCDF grids using the best deflation level. But I'm even more confused now
> because if I store just the Z vector (by cheating a bit) I don't recover the
> small grids as I do by simply converting a grid in xyz with txt2las.

Hmm, I'm afraid I don't understand your exact problem here. What are
the "small grids"?

> If any of want to try it, it can be done with this simple example (which
> uses GMT as well)
>
> using Laszip, GMT
> G = gmt("gmtread -Tg V:/002_4_13-mis_orto.grd");
> z=reshape(G.z,G.n_rows*G.n_columns,1);
> dat2las("V:/bathy.laz", z, [G.range' G.registration G.inc']);

I thought I'd try it out so I grabbed the bat_atl_30c_j.grd example.
I got GMT built, and managed to pull out data for the
http://w3.ualg.pt/%7Ejluis/mirone/misc/bat_atl_30c_j.grd.zip example,
but I got some odd ccall errors (ERROR: could not load symbol
"laszip_create":), which I suspect means my laszip doesn't have the C
interface compiled into it. After that I admit I gave up.

> and recover the grid with
>
> Z,h=las2dat("V:/bathy.laz");
>
> I also wanted to provoke Simon to test his GLVisualiser and apparently the
> result was a success
>
> https://github.com/JuliaGL/GLVisualize.jl/issues/44#issuecomment-164005658

Nice! It's good to see glvisualize coming along.

If you're mainly interested in 3D geospatial data and want to flexibly
display rather large point clouds, you might also be interested in my
project displaz, which has an evolving julia interface. (The
interface isn't in a proper julia package yet, but you can add
https://github.com/c42f/displaz/tree/master/bindings/julia/Displaz.jl
to your julia LOAD_PATH. You'd want the Displaz.jl version from the
associated compiled version of displaz since it has changed a bit
recently. If you're on windows you can download a binary of the
latest displaz release.).

I spent a while trying to make displaz show the Atlantic bathymetry in
a nice way. I ended up with this, which was reasonably nice:

#-------------------------------------------------------------------------------
using Proj4, GMT, Colors, Displaz

function load_grd(filename)
G = gmt("gmtread -Tg $filename")
points = zeros(3, size(G.z)...)
for j=1:size(G.z,2), i=1:size(G.z,1)
points[1,i,j] = G.x[j]
points[2,i,j] = G.y[i]
points[3,i,j] = G.z[i,j]
end
G, reshape(points, 3, length(z))
end

G, points = load_grd("/home/chris/downloads/bat_atl_30c_j.grd")

#points[3,:] *= 10 # Enhance height

# This is a big chunk of the Earth's surface. Transform to ECEF for interest
wgs84 = Projection("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84
+towgs84=0,0,0")
geocent = Projection("+proj=geocent +datum=WGS84 +units=m +no_defs")
points = transform(wgs84, geocent, points')' # ugh, need to fix these
transposes....

z = vec(G.z)
zmin, zmax = extrema(z)
ncols = 200
cm = map(RGB{Float32}, Colors.colormap("RdBu", ncols))
cols = cm[floor(Int, (ncols-1)*(z-zmin)/(zmax-zmin)) + 1]

plot3d(points, color=reinterpret(Float32, cols, (3, size(points2,2))))
#-------------------------------------------------------------------------------

Cheers,
~Chris

J Luis

unread,
Dec 18, 2015, 12:49:56 PM12/18/15
to julia-geo


Hmm, I'm afraid I don't understand your exact problem here.  What are
the "small grids"?

By small I meant the file size. Some examples

A grid stored as a xyz (like the 'points' in your load_grd() function) and converted with txt2las (lastools)
002_4_13-mis_orto.laz    109 KB

Same grid converted to netCDF (deflation 9):  bathy.grd  406 KB

Now convert this grid to laz:     bathy.laz  158 KB > If any of want to try it, it can be done with this simple example (which
This last one only stores the Z column (applies a trick to divide the z pts in three chunks an put each in the x,y,z channels) so it has only 1/3 of the points but it's still bigger than the original file

The same exercise on the bigger bat_atl_30c_j.grd that you used, gives:

bat_atl_30c_j.grd    26.0 MB
bat_atl_30c_j.laz    19.6 MB


I thought I'd try it out so I grabbed the bat_atl_30c_j.grd example.
I got GMT built, and managed to pull out data for the
http://w3.ualg.pt/%7Ejluis/mirone/misc/bat_atl_30c_j.grd.zip example,
but I got some odd ccall errors (ERROR: could not load symbol
"laszip_create":), which I suspect means my laszip doesn't have the C
interface compiled into it.  After that I admit I gave up.

Hmm, maybe it's that. It worked fine with me.

 
If you're mainly interested in 3D geospatial data and want to flexibly
display rather large point clouds, you might also be interested in my
project displaz, which has an evolving julia interface.

Well, my real main interest with Julia is to be able to convert Mirone to Julia. The GMT, Laszip (and the probably abandoned IUP) are bits of that goal. But the main piece is not yet available, which is a way to build GUIs (I just not convinced that GTK (via GTK.jl) is the way for this).

I tried your displaz viewer. Cool, very fast. It would be nice if it had a coloring model based on point's height. Much like you seam to be doing in your example script.
I say "seam" because last line errors for me with

    julia> plot3d(points, color=reinterpret(Float32, cols, (3, size(points,2))))
    ERROR: could not spawn `displaz -script -server default -dataname 'Points [13630201 vertices]' -shader generic_points.glsl -rmtemp 'C:\tmp\julF59A.tmp.ply'`: no such file or   directory (ENOENT)
    in _jl_spawn at process.jl:262
 

Chris Foster

unread,
Dec 18, 2015, 7:33:19 PM12/18/15
to julia-geo
On Sat, Dec 19, 2015 at 3:49 AM, J Luis <jmf...@gmail.com> wrote:
> The same exercise on the bigger bat_atl_30c_j.grd that you used, gives:
>
> bat_atl_30c_j.grd 26.0 MB
> bat_atl_30c_j.laz 19.6 MB

Right, that makes sense. The laszip codec is pretty nice :)

>> I thought I'd try it out so I grabbed the bat_atl_30c_j.grd example.
>> I got GMT built, and managed to pull out data for the
>> http://w3.ualg.pt/%7Ejluis/mirone/misc/bat_atl_30c_j.grd.zip example,
>> but I got some odd ccall errors (ERROR: could not load symbol
>> "laszip_create":), which I suspect means my laszip doesn't have the C
>> interface compiled into it. After that I admit I gave up.
>
>
> Hmm, maybe it's that. It worked fine with me.

Looking at the cmake source it seems that `laszip_dll.cpp` is only
included in the laszip build on windows, so the C interface just isn't
available on linux. It seems like a bit of an odd decision, though
easily worked around.

> Well, my real main interest with Julia is to be able to convert Mirone to
> Julia. The GMT, Laszip (and the probably abandoned IUP) are bits of that
> goal. But the main piece is not yet available, which is a way to build GUIs
> (I just not convinced that GTK (via GTK.jl) is the way for this).

Right, that makes sense. displaz aims at being a configurable and
scriptable plotting tool for large point clouds and other geospatial
data, but I don't think it will grow beyond that. The general idea is
that data processing should be done in a proper programming language
(not a GUI), so people should be able to write data analysis code in
their favourite language (C++, matlab, python, julia, whatever) and
use displaz to show the results. At the moment it's pretty good at
displaying large lidar point clouds (up to a few hundred million
points), and it has some rudimentary support for meshes and lines.

> I tried your displaz viewer. Cool, very fast. It would be nice if it had a
> coloring model based on point's height.

This is possible via editing the glsl shader (in 0.3.2 see the
View->Shader Editor menu. Edit as you like and press shift+enter to
compile it.) The project is gradually evolving in the direction of
being entirely scriptable via the socket interface, rather than adding
a lot of visible UI via the C++ code. There will need to be more UI
pieces, but I want them to generally support the goal of
scriptability.

> Much like you seam to be doing in
> your example script.
> I say "seam" because last line errors for me with
>
> julia> plot3d(points, color=reinterpret(Float32, cols, (3,
> size(points,2))))
> ERROR: could not spawn `displaz -script -server default -dataname
> 'Points [13630201 vertices]' -shader generic_points.glsl -rmtemp
> 'C:\tmp\julF59A.tmp.ply'`: no such file or directory (ENOENT)
> in _jl_spawn at process.jl:262

Ah yes, I think you've got two problems here:
1) You don't have displaz.exe in the path, so it can't be found by
julia. I assume you installed the latest released win32 executable
from https://github.com/c42f/displaz/releases/download/v0.3.2/displaz-0.3.2-win64.exe
?
2) You're using a too recent version of Displaz.jl. Some command line
arguments have diverged since 0.3.2, so I really need to make a new
release 0.4. For the moment, you'd need to go with the 0.3.2 version
to get compatibility with the released binary. Here it is:
https://github.com/c42f/displaz/blob/v0.3.2/bindings/julia/Displaz.jl.
Unfortunately it's a bit older, so you'll need `plot` rather than
`plot3d`, and some of the arrays may need to be transposed.

~Chris

J Luis

unread,
Dec 19, 2015, 6:19:35 PM12/19/15
to julia-geo
Yes, I don't have displaz in the path but I could find the ply file in the tmp dir and open it.
However, I think that is the week point of this. I mean the, the fact that a (large - 571 MB) temporary file had to be created. Why is that? Because displaz is C++ and Julia can't ccall it (yet)?

Chris Foster

unread,
Dec 19, 2015, 7:58:36 PM12/19/15
to julia-geo
On Sun, Dec 20, 2015 at 9:19 AM, J Luis <jmf...@gmail.com> wrote:
> Yes, I don't have displaz in the path but I could find the ply file in the
> tmp dir and open it.
> However, I think that is the week point of this. I mean the, the fact that a
> (large - 571 MB) temporary file had to be created. Why is that? Because
> displaz is C++ and Julia can't ccall it (yet)?

The displaz language binding interfaces intentionally use a
multiprocess design with message passing rather than running the UI in
the julia process. This has several advantages, but it does mean
using some form of interprocess communication to pass the data.
There's basically three options I'm aware of:
1) Put it in a temporary file, and communicate the location to displaz
(via socket, for example)
2) Pass the data directly down a socket to the displaz process
3) Use a shared memory array

Option (1) is by far the simplest to implement - remember that I'm
trying to support use from python and matlab as well - so I've gone
with that for now. The performance is actually pretty good too, since
files are aggressively cached in RAM by the operating system (at least
on linux).

Even if I implemented option (3) and passed the memory directly I
suspect it would help less than you might imagine: displaz reorders
the data for good rendering performance which implies making a copy.

I agree that using temporary files is not an elegant option, but so
far I've found it to be surprisingly practical :-)

~Chris

J Luis

unread,
Dec 21, 2015, 9:40:19 AM12/21/15
to julia-geo
I confess that my dislike for the temporaries comes from a little paranoia about SSDs longevity. I have several colleagues who broke their SSDs and those are known to depend on the number of writings. Having a 570 MB tmp file generated to display a 30 MB file, scares me a bit.
Reply all
Reply to author
Forward
0 new messages