combining point cloud substraction with lasduplicate huge files script

Petra Steffen

unread,

Sep 1, 2016, 9:35:54 AM9/1/16

to LAStools - efficient tools for LiDAR processing

Hello Community,

I'm trying to combine

a) Point cloud substraction --> https://groups.google.com/d/msg/lastools/03HSFPTEhXg/-BO_bgG473YJ

with

b) huge las file delete duplicates --> https://groups.google.com/d/msg/lastools/vIG8V0dDso8/96MXuf8jXM8J

My Problem is, that I received about 25km2 of LIDAR Data which is split in "first pulse", "last pulse ground", "last pulse not ground" files.
On the one hand this is nice, as I don't need to do the classification and I can easily compute a DTM, but in order to compute a spike-free DSM and to classify vegetation and buildings I want to combine them back into one file/many tiles which contain all types of points.

The problem I see is that some points are saved twice, for example when there was only one return the point is in the first pulse file as well as the last pulse file (there are some other examples as well).
When I combine both, I duplicate the point. So I want to remove the extra point but I want to be specific from which file I remove them.
Generally I can do this by processing them as it is described in "point cloud substraction".
Unfortunately I have too many points (more that 160 million) so that whenerver I run lasduplicate with -unique_xyz I get "abnormal program termination".

So I did browse and found the script "huge las file delete duplicates". This is very helpful but I have one important question:

When using lasduplicate only the first appearance survives, so following the point cloud substraction I can decide by the order of inputs from which file/classification (first pulse - 1, last pulse ground - 2, last pulse not ground - 3) the duplicate points were to be removed.
When I follow the workflow "huge las file delete duplicates", I first merge all points, the tile them and now remove the duplicate points. But when I feed the merged file to lasduplicate I cannot determine from which classification(source file) the duplicates are removed by the order of input any longer. Am I right with this assumption?

Questions:

Is there a way to sort the data by classification?

Or I wondered if I then have to pay special attention to the order in which I give the input for lasmerge?

How do I know by what criterion my data is sorted anyway?

Or is there maybe another way to put the files back together properly and keep the classification from the data I received?

Or am I worrying needlessly about these duplicate points? (I think they will have some impact on the computation of DTM and DSM (and classification and normalized DSM).

Do you have any advice or an answer for any of my questions?

I appreciate it.

Petra

P.S.: I'm a student and have been working with lastools and LIDAR data for about two weeks now...not a professional yet. Maybe you can consider that in your answer... :)

Martin Isenburg

unread,

Sep 6, 2016, 9:30:45 AM9/6/16

to LAStools - efficient command line tools for LIDAR processing

Hi Petra,

interesting question about how to eliminate the "right" points with lasduplicate:

http://rapidlasso.com/lasduplicate

http://lastools.org/download/lasduplicate_README.txt

I am trying to re-create this situation using fusa.laz.

(1) first pulse (i prefer the term 'echo' or 'return' as 'pulse' usually refers to the short burst of laser light that is emitted by the LiDAR system

las2las -i ..\data\fusa.laz -keep_first -o fusa_first.laz

(2) last return ground points

las2las -i ..\data\fusa.laz -keep_last -keep_class 2 -o fusa_last_ground.laz

(3) last return non-ground points

las2las -i ..\data\fusa.laz -keep_last -drop_class 2 -o fusa_last_non_ground.laz

Now if your data is correct then there should not be any duplicates when you combine the points clouds in files (2) and (3). For me this holds true. Maybe check this on a few tiles:

lasduplicate -i fusa_last_ground.laz fusa_last_non_ground.laz -merged -unique_xyz -nil -v

reading 263370 points of type 1 from 'merged' and writing to '(null)'.

found 0 duplicate points in 'merged'. took 0.839 sec.

Hence it is save to merge them already ahead of time if you want to do it step by step to develop workflow:

lasmerge -i fusa_last_ground.laz fusa_last_non_ground.laz -o fusa_last.laz

or

lasmerge -i fusa_last_non_ground.laz fusa_last_ground.laz -o fusa_last.laz

Now it is as you say. We keep the first of the points that arrives in case '-unique_xyz' is chosen.

lasduplicate -i fusa_last.laz fusa_first.laz -merged -unique_xyz -o fusa_recon.laz -v

E:\LAStools\bin>lasduplicate -i fusa_last.laz fusa_first.laz -merged -unique_xyz -o fusa_recon.laz -v

reading 526783 points of type 1 from 'merged' and writing to 'fusa_recon.laz'.

found 249493 duplicate points in 'merged'. took 2.351 sec.

And furthermore we have added a special option called '-single_returns' that allows you to reconstruct the correct return numbering (but here only an XY unique check is done) as you can see in the README.

[..]

The special option '-single_returns' was added particularly to

reconstruct the single versus multiple return information for

the (unfortunate) case that the LiDAR points were delivered in

two separate files with some points appearing in both. These

LiDAR points may be split into all first return and all last

returns or into all first returns and all ground returns. See

the example below how to deal with this case correctly.

[..]

However, the pit-free and spike-free algorithms will automatically delete all duplicate points (albeit it will be a bit slower). Another way of prepping your data set for more efficient pit-free and spike-free operation is to thin the data to a reasonable amount. For example if you are planning to create a 50 cm spike-free DSM / pit-free CHM raster you could first keep the highest point that falls into each 25cm by 25cm cell. This will also delete all XY duplicates and it won't matter whether it is ground or non-ground as we want to construct the highest CHM / DSM no matter what classification the point has:

lasthin -i fusa_last.laz fusa_first.laz -highest -step 0.25 -o temp.laz

las2dem -i temp.laz -step 0.5 -spike_free 1.5 -o pit_free_dsm.bil

http://rapidlasso.com/lasthin

http://lastools.org/download/las2dem_README.txt

http://rapidlasso.com/lasthin

http://lastools.org/download/las2dem_README.txt

http://rapidlasso.com/2016/02/03/generating-spike-free-digital-surface-models-from-lidar/

Regards,

Martin @rapidlasso

--
Download LAStools at
http://lastools.org
http://rapidlasso.com
Be social with LAStools at
http://facebook.com/LAStools
http://twitter.com/LAStools
http://linkedin.com/groups/LAStools-4408378
Manage your settings at
http://groups.google.com/group/lastools/subscribe

Petra Steffen

unread,

Sep 26, 2016, 10:40:49 AM9/26/16

to LAStools - efficient tools for LiDAR processing

Hi Martin,

I tried everything you proposed and the result is very satisfying. Thank you!
I have another two questions:

I never used lasthin before because I thought that it's best to have as many points as possible. Yet, the DSM was much better with lasthin (as you proposed) than without. So I will use a similar procedure for nDSM calculation.

Q1: Is it adivsable to use lasthin for a DTM as well? Or does it NOT imporve the output as the terrain is usually much less spiky than for example a tree canopy?
Q2: I want to do some spatial inqueries in ArcGIS/QGIS afterwards, like calculating mean (urban) canopy height of different areas. You used .bil as output format in the example you gave. I this a recomendation? Do you know which output format is most suited for this task?

Thank you,
I appreciate it
Petra

Reply all

Reply to author

Forward