Fix return numbering after removing of noise

317 views
Skip to first unread message

Mikael Sejersen

unread,
Jul 31, 2018, 9:47:58 AM7/31/18
to LAStools - efficient tools for LiDAR processing
I had a LAS file with a lot of noise, that has been cleaned. 
In some parts of the file the noise was unfortunately located below terrain (due to MTA interpenetration), meaning that the noise was marked as last return. So when removing the noise the last returns was also removed.

I guess I am not the first with this problem, but when trying to search the internet I do not get anything. In theory, I guess it should be fairly simple. If the return counter indicates n returns, but there are only n-1 corresponding returns in total, The return counter should be deducted one. Any chance LAStools could be used for such stunt? 

A LAZ sample can be found here: http://gofile.me/3HabV/sGW4hxLkf


lasinfo report for Last_return_issue.laz
reporting all LAS header entries:
  file signature:             'LASF'
  file source ID:             0
  global_encoding:            1
  project ID GUID data 1-4:   00000000-0000-0000-0000-000000000000
  version major.minor:        1.2
  system identifier:          ''
  generating software:        'TerraScan'
  file creation day/year:     210/2018
  header size:                227
  offset to point data:       229
  number var. length records: 0
  point data format:          1
  point data record length:   28
  number of point records:    163290312
  number of points by return: 142134003 20652857 497184 6266 2
  scale factor x y z:         0.01 0.01 0.01
  offset x y z:               0 0 0
  min x y z:                  350321.55 6215830.03 2.86
  max x y z:                  376115.37 6224756.00 74.26
the header is followed by 2 user-defined bytes
LASzip compression (version 3.1r0 c2 50000): POINT10 2 GPSTIME11 2
reporting minimum and maximum for all LAS point record entries ...
  X            35032155   37611537
  Y           621583003  622475600
  Z                 286       7426
  intensity          53       5236
  return_number       1          5
  number_of_returns   1          5
  edge_of_flight_line 0          1
  scan_direction_flag 0          0
  classification      0          0
  scan_angle_rank   -41         45
  user_data           0          0
  point_source_ID     0          0
  gps_time 313116.035001 313714.404038
number of first returns:        142134003
number of intermediate returns: 2184710
number of last returns:         146835144
number of single returns:       127863545
overview over number of returns of given pulse: 127863545 32581305 2766823 78215 424 0 0
histogram of classification of points:
       163290312  never classified (0)
    

Martin Isenburg

unread,
Aug 2, 2018, 7:47:28 AM8/2/18
to LAStools - efficient command line tools for LIDAR processing
Hello,

here is an example workflow. In order to make the large file more tractable for our experiments we first cut out a small 5 second slice. The attached lasinfo report (*) gives us an idea which GPS times slices are there.

lasinfo -i Last_return_issue.laz ^
            -histo gps_time 5 ^
            -odix _into -otxt

We only keep the 5 second slice between GPS time 313565 to 313570

las2las -i Last_return_issue.laz ^
            -keep_gps_time 313565 313570 ^
            -odix _sub -olaz

:: We have a look at the return colored slice (see attached picture).

lasview -i Last_return_issue_sub.laz -color_by_return

:: Before running lasreturn we must bring the points into GPS order

lassort -i Last_return_issue_sub.laz ^
            -gps_time ^
            -odix _sorted -olaz

:: Now we can analyse the file for missing returns

lasreturn -i Last_return_issue_sub_sorted.laz ^
               -check_return_numbering
checked returns of 56744 multi and 1281323 single return pulses. took 1.015 secs
missing: 266490 duplicate: 0 too large: 0 zero: 0
missing
=======
214811 returns with n = 2 and r = 1 are missing
871 returns with n = 2 and r = 2 are missing
28143 returns with n = 3 and r = 1 are missing
62 returns with n = 3 and r = 2 are missing
21313 returns with n = 3 and r = 3 are missing
645 returns with n = 4 and r = 1 are missing
13 returns with n = 4 and r = 2 are missing
2 returns with n = 4 and r = 3 are missing
628 returns with n = 4 and r = 4 are missing
1 returns with n = 5 and r = 1 are missing
1 returns with n = 5 and r = 5 are missing

There are quite a number of returns missing. This is 5 seconds of a flightline so usually all returns should be present. However, we can determine that at least 266490 returns are missing. There might be more because when *all* returns of a LiDAR shot were removed then lasreturn has no means to notice that the entire shot is missing. It can only determine whether there are missing returns when we have at least one return from a multi-return shot.

The ones marked in red are missing first returns. The ones marked in green are missing intermediate returns.  The ones marked in blue are missing last returns. We can also use lasreturn to show us *where* the returns are missing by giving those that remain (i.e. the returns of incomplete return sets) a special classification.

:: Create a temporary file to show where missing returns are

lasreturn -i Last_return_issue_sub_sorted.laz ^
               -check_return_numbering ^
               -classify_as 8 ^
               -odix _marked -olaz

The attached image clearly shows the corridor where returns have been removed.

:: Modify the "number of returns" field of each point to store 
:: the highest "return number" for each shot in the file

lasreturn -i Last_return_issue_sub_sorted.laz ^
               -repair_number_of_returns ^
               -odix _fudged -olaz

The attached image shows the result. Note that this renumbering *only* rewrote the "number of returns" field. This does not do any renumbering for the case that we have missing first returns as the last image shows. But that I hesitate to implement. The removal of first returns should in general not be obscured as knowledge about where missing first returns are should be known and considered in a number of post processing steps. I usually advocate for leaving all returns in the file and to mark them as noise instead. Another run of lasreturn on the resulting file shows that in the "fudged file" only first returns (and a few intermediate returns) are noticed to be missing. 

lasreturn -i Last_return_issue_sub_sorted.laz ^
               -check_return_numbering
missing: 243675 duplicate: 0 too large: 0 zero: 0
missing
=======
236089 returns with n = 2 and r = 1 are missing
7492 returns with n = 3 and r = 1 are missing
70 returns with n = 3 and r = 2 are missing
19 returns with n = 4 and r = 1 are missing
5 returns with n = 4 and r = 2 are missing

To process the entire flightline we simply chop the entire file into 10, 20, or 50 second slices and repeat the above on each slice (on many cores). Does this kind of renumbering (aka setting the correct highest "number of returns" field based on the returns actually in the file) do what you want to achieve? 

Regards.

Martin

Last_return_issue_into.txt
lasreturn_fudge_01_before.jpg
lasreturn_fudge_02_in_red_shots_with_missing_returns.jpg
lasreturn_fudge_03_after.jpg
lasreturn_fudge_04_after_first_returns_only.jpg

Mikael Sejersen

unread,
Aug 3, 2018, 5:39:59 AM8/3/18
to last...@googlegroups.com

Hi Martin,

 

Thank you very much, I look forward to looking deeper into this. Your very well formulated and detailed decryption makes sense. I agree with you about the principal of leaving all points in the las files.

Is there an upper limit for the “slice” size? Is the slicing only to speed up processing, or does LAStool have a maximum file size it can work with? Problem as I see it is that there is a risk of losing points or get wrong numbering in the cuts. Especially if the points in the file are not sorted by GPS time.    

 

Once again sorry for misunderstanding the concept of LAStools.    

Best regards,
Mikael

Martin Isenburg

unread,
Aug 5, 2018, 3:34:56 AM8/5/18
to LAStools - efficient command line tools for LIDAR processing
Hello,

the time slice was chosen to be just 5 seconds to have a small example that is easily visualized. Since 5 seconds was just above 1 million points we can easily go to 50 seconds for an efficient multi-core pipeline. We would then use lassplit to generate a folder of 50 second slices that are then processed in parallel and merged back together at the end. Here an example for one flight line:

:: turn the flightline into slices worth 50 seconds of data

mkdir slices
lassplit -i flightline0001.laz ^
            -by_gps_time_interval 50 ^
            -odir slices -olaz

:: before running lasreturn we must bring the points of each slice
:: into GPS order. we run this in parallel on 4 cores.

mkdir slices_sorted
lassort -i slices\*.laz ^
            -gpstime ^
            -odir slices_sorted -olaz ^
            -cores 4

:: modify the "number of returns" field of each point to store 
:: the highest "return number" for each shot in the file. we run
:: this in parallel on 4 cores.

mkdir  slices_fudged
lasreturn -i slices_sorted\*.laz ^
               -repair_number_of_returns ^
               -odir slices_fudged -olaz ^
               -cores 4

:: merge the result back into a flightline

lasmerge -i slices_fudged\*.laz ^
               -o flightline0001_fudged.laz 

To turn this into a batch script that can be run on each flightline have a look at similar "huge file" scripts in the example_batch_script folder of the LAStools distribution and modify one of those accordingly.

There is no risk of loosing any points as each return should appear in exactly one slice together with all the returns from its laser shot.

Regards,

Martin

Mikael Sejersen

unread,
Aug 22, 2018, 9:56:22 AM8/22/18
to last...@googlegroups.com

Hi Martin,

 

In your answers below your give two examples. The first on 2. august 2018, where you detailed go thru the procedure with a 5 sec. slice of the file Last_return_issue.las. That part I can reproduce.

In you second example from 5. august 2018 you describe how to execute the procedure on the entire file. That example I cannot reproduce.

 

In the second last step where the “lasreturn” command is used, the output files are empty. The files are in the subdirectory “slices_fudged” but they are all of the length of 0 kb.

There are no error message in the command prompt window indicating something should be wrong.  

 

The command outputting the empty files looks like this in your example:

 

lasreturn -i slices_sorted\*.laz ^

               -repair_number_of_returns ^

               -odir slices_fudged -olaz ^

               -cores 4

 

I have tried to slice the file in 5 sec. instead of the suggested 50 sec. to get closer to the first example, but that does not change anything.

As far as I can see the difference between the first and second example is the procedure used to slice the file.

 

The first example uses “las2las -keep_gps_time” where as the second example uses “lassplit -by_gps_time_interval 50”. Could that be an issue?

 

 

Best regards
Mikael    

 

 

Fra: last...@googlegroups.com [mailto:last...@googlegroups.com] På vegne af Martin Isenburg
Sendt: 5. august 2018 09:34
Til: LAStools - efficient command line tools for LIDAR processing
Emne: Re: [LAStools] Fix return numbering after removing of noise

 

Hello,

 

the time slice was chosen to be just 5 seconds to have a small example that is easily visualized. Since 5 seconds was just above 1 million points we can easily go to 50 seconds for an efficient multi-core pipeline. We would then use lassplit to generate a folder of 50 second slices that are then processed in parallel and merged back together at the end. Here an example for one flight line:

Mikael Sejersen

unread,
Aug 22, 2018, 9:56:37 AM8/22/18
to last...@googlegroups.com

Hi Martin,

 

After looking at the two different output files form “las2las -keep_gps_time” and “lassplit -by_gps_time_interval 50” using las2txt tool, it looks as the gps time in output from “lassplit -by_gps_time_interval 50” is missing. In the text file all gps time stamps are 0.0000. I guess that could course the problem?

 

Best regards
Mikael    

 

Martin Isenburg

unread,
Aug 22, 2018, 10:21:05 AM8/22/18
to LAStools - efficient command line tools for LIDAR processing
Hello Mikael,

indeed. You've hit the "license restriction" of the closed source part of LAStools. When your inputs are bigger than a certain number of points (for lassplit 500,000,000 points) then lassplit will set the GPS time stamps to zero.


It'll inform you about that in the black control console. Hence for unlicensed users I suggest to stick to las2las and make sure the time slices are short enough for the resulting point counts per slice to be below 5,000,000 to allow for lasreturn to work without restrictions as well.


http://lastools.org/download/lasreturn_README.txt

Regards.

Martin

--
Reply all
Reply to author
Forward
0 new messages