Hi Andrew,
thanks for the massive file. We checked this now.
Your input file has 824 mio. points. So you did absolute right to tile first.
The header shows some more funny stuff:
point count: 824,375,291
scale x y z: 0.001 0.001 0.001
offset x y z: 333,000 4,317,000 -7,000
min x y z: 330,447.519 4,310,316.287 -27,748.748
max x y z: 337,426.195 4,319,794.896 128.068
range x y z: 6,978.676 9,478.609 27,876.816
Assuming you do measurements on our wonderful planet earth it is not much likely that you have a height range of 27876 meters.
Also the resolution of 1mm is probably much to high for most datasets, but we keep this for now.
We run
lastile64 -i GuilfordWoods25.las -buffer 10 -odir tiles -olaz -tile_size 150"-reversible" does not makes sense if we just want to merge afterwards, we skip this.
A buffer size of 30 is unneccesary large. About 5-10% should be fine.
Having a look at a single tile we can confirm that we have strange input data:
330750_4316400.laz
Our data range is from -380 meter up to 74 meters
As a first try we just remove the stupid heights:
las2las64 -i tiles\*.laz -odir fixed -olaz -keep_z -1 35 -cores 15And check one result file
lasinfo64 -i fixed\330750_4316400.lazLAStools lasinfo (by in...@rapidlasso.de) version 250426
reading 'fixed\330750_4316400.laz' with 27378359 points
lasinfo (250426) report for 'fixed\330750_4316400.laz'
reporting all LAS header entries:
file signature: 'LASF'
file source ID: 0
global_encoding: 0
project ID GUID data 1-4: 00000000-0000-0000-0000-000000000000
version major.minor: 1.2
system identifier: 'LAStools (c) by rapidlasso GmbH'
generating software: 'las2las64 (version 250426)'
file creation day/year: 117/2025
header size: 227
offset to point data: 1082
number var. length records: 3
point data format: 3
point data record length: 34
number of point records: 27378359
number of points by return: 0 0 0 0 0
scale factor x y z: 0.001 0.001 0.001
offset x y z: 333000 4317000 -7000
min x y z: 330740.000 4316390.000 -0.895
max x y z: 330909.999 4316559.999 34.999
variable length header record 1 of 3:
reserved 43707
user ID 'LASF_Projection'
record ID 34735
length after header 64
description 'GeoTIFF GeoKeyDirectoryTag'
GeoKeyDirectoryTag version 1.1.0 number of keys 7
key 1024 tiff_tag_location 0 count 1 value_offset 1 - GTModelTypeGeoKey: ModelTypeProjected
key 1025 tiff_tag_location 0 count 1 value_offset 1 - GTRasterTypeGeoKey: RasterPixelIsArea
key 1026 tiff_tag_location 34737 count 22 value_offset 0 - GTCitationGeoKey: WGS 84 / UTM zone 18N
key 2049 tiff_tag_location 34737 count 7 value_offset 22 - GeogCitationGeoKey: WGS 84
key 2054 tiff_tag_location 0 count 1 value_offset 9102 - GeogAngularUnitsGeoKey: Angular_Degree
key 3072 tiff_tag_location 0 count 1 value_offset 32618 - ProjectedCSTypeGeoKey: WGS 84 / UTM 18N
key 3076 tiff_tag_location 0 count 1 value_offset 9001 - ProjLinearUnitsGeoKey: Linear_Meter
variable length header record 2 of 3:
reserved 43707
user ID 'LASF_Projection'
record ID 34737
length after header 30
description 'GeoTIFF GeoAsciiParamsTag'
GeoAsciiParamsTag (number of characters 30)
WGS 84 / UTM zone 18N|WGS 84|
variable length header record 3 of 3:
reserved 43707
user ID 'liblas'
record ID 2112
length after header 599
description 'OGR variant of OpenGIS WKT SRS'
LASzip compression (version 3.4r4 c2 50000): POINT10 2 GPSTIME11 2 RGB12 2
LAStiling (idx 2247, lvl 6, sub 0, bbox 329100 4310250 338700 4319850, buffer) (size 150 x 150, buffer 10)
reporting minimum and maximum for all LAS point record entries ...
X -2260000 -2090001
Y -610000 -440001
Z 6999105 7034999
intensity 0 0
return_number 0 0
number_of_returns 0 0
edge_of_flight_line 0 0
scan_direction_flag 0 0
classification 0 0
scan_angle_rank 0 0
user_data 0 0
point_source_ID 0 0
gps_time 0.000000 0.000000
Color R 4864 65280
G 4864 65280
B 4864 65280
number of first returns: 27378359
number of intermediate returns: 0
number of last returns: 27378359
number of single returns: 27378359
WARNING: there are 27378359 points with return number 0
WARNING: there are 27378359 points with a number of returns of given pulse of 0
histogram of classification of points:
27378359 never classified (0)
We can not see duplicates in terms of point_source or gps_time, because those fields are empty, but we see several layers during visualization which maybe result by the merge of more or less identical datasets.
So we use lasduplicate64 to try to decrease the point counts to increase further processing speed.
lasduplicate64 -i fixed\330750_4316400.laz -o tmp1.laz
95263 dups (<1%)
Checking for dups by distance:
lasduplicate64 -i fixed\330750_4316400.laz -o tmp2.laz -nearby 0.005 -lowest
number of xy-duplicates 37681089
found 614654 duplicates in 'fixed\330750_4316400.laz'. took 212.782 sec.
So the duplicate check is ok - there are not that much duplicates as we expected from visual check.
Next we do the ground classification you did:
"-all_returns" is not required, we do not have return info in the file.
lasground_new -i fixed\*.laz -spike_down 0.25 -compute_height -replace_z -odir ground -olaz -cores 15
The result is somehow medium: There are not so much ground points detected.
The log tells:
horizontal units are meter and vertical units are meter. nature mode.
reading 5445 points. step is 25 m, sub is 5, bulge is 2 m, spike is 1+0.25 m, and offset is 0.05 m ...
We can try a smaller step size:
lasground_new64 -i fixed\330750_4316400.laz -archeology -extra_coarse -o tmp4.laz
reading 27378359 points. step is 1 m, sub is 3, bulge is 1 m, spike is 0.35+0.35 m, and offset is 0.02 m ...
and get much more ground points, but also points at roofs - because their length is larger than 1m of our step.
It is worth to have a closer look at your data.
We define a cut along the road and limit visualization to this.
The road is not flat at all - to proof further we cut out a piece of the road area - where we expect the road is somehow flat.
las2las64 -i fixed\330750_4316400.laz -keep_xy 330853 4316543 330856 4316546 -o tmp5.lazThe detail view of your road surface plot shows points with a Z-range of more than 5 meters.

So we can conclude:
The data are crap.
Maybe you can use lasthin64 and lasnoise64 to reduce many of the points, but of course precision will get lower then.
Best will be to get a better data source.
Cheers,
Jochen @rapidlasso