Don't make USGS spec require storing OGC WKT as EVLRs

200 views
Skip to first unread message

Martin Isenburg

unread,
Feb 1, 2017, 4:34:10 AM2/1/17
to LAStools - efficient command line tools for LIDAR processing, The LAS room - a friendly place to discuss specifications of the LAS format
Hello,

from several users I have heard that the new USGS specification is going to require the storage of the OGC WKT way to specify a CRS to happen exclusively in the EVLR section that is at the very end of a LAS/LAZ file. I am not sure this really is true but I have seen several examples of this already. All of those had been prepared by the LP360 sotfware like in the lasinfo output example included at the end.

I had argued in the past that this is a bad idea for multiple reasons:


Given the choice you should try to add the CRS to the beginning of a file using a VLR because you get punished with those disadvantages wheever you use an EVLR:

(1) extra disk seek to beginning of file needed at end of writing files
(2) extra disk seek to end of file needed to read meta information
(3) stream-processing or piping of LAS/LAZ file not possible
(4) ftp-truncated files are guaranteed to loose their CRS information 
(5) <something bad that i have not thought of yet>
reporting all LAS header entries:
  file signature:             'LASF'
  file source ID:             0
  global_encoding:            17
  project ID GUID data 1-4:   00000000-0000-0000-0000-000000000000
  version major.minor:        1.4
  system identifier:          'OTHER'
  generating software:        'LP360 from QCoherent Software  '
  file creation day/year:     315/2016
  header size:                375
  offset to point data:       512
  number var. length records: 0
  point data format:          7
  point data record length:   36
  number of point records:    0
  number of points by return: 0 0 0 0 0
  scale factor x y z:         0.001 0.001 0.001
  offset x y z:               0 5000000 0
  min x y z:                  470250.160 5331846.440 -0.039
  max x y z:                  470346.860 5332117.060 4.680
  start of waveform data packet record: 0
  start of first extended variable length record: 5408
  number of extended_variable length records: 1
  extended number of point records: 136
  extended number of points by return: 111 21 4 0 0 0 0 0 0 0 0 0 0 0 0
extended variable length header record 1 of 1:
  reserved             0
  user ID              'LASF_Projection'
  record ID            2112
  length after header  947
  description          'OGC Well Known Text'
    OGC COORDINATE SYSTEM WKT:
    COMPD_CS["NAD83(2011) / UTM zone 10N / NAVD88 height - Geoid12A (metres)",PROJCS["NAD83(2011) / UTM zone 10N",GEOGCS["NAD83(2011)",DATUM["NAD83_National_Spatial_Reference_System_2011",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","1116"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","6318"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-123],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","6339"]],VERT_CS["NAVD88 height - Geoid12A (metres)",VERT_DATUM["North American Vertical Datum 1988",2005,AUTHORITY["EPSG","5103"]],UNIT["metre",1.0,AUTHORITY["EPSG","9001"]],AXIS["Gravity-related height",UP],AUTHORITY["EPSG","5703"]]]
the header is followed by 137 user-defined bytes
reporting minimum and maximum for all LAS point record entries ...
  X           470250160  470346860
  Y           331846440  332117060
  Z                 -39       4680
  intensity         737      40911
  return_number       1          3
  number_of_returns   1          3
  edge_of_flight_line 0          1
  scan_direction_flag 0          1
  classification      1         18
  scan_angle_rank     9         18
  user_data           1          2
  point_source_ID    10         20
  gps_time 143320531.425046 143321224.005746
  Color R 2313 41377
        G 2313 36494
        B 2313 33924
  extended_return_number          1      3
  extended_number_of_returns      1      3
  extended_classification         1     18
  extended_scan_angle          1433   3035
  extended_scanner_channel        1      2
WARNING: there is coordinate resolution fluff (x10) in XY
number of first returns:        111
number of intermediate returns: 4
number of last returns:         111
number of single returns:       90
overview over extended number of returns of given pulse: 90 34 12 0 0 0 0 0 0 0 0 0 0 0 0
histogram of classification of points:
              98  unclassified (1)
               7  water (9)
              31  Reserved for ASPRS Definition (18)
 +-> flagged as withheld:  14
 +-> flagged as extended overlap: 51
---------- Forwarded message ----------
From: Martin Isenburg <martin....@gmail.com>
Date: Mon, Oct 24, 2016 at 6:35 AM
Subject: Don't use EVLRs unless you really really have to
To: LAStools - efficient command line tools for LIDAR processing <last...@googlegroups.com>, The LAS room - a friendly place to discuss specifications of the LAS format <las...@googlegroups.com>


Hello,

the new LAS 1.4 specification introduces a new concept: EVLRs. They are very useful. They are the only way to *add* VLR with huge payloads (more than 65535 bytes). And they are also great to *modify* an existing LAS/LAZ file on disk in place without a full re-write, for example, when all one wants to do is add CRS information.

However, I like to make the case against using them if possible and restricting their use to the two scenarios outlined above. If at all possible use VLRs and "promote" all small EVLRs to VLRs whenever rewriting a file (assuming this would drop their number to zero). Please don't make it mandatory to place the tiny CRS records into a EVLR that may just as well (and better) life in a VLR. Is it true that there are some specifications (USGS?) that ask folks to do so ...?

Why do I say this? I was one of those who suggested those EVLRs in the first place. Why don't I like them anymore? Well, I only like them in the two scenarios outlined above where they *add* value to the LAS / LAZ format. In general they complicate things. Especially for piping.

The concept of EVLRs is inherently incompatible with "piped processing" because the start of the EVLR block needs to be written at the very beginning of the file. This is only possible if I know exactly how many points I will write and if each point record occupies the same number of bytes. Hence any point source that cannot pre-determine the number of written points, any filter operation, or LAZ compression cannot operate in a piped mode if an EVLR block is present because the "start_of_first_extended_variable_length_record" field cannot be written correctly.

Here an example of some simple piping (with LAStools) that will not be possible when EVLRs are present:

====== piped lasinfo report ======

las2las -i input14withEVLR.las ^
            -keep_first ^
            -stdout |
lasinfo -stdin -o input14withEVLR_filtered.txt

=======  piped DTM creation =======

lasground -i input14withEVLR.las ^
                 -city -extra_fine  ^
                 -stdout |
las2dem -stdin ^
               -keep_class 2 ^
               -step 0.5 -use_tile_bb ^
               -o 
input14withEVLR_dtm.bil

====   piped pit-free CHM creation ====

lasground -i input14withEVLR.las ^
                 -wilderness  ^
                 -replace_z  ^
                 -stdout |
las2dem -stdin -pipeon ^
               -step 0.5 -kill 1.5 ^
               -o chm_000.bil |
las2dem -stdin -pipeon ^
               -drop_z_below 2.0 ^
               -step 0.5 -kill 1.5 ^
               -o chm_020.bil |
las2dem -stdin -pipeon ^
               -drop_z_below 5.0 ^
               -step 0.5 -kill 1.5 ^
               -o chm_050.bil |
las2dem -stdin -pipeon ^
               -drop_z_below 10.0 ^
               -step 0.5 -kill 1.5 ^
               -o chm_100.bil |
las2dem -stdin -pipeon ^
               -drop_z_below 15.0 ^
               -step 0.5 -kill 1.5 ^
               -o chm_150.bil |
las2dem -stdin -pipeon ^
               -drop_z_below 20.0 ^
               -step 0.5 -kill 1.5 ^
               -o chm_200.bil 

Your thoughts?

Martin @rapidlasso

Evon Silvia

unread,
Feb 1, 2017, 7:46:20 PM2/1/17
to last...@googlegroups.com, The LAS room - a friendly place to discuss specifications of the LAS format
I agree that having CRS info in a VLR instead of an EVLR is certainly preferable, but I'm unaware of an official preference or requirement from USGS. My understanding of the USGS CRS requirement is that they allow it to be appended as a EVLR. I've never heard of embedding it as a EVLR being a requirement in any of our conversations with USGS.

Appending as a EVLR is allowed because it has the distinct advantage of alleviating the need to rewrite the entire file just to add CRS information while preparing for delivery to USGS, resulting in faster deliveries. I see this as an example of USGS being flexible on a non-issue since USGS likely rewrites the LAS files prior to distribution anyway, at which point they certainly have the option to "promote" all small (<65536 bytes) EVLRs to VLRs. I believe that LAStools will do this promotion automatically, as does my software, and I'm sure others out there do so, too.

I could be wrong, but that's my current understanding of the matter. Karl or Jason are welcome to correct me.

Evon
--
Quantum Geospatial Logo
Evon Silvia PLS
Solutions Developer
517 SW 2nd Street, Suite 400, Corvallis, OR 97333
P: (541) 452-8502



Stoker, Jason

unread,
Feb 1, 2017, 8:38:19 PM2/1/17
to <lastools@googlegroups.com>, The LAS room - a friendly place to discuss specifications of the LAS format
Hi Martin- thanks for the post. I believe you are correct that the latest LBS v1.3 draft, as currently written, does require WKT to be held in an EVLR.  We can certainly remove it and defer a decision to v2.0 if we want to study the requirement further. I have asked Karl to release the base spec for comment before we lock it in and publish so we can weigh the pros and cons of these types of concerns. The main additions for the next version have more to do with the content for breaklines than the point clouds, and Karl and his team has been working with the hydro community to ensure we are meeting their needs. But there are a few tweaks such as you noted that may be of interest to others here.

We are well aware that while this is a USGS base specification for 3DEP, there are many others who have adopted it to ensure their data is of satisfactory quality and consistency. There are also many, many users out there who could care less how we do things at USGS, as it doesn't apply to them at all. And while many software vendors are working to ensure that they can produce data that meets our specs (whether they agree it is right or not), we don't want to burden all the various software vendors to bend over backwards and create separate modes just for us.  I personally think it is more important for us to say what is in a file than exactly how it gets written- the hows (and even a lot of the whats) should be fleshed out via the LAS WG and OGC, of which we contribute to now. What we really want is fully compliant LAS data, all fields correctly populated. USGS is not in the business of making lidar formats, but ensuring what is given to us is correct and consistent.

And we do have a goal of enabling data in the cloud for piped processing and streaming in the future, so I don't want to lose sight of your issues there.  The future answer may even be that we need to have more than one point cloud format for transfer, storage/archive and/or streaming/exploitation, with a way to make that translation on the fly (something like what pdal does), instead of "one format to rule them all". We have partners and instruments that now produce data in bpf and hdf5 formats for example, and we're currently forcing them to convert to LAS, which may result in losing important information or empty fields.  

One of the things we struggle with is balancing improvements in the technologies and formats with the fact that we also have a responsibility to store and archive all these data too- so our grandchildren can still pull any of them up in their holodecks in 50 years and complain about how silly it was we could only store 2 points per square meter back in the olden days ;-). 

So I applaud your proactive announcements of concerns here, but do not worry- we're still in draft mode and are still soliciting input to make this the best spec for us and people who use it. Once the draft is ready for formal input, I trust Karl and his team will make it available for comments.


Jason M. Stoker, Ph.D
US Geological Survey
National Geospatial Program
Office: 970-226-9227
Cell: 605-496-3513
My USGS Profile 

Reply all
Reply to author
Forward
0 new messages