Improved wording FWF part of LAS 1.3/1.4 specification

64 views
Skip to first unread message

Martin Isenburg

unread,
Jan 22, 2014, 5:02:46 AM1/22/14
to The LAS room - a friendly place to discuss specifications of the LAS format, PulseWaves - no pulse left behind
Hello,

this discussion is in respect to revision 13 of the LAS 1.4
specification but equally applies to the corresponding parts of the
latest LAS 1.3 specification. The relevant pages are page 14/15 and
page 26/27:

http://www.asprs.org/a/society/committees/standards/LAS_1_4_r13.pdf

First RIEGL (late 2011) and then Optech (now) were miss-reading the
FWF part of the LAS specification and have been implementing export
software producing files that were incompatible to those produced by
Leica. In both cases the engineers complained to me - as we were
fixing up the exporters - that the language of the LAS specification
is too vague and subject to interpretation. In fact even what I assume
to be the "ground truth" - namely the Leica exports - are quite
different from what the specification describes.

Why do I consider Leica's output as the "ground-truth"?
(1) I read up on the history of FWF in LAS 1.3. That part of the
specification was proposed (and I assume mostly written) by Paul Galla
of Leica.
(2) It seems that the Leica LAS 1.3 FWF is the only output that is
currently being used in a third vendor's commercial software
(Terrasolid) for actual processing (extracting additional returns).

There are two issues to clarify before we can start improving:

(1) Where is the number of samples per waveform encoded?

(a) in the "Number of Samples" field of the Wave Packet Descriptor
VLRs (of which there are only 255)? This is what Leica stores in this
field, and this was what I read into the wording of the specification.
I quote page 27:

"Number of Samples: The number of samples associated with this
waveform packet type. This value always represents the fully
decompressed waveform packet."

But Leica outputs 256 samples (no matter what) whereas Optech outputs
a great variety of different lengths.

(b) implicitely in the "size" field of the "Wave Packet" in each point
record? This is what Optech does. It is implicit because the stored
number has to be divided by two to get the actual number of (16 bit)
sampled. But this "implicit" computation is a little dangerous because
if the actual samples were stored in some compressed manner then the
size field would no longer be able to fulfill this dual role.

So does Optech now have to (1) store a huge number of Wave Packet
Descriptors each with a different "Number of Samples" field as you can
see in the example histogram of the "size" field below or (2) pad the
waveforms with "zero" values to lower the number of VLRs required?

lasinfo.exe -i optech.laz -histo wavepacket_size 1 -nh -nv -nmm
bin 96 has 58591
bin 104 has 113479
bin 112 has 206085
bin 120 has 607946
bin 128 has 2653437
bin 136 has 2249816
bin 144 has 1193661
bin 152 has 1026636
bin 160 has 899445
bin 168 has 809480
bin 176 has 751317
bin 184 has 711539
bin 192 has 668146
bin 200 has 606943
bin 208 has 540692
bin 216 has 494023
bin 224 has 456205
bin 232 has 409471
bin 240 has 365311
bin 248 has 312900
bin 256 has 258146
bin 264 has 210543
bin 272 has 174621
bin 280 has 144973
bin 288 has 115251
bin 296 has 91945
bin 304 has 73190
bin 312 has 55318
bin 320 has 41798
bin 328 has 31443
bin 336 has 22544
bin 344 has 16012
bin 352 has 11124
bin 360 has 7527
bin 368 has 5220
bin 376 has 3332
bin 384 has 2041
bin 392 has 1342
bin 400 has 973
bin 408 has 562
bin 416 has 410
bin 424 has 184
bin 432 has 191
bin 440 has 119
bin 448 has 39
bin 456 has 35
bin 464 has 25
bin 472 has 15
bin 480 has 15
bin 488 has 10
bin 496 has 5

I seem to remember that also RIEGL was doing it also this way but
their variety was much lower (160, 240, 320, 400, 480). And they are
now exporting PulseWaves which is one way to side-step the issue.
Roland? Guenther? Can you confirm?

(2) What is scaling and direction of the vector stored in the X(t),
Y(t), Z(t) fields of the "Wave Packet" in each point record?

The wording of the specification is really confusing here, but this is
what Leica and RIEGL (and soon also Optech) are storing in the X(t),
Y(t), Z(t) fields.

X(t), Y(t), Z(t) contain a vector that describes the direction of the
laser beam (away from the optical origin of the laser) and the
round-trip (!) distance that the light of the laser travels per
picosecond in meters. Hence, the length of this vector should be
somewhere around 0.299792458 / 1000 / 2 = 0.00014989623. So that the
"location" of each waveform sample for "S" ranging from 0 to
num_samples-1 can be computed as

dist = location - s*temporal;
X_sample = X_return + dist*X(t);
Y_sample = Y_return + dist*Y(t);
Z_sample = Z_return + dist*Z(t);

where "location" is the "Return Point location" field from the Wave
Packet of each point record that stores - i quote the specification
(page 14) - an "offset in picoseconds from the first digitized value
to the location within the waveform packet that the associated return
pulse was detected." and "temporal" is the "Temporal Sample Spacing"
that stores - i quote the specification (page 27) the "temporal sample
spacing in picoseconds. Example values might be 500, 1000, 2000 and so
on, representing digitizer frequencies of 2 GHz, 1 GHz and 500 MHz
respectively."

Hopefully one day get all three big hardware vendors output mutually
compatible LAS 1.3 FWF files ... but issue (1) really needs to be
resolved first. For issue (2) we only need a better wording in the
spec. Seems by nagging has resolved that now. (-: Is anyone aware on
any other software (besides RiPROCESS, LMS, and ALSXX) that generates
LAS 1.3 FWF?

Comments?

Martin @rapidlasso

--
http://rapidlasso.com - fast tools to fix LiDAR specs

Evon Silvia

unread,
Jan 22, 2014, 12:59:38 PM1/22/14
to las...@googlegroups.com, PulseWaves - no pulse left behind
Martin et al,

Although I am a relative newcomer, I have read the specs several times and implemented my own waveform read/write code that we use internally. I have no vested interest in any hardware vendor. I'll grant that the specification is a little confusing on one particular point, but generally I find it unambiguous.

(1): I don't understand this alleged ambiguity at all. The specification explicitly states that the Number of Samples per record is stored as an unsigned long integer in the point's corresponding Waveform Packet Descriptor, of which you can have 256 different records. A value of 120 means that each waveform data packet has 120 samples. There's no need for "implicit" anything, and obscuring that value as suggested in (1)(b) begs the question of what that field in the Waveform Data Packet Descriptor would be for.

The specification (pg 14) explicitly states that the Waveform Packet Size field in the point record is only there to allow for a future case where the uncompressed waveform data's size (determined from the Waveform Packet Descriptor) doesn't equal its compressed size (determined in the individual point record). Martin's findings suggest to me that Optech must be using some form of compression.

(2): Martin, I would suggest that the greater question is rather how people define the "anchor point", as that will control the definition of the parametric variables. The ambiguity, in my opinion, comes from the question of which X(t) you're talking about, and a misplaced parenthetical statement. Quoting from page 14 of the 1.3 spec (emphasis added by me):
X(t), Y(t), Z(t): These parameters define a parametric line equation for extrapolating points along the associated waveform. The position along the wave is given by:
X = X_0 + X(t)
Y = Y_0 + Y(t)
Z = Z_0 + Z(t)
where X, Y, and Z are the spatial position of the derived point, X_0, Y_0, Z_0 are the position of the "anchor" point (the X, Y, Z locations from this point's data record) and t is the time, in picoseconds, relative to the anchor point...
Ignoring the parenthetical statement, I get the impression that the "anchor" point is where the first detectable return occurred (i.e., the "front' of a waveform curve) for a particular pulse.

For a consumer, to determine another point along the waveform packet, you must first back-calculate coordinates for the "anchor" point of the current pulse given the following equations:
X_0 = X - X(t_P)
Y_0 = Y - Y(t_P)
Z_0 = Z - Z(t_P)
where X is the X value of this point, X() is the parametric X(t) value of that point, and t_P is the Return Point Waveform Location of this point. After that, your coordinates (X_s, Y_s, and Z_s) for other positions along the waveform curve (at sample S) are determined relative to that calculated anchor point using the following unambiguous equations:
X_s = X_0 + X(t) * t_s
Y_s = Y_0 + Y(t) * t_s
Z_s = Z_0 + Z(t) * t_s
where t_s is the sample's time offset determined by multiplying the Waveform Packet Descriptor's Temporal Sample Spacing and the integer number of this particular sample.

Of course, this interpretation means that you cannot store waveform data for a pulse that has no digitized return, but that's a flaw in the specification, not in the interpretation. This also implies that every return for a single pulse would point to (i.e., "share") the same waveform data packet, which is not something I have seen in most data sets.

Anyway, by this my interpretation, the ambiguity of the specification is easily cured with the following revisions:
X(t), Y(t), Z(t): These parameters define a parametric line equation for extrapolating points along the associated waveform. The position along the wave is given by:
X = X_0 + X(t) * t_s
Y = Y_0 + Y(t) * t_s
Z = Z_0 + Z(t) * t_s
where X, Y, and Z (the X, Y, Z locations from this point's data record) are the spatial position of the derived point, X_0, Y_0, Z_0 are the position of the "anchor" point and t_s is the time offset of a waveform sample, in picoseconds, relative to the anchor point ...
When recording the waveform data in the first place, Riegl, Leica, and Optech should already know the time (t_s) and coordinates (X_0, Y_0, Z_0) of the anchor point, so recording the parametric equations for a point record becomes simple algebra.

That's my $0.02. Take it or leave it.

Evon

--
Evon Silvia  Geomatics Specialist
WSI Corvallis, OR WSI Portland, OR WSI Oakland, CA
517 SW 2nd St., Suite 400, Corvallis, OR 97333



--
--
You are subscribed to "The LAS room - a friendly place to discuss the the LAS or LAZ formats" for those who want to see LAS or LAZ succeed as open standards. Go on record with bug reports, suggestions, and concerns about current and proposed specifications.

Visit this group at http://groups.google.com/group/lasroom
Post to this group with an email to las...@googlegroups.com
Unsubscribe by email to lasroom+u...@googlegroups.com
---
You received this message because you are subscribed to the Google Groups "The LAS room - a friendly place to discuss the LAS and LAZ formats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lasroom+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Martin Isenburg

unread,
Jan 22, 2014, 5:46:49 PM1/22/14
to The LAS room - a friendly place to discuss specifications of the LAS format
Evon.

I have seen three completely different implementations in the flagship
software releases of the three main vendors. I believe all three
companies have very smart engineers working on the software. Yet the
LAS 1.3 FWF output they are producing is not compatible as the spec
was interpreted differently. I think that is sufficient evidence that
the spec needs to be written with more clarity.

I agree with you that item (1) is pretty clear. But Optech and RIEGL
have - nevertheless - implemented this differently. I guess they will
have to fix their exporters (if they haven't already). But item (2) is
- and here I agree with the engineers at Optech and RIEGL - very vague
and leaves plenty of room for interpretation. At this point I believe
the most pragmatic option is to follow the way that Leica interprets
the spec as they have been producing the vast majority of LAS 1.3 FWF
content and then rewrite the LAS 1.3 FWF spec to more clearly describe
what Leica is exporting.

Who uses LAS 1.3 FWF in a third party software? Anyone?

Martin

Alex

unread,
Jan 23, 2014, 2:26:31 AM1/23/14
to las...@googlegroups.com, PulseWaves - no pulse left behind
As Martin noticed, instead of constructing a lot of Waveform Packet Descriptors for waveforms of different length, Optech LMS writes a single Waveform Packet Descriptor for all waveforms. It sets Bits per sample (16) and Waveform compression type (no compression). LMS uses Waveform packet size  to represent a length of waveform for each return. Because there is no compression, this is the exact length of the waveform (in bytes, not in samples). Thus, the field Number of samples of Waveform Packet Descriptor is unnecessary in this case, and LMS writes a total number of samples to this field. When one run lasinfo, it says:

variable length header record 2 of 2:
  reserved             0
  user ID              'LASF_Spec'
  record ID            100
  length after header  26
  description          ''
  index 1 bits/sample 16 compression 0 samples 2710064 temporal 1000 gain 1, offset 0




Martin Isenburg

unread,
Jan 23, 2014, 2:46:44 AM1/23/14
to The LAS room - a friendly place to discuss specifications of the LAS format, PulseWaves - no pulse left behind
Hello Alex,

i think there is little choice but to convince Optech to have their
LMS LAS FWF exporter fixed to be specification conform. Writing the
total number of samples into that field broke my software and will -
no doubt - break others too that expect a specification conform LAS
file. I quickly compiled a "special investigation version" of LAStools
called las2txt_WOF and lasview_WOF (WOF = With Optech Fix) to produce
the illustrations I send earlier ... but this software cannot read
"real" LAS FWF files anymore ...

Here a real-world example. A little know fact, but laszip.exe can also
compress the waveform data when run with the option 'waveforms' such
as

laszip.exe -i leica_las13fwf.las -olaz -waveforms

However running this

laszip.exe -i optech_las13fwf.las -olaz -waveforms

currently crashes the compressor because the input does not follow the
specification. Of course it should not crash but exit with an ERROR
message. The next version will exdo so proclaiming "corrupt LAS FWF
file". The LASzip compressor has no chance to compress the waveforms
correctly because suddenly the information about how many samples are
in the waveform is not there anymore. The original reason that there
is both a "number of samples" entry as well as a "size" entry was - in
fact - to allow compression to be added to the official LAS
specification one day (as the specification explicitely states).

By not following the spec Optech LMS LAS 1.3 FWF output will not only
cause unpredictable behaviour in software that expetcs the LAS input
to be specification conform ... it will also prohibit future
developments - such as compression - to be applied to Optech LMS LAS
1.3 FWF data. As the same FWF mechanism will live on in LAS 1.4 FWF
and as full waveform become more and more dominant with REDD+ biomass
campaigns and full waveform algorithms maturing ... I think it is best
to make the LAS FWF output of Optech LMS specification conform as soon
as possible ...

Any one else have an opinion here?

Regards,

Martin @rapidlasso

Alex Yeryomin

unread,
Jan 23, 2014, 2:53:05 AM1/23/14
to las...@googlegroups.com, PulseWaves - no pulse left behind
Martin, I agree. I will inform developers about this issue and possible negative consequences.

Thank you for the hint on laszip and waveforms!

Alex

Martin Isenburg

unread,
Jan 23, 2014, 3:29:51 AM1/23/14
to PulseWaves - no pulse left behind, The LAS room - a friendly place to discuss specifications of the LAS format
Great to hear that, Alex.

I agree that not being able to specify the number of samples "per
shot" is annoying and one of the major design flaws of LAS 1.3 FWF
(don't get me started (-;). This (and other design issues) are the
reason we started the PulseWaves effort. As far as I can reconstruct
it, this flaw was introduced like this: Leica was the first to want
FWF exports and pushed for this functionality. Their hardware seemed
to be to always produce 256 samples and apparently they did not
anticipate that anyone may want to vary the number of samples some
day. No one seemed to scrutinize Leica's suggested addendums further.
I know Paul Galla tried really hard to get feedback on the draft but
got none. Last year I read up on that.

This was all discussed here:
http://groups.yahoo.com/neo/groups/lasformat/info

And then continued here:
http://lidar.cr.usgs.gov/

But now all of these "official records" of the history of the LAS
format seem gone. It is a pitty. Should such discussions not be
preserved? I would assume that a anyone maintaining an open standard
should. But who am I to speak ... all of the forums that I have
created rely on the goodwill of a large corporation. But they did once
say "do no evil" ... albeit quite some time ago. (-:
> --
> --
> Post to "PulseWaves" by email to pulse...@googlegroups.com
> Unsubscribe by email to pulsewaves+...@googlegroups.com
> Visit this group's message archives at http://pulsewaves.org
>
> ---
> You received this message because you are subscribed to the Google Groups
> "PulseWaves - no pulse left behind" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pulsewaves+...@googlegroups.com.

Martin Isenburg

unread,
Jan 27, 2014, 7:53:22 AM1/27/14
to PulseWaves - no pulse left behind, The LAS room - a friendly place to discuss specifications of the LAS format
Hello,

i double checked and can confirm that RIEGL's LAS 1.3 FWF exports
already follows what we have have outlined in this thread and is fully
compatible with Leica's LAS 1.3 FWF exports. They simply allocate a
whole bunch of "empty" wavepacket descriptors as VLRs and then
populate them before closing the written file with the waveform
descriptors that were actually used during export. That is one easy
way to deal with the fixed sample block size flaw in LAS 1.3 FWF ...
assuming the sample block sizes do not vary too much.

But it seems that Optech will have to do some padding to avoid using
hundreds of VLRs for the many different waveform sample block sizes
they seem to be exporting with the current incompatible LAS 1.3 FWF
exporter. Attached a sample lasinfo report for a RIEGL LAS 1.3 FWF
file. That's more or less how it should look like ...

Regards,

Martin @rapidlasso
130415_150510.txt
Reply all
Reply to author
Forward
0 new messages