I would point out that the XDR libraries are part of the underpinnings
of the Network File System. This is part of the magic that allows us
to remotely mount file systems on machines of different architecture,
64 bit, big endian, little endian, and all of that transparently and
with out problems.
Make no mistake XDR is stable. If people think that they are having
problems with XDR please email me directly with details so that we
can try to reproduce the problem and fix it.
Indeed, I have never seen a documented problem with our implementation
of XDR in SU to this date.
Why use XDR? Basically, installing SU with XDR allows you to transport
SU format files between different platforms. You don't really need
to use XDR if you are not exchanging data in the SU format between
systems of different endian or of 32 versus 64 bit.
Please note: the SU data format is an *internal* data format for use
with SU programs. It is not a data exchange format. You can use
segywrite to write SEGY format files for data exchange purposes.
-John
John Stockwell | jo...@dix.Mines.EDU
Center for Wave Phenomena (The Home of Seismic Un*x)
Colorado School of Mines
Golden, CO 80401 | http://www.cwp.mines.edu/cwpcodes
voice: (303) 273-3049
Our book:
Norman Bleistein, Jack K. Cohen, John W. Stockwell Jr., [2001],
Mathematics of multidimensional seismic imaging, migration, and inversion,
(Interdisciplinary Applied Mathematics, V. 13.), Springer-Verlag, New York.
_______________________________________________
seisunix mailing list
seis...@mailman.mines.edu
https://mailman.mines.edu/mailman/listinfo/seisunix
Unsubscribe: seisunix-u...@mailman.mines.edu
> I really believe this should be a community discussion so I've copied the mailing list.
>
> Responses are inline below:
>
> --- On Sat, 7/3/10, pm <p...@cgiss.boisestate.edu> wrote:
>
>> From: pm <p...@cgiss.boisestate.edu>
>> Subject: Re: [Seisunix] why XDR?
>> To: "Reginald Beardsley" <pula...@yahoo.com>
>> Cc: "John Stockwell" <jo...@dix.mines.edu>
>> Date: Saturday, July 3, 2010, 10:44 AM
>> Reg,
>>
>> WHAT IT IS
>> Originally, I was completely confused by the move to
>> SUXDR. I now think I may
>> understand the motivation. With SUXDR, files are
>> stored with big endian 4
>> byte floats. The 2 and 4 byte integers are also big
>> endian in the 240 byte
>> trace header. Most Linux PC's are little endian as
>> their native format (all
>> of this is IEEE not IBM).
>>
>> ISSUE WITH NFS
>> With NFS, I might imagine a scenario where a native big
>> endian machine and a
>> native little endian machine were to both mount the same
>> disk from an NFS
>> file server. On that mount might reside SU files that
>> both types of machines
>> wish to read. If the little endian client uses SUXDR,
>> then it could read the
>> same files that the big endian client could read
>> (presumably with or without
>> XDR since it is native, no external needed). How
>> likely is that scenario? I
>> can't see myself in that situation, since I would only use
>> NFS on a LAN, and
>> we only have little endian machines on our network.
>> But maybe this theory is
>> not correct, since I have not used a big endian machine for
>> 20 years.
>
> It was commonplace 20 years ago to have a variety of machines NFS mounting the same filesystems. That was the motivation for SUXDR. It's much less common now. Except for the PowerPC and SPARC, I'm not aware of any extant Big Endian systems other than the mainframe Z series. So most modern Big Endian machines are servers. Big Endian workstations are very rare now and typically old.
>
It's still commonplace in oil companies to have multiple platforms. The
endian issue is an issue of data made years ago. We are haunted by the
formats and hardwares that we used in the past in our data.
Also, remember that I am not thinking primarily of users in the
US and Europe. Who knows what hardware is still being used by people
who don't have the funds to purchase the most modern equipment? I
still get questions about platforms that we threw out years ago.
These are people with one person seismic companies, academics with
less funding, government researchers, as well as people in developing
nations. I have to think of all of the users of SU, not just those
in the US and Europe.
>>
>> STILL CONFUSED ON 32 vs 64
>> I don't see the 32 vs 64 bit issue at all. This is
>> because the file format is
>> 4 byte floats regardless of the arch of the host.
>> Remember, SU file format
>> is derivative of an exchange format, SEGY, so let's not be
>> even thinking of
>> larger byte floats in an SU file.
>
> 64 bit only affects things like file sizes and badly written code. It's the 16 bit vs 32 bit crisis of 25 years ago reinvented by another generation :-(
>
The issue of 32 bit versus 64 bit arises in SU because the SU data
format has some fields that are short integer fields. The short integer
is a different length in some versions of 64 bit than in others.
Certainly, taking data made on a 32 bit platform to a 64 bit platform
may be an issue.
>>
>> UNIVERSALLY ADAPTIVE CODE
>> The problem with a code that doesn't care about endian is
>> as follows. The
>> header definition would need amendment to include a 1 byte
>> element that
>> defines the file endian type. When I wrote
>> a byte swap program that toggles
>> a SU file between little and big (SUXDR) endian formats, I
>> had to have the
>> number of samples be a user supplied value. With just
>> that information, one
>> can do the byte swap. Without it, there is no way a
>> code can determine
>> number of samples, since the two alternative
>> interpretations (big or little
>> endian) will lead to valid integers. Of course, maybe
>> one could look at the
>> total file bytes to determine if a fraction of a trace
>> residual was left over
>> (suggesting the wrong interpretation). However it
>> would not be bullet proof
>> since there would be cases where the total bytes and the
>> sample length made
>> sense (ie. an integer number of traces).
>
> While adding a field to the header will work, it's not necessary. There are an assortment of subterfuges one can use to determine the byte sex and floating point format. Never underestimate the low cunning of an old programmer. Especially one who has read several million lines of other people's code. You learn a lot of handy tricks along the way.
>
I put "swapbytes" in there for straight binary data, and "suswapbytes"
for SU data format files. The program "suoldtonew" is there to go from
pre-XDR to XDR format data. This does the trick. I don't know why you
are making it harder than it has to be.
We have put in a byte order detection in the more recent versions
of segyread and segywrite, so, in theory, one need not set the endian=
flag anymore. However, there are enough variations in what people call
"segy" that it is necessary to have a flag to select these items.
Some data requires that both the data and the headers be swapped,
on some it is one or the other only.
>>
>> MY SOLUTION
>> Since I write code outside of SU, but like to be able to
>> apply both SU and my
>> own codes to the same file endian format, I don't use
>> SUXDR. The reason is
>> that I don't want any extra overhead on IO, since I always
>> use a little
>> endian machine. But now that I have a byte swap
>> program, I can convert a
>> SUXDR file someone gives me (I know, it isn't an exchange
>> format, but people
>> do these things). Bottom line, I am not passionate
>> about this, other than to
>> point out that SUXDR is optional, so if you don't like it,
>> don't compile that
>> way.
>
> I'm not suggesting doing byte swaps unless the data are in the wrong format for the execution host. So the only time there would be any overhead is if the file was written on a machine w/ a different byte sex. The code would read either format w/o complaint and always write native format. This could be extended to reading either IEEE or IBM floats transparently.
>
The choice of XDR or no XDR does not affect at all any programs
written in the SU style, because the only place were XDR is implemented
is in fgetter and fputter. That choice is totally hidden from the
user.
If you install SU either with or without XDR you won't be able to tell
the difference in performance. Also, if you plan to work in your own
personal SU data universe, the it doesn't make any difference whether
you set SUXDR or not.
However, if you plan to move SU format data to other platforms with
a different architecture, then you may have an issue.
>>
>> I would suggest that Makefile.config return to setting the
>> default to no xdr.
>> It seems to default to SUXDR with the shipped config file.
>
> I would like to see multiple config files so that a new user can select one close to their case. John believes that forcing a user to sort through the current arrangement is educational. I disagree and maintain my own set of config files that I drop into a new distribution along w/ a build script to automate the whole process. Overall John has done an excellent job w/ the build system. I don't like, but it's better than the other packages and difficult to improve on. I know. I've tried a couple of times and failed.
Primarily, I do not want to have to maintain a bunch of different
configuration files, when a user needs merely to comment out or uncomment
a specific line in one Makefile.config. If people want to send
me a bunch of example Makefile.config's then I would be happy
to include them in the Portability directory.
As to shipping with XDR set, that is the environment that we use here
in the office on a bunch of Linux and Mac systems. If people don't
want to use XDR, then they don't have to use it. That is why there
is a flag there.
>
>>
>> With regard to a steering committee, that would depend on
>> things like who
>> might contribute to the funding of SU. I have no view
>> on that topic, but
>> will say that SU is a great collection of applications, and
>> has done very
>> well to this point!
>
> There is NO funding. This is just a community management issue. If we are to preserve an active and growing community we have to work together as a group. Otherwise, the stronger members will simply wander off on their own.
>
> The essence of democracy is voluntary submission to the will of a larger group. Unlike many, I don't need the CWP/SU community. I can readily build a processing system from scratch if I want to. In addition there are several other candidates, FreeUSP, CPSeis, SEPlib, & Madagascar to name a few.
>
> I continue to use and support CWP/SU because I find it to be better managed than the alternatives. John has done an excellent job over the years. This has led me to reject entreaties to actively participate in other groups. I'd rather invest my time in CWP/SU.
>
The SU package is a byproduct of the Center for Wave Phenomena's
Consortium Project on Imaging Complex Structures
>>
>> Sincerely,
>> Paul
>>
>> --
>> Dr. Paul Michaels, PE
>> Professor, Engineering Geophysics
>> http://cgiss.boisestate.edu/~pm
>
> Hopefully this clarifies my position. I'm not trying to bash anyone and especially not John. But I have a well deserved reputation for not being very compromising on what constitutes good software engineering practice.
>
> Have Fun!
> Reg
> While adding a field to the header will work, it's not necessary. There are an assortment of subterfuges one can use to determine the byte sex and floating point format. >Never underestimate the low cunning of an old programmer. Especially one who has read several million lines of other people's code. You learn a lot of handy tricks along >the way.
I'm intrigued to know what the subterfuge is (you're not allowed a full data read & byte count). Anybody willing to share?
PS I'm glad to see the discussion get a little more tempered. John does do a fantastic job of keeping things running
smoothly IMHO.
Best wishes all,
James.
If you think about it, it isn't hard to think of tests. The SEGY header
consists of a lot of ints and shorts, so if the numbers are crazy,
usually large and crazy when you switch the mantissa and the exponent,
that could give you a clue on on endian.
It gets more challenging when you try to figure out what the data format
is. Again, though if you were to make some test cases with suplane
and write them as segy files, and then read those files with conv=0, you
could compare the 32 bit IBM tape format to your native IEEE format.
Likely there is a similar criterion that you could gin up.
-John
John Stockwell | jo...@dix.Mines.EDU
Center for Wave Phenomena (The Home of Seismic Un*x)
Colorado School of Mines
Golden, CO 80401 | http://www.cwp.mines.edu/cwpcodes
voice: (303) 273-3049
Our book:
Norman Bleistein, Jack K. Cohen, John W. Stockwell Jr., [2001],
Mathematics of multidimensional seismic imaging, migration, and inversion,
(Interdisciplinary Applied Mathematics, V. 13.), Springer-Verlag, New York.
_______________________________________________
This whole conversation has made me feel as if you had been talking to someone who pretended to be Reginald. Not sure why, but I am not willing to do anything that breaks code, no matter how exotic the environment. Code portability is a religious issue with me.
I also have no interest in keeping anything secret. What I want to do is teach some other people how to do some of this. I just don't seem to be able to find anyone who is either interested in learning or interested in the result.
Byte sex:
There are various fields which have narrow ranges of values. By computing statistics for these fields one can make a reliable judgment exactly as John has suggested. Simple trick exemplified by the header character detection logic in fgettr.c.
Floating point format:
If you compute a statistical distribution of the floating point values, they are weirdly non-Gausssian if you get it wrong. This is the result of the different bases for the exponents.
I learned this trick and the character set trick from David Jarzabeck when I took over supporting a 750,000 line code base he and two other people had written. The only comment one author included in his code was his name! It was not actually compilable when I reported for work, so I had my work cut out for me, but I learned a lot.
The only reason I've been agitating about this is requests to seisunix for help referencing messages like the following:
"fgettr.c: on trace #1, number of samples in header (0) differs from number for first trace (24135)"
I'd like to make this problem go away and I know how. I also think CWP/SU would develop better if there was some degree of coordinated activity, so that one person didn't break someone else's code.
Sadly no one else seems to be interested in organized activity. So I'm dropping the effort. Let the random walk can continue.
Have Fun!
Reg
| I am interested in organized activity, but I am not knowledgeable enough to be a leader. David Forel --- On Wed, 7/7/10, Reginald Beardsley <pula...@yahoo.com> wrote: |
|
| David, What's needed is a group of interested people who know what they're doing and can articulate their reasoning. In a work environment whenever I have encountered change requests I thought might be disruptive I've always made a point of polling people I thought might have a reason to oppose the change. Not because there was a formal requirement for this, but because it just made good sense. I don't think that any one person knows enough to decide in isolation once the user community exceeds their regular daily personal contacts. It's really less a question of leadership than it is refereeing. In the industrial world it's typically called change management. Though someone I knew remarked that he'd never seen them change management in any of the many meetings he attended. It's just about fixing the things that are broken without breaking anything else. There are a large number of things in CWP/SU such as closing a file descriptor that was never opened that will cause a core dump on some systems. The fix is trivial to implement, but there's a lot of code to go through. And if most people are using systems that don't crash, such things tend never to get fixed without organized effort. So the code doesn't work for the unlucky. CWP/SU is improved or not by the people who use it. Though John has gotten some money for supporting it, I'm sure it is a pittance relative to the effort required. And it works better than the several alternatives. Whatever problems CWP/SU has, at least it will compile easily which is generally not the case for most seismic oriented packages. Have Fun! Reg --- On Wed, 7/7/10, David Forel <david...@yahoo.com> wrote: |
| Well, one change I would like to see is to move SU to understanding SEG-Y rev-1. I find being stuck at rev-0 is a big usage stumbling block. My 2 cents. David . . . . . . . . . . . . . . David's cell = 303-956-2138 |
--- On Wed, 7/7/10, Reginald Beardsley <pula...@yahoo.com> wrote: |
| David, What problems do you encounter? Shouldn't be hard to fix. If you turn on the SU_LINE_HEADER flag SU format is almost exactly SEG-Y rev 1 except for the byte sex part. You can handle rev 1 w/ segyread/write if you set the flags properly. Only deviation I'm aware of is the revision number in the binary header is not being set. I'll be happy to fix any issues that might remain. Much easier than making cpseis compile! Reg |
Reg,
I’d be willing to help in such a concerted effort if we clearly define what we would like to achieve and how to accomplish it.
Best regards,
Werner
I have been sitting on the side lines watching this debate and I feel uneasy
about speed this bandwagon is gathering momentum. SU is CWP's baby and we
should respect their (especially John's) views. I strongly support the open
software movement but it has to be through cooperation and not co-option or,
worse still, imposition. What is potentially being proposed is a new way for
the future development of SU as a community project with John/CWP taking a
more development coordination role. This will only succeed with John's
blessing.
All you free software enthusiasts should first read The Cathedral and the
Bazaar by Eric S. Raymond if you have not already done so.
Richard Hobbs
Cheers
Sent from my iPod.
Thanks to a number of SU users, in
particular Matthias Imhoff and Stew Levin, we have a header field remapping
option in segyread that will allow you to remap items from the optional
fields into unused fields. As far as the added textual headers are concerned,
in the regular SU (not LINEHEADER) installation, these are read to
an external file. I have changed the trace ID definitions and the
field descriptions to match SEGY Rev 1 standard. Of course where the SEG
mined into the optional field, these are not supported, as these are
used for SU fields.
The SEGY Tape format committee's descision to allow
an arbitrary number of textual header stanzas was a bad one, as this
essentially made the SEGY REV 1 header variable length, which IMHO is
bad, because if you cannot predict the length of a header in a file,
you have a hard time knowing how much to strip off to get to your data.
At least it is only the reel header that is variable length, and not
the trace headers.
-John
> David,
>
> What problems do you encounter? Shouldn't be hard to fix. If you turn on the SU_LINE_HEADER flag SU format is almost exactly SEG-Y rev 1 except for the byte sex part. You can handle rev 1 w/ segyread/write if you set the flags properly. Only deviation I'm aware of is the revision number in the binary header is not being set. I'll be happy to fix any issues that might remain. Much easier than making cpseis compile!
>
> Reg
>
>
>
>
>
John Stockwell | jo...@dix.Mines.EDU
The code that Wences worked on is a pvm code. I can make that available
if people on the group are interested.
The code is a translation of Kennet's original fortran
code. This was done by a student named Gabriel Alvarez.
thanks for the fixes,
John
John Stockwell | jo...@dix.Mines.EDU