Suppose a response line from a sysfiletree call looks like
yyyy-mm-dd hh:mm:ss 0 -D--- C:\a.b.c.d.e f.g.h
how do people generally chop this up? It seems to me that
parse var stem.n ymd hms siz att fnm .
works well, returning each value with no leading or trailing blanks - but of
course the fnm value is wrong if there were spaces in the filename. Trying
parse var stem.n ymd hms siz att fnm
works except the last var, fnm, ends up with a leading space. So do people
generally do
parse var stem.n ymd hms siz att" "fnm
instead? This doesn't seem to me to adhere to the Principle of Least
Astonishment.
--
Jeremy C B Nicoll - my opinions are my own.
Neat duct tape! ;-) Saving that to my KB.
--
Michael Lueck
Lueck Data Systems
http://www.lueckdatasystems.com/
parse value space(stem.n) with ymd hms siz att fnm
HTH...
>As the problem - the leading blank on the file name - is a result of the
>double blank preceding the file name, why not use
>
>parse value space(stem.n) with ymd hms siz att fnm
Because file "a b c" is different from file "a b c". Screen
scraping (with some dross left out):
23:36 Fri 2008-07-04 F:\>dir
Directory of F:\
2008-07-04 23:36 12 a b c
2008-07-04 23:36 11 a b c
2 File(s) 23 bytes
--
Arthur T. - ar23hur "at" intergate "dot" com
Looking for a z/OS (IBM mainframe) systems programmer position
>Jeremy Nicoll - news posts wrote:
>> works except the last var, fnm, ends up with a leading space. So do people
>> generally do
>>
>> parse var stem.n ymd hms siz att" "fnm
>
>Neat duct tape! ;-) Saving that to my KB.
Take it back out of your KB, it doesn't work.
When parsing, REXX will first find literals and make
substrings before and after. Then it parses those substrings.
So, the first " " splits the string and *everything* following
will be assigned to fnm (with most of the other variables being
assigned null).
You're pretty much required to use two statements if you want
to get rid of leading blanks from the last variable in a parse. I
agree with the OP that it violates the law of least astonishment,
but it's how it's documented.
I normally go for:
Parse value subword(stem.n,1,4) subword(stem.n,5)
with date time size attrs filename
Thus the only blanks that I have to worry about are leading blanks in
the filename. I've never come across a file so named; I'm not sure it
would be legal on most systems, and it would surely be asking for trouble.
--
Steve Swift
http://www.swiftys.org.uk/swifty.html
http://www.ringers.org.uk
>> yyyy-mm-dd hh:mm:ss 0 -D--- C:\a.b.c.d.e f.g.h
>>
>> how do people generally chop this up? It seems to me that
>>
>> parse var stem.n ymd hms siz att fnm .
>
>
>I normally go for:
>
>Parse value subword(stem.n,1,4) subword(stem.n,5)
> with date time size attrs filename
>
>Thus the only blanks that I have to worry about are leading blanks in
>the filename. I've never come across a file so named; I'm not sure it
>would be legal on most systems, and it would surely be asking for trouble.
Asking for trouble, certainly. Illegal, I'm afraid not:
1:22 Sat 2008-07-05 F:\>dir *b*
Directory of F:\
2008-07-04 23:36 12 a b c
2008-07-04 23:36 11 a b c
2008-07-05 01:20 0 abc
2008-07-05 01:20 0 a bc
2008-07-05 01:20 0 abc
2008-07-05 01:22 0 abc
System is Win2K. I expect it's equally legal (and equally a
bad idea) on any Win system which supports long filenames.
> As the problem - the leading blank on the file name - is a result of the
> double blank preceding the file name, why not use
>
> parse value space(stem.n) with ymd hms siz att fnm
Because it destroys multiple embedded blanks in filenames. It's bad enough
(I think) that Windows has spaces in filenames at all (I'm more used to RISC
OS which lets people type in spaces but converts them to ascii 160 ie a hard
space internally, so filenames are still single 'words').
I dislike seeing filenames in proportional fonts, and have on occasion used
multiple spaces to try to align sets of filenames better. So eg I have some
notes files in one directory which when listed with 'dir' look a bit like:
disk-C b scratch file tidyup - done 20080624 .txt
disk-C c cleanup - done 20080624 .txt
disk-C d error-check - done 20080624 .txt
disk-C e defrag - done 20080624 .txt
disk-C f full backup to SEA1 - done 20080407 .txt
disk-C f full backup to SEA2 - done 20080407 .txt
disk-SEA1 a error-check - done 20080406 .txt
disk-SEA1 b defrag - done 20080406 .txt
disk-SEA2 a error-check - done 20080406 .txt
disk-SEA2 b defrag - done 20080406 .txt
but when I see them in Windows Explorer they look more like:
disk-C b scratch file tidyup - done 20080624 .txt
disk-C c cleanup - done 20080624 .txt
disk-C d error-check - done 20080624 .txt
disk-C e defrag - done 20080624 .txt
disk-C f full backup to SEA1 - done 20080407 .txt
disk-C f full backup to SEA2 - done 20080407 .txt
disk-SEA1 a error-check - done 20080406 .txt
disk-SEA1 b defrag - done 20080406 .txt
disk-SEA2 a error-check - done 20080406 .txt
disk-SEA2 b defrag - done 20080406 .txt
That is, they're not perfectly aligned, but it's better than it would be
without judicious extra spaces in the filenames.
> > yyyy-mm-dd hh:mm:ss 0 -D--- C:\a.b.c.d.e f.g.h
> >
> > how do people generally chop this up? It seems to me that
> >
> > parse var stem.n ymd hms siz att fnm .
>
>
> I normally go for:
>
> Parse value subword(stem.n,1,4) subword(stem.n,5)
> with date time size attrs filename
Thanks for this suggestion.
> Thus the only blanks that I have to worry about are leading blanks in
> the filename. I've never come across a file so named; I'm not sure it
> would be legal on most systems, and it would surely be asking for trouble.
Hell! I seriously mistyped that filename - nearly all the dots were meant
to be backslashes ie I meant:
C:\a\b\c\d\e f\g.h
This comes from using a machine with dots as the 'directory' separator a lot
of the time, where filenames look like:
ADFS::4.$.This.That and.The Other.Prog 2/Rexx
and for anyone who's interested I've written a bit more about that system
below.
This's in ex-Acorn's RISC OS. In those names, what looks like a space eg
between the "That" and "and" in "That and", or "Prog" and "2" in the
leafname, is held in the file system as a hard-space (ascii 160) so the
whole name parses as a single token. Typically - with a Regina rexx port on
that system - I break those into successive levels by translating all the
dots to spaces then using words() and subword() to select chosen parts, then
if needed translate spaces back to dots.
When someone is naming/renaming a file the RO gui translates the ordinary
space character(s) that they type into the 160s.
Under RO, files don't have type-specific file extensions. Instead there's a
three digit hex value stored as metadata by the filesystem. Internally RO
just uses that to categorise files. Plain "data" files have filetype
x'000', text is x'fff', jpegs are 'c85', and so on. There's a default set
of system variables eg
File$Type_C85 has value "JPEG"
A simple command:
Set File$Type_C85 Piccie
would result in all jpeg files being listed in directory displays etc as eg
My Picture Piccie
His photo Piccie
though there are also variants of the directory display commands which will
show filetypes in hex. File system commands will take filetype parameters
either in text form or as eg "&FFF" form.
The same method is used for "file associations". When you double click a
file the system looks at a system variable, eg for jpegs:
Alias@RunType_C85
to find the command to be used to 'run' a jpeg. Merely listing all the
system variables shows you the current set of all the associations, and many
other things too.
> In Message-ID:<egAbk.18$rb6...@newsfe02.lga>,
> Michael Lueck <mlu...@lueckdatasystems.com> wrote:
>
> >Jeremy Nicoll - news posts wrote:
> >> parse var stem.n ymd hms siz att" "fnm
> >
> >Neat duct tape! ;-) Saving that to my KB.
>
> Take it back out of your KB, it doesn't work.
So it doesn't! Well, tie me to a tree and call me Brenda!
(as a friend used to say).
Clearly it's time I reread the chapter on parse.
> You're pretty much required to use two statements if you want
> to get rid of leading blanks from the last variable in a parse. I
> agree with the OP that it violates the law of least astonishment,
> but it's how it's documented.
Actually, what really surprises me is that the values on each line aren't
just single-space separated, leaving reformatting of the lines to the caller
if they wanted to do so.
Also, I wonder how the designers decided how wide the size column should be?
Any fixed limit is going to be too small eventually. It doesn't get wider
if you set a big 'numeric digits' value, either.
If you'ld really like to avoid RexxUtil due to several silly add-ons:
You could create (a) file(s), and use that name as a placeholder for
each name. If "1234567890.txt" starts at position 48, both " x.txt"
and " y.txt" should also start at pos 48.
---
Parse value subword(stem.n,1,4) substr(stem.n,pos(subword(stem.n,
4), ,
stem.n)+length(subword(stem.n,4,1))+2) with date time size attrs
filename
Since the output of SYSFILETREE seems to be fixed column,
would this be a better solution:
parse var stem.n ymd hms siz att . 38 fnm
Or, bite the bullet and do it in two lines. I would guess it
would still take less processing than multiple SUBWORDs:
parse var stem.n ymd hms siz att fnm
fnm = strip(fnm,"L")
> Since the output of SYSFILETREE seems to be fixed column,
> would this be a better solution:
>
> parse var stem.n ymd hms siz att . 38 fnm
You need to adjust the fixed column depending in whether you use the default
date/time format or "T" or "L".
One thing that does seem fixed is the width of the 'size' column. Goodness
only knows what will happen when files get so big that the column isn't wide
enough.
Here is the SysFileTree output of a directory with pretty large files.
NOTE the filesize of big.bin.
----------
2008-07-06 15:39:39 3197704724 A---- D:\VirtualMachines\Virtual PC
\big.bin
2008-02-14 16:07:25 1220875776 A---- D:\VirtualMachines\Virtual PC
\Ubuntu 7.10 LAMP.vhd
2008-01-07 15:13:12 4088370176 A---- D:\VirtualMachines\Virtual PC
\Windows 2000 P1.vhd
2007-12-24 23:04:45 27136 A---- D:\VirtualMachines\Virtual PC
\Windows 2000 P2.doc
2008-07-04 15:37:25 198252032 A---- D:\VirtualMachines\Virtual PC
\Windows 2000 P3.vhd
2008-07-05 20:31:27 3207351296 A---- D:\VirtualMachines\Virtual PC
\Windows 2000 P4.vhd
2008-01-24 20:38:56 802379264 A---- D:\VirtualMachines\Virtual PC
\Windows 2000 P5.vhd
2008-04-14 12:20:20 1156884480 A---- D:\VirtualMachines\Virtual PC
\Windows 2000 P6.vhd
Here is the corresponding directory listing:
Directory of D:\VirtualMachines\Virtual PC
07/06/2008 03:39 PM <DIR> .
07/06/2008 03:39 PM <DIR> ..
07/06/2008 03:39 PM 123,456,789,012 big.bin
02/14/2008 04:07 PM 1,220,875,776 Ubuntu 7.10 LAMP.vhd
01/07/2008 03:13 PM 4,088,370,176 Windows 2000 P1.vhd
12/24/2007 11:04 PM 27,136 Windows 2000 P2.doc
07/04/2008 03:37 PM 4,493,219,328 Windows 2000 P3.vhd
07/05/2008 08:31 PM 3,207,351,296 Windows 2000 P4.vhd
01/24/2008 08:38 PM 5,097,346,560 Windows 2000 P5.vhd
04/14/2008 12:20 PM 5,451,851,776 Windows 2000 P6.vhd
8 File(s) 147,015,831,060 bytes
Clearly, SysFileTree is unable to handle files larger than 100GB. As
you can see, the file "big.bin" is not handled correctly. Therefore,
I expect fixed columns, especially the field size to change SOON and
become variable, or at the very least larger, in order to fix this
bug. Terrabytes are after all nothing special anymore. Using subword
should hopefully keep the code working once the bug is fixed.
We were thinking along the same lines. :)
> One thing that does seem fixed is the width of the 'size' column. Goodness
> only knows what will happen when files get so big that the column isn't wide
> enough.
Indeed it does shift eventually. I have encountered this at one time.
parse var stem.n ymd hms siz att . 1 (att) +(length(att)+2) fnm
The trick we discussed here earlier to do it in a single instruction is
to reparse the input up to the first word of the filename. In this
case, we know ATTRS will contain characters that can't occur in any of
the previous words (either a letter or two consecutive hyphens), so we
can streamline the reparsing and skip straight to it:
Parse Var stem.n date time size attrs filename . ,
0 (attrs) (filename) +0 filename
How the performance compares, as always, will depend on the platform.
�R
That does not work correctly if the filename begins with one or more blanks
(which is perfectly valid on a Windows system), viz:
---------------------
/* parstest.rex */
/* "file" has 5 blanks at the start */
file=' abc.xyz'
string='2008-07-02 12:10:00 1000 ADHRS '||file
/* following does not give the right answer */
parse var string date time size attrs filename . ,
0 (attrs) (filename) +0 filename
say 'file: "'filename'"'
/* following works correctly */
parse var string date time size attrs 0 (attrs) +7 filename
say 'file: "'filename'"
---------------------
E:\>parstest
file: "abc.xyz"
file: " abc.xyz"'
-- from CyberSimian in the UK
--except for keeping extra stuff in ATTRS; you want a dot before the 0.
Actually, though, are there file systems that can have a blank at the
start of a *fully-qualified* file name? That's what sysreftree returns.
¬R
Windows XP Professional:
C:\temp\blanks>dir
Volume in drive C is C: Drive
Volume Serial Number is 8017-F8A8
Directory of C:\temp\blanks
09/07/2008 08:34 <DIR> .
09/07/2008 08:34 <DIR> ..
09/07/2008 08:33 0 a file
09/07/2008 08:34 0 noblanks.file
2 File(s) 0 bytes
2 Dir(s) 39,151,828,992 bytes free
As I said earlier, having leading blanks in a filename is simply begging
for problems. I'd be the last person to fail such a request. :-)
The correct solution to this problem is to change SysFileTree to return only
one blank between the attributes and the filename; then a simple parse
provides the right answer, with leading blanks in the name preserved:
parse var string date time size attrs filename
However, compatibility considerations suggest that SysFileTree would not be
changed in that way (although it could be done via a new option that the
invoker specified). For the same reason, SysFileTree will not return MORE or
LESS attributes than it does at the moment (again unless controlled by a new
option), so the use of the explicit "7" in the parse is safe.
Can you explain the purpose of putting a dot before the "0"? I cannot see
that it is needed (but I am always happy to learn something new).
Examine the contents of ATTRS and you'll see. Without the dot, you're
setting ATTRS to everything that follows one blank after the third word.
ŹR Plus meditandum, minus misculandum.
(Marty Shapiro, deftly translated by Sean Fitzpatrick)
Fine, but there's no leading blank in the *fully-qualified* file name
"C:\temp\blanks\ a file".
ŹR Around here, the fun is always filled with blanks.
http://users.bestweb.net/~notr/arkville.html --Theresa Willis
I have now done that, and I see what you mean! Thanks for pointing out this
error.
> /* following works correctly */
> parse var string date time size attrs 0 (attrs) +7 filename
Unfortunately this seems "hard coded" to the platform on which the SysFileTree output was generated. For ooRexx on Linux, I needed:
parse var string date time size attrs . 0 (attrs) +13 filename
to get the filename correctly.
It was necessary to add a . after the attrs word to keep that entry clean.
It would be nice to automate the correct +? finding in some way.
How about the solution shown below. To work correctly, it requires that the
attributes (Windows or Linux) are always followed by TWO blanks, and the
parsing trigger is a string with two blanks (if you are viewing this post
using a proportionally-spaced font, the trigger might look like a single
blank, but it really is two).
-- from CyberSimian in the UK
-------------------------------------------------------
/* parstest.rex */
/* "file" has 5 blanks at the start */
file=' abc.xyz'
string='2008-07-02 12:10:00 1000 ADHRS '||file
say 'SysFileTree data in Windows format using +7 as parsing trigger'
parse var string date time size attrs . 0 (attrs) +7 filename
say 'date: "'date'"'
say 'time: "'time'"'
say 'size: "'size'"'
say 'attrs: "'attrs'"'
say 'file: "'filename'"'
say 'SysFileTree data in Windows format using " " as parsing trigger'
parse var string date time size attrs . 0 (attrs) ' ' filename
say 'date: "'date'"'
say 'time: "'time'"'
say 'size: "'size'"'
say 'attrs: "'attrs'"'
say 'file: "'filename'"'
say 'SysFileTree data in Linux format using " " as parsing trigger'
string='2008-07-02 12:10:00 1000 drwxrwxrwx '||file
parse var string date time size attrs . 0 (attrs) ' ' filename
say 'date: "'date'"'
say 'time: "'time'"'
say 'size: "'size'"'
say 'attrs: "'attrs'"'
say 'file: "'filename'"'
----------------------------------------------------------------
That works on Linux for me, so I assume it will also work on Windows.
Thanks! :-)
>The correct solution to this problem is to change SysFileTree to return
>only one blank between the attributes and the filename; then a simple
>parse provides the right answer, with leading blanks in the name
>preserved:
>parse var string date time size attrs filename
No; you still need to remove an extraneous leading blank.
--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>
Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to spam...@library.lspace.org
The example below shows how simple things would be if the source for
SysFileTree were changed so that it returned only one blank between the attrs
and the filename. A simple parse returns the filename with significant leading
blanks preserved, and with no extraneous blanks.
-- from CyberSimian in the UK
------------------------------------
/* parstest.rex */
/* "file" has 5 blanks at the start */
file=' abc.xyz'
say 'SysFileTree modified to return one blank between attrs and filename'
string='2008-07-02 12:10:00 1000 drwxrwxrwx '||file
parse var string date time size attrs filename
say 'date: "'date'"'
say 'time: "'time'"'
say 'size: "'size'"'
say 'attrs: "'attrs'"'
say 'file: "'filename'"'
------------------------------------
>The example below shows how simple things would be if the source for
>SysFileTree were changed so that it returned only one blank between the
>attrs and the filename.
Would that break existing code?
Yes it would. That is why I suggested in a previous post that a new option on
SysFileTree would be necessary -- let's call it the "easy parse" option:
(1) If the invoking REXX program does not specify the "easy parse" option,
SysFileTree returns the data as it does today, i.e. TWO blanks between the
attributes and the first character of the filename.
(2) If the invoking REXX program specifies the "easy parse" option,
SysFileTree returns ONE blank between the attributes and the first character
of the filename.
I wonder if there are any systems where the attributes can collapse to
nothing/blanks? Probably not, as SysFileTree puts in a "-" for any that
are not set, but it was the first pitfall that occurred to me. That's
one benefit of fixed columns - it doesn't matter if one of them becomes
blank (as long as you are parsing them out using absolute positions).