I'm taking wordperfect attachments, stripping the text out of them,
dumping them in a directory and now I am trying to figure out how to
replace cr/lf with <p>cr/lf so that the resulting files can be Blosxom
(weblog) entries.
I have a TCSH script to do everything except for the replacement:
#!/bin/tcsh -v
# convert pissy attachments to text
cd /library/webserver/documents/communicator/
foreach f (*)
strings "$f" > /library/webserver/documents/blosxom/"$f".txt
end
rm *.doc
and it struck me that awk would be the simplest way to do this,
perhaps.
I know I've got to do something with gsub:
{ gsub(/USA/, "United States"); print }
but at that point I'm stuck. I'll look for an online book, but if
anyone wants to throw me a bone that'd be cool :)
How about:
{ gsub(/USA/, "United States"); print $0 "<p>" }
which ought to do it.
HTH
--
Peter S Tillier
"Who needs perl when you can write dc and sokoban in sed?"
% I'm taking wordperfect attachments, stripping the text out of them,
% dumping them in a directory and now I am trying to figure out how to
% replace cr/lf with <p>cr/lf so that the resulting files can be Blosxom
% (weblog) entries.
You might be better off starting with, say, wp2latex rather than
using strip to get at the text.
I'm assuming from your use of tcsh that you're on a Unix system. Normally,
awk delimits records using lf under Unix, so cr will be the last character
on each record which has a cr/lf. You could deal with this a few ways. For
instance, you could examine the last character using substr():
{
if (substr($0, length($0) - 1) == "\r") {
print substr($0, 1, length($0) - 1) "<p>\r"
}
else
print
}
or you could make cr be the field separator, and test to see if the last
field on the line is empty, and if so stick <p> into the preceding field:
BEGIN { FS = OFS = "\r" }
NF > 1 && $NF == "" { $(NF-1) = $(NF-1) "<p>" }
{ print }
--
Patrick TJ McPhee
East York Canada
pt...@interlog.com
pt...@interlog.com (Patrick TJ McPhee) wrote in message news:<F5VZ9.2829$Sq1.1...@news.ca.inter.net>...
> { gsub(/USA/, "United States"); print }
Well, as for the awk question context you may
find your grace in the use of output separator chosing (OS OFS)
For what's geopolitical context the assertion is of course false :D)