Please excuse my ignorance.
I have a huge batch of files, each of them carrying two sets of data
separated by a mark '=========='. Now I want to split each of them
into two separate files. The content above the mark is stored in one file
while that below the mark is stored in another. I want to write a shell
script to do this. Could you offer me some good suggestions?
Thanks.
Best Regards.
Randolph.
--
o o o o o o . . . ======================== ==========================
o _____ ||RANDOLPH LEUNG | || h891...@hkuxb.hku.hk |
.][__n_n_|DD( ==^^____ || University of | || |
>(________|__|_[________]_||__________ Hong Kong |_||_______________________|_
_/oo OOOOO oo` oo oo 'o^o o^o` 'o^o o^o`
-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
You did'nt say what shell but here's a Bourne compatible:
---------------------------cut---------------------------
#!/bin/sh
for file in "$*"; do
if [ ! -f $file ]; then
echo "$0: $file: Not a regular file"
exit 1
fi
stop=`grep -n '^=*$' $file | sed 's/\([0-9]*\)\(:.*\)/\1/p'`
stop=`expr $stop - 1`
sed -n "1,${stop}p" $file > $file.1
strt=`expr $stop + 2`
sed -n "$strt"',$p' $file > $file.2
done
---------------------------cut---------------------------
This will take a list of arguments and first check to see if
it is a regular file. If just one of the files is not, then
you'll need to start over. Anyway, it will split the file[s]
up for you and name the separated versions filename.1 and 2
respectfully.
Hope this helps...
- Kevin
Uh, change "$*" to "$@" ... Sorry, I did it on the fly...
- Kevin
| stop=`expr $stop - 1`
| sed -n "1,${stop}p" $file > $file.1
| strt=`expr $stop + 2`
| sed -n "$strt"',$p' $file > $file.2
First, leave the original assignment of $stop unchanged, and do the above
with one call to sed and no further calls to expr:
sed "1,$stop !w $file.2
$stop,$ d" $file > $file.1
or, for those of you who prefer multiple -e's to embedded newlines:
sed -e "1,$stop !w $file.2" -e "$stop,$ d" $file > $file.1
In either format, be careful to leave a space before the `d'.
Second, if you have csplit, use it instead. This is its job.
sed '1,/====/d' file > file.2
sed '/====/,$d' file > file.1
does the job already.
Barny :-{)
--
Rossboth Bernhard, Email:Bernhard...@aut.alcatel.at, Tel:+32-2-718-7051
I haven't seen the original problem, but if Bernhard...@aut.alcatel.at
has a correct solution in article <1994Oct13....@aaf.alcatel.at>:
> sed '1,/====/d' file > file.2
> sed '/====/,$d' file > file.1
then
sed '/====/,$ !w file.1
1,/====/d' file > file.2
is equivalent, but saves processing "file" (and invoking "sed") twice, and
thus can (if desired) be used as a filter. Notes: (a) the order of the
"sed" commands is significant; (b) the second "/====/" sort-of-ought to be
replaceable by just "//", but this doesn't work [at least on my computer].
--
Andy Walker, Maths Dept., Nott'm Univ., UK.
a...@maths.nott.ac.uk
| sed '1,/====/d' file > file.2
| sed '/====/,$d' file > file.1
|
| does the job already.
True; I was assuming that there was some concern about getting the right
occurrence of /^======*$/. (/====/ might not be enough, though, to make
sure you're breaking the text at the right point; another line might contain
four equal signs but also some other text.)
However, Bernhard's suggestion to use a regexp without the trouble of
determining a line number should be combined with mine to fork sed only
once:
sed '1,/^====*$/ !w file.2
/^====*$/,$ d' file > file.1
> sed '1,/====/d' file > file.2
> sed '/====/,$d' file > file.1
then
>Please excuse my ignorance.
>I have a huge batch of files, each of them carrying two sets of data
>separated by a mark '=========='. Now I want to split each of them
>into two separate files. The content above the mark is stored in one file
>while that below the mark is stored in another. I want to write a shell
>script to do this. Could you offer me some good suggestions?
#!/usr/bin/perl
#
# Split a file into several files based on a ====== separator
#
@files = ("firstfile","secondfile");
for (@files) {
open(FILE,">$_") || die;
while(<>) {
last if (/^=======/);
print FILE;
}
}
close(FILE);
--
Brian Blackmore.
awk 'NR == 1 { output = "output_file1" }
/^==========/ { output = "output_file2"; next }
{ print > output }' input_file
--------------------------------------------------------------------------------
ke...@cfc.com <-- (ASCII only please) | Kevin Darcy, UNIX Systems Admin (CFC)
ke...@tech.mis.cfc.com <-- (mute | Technical Services
Voice: (810) 759-7140 NeXTmail | Chrysler Corporation
Fax: (810) 758-8173 welcome) | Center Line, Michigan, MIS Complex
--------------------------------------------------------------------------------
Brian> LeungChiKin Randolph (h891...@hkuxb.hku.hk) wrote:
>> Dear Experts,
>> Please excuse my ignorance.
>> I have a huge batch of files, each of them carrying two sets of data
>> separated by a mark '=========='. Now I want to split each of them
>> into two separate files. The content above the mark is stored in one file
>> while that below the mark is stored in another. I want to write a shell
>> script to do this. Could you offer me some good suggestions?
Brian> #!/usr/bin/perl
Brian> #
Brian> # Split a file into several files based on a ====== separator
Brian> #
Brian> @files = ("firstfile","secondfile");
Brian> for (@files) {
Brian> open(FILE,">$_") || die;
Brian> while(<>) {
Brian> last if (/^=======/);
Brian> print FILE;
Brian> }
Brian> }
Brian> close(FILE);
Well, if you're gonna show Perl... :-)
perl -pe 'open(STDOUT,">file".++$n) if /^=====/' somefile1 somefile2 ...
This'll put the ===== line as the first line of the new file. If
you wanna get rid of the line entirely, use:
perl -pe 'open(STDOUT,">file".++$n),$_ = "" if /^=====/' ...
Also, all lines before the first ==== go to standard out. Just redirect
it to "file0" if you wish.
print "Just another Perl hacker," # look ma, no perl5 in this posting. :-)
--
Name: Randal L. Schwartz / Stonehenge Consulting Services (503)777-0095
Keywords: Perl training, UNIX[tm] consulting, video production, skiing, flying
Email: <mer...@stonehenge.com> Snail: (Call) PGP-Key: (finger mer...@ora.com)
Phrase: "Welcome to Portland, Oregon ... home of the California Raisins!"
Tony> cat file | sed -e '/^==========/,$d' > part.1
Wow. This week's "Useless Use of Cat Award" is being handed out
on the first day of the week. I can rest now. :-)
Hint: whenever cat has one argument, or no arguments, it's not
*concatenating* anything, and can probably be removed.
In your case:
sed -e '/^==============,$d' >part.1 <file
Just another UNIX hacker (since 1977, yes, 19*77*),