Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Help a newbie with splitting files .....

14 views
Skip to first unread message

LeungChiKin Randolph

unread,
Oct 12, 1994, 8:44:03 AM10/12/94
to
Dear Experts,

Please excuse my ignorance.

I have a huge batch of files, each of them carrying two sets of data
separated by a mark '=========='. Now I want to split each of them
into two separate files. The content above the mark is stored in one file
while that below the mark is stored in another. I want to write a shell
script to do this. Could you offer me some good suggestions?

Thanks.

Best Regards.

Randolph.
--

o o o o o o . . . ======================== ==========================
o _____ ||RANDOLPH LEUNG | || h891...@hkuxb.hku.hk |
.][__n_n_|DD( ==^^____ || University of | || |
>(________|__|_[________]_||__________ Hong Kong |_||_______________________|_
_/oo OOOOO oo` oo oo 'o^o o^o` 'o^o o^o`
-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


Kevin Woods

unread,
Oct 12, 1994, 1:38:24 PM10/12/94
to
In article <CxK8p...@hkuxb.hku.hk>, h891...@hkuxb.hku.hk (LeungChiKin Randolph) writes:
: Dear Experts,

:
: Please excuse my ignorance.
:
: I have a huge batch of files, each of them carrying two sets of data
: separated by a mark '=========='. Now I want to split each of them
: into two separate files. The content above the mark is stored in one file
: while that below the mark is stored in another. I want to write a shell
: script to do this. Could you offer me some good suggestions?
:
: Thanks.
:
: Best Regards.
:
: Randolph.
: --

You did'nt say what shell but here's a Bourne compatible:

---------------------------cut---------------------------
#!/bin/sh

for file in "$*"; do
if [ ! -f $file ]; then
echo "$0: $file: Not a regular file"
exit 1
fi

stop=`grep -n '^=*$' $file | sed 's/\([0-9]*\)\(:.*\)/\1/p'`

stop=`expr $stop - 1`
sed -n "1,${stop}p" $file > $file.1
strt=`expr $stop + 2`
sed -n "$strt"',$p' $file > $file.2
done
---------------------------cut---------------------------

This will take a list of arguments and first check to see if
it is a regular file. If just one of the files is not, then
you'll need to start over. Anyway, it will split the file[s]
up for you and name the separated versions filename.1 and 2
respectfully.

Hope this helps...

- Kevin

Kevin Woods

unread,
Oct 12, 1994, 1:55:12 PM10/12/94
to
In article <37h6ug$c...@hpscit.sc.hp.com>, kev...@nafohq.hp.com (Kevin Woods) writes:
:
: for file in "$*"; do

Uh, change "$*" to "$@" ... Sorry, I did it on the fly...

- Kevin

David W. Tamkin

unread,
Oct 12, 1994, 11:03:23 PM10/12/94
to
kev...@nafohq.hp.com (Kevin Woods) suggested to LeungChiKin Randolph in
<37h6ug$c...@hpscit.sc.hp.com>:

| stop=`expr $stop - 1`
| sed -n "1,${stop}p" $file > $file.1
| strt=`expr $stop + 2`
| sed -n "$strt"',$p' $file > $file.2

First, leave the original assignment of $stop unchanged, and do the above
with one call to sed and no further calls to expr:

sed "1,$stop !w $file.2
$stop,$ d" $file > $file.1

or, for those of you who prefer multiple -e's to embedded newlines:

sed -e "1,$stop !w $file.2" -e "$stop,$ d" $file > $file.1

In either format, be careful to leave a space before the `d'.

Second, if you have csplit, use it instead. This is its job.

Bernhard Rossboth

unread,
Oct 13, 1994, 4:41:41 AM10/13/94
to
dat...@MCS.COM (David W. Tamkin) writes:
:
: First, leave the original assignment of $stop unchanged, and do the above

: with one call to sed and no further calls to expr:
:
Not at all keep the complex calculation of $stop using grep and
extracting the line number with sed. Sed can address lines using
regex:

sed '1,/====/d' file > file.2
sed '/====/,$d' file > file.1

does the job already.

Barny :-{)
--
Rossboth Bernhard, Email:Bernhard...@aut.alcatel.at, Tel:+32-2-718-7051

a...@maths.nott.ac.uk

unread,
Oct 14, 1994, 3:19:00 PM10/14/94
to
Subject: Re: Help a newbie with splitting files .....

I haven't seen the original problem, but if Bernhard...@aut.alcatel.at
has a correct solution in article <1994Oct13....@aaf.alcatel.at>:

> sed '1,/====/d' file > file.2
> sed '/====/,$d' file > file.1

then
sed '/====/,$ !w file.1
1,/====/d' file > file.2

is equivalent, but saves processing "file" (and invoking "sed") twice, and
thus can (if desired) be used as a filter. Notes: (a) the order of the
"sed" commands is significant; (b) the second "/====/" sort-of-ought to be
replaceable by just "//", but this doesn't work [at least on my computer].

--
Andy Walker, Maths Dept., Nott'm Univ., UK.
a...@maths.nott.ac.uk

David W. Tamkin

unread,
Oct 13, 1994, 11:37:44 PM10/13/94
to
Bernhard...@aut.alcatel.at wrote in
<1994Oct13....@aaf.alcatel.at>:

| sed '1,/====/d' file > file.2
| sed '/====/,$d' file > file.1
|
| does the job already.

True; I was assuming that there was some concern about getting the right
occurrence of /^======*$/. (/====/ might not be enough, though, to make
sure you're breaking the text at the right point; another line might contain
four equal signs but also some other text.)

However, Bernhard's suggestion to use a regexp without the trouble of
determining a line number should be combined with mine to fork sed only
once:

sed '1,/^====*$/ !w file.2
/^====*$/,$ d' file > file.1

Dr A. N. Walker

unread,
Oct 14, 1994, 12:19:19 PM10/14/94
to
I haven't seen the original problem, but if Bernhard...@aut.alcatel.at
has a correct solution in article <1994Oct13....@aaf.alcatel.at>:

> sed '1,/====/d' file > file.2
> sed '/====/,$d' file > file.1

then

Brian Blackmore

unread,
Oct 16, 1994, 9:51:44 AM10/16/94
to
LeungChiKin Randolph (h891...@hkuxb.hku.hk) wrote:
>Dear Experts,

>Please excuse my ignorance.

>I have a huge batch of files, each of them carrying two sets of data
>separated by a mark '=========='. Now I want to split each of them
>into two separate files. The content above the mark is stored in one file
>while that below the mark is stored in another. I want to write a shell
>script to do this. Could you offer me some good suggestions?

#!/usr/bin/perl
#
# Split a file into several files based on a ====== separator
#
@files = ("firstfile","secondfile");
for (@files) {
open(FILE,">$_") || die;
while(<>) {
last if (/^=======/);
print FILE;
}
}
close(FILE);

--
Brian Blackmore.

Kevin Darcy

unread,
Oct 18, 1994, 7:09:54 PM10/18/94
to
In article <CxK8p...@hkuxb.hku.hk>,

LeungChiKin Randolph <h891...@hkuxb.hku.hk> wrote:
>Dear Experts,
>
>Please excuse my ignorance.
>
>I have a huge batch of files, each of them carrying two sets of data
>separated by a mark '=========='. Now I want to split each of them
>into two separate files. The content above the mark is stored in one file
>while that below the mark is stored in another. I want to write a shell
>script to do this. Could you offer me some good suggestions?

awk 'NR == 1 { output = "output_file1" }
/^==========/ { output = "output_file2"; next }
{ print > output }' input_file

--------------------------------------------------------------------------------
ke...@cfc.com <-- (ASCII only please) | Kevin Darcy, UNIX Systems Admin (CFC)
ke...@tech.mis.cfc.com <-- (mute | Technical Services
Voice: (810) 759-7140 NeXTmail | Chrysler Corporation
Fax: (810) 758-8173 welcome) | Center Line, Michigan, MIS Complex
--------------------------------------------------------------------------------

Randal L. Schwartz

unread,
Oct 22, 1994, 7:21:29 PM10/22/94
to
>>>>> "Brian" == Brian Blackmore <b...@gryphon.demon.co.uk> writes:

Brian> LeungChiKin Randolph (h891...@hkuxb.hku.hk) wrote:
>> Dear Experts,

>> Please excuse my ignorance.

>> I have a huge batch of files, each of them carrying two sets of data
>> separated by a mark '=========='. Now I want to split each of them
>> into two separate files. The content above the mark is stored in one file
>> while that below the mark is stored in another. I want to write a shell
>> script to do this. Could you offer me some good suggestions?

Brian> #!/usr/bin/perl
Brian> #
Brian> # Split a file into several files based on a ====== separator
Brian> #
Brian> @files = ("firstfile","secondfile");
Brian> for (@files) {
Brian> open(FILE,">$_") || die;
Brian> while(<>) {
Brian> last if (/^=======/);
Brian> print FILE;
Brian> }
Brian> }
Brian> close(FILE);

Well, if you're gonna show Perl... :-)

perl -pe 'open(STDOUT,">file".++$n) if /^=====/' somefile1 somefile2 ...

This'll put the ===== line as the first line of the new file. If
you wanna get rid of the line entirely, use:

perl -pe 'open(STDOUT,">file".++$n),$_ = "" if /^=====/' ...

Also, all lines before the first ==== go to standard out. Just redirect
it to "file0" if you wish.

print "Just another Perl hacker," # look ma, no perl5 in this posting. :-)
--
Name: Randal L. Schwartz / Stonehenge Consulting Services (503)777-0095
Keywords: Perl training, UNIX[tm] consulting, video production, skiing, flying
Email: <mer...@stonehenge.com> Snail: (Call) PGP-Key: (finger mer...@ora.com)
Phrase: "Welcome to Portland, Oregon ... home of the California Raisins!"

Randal L. Schwartz

unread,
Oct 23, 1994, 11:53:05 PM10/23/94
to
>>>>> "Tony" == Tony Nugent <T.Nu...@sct.gu.edu.au> writes:

Tony> cat file | sed -e '/^==========/,$d' > part.1

Wow. This week's "Useless Use of Cat Award" is being handed out
on the first day of the week. I can rest now. :-)

Hint: whenever cat has one argument, or no arguments, it's not
*concatenating* anything, and can probably be removed.

In your case:

sed -e '/^==============,$d' >part.1 <file

Just another UNIX hacker (since 1977, yes, 19*77*),

0 new messages