Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

deleting the last line of a text file

10 views
Skip to first unread message

Donna Jean Kilpatrick

unread,
Apr 24, 2001, 11:07:03 AM4/24/01
to
Can anyone tell me the easiest way to delete the last line of a text
file?
I'm altering a C program that breaks up a very large text file into
smaller files of a specified size.
The first and last line of data in each of the smaller files are useless
since the values normally get chopped in half. I wanted to just discard
them but I can't seem to find a way to locate the end of file and delete
the last line. Any suggestions?
Thanks
Andy


#include <iostream.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <malloc.h>
#ifdef _WIN32
#include <fcntl.h>
#include <io.h>
#endif

int main(int argc, char *argv[])
{
char splitfn[280];
long part = 0, chunk = 100 * 1024;
FILE *fp, *op;
char *fb;
char *ofn = "StdIn";
int char_in_line= 70;
char * lineptr= NULL;
char * lineptr2= NULL;
long small_chunk = 80;

fprintf(stderr, "hello there: splits [<filename>|-] [chunk(Kb)]\n");
if (argc < 2 || argc > 3) {
fprintf(stderr, "Usage: splits [<filename>|-] [chunk(Kb)]\n");
return 2;
}

if (argc > 2) {
long lch = atol(argv[2]);

if (lch < 1) {
fprintf(stderr, "Invalid chunk size.\n");
return 2;
}
chunk = lch * 1024L;
}

fb = malloc((unsigned) chunk);
if (fb == NULL) {
fprintf(stderr, "Unable to allocate %ld byte I/O buffer.\n",
chunk);
return 1;
}
if (strcmp(argv[1], "-") != 0) {
ofn = argv[1];
fp = fopen(argv[1], "r+");
if (fp == NULL) {
fprintf(stderr, "Cannot open file %s\n", argv[1]);
return 2;
}
} else {

fp = stdin;

}

while (1) {
long fl;

lineptr = malloc((unsigned) small_chunk);

// Delete the first line from the file fp

fgets(lineptr, char_in_line, fp);

free(lineptr);
lineptr = NULL;

fl = fread(fb, 1, (int) chunk, fp);
if (fl > 0) {
// This sets the filename of the output file
sprintf(splitfn, "%s.%03ld", ofn, ++part);
// This opens the output file for writing
op = fopen(splitfn, "w");
if (op == NULL) {
fprintf(stderr, "Cannot create output file %s\n",
splitfn);
return 2;
}


fprintf(op,"A%i = [ \n",part -1);
// Write from fb to op one chunk of data
fwrite(fb, 1, (int) fl, op);

// fseek(op, 10, SEEK_END);

//lineptr2 = malloc((unsigned) small_chunk);

// Delete the last line from the file op

//fgets(lineptr2, char_in_line, op);

//free(lineptr2);
//lineptr2 = NULL;

// fseek(op, 10, SEEK_END);

fprintf(op,"\n ];");

fclose(op);
} else {
break;
}
}

if (fp != stdin) {
fclose(fp);
}
return 0;
}


Steve Fosdick

unread,
Apr 24, 2001, 12:38:03 PM4/24/01
to
In article <3AE59697...@nrcan.gc.ca>,

Donna Jean Kilpatrick <dkil...@nrcan.gc.ca> wrote:

> Can anyone tell me the easiest way to delete the last line of a text
> file?

Copy the file to a new file, reading and writing one line at a time and
writing one line behind what you are reading.

Here is an example which copies from stdin to stdout and has
a length limit - you can wrap this with other code as necessary
and introduce dynamic memory allocation for long lines if you
need to.

#include <stdio.h>

#define LINE_SIZE 255

int main(int argc, char *argv[])
{

char buf1[LINE_SIZE + 1];
char buf2[LINE_SIZE + 1];
char *current_ptr;
char *other_ptr;
char *temp_ptr;

if (fgets(buf1, LINE_SIZE, stdin) == NULL)
{
fputs("input file is empty\n", stderr);
return 1;
}
other_ptr = buf1;
current_ptr = buf2;
while (fgets(current_ptr, LINE_SIZE) != NULL)
{
temp_ptr = current_ptr; /* swap buffers */
current_ptr = other_ptr;
other_ptr = temp_ptr;
fputs(current_ptr, stdout);
}
return 0;
}

Mal Kay

unread,
Apr 24, 2001, 12:32:15 PM4/24/01
to
Donna Jean Kilpatrick wrote:
>
> Can anyone tell me the easiest way to delete the last line of a text
> file?
> I'm altering a C program that breaks up a very large text file into
> smaller files of a specified size.
> The first and last line of data in each of the smaller files are useless
> since the values normally get chopped in half. I wanted to just discard
> them but I can't seem to find a way to locate the end of file and delete
> the last line. Any suggestions?

When you say values get chopped in half I take it that you mean text
lines
are broken with part in one file and the rest in the next.

Within the confines of standard C there no means of deleting parts
of a file. You could however copy the file without the parts to be
deleted.

But as I understand it your need would be satisfied by breaking the
original
file only at the ends of text lines.

Try using fgets() to read the original a line at a time, determine the
line
length using say strlen() and when the total exceeds your maximium chunk
size switch to a new ouput file before writing the current line.

On a WIN32 system the internal size with files opened in text mode
won't quite match the file size seen by the system as end-of-lines are
stored internally as a single character but as CR-LF pair in file. If
you
know the system to be one that performs this way you can carry out a
correction by adding one for each line read and written.

Complications can arise if any lines are longer than the buffer used
with
fgets() as it will then only read a part line. This can be detected by
checking for '\n' immediately prior to the terminating '\0' (or rather
the absence of '\n'). The next fgets() will read the rest of the line
assuming that the rest is not still too long.

By the way
#include <iostream.h> is not part of C but C++ - you don't need it here.
#include <malloc.h> is historic; the declarations for malloc are
included
with <stdlib.h>
And even with WIN32
#include <fcntl.h> should not be needed with standard C library
functions
#include <io.h> should not be needed with standard C library
functions

Malcolm Kay

David Rubin

unread,
Apr 24, 2001, 8:58:45 AM4/24/01
to
Donna Jean Kilpatrick wrote:
>
> Can anyone tell me the easiest way to delete the last line of a text
> file?
> I'm altering a C program that breaks up a very large text file into
> smaller files of a specified size.
> The first and last line of data in each of the smaller files are useless
> since the values normally get chopped in half. I wanted to just discard
> them but I can't seem to find a way to locate the end of file and delete
> the last line. Any suggestions?


If your file is big and your solution is general (e.g., delete the last n lines)
you might want to read the file twice: count the lines first in nlines, and then
print nlines - n lines on the second iteration. Otherwise, you have to buffer n
lines.

david

--
If 91 were prime, it would be a counterexample to your conjecture.
-- Bruce Wheeler

Kees van der Bent

unread,
Apr 24, 2001, 3:01:20 PM4/24/01
to
Donna Jean Kilpatrick wrote:

> Can anyone tell me the easiest way to delete the last line of a text
> file?

Use fseek() to search backwards from the end until you're past the
second line delimiter (assuming the last line you want to delete is
properly delimited, otherwise look for the first).

Then get the file position using ftell().

After a rewind(), finally copy <file position> characters from the
original file to the new file using fgetc() and fputc() in a simple
for loop.

(Maybe you might want to use tmpnam() to create the name of
the temporary result file and use rename() to overwrite the original.)

This solution may not be the most efficient, but I think it clearly shows
what you're doing and shouldn't need many lines of code.

All the best,
Kees

Dan Pop

unread,
Apr 24, 2001, 3:37:39 PM4/24/01
to

>By the way
>#include <iostream.h> is not part of C but C++ - you don't need it here.
>#include <malloc.h> is historic; the declarations for malloc are

It's not historic, it merely serves a different purpose than declaring
malloc and friends (despite its misleading name) on Unix systems.

This is how it looks like on the system I'm typing this post on:

#ifndef _MALLOC_H_
#define _MALLOC_H_
/*
Constants defining mallopt operations
*/
#define M_MXFAST 1 /* set size of blocks to be fast */
#define M_NLBLKS 2 /* set number of block in a holding block */
#define M_GRAIN 3 /* set number of sizes mapped to one, for
small blocks */
#define M_KEEP 4 /* retain contents of block after a free until
another allocation */
/*
structure filled by
*/
struct mallinfo {
int arena; /* total space in arena */
int ordblks; /* number of ordinary blocks */
int smblks; /* number of small blocks */
int hblks; /* number of holding blocks */
int hblkhd; /* space in holding block headers */
int usmblks; /* space in small blocks in use */
int fsmblks; /* space in free small blocks */
int uordblks; /* space in ordinary blocks in use */
int fordblks; /* space in free ordinary blocks */
int keepcost; /* cost of enabling keep option */
};

#ifdef _NO_PROTO
extern int mallopt();
extern struct mallinfo mallinfo();
#else /*_NO_PROTO */
#ifdef __cplusplus
extern "C" {
#endif
extern int mallopt(int, int);
extern struct mallinfo mallinfo(void);
#ifdef __cplusplus
}
#endif
#endif /*_NO_PROTO */

#endif /* _MALLOC_H_ */

Dan
Dan
--
Dan Pop
CERN, IT Division
Email: Dan...@cern.ch
Mail: CERN - IT, Bat. 31 1-014, CH-1211 Geneve 23, Switzerland

Dann Corbit

unread,
Apr 24, 2001, 3:19:15 PM4/24/01
to
"Kees van der Bent" <kvd...@mail.com> wrote in message
news:3AE5CD80...@mail.com...


Some related FAQ's:

19.13: How can a file be shortened in-place without completely clearing
or rewriting it?

A: BSD systems provide ftruncate(), several others supply chsize(),
and a few may provide a (possibly undocumented) fcntl option
F_FREESP. Under MS-DOS, you can sometimes use write(fd, "", 0).
However, there is no portable solution, nor a way to delete
blocks at the beginning. See also question 19.14.

19.14: How can I insert or delete a line (or record) in the middle of a
file?

A: Short of rewriting the file, you probably can't. The usual
solution is simply to rewrite the file. (Instead of deleting
records, you might consider simply marking them as deleted, to
avoid rewriting.) Another possibility, of course, is to use a
database instead of a flat file. See also questions 12.30 and
19.13.

<OT>
Efficient:
Store the text file in a database, with a unique clustered index on 'line
number' and delete the row with the highest line number.

If there are frequent operations on the file, it is certainly a good way to do
it.
</OT>
--
C-FAQ: http://www.eskimo.com/~scs/C-faq/top.html
"The C-FAQ Book" ISBN 0-201-84519-9
C.A.P. FAQ: ftp://cap.connx.com/pub/Chess%20Analysis%20Project%20FAQ.htm


Dan Pop

unread,
Apr 24, 2001, 3:56:24 PM4/24/01
to

This solution is fundamentally flawed. To *reliably* detect the line
delimiters, you have to open the file in text mode. Once you do that,
you can no longer use fseek with arbitrary offsets and the file
position returned by ftell has no special meaning.

For your edification:

4.9.9.2 The fseek function

Synopsis

#include <stdio.h>
int fseek(FILE *stream, long int offset, int whence);

Description

The fseek function sets the file position indicator for the stream
pointed to by stream .

For a binary stream, the new position, measured in characters from
the beginning of the file, is obtained by adding offset to the
position specified by whence. The specified point is the beginning
of the file for SEEK_SET, the current value of the file position
indicator for SEEK_CUR, or end-of-file for SEEK_END. A binary
stream need not meaningfully support fseek calls with a whence value
of SEEK_END.

For a text stream, either offset shall be zero, or offset shall be
a value returned by an earlier call to the ftell function on the same
stream and whence shall be SEEK_SET.

A successful call to the fseek function clears the end-of-file
indicator for the stream and undoes any effects of the ungetc function
on the same stream. After an fseek call, the next operation on an
update stream may be either input or output.

Returns

The fseek function returns nonzero only for a request that cannot
be satisfied.

...

4.9.9.4 The ftell function

Synopsis

#include <stdio.h>
long int ftell(FILE *stream);

Description

The ftell function obtains the current value of the file position
indicator for the stream pointed to by stream . For a binary stream,
the value is the number of characters from the beginning of the file.
For a text stream, its file position indicator contains unspecified
information, usable by the fseek function for returning the file
position indicator for the stream to its position at the time of the
ftell call; the difference between two such return values is not
necessarily a meaningful measure of the number of characters written
or read.

Returns

If successful, the ftell function returns the current value of the
file position indicator for the stream. On failure, the ftell
function returns -1L and stores an implementation-defined positive
value in errno.

Robert B. Clark

unread,
Apr 24, 2001, 5:40:17 PM4/24/01
to
On Tue, 24 Apr 2001 11:07:03 -0400, Donna Jean Kilpatrick
<dkil...@nrcan.gc.ca> wrote:

>Can anyone tell me the easiest way to delete the last line of a text
>file?
>I'm altering a C program that breaks up a very large text file into
>smaller files of a specified size.
>The first and last line of data in each of the smaller files are useless
>since the values normally get chopped in half. I wanted to just discard
>them but I can't seem to find a way to locate the end of file and delete
>the last line. Any suggestions?

<snip code>

You might be better off to rewrite the code.

A better approach would be to tally the size of each line read from the
source file as you write it out. If the accumulated total is greater than
your size limit, close the current output file and create a new one before
writing the line.

Of course, this assumes that it is permissible to have split files that may
be smaller than your maximum segment size. Note that they can't be greater
unless a line in the source file is larger than the specified segment size.

If this approach is unsuitable, read and write the source file on a
character basis. Instead of blindly writing the characters immediately as
they are read, store them in a buffer. If you reach a newline in the
source stream, and if the length of the line buffer does not exceed your
accumulated segment size, write the line out. Otherwise, discard the line
and close the current segment output file.

Both approaches are basically the same, but with different granularity.

If you absolutely must be able to clip the last line from an extant text
file, you'll have to first read the file and count the newlines. On a
subsequent pass, read the source file and write each line up to line n-1 to
a temp file. Close the files, then replace the source file with the
temporary one.

--
Robert B. Clark
Visit ClarkWehyr Enterprises On-Line at http://home.earthlink.net/~rclark31/ClarkWehyr.html

Kees van der Bent

unread,
Apr 24, 2001, 7:13:58 PM4/24/01
to
Dan Pop wrote:

> >This solution may not be the most efficient, but I think it clearly shows
> >what you're doing and shouldn't need many lines of code.
>
> This solution is fundamentally flawed. To *reliably* detect the line
> delimiters, you have to open the file in text mode.

With 'reliably', do you mean 'portable', covering the fact that line
delimiters are platform dependent?

Mike Copeland

unread,
Apr 24, 2001, 8:13:38 PM4/24/01
to
> > Can anyone tell me the easiest way to delete the last line of a text
> > file?
> > I'm altering a C program that breaks up a very large text file into
> > smaller files of a specified size.
> > The first and last line of data in each of the smaller files are useless
> > since the values normally get chopped in half. I wanted to just discard
> > them but I can't seem to find a way to locate the end of file and delete
> > the last line. Any suggestions?
>
>
> If your file is big and your solution is general (e.g., delete the last n lines)
> you might want to read the file twice: count the lines first in nlines, and then
> print n lines - n lines on the second iteration. Otherwise, you have to buffer n
> lines.

If it's not a general solution for "delete n lines" (and deleting the
last line is the only requirement), I'd read each line, saving it for
sending to the output until I read another line. That way, when I reach
EOF, I just don't write out the (last) saved line. One pass through the
data... 8<}}

Mal Kay

unread,
Apr 27, 2001, 7:32:49 AM4/27/01
to

My recollection is that prior to standard C, the header file
<stdlib.h> did not exist (or at least was uncommon) and the
declarations for malloc, calloc and realloc were in <malloc.h>.
It appears that the use of <malloc.h> may have been redefined
(after all it is not part of the standard) in association with
fine control of malloc and friends. I have not come accross
this before.

On my system <malloc.h> reads:

#if __GNUC__
#warning "this file includes <malloc.h> which is deprecated, use
<stdlib.h> instead"
#endif

#include <stdlib.h>

Joe Wright

unread,
Apr 30, 2001, 7:32:52 PM4/30/01
to
Without doubting anything you wrote or quoted here for the moment, I
have used ftell() on text files to 'tell' me where the next line
starts. I subsequently use that value to fseek() to the desired line.

First, the 'first' line starts at 0L by definition. Now reading the
file with fgetc() or fgets() until '\n', I call ftell() to mark the
position in the file following the '\n' which is the beginning of the
'next' line. I continue to EOF 'remembering' the ftell() values for
each line. I can now fseek() to any particular line in the file. In
fact I can sort the 'line pointers' against the lines in the file and
print or re-write the file in lexical order. Is this allowed by the
Standard or am I just lucky?
--
Joe Wright mailto:joeww...@earthlink.net
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

David Thompson

unread,
May 7, 2001, 12:41:39 AM5/7/01
to
Joe Wright <joeww...@earthlink.net> wrote :
...

> Without doubting anything you wrote or quoted here for the moment, I
> have used ftell() on text files to 'tell' me where the next line
> starts. I subsequently use that value to fseek() to the desired line.
> ... I can sort the 'line pointers' against the lines in the file and

> print or re-write the file in lexical order. Is this allowed by the
> Standard or am I just lucky?

Yes. You can always validly fseek() to a value previously
obtained from ftell() (if the underlying file hasn't changed).
What you can't _portably_ do for text files, but can for
binary files, is arithmetically compute a seek position:
for a binary file, fseek(500) is exactly 200 bytes past
fseek(300) if it works at all, but on a text file there is
no guarantee what value, if any, does that. (On Unixy
systems, text files do work the same as binary.)

The one caveat is that on systems that now support files
with sizes larger than can be represented in 'long'
(often only 32 bits) fseek()/ftell() are no longer sufficient.

--
- David.Thompson 1 now at worldnet.att.net

0 new messages