[perl-devel-nytprof commit] r368 - trunk

2 views
Skip to first unread message

codesite...@google.com

unread,
Jul 29, 2008, 8:25:03 PM7/29/08
to develnyt...@googlegroups.com
Author: tim.bunce
Date: Tue Jul 29 09:45:08 2008
New Revision: 368

Modified:
trunk/NYTProf.xs

Log:
Use fwrite in preference to fputc loop, thanks to Nicholas Clark.


Modified: trunk/NYTProf.xs
==============================================================================
--- trunk/NYTProf.xs (original)
+++ trunk/NYTProf.xs Tue Jul 29 09:45:08 2008
@@ -334,8 +334,7 @@
output_int(fid_info->fid_flags);
output_int(fid_info->file_size);
output_int(fid_info->file_mtime);
- while (file_name_len--)
- fputc(*file_name++, out);
+ fwrite(file_name, file_name_len, 1, out);
fputc('\n', out);
}

@@ -520,11 +519,7 @@
*/
void
output_nv(NV nv) {
- int i = sizeof(NV);
- unsigned char *p = (unsigned char *)&nv;
- while (i-- > 0) {
- fputc(*p++, out);
- }
+ fwrite((unsigned char *)&nv, sizeof(NV), 1, out);
}


Nicholas Clark

unread,
Jul 30, 2008, 4:53:55 AM7/30/08
to develnyt...@googlegroups.com
On Tue, Jul 29, 2008 at 05:25:03PM -0700, codesite...@google.com wrote:
>
> Author: tim.bunce
> Date: Tue Jul 29 09:45:08 2008
> New Revision: 368
>
> Modified:
> trunk/NYTProf.xs
>
> Log:
> Use fwrite in preference to fputc loop, thanks to Nicholas Clark.

I missed it's corresponding read loop though, didn't I? :-)

Still no error checking. Not sure of the best way to deal with that.

The way I read headers in PerlIO::gzip and ex::lib::zip was to fread() in
enough bytes (and error check that) then copy them out to the places they
were needed. No faffing with fgetc()/fputc() loops.

Nicholas Clark

Index: NYTProf.xs
===================================================================
--- NYTProf.xs (revision 369)
+++ NYTProf.xs (working copy)
@@ -1699,11 +1699,7 @@
NV
read_nv() {
NV nv;


- int i = sizeof(NV);
- unsigned char *p = (unsigned char *)&nv;
- while (i-- > 0) {

- *p++ = (unsigned char)fgetc(in);
- }
+ fread((unsigned char *)&nv, sizeof(NV), 1, in);
return nv;
}

Tim Bunce

unread,
Jul 30, 2008, 1:15:39 PM7/30/08
to develnyt...@googlegroups.com
On Wed, Jul 30, 2008 at 09:53:55AM +0100, Nicholas Clark wrote:
>
> On Tue, Jul 29, 2008 at 05:25:03PM -0700, codesite...@google.com wrote:
> >
> > Author: tim.bunce
> > Date: Tue Jul 29 09:45:08 2008
> > New Revision: 368
> >
> > Modified:
> > trunk/NYTProf.xs
> >
> > Log:
> > Use fwrite in preference to fputc loop, thanks to Nicholas Clark.
>
> I missed it's corresponding read loop though, didn't I? :-)

Applied, thanks.

> Still no error checking. Not sure of the best way to deal with that.

I added:
/* no error checking on the assumption that a later token read will
* detect the error/eof condition
*/
:)

> The way I read headers in PerlIO::gzip and ex::lib::zip was to fread() in
> enough bytes (and error check that) then copy them out to the places they
> were needed. No faffing with fgetc()/fputc() loops.

Hey, fancy adding zip support to NYTProf? :)

Meanwhile, want a commit bit?

Tim.

Nicholas Clark

unread,
Jul 31, 2008, 7:29:44 AM7/31/08
to develnyt...@googlegroups.com
On Wed, Jul 30, 2008 at 06:15:39PM +0100, Tim Bunce wrote:

> Hey, fancy adding zip support to NYTProf? :)

Not yet. I *think* that it might be as simple as

#include <zlib.h>

and then using gzopen() rather than fopen(), and using its gz*() functions
in place of f*()

> Meanwhile, want a commit bit?

Why not? I can add it to my collection. :-)

Nicholas Clark

Nicholas Clark

unread,
Aug 6, 2008, 4:39:46 PM8/6/08
to develnyt...@googlegroups.com
On Thu, Jul 31, 2008 at 12:29:44PM +0100, Nicholas Clark wrote:
>
> On Wed, Jul 30, 2008 at 06:15:39PM +0100, Tim Bunce wrote:
>
> > Hey, fancy adding zip support to NYTProf? :)
>
> Not yet. I *think* that it might be as simple as
>
> #include <zlib.h>
> and then using gzopen() rather than fopen(), and using its gz*() functions
> in place of f*()

Right. I tried. Because right now I'm generating over 4Gb of profile data
trying to profile the harness in the Perl 5 core running tests in parallel.

I got to the appended in about an hour. It fails the tests, in a major way,
because there's no way to do this bit of cheating with a zlib compressed
file handle:

/* any data that was unflushed in the parent when it forked
* is now duplicated unflushed in this child process.
* We need to be a little devious to prevent it getting flushed.
*/
close(fileno(out)); /* close the underlying fd first */
NYT_close(out); /* happily, this can't flush now */

and the underling implementation of zlib's compressed file handles uses stdio,
hence it can end up with unflushed data in the child.

Well, I think that there's no way of cheating. I can't see any way to
invalidate the FILE * hiding inside the gzFile, but I'd love to be proved
wrong.

If I'm not wrong, then the way to go seems to be to avoid fprintf, fscanf,
fputc, fgetc, and fgets, and instead recast all the input and output code
solely in terms of fread() and fwrite(). At which point they can be
conditionally replaced with routines that read or write blocks of data into
deflate() and inflate(), and we have to do all the manual handling ourselves.
However, it does mean that we can keep the current textual header, before
dropping into a deflated stream, if we so choose.

Nicholas Clark

Index: Makefile.PL
===================================================================
--- Makefile.PL (revision 390)
+++ Makefile.PL (working copy)
@@ -81,6 +81,7 @@

my @LIBS = ();
push @LIBS, "-lrt" if $clock_gettime;
+push @LIBS, "-lz";

# See lib/ExtUtils/MakeMaker.pm for details of how to influence
# the contents of the Makefile that is written.
Index: NYTProf.xs
===================================================================
--- NYTProf.xs (revision 390)
+++ NYTProf.xs (working copy)
@@ -19,6 +19,8 @@
#define PERL_NO_GET_CONTEXT /* we want efficiency */
#endif

+#define HAS_ZLIB 1
+
#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"
@@ -28,6 +30,22 @@
# include "ppport.h"
#endif

+#ifdef HAS_ZLIB
+# include "zlib.h"
+# define NYT_open(n, m) gzopen((n), (m))
+# define NYT_close gzclose
+# define NYT_read(p, l, f) gzread((f), (p), (l))
+# define NYT_write(p, l, f) gzwrite((f), (p), (l))
+# define NYT_putc(i, f) gzputc((f), (i))
+# define NYT_getc gzgetc
+# define NYT_printf gzprintf
+# define NYT_tell gztell
+# define NYT_gets(p, l, f) gzgets((f), (p), (l))
+typedef gzFile NYT_File;
+#else
+typedef FILE *NYT_File;
+#endif
+
#if !defined(OutCopFILE)
# define OutCopFILE CopFILE
#endif
@@ -110,8 +128,8 @@
/* END Hash table definitions */

/* defaults */
-static FILE* out;
-static FILE* in;
+static NYT_File out;
+static NYT_File in;

/* options and overrides */
static char PROF_output_file[MAXPATHLEN+1] = "nytprof.out";
@@ -214,11 +232,11 @@
#define getppid() 0
#endif
#define OUTPUT_PID() STMT_START { \
- assert(out != NULL); fputc(NYTP_TAG_PID_START, out); output_int(getpid()); output_int(getppid()); \
+ assert(out != NULL); NYT_putc(NYTP_TAG_PID_START, out); output_int(getpid()); output_int(getppid()); \
} STMT_END

#define END_OUTPUT_PID(pid) STMT_START { \
- assert(out != NULL); fputc(NYTP_TAG_PID_END, out); output_int(pid); fflush(out); \
+ assert(out != NULL); NYT_putc(NYTP_TAG_PID_END, out); output_int(pid); fflush(out); \
} STMT_END

/***********************************
@@ -236,25 +254,25 @@

assert(out != NULL);
/* File header with "magic" string, with file major and minor version */
- fprintf(out, "NYTProf %d %d\n", 2, 0);
+ NYT_printf(out, "NYTProf %d %d\n", 2, 0);
/* Human readable comments and attributes follow
* comments start with '#', end with '\n', and are discarded
* attributes start with ':', a word, '=', then the value, then '\n'
*/
- fprintf(out, "# Perl profile database. Generated by Devel::NYTProf on %s",
+ NYT_printf(out, "# Perl profile database. Generated by Devel::NYTProf on %s",
ctime(&basetime)); /* uses \n from ctime to terminate line */

/* XXX add options, $0, etc, but beware of embedded newlines */
/* XXX would be good to adopt a proper charset & escaping for these */
/* $^T */
- fprintf(out, ":%s=%lu\n", "basetime", (unsigned long)PL_basetime);
- fprintf(out, ":%s=%s\n", "xs_version", XS_VERSION);
- fprintf(out, ":%s=%d.%d.%d\n", "perl_version", PERL_REVISION, PERL_VERSION, PERL_SUBVERSION);
- fprintf(out, ":%s=%u\n", "ticks_per_sec", ticks_per_sec);
- fprintf(out, ":%s=%lu\n", "nv_size", sizeof(NV));
+ NYT_printf(out, ":%s=%lu\n", "basetime", (unsigned long)PL_basetime);
+ NYT_printf(out, ":%s=%s\n", "xs_version", XS_VERSION);
+ NYT_printf(out, ":%s=%d.%d.%d\n", "perl_version", PERL_REVISION, PERL_VERSION, PERL_SUBVERSION);
+ NYT_printf(out, ":%s=%u\n", "ticks_per_sec", ticks_per_sec);
+ NYT_printf(out, ":%s=%lu\n", "nv_size", sizeof(NV));
/* $0 - application name */
mg_get(sv = get_sv("0",GV_ADDWARN));
- fprintf(out, ":%s=%s\n", "application", SvPV_nolen(sv));
+ NYT_printf(out, ":%s=%s\n", "application", SvPV_nolen(sv));

OUTPUT_PID();

@@ -273,9 +291,9 @@
tag = NYTP_TAG_STRING_UTF8;
len = -len;
}
- fputc(tag, out);
+ NYT_putc(tag, out);
output_int(len);
- fwrite(str, 1, len, out);
+ NYT_write(str, len, out);
}


@@ -283,10 +301,10 @@
read_str(pTHX_ SV *sv) {
STRLEN len;
char *buf;
- char tag = fgetc(in);
+ char tag = NYT_getc(in);
if (NYTP_TAG_STRING != tag && NYTP_TAG_STRING_UTF8 != tag)
croak("File format error at offset %ld, expected string tag but found %d ('%c')",
- (long)ftell(in)-1, tag, tag);
+ (long)NYT_tell(in)-1, tag, tag);

len = read_int();
if (sv) {
@@ -298,9 +316,9 @@
}

buf = SvPV_nolen(sv);
- if (fread(buf, 1, len, in) != len)
+ if (NYT_read(buf, len, in) != len)
croak("String truncated in file at offset %ld: %s",
- (long)ftell(in)-1, (feof(in)) ? "end of file" : strerror(ferror(in)));
+ (long)NYT_tell(in)-1, (feof(in)) ? "end of file" : strerror(ferror(in)));
SvCUR_set(sv, len);
*SvEND(sv) = '\0';

@@ -415,7 +433,7 @@
file_name = fid_info->key_abs;
file_name_len = strlen(file_name);
}
- fputc(NYTP_TAG_NEW_FID, out);
+ NYT_putc(NYTP_TAG_NEW_FID, out);
output_int(fid_info->id);
output_int(fid_info->eval_fid);
output_int(fid_info->eval_line_num);
@@ -592,29 +610,29 @@

/* general case. handles all integers */
if (i < 0x80) { /* < 8 bits */
- fputc( (char)i, out);
+ NYT_putc( (char)i, out);
}
else if (i < 0x4000) { /* < 15 bits */
- fputc( (char)((i >> 8) | 0x80), out);
- fputc( (char)i, out);
+ NYT_putc( (char)((i >> 8) | 0x80), out);
+ NYT_putc( (char)i, out);
}
else if (i < 0x200000) { /* < 22 bits */
- fputc( (char)((i >> 16) | 0xC0), out);
- fputc( (char)(i >> 8), out);
- fputc( (char)i, out);
+ NYT_putc( (char)((i >> 16) | 0xC0), out);
+ NYT_putc( (char)(i >> 8), out);
+ NYT_putc( (char)i, out);
}
else if (i < 0x10000000) { /* 32 bits */
- fputc( (char)((i >> 24) | 0xE0), out);
- fputc( (char)(i >> 16), out);
- fputc( (char)(i >> 8), out);
- fputc( (char)i, out);
+ NYT_putc( (char)((i >> 24) | 0xE0), out);
+ NYT_putc( (char)(i >> 16), out);
+ NYT_putc( (char)(i >> 8), out);
+ NYT_putc( (char)i, out);
}
else { /* need all the bytes. */
- fputc( 0xFF, out);
- fputc( (char)(i >> 24), out);
- fputc( (char)(i >> 16), out);
- fputc( (char)(i >> 8), out);
- fputc( (char)i, out);
+ NYT_putc( 0xFF, out);
+ NYT_putc( (char)(i >> 24), out);
+ NYT_putc( (char)(i >> 16), out);
+ NYT_putc( (char)(i >> 8), out);
+ NYT_putc( (char)i, out);
}
}

@@ -626,7 +644,7 @@
void
output_nv(NV nv)
{
- fwrite((unsigned char *)&nv, sizeof(NV), 1, out);
+ NYT_write((unsigned char *)&nv, sizeof(NV), out);
}


@@ -928,7 +946,7 @@
if (last_executed_fid) {
reinit_if_forked(aTHX);

- fputc( (profile_blocks) ? NYTP_TAG_TIME_BLOCK : NYTP_TAG_TIME_LINE, out);
+ NYT_putc( (profile_blocks) ? NYTP_TAG_TIME_BLOCK : NYTP_TAG_TIME_LINE, out);
output_int(elapsed);
output_int(last_executed_fid);
output_int(last_executed_line);
@@ -1025,7 +1043,7 @@
* increment the count (because the time is not for a new statement but simply
* a continuation of a previously counted statement).
*/
- fputc(NYTP_TAG_DISCOUNT, out);
+ NYT_putc(NYTP_TAG_DISCOUNT, out);

if (trace_level >= 4) {
warn("left %u:%u via %s back to %s at %u:%u (b%u s%u) - discounting next statement%s\n",
@@ -1101,7 +1119,7 @@
/* caller is expected to have purged/closed old out if appropriate */
}

- out = fopen(filename, "wb");
+ out = NYT_open(filename, "wb");
if (!out) {
disable_profile(aTHX);
croak("Failed to open output '%s': %s", filename, strerror(errno));
@@ -1127,12 +1145,16 @@
if (sub_callers_hv)
hv_clear(sub_callers_hv);

+#ifdef HAS_ZLIB
+ /* This isn't stdio. Our best bet is to just drop it. */
+#else
/* any data that was unflushed in the parent when it forked
* is now duplicated unflushed in this child process.
* We need to be a little devious to prevent it getting flushed.
*/
close(fileno(out)); /* close the underlying fd first */
- fclose(out); /* happily, this can't flush now */
+ NYT_close(out); /* happily, this can't flush now */
+#endif

open_output_file(aTHX_ PROF_output_file);

@@ -1352,7 +1374,7 @@
}

if (trace_level >= 3)
- fprintf(stderr, "fid %d:%d called %s %s (oh %gt, sub %gs)\n", fid, line,
+ NYT_printf(stderr, "fid %d:%d called %s %s (oh %gt, sub %gs)\n", fid, line,
SvPV_nolen(subname_sv), (is_xs) ? "xs" : "sub",
sub_call_start.current_overhead_ticks,
sub_call_start.current_subr_secs);
@@ -1510,14 +1532,26 @@
disable_profile(aTHX);

if (out) {
+ int result;
write_sub_line_ranges(aTHX_ 0);
write_sub_callers(aTHX);
/* mark end of profile data for last_pid pid
* (which is the pid that relates to the out filehandle)
*/
END_OUTPUT_PID(last_pid);
- if (-1 == fclose(out))
+ result = NYT_close(out);
+#ifdef HAS_ZLIB
+ if (result) {
+ if (result != Z_ERRNO) {
+ warn("Error closing profile data file: %s", gzerror(out, &result));
+ } else {
+ warn("Error closing profile data file: %s", strerror(errno));
+ }
+ }
+#else
+ if (-1 == result)
warn("Error closing profile data file: %s", strerror(errno));
+#endif
out = NULL;
}
}
@@ -1719,7 +1753,7 @@
warn("Sub %s fid %u lines %lu..%lu\n",
sub_name, fid, (unsigned long)first_line, (unsigned long)last_line);

- fputc(NYTP_TAG_SUB_LINE_RANGE, out);
+ NYT_putc(NYTP_TAG_SUB_LINE_RANGE, out);
output_int(fid);
output_int(first_line);
output_int(last_line);
@@ -1760,7 +1794,7 @@
SvNV(AvARRAY(av)[1]), SvNV(AvARRAY(av)[2]),
SvNV(AvARRAY(av)[3]), SvNV(AvARRAY(av)[4]) );

- fputc(NYTP_TAG_SUB_CALLERS, out);
+ NYT_putc(NYTP_TAG_SUB_CALLERS, out);
output_int(fid);
output_int(line);
output_int(count);
@@ -1784,39 +1818,39 @@
unsigned char d;
unsigned int newint;

- d = fgetc(in);
+ d = NYT_getc(in);
if (d < 0x80) { /* 7 bits */
newint = d;
}
else if (d < 0xC0) { /* 14 bits */
newint = d & 0x7F;
newint <<= 8;
- newint |= (unsigned char)fgetc(in);
+ newint |= (unsigned char)NYT_getc(in);
}
else if (d < 0xE0) { /* 21 bits */
newint = d & 0x1F;
newint <<= 8;
- newint |= (unsigned char)fgetc(in);
+ newint |= (unsigned char)NYT_getc(in);
newint <<= 8;
- newint |= (unsigned char)fgetc(in);
+ newint |= (unsigned char)NYT_getc(in);
}
else if (d < 0xFF) { /* 28 bits */
newint = d & 0xF;
newint <<= 8;
- newint |= (unsigned char)fgetc(in);
+ newint |= (unsigned char)NYT_getc(in);
newint <<= 8;
- newint |= (unsigned char)fgetc(in);
+ newint |= (unsigned char)NYT_getc(in);
newint <<= 8;
- newint |= (unsigned char)fgetc(in);
+ newint |= (unsigned char)NYT_getc(in);
}
else if (d == 0xFF) { /* 32 bits */
- newint = (unsigned char)fgetc(in);
+ newint = (unsigned char)NYT_getc(in);
newint <<= 8;
- newint |= (unsigned char)fgetc(in);
+ newint |= (unsigned char)NYT_getc(in);
newint <<= 8;
- newint |= (unsigned char)fgetc(in);
+ newint |= (unsigned char)NYT_getc(in);
newint <<= 8;
- newint |= (unsigned char)fgetc(in);
+ newint |= (unsigned char)NYT_getc(in);
}
return newint;
}
@@ -1832,7 +1866,7 @@


/* no error checking on the assumption that a later token read will
* detect the error/eof condition
*/

- fread((unsigned char *)&nv, 1, sizeof(NV), in);
+ NYT_read((unsigned char *)&nv, sizeof(NV), in);
return nv;
}

@@ -1902,21 +1936,26 @@
HV* sub_subinfo_hv = newHV();
HV* sub_callers_hv = newHV();
SV *tmp_str_sv = newSVpvn("",0);
+ char buffer[sizeof("NYTProf \n") + 2 * 20];

av_extend(fid_fileinfo_av, 64); /* grow it up front. */
av_extend(fid_line_time_av, 64);

- if (2 != fscanf(in, "NYTProf %d %d\n", &file_major, &file_minor)) {
+ if (!NYT_gets(buffer, sizeof(buffer), in)) {
+ croak("Profile format error while reading header");
+ }
+
+ if (2 != sscanf(buffer, "NYTProf %d %d\n", &file_major, &file_minor)) {
croak("Profile format error while parsing header");
}
if (file_major != 2)
croak("Profile format version %d.%d not supported by %s %s",
file_major, file_minor, __FILE__, XS_VERSION);

- while (EOF != (c = fgetc(in))) {
+ while (EOF != (c = NYT_getc(in))) {
input_chunk_seqn++;
if (trace_level >= 6)
- warn("Chunk %lu token is %d ('%c') at %ld\n", input_chunk_seqn, c, c, (long)ftell(in)-1);
+ warn("Chunk %lu token is %d ('%c') at %ld\n", input_chunk_seqn, c, c, (long)NYT_tell(in)-1);

switch (c) {
case NYTP_TAG_DISCOUNT:
@@ -2166,7 +2205,7 @@
char text[MAXPATHLEN*2];
char *value, *end;
SV *value_sv;
- if (NULL == fgets(text, sizeof(text), in))
+ if (NULL == NYT_gets(text, sizeof(text), in))
/* probably EOF */
croak("Profile format error reading attribute");
if ((NULL == (value = strchr(text, '=')))
@@ -2196,7 +2235,7 @@
case NYTP_TAG_COMMENT:
{
char text[MAXPATHLEN*2];
- if (NULL == fgets(text, sizeof(text), in))
+ if (NULL == NYT_gets(text, sizeof(text), in))
/* probably EOF */
croak("Profile format error reading comment");
if (trace_level >= 2)
@@ -2206,7 +2245,7 @@

default:
croak("File format error: token %d ('%c'), chunk %lu, pos %lu",
- c, c, input_chunk_seqn, ftell(in));
+ c, c, input_chunk_seqn, NYT_tell(in));
}
}

@@ -2330,11 +2369,11 @@
CODE:
if (trace_level)
warn("reading profile data from file %s\n", file);
- in = fopen(file, "rb");
+ in = NYT_open(file, "rb");
if (in == NULL) {
croak("Failed to open input '%s': %s", file, strerror(errno));
}
RETVAL = load_profile_data_from_stream();
- fclose(in);
+ NYT_close(in);
OUTPUT:
RETVAL

Tim Bunce

unread,
Aug 7, 2008, 4:28:11 AM8/7/08
to develnyt...@googlegroups.com
On Wed, Aug 06, 2008 at 09:39:46PM +0100, Nicholas Clark wrote:
>
> On Thu, Jul 31, 2008 at 12:29:44PM +0100, Nicholas Clark wrote:
> >
> > On Wed, Jul 30, 2008 at 06:15:39PM +0100, Tim Bunce wrote:
> >
> > > Hey, fancy adding zip support to NYTProf? :)
> >
> > Not yet. I *think* that it might be as simple as
> >
> > #include <zlib.h>
> > and then using gzopen() rather than fopen(), and using its gz*() functions
> > in place of f*()
>
> Right. I tried. Because right now I'm generating over 4Gb of profile data
> trying to profile the harness in the Perl 5 core running tests in parallel.

Ouch!

(We also need a way to turn off the statement profiler, just leaving the
sub profiler running. That would be a big help. Then we'd need a way to
enable the statement profiler for individual subs or packages.)

> I got to the appended in about an hour. It fails the tests, in a major way,
> because there's no way to do this bit of cheating with a zlib compressed
> file handle:
>
> /* any data that was unflushed in the parent when it forked
> * is now duplicated unflushed in this child process.
> * We need to be a little devious to prevent it getting flushed.
> */
> close(fileno(out)); /* close the underlying fd first */
> NYT_close(out); /* happily, this can't flush now */
>
> and the underling implementation of zlib's compressed file handles uses stdio,
> hence it can end up with unflushed data in the child.
>
> Well, I think that there's no way of cheating. I can't see any way to
> invalidate the FILE * hiding inside the gzFile, but I'd love to be
> proved wrong.

We could perhaps use this trick... just before opening the file using
gzopen(), open a file ourselves, note the fd, then close it. Then when
gzopen() it'll _probably_ open a file with the same fd.

(An extra level of safety could be added by opening two files, getting
two fds and closing them. Then, after the gzopen(), we could open
another file and see if the new fd is the same as the second one
we opened previously.)

Tim.

Nicholas Clark

unread,
Aug 8, 2008, 12:28:58 PM8/8/08
to develnyt...@googlegroups.com
On Thu, Aug 07, 2008 at 09:28:11AM +0100, Tim Bunce wrote:

> We could perhaps use this trick... just before opening the file using
> gzopen(), open a file ourselves, note the fd, then close it. Then when
> gzopen() it'll _probably_ open a file with the same fd.
>
> (An extra level of safety could be added by opening two files, getting
> two fds and closing them. Then, after the gzopen(), we could open
> another file and see if the new fd is the same as the second one
> we opened previously.)

That feels very fragile to me.

Nicholas Clark

Nicholas Clark

unread,
Aug 10, 2008, 1:08:51 PM8/10/08
to develnyt...@googlegroups.com
On Wed, Aug 06, 2008 at 09:39:46PM +0100, Nicholas Clark wrote:

> If I'm not wrong, then the way to go seems to be to avoid fprintf, fscanf,
> fputc, fgetc, and fgets, and instead recast all the input and output code
> solely in terms of fread() and fwrite(). At which point they can be
> conditionally replaced with routines that read or write blocks of data into
> deflate() and inflate(), and we have to do all the manual handling ourselves.
> However, it does mean that we can keep the current textual header, before
> dropping into a deflated stream, if we so choose.

Which doesn't seem to be that hard, so I've started on it.

To me, currently, it seems that the way to go might well be to keep the current
plain text header, then switch to a compressed stream straight after it. This
could probably be done by having a tag that says "zlib compressed stream
starts here" and a pointer to swappable tables of read/write/... functions
for compressed and uncompressed.

Nicholas Clark

Tim Bunce

unread,
Aug 10, 2008, 5:28:49 PM8/10/08
to develnyt...@googlegroups.com, develnyt...@googlegroups.com
Sounds good to me! Thanks.

It'll be interesting to see how it affects performance. I think
there's a good chance performance will improve for most people as
we'll be reducing the io by ~90%. May just need to set the default
compression level to achieve a good balance.

Any portability issues?

Tim.

Sent from my iPhone

Adam Kaplan

unread,
Aug 11, 2008, 4:52:53 PM8/11/08
to develnyt...@googlegroups.com
On Sun, Aug 10, 2008 at 17:28, Tim Bunce <tim....@gmail.com> wrote:
>
> Sounds good to me! Thanks.
>
> It'll be interesting to see how it affects performance. I think
> there's a good chance performance will improve for most people as
> we'll be reducing the io by ~90%. May just need to set the default
> compression level to achieve a good balance.
>
> Any portability issues?
>
> Tim.

Wow, that would be incredible.

--
Adam J. Kaplan
Software Engineer, Code Critic and Cocoa Drinker.
The New York Times Company

Nicholas Clark

unread,
Oct 1, 2008, 5:14:56 AM10/1/08
to develnyt...@googlegroups.com
On Mon, Aug 11, 2008 at 04:52:53PM -0400, Adam Kaplan wrote:
>
> On Sun, Aug 10, 2008 at 17:28, Tim Bunce <tim....@gmail.com> wrote:
> >
> > Sounds good to me! Thanks.
> >
> > It'll be interesting to see how it affects performance. I think
> > there's a good chance performance will improve for most people as
> > we'll be reducing the io by ~90%. May just need to set the default
> > compression level to achieve a good balance.
> >
> > Any portability issues?
> >
> > Tim.
>
> Wow, that would be incredible.

Given it's done, will it get released soon?

Nicholas Clark

Tim Bunce

unread,
Oct 1, 2008, 8:26:57 AM10/1/08
to develnyt...@googlegroups.com

I hope so. I'd really like to get xsub caller info into the report for
the next release. I'm aiming for early next week.

Tim.

Tim Bunce

unread,
Oct 1, 2008, 11:37:39 AM10/1/08
to develnyt...@googlegroups.com

Or sooner... I've just uploaded Devel-NYTProf-2.04_RC1.tar.gz

I figured there's wasn't much point in waiting and there was a growing
risk I wouldn't have much time to do the xsub caller info changes this
week.

If the cpantesters don't find any holes in it within a day or two
I'll upload a real release.

Thanks again Nicholas for your contributions! The zip support extends
the usability of NYTProf greatly.

Tim.

Adam Kaplan

unread,
Oct 1, 2008, 4:16:44 PM10/1/08
to develnyt...@googlegroups.com

I agree, the zip ability is very helpful when dealing with these
formally large file on an NFS like I had been using.

It also got me thinking of other things that could benefit from using
zlib compression. One app in particular, which runs on a popular
mobile device that I can't disclose just yet, currently has a data
download (speed) problem. But in it's next release, I was able to
reduced data download by 98% using various methods including zip.

Thanks :)

--
Adam J. Kaplan
Duct Tape Specialist & Software Engineer

Reply all
Reply to author
Forward
0 new messages