Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PHP-DEV] transparent output compression patch

0 views
Skip to first unread message

Jade Nicoletti

unread,
Nov 12, 2000, 4:53:28 PM11/12/00
to
Hi

Here's a patch that adds transparent output compression in the output layer.
The patch is just a start. I've mainly posted it to get some feedback...

So far, it isn't thread safe and only the 'gzip' coding really works.

Some questions that I have:
- Is output.c the right place to put this stuff in?
- There shouln't be a ob_compress_finish(). The gzip/deflate trailer
should be sent automatically. Where should I place the call to
that function?
- The patch won't work if you dynamically load the zlib extension.
How can I fix that?
HAVE_ZLIB isn't set... should I remove the '#if's and check
at runtime if zlib is available? I think this won't work if the
compression is done in output.c...

Usage:
------ example.php
<?
ob_compress(1);
echo 'this is just test';
ob_compress_finish();
------

Bugs/TODOs:
1. Grep for them (TODO|FIXME).
2. The deflate coding dosn't work. I don't see why, right now... hint?


-Jade.

PS: Thank you for your feedback.

--
===============================================================================
Jade Nicoletti Nicoletti Net Services Tel. 01 240 4774
Geschäftsleitung Postfach 2519 Fax 01 240 4775
System-Administration 8021 Zürich
============================================[ Weitere Infos: http://nns.ch/ ]==

comp.patch

Zeev Suraski

unread,
Nov 12, 2000, 5:41:46 PM11/12/00
to
Thanks for the patch. I'll try to take a look at it this week.

At 23:54 12/11/2000, Jade Nicoletti wrote:
>Hi
>
>Here's a patch that adds transparent output compression in the output layer.
>The patch is just a start. I've mainly posted it to get some feedback...
>
>So far, it isn't thread safe and only the 'gzip' coding really works.
>
>Some questions that I have:
> - Is output.c the right place to put this stuff in?

Probably not. The output buffering mechanism supports output handlers,
even user-defined output handlers.

> - There shouln't be a ob_compress_finish(). The gzip/deflate trailer
> should be sent automatically. Where should I place the call to
> that function?

If you implement the gzip encoding as an output handler, this will be taken
care of transparently for you.

> - The patch won't work if you dynamically load the zlib extension.
> How can I fix that?
> HAVE_ZLIB isn't set... should I remove the '#if's and check
> at runtime if zlib is available? I think this won't work if the
> compression is done in output.c...

If (when :) you implement the gzip encoding as an output handler, I'd say
this code belongs to the gzip module. Then, whenever gzip support is
available, gzip encoding would also be available, regardless of whether the
gzip module is static or dynamic.

Thanks again for the patch!

Zeev

--
Zeev Suraski <ze...@zend.com>
CTO, Zend Technologies Ltd. http://www.zend.com/


--
PHP Development Mailing List <http://www.php.net/>
To unsubscribe, e-mail: php-dev-u...@lists.php.net
For additional commands, e-mail: php-de...@lists.php.net
To contact the list administrators, e-mail: php-lis...@lists.php.net

Manuel Lemos

unread,
Nov 12, 2000, 5:53:43 PM11/12/00
to
Hello Jade,

On 12-Nov-00 19:12:00, you wrote:

>Bugs/TODOs:
> 1. Grep for them (TODO|FIXME).
> 2. The deflate coding dosn't work. I don't see why, right now... hint?

In theory checking if the Accepting-Encoding: request header would be
enough to figure if compression is acceptable in the requesting browser (or
proxy server). The truth is that it isn't that simple. There are plenty
of circumstances where it may not work and the end user will only see a
blank page.

If you look into Remote Communications mod_gzip source code they explain
some of the circumstances where it does not work. Oddly they seem to not
handle those situations. That suggests that they may want people to
complain so they can sell their commercial solution.

Anyway, there seem to be other situations that they admitedly did not seem
to know, like proxy servers that choke on character 0 (NUL) in a
Content-Type: text/html stream. Many free access ISP use proxy servers and
they now represent a significative market share in many countries. I don't
know if some could not be using broken proxy servers to make all this that
hard to handle.


Regards,
Manuel Lemos

Web Programming Components using PHP Classes.
Look at: http://phpclasses.UpperDesign.com/?user=mle...@acm.org
--
E-mail: mle...@acm.org
URL: http://www.mlemos.e-na.net/
PGP key: http://www.mlemos.e-na.net/ManuelLemos.pgp
--

Jade Nicoletti

unread,
Nov 13, 2000, 7:50:44 AM11/13/00
to
The deflate coding didn't work because I sent a zlib header/trailer. Without
them everything seems to work fine (so far :).

May be that problems will arise in certain circumstances (as described in
Remote Communications' mod_gzip.c). Nevertheless it is a good thing to have
this in PHP. Of course the compression isn't enabled by default. A php
developer has to ponder the use of compression and really only apply it if
it's use is adequate.

-Jade.

--

===============================================================================
Jade Nicoletti Nicoletti Net Services Tel. 01 240 4774
Geschäftsleitung Postfach 2519 Fax 01 240 4775
System-Administration 8021 Zürich
============================================[ Weitere Infos: http://nns.ch/ ]==

--

Jade Nicoletti

unread,
Nov 13, 2000, 10:11:20 AM11/13/00
to
Here is an updated patch. Just in case... :)

The deflate coding and the autodetection works now. I've definied some constants
and I've put in some warning messages.

-Jade.

On Mon, Nov 13, 2000 at 12:32:57PM +0200, Zeev Suraski wrote:
> I'm working on converting your work to a output handler in zlib.c - I'll
> send you the diff when I'm done.
>
> Zeev

comp.patch

Zeev Suraski

unread,
Nov 13, 2000, 10:49:26 AM11/13/00
to
Damn :) We'll have to merge code now. I'll first get your original code
to work, and then you'd could merge in your changes.

Zeev

At 17:12 13-11-00, Jade Nicoletti wrote:
>Here is an updated patch. Just in case... :)
>
>The deflate coding and the autodetection works now. I've definied some
>constants
>and I've put in some warning messages.
>
>-Jade.
>
>On Mon, Nov 13, 2000 at 12:32:57PM +0200, Zeev Suraski wrote:
> > I'm working on converting your work to a output handler in zlib.c - I'll
> > send you the diff when I'm done.
> >
> > Zeev
>

>--
>===============================================================================
> Jade Nicoletti Nicoletti Net Services Tel. 01 240 4774
> Geschäftsleitung Postfach 2519 Fax 01 240 4775
> System-Administration 8021 Zürich
>============================================[ Weitere Infos:
>http://nns.ch/ ]==

--


Zeev Suraski <ze...@zend.com>
CTO, Zend Technologies Ltd. http://www.zend.com/

Thies C. Arntzen

unread,
Nov 13, 2000, 11:44:31 AM11/13/00
to
On Mon, Nov 13, 2000 at 05:52:37PM +0200, Zeev Suraski wrote:
> Damn :) We'll have to merge code now. I'll first get your original code
> to work, and then you'd could merge in your changes.
>
> Zeev

do we really think this belongs into a scripting language? i
think that transparent output compression should be part of
the web-server (like an apache 2.0 filter) or a proxy in
front of the server.

tc

Zeev Suraski

unread,
Nov 13, 2000, 11:50:47 AM11/13/00
to
At 18:45 13-11-00, Thies C. Arntzen wrote:
>On Mon, Nov 13, 2000 at 05:52:37PM +0200, Zeev Suraski wrote:
> > Damn :) We'll have to merge code now. I'll first get your original code
> > to work, and then you'd could merge in your changes.
> >
> > Zeev
>
> do we really think this belongs into a scripting language? i
> think that transparent output compression should be part of
> the web-server (like an apache 2.0 filter) or a proxy in
> front of the server.

I don't see any reason for this not to be in the scripting language. It's
useful, there's no current way of doing this (we're still years away from
wide Apache 2.0 acceptance), why not support it?

There are things in PHP that are completely not useful for Web development
and nobody cares about them - I wouldn't start rejecting useful
functionality because it "doesn't belong in the scripting language"...

Zeev

Zeev

Thies C. Arntzen

unread,
Nov 13, 2000, 12:41:13 PM11/13/00
to
On Mon, Nov 13, 2000 at 06:53:59PM +0200, Zeev Suraski wrote:
> At 18:45 13-11-00, Thies C. Arntzen wrote:
> >On Mon, Nov 13, 2000 at 05:52:37PM +0200, Zeev Suraski wrote:
> > > Damn :) We'll have to merge code now. I'll first get your original code
> > > to work, and then you'd could merge in your changes.
> > >
> > > Zeev
> >
> > do we really think this belongs into a scripting language? i
> > think that transparent output compression should be part of
> > the web-server (like an apache 2.0 filter) or a proxy in
> > front of the server.
>
> I don't see any reason for this not to be in the scripting language. It's
> useful, there's no current way of doing this (we're still years away from
> wide Apache 2.0 acceptance), why not support it?

we rejected a similar patch (for 3.0) for the very same
reason around a year ago if i recall right. but that would be
no reason to reject it now - i agree.

>
> There are things in PHP that are completely not useful for Web development
> and nobody cares about them - I wouldn't start rejecting useful
> functionality because it "doesn't belong in the scripting language"...

i played around with the output-compression tools that are
around and i'm not too impressed. but who knows maybe they
had problems with my applications simply _because_ they
weren't written by us and integrated into php;-)

if we have a clean enough interface for this and it doesn't
end up as a "hack" i agree -> let's get it in.

tc

Zeev Suraski

unread,
Nov 13, 2000, 12:55:39 PM11/13/00
to
I doubt we'd get sued... Is anyone using mod_gzip going to get sued?

Zeev

At 19:48 13-11-00, Jim Jagielski wrote:
>I forwarded a short email maybe a few days ago from one of the
>Apache lists regarding tranparent compression and some sort
>of bogus patent. I can resend if needed, but I think it's
>"applicable" here in some way...
>--
>===========================================================================
> Jim Jagielski [|] j...@jaguNET.com [|] http://www.jaguNET.com/
> "Are you suggesting coconuts migrate??"

Andi Gutmans

unread,
Nov 13, 2000, 12:43:50 PM11/13/00
to
At 06:42 PM 11/13/00 +0100, Thies C. Arntzen wrote:
>On Mon, Nov 13, 2000 at 06:53:59PM +0200, Zeev Suraski wrote:
> > At 18:45 13-11-00, Thies C. Arntzen wrote:
> > >On Mon, Nov 13, 2000 at 05:52:37PM +0200, Zeev Suraski wrote:
> > > > Damn :) We'll have to merge code now. I'll first get your
> original code
> > > > to work, and then you'd could merge in your changes.
> > > >
> > > > Zeev
> > >
> > > do we really think this belongs into a scripting language? i
> > > think that transparent output compression should be part of
> > > the web-server (like an apache 2.0 filter) or a proxy in
> > > front of the server.
> >
> > I don't see any reason for this not to be in the scripting language. It's
> > useful, there's no current way of doing this (we're still years away from
> > wide Apache 2.0 acceptance), why not support it?
>
> we rejected a similar patch (for 3.0) for the very same
> reason around a year ago if i recall right. but that would be
> no reason to reject it now - i agree.

The reason was that in 3.0 it was terrible. It was a real PHP core hack but
with the output buffering it can be done very cleanly and nicely.

> >
> > There are things in PHP that are completely not useful for Web development
> > and nobody cares about them - I wouldn't start rejecting useful
> > functionality because it "doesn't belong in the scripting language"...
>

> i played around with the output-compression tools that are
> around and i'm not too impressed. but who knows maybe they
> had problems with my applications simply _because_ they
> weren't written by us and integrated into php;-)
>
> if we have a clean enough interface for this and it doesn't
> end up as a "hack" i agree -> let's get it in.

I guess I addressed this before reading the second part of your mail :)

Andi
---
Andi Gutmans <an...@zend.com>

Jim Jagielski

unread,
Nov 13, 2000, 12:47:57 PM11/13/00
to
I forwarded a short email maybe a few days ago from one of the
Apache lists regarding tranparent compression and some sort
of bogus patent. I can resend if needed, but I think it's
"applicable" here in some way...
--
===========================================================================
Jim Jagielski [|] j...@jaguNET.com [|] http://www.jaguNET.com/
"Are you suggesting coconuts migrate??"

--

Jade Nicoletti

unread,
Nov 13, 2000, 1:10:57 PM11/13/00
to
Ok, I merge in my changes later if you want :)

A question: how do you make output compression work with the session uri
adaption?

It seems that output goes through these layers:

1. output handlers (if present)
2. php_ub_body_write_no_header with session_adapt_uri
3. sapi output function (php_header_write)

If you implement the output compression as a output handler, the compressed
output will be scanned for uris :(

Do you see the problem or did I miss something? :)

If you wan't it really clean, the uri adaption should be an output handler
too, shouldn't it?

-Jade.

Zeev Suraski

unread,
Nov 13, 2000, 1:17:06 PM11/13/00
to
At 20:11 13-11-00, Jade Nicoletti wrote:
>Ok, I merge in my changes later if you want :)

I think I merged most of them. I'll be committing soon :)

>A question: how do you make output compression work with the session uri
>adaption?

The way session URL adaption works is wrong IMHO. If it was applied as a
output handler, it would have worked transparently... It's no surprise it
wasn't implemented as an output handler though, as there weren't output
handlers in the language at the time URL adaption was introduced :)

We'll have to do some work on that in the future.

I think things will be much clearer when I commit the gzip/deflate encoding
support.

Zeev

Jade Nicoletti

unread,
Nov 13, 2000, 1:21:34 PM11/13/00
to
On Mon, Nov 13, 2000 at 05:45:33PM +0100, Thies C. Arntzen wrote:
> On Mon, Nov 13, 2000 at 05:52:37PM +0200, Zeev Suraski wrote:
> > Damn :) We'll have to merge code now. I'll first get your original code
> > to work, and then you'd could merge in your changes.
> >
> > Zeev
>
> do we really think this belongs into a scripting language? i
> think that transparent output compression should be part of
> the web-server (like an apache 2.0 filter) or a proxy in
> front of the server.
>
> tc

I won't compress everything. It will depend on various things. Letting a
script decide what has to be compressed (and with which coding, compression
level) is a lot more flexible than some static configurations of an independent
filter.

-Jade.

--
===============================================================================
Jade Nicoletti Nicoletti Net Services Tel. 01 240 4774
Geschäftsleitung Postfach 2519 Fax 01 240 4775
System-Administration 8021 Zürich
============================================[ Weitere Infos: http://nns.ch/ ]==

--

Zeev Suraski

unread,
Nov 13, 2000, 1:58:26 PM11/13/00
to
Alright, I imported your transparent gzip compression support.

All of the code is centralized in the zlib extension, and makes no use of
any specialized patches in the core of the language.

To use it transparently, simply call:

ob_start("ob_gzhandler");

It's as simple as that.

You can also compress all of your PHP scripts transparently by setting the
new output_handler directive to 'ob_gzhandler'.

I'm not sure if I imported all of the diffs from the new patch you sent me
- so if you see some missing functionality, either add it or let me know...

Zeev

Jim Jagielski

unread,
Nov 13, 2000, 2:02:54 PM11/13/00
to
I would assume they would come after the ASF 1st :)

mod_gzip had other issues ;)

Zeev Suraski

unread,
Nov 13, 2000, 2:03:49 PM11/13/00
to
What kind of issues? :)


At 21:03 13-11-00, Jim Jagielski wrote:
>I would assume they would come after the ASF 1st :)
>
>mod_gzip had other issues ;)
>Zeev Suraski wrote:
> >

Peter Korsgaard

unread,
Nov 13, 2000, 2:33:45 PM11/13/00
to
On Mon, 13 Nov 2000, Jade Nicoletti wrote:

> I won't compress everything. It will depend on various things. Letting a
> script decide what has to be compressed (and with which coding, compression
> level) is a lot more flexible than some static configurations of an independent
> filter.

yes, that's by opinion too. And it is quite easy to implement gzip
compression in pure php anyway (I've implemented it as an option to the
Fasttemplate class)

But ofcause it could be implemented as part of the language - we have all
these extensions anyway, so one more isn't going to change much imho.

--
Bye, Peter Korsgaard

Jade Nicoletti

unread,
Nov 13, 2000, 2:15:55 PM11/13/00
to
On Mon, Nov 13, 2000 at 09:01:38PM +0200, Zeev Suraski wrote:
> Alright, I imported your transparent gzip compression support.
>
> All of the code is centralized in the zlib extension, and makes no use of
> any specialized patches in the core of the language.
>
> To use it transparently, simply call:
>
> ob_start("ob_gzhandler");
>
> It's as simple as that.
>
> You can also compress all of your PHP scripts transparently by setting the
> new output_handler directive to 'ob_gzhandler'.
>
> I'm not sure if I imported all of the diffs from the new patch you sent me
> - so if you see some missing functionality, either add it or let me know...
>
> Zeev

You didn't add all of the diffs. I'm requesting a CVS account right now.
As soon as I get it, I will commit the remaining changes.

-Jade.

Jim Jagielski

unread,
Nov 13, 2000, 2:50:58 PM11/13/00
to
Mostly licensing.

Zeev Suraski wrote:
>
> What kind of issues? :)
>
>
> At 21:03 13-11-00, Jim Jagielski wrote:
> >I would assume they would come after the ASF 1st :)
> >
> >mod_gzip had other issues ;)
> >Zeev Suraski wrote:
> > >
> > > I doubt we'd get sued... Is anyone using mod_gzip going to get sued?
> > >

--
===========================================================================
Jim Jagielski [|] j...@jaguNET.com [|] http://www.jaguNET.com/
"Are you suggesting coconuts migrate??"

--

Stig Venaas

unread,
Nov 13, 2000, 3:09:08 PM11/13/00
to
Would anyone be interested in transparent encryption? Not sure how
hard it is, but could be done using OpenSSL.

Just a stray thought,

Stig

Zeev Suraski

unread,
Nov 13, 2000, 3:23:47 PM11/13/00
to
As in SSL on the fly? That could be interesting. It's a bit wierd, but I
don't know enough about it to give an educated opinion.


At 22:10 13/11/2000, Stig Venaas wrote:
>Would anyone be interested in transparent encryption? Not sure how
>hard it is, but could be done using OpenSSL.
>
>Just a stray thought,
>
>Stig
>

>--
>PHP Development Mailing List <http://www.php.net/>
>To unsubscribe, e-mail: php-dev-u...@lists.php.net
>For additional commands, e-mail: php-de...@lists.php.net
>To contact the list administrators, e-mail: php-lis...@lists.php.net

--


Zeev Suraski <ze...@zend.com>
CTO, Zend Technologies Ltd. http://www.zend.com/

Manuel Lemos

unread,
Nov 18, 2000, 11:17:09 AM11/18/00
to
Hello Jade,

On 13-Nov-00 09:50:44, you wrote:

>May be that problems will arise in certain circumstances (as described in
>Remote Communications' mod_gzip.c). Nevertheless it is a good thing to have
>this in PHP. Of course the compression isn't enabled by default. A php
>developer has to ponder the use of compression and really only apply it if
>it's use is adequate.

Sure, but my point is that there many circumstances that looking at the
Accept-Encoding header is not suficient because it may not be true that the
compressed content can reach the end user browser intact even when the
browser supports compression, not to speak of browsers that lie about
supporting compression.

That is why Remote Communications has a commercial module that takes care
of those cases while the free version of mod_gzip simply ignores those
cases.

I propose that we start building a knowledge base of browsers that in
practice do not support compression and proxy servers that corrupt
compressed data.

I tried talking with Remote Communications but they don't want to cooperate
because they seem to be protecting their business of selling the commercial
version of the module for compression.

It sounds like that their free version of mod_gzip is like a bait for
people to buy their commercial stuff. It's a bit of a silly attitude
because soon or later there will be public knowlegde of what really
supports and what doesn't.

So why pay for their stuff when we can build a free and complete mod_gzip
module or better, stuff that support in PHP? It's just a matter that we
cooperate and share knowledge between us, don't you think?


Regards,
Manuel Lemos


--

Jade Nicoletti

unread,
Nov 23, 2000, 12:56:43 PM11/23/00
to
On Sat, Nov 18, 2000 at 02:08:59PM -0300, Manuel Lemos wrote:
> I propose that we start building a knowledge base of browsers that in
> practice do not support compression and proxy servers that corrupt
> compressed data.
Why don't you just present a concept of this knowledge base?

> So why pay for their stuff when we can build a free and complete mod_gzip
> module or better, stuff that support in PHP? It's just a matter that we
> cooperate and share knowledge between us, don't you think?

T.

-Jade.

Manuel Lemos

unread,
Nov 26, 2000, 4:35:09 PM11/26/00
to
Hello Jade,

On 23-Nov-00 14:56:43, you wrote:

>On Sat, Nov 18, 2000 at 02:08:59PM -0300, Manuel Lemos wrote:
>> I propose that we start building a knowledge base of browsers that in
>> practice do not support compression and proxy servers that corrupt
>> compressed data.
>Why don't you just present a concept of this knowledge base?

It's nothing more than the list user agents (browsers and proxy servers)
that may mislead a server into believing that the they support compression
when in reality they don't.

The usual procedure is to check the user agent supports compression by
verifying the presence of the request header Accept-Encoding: gzip etc..

Than you need of the user agent is one of those in the knowledge base of
browsers that actually do not support compression despite they send the
Accept-Encoding header.

If the browser is not in the knowledge base, then verify if the proxy
server that forwards the request is not one of those that mangle compressed
text/html input because it contains NUL bytes in the middle or for some
other unknown reason.

We don't know all reasons, browsers or proxy servers that void the use of
compression, but the sooner we start cooperating sharing information, the
sooner we will be able to use compression in our sites as a viable speed
boost solution that may be used as much as possible without preventing part
of the audience to access the content normally without compression if
necessary.


Regards,
Manuel Lemos

Jade Nicoletti

unread,
Nov 27, 2000, 3:55:56 AM11/27/00
to
How do you want to collect the information? Just send it to the php-dev
mailing list and then hardcode it into php?

-Jade.

--

Manuel Lemos

unread,
Dec 3, 2000, 11:26:41 AM12/3/00
to
Hello Jade,

On 27-Nov-00 05:55:56, you wrote:

>How do you want to collect the information? Just send it to the php-dev
>mailing list and then hardcode it into php?

Somebody has to include it in the compression support detection code. I
think it is better to include only the user agent list of browsers that we
proved to be able to handle compressed content.

Regards,
Manuel Lemos

0 new messages