Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Responsibility for encoding concerns in web applications

8 views
Skip to first unread message

Ney André de Mello Zunino

unread,
Nov 26, 2012, 12:01:55 PM11/26/12
to
Hello.

I had always assumed that the handling of encoding issues in the context
of web applications should be a responsibility of the HTTP server. On
the other hand, I have seen such issues dealt with by the application
itself, mostly in the context of Java EE (via filter classes). A
colleague and I started arguing about which approach is better and I
decided to check what others had to say.

So, how do you see the responsibility for encoding issues when serving
contents from web applications? "Who" should be in charge of it? What
are some pros and cons of each approach?

Thank you for your input.

P.S.: I admit I was rather unsure about which newsgroup to post to; I
was only going to post to c.i.w.authoring.misc, but since it looks
abandoned, I chose to add c.i.w.authoring.html as well. Feel free to set
follow-ups appropriately.

Regards,

--
Ney Andr� de Mello Zunino

James Moe

unread,
Nov 26, 2012, 1:24:12 PM11/26/12
to
On 11/26/2012 10:01 AM, Ney André de Mello Zunino wrote:
>
> I had always assumed that the handling of encoding issues ...
>
What are the "encoding issues" that concern you?

--
James Moe
jmm-list at sohnen-moe dot com

Ney André de Mello Zunino

unread,
Nov 26, 2012, 1:34:55 PM11/26/12
to
On 26/11/2012 16:24, James Moe wrote:
> On 11/26/2012 10:01 AM, Ney André de Mello Zunino wrote:
>>
>> I had always assumed that the handling of encoding issues ...
>>
> What are the "encoding issues" that concern you?
>

Nothing really special; what I was referring to was simply the act of
providing the appropriate encoding information for the "Content-type"
response header, depending on which resource is being requested.

Regards,

--

Thomas 'PointedEars' Lahn

unread,
Nov 26, 2012, 1:41:04 PM11/26/12
to
Ney André de Mello Zunino wrote:

> I had always assumed that the handling of encoding issues in the context
> of web applications should be a responsibility of the HTTP server. On
> the other hand, I have seen such issues dealt with by the application
> itself, mostly in the context of Java EE (via filter classes). A
> colleague and I started arguing about which approach is better and I
> decided to check what others had to say.
>
> So, how do you see the responsibility for encoding issues when serving
> contents from web applications?

I see the responsibility for encoding issues with the developer.

> "Who" should be in charge of it?

In charge of what, exactly?

> What are some pros and cons of each approach?

Your question is too broad. Be more specific.

> P.S.: I admit I was rather unsure about which newsgroup to post to; I
> was only going to post to c.i.w.authoring.misc, but since it looks
> abandoned, I chose to add c.i.w.authoring.html as well. Feel free to set
> follow-ups appropriately.

This question has nothing to do with writing HTML and is therefore off-topic
here.

The purpose of Usenet is not to provide you with an answer to your question
or a solution to your problem as quick as possible; it is to discuss your
question or problem in detail, to determine the best answers or solutions,
from which you might derive your answer or solution, in time.

Therefore, you should post based on where the question or problem is on-
topic, not where the people seemingly are. For if you post where the people
are *instead of* where the question is on-topic, the situation in the other
newsgroup, where the question is on-topic, cannot change for the better.

<http://www.catb.org/~esr/faqs/smart-questions.html>


F'up2 comp.infosystems.www.authoring.misc

PointedEars
--
var bugRiddenCrashPronePieceOfJunk = (
navigator.userAgent.indexOf('MSIE 5') != -1
&& navigator.userAgent.indexOf('Mac') != -1
) // Plone, register_function.js:16

Jukka K. Korpela

unread,
Nov 26, 2012, 1:58:26 PM11/26/12
to
2012-11-26 19:01, Ney Andr� de Mello Zunino wrote:

> I had always assumed that the handling of encoding issues in the context
> of web applications should be a responsibility of the HTTP server.

And HTTP server is software, or hardware, or combination of them. Not a
human being, or otherwise a sentient being, which could have any
responsibility. Even less than a dog. Any reference to responsibilities
of servers should be taken as colloquial speech referencing to people
responsible for server behavior in some sense.

> On the other hand, I have seen such issues dealt with by the application
> itself, mostly in the context of Java EE (via filter classes).

Which "application itself"? Any application that generates a web page as
a response to a GET request to an HTTP server should be considered as
"responsible" for indicating the encoding, in the sense described above.

> So, how do you see the responsibility for encoding issues when serving
> contents from web applications?

Define "web application".

Generally, servers are configured to send encoding information in HTTP
headers, or to send no such information in them, when responding to a
request that is mapped to a static file on the server. In other kinds of
requests, served with some software running on the server, it's clearly
that software that should generate adequate Content-Type headers.

> P.S.: I admit I was rather unsure about which newsgroup to post to; I
> was only going to post to c.i.w.authoring.misc, but since it looks
> abandoned, I chose to add c.i.w.authoring.html as well. Feel free to set
> follow-ups appropriately.

As a rule, if you don't know which group is the best one, don't post the
message before you have found it out. In most cases, any alternative is
better than crossposting.

F'ups set to c.i.w.a.html, since it has traditionally been a group for
discussing matters like this, too.

--
Yucca, http://www.cs.tut.fi/~jkorpela/

Barry Margolin

unread,
Nov 26, 2012, 2:01:50 PM11/26/12
to
In article <k90coc$gf$1...@speranza.aioe.org>,
Ney Andr� de Mello Zunino <zun...@softplan.com.br> wrote:

> On 26/11/2012 16:24, James Moe wrote:
> > On 11/26/2012 10:01 AM, Ney Andr� de Mello Zunino wrote:
> >>
> >> I had always assumed that the handling of encoding issues ...
> >>
> > What are the "encoding issues" that concern you?
> >
>
> Nothing really special; what I was referring to was simply the act of
> providing the appropriate encoding information for the "Content-type"
> response header, depending on which resource is being requested.

If the resource is an application, how is the server supposed to know
what encoding it's using?

--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

Denis McMahon

unread,
Nov 26, 2012, 6:52:51 PM11/26/12
to
On Mon, 26 Nov 2012 15:01:55 -0200, Ney André de Mello Zunino wrote:

> I had always assumed that the handling of encoding issues in the context
> of web applications should be a responsibility of the HTTP server.

Nope.

It's the responsibility of the person or people developing and operating
the website to make sure it happens correctly.

The web server doesn't parse content, it sends it. It can be configured
to server specified file extensions with a specified content type, but it
doesn't know if the file contents match the file extension. Occasionally
I come across misconfigured web servers that server images with the
incorrect image type, typically png as jpeg or jpg as gif, presumably
caused by copying and pasting bits of config by people who didn't really
understand what they were doing (ie human failure, not software failure)
although that hasn't happened recently - it may have been a couple of
years since I last observed it - so I guess software and config updates
may be resolving that particular issue.

When the server gets the content from another process, eg by running a php
script, then the php script could deliver (x)html, a compressed archive
file of some sort, some form of proprietary or open formatted document,
images, sound files, video etc. In such cases, it is ridiculous to expect
the web server to know what content to server for a php file, and the php
file should supply the relevant information.

So it varies, the server can make a best guess based on the file type,
which is usually fine for static content, but that will not always be
correct for dynamically generated content that is just being piped
through the server from some other application. Such external sources
really need to be able to take care of such notifications, because a
given url file extension may not always relate to a single unique content
type.

Rgds

Denis McMahon

James Moe

unread,
Dec 4, 2012, 11:57:32 AM12/4/12
to
The encoding information provided is that which is used to encode the
MIME section. That probably does not answer your question, though.
As a broad rule, MIME text types are usually encoded with
quoted-printable. All other types (image, audio, application, ...) are
encoded base64.
The "responsibility" of deciding which encoding to use is on the
developer of the application that creates MIME messages. The software
that makes the decision embodies the policies created by the developers.

Barry Margolin

unread,
Dec 4, 2012, 12:58:08 PM12/4/12
to
In article <V66dncjdebXitiPN...@giganews.com>,
I suspect he's really asking about the charset information.

Thomas 'PointedEars' Lahn

unread,
Dec 4, 2012, 12:58:30 PM12/4/12
to
James Moe wrote:

> On 11/26/2012 11:34 AM, Ney André de Mello Zunino wrote:
>> On 26/11/2012 16:24, James Moe wrote:
>>>> I had always assumed that the handling of encoding issues ...
>>> What are the "encoding issues" that concern you?
>>
>> Nothing really special; what I was referring to was simply the act of
>> providing the appropriate encoding information for the "Content-type"
>> response header, depending on which resource is being requested.
>>
> The encoding information provided is that which is used to encode the
> MIME section.

No, in an *HTTP* ”Content-Type” header field it is what the message body is
encoded with. We are talking about *Web* applications here.

> That probably does not answer your question, though.

At least it does not answer it correctly.

> As a broad rule, MIME text types are usually encoded with
> quoted-printable.

No, Quoted-Printable is one possible Transfer-Encoding in MIME; it is not
required, and it is virtually *never* used on the Web (because HTTP is an
8-bit-safe protocol).

> All other types (image, audio, application, ...) are
> encoded base64.

No, in an 8-bit-safe protocol such as HTTP, there is no need to use a 7-bit-
safe Transfer-Encoding such as QP or base64. It is more likely that a HTTP
message body would be gzipped than that it would be base64-encoded – the
latter increases the payload by about 33%.

> The "responsibility" of deciding which encoding to use is on the
> developer of the application that creates MIME messages. The software
> that makes the decision embodies the policies created by the developers.

There is at least one correct bit in that answer.

--
PointedEars

Twitter: @PointedEars2
Please do not Cc: me. / Bitte keine Kopien per E-Mail.

tlvp

unread,
Dec 5, 2012, 8:48:44 PM12/5/12
to
On Tue, 04 Dec 2012 18:58:30 +0100, Thomas 'PointedEars' Lahn wrote:

> There is at least one correct bit in that answer.

And is the bit you're thinking of a "0" or a "1", pray tell :-) ?
0 new messages