decoding filter on (utf-8) by default

3 views
Skip to first unread message

Kevin Dangoor

unread,
Mar 22, 2006, 2:17:51 PM3/22/06
to turbo...@googlegroups.com
As of [993] (which will be part of TG 0.9a2), the decoding filter is
on by default. This should clear up problems that some people have
seen with non-ASCII values being sent into their programs.

The encoding that it's set to use is utf-8. Do people have browsers
out there that would be sending data encoded in something other than
utf-8?

This value is still changeable in the config file. It just seemed that
the current default of not decoding the input is not a desirable
default.

Kevin

--
Kevin Dangoor
Author of the Zesty News RSS newsreader

email: k...@blazingthings.com
company: http://www.BlazingThings.com
blog: http://www.BlueSkyOnMars.com

Jorge Godoy

unread,
Mar 22, 2006, 2:30:00 PM3/22/06
to turbo...@googlegroups.com
"Kevin Dangoor" <dan...@gmail.com> writes:

> The encoding that it's set to use is utf-8. Do people have browsers
> out there that would be sending data encoded in something other than
> utf-8?

IIRC, the encoding used to send data should be the same that is used to
receive data. So, if someone configures it's code to use, e.g., iso-8859-1
then the browser will be expecting iso-8859-1 data and will be sending
iso-8859-1 as well. The same might happen for more esoteric encodings such as
the ones used by Windows (cp*).

> This value is still changeable in the config file. It just seemed that
> the current default of not decoding the input is not a desirable
> default.

Agreed. I'd go a step further and configure it to use the same value as
kid.encoding = "utf-8" in app.cfg. This ties the configuration and makes it
consistent between TG and Kid.

--
Jorge Godoy <jgo...@gmail.com>

Kevin Dangoor

unread,
Mar 22, 2006, 2:50:23 PM3/22/06
to turbo...@googlegroups.com
On 3/22/06, Jorge Godoy <jgo...@gmail.com> wrote:
> > This value is still changeable in the config file. It just seemed that
> > the current default of not decoding the input is not a desirable
> > default.
>
> Agreed. I'd go a step further and configure it to use the same value as
> kid.encoding = "utf-8" in app.cfg. This ties the configuration and makes it
> consistent between TG and Kid.

I think you're right about the browsers working that way. I've changed
it to use kid.encoding.

Kevin

Max Ischenko

unread,
Mar 23, 2006, 7:56:16 AM3/23/06
to TurboGears
> I think you're right about the browsers working that way.
> I've changed it to use kid.encoding.

I think we can go one _more_ step further and set default content-type,
namely:
if decoding_filter is on and tg.content_type not set, set default
content-type to "text/html; charset=" kid.encoding.

Kevin Dangoor

unread,
Mar 23, 2006, 8:00:39 AM3/23/06
to turbo...@googlegroups.com

That's a good idea. Would you mind opening a ticket to do that so that
we're sure it gets done?

Kevin

fumanchu

unread,
Mar 23, 2006, 12:36:44 PM3/23/06
to TurboGears
> IIRC, the encoding used to send data should
> be the same that is used to receive data.

...which would work great if the server always called the client first.
;) There's quite a bit of good research baked into CP's decodingfilter,
to fall back to required/reasonable guesses when the charset is not
specified in the Content-Type header.


Robert Brewer
System Architect
Amor Ministries
fuma...@amor.org

Max Ischenko

unread,
Mar 24, 2006, 10:35:59 AM3/24/06
to TurboGears
Actually, I wanted to have this a long time ago, see #335. I'll try to
close it RSN (hopefully my commit privs still works).

Reply all
Reply to author
Forward
0 new messages