"OLED & =?ISO-8859-1?Q?br=E4nsleceller?="
It's found in the Subject: header in a usenet message (i.e. not in a
mail) which had these headers:
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
So, ok - it isn't quoted printable, or? I've always thought that the
Content-Type header only referred to the body of the text, the subject
line still had to be in quoted-printable, which is why swedish
characters in the post body isn't garbled like above. How do I parse
it back to what it's supposed to be:
"OLED & bränsleceller"
Thanks for any help.
--
Sandman[.net]
>Maybe it's me who is misunderstanding this, but isn't this string in
>quoted printable:
>
> "OLED & =?ISO-8859-1?Q?br=E4nsleceller?="
>
>It's found in the Subject: header in a usenet message (i.e. not in a
>mail) which had these headers:
>
> Content-Type: text/plain; charset=ISO-8859-1
> Content-Transfer-Encoding: 8bit
>
>So, ok - it isn't quoted printable, or?
Not quoted-printable; see RFC 2047 - MIME (Multipurpose Internet Mail
Extensions) Part Three: Message Header Extensions for Non-ASCII Text
--
Andy Hassall :: an...@andyh.co.uk :: http://www.andyh.co.uk
http://www.andyhsoftware.co.uk/space :: disk and FTP usage analysis tool
> On Fri, 14 Jul 2006 15:48:47 +0200, Sandman <m...@sandman.net> wrote:
>
> >Maybe it's me who is misunderstanding this, but isn't this string in
> >quoted printable:
> >
> > "OLED & =?ISO-8859-1?Q?br=E4nsleceller?="
> >
> >It's found in the Subject: header in a usenet message (i.e. not in a
> >mail) which had these headers:
> >
> > Content-Type: text/plain; charset=ISO-8859-1
> > Content-Transfer-Encoding: 8bit
> >
> >So, ok - it isn't quoted printable, or?
>
> Not quoted-printable; see RFC 2047 - MIME (Multipurpose Internet Mail
> Extensions) Part Three: Message Header Extensions for Non-ASCII Text
Aha - gotcha. Any idea on how to decode it to a 8bit string?
--
Sandman[.net]
I think the term is "quoted word." It's sort of confusing.
> So, ok - it isn't quoted printable, or? I've always thought that the
> Content-Type header only referred to the body of the text, the subject
> line still had to be in quoted-printable, which is why swedish
> characters in the post body isn't garbled like above. How do I parse
> it back to what it's supposed to be:
>
> "OLED & bränsleceller"
With regular expression, of course.
function quoted_word_callback($m) {
switch($m[2]) {
case 'Q': case 'q': return quoted_printable_decode($m[3]);
case 'B': case 'b': return base64_decode($m[3]);
}
}
echo preg_replace_callback('/=\?(.*)\?([BQ])\?(.*)\?=/U',
'quoted_word_callback', $s);
It didn't work...
#!/usr/bin/php
<?
function quoted_word_callback($m) {
switch($m[2]) {
case 'Q': case 'q': return quoted_printable_decode($m[3]);
case 'B': case 'b': return base64_decode($m[3]);
}
}
$s = "OLED & =?ISO-8859-1?Q?br=E4nsleceller?=";
echo preg_replace_callback('/=\?(.*)\?([BQ])\?(.*)\?=/U',
'quoted_word_callback', $s);
?>
Result:
OLED & br”nsleceller
Instead of:
OLED & bränsleceller
--
Sandman[.net]
The code works fine for me. Just make sure that wherever you're printing
the result can handle ISO-8859-1 characters (it seems you're using it as
shell script so your console may be configured to show UTF-8).
--
-+ http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
++ Mi sitio sobre programación web: http://bits.demogracia.com
+- Mi web de humor con rayos UVA: http://www.demogracia.com
--
> *** Sandman escribió/wrote (Sat, 15 Jul 2006 08:01:58 +0200):
> > Result:
> >
> > OLED & br”nsleceller
> >
> > Instead of:
> >
> > OLED & bränsleceller
>
> The code works fine for me. Just make sure that wherever you're printing
> the result can handle ISO-8859-1 characters (it seems you're using it as
> shell script so your console may be configured to show UTF-8).
Yes! That was it - sorry about that!
--
Sandman[.net]