Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

read charset of meta tag

0 views
Skip to first unread message

MZ

unread,
Aug 28, 2008, 6:12:13 AM8/28/08
to
Hello!

I cannot find anything about this thing.

I need to read what kind of charset is being set on the

<META http-equiv="Content-Type" CONTENT="text/html; charset=iso-8859-2">

I need to do this in javascript.
How to read charset in javascript? I want javascript function to return in
this case iso-8859-2.

Please help me, because google fails:(
Thank you in advance for help
M.

Laser Lips

unread,
Aug 28, 2008, 6:41:32 AM8/28/08
to

<script language='javascript'>
META = document.getElementsByTagName("META");
content = "";
for(x=0;x<META.length;x++)
{
att=META[x].attributes;

for(y=0;y<att.length;y++)
{
if(att[y].name=="content")
{
content = att[y].value;
}
}
}
alert(content);
</script>

Martin Honnen

unread,
Aug 28, 2008, 6:42:05 AM8/28/08
to
MZ wrote:

> I need to read what kind of charset is being set on the
>
> <META http-equiv="Content-Type" CONTENT="text/html; charset=iso-8859-2">

You can find all 'meta' elements with getElementsByTagName e.g.
var metas = document.getElementsByTagName('meta');
Each such element has the properties listed in
http://www.w3.org/TR/DOM-Level-2-HTML/html.html#ID-37041454
so you can loop through the meta elements and look for one where
metas[i].httpEquiv.toLowerCase() === 'content-type', then you can access
metas[i].content and for instance use a regular expression to look for
the charset parameter.

On the other hand most browsers by now expose a property like
document.characterSet (Mozilla) or document.charset (IE) which should
give you the charset the browser has taken from the meta or from real
HTTP headers.


--

Martin Honnen
http://JavaScript.FAQTs.com/

MZ

unread,
Aug 28, 2008, 7:11:28 AM8/28/08
to

Uzytkownik "Laser Lips" <louds...@gmail.com> napisal w wiadomosci
news:c873c2f8-3aac-46e8...@p25g2000hsf.googlegroups.com...

Thank you very much for help.

I have modified a little this code:

META = document.getElementsByTagName("META");
content = "";
for(x=0;x<META.length;x++)
{
att=META[x].attributes;

for(y=0;y<att.length;y++)
{
if(att[y].name=="content")
{

var szukany_ciag="text/html; charset=";
myString = new String(att[y].value.toLowerCase());
if (myString.indexOf(szukany_ciag)==0)
{
var wynik=att[y].value;
jaka_strona_kodowa = wynik.substring(szukany_ciag.length,wynik.length);

}
}
}
}

then jaka_strona_kodowa returns me utf-8
It can be many meta tags which have name "content"

Thank you again
M.

optimistx

unread,
Aug 28, 2008, 7:17:19 AM8/28/08
to
Martin Honnen wrote:
...

> On the other hand most browsers by now expose a property like
> document.characterSet (Mozilla) or document.charset (IE) which should
> give you the charset the browser has taken from the meta or from real
> HTTP headers.

Is it true that these metatags have often different values compared to what
the browser actually uses? And the page author might have not a correct idea
of the character set, which the server is sending, thus believing that
putting something to metatags the server obeys him/her. The browsers in this
mess try to conclude the character set by examining the byte stream with
heuristic rules, getting often somewhat correct results.
I thought earlier that metatags are commands to the server: if I put
ISO-8859-10 to the tag, then the server transforms the page to that
characterset!

Eg javascript files on the server: I suspect most authors here have not 100
% reliable and true info which characterset they have. At least I have not
:)


Richard Cornford

unread,
Aug 28, 2008, 8:37:15 AM8/28/08
to
optimistx wrote:
> Martin Honnen wrote:
> ...
>> On the other hand most browsers by now expose a property
>> like document.characterSet (Mozilla) or document.charset
>> (IE) which should give you the charset the browser has
>> taken from the meta or from real HTTP headers.
>
> Is it true that these metatags have often different values
> compared to what the browser actually uses?

That is certainly possible as that browser will follow the HTTP headers
and take any character set declarations in the headers in preference to
anything else (as it required in HTTP). And it is certainly common for
the attributes of META elements to be at odds with HTTP headers (and
even at odds with document mark-up; how often do you see XHTML mark-up
contain a META element that attempts to assert that document to be
textt/html?).

> And the page author might have not a correct idea of the
> character set, which the server is sending, thus believing
> that putting something to metatags the server obeys him/her.

Yes, misconceptions about web technologies are rife.

> The browsers in this mess try to conclude the character set
> by examining the byte stream with heuristic rules,

That has been observed (particularly with IE (at least up until 6)).

> getting often somewhat correct results.

And some spectacularly wrong result, hence the observation of the
phenomenon.

> I thought earlier that metatags are commands to the server:

For some servers they have been; the server would process the document
prior to sending and base the headers used on the META elements. But
that is not very common with servers.

> if I put ISO-8859-10 to the tag, then the server transforms
> the page to that characterset!
>
> Eg javascript files on the server: I suspect most authors here
> have not 100 % reliable and true info which characterset they
> have.

Observing that some people don't know what they are doing is no reason
to assume 'most here' do not.

> At least I have not
> :)

Get yourself a web debugging proxy (Fiddler or Charles) or some other
tool that can show you the HTTP headers.

Richard.

0 new messages