Web Images Videos Maps News Shopping Gmail more »
Recently Visited Groups | Help | Sign in
Google Groups Home
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  3 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
michelm  
View profile  
 More options Jun 23 2008, 5:05 am
From: michelm
Date: Mon, 23 Jun 2008 02:05:26 -0700 (PDT)
Local: Mon, Jun 23 2008 5:05 am
Subject: US-ASCII vs UTF-8
Sitemaps doesn't seem to recognize pages coded in UTF-8 and see them
as US-ASCII instead, even if the right META charset is declared in the
pages.

I read somewhere that any extended characters have to in the pages ,
to be recognized as UTF-8...

Does anybody knows something about that? Does it have an incidence on
the way pages are indexed/crawled ?

Thanks.


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
webado  
View profile  
(1 user)  More options Jun 23 2008, 7:03 am
From: webado
Date: Mon, 23 Jun 2008 04:03:51 -0700 (PDT)
Local: Mon, Jun 23 2008 7:03 am
Subject: Re: US-ASCII vs UTF-8
OK see here for some discussions and clarifications:
http://www.google.com/search?sourceid=navclient&ie=UTF-8&rlz=1T4GGIH_...

In a nutshell we can say that the chaaracters in US-ASCII can be said
to form  a subset of UTF-8 . US-ASCII is also a subset of ISO-859-1 .
But ISO-8859-1 is not a subset of UTF-8 .

If your document only contains only characters found in US-ASCII, even
if you set the charset encoding to UTF-8 it will be reported as US-
ASCII in Webmaster Tools.
Similalry if your charset encoding is set to ISO-889-1 but all the
characters used as part of US-ASCII (e.g. no accented characters are
present) that too will be reported as being US-ASCII.

On Jun 23, 5:05 am, michelm wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
michelm  
View profile  
 More options Jun 23 2008, 11:12 am
From: michelm
Date: Mon, 23 Jun 2008 08:12:36 -0700 (PDT)
Local: Mon, Jun 23 2008 11:12 am
Subject: Re: US-ASCII vs UTF-8
Merci, thanks, for the explanation.
What I dont't understand is that the W3C Markup validator doesn't
validate my site if I manually choose the US-ASCII encoding.
But it's OK if I select UTF-8 or iso-8859-1.
I know that all pages contains at least one extended character ( the
french ç ) and saved in UTF-8.
But I changed from iso-8859-1 to UTF-8  2 or 3 months ago only, so
maybe Google doesn't refresh the encoding statistics very often and I
will see the correct encoding later...

    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google