I am concerned about being penalized (or having page rank slip) due to
possible duplicate content when it comes to upper case and lower case
versions of a URL.
what I mean is I have gone into Web master tools and gone to:
Diagnostics -> Content Analysis -> Duplicate meta descriptions
And I noticed that google had one URL listed twice such as:
In many cases things like URLs are not case-sensitive.
so
thatsite.com / HereIsAnItem
and
thatsite.com / hereisanITEM
Are the same page (In come cases... on some servers, it is possible to
have them as different pages... but URLs are best thought of as case
insensitive).
.
Solutions....
If you are using an Apache server, I believe you can use a tiny bit of
code in the .htacess file.
This will take any URL request and redirect it to a lower-case only
version.
That way any request ends up on the lowercase page...
:D
> I am concerned about being penalized (or having page rank slip) due to
> possible duplicate content when it comes to upper case and lower case
> versions of a URL.
> what I mean is I have gone into Web master tools and gone to:
> Diagnostics -> Content Analysis -> Duplicate meta descriptions
> And I noticed that google had one URL listed twice such as:
Microsoft IIS is usually case insensitive, which is why those urls all
return a 200 (success).
Apache is usually set up as case sensitive, so those urls are
considered different. Google is case sensitive.
If the server is case insensitive, then you have duplication, because
those urls all work and Google assumes them to be different urls
serving the same content.
The best method to build a site is to always use lower case in all
urls. As Autocrat said, if you can set up the server to 301 redirect
urls to their lower case version, you will at least eliminate
duplication due to that.
Not sure if it can be done through server settings on an IIS server.
Might need a script added to each page that compares the uri used to
the lowercase transformation of the same uri and if not equal then it
performs a 301 redirection. Probably difficult to do when using canned
software like Miva.
But luckily your server is Apache, so the .htaccess solution should
work. Probably however it's set up to be case insensitive, so the
first thing to do is get rid of this. After that add the redirection
directives, lest you get 404's.
I found these .htaccess directives, but I have not tested this:
~~~~~~~~~~~~~~~~~~~~
# Skip this entire section if no uppercase letters in requested URL
RewriteRule ![A-Z] - [S=28]
# Else rewrite one of each uppercase letter to lowercase
RewriteRule ^([^A]*)A(.*)$ /$1a$2
RewriteRule ^([^B]*)B(.*)$ /$1b$2
RewriteRule ^([^C]*)C(.*)$ /$1c$2
...
...
RewriteRule ^([^Z]*)Z(.*)$ /$1z$2
# If more uppercase letters remain, re-invoke .htaccess and start
over
RewriteRule [A-Z] - [N]
# Else do a 301 redirect to the all-lowercase URL
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
~~~~~~~~~~~~~~~~~
The lines shown as ... need to contain directives for the missing
letters.
It's bound to be a long process so it's best to hurry up and clean up
the site navigation to avoid this lengthy transfromation, and to
eliminate all the redirections found during navigation due to the case
adjustment
> In many cases things like URLs are not case-sensitive.
> so
> thatsite.com / HereIsAnItem
> and
> thatsite.com / hereisanITEM
> Are the same page (In come cases... on some servers, it is possible to
> have them as different pages... but URLs are best thought of as case
> insensitive).
> .
> Solutions....
> If you are using an Apache server, I believe you can use a tiny bit of
> code in the .htacess file.
> This will take any URL request and redirect it to a lower-case only
> version.
> That way any request ends up on the lowercase page...
> :D
> On Jul 18, 7:35 am, Jay Is The Boss wrote:
> > Hi everyone,
> > I am concerned about being penalized (or having page rank slip) due to
> > possible duplicate content when it comes to upper case and lower case
> > versions of a URL.
> > what I mean is I have gone into Web master tools and gone to:
> > Diagnostics -> Content Analysis -> Duplicate meta descriptions
> > And I noticed that google had one URL listed twice such as:
> > In the first one, Hindu-Statues is capitalized, while in the second,
> > it is lower case.
> > My site is a shopping cart (Miva Merchant), and even though the url
> > SHOULD be capitalized, the all lower case will resolve to the same
> > page.
> > In fact, you could use any mixture of lower or upper case letters and
> > it will still resolve to the same page.
> > Is this something to worry about?
> > In case a complete URL would help, here is one:
I am on an apache server, so I will see about doing the 301 redirect
(will probably need my hosting company's help).
Webado; You said;
> It's bound to be a long process so it's best to hurry up and clean up
> the site navigation to avoid this lengthy transfromation, and to
> eliminate all the redirections found during navigation due to the case
> adjustment
By that, I assume you mean I should change my items (and category
names) so they are all lower case letters first and then do the 301
redirect in the .htaccess files, right?
Or at least, you mean go ahead and change any upper case letters down
to lower case, right?
Also, you said:
> Google is case sensitive...
Does that mean if I change my URLs to lower case, then any results in
the google search results WON'T resolve until the next crawl? (Or do
you just mean that google things they are two different urls?)
> Microsoft IIS is usually case insensitive, which is why those urls all
> return a 200 (success).
> Apache is usually set up as case sensitive, so those urls are
> considered different. Google is case sensitive.
> If the server is case insensitive, then you have duplication, because
> those urls all work and Google assumes them to be different urls
> serving the same content.
> The best method to build a site is to always use lower case in all
> urls. As Autocrat said, if you can set up the server to 301 redirect
> urls to their lower case version, you will at least eliminate
> duplication due to that.
> Not sure if it can be done through server settings on an IIS server.
> Might need a script added to each page that compares the uri used to
> the lowercase transformation of the same uri and if not equal then it
> performs a 301 redirection. Probably difficult to do when using canned
> software like Miva.
> But luckily your server is Apache, so the .htaccess solution should
> work. Probably however it's set up to be case insensitive, so the
> first thing to do is get rid of this. After that add the redirection
> directives, lest you get 404's.
> I found these .htaccess directives, but I have not tested this:
> ~~~~~~~~~~~~~~~~~~~~
> # Skip this entire section if no uppercase letters in requested URL
> RewriteRule ![A-Z] - [S=28]
> # Else rewrite one of each uppercase letter to lowercase
> RewriteRule ^([^A]*)A(.*)$ /$1a$2
> RewriteRule ^([^B]*)B(.*)$ /$1b$2
> RewriteRule ^([^C]*)C(.*)$ /$1c$2
> ...
> ...
> RewriteRule ^([^Z]*)Z(.*)$ /$1z$2
> # If more uppercase letters remain, re-invoke .htaccess and start
> over
> RewriteRule [A-Z] - [N]
> # Else do a 301 redirect to the all-lowercase URL
> RewriteRule (.*)http://www.example.com/$1[R=301,L]
> ~~~~~~~~~~~~~~~~~
> The lines shown as ... need to contain directives for the missing
> letters.
> It's bound to be a long process so it's best to hurry up and clean up
> the site navigation to avoid this lengthy transfromation, and to
> eliminate all the redirections found during navigation due to the case
> adjustment
> On Jul 18, 5:08 am, Autocrat wrote:
> > In many cases things like URLs are not case-sensitive.
> > so
> > thatsite.com / HereIsAnItem
> > and
> > thatsite.com / hereisanITEM
> > Are the same page (In come cases... on some servers, it is possible to
> > have them as different pages... but URLs are best thought of as case
> > insensitive).
> > .
> > Solutions....
> > If you are using an Apache server, I believe you can use a tiny bit of
> > code in the .htacess file.
> > This will take any URL request and redirect it to a lower-case only
> > version.
> > That way any request ends up on the lowercase page...
> > :D
> > On Jul 18, 7:35 am, Jay Is The Boss wrote:
> > > Hi everyone,
> > > I am concerned about being penalized (or having page rank slip) due to
> > > possible duplicate content when it comes to upper case and lower case
> > > versions of a URL.
> > > what I mean is I have gone into Web master tools and gone to:
> > > In the first one, Hindu-Statues is capitalized, while in the second,
> > > it is lower case.
> > > My site is a shopping cart (Miva Merchant), and even though the url
> > > SHOULD be capitalized, the all lower case will resolve to the same
> > > page.
> > > In fact, you could use any mixture of lower or upper case letters and
> > > it will still resolve to the same page.
> > > Is this something to worry about?
> > > In case a complete URL would help, here is one:
> I am on an apache server, so I will see about doing the 301 redirect
> (will probably need my hosting company's help).
> Webado; You said;
> > It's bound to be a long process so it's best to hurry up and clean up
> > the site navigation to avoid this lengthy transfromation, and to
> > eliminate all the redirections found during navigation due to the case
> > adjustment
> By that, I assume you mean I should change my items (and category
> names) so they are all lower case letters first and then do the 301
> redirect in the .htaccess files, right?
> Or at least, you mean go ahead and change any upper case letters down
> to lower case, right?
> Also, you said:
> > Google is case sensitive...
> Does that mean if I change my URLs to lower case, then any results in
> the google search results WON'T resolve until the next crawl? (Or do
> you just mean that google things they are two different urls?)
> Thanks again,
> Mark
> On Jul 18, 3:42 am, webado wrote:
> > Microsoft IIS is usually case insensitive, which is why those urls all
> > return a 200 (success).
> > Apache is usually set up as case sensitive, so those urls are
> > considered different. Google is case sensitive.
> > If the server is case insensitive, then you have duplication, because
> > those urls all work and Google assumes them to be different urls
> > serving the same content.
> > The best method to build a site is to always use lower case in all
> > urls. As Autocrat said, if you can set up the server to 301 redirect
> > urls to their lower case version, you will at least eliminate
> > duplication due to that.
> > Not sure if it can be done through server settings on an IIS server.
> > Might need a script added to each page that compares the uri used to
> > the lowercase transformation of the same uri and if not equal then it
> > performs a 301 redirection. Probably difficult to do when using canned
> > software like Miva.
> > But luckily your server is Apache, so the .htaccess solution should
> > work. Probably however it's set up to be case insensitive, so the
> > first thing to do is get rid of this. After that add the redirection
> > directives, lest you get 404's.
> > I found these .htaccess directives, but I have not tested this:
> > ~~~~~~~~~~~~~~~~~~~~
> > # Skip this entire section if no uppercase letters in requested URL
> > RewriteRule ![A-Z] - [S=28]
> > # Else rewrite one of each uppercase letter to lowercase
> > RewriteRule ^([^A]*)A(.*)$ /$1a$2
> > RewriteRule ^([^B]*)B(.*)$ /$1b$2
> > RewriteRule ^([^C]*)C(.*)$ /$1c$2
> > ...
> > ...
> > RewriteRule ^([^Z]*)Z(.*)$ /$1z$2
> > # If more uppercase letters remain, re-invoke .htaccess and start
> > over
> > RewriteRule [A-Z] - [N]
> > # Else do a 301 redirect to the all-lowercase URL
> > RewriteRule (.*)http://www.example.com/$1[R=301,L]
> > ~~~~~~~~~~~~~~~~~
> > The lines shown as ... need to contain directives for the missing
> > letters.
> > It's bound to be a long process so it's best to hurry up and clean up
> > the site navigation to avoid this lengthy transfromation, and to
> > eliminate all the redirections found during navigation due to the case
> > adjustment
> > On Jul 18, 5:08 am, Autocrat wrote:
> > > In many cases things like URLs are not case-sensitive.
> > > so
> > > thatsite.com / HereIsAnItem
> > > and
> > > thatsite.com / hereisanITEM
> > > Are the same page (In come cases... on some servers, it is possible to
> > > have them as different pages... but URLs are best thought of as case
> > > insensitive).
> > > .
> > > Solutions....
> > > If you are using an Apache server, I believe you can use a tiny bit of
> > > code in the .htacess file.
> > > This will take any URL request and redirect it to a lower-case only
> > > version.
> > > That way any request ends up on the lowercase page...
> > > :D
> > > On Jul 18, 7:35 am, Jay Is The Boss wrote:
> > > > Hi everyone,
> > > > I am concerned about being penalized (or having page rank slip) due to
> > > > possible duplicate content when it comes to upper case and lower case
> > > > versions of a URL.
> > > > what I mean is I have gone into Web master tools and gone to:
> > > > In the first one, Hindu-Statues is capitalized, while in the second,
> > > > it is lower case.
> > > > My site is a shopping cart (Miva Merchant), and even though the url
> > > > SHOULD be capitalized, the all lower case will resolve to the same
> > > > page.
> > > > In fact, you could use any mixture of lower or upper case letters and
> > > > it will still resolve to the same page.
> > > > Is this something to worry about?
> > > > In case a complete URL would help, here is one:
> Any links on your site (including in a sitemap) should go to the right
> place, first time... no redirect etc.
> The redirects are there to pick up 'old' problems.
> You should not have any links/navigation that require redirection.
> On Jul 18, 5:47 pm, Jay Is The Boss wrote:
> > Thank you, both Autocrat and webado.
> > I am on an apache server, so I will see about doing the 301 redirect
> > (will probably need my hosting company's help).
> > Webado; You said;
> > > It's bound to be a long process so it's best to hurry up and clean up
> > > the site navigation to avoid this lengthy transfromation, and to
> > > eliminate all the redirections found during navigation due to the case
> > > adjustment
> > By that, I assume you mean I should change my items (and category
> > names) so they are all lower case letters first and then do the 301
> > redirect in the .htaccess files, right?
> > Or at least, you mean go ahead and change any upper case letters down
> > to lower case, right?
> > Also, you said:
> > > Google is case sensitive...
> > Does that mean if I change my URLs to lower case, then any results in
> > the google search results WON'T resolve until the next crawl? (Or do
> > you just mean that google things they are two different urls?)
> > Thanks again,
> > Mark
> > On Jul 18, 3:42 am, webado wrote:
> > > Microsoft IIS is usually case insensitive, which is why those urls all
> > > return a 200 (success).
> > > Apache is usually set up as case sensitive, so those urls are
> > > considered different. Google is case sensitive.
> > > If the server is case insensitive, then you have duplication, because
> > > those urls all work and Google assumes them to be different urls
> > > serving the same content.
> > > The best method to build a site is to always use lower case in all
> > > urls. As Autocrat said, if you can set up the server to 301 redirect
> > > urls to their lower case version, you will at least eliminate
> > > duplication due to that.
> > > Not sure if it can be done through server settings on an IIS server.
> > > Might need a script added to each page that compares the uri used to
> > > the lowercase transformation of the same uri and if not equal then it
> > > performs a 301 redirection. Probably difficult to do when using canned
> > > software like Miva.
> > > But luckily your server is Apache, so the .htaccess solution should
> > > work. Probably however it's set up to be case insensitive, so the
> > > first thing to do is get rid of this. After that add the redirection
> > > directives, lest you get 404's.
> > > I found these .htaccess directives, but I have not tested this:
> > > ~~~~~~~~~~~~~~~~~~~~
> > > # Skip this entire section if no uppercase letters in requested URL
> > > RewriteRule ![A-Z] - [S=28]
> > > # Else rewrite one of each uppercase letter to lowercase
> > > RewriteRule ^([^A]*)A(.*)$ /$1a$2
> > > RewriteRule ^([^B]*)B(.*)$ /$1b$2
> > > RewriteRule ^([^C]*)C(.*)$ /$1c$2
> > > ...
> > > ...
> > > RewriteRule ^([^Z]*)Z(.*)$ /$1z$2
> > > # If more uppercase letters remain, re-invoke .htaccess and start
> > > over
> > > RewriteRule [A-Z] - [N]
> > > # Else do a 301 redirect to the all-lowercase URL
> > > RewriteRule (.*)http://www.example.com/$1[R=301,L]
> > > ~~~~~~~~~~~~~~~~~
> > > The lines shown as ... need to contain directives for the missing
> > > letters.
> > > It's bound to be a long process so it's best to hurry up and clean up
> > > the site navigation to avoid this lengthy transfromation, and to
> > > eliminate all the redirections found during navigation due to the case
> > > adjustment
> > > On Jul 18, 5:08 am, Autocrat wrote:
> > > > In many cases things like URLs are not case-sensitive.
> > > > so
> > > > thatsite.com / HereIsAnItem
> > > > and
> > > > thatsite.com / hereisanITEM
> > > > Are the same page (In come cases... on some servers, it is possible to
> > > > have them as different pages... but URLs are best thought of as case
> > > > insensitive).
> > > > .
> > > > Solutions....
> > > > If you are using an Apache server, I believe you can use a tiny bit of
> > > > code in the .htacess file.
> > > > This will take any URL request and redirect it to a lower-case only
> > > > version.
> > > > That way any request ends up on the lowercase page...
> > > > :D
> > > > On Jul 18, 7:35 am, Jay Is The Boss wrote:
> > > > > Hi everyone,
> > > > > I am concerned about being penalized (or having page rank slip) due to
> > > > > possible duplicate content when it comes to upper case and lower case
> > > > > versions of a URL.
> > > > > what I mean is I have gone into Web master tools and gone to:
> > > > > In the first one, Hindu-Statues is capitalized, while in the second,
> > > > > it is lower case.
> > > > > My site is a shopping cart (Miva Merchant), and even though the url
> > > > > SHOULD be capitalized, the all lower case will resolve to the same
> > > > > page.
> > > > > In fact, you could use any mixture of lower or upper case letters and
> > > > > it will still resolve to the same page.
> > > > > Is this something to worry about?
> > > > > In case a complete URL would help, here is one:
I have notoced that and unfortunately Google is case sensitive. But I
don't understand why Google should be case sensitive, when it returns
the same results for search stings "abc" and "ABC".
Being case sesitive, has kept my site in low rank. I am using IIS
server and it returns the same content for /page123.asp and /
Page123.asp
It does not make sence to me why google sees them as different pages ?
Don't you see the same page when you type GOOGLE.COM and google.com ?
In my opinion this is a bug not an advantage of Google.
> Microsoft IIS is usually case insensitive, which is why those urls all
> return a 200 (success).
> Apache is usually set up as case sensitive, so those urls are
> considered different. Google is case sensitive.
> If the server is case insensitive, then you have duplication, because
> those urls all work and Google assumes them to be different urls
> serving the same content.
> The best method to build a site is to always uselower casein all
> urls. As Autocrat said, if you can set up the server to 301 redirect
> urls to theirlower caseversion, you will at least eliminate
> duplication due to that.
> Not sure if it can be done through server settings on an IIS server.
> Might need a script added to each page that compares the uri used to
> the lowercase transformation of the same uri and if not equal then it
> performs a 301 redirection. Probably difficult to do when using canned
> software like Miva.
> But luckily your server is Apache, so the .htaccess solution should
> work. Probably however it's set up to be case insensitive, so the
> first thing to do is get rid of this. After that add the redirection
> directives, lest you get 404's.
> I found these .htaccess directives, but I have not tested this:
> ~~~~~~~~~~~~~~~~~~~~
> # Skip this entire section if no uppercase letters in requested URL
> RewriteRule ![A-Z] - [S=28]
> # Else rewrite one of each uppercase letter to lowercase
> RewriteRule ^([^A]*)A(.*)$ /$1a$2
> RewriteRule ^([^B]*)B(.*)$ /$1b$2
> RewriteRule ^([^C]*)C(.*)$ /$1c$2
> ...
> ...
> RewriteRule ^([^Z]*)Z(.*)$ /$1z$2
> # If more uppercase letters remain, re-invoke .htaccess and start
> over
> RewriteRule [A-Z] - [N]
> # Else do a 301 redirect to the all-lowercase URL
> RewriteRule (.*)http://www.example.com/$1[R=301,L]
> ~~~~~~~~~~~~~~~~~
> The lines shown as ... need to contain directives for the missing
> letters.
> It's bound to be a long process so it's best to hurry up and clean up
> the site navigation to avoid this lengthy transfromation, and to
> eliminate all the redirections found during navigation due to the case
> adjustment
> On Jul 18, 5:08 am, Autocrat wrote:
> > In many cases things like URLs are not case-sensitive.
> > so
> > thatsite.com / HereIsAnItem
> > and
> > thatsite.com / hereisanITEM
> > Are the same page (In come cases... on some servers, it is possible to
> > have them as different pages... but URLs are best thought of as case
> > insensitive).
> > .
> > Solutions....
> > If you are using an Apache server, I believe you can use a tiny bit of
> > code in the .htacess file.
> > This will take any URL request and redirect it to alower-caseonly
> > version.
> > That way any request ends up on the lowercase page...
> > :D
> > On Jul 18, 7:35 am, Jay Is The Boss wrote:
> > > Hi everyone,
> > > I am concerned about being penalized (or having page rank slip) due to
> > > possible duplicate content when it comes to upper case andlower case
> > > versions of a URL.
> > > what I mean is I have gone into Web master tools and gone to:
> > > In the first one, Hindu-Statues is capitalized, while in the second,
> > > it islower case.
> > > My site is a shopping cart (Miva Merchant), and even though the url
> > > SHOULD be capitalized, the alllower casewill resolve to the same
> > > page.
> > > In fact, you could use any mixture of lower or upper case letters and
> > > it will still resolve to the same page.
> > > Is this something to worry about?
> > > In case a complete URL would help, here is one:
In the URL, the path, file name and query string are by definition
case-sensitive. In this case, Google is just following the standards
as defined in the "RFCs" 1738 and 1808. The host name, as you noticed,
is not case-sensitive, neither is the choice of protocol (http:// is
the same as HTTP://).
IIS does treat things a bit differently, so it's important that you
make sure that all of your internal links point to the same version of
the URL, otherwise search engines (including Google) may recognize two
(or more) distinct URLs. Once we crawl them, we'll notice that they're
identical, but until then we may treat them separately.
I had reviewd these documents few years ago and checked them again,
but I still cannot find any where it mention being case sensitive
path ?
It does talk about base being case in-sensitive, as you mentioned, but
I didn't find about the path.
I know it comes down from unix base OS systems, but on the net it
doesn't make sence to have case sesitivity on the path. (my opinion)
> In the URL, the path, file name and query string are by definition
> case-sensitive. In this case, Google is just following the standards
> as defined in the "RFCs" 1738 and 1808. The host name, as you noticed,
> is not case-sensitive, neither is the choice of protocol (http:// is
> the same as HTTP://).
> IIS does treat things a bit differently, so it's important that you
> make sure that all of your internal links point to the same version of
> the URL, otherwise search engines (including Google) may recognize two
> (or more) distinct URLs. Once we crawl them, we'll notice that they're
> identical, but until then we may treat them separately.