[Django] #33998: Use alternates and i18n generate duplicated URLs in the sitemap

30 views
Skip to first unread message

Django

unread,
Sep 8, 2022, 4:00:45 PM9/8/22
to django-...@googlegroups.com
#33998: Use alternates and i18n generate duplicated URLs in the sitemap
-------------------------------------+-------------------------------------
Reporter: brenosss | Owner: nobody
Type: Bug | Status: new
Component: | Version: 4.0
contrib.sitemaps | Keywords: sitemap, i18n,
Severity: Normal | alternates
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
If the i18n variable is set for True, the SiteMapClass generates a
different URL for each item in LANGUAGES, but I'm using the alternates I
expected to have only one URL for the default language and have the
translations version in the alternates URLs.

The current function:

{{{
def _items(self):
if self.i18n:
# Create (item, lang_code) tuples for all items and languages.
# This is necessary to paginate with all languages already
considered.
items = [
(item, lang_code)
for lang_code in self._languages()
for item in self.items()
]
return items
return self.items()
}}}

This is a e.g of a result:

{{{
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>https://example.com/en/contact</loc>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
<xhtml:link rel="alternate" hreflang="en"
href="https://example.com/en/contact" />
<xhtml:link rel="alternate" hreflang="es"
href="https://example.com/es/contact" />
<xhtml:link rel="alternate" hreflang="el"
href="https://example.com/el/contact" />
</url>
<url>
<loc>https://example.com/es/contact</loc>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
<xhtml:link rel="alternate" hreflang="en"
href="https://example.com/en/contact" />
<xhtml:link rel="alternate" hreflang="es"
href="https://example.com/es/contact" />
<xhtml:link rel="alternate" hreflang="el"
href="https://example.com/el/contact" />
</url>
<url>
<loc>https://example.com/el/contact</loc>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
<xhtml:link rel="alternate" hreflang="en"
href="https://example.com/en/contact" />
<xhtml:link rel="alternate" hreflang="es"
href="https://example.com/es/contact" />
<xhtml:link rel="alternate" hreflang="el"
href="https://example.com/el/contact" />
</url>
</urlset>
}}}

I propose to verify if the alternates is True and generate the items only
for the default language:

{{{
def _items(self):
if self.i18n:
if self.alternates:
lang_code = self.default_lang or settings.LANGUAGE_CODE
items = self.items()
items = [
# The url will be generated based on the default
language the translations links will be added in the alternate links
(item, lang_code)
for item in items
]
return items
# Create (item, lang_code) tuples for all items and languages.
# This is necessary to paginate with all languages already
considered.
items = [
(item, lang_code)
for lang_code in self._languages()
for item in self.items()
]
return items
return self.items()
}}}

Then i expected a result more like:
{{{
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>https://example.com/en/contact</loc>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
<xhtml:link rel="alternate" hreflang="en"
href="https://example.com/en/contact" />
<xhtml:link rel="alternate" hreflang="es"
href="https://example.com/es/contact" />
<xhtml:link rel="alternate" hreflang="el"
href="https://example.com/el/contact" />
</url>
</urlset>
}}}

This makes sense? can I start work on it?

--
Ticket URL: <https://code.djangoproject.com/ticket/33998>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Sep 8, 2022, 4:01:26 PM9/8/22
to django-...@googlegroups.com
#33998: Use alternates and i18n generate duplicated URLs in the sitemap
-------------------------------------+-------------------------------------
Reporter: brenosss | Owner: nobody
Type: Bug | Status: new
Component: contrib.sitemaps | Version: 4.0
Severity: Normal | Resolution:
Keywords: sitemap, i18n, | Triage Stage:
alternates | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by brenosss:

Old description:

New description:

If the i18n variable is set for True, the SiteMapClass generates a

different URL for each item in LANGUAGES, but I'm using the alternates so

The current function:

--

--
Ticket URL: <https://code.djangoproject.com/ticket/33998#comment:1>

Django

unread,
Sep 9, 2022, 8:21:46 AM9/9/22
to django-...@googlegroups.com
#33998: Use alternates and i18n generate duplicated URLs in the sitemap
-------------------------------------+-------------------------------------
Reporter: brenosss | Owner: nobody
Type: Bug | Status: new
Component: contrib.sitemaps | Version: 4.0
Severity: Normal | Resolution:
Keywords: sitemap, i18n, | Triage Stage:
alternates | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* cc: Florian Demmer (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/33998#comment:2>

Django

unread,
Sep 9, 2022, 12:21:12 PM9/9/22
to django-...@googlegroups.com
#33998: Use alternates and i18n generate duplicated URLs in the sitemap
-------------------------------------+-------------------------------------
Reporter: brenosss | Owner: nobody
Type: Bug | Status: new
Component: contrib.sitemaps | Version: 4.0
Severity: Normal | Resolution:
Keywords: sitemap, i18n, | Triage Stage:
alternates | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Florian Demmer):

Thank you for your report, but it is my understanding that the result you
expect would not be correct. Each translation of a page needs a separate
`url` entry.

Here is an example showing the explicit listing of all language "pages" by
themselves with their alternates including itself:
https://developers.google.com/search/docs/advanced/crawling/localized-
versions#sitemap

Do you have any references, that support your expectation?

--
Ticket URL: <https://code.djangoproject.com/ticket/33998#comment:3>

Django

unread,
Sep 9, 2022, 1:16:50 PM9/9/22
to django-...@googlegroups.com
#33998: Use alternates and i18n generate duplicated URLs in the sitemap
-------------------------------------+-------------------------------------
Reporter: brenosss | Owner: nobody
Type: Bug | Status: closed
Component: contrib.sitemaps | Version: 4.0
Severity: Normal | Resolution: needsinfo

Keywords: sitemap, i18n, | Triage Stage:
alternates | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* status: new => closed
* resolution: => needsinfo


Comment:

Florian, thanks for checking.

--
Ticket URL: <https://code.djangoproject.com/ticket/33998#comment:4>

Reply all
Reply to author
Forward
0 new messages