Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Generating slug for words with accents
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  13 messages - Expand all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Michal  
View profile  
 More options Aug 26 2006, 4:05 am
From: Michal <mic...@plovarna.cz>
Date: Sat, 26 Aug 2006 10:05:50 +0200
Local: Sat, Aug 26 2006 4:05 am
Subject: [patch] Generating slug for words with accents

Hello,
I have problem with submiting ticket in trac (details below) with my
patch, so I decided to post it here.

-----------------------------------------

Short summary: [patch] Generating slug for words with accents

Full description: In my language (czech) there are a lot of characters
with accents. When I type titles in admin forms, the slug field
autogenerated values are incorect (for example:title="sršeň",
autogenerated slug="sre"; correct is "srsen"). So I wrote little patch
to urlify.js code, which first convert all accents chars to their ASCII
equivalent. For now, my code respect only czech accents. I will be glad,
If some others of you add your own national characters.

Priority: normal
Component: Admin interface
Severity: normal
Version: SVN
Keywords: slug urlify

-----------------------------------------

Trac error

Trac detected an internal error:
Traceback (most recent call last):
   File "/usr/lib/python2.3/site-packages/trac/web/main.py", line 299,
in dispatch_request
     dispatcher.dispatch(req)
   File "/usr/lib/python2.3/site-packages/trac/web/main.py", line 189,
in dispatch
     resp = chosen_handler.process_request(req)
   File "/usr/lib/python2.3/site-packages/trac/ticket/web_ui.py", line
104, in process_request
     self._do_create(req, db)
   File "/usr/lib/python2.3/site-packages/trac/ticket/web_ui.py", line
163, in _do_create
     self._validate_ticket(req, ticket)
   File "/usr/lib/python2.3/site-packages/trac/ticket/web_ui.py", line
47, in _validate_ticket
     for field, message in manipulator.validate_ticket(req, ticket):
   File "build/bdist.linux-i686/egg/tracspamfilter/adapters.py", line
40, in validate_ticket
   File "build/bdist.linux-i686/egg/tracspamfilter/api.py", line 74, in test
herror: (1, 'Unknown host')

(I was trying to submit ticket from FreeBSD 5.4 system & Firefox 1.5.0.1)

Regards
Michal

[ urlify_patch.diff < 1K ]
Index: django/contrib/admin/media/js/urlify.js
===================================================================
--- django/contrib/admin/media/js/urlify.js     (revision 3657)
+++ django/contrib/admin/media/js/urlify.js     (working copy)
@@ -1,4 +1,20 @@
+function replAccents(s)
+{
+    // from and to strings must have same number of characters
+    var from = 'Ã¡Ä Ä Ã©Ä›Ã­ÅˆÃ³Å™Å¡Å¥ÃºÅ¯Ã½Å¾Ã ÄŒÄŽÃ‰ÄšÃ Å‡Ã“Å˜Å Å¤ÃšÅ®Ã Å½';
+    var to   = 'acdeeinorstuuyzACDEEINORSTUUYZ';
+    for (var i = 0; i != s.length; i++) {
+        var x = from.indexOf(s[i]);
+        if (x != -1) {
+            r = new RegExp(from[x], 'g');
+            s = s.replace(r, to[x]);
+        }
+    }
+    return s;
+}
+
 function URLify(s, num_chars) {
+    s = replAccents(s);
     // changes, e.g., "Petty theft" to "petty_theft"
     // remove all these words from the string before urlifying
     removelist = ["a", "an", "as", "at", "before", "but", "by", "for", "from",


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Maciej Bliziński  
View profile  
 More options Aug 26 2006, 7:44 am
From: Maciej Bliziński <maciej.blizin...@gmail.com>
Date: Sat, 26 Aug 2006 13:44:45 +0200
Local: Sat, Aug 26 2006 7:44 am
Subject: Re: [patch] Generating slug for words with accents

On Sat, 2006-08-26 at 10:05 +0200, Michal wrote:
> Full description: In my language (czech) there are a lot of characters
> with accents. When I type titles in admin forms, the slug field
> autogenerated values are incorect (for example:title="sršeň",

Is it a hornet?

> autogenerated slug="sre"; correct is "srsen").

That's right. I've been experiencing the same thing.

> I will be glad, If some others of you add your own national characters.

I'm attaching a modified patch with Polish characters added.

--
Maciej Bliziński
http://automatthias.wordpress.com

  urlify_patch_cz_and_pl.diff
< 1K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Michal  
View profile  
 More options Aug 26 2006, 10:48 am
From: Michal <mic...@plovarna.cz>
Date: Sat, 26 Aug 2006 16:48:15 +0200
Local: Sat, Aug 26 2006 10:48 am
Subject: Re: [patch] Generating slug for words with accents

Maciej Bliziński wrote:
> On Sat, 2006-08-26 at 10:05 +0200, Michal wrote:
>> Full description: In my language (czech) there are a lot of characters
>> with accents. When I type titles in admin forms, the slug field
>> autogenerated values are incorect (for example:title="sršeň",

> Is it a hornet?

Yes it is, my Slavic brother :)

>> autogenerated slug="sre"; correct is "srsen").

> That's right. I've been experiencing the same thing.

>> I will be glad, If some others of you add your own national characters.

> I'm attaching a modified patch with Polish characters added.

Thank you. I also added a few of Slovak characters (Czech and Slovak was
brothers too, and they have similar alphabet).

[ urlify_patch_cz_pl_sk.diff 1K ]
Index: django/contrib/admin/media/js/urlify.js
===================================================================
--- django/contrib/admin/media/js/urlify.js     (revision 3657)
+++ django/contrib/admin/media/js/urlify.js     (working copy)
@@ -1,4 +1,20 @@
+function replAccents(s)
+{
+    // from and to strings must have same number of characters
+    var from = 'Ã¡Ä Ä Ã©Ä›Ã­ÅˆÃ³Å™Å¡Å¥ÃºÅ¯Ã½Å¾Ä…Ä‡Ä™Å‚Å„Ã³Å›Å¼ÅºÃ¤Ä¾ÄºÃ´Å•Ã ÄŒÄŽÃ‰ÄšÃ Å‡Ã“Å˜Å Å¤ÃšÅ®Ã Å½Ä„Ä†Ä˜Å ÅƒÃ“ÅšÅ»Å¹Ã„Ä½Ä¹Ã”Å”';
+    var to   = 'acdeeinorstuuyzacelnoszzallorACDEEINORSTUUYZACELNOSZZALLOR';
+    for (var i = 0; i != s.length; i++) {
+        var x = from.indexOf(s[i]);
+        if (x != -1) {
+            r = new RegExp(from[x], 'g');
+            s = s.replace(r, to[x]);
+        }
+    }
+    return s;
+}
+
 function URLify(s, num_chars) {
+    s = replAccents(s);
     // changes, e.g., "Petty theft" to "petty_theft"
     // remove all these words from the string before urlifying
     removelist = ["a", "an", "as", "at", "before", "but", "by", "for", "from",


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Maciej Bliziński  
View profile  
 More options Aug 27 2006, 5:20 am
From: Maciej Bliziński <maciej.blizin...@gmail.com>
Date: Sun, 27 Aug 2006 11:20:03 +0200
Local: Sun, Aug 27 2006 5:20 am
Subject: Re: [patch] Generating slug for words with accents

On Sat, 2006-08-26 at 16:48 +0200, Michal wrote:
> I also added a few of Slovak characters (Czech and Slovak was
> brothers too, and they have similar alphabet).

I looked at the Latin Unicode article in Wikipedia:
http://en.wikipedia.org/wiki/Latin_Unicode

There are characters with accents have I never seen before... Vietnamese
alphabet, for instance, has glyphs which are Latin characters with
unusual accents, for example: ã, or even with two accents: ặ

For most of the characters, it's pretty easy to remove the accents.
However, some characters are mysterious: should Ƨ be translated to S?
I don't know. So I just deleted them from the accent removal list.

I'm including a patch with "from" and "to" constants extended with all
the characters I found on Wikipedia that seemed to be of any use. This
should cover all the Slavic countries except those which use cyrylic
alphabet.

One thing... some characters want to be translated into _two_ ASCII
characters, for example Æ to AE. This would require a different data
structure. In present form, I just entered E. The same with ß which
I replaced with single S.

Regards,
Maciej

--
Maciej Bliziński
http://automatthias.wordpress.com

  urlify-i18n-patch.diff
3K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Michal  
View profile  
 More options Aug 27 2006, 5:37 am
From: Michal <mic...@plovarna.cz>
Date: Sun, 27 Aug 2006 11:37:31 +0200
Local: Sun, Aug 27 2006 5:37 am
Subject: Re: [patch] Generating slug for words with accents

Nice work Maciej :)

When I wrote my first post, I typed: "I will be glad, If some others of
you add your own national characters."
Each nationality have its own specific characters and rules for them, so
I think that somebody from this countries should check your version of
patch.

> I'm including a patch with "from" and "to" constants extended with all
> the characters I found on Wikipedia that seemed to be of any use. This
> should cover all the Slavic countries except those which use cyrylic
> alphabet.

> One thing... some characters want to be translated into _two_ ASCII
> characters, for example Æ to AE. This would require a different data
> structure. In present form, I just entered E. The same with ß which
> I replaced with single S.

Maybe we could try wrote one new function, which will translate one
unicode to adequate 2 ascii chars? (translate accent chars will be then
done in two steps: 1-replAccents, 2-new function)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Nicolas Steinmetz  
View profile  
 More options Nov 16 2006, 3:49 am
From: Nicolas Steinmetz <nsteinm...@gmail.com>
Date: Thu, 16 Nov 2006 09:49:05 +0100
Local: Thurs, Nov 16 2006 3:49 am
Subject: Re: [patch] Generating slug for words with accents
Maciej Bliziński a écrit :

> I'm including a patch with "from" and "to" constants extended with all
> the characters I found on Wikipedia that seemed to be of any use. This
> should cover all the Slavic countries except those which use cyrylic
> alphabet.

Was this page commit to svn version of django, as in 0.95 I was facing
this issue with french accents.

Nicolas


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Aidas Bendoraitis  
View profile  
 More options Nov 16 2006, 4:53 am
From: "Aidas Bendoraitis" <aidas.bendorai...@gmail.com>
Date: Thu, 16 Nov 2006 10:53:04 +0100
Local: Thurs, Nov 16 2006 4:53 am
Subject: Re: [patch] Generating slug for words with accents
German ß should be translated to ss
ä to ae
ö to oe
ü to ue

Regards,
Aidas Bendoraitis [aka Archatas]

On 11/16/06, Nicolas Steinmetz <nsteinm...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Karsten W. Rohrbach  
View profile  
 More options Nov 16 2006, 4:54 am
From: "Karsten W. Rohrbach" <kars...@rohrbach.de>
Date: Thu, 16 Nov 2006 01:54:04 -0800
Local: Thurs, Nov 16 2006 4:54 am
Subject: Re: Generating slug for words with accents
Would this make sense to integrate on the server side (instead of JS),
say next to django.utils.text.get_valid_filename()?

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John Lenton  
View profile  
 More options Nov 16 2006, 7:42 am
From: "John Lenton" <jlen...@gmail.com>
Date: Thu, 16 Nov 2006 09:42:35 -0300
Local: Thurs, Nov 16 2006 7:42 am
Subject: Re: [patch] Generating slug for words with accents
On 11/16/06, Aidas Bendoraitis <aidas.bendorai...@gmail.com> wrote:

> German ß should be translated to ss
> ä to ae
> ö to oe
> ü to ue

but «ü» in Spanish should be just «u» (as in pingüino -> pinguino).

--
John Lenton (jlen...@gmail.com) -- Random fortune:
The trouble with a lot of self-made men is that they worship their creator.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
zenx  
View profile  
 More options Nov 16 2006, 6:38 pm
From: "zenx" <antonio.m...@gmail.com>
Date: Thu, 16 Nov 2006 23:38:55 -0000
Local: Thurs, Nov 16 2006 6:38 pm
Subject: Re: Generating slug for words with accents
Spanish info:
á é í ó ú    should be         a e i o u
ü              should be         u
ñ              should be         n

I think that's everything in spanish ;)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kamil Wdowicz  
View profile  
 More options Nov 17 2006, 1:25 am
From: "Kamil Wdowicz" <kwdow...@zenstudio.pl>
Date: Fri, 17 Nov 2006 07:25:50 +0100
Local: Fri, Nov 17 2006 1:25 am
Subject: Re: Generating slug for words with accents
Polish:
ą = a
ć = c
ź or ż = z
ę = e
ó = o
ł = l
ś = s
ń = n

2006/11/17, zenx <antonio.m...@gmail.com>:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Aidas Bendoraitis  
View profile  
 More options Nov 17 2006, 3:19 am
From: "Aidas Bendoraitis" <aidas.bendorai...@gmail.com>
Date: Fri, 17 Nov 2006 09:19:44 +0100
Local: Fri, Nov 17 2006 3:19 am
Subject: Re: Generating slug for words with accents
Similarly Lithuanian would be:
ą = a
č = c
ę = e
ė = e
į = i
š = s
ų = u
ū = u
ž = z

I am just thinking whether slugify function should correspond to the
chosen language or not. It seems that there are not many differences
among stripped accented letters in different languages, so maybe it
should be left the same. Whatever we decide, ß should still be
translated to ss, but not S. What is the opinion of the others?

And also, if we are already adding localizations to the slugify
function, should't greek, russian, and  other non-latin alphabets also
be translated to latin charset?

Regards,
Aidas Bendoraitis [aka Archatas]

On 11/17/06, Kamil Wdowicz <kwdow...@zenstudio.pl> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
orestis  
View profile  
 More options Nov 17 2006, 4:53 am
From: "orestis" <ores...@gmail.com>
Date: Fri, 17 Nov 2006 01:53:54 -0800
Local: Fri, Nov 17 2006 4:53 am
Subject: Re: Generating slug for words with accents
Can you discuss this on the relevant ticket:

http://code.djangoproject.com/ticket/2282

Thanks,
Orestis


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »