Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Script for finding words of any size that do NOT contain vowels with acute diacritic marks?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  13 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
nwaits  
View profile  
 More options Oct 17 2012, 10:31 am
Newsgroups: comp.lang.python
From: nwaits <nowa...@gmail.com>
Date: Wed, 17 Oct 2012 07:31:42 -0700 (PDT)
Local: Wed, Oct 17 2012 10:31 am
Subject: Script for finding words of any size that do NOT contain vowels with acute diacritic marks?
I'm very impressed with python's wordlist script for plain text.  Is there a script for finding words that do NOT have certain diacritic marks, like acute or grave accents (utf-8), over the vowels?  
Thank you.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dave Angel  
View profile  
 More options Oct 17 2012, 11:00 am
Newsgroups: comp.lang.python
From: Dave Angel <d...@davea.name>
Date: Wed, 17 Oct 2012 11:00:11 -0400
Local: Wed, Oct 17 2012 11:00 am
Subject: Re: Script for finding words of any size that do NOT contain vowels with acute diacritic marks?
On 10/17/2012 10:31 AM, nwaits wrote:

> I'm very impressed with python's wordlist script for plain text.  Is there a script for finding words that do NOT have certain diacritic marks, like acute or grave accents (utf-8), over the vowels?  
> Thank you.

if you can construct a list of "illegal" characters, then you can simply
check each character of the word against the list, and if it succeeds
for all of the characters, it's a winner.

If that's not fast enough, you can build a translation table from the
list of illegal characters, and use translate on each word.  Then it
becomes a question of checking if the translated word is all zeroes.  
More setup time, but much faster looping for each word.

--

DaveA


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
wxjmfa...@gmail.com  
View profile  
 More options Oct 17 2012, 11:32 am
Newsgroups: comp.lang.python
From: wxjmfa...@gmail.com
Date: Wed, 17 Oct 2012 08:32:52 -0700 (PDT)
Local: Wed, Oct 17 2012 11:32 am
Subject: Re: Script for finding words of any size that do NOT contain vowels with acute diacritic marks?
Le mercredi 17 octobre 2012 17:00:46 UTC+2, Dave Angel a écrit :

Lazy way.
Py3.2

>>> import unicodedata
>>> def HasDiacritics(w):

...     w_decomposed = unicodedata.normalize('NFKD', w)
...     return 'no' if len(w) == len(w_decomposed) else 'yes'
...    

>>> HasDiacritics('éléphant')
'yes'
>>> HasDiacritics('elephant')
'no'
>>> HasDiacritics('\N{LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRON}')
'yes'
>>> HasDiacritics('U')
'no'

Should be ok for the CombiningDiacriticalMarks unicode range
(common diacritics)

jmf


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
wxjmfa...@gmail.com  
View profile  
 More options Oct 17 2012, 11:33 am
Newsgroups: comp.lang.python
From: wxjmfa...@gmail.com
Date: Wed, 17 Oct 2012 08:32:52 -0700 (PDT)
Local: Wed, Oct 17 2012 11:32 am
Subject: Re: Script for finding words of any size that do NOT contain vowels with acute diacritic marks?
Le mercredi 17 octobre 2012 17:00:46 UTC+2, Dave Angel a écrit :

Lazy way.
Py3.2

>>> import unicodedata
>>> def HasDiacritics(w):

...     w_decomposed = unicodedata.normalize('NFKD', w)
...     return 'no' if len(w) == len(w_decomposed) else 'yes'
...    

>>> HasDiacritics('éléphant')
'yes'
>>> HasDiacritics('elephant')
'no'
>>> HasDiacritics('\N{LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRON}')
'yes'
>>> HasDiacritics('U')
'no'

Should be ok for the CombiningDiacriticalMarks unicode range
(common diacritics)

jmf


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Kelly  
View profile  
 More options Oct 17 2012, 1:07 pm
Newsgroups: comp.lang.python
From: Ian Kelly <ian.g.ke...@gmail.com>
Date: Wed, 17 Oct 2012 11:07:11 -0600
Local: Wed, Oct 17 2012 1:07 pm
Subject: Re: Script for finding words of any size that do NOT contain vowels with acute diacritic marks?

On Wed, Oct 17, 2012 at 9:32 AM,  <wxjmfa...@gmail.com> wrote:
>>>> import unicodedata
>>>> def HasDiacritics(w):
> ...     w_decomposed = unicodedata.normalize('NFKD', w)
> ...     return 'no' if len(w) == len(w_decomposed) else 'yes'
> ...
>>>> HasDiacritics('éléphant')
> 'yes'
>>>> HasDiacritics('elephant')
> 'no'
>>>> HasDiacritics('\N{LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRON}')
> 'yes'
>>>> HasDiacritics('U')
> 'no'

Is there something wrong with True and False that you had to replace
them with strings?

"return len(w) != len(w_decomposed)" is all you need.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David Robinow  
View profile  
 More options Oct 17 2012, 1:16 pm
Newsgroups: comp.lang.python
From: David Robinow <drobi...@gmail.com>
Date: Wed, 17 Oct 2012 13:16:43 -0400
Local: Wed, Oct 17 2012 1:16 pm
Subject: Re: Script for finding words of any size that do NOT contain vowels with acute diacritic marks?
On Wed, Oct 17, 2012 at 1:07 PM, Ian Kelly <ian.g.ke...@gmail.com> wrote:
> "return len(w) != len(w_decomposed)" is all you need.

 Thanks for helping, but I already knew that.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
wxjmfa...@gmail.com  
View profile  
 More options Oct 17 2012, 2:17 pm
Newsgroups: comp.lang.python
From: wxjmfa...@gmail.com
Date: Wed, 17 Oct 2012 11:17:16 -0700 (PDT)
Local: Wed, Oct 17 2012 2:17 pm
Subject: Re: Script for finding words of any size that do NOT contain vowels with acute diacritic marks?
Le mercredi 17 octobre 2012 19:07:43 UTC+2, Ian a écrit :

Not at all, I knew this. In this I decided to program like
this.

Do you get it?  Yes/No  or True/False

jmf


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
wxjmfa...@gmail.com  
View profile  
 More options Oct 17 2012, 2:17 pm
Newsgroups: comp.lang.python
From: wxjmfa...@gmail.com
Date: Wed, 17 Oct 2012 11:17:16 -0700 (PDT)
Local: Wed, Oct 17 2012 2:17 pm
Subject: Re: Script for finding words of any size that do NOT contain vowels with acute diacritic marks?
Le mercredi 17 octobre 2012 19:07:43 UTC+2, Ian a écrit :

Not at all, I knew this. In this I decided to program like
this.

Do you get it?  Yes/No  or True/False

jmf


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Chris Angelico  
View profile  
 More options Oct 17 2012, 2:22 pm
Newsgroups: comp.lang.python
From: Chris Angelico <ros...@gmail.com>
Date: Thu, 18 Oct 2012 05:22:29 +1100
Local: Wed, Oct 17 2012 2:22 pm
Subject: Re: Script for finding words of any size that do NOT contain vowels with acute diacritic marks?

On Thu, Oct 18, 2012 at 5:17 AM,  <wxjmfa...@gmail.com> wrote:
> Not at all, I knew this. In this I decided to program like
> this.

> Do you get it?  Yes/No  or True/False

Yes but why? When you're returning a boolean concept, why not return a
boolean value? You don't even use values with one that
compares-as-true and the other that compares-as-false (for instance,
you could write the function so that it returns just the
diacritic-containing characters, meaning it'll return "" if there
aren't any). To what benefit?

Puzzled.

ChrisA


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Kelly  
View profile  
 More options Oct 17 2012, 2:28 pm
Newsgroups: comp.lang.python
From: Ian Kelly <ian.g.ke...@gmail.com>
Date: Wed, 17 Oct 2012 12:27:12 -0600
Local: Wed, Oct 17 2012 2:27 pm
Subject: Re: Script for finding words of any size that do NOT contain vowels with acute diacritic marks?

On Wed, Oct 17, 2012 at 12:17 PM,  <wxjmfa...@gmail.com> wrote:
> Not at all, I knew this. In this I decided to program like
> this.

> Do you get it?  Yes/No  or True/False

It's just bad style, because both 'yes' and 'no' evaluate true.

if HasDiacritics('éléphant'):
    print('Correct!')

if HasDiacritics('elephant'):
    print('Error!')

Prints:

Correct!
Error!

You could replace the test with "if HasDiacritics('elephant') ==
'yes':", but why force the caller to write that out when the former
test is more natural and less prone to error (e.g. typoing 'yes')?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
wxjmfa...@gmail.com  
View profile  
 More options Oct 17 2012, 2:33 pm
Newsgroups: comp.lang.python
From: wxjmfa...@gmail.com
Date: Wed, 17 Oct 2012 11:33:30 -0700 (PDT)
Local: Wed, Oct 17 2012 2:33 pm
Subject: Re: Script for finding words of any size that do NOT contain vowels with acute diacritic marks?
Le mercredi 17 octobre 2012 20:28:21 UTC+2, Ian a écrit :

I *know* all this. In my prev. msg, the goal was to emph. the
usage of *unicode.normalize()".

jmf


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
wxjmfa...@gmail.com  
View profile  
 More options Oct 17 2012, 2:34 pm
Newsgroups: comp.lang.python
From: wxjmfa...@gmail.com
Date: Wed, 17 Oct 2012 11:33:30 -0700 (PDT)
Local: Wed, Oct 17 2012 2:33 pm
Subject: Re: Script for finding words of any size that do NOT contain vowels with acute diacritic marks?
Le mercredi 17 octobre 2012 20:28:21 UTC+2, Ian a écrit :

I *know* all this. In my prev. msg, the goal was to emph. the
usage of *unicode.normalize()".

jmf


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Steven D'Aprano  
View profile  
 More options Oct 17 2012, 7:18 pm
Newsgroups: comp.lang.python
From: Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info>
Date: 17 Oct 2012 23:18:03 GMT
Local: Wed, Oct 17 2012 7:18 pm
Subject: Re: Script for finding words of any size that do NOT contain vowels with acute diacritic marks?

On Wed, 17 Oct 2012 13:16:43 -0400, David Robinow wrote:
> On Wed, Oct 17, 2012 at 1:07 PM, Ian Kelly <ian.g.ke...@gmail.com>
> wrote:
>> "return len(w) != len(w_decomposed)" is all you need.

>  Thanks for helping, but I already knew that.

David, Ian was directly responding to wxjmfa...@gmail.com, whose
suggestion included an entirely unnecessary conversion from a bool flag
to the strings 'yes' and 'no'. That can be seen in the part of Ian's post
that you deleted.

Regardless of whether *you personally* already knew that jmf's function
was unidiomatic and a poor design, you weren't directly the target of the
comment. I'm glad you already knew what Ian said, but you're not the only
person reading this thread.

--
Steven


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »