I have a list call it 'l':
l = ['asc', '*nbh*', 'jlsdjfdk', 'ikjh', '*jkjsdfjasd*', 'rewr']
Notice that some of the items in the list start and end with an '*'. I wish to construct a new list, call it 'n' which is all the members of l that start and end with '*', with the '*'s removed.
So in the case above n would be ['nbh', 'jkjsdfjasd']
the following works:
r = re.compile('\*(.+)\*')
def f(s):
m = r.match(s)
if m:
return m.group(1)
else:
return ''
n = [f(x) for x in l if r.match(x)]
But it is inefficient, because it is matching the regex twice for each item, and it is a bit ugly.
I could use:
n = []
for x in keys:
m = r.match(x)
if m:
n.append(m.group(1))
It is more efficient, but much uglier.
Does anyone have a better solution?
Thank,
-EdK
Ed Keith
e_...@yahoo.com
Blog: edkeith.blogspot.com
Regexes seem like the proverbial sledgehammer to crack a nut here.
Note that '*' if it is present, is always 1 character, so we can
write:
n = [x[1:-1] for x in l if x.startswith("*") and x.endswith("*")]
--
André Engels, andre...@gmail.com
>>> lst = ['asc', '*nbh*', 'jlsdjfdk', 'ikjh', '*jkjsdfjasd*', 'rewr']
>>> [item[1:-1] for item in lst if (item.startswith("*") and item.endswith("*"))]
['nbh', 'jkjsdfjasd']
>>>
hth,
vbr
> Notice that some of the items in the list start and end with
> an '*'. I wish to construct a new list, call it 'n' which is
> all the members of l that start and end with '*', with the
> '*'s removed.
>
> So in the case above n would be ['nbh', 'jkjsdfjasd']
[s[1:-1] for s in l if (s[0] == s[-1] == '*')]
--
Grant Edwards grante Yow! Used staples are good
at with SOY SAUCE!
visi.com
You can skip the function by writing that as
n = [r.match(s).group(1) for s in l if r.match(s)]
but it doesn't solve your match-twice problem.
I'd skip regexps completely and do something like
n = [s[1:-1] for s in l
if s.startswith('*')
and s.endswith('*')
]
And this is coming from a guy that tends to overuse regexps :)
-tkc
That last bit doesn't work right, does it, since an == expression
evaluates to True or False, no the true or false value itself?
--
Neil Cerutti
It's efficient and easy to understand; maybe you have to readjust your
taste.
> Does anyone have a better solution?
In this case an approach based on string slicing is probably best. When the
regular expression gets more complex you can use a nested a generator
expression:
>>> items = ['asc', '*nbh*', 'jlsdjfdk', 'ikjh', '*jkjsdfjasd*', 'rewr']
>>> match = re.compile(r"\*(.+)\*").match
>>> [m.group(1) for m in (match(s) for s in items) if m is not None]
['nbh', 'jkjsdfjasd']
Peter
It works for me. Doesn't it work for you?
From the fine manual (section 5.9. Comparisons):
Comparisons can be chained arbitrarily, e.g., x < y <= z is
equivalent to x < y and y <= z, except that y is evaluated
only once (but in both cases z is not evaluated at all when x
< y is found to be false).
--
Grant Edwards grante Yow! Hand me a pair of
at leather pants and a CASIO
visi.com keyboard -- I'm living
for today!
I am going to use string slicing, re is the wrong tool for the job. But this is what I was looking for when I posted. Simple, elegant and efficient.
Thanks all,
s[0] and s[-1] raise an IndexError if l contains an empty string.
Better something like:
>>> [s[1:-1] for s in l if (s[:1] == s[-1:] == '*')]
Or just the slightly more verbose startswith/endswith version.
--
Matt Nordhoff
> I have a problem and I am trying to find a solution to it that is both
> efficient and elegant.
>
> I have a list call it 'l':
>
> l = ['asc', '*nbh*', 'jlsdjfdk', 'ikjh', '*jkjsdfjasd*', 'rewr']
>
> Notice that some of the items in the list start and end with an '*'. I
> wish to construct a new list, call it 'n' which is all the members of l
> that start and end with '*', with the '*'s removed.
>
> So in the case above n would be ['nbh', 'jkjsdfjasd']
>
> the following works:
>
> r = re.compile('\*(.+)\*')
[snip]
Others have suggested using a list comp. Just to be different, here's a
version using filter and map.
l = ['asc', '*nbh*', 'jlsdjfdk', 'ikjh', '*jkjsdfjasd*', 'rewr']
l = map(
lambda s: s[1:-1] if s.startswith('*') and s.endswith('*') else '', l)
l = filter(None, l)
--
Steven
def f(s):
m = r.match(s)
if m:
return m.group(1)
l = ['asc', '*nbh*', 'jlsdjfdk', 'ikjh', '*jkjsdfjasd*', 'rewr']
n = [y for y in (f(x) for x in l) if y]
I agree, it's easy to understand, but it's also ugly because of the
level of indentation (which is too deep for such a simple problem).
>> Does anyone have a better solution?
(sorry to ramble around)
A few months ago, I suggested an improvement in the python-ideas list to
add a post-filter to list-comprehension, somewhere in this line:
a = [f(x) as F for x in l if c(F)]
where the evaluation of f(x) will be the value of F so F can be used in
the if-expression as a post-filter (complementing list-comps' pre-filter).
Many doubted its usefulness since they say it's easy to wrap in another
list-comp:
a = [y for y in (f(x) for x in l) if c(y)]
or with a map and filter
a = filter(None, map(f, l))
Up till now, I don't really like the alternatives.
> the following works:
>
> r = re.compile('\*(.+)\*')
>
> def f(s):
> m = r.match(s)
> if m:
> return m.group(1)
> else:
> return ''
>
> n = [f(x) for x in l if r.match(x)]
>
>
>
> But it is inefficient, because it is matching the regex twice for each
> item, and it is a bit ugly.
> Does anyone have a better solution?
Use a language with *real* list comprehensions?
Flamebait aside, you can use another level of comprehension, i.e.:
n = [m.group(1) for m in (r.match(x) for x in l) if m]
I did not know that. Thanks, Grant.
--
Neil Cerutti
What kind of guarantee do you have that the asterisk will only exist on
the first and last character, if at all?
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/
Looking back over the years, after I learned Python I realized that I
never really had enjoyed programming before.
> In article <mailman.1744.1260564...@python.org>, Ed
> Keith <e_...@yahoo.com> wrote:
>>
>>I have a list call it 'l':
>>
>>l = ['asc', '*nbh*', 'jlsdjfdk', 'ikjh', '*jkjsdfjasd*', 'rewr']
>>
>>Notice that some of the items in the list start and end with an '*'. I
>>wish to construct a new list, call it 'n' which is all the members of l
>>that start and end with '*', with the '*'s removed.
>
> What kind of guarantee do you have that the asterisk will only exist on
> the first and last character, if at all?
Does it matter?
In any case, surely the simplest solution is to eschew regular
expressions and do it the easy way.
result = [s[1:-1] for s in l if s.startswith('*') and s.endswith('*')]
For a more general solution, I'd use a pair of helper functions:
def bracketed_by(s, prefix, suffix=None):
if suffix is None:
suffix = prefix
return s.startswith(prefix) and s.endswith(suffix)
def strip_brackets(s, prefix, suffix=None):
if suffix is None:
suffix = prefix
return s[len(prefix):-len(suffix)]
Note that I haven't tested these two helper functions. The second in
particular may not work correctly in some corner cases (e.g. passing the
empty string as suffix).
--
Steven