"6-7,mx8-10,8-10,foo"
I.e., a name is an integer or an alphabetic prefix possibly followed
by an integer. The display must compress each set of succesive names
having the same (possibly null) prefix and sequential integers. The
original order of names must be preserved.
I (hesitantly) post the following code that works, but makes me cringe
anytime I see it, for it's clumsy unreadableness. Can someone with
a fresh mind see a clear and concise way to make my clear and concise
name display?
names = ['6','7','mx8','mx09','mx10','8','9','10','foo']
groups = []
import re
def collapse(x,y):
if x and x[-1][1] and y[1] and x[-1][0] == y[0] and int(x[-1][2]) == (int(y[2])-1): x[-1][2] = y[2]
return x
else:
x.append(y)
return x
groups = []
for n in names:
r = re.compile('\d*$').search(n)
groups.append([n[0:r.start()], n[r.start():r.end()], n[r.start():r.end()]])
r = reduce(collapse, groups, [])
s=[]
for i in r:
if i[1] == i[2]:
n=i[1]
else:
n=i[1] + '-' + i[2]
s.append(i[0] + n)
print ','.join(s)
import re
names = ['6','7','5','mx8','mx09','mx10','8','9','10','foo','5','this']
def collapse( names ):
"""Collapse ranges of names"""
result = []
current = []
currentPrefix = ''
for item in names:
a,b = split( item )
if a == currentPrefix and current and b == current[-1]+1:
current.append( b )
else:
if current:
# need to put previous result in result-set
result.append( (currentPrefix,current))
# now process the new item...
if b is None:
# no number, so just add record to list
result.append( (a,[]))
current = []
else:
current = [b]
currentPrefix = a
if current:
result.append( (currentPrefix,current))
return result
matcher = re.compile( '^(?P<prefix>\D*)(?P<number>\d*)$' )
def split( s ):
"""Get prefix and integer value (or None) for a string"""
a,b = matcher.match( s ).groups()
if b:
b = int( b, 10 )
return a, b or None
print collapse( names )
Enjoy,
Mike
George Young wrote:
>[python 2.3a1]
>I have a list of "names" like:
> ['6','7','mx8','mx09','mx10','8','9','10','foo']
>which needs for concise and clear display to become a string like:
>
> "6-7,mx8-10,8-10,foo"
>
>I.e., a name is an integer or an alphabetic prefix possibly followed
>by an integer. The display must compress each set of succesive names
>having the same (possibly null) prefix and sequential integers. The
>original order of names must be preserved.
>
>I (hesitantly) post the following code that works, but makes me cringe
>anytime I see it, for it's clumsy unreadableness. Can someone with
>a fresh mind see a clear and concise way to make my clear and concise
>name display?
>
>
...
_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/
It helps to separate the problem into parsing, grouping, and output formatting:
# input parsing
import re
numb = re.compile('(\D*)(\d*)')
names = ['6','7','mx8','mx09','mx10','8','9','10','foo']
pairs = [numb.match(n).groups() for n in names]
# grouping
result = []
currobj, currnum, cnt = None, "", 0
for obj, num in pairs:
if obj == currobj:
cnt += 1
else:
result.append((currobj, currnum, cnt))
currobj, currnum, cnt = obj, num, 1
result.append((currobj, currnum, cnt))
# output formatting
for obj, num, cnt in result[1:]:
if cnt == 1: print obj + num
else: print obj + num + '-' + str(int(num)+cnt)
Raymond Hettinger
> [python 2.3a1]
> I have a list of "names" like:
> ['6','7','mx8','mx09','mx10','8','9','10','foo']
> which needs for concise and clear display to become a string like:
>
> "6-7,mx8-10,8-10,foo"
>
> I.e., a name is an integer or an alphabetic prefix possibly followed
> by an integer. The display must compress each set of succesive names
> having the same (possibly null) prefix and sequential integers. The
> original order of names must be preserved.
>
> I (hesitantly) post the following code that works, but makes me cringe
> anytime I see it, for it's clumsy unreadableness. Can someone with
> a fresh mind see a clear and concise way to make my clear and concise
> name display?
Not really:
import re
def compressRanges(l):
if not l: return ""
prefixNumStrTuples = [re.match(r'(\D*)(\d*)', s).groups() for s in l]
lastPrefix, lastNumS = prefixNumStrTuples.pop(0)
startNumS = lastNumS
startNum = lastNum = lastNumS and int(lastNumS)
prefixNumStrTuples.append(("",""))
res = []
for prefix, numS in prefixNumStrTuples:
num = numS and int(numS)
if prefix != lastPrefix or numS and lastNum != num - 1:
if startNum != lastNum:
res.append(lastPrefix + startNumS + '-' + lastNumS)
else:
res.append(lastPrefix + lastNumS)
startNum, startNumS = num, numS
lastNum, lastNumS = num, numS
lastPrefix = prefix
return ','.join(res)
'as
import re
numb = re.compile('(\D*)(\d*)')
def group ((list, (last, start, end)), (next, num)):
if last == next:
if int(num) == int(end)+1: return (list, (last, start, num))
if num == start: return (list, (next, num, num))
return (list+[(last, start, end)], (next, num, num))
def fmt ((text, start, end)):
if start == end: return text + start
else: return text + start + "-" + end
def compare ((t1, n1), (t2, n2)):
try: return cmp(t1, t2) or cmp(int(n1), int(n2))
except: return 1
def tidy (list):
if list:
pairs = [numb.match(n).groups() for n in list]
pairs.sort(compare)
(text, num) = pairs[0]
(groups, last) = reduce (group, pairs, ([], (text, num, num)))
return map (fmt, groups+[last])
else: return list
print tidy(['6','7','mx8','mx09','mx10','8','9','10','foo'])
andrew
my @names = qw(6 7 mx8 mx09 mx10 8 9 10 foo);
my @result;
my ($prefix, $integer) = (shift @names) =~ m/(\D*)(\d*)/;
my $last = $integer;
while (my ($new_prefix, $new_integer) = (shift @names) =~ m/(\D*)(\d*)/) {
if ($new_prefix eq $prefix and $new_integer == $last + 1) {
$last += 1;
} else {
push @result, display ($prefix, $integer, $last);
$prefix = $new_prefix;
$integer = $last = $new_integer;
}
last unless @names;
}
push @result, display ($prefix, $integer, $last);
print join (',', @result) . "\n";
or can be solved in a concise way (perhaps someone can do better) :
my @o;
@result = grep {if ($_->[0] eq $o[0] and $_->[1] == $o[2]+1) {$o[2]++;0}
else {($_,@o)=(display(@o),@$_);1}} map {m/(\D*)(\d*)/;[$1,$2, $2]} @names;
push @result, display(@o);shift @result;
"George Young" <g...@ll.mit.edu> wrote in message
news:pan.2003.05.29....@ll.mit.edu...
LOL!! Good one. :-)
> sub display {
> my ($prefix, $first, $last) = @_;
> return "$prefix$first" if $last eq $first;
> return "$prefix$first-$last";
> }
[snip rest of line noise]
-bigoted-ly y'rs,
Peter
>I have a list of "names" like:
> ['6','7','mx8','mx09','mx10','8','9','10','foo']
>which needs for concise and clear display to become a string like:
>
> "6-7,mx8-10,8-10,foo"
I also see from your subject line that
[2,3,4,7] --> "2-4,7"
I am not sure all the code given before took this into account.
I was thinking about this problem since yesterday, because I have
coded similar stuff at least a dozen times in my life, but I had not
yet found a good way to do it.
I tried Python OO, and I think it turned out pretty clean.
###############################
## gyoung.py
import re
class GroupHelper:
def __init__(self, s='', **d):
m = re.match(r'(\D*)(\d+)$', s)
if m:
g = m.groups()
self.kind = 'pair'
self.prefix = g[0]
self.start = self.end = int(g[1])
else:
self.kind = 'string'
self.prefix = s
self.__dict__.update(d)
def __add__(self, other):
if (self.kind != 'pair') or (other.kind != 'pair'):
raise TypeError
if self.prefix != other.prefix:
raise TypeError
if other.start != (self.end + 1):
raise TypeError
return GroupHelper(
kind='pair',
prefix=self.prefix,
start=self.start,
end=other.end)
def __str__(self):
if self.kind == 'string':
return self.prefix
if self.start == self.end:
return '%s%i' % (self.prefix, self.start)
return '%s%i-%i' % (self.prefix, self.start, self.end)
def gyoung(list0):
if not list0: return ''
list0 = [GroupHelper(s) for s in list0]
result = [list0[0]]
for gh in list0[1:]:
try:
result[-1] += gh
except TypeError:
result.append(gh)
return ','.join( [str(y) for y in result] )
print gyoung(
['6','7','8','12','mx8','mx09','mx10','8','9','10','foo'])
###############################
The main loop is particularly clean, because the class does all the
work.
In the other solutions given, I was worried about what always happened
to me the other dozen times I coded something like this: if there is a
small change to the specification, my old algorithm cannot be simply
adapted.
For example, if you wanted 'foo' to be treated like 'foo00', or if in
the original list, you already had some items in the form
'mx8-12, mx13, mx14-17'
and you wanted the result to be 'mx8-17'
I think with this Python OO approach, you can make such changes
simply. Local changes to the parsing, 'adding', formatting, or the
main loop don't seem to necessitate a global change to the code.
The Perl solution was hilarious! I assume it was done tongue-in-cheek.
Manuel
from __future__ import nested_scopes
import re
def collapse( input_sequence ):
search = re.compile( '\d+$' ).search
range_prefix = None
range_start = 0
range_last = 0
item_last = None
result_list = []
def append_last( ):
if range_prefix is not None:
if range_start == range_last:
result_list.append( item_last )
else:
result_list.append( range_prefix + str( range_start ) + '-' + str(
range_last ) )
for item in input_sequence:
found = search( item )
if found is None:
append_last( )
result_list.append( item )
range_prefix = None
continue
start = found.start( )
prefix = item[:start]
value = int( item[start:] )
if prefix == range_prefix and value == range_last + 1:
range_last = value
else:
append_last( )
range_prefix = prefix
range_start = value
range_last = value
item_last = item
append_last( )
return ','.join( result_list )
---
CB
from __future__ import generators
import re
def inputparse(source):
splitter = re.compile('(\D*)(\d*)').match
for name in source:
yield splitter(name).groups()
def grouping(pairsource):
source = iter(pairsource)
lastname, lastnumber = source.next()
startnumber = lastnumber
for name, number in source:
if name == lastname and int(number) == int(lastnumber)+1:
lastnumber = number
else:
yield lastname, startnumber, lastnumber
lastname, startnumber, lastnumber = name, number, number
yield lastname, startnumber, lastnumber
def format(lastname, startnumber, lastnumber):
if startnumber is lastnumber:
return lastname + startnumber
else:
return lastname + startnumber + '-' + lastnumber
if __name__ == '__main__':
names = ['6','7','mx8','mx09','mx10','8','9','10','foo']
print ', '.join([format(name, start, stop) for name, start, stop
in grouping(inputparse(names))])
-Scott David Daniels
Scott....@Acm.Org
<snip>
> In the other solutions given, I was worried about what always happened
> to me the other dozen times I coded something like this: if there is a
> small change to the specification, my old algorithm cannot be simply
> adapted.
<snip>
> or if in
> the original list, you already had some items in the form
> 'mx8-12, mx13, mx14-17'
> and you wanted the result to be 'mx8-17'
<snip>
I agree with the sentiment completely. Having said that, here is how I
amused myself for the last hour:
import re
names=['6','7','mx8','mx09','mx10','8','9','10','foo']
r=re.compile(r'(\D*)(\d*)-?(\d*)')
for ii in xrange(len(names)-1, 0,-1):
(base1, low1, high1),(base2, low2, high2)=r.match(names[ii-1]).groups(), r.match(names[ii]).groups()
if base1==base2 and int(low1)+1==int(low2):
names[ii-1]='%s%d-%d' %(base1, int(low1), int(max(low2.rjust(len(high2)), high2)))
names.pop(ii)
print names
Alex
>[python 2.3a1]
>I have a list of "names" like:
> ['6','7','mx8','mx09','mx10','8','9','10','foo']
>which needs for concise and clear display to become a string like:
>
> "6-7,mx8-10,8-10,foo"
Don't you need to be able to reconstitute the original sequence from the
abbreviated one? I.e., how would you know to go backwards to 'mx09' vs 'mx9' ?
[...]
Regards,
Bengt Richter
>I agree with the sentiment completely. Having said that, here is how I
>amused myself for the last hour:
>
>import re
>names=['6','7','mx8','mx09','mx10','8','9','10','foo']
>r=re.compile(r'(\D*)(\d*)-?(\d*)')
>
>for ii in xrange(len(names)-1, 0,-1):
> (base1, low1, high1),(base2, low2, high2)=r.match(names[ii-1]).groups(), r.match(names[ii]).groups()
> if base1==base2 and int(low1)+1==int(low2):
> names[ii-1]='%s%d-%d' %(base1, int(low1), int(max(low2.rjust(len(high2)), high2)))
> names.pop(ii)
>
>print names
Kick ass!
I recommend to everyone in the thread to play with this one, to fully
appreciate its majesty.
['6','7','12','mx8','mx09','mx10','mx11-12','8','9','10','foo']
-> ['6-7', '12', 'mx8-12', '8-10', 'foo']
Fantastic!
Two bugs, easily fixed:
It doesn't work for ['foo','foo1'].
Little mix-up with 'low' and 'high' in the 'if' condition,
so ['mx3-7','mx8'] doesn't work.
I am not sure if 'int(max(low2.rjust(len(high2)), high2))' always
works. I would replace it with
'int([high2, low2][len(high2)==0])'.
I think this fixes it:
########################
## gyoung2.py
import re
names = ['6','7','12','mx3-7','mx8','mx09','mx10','mx11-12','8','9','10','foo','foo1','foo2']
for ii in xrange(len(names)-1, 0,-1):
(base1, low1, high1),(base2, low2, high2)=(re.match(r'(\D*)(\d*)-?(\d*)', names[ii-1]).groups(), re.match(r'(\D*)(\d*)-?(\d*)', names[ii]).groups())
if base1 == base2 and low1 and low2 and int([high1, low1][len(high1)==0])+1 == int(low2):
names[ii-1] = '%s%d-%d' % (base1, int(low1), int([high2, low2][len(high2)==0]))
names.pop(ii)
print names
########################
My brain is not working to make it any shorter.
Pythonic, it is not!
Manuel
> Kick ass!
Thanks.
> Two bugs, easily fixed
Thanks for the fixes.
> Pythonic, it is not!
No, but it was fun!
Alex