Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss
Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

slicing functionality for strings / Python suitability for bioinformatics

2 views
Skip to first unread message

jbper...@yahoo.com

unread,
Sep 19, 2005, 3:25:16 PM9/19/05
to
>>> rs='AUGCUAGACGUGGAGUAG'
>>> rs[12:15]='GAG'
Traceback (most recent call last):
File "<pyshell#119>", line 1, in ?
rs[12:15]='GAG'
TypeError: object doesn't support slice assignment

You can't assign to a section of a sliced string in
Python 2.3 and there doesn't seem to be mention of this
as a Python 2.4 feature (don't have time to actually try
2.4 yet).

Q1. Does extended slicing make use of the Sequence protocol?
Q2. Don't strings also support the Sequence protcol?
Q3. Why then can't you make extended slicing assignment work
when dealing with strings?

This sort of operation (slicing/splicing of sequences represented
as strings) would seem to be a very fundamental oepration when doing
rna/dna/protein sequencing algorithms, and it would greatly enhance
Python's appeal to those doing bioinformatics work if the slicing
and extended slicing operators worked to their logical limit.

Doing a cursory search doesn't seem to reveal any current PEPs
dealing with extending the functionality of slicing/extended
slicing operators.

Syntax and feature-wise, is there a reason why Python can't kick
Perl's butt as the dominant language for bioinformatics and
eventually become the lingua franca of this fast-growing and
funding-rich field?

Terry Reedy

unread,
Sep 19, 2005, 7:30:35 PM9/19/05
to pytho...@python.org

"Reinhold Birkenfeld" <reinhold-birk...@wolke7.net> wrote in
message news:3p8in6F...@individual.net...

> jbper...@yahoo.com wrote:
>>>>> rs='AUGCUAGACGUGGAGUAG'
>>>>> rs[12:15]='GAG'
>> Traceback (most recent call last):
>> File "<pyshell#119>", line 1, in ?
>> rs[12:15]='GAG'
>> TypeError: object doesn't support slice assignment
>>
>> You can't assign to a section of a sliced string in
>> Python 2.3 and there doesn't seem to be mention of this
>> as a Python 2.4 feature (don't have time to actually try
>> 2.4 yet).
>
> Strings are immutable in Python, which is why assignment to
> slices won't work.
>
> But why not use lists?
>
> rs = list('AUGC...')
> rs[12:15] = list('GAG')

Or arrays of characters: see the array module.

Terry J. Reedy

Reinhold Birkenfeld

unread,
Sep 19, 2005, 3:43:34 PM9/19/05
to
jbper...@yahoo.com wrote:
>>>> rs='AUGCUAGACGUGGAGUAG'
>>>> rs[12:15]='GAG'
> Traceback (most recent call last):
> File "<pyshell#119>", line 1, in ?
> rs[12:15]='GAG'
> TypeError: object doesn't support slice assignment
>
> You can't assign to a section of a sliced string in
> Python 2.3 and there doesn't seem to be mention of this
> as a Python 2.4 feature (don't have time to actually try
> 2.4 yet).

Strings are immutable in Python, which is why assignment to
slices won't work.

But why not use lists?

rs = list('AUGC...')
rs[12:15] = list('GAG')

Reinhold

jbper...@yahoo.com

unread,
Sep 19, 2005, 10:28:33 PM9/19/05
to
right, i forgot about that...

jbper...@yahoo.com

unread,
Sep 19, 2005, 10:40:12 PM9/19/05
to
Having to do an array.array('c',...):

>>> x=array.array('c','ATCTGACGTC')
>>> x[1:9:2]=array.array('c','AAAA')
>>> x.tostring()
'AACAGACATC'

is a bit klunkier than one would want, but I guess
the efficient performance is the silver lining here.

Sebastian Bassi

unread,
Sep 19, 2005, 10:57:24 PM9/19/05
to pytho...@python.org
On 19 Sep 2005 12:25:16 -0700, jbper...@yahoo.com

<jbper...@yahoo.com> wrote:
> >>> rs='AUGCUAGACGUGGAGUAG'
> >>> rs[12:15]='GAG'
> Traceback (most recent call last):
> File "<pyshell#119>", line 1, in ?
> rs[12:15]='GAG'
> TypeError: object doesn't support slice assignment

You should try Biopython (www.biopython.org). There is a sequence
method you could try.

--
<a href="http://www.spreadfirefox.com/?q=affiliates&id=24672&t=1">La
web sin popups ni spyware: Usa Firefox en lugar de Internet
Explorer</a>

jbper...@yahoo.com

unread,
Sep 19, 2005, 11:01:06 PM9/19/05
to
Great suggestion... I was naively trying to turn the string into a list
and slice
that which I reckon would be significantly slower.

Steven D'Aprano

unread,
Sep 20, 2005, 6:55:04 PM9/20/05
to

There are a number of ways to streamline that. The simplest is to merely
create an alias to array.array:

from array import array as str

Then you can say x = str('c', 'ATCTGACGTC').

A little more sophisticated would be to use currying:

def str(value):
return array.array('c', value)

x = str('ATCTGACGTC')

although to be frank I'm not sure that something as simple as this
deserves to be dignified with the name currying.


Lastly, you could create a wrapper class that implements everything you
want. For a serious application, this is probably what you want to do
anyway:

class DNA_Sequence:
alphabet = 'ACGT'

def __init__(self, value):
for c in value:
if c not in self.__class__.alphabet:
raise ValueError('Illegal character "%s".' % c)
self.value = array.array('c', value)

def __repr__(self):
return self.value.tostring()

and so on. Obviously you will need more work than this, and it may be
possible to subclass array directly.


--
Steven.

Tom Anderson

unread,
Sep 21, 2005, 6:37:38 AM9/21/05
to
On Wed, 21 Sep 2005, Steven D'Aprano wrote:

> On Mon, 19 Sep 2005 19:40:12 -0700, jbperez808 wrote:
>
>> Having to do an array.array('c',...):
>>
>> >>> x=array.array('c','ATCTGACGTC')
>> >>> x[1:9:2]=array.array('c','AAAA')
>> >>> x.tostring()
>> 'AACAGACATC'
>>
>> is a bit klunkier than one would want, but I guess the efficient
>> performance is the silver lining here.
>
> There are a number of ways to streamline that. The simplest is to merely
> create an alias to array.array:
>
> from array import array as str
>
> Then you can say x = str('c', 'ATCTGACGTC').
>
> A little more sophisticated would be to use currying:
>
> def str(value):
> return array.array('c', value)
>
> x = str('ATCTGACGTC')

There's a special hell for people who override builtins.

> although to be frank I'm not sure that something as simple as this
> deserves to be dignified with the name currying.

It's definitely not currying - it doesn't create a new function. Currying
would be:

def arraytype(kind):
def mkarray(value):
return array.array(kind, value)
return mkarray

chars = arraytype('c')
seq = chars("tacatcgtcgacgtcgatcagtaccc")

> Lastly, you could create a wrapper class that implements everything you
> want. For a serious application, this is probably what you want to do
> anyway:

Definitely - there are lots of things to know about DNA molecules or parts
of them that aren't captured by the sequence.

tom

--
If it ain't Alberta, it ain't beef.

Fredrik Lundh

unread,
Sep 21, 2005, 7:08:45 AM9/21/05
to pytho...@python.org
Tom Anderson wrote:

> There's a special hell for people who override builtins.

which is, most likely, chock full of highly experienced python programmers.

</F>

Steven D'Aprano

unread,
Sep 21, 2005, 10:28:34 AM9/21/05
to
On Wed, 21 Sep 2005 11:37:38 +0100, Tom Anderson wrote:

> There's a special hell for people who override builtins.

[slaps head]

Of course there is, and I will burn in it for ever...

--
Steven.

Tom Anderson

unread,
Sep 21, 2005, 2:13:24 PM9/21/05
to

You reckon? I've never felt the need to do it myself, and instinctively,
it seems like a bad idea. Perhaps i've been missing something, though -
could you give me some examples of when overriding a builtin is a good
thing to do?

tom

--
Fitter, Happier, More Productive.

0 new messages