Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

turn list of letters into an array of integers

129 views
Skip to first unread message

seektime

unread,
Oct 24, 2012, 1:23:11 AM10/24/12
to
Here's some example code. The input is a list which is a "matrix" of letters:
a b a
b b a

and I'd like to turn this into a Python array:

1 2 1
2 2 1

so 1 replaces a, and 2 replaces b. Here's the code I have so far:

>>> L=['a b a\n','b b a\n']
>>> s=' '.join(L)
>>> seq1=('a','b')
>>> seq2=('1','2')
>>> d = dict(zip(seq1,seq2))
>>> # Define method to replace letters according to dictionary (got this from http://gomputor.wordpress.com/2008/09/27/search-replace-multiple-words-or-characters-with-python/).
... def replace_all(text, dic):
... for i, j in dic.iteritems():
... text = text.replace(i, j)
... return text
...

>>> seq = replace_all(s,d)
>>> print seq
1 2 1
2 2 1

>>> seq
'1 2 1\n 2 2 1\n'

My question is how can I turn "seq" into a python array?

Thanks
Michael

Demian Brecht

unread,
Oct 24, 2012, 1:45:07 AM10/24/12
to seektime, pytho...@python.org
On 2012-10-23, at 10:23 PM, seektime <michael....@gmail.com> wrote:
> My question is how can I turn "seq" into a python array?


Something like this perhaps?:

>>> alpha = ('a', 'b')
>>> numeric = ('1', '2')
>>> L = ['a b a\n', 'b b a\n']
>>> s = ' '.join(L)
>>> d = dict(zip(alpha, numeric))
>>> list_ = [d[c] for c in s.strip('\n').split()]
>>> list_
['1', '2', '1', '2', '2', '1']

Demian Brecht
@demianbrecht
http://demianbrecht.github.com




David Hutto

unread,
Oct 24, 2012, 1:50:38 AM10/24/12
to seektime, pytho...@python.org
On Wed, Oct 24, 2012 at 1:23 AM, seektime <michael....@gmail.com> wrote:
> Here's some example code. The input is a list which is a "matrix" of letters:
> a b a
> b b a
>
> and I'd like to turn this into a Python array:
>
> 1 2 1
> 2 2 1
>
> so 1 replaces a, and 2 replaces b. Here's the code I have so far:
>
>>>> L=['a b a\n','b b a\n']
>>>> s=' '.join(L)
>>>> seq1=('a','b')
>>>> seq2=('1','2')
>>>> d = dict(zip(seq1,seq2))
>>>> # Define method to replace letters according to dictionary (got this from http://gommeitputor.wordpress.com/2008/09/27/search-replace-multiple-words-or-characters-with-python/).
> ... def replace_all(text, dic):
> ... for i, j in dic.iteritems():
> ... text = text.replace(i, j)
> ... return text
> ...
>
>>>> seq = replace_all(s,d)
>>>> print seq
> 1 2 1
> 2 2 1
>
>>>> seq
> '1 2 1\n 2 2 1\n'
>
I'd suggest, if this is what you're referring to:

x = seq.split('\n ')
array_list = [ ]
next_3_d_array = []
range_of_seq = len(seq)
for num in range(0,range_of_seq):
if num % 3 != 0:
next_3_d_array.append(num)
if num % 3 == 0:
array_list.append(next_3_d_array)
next_3_d_array = [ ]

--
Best Regards,
David Hutto
CEO: http://www.hitwebdevelopment.com

Demian Brecht

unread,
Oct 24, 2012, 1:51:26 AM10/24/12
to seektime, pytho...@python.org

On 2012-10-23, at 10:45 PM, Demian Brecht <demian...@gmail.com> wrote:

>>>> list_ = [d[c] for c in s.strip('\n').split()]
>>>> list_
> ['1', '2', '1', '2', '2', '1']


Of course, if you want these to be ints, then you can either change the format of your int list, or map(int, list_) if you don't have control over it.

Demian Brecht

unread,
Oct 24, 2012, 1:54:30 AM10/24/12
to seektime, pytho...@python.org

> Of course, if you want these to be ints, then you can either change the format of your int list, or map(int, list_) if you don't have control over it.


Ugh, I'm tired. Shouldn't map it, the conversion should be done in the list comprehension to avoid a needless second list iteration.

K, I'm going to sleep now. :P

Chris Rebert

unread,
Oct 24, 2012, 2:07:26 AM10/24/12
to Michael J. Krause, pytho...@python.org
On Tue, Oct 23, 2012 at 10:23 PM, seektime <michael....@gmail.com> wrote:
> Here's some example code. The input is a list which is a "matrix" of letters:
> a b a
> b b a
>
> and I'd like to turn this into a Python array:

You mean a Python list. The datatype Python calls an `array` is very
different and relatively uncommonly used.
Although, confusingly, Python's lists are implemented using C arrays
rather than linked lists.

> 1 2 1
> 2 2 1
>
> so 1 replaces a, and 2 replaces b. Here's the code I have so far:
>
>>>> L=['a b a\n','b b a\n']
<snip>
>>>> seq
> '1 2 1\n 2 2 1\n'
>
> My question is how can I turn "seq" into a python array?

I'd say you're asking the wrong question. The better question is "Why
wasn't the result a list in the first place?". Many transformations
are cumbersome to express over just strings, which is why the first
job of most programs is to parse their input into a more convenient
structure that is suited to their main task(s).

This (along with some other improvements) leads to a better, somewhat
different program/algorithm:

letter2number = {'a': 1, 'b': 2}
with open("path/to/file.txt", "r") as f:
result = [[letter2number[letter] for letter in
line.strip().split()] for line in f]

If it's safe to assume that the correspondence between the letters and
numbers isn't completely arbitrary, some further improvements are also
possible.

Some relevant docs:
http://docs.python.org/library/stdtypes.html#string-methods
http://docs.python.org/tutorial/datastructures.html#list-comprehensions

Cheers,
Chris

P.S.: I'm guessing you obtained `L` from file.readlines() or similar;
it is worth noting for future reference that the readlines() method is
considered somewhat deprecated.

Peter Otten

unread,
Oct 24, 2012, 3:47:57 AM10/24/12
to pytho...@python.org
Chris Rebert wrote:

> line.strip().split()

No need to strip() if you are going to split on whitespace:

>>> line = " a b c \n"
>>> line.split() == line.strip().split()
True

Lest the new idiom takes on while you are bravely fighting the immortable
readlines() ;)

Peter Otten

unread,
Oct 24, 2012, 5:04:38 AM10/24/12
to pytho...@python.org
Peter Otten wrote:

Brave new words:

> immortable

should be "immortal"

88888 Dihedral

unread,
Oct 24, 2012, 8:03:52 AM10/24/12
to Michael J. Krause, pytho...@python.org
Chris Rebert於 2012年10月24日星期三UTC+8下午2時07分29秒寫道:
> On Tue, Oct 23, 2012 at 10:23 PM, seektime <michael....@gmail.com> wrote:
>
> > Here's some example code. The input is a list which is a "matrix" of letters:
>
> > a b a
>
> > b b a
>
> >
>
> > and I'd like to turn this into a Python array:
>
>
>
> You mean a Python list. The datatype Python calls an `array` is very
>
> different and relatively uncommonly used.
>
> Although, confusingly, Python's lists are implemented using C arrays
>
> rather than linked lists.

The list in python is a list of valid python objects.
For the number crunching part, please use arrays in numarray and scipy.

88888 Dihedral

unread,
Oct 24, 2012, 8:03:52 AM10/24/12
to comp.lan...@googlegroups.com, pytho...@python.org, Michael J. Krause
Chris Rebert於 2012年10月24日星期三UTC+8下午2時07分29秒寫道:
> On Tue, Oct 23, 2012 at 10:23 PM, seektime <michael....@gmail.com> wrote:
>
> > Here's some example code. The input is a list which is a "matrix" of letters:
>
> > a b a
>
> > b b a
>
> >
>
> > and I'd like to turn this into a Python array:
>
>
>
> You mean a Python list. The datatype Python calls an `array` is very
>
> different and relatively uncommonly used.
>
> Although, confusingly, Python's lists are implemented using C arrays
>
> rather than linked lists.

The list in python is a list of valid python objects.
For the number crunching part, please use arrays in numarray and scipy.

>

Robert Kern

unread,
Oct 24, 2012, 8:22:13 AM10/24/12
to pytho...@python.org
On 10/24/12 1:03 PM, 88888 Dihedral wrote:

> The list in python is a list of valid python objects.
> For the number crunching part, please use arrays in numarray and scipy.

Your bot's database is laughably out of date.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Terry Reedy

unread,
Oct 24, 2012, 11:56:19 AM10/24/12
to pytho...@python.org
On 10/24/2012 1:23 AM, seektime wrote:
> Here's some example code. The input is a list which is a "matrix" of letters:
> a b a
> b b a
>
> and I'd like to turn this into a Python array:
>
> 1 2 1
> 2 2 1
>
> so 1 replaces a, and 2 replaces b.

If you are going to replace single characters (letters) with single
characters (digits), use maketrans and translate.

>>> 'a b c'.translate(str.maketrans('abc', '123'))
'1 2 3'

--
Terry Jan Reedy

MRAB

unread,
Oct 24, 2012, 1:05:43 PM10/24/12
to pytho...@python.org
On 2012-10-24 07:07, Chris Rebert wrote:
> On Tue, Oct 23, 2012 at 10:23 PM, seektime <michael....@gmail.com> wrote:
>> Here's some example code. The input is a list which is a "matrix" of letters:
>> a b a
>> b b a
>>
>> and I'd like to turn this into a Python array:
>
> You mean a Python list. The datatype Python calls an `array` is very
> different and relatively uncommonly used.
> Although, confusingly, Python's lists are implemented using C arrays
> rather than linked lists.
>
>> 1 2 1
>> 2 2 1
>>
>> so 1 replaces a, and 2 replaces b. Here's the code I have so far:
>>
>>>>> L=['a b a\n','b b a\n']
> <snip>
>>>>> seq
>> '1 2 1\n 2 2 1\n'
>>
>> My question is how can I turn "seq" into a python array?
>
> I'd say you're asking the wrong question. The better question is "Why
> wasn't the result a list in the first place?". Many transformations
> are cumbersome to express over just strings, which is why the first
> job of most programs is to parse their input into a more convenient
> structure that is suited to their main task(s).
>
> This (along with some other improvements) leads to a better, somewhat
> different program/algorithm:
>
> letter2number = {'a': 1, 'b': 2}
> with open("path/to/file.txt", "r") as f:
> result = [[letter2number[letter] for letter in line.strip().split()] for line in f]
>
If you're using .split() then you don't need to use .strip() as well:

result = [[letter2number[letter] for letter in line.split()] for

wxjm...@gmail.com

unread,
Oct 24, 2012, 1:27:27 PM10/24/12
to
Not so sure what you mean by an "array of integers".

>>> def z(s):
... a = s.splitlines()
... b = [e.split() for e in a]
... for row in range(len(b)):
... for col in range(len(b[row])):
... b[row][col] = ord(b[row][col]) - ord('a')
... return b
...
>>> z('a b a\n b b a')
[[0, 1, 0], [1, 1, 0]]
>>>
>>> # or
>>> table = {'a': 111, 'b': 222}
>>>
>>> def z2(s, table):
... a = s.splitlines()
... b = [e.split() for e in a]
... for row in range(len(b)):
... for col in range(len(b[row])):
... b[row][col] = table[b[row][col]]
... return b
...
>>> z2('a b a\n b b a', table)
[[111, 222, 111], [222, 222, 111]]
>>>
>>> # note
>>> z('a\n b b b b b\n a a')
[[0], [1, 1, 1, 1, 1], [0, 0]]

jmf

Demian Brecht

unread,
Oct 24, 2012, 1:36:49 PM10/24/12
to wxjm...@gmail.com, pytho...@python.org

On 2012-10-24, at 10:27 AM, wxjm...@gmail.com wrote:
> Not so sure what you mean by an "array of integers".


I wasn't entirely sure about that either. I assumed given the subject that it was just a 1-D array and could then be accessed by arr[(y * width) + x].
Message has been deleted

seektime

unread,
Oct 25, 2012, 12:27:28 AM10/25/12
to Michael J. Krause, pytho...@python.org
Thanks to everyone lots of great comments are actionable suggestions.

My intension is to used the numpy/scipy packages to solve the task at hand. I agree that there's no point in loading a file into a format which only needs to be converted right after loading. But I'm new to Python the f.readline(s) command, according to the 2.7.3 tutorial and manual, is pretty much all there is for file i/o. If, as you indicated, f.readlines() is deprecated then what should I use instead? I'm using ver. 2.6 on Linux (it's a bit dated, I know).

seektime

unread,
Oct 25, 2012, 12:27:28 AM10/25/12
to comp.lan...@googlegroups.com, pytho...@python.org, Michael J. Krause
On Tuesday, October 23, 2012 11:07:29 PM UTC-7, Chris Rebert wrote:

Chris Rebert

unread,
Oct 25, 2012, 12:52:07 AM10/25/12
to seektime, pytho...@python.org
On Wed, Oct 24, 2012 at 9:27 PM, seektime <michael....@gmail.com> wrote:
> On Tuesday, October 23, 2012 11:07:29 PM UTC-7, Chris Rebert wrote:
<snip>
>> P.S.: I'm guessing you obtained `L` from file.readlines() or similar;
>> it is worth noting for future reference that the readlines() method is
>> considered somewhat deprecated.
>
> Thanks to everyone lots of great comments are actionable suggestions.
>
> My intension is to used the numpy/scipy packages to solve the task at hand. I agree that there's no point in loading a file into a format which only needs to be converted right after loading. But I'm new to Python the f.readline(s) command, according to the 2.7.3 tutorial and manual, is pretty much all there is for file i/o. If, as you indicated, f.readlines() is deprecated then what should I use instead? I'm using ver. 2.6 on Linux (it's a bit dated, I know).

Just iterate over the file directly using a for-loop (e.g. `for line
in some_file:`). Each iteration yields one line of the file. I used a
very minor variation of this approach in my code (a list comprehension
is just syntax sugar for a for-loop).

Cheers,
Chris

Peter Otten

unread,
Oct 25, 2012, 1:47:48 AM10/25/12
to pytho...@python.org
Dennis Lee Bieber wrote:

> On Wed, 24 Oct 2012 11:04:38 +0200, Peter Otten <__pet...@web.de>
> declaimed the following in gmane.comp.python.general:
> Readlines() isn't immortal... It's a lich
> http://en.wikipedia.org/wiki/Lich

Wasn't there a Monty Python sketch where a man carrying a parrot in a cage
comes into a shop full of stuffed animals and complains: No, I don't admire
the taxidermist for making that parrot look like it were alive -- that beast
bit me!

Steven D'Aprano

unread,
Oct 25, 2012, 3:49:42 AM10/25/12
to
On Thu, 25 Oct 2012 07:47:48 +0200, Peter Otten wrote:

> Wasn't there a Monty Python sketch where a man carrying a parrot in a
> cage comes into a shop full of stuffed animals and complains: No, I
> don't admire the taxidermist for making that parrot look like it were
> alive -- that beast bit me!

I don't think so. Are you thinking of the famous Monty Python "Dead
Parrot Sketch"? Here's one of many versions:

http://www.youtube.com/watch?v=4vuW6tQ0218



--
Steven

Peter Otten

unread,
Oct 25, 2012, 4:25:31 AM10/25/12
to pytho...@python.org
My rendition was meant to be a travesty of that one, an "undead" parrot as a
follow-up to Dennis' "lich" post. I'm sorry I forgot the smiley ;)

- He didn't move, that was just the wind stirring his plumage.
- No, that parrot is alive and kicking, fresh as a daisy, full of beans...
- Aren't the glass eyes beautiful?
- Glass eyes -- he just blinked!

And so on. I'm off to work on my laden swallow branch of Python. It's going
to be a real heavy-weight...

--
Always look on the dark side of death

Mark Lawrence

unread,
Oct 25, 2012, 4:55:37 AM10/25/12
to pytho...@python.org
On 25/10/2012 09:25, Peter Otten wrote:
> My rendition was meant to be a travesty of that one, an "undead" parrot as a
> follow-up to Dennis' "lich" post. I'm sorry I forgot the smiley ;)
>
> - He didn't move, that was just the wind stirring his plumage.
> - No, that parrot is alive and kicking, fresh as a daisy, full of beans...
> - Aren't the glass eyes beautiful?
> - Glass eyes -- he just blinked!
>
> And so on. I'm off to work on my laden swallow branch of Python. It's going
> to be a real heavy-weight...
>

I just hope you get the technicalities correct. "That parrot wouldn't
move if you put 4 million volts through it". What rubbish. It should
either have been 4 million amps through it or 4 million volts across it.
I'm +1 for the former, although possibly biased by history.

--
Cheers.

Mark Lawrence.

Prasad, Ramit

unread,
Oct 25, 2012, 5:27:16 PM10/25/12
to pytho...@python.org
David Hutto wrote:
> On Wed, Oct 24, 2012 at 1:23 AM, seektime <michael....@gmail.com> wrote:
> > Here's some example code. The input is a list which is a "matrix" of letters:
> > a b a
> > b b a
> >
> > and I'd like to turn this into a Python array:
> >
> > 1 2 1
> > 2 2 1
> >
> > so 1 replaces a, and 2 replaces b. Here's the code I have so far:
> >
> >>>> L=['a b a\n','b b a\n']
> >>>> s=' '.join(L)
> >>>> seq1=('a','b')
> >>>> seq2=('1','2')
> >>>> d = dict(zip(seq1,seq2))
> >>>> # Define method to replace letters according to dictionary (got this from
> http://gommeitputor.wordpress.com/2008/09/27/search-replace-multiple-words-or-characters-with-python/).
> > ... def replace_all(text, dic):
> > ... for i, j in dic.iteritems():
> > ... text = text.replace(i, j)
> > ... return text
> > ...
> >
> >>>> seq = replace_all(s,d)
> >>>> print seq
> > 1 2 1
> > 2 2 1
> >
> >>>> seq
> > '1 2 1\n 2 2 1\n'
> >
> I'd suggest, if this is what you're referring to:
>
> x = seq.split('\n ')
> array_list = [ ]
> next_3_d_array = []
> range_of_seq = len(seq)
> for num in range(0,range_of_seq):
> if num % 3 != 0:
> next_3_d_array.append(num)
> if num % 3 == 0:
> array_list.append(next_3_d_array)
> next_3_d_array = [ ]
>

Wow, that looks complicated. Why hardcode to 3 instead of where ever
the newline is?

>>> [ int(x.strip()) for subseq in seq.split('\n') for x in subseq.split() ]
[1, 2, 1, 2, 2, 1]
>>> lst = []
# OR
>>> for subseq in seq.split('\n'):
... for x in subseq.split():
... lst.append( int(x.strip()))
...
>>>

Ramit Prasad


This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.
0 new messages