Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

What's correct Python syntax?

80 views
Skip to first unread message

Igor Korot

unread,
Jan 14, 2014, 3:46:56 AM1/14/14
to pytho...@python.org
Hi, ALL,
I'm trying to process a file which has following lines:

192.168.1.6 > 192.168.1.7: ICMP echo request, id 100, seq 200, length 30

(this is the text file out of tcpdump)

Now I can esily split the line twice: once by ':' symbol to separate
address and the protocol information and the second time by ',' to get
information about the protocol.
However, I don't need all the protocol info. All I'm interested in is
the last field, which is length.

Is there a way to write something like this:

for data in f:
(address,traffic) = string.split(data, ':')
length = string.split(traffic, ',')[3]

I'm interesred in only one element, so why should care about everything else?
This can be easily done in Perl, but I'm stuck with Python now. ;-)

Thank you.

Rustom Mody

unread,
Jan 14, 2014, 3:54:56 AM1/14/14
to
>>> data="192.168.1.6 > 192.168.1.7: ICMP echo request, id 100, seq 200, length 30"
>>> (add,traff) = data.split(':')
>>> add
'192.168.1.6 > 192.168.1.7'
>>> traff
' ICMP echo request, id 100, seq 200, length 30'
>>> lenn = traff.split(',')
>>> lenn = traff.split(',')[3]
>>> lenn
' length 30'
>>>

Chris Angelico

unread,
Jan 14, 2014, 3:58:17 AM1/14/14
to pytho...@python.org
On Tue, Jan 14, 2014 at 7:46 PM, Igor Korot <ikor...@gmail.com> wrote:
> 192.168.1.6 > 192.168.1.7: ICMP echo request, id 100, seq 200, length 30
>
> However, I don't need all the protocol info. All I'm interested in is
> the last field, which is length.

You can split on any string. If you're confident that this is the only
instance of the word "length", you can split on that:

for data in f:
# This will throw an exception if there's no " length "
# or if there are two of them. This means you're safe;
# if anything unexpected happens, you'll know.
_, length = data.split(" length ")
# process length

Alternatively, you can split on the space and take just the very last word:

for data in f:
length = data.split(" ")[-1]
# process length

Either way, the length will be a string. If you need it as an integer,
just do this:

length = int(length)

>From there, you can do whatever analysis you need.

Hope that helps!

ChrisA

Igor Korot

unread,
Jan 14, 2014, 4:25:00 AM1/14/14
to pytho...@python.org
Hi, Rustom,
What if I want field 2 and field 3? ("seq 200" and "length 30")

Thank you.

>>>>
> --
> https://mail.python.org/mailman/listinfo/python-list

Rustom Mody

unread,
Jan 14, 2014, 4:37:36 AM1/14/14
to
On Tuesday, January 14, 2014 2:55:00 PM UTC+5:30, Igor Korot wrote:
>
> What if I want field 2 and field 3? ("seq 200" and "length 30")

Wee you did say:

> I'm interesred in only one element, so why should care about everything else?

So its not clear what you want!

Do you want a one-liner? You could use a regular expression.
[You will very soon find that the world divides between the regular and the
irregular folks!]

Or you want some other perl-ism? You need to say what...

Or maybe you just want to use scapy instead of tcpdump?

Mark Lawrence

unread,
Jan 14, 2014, 4:37:48 AM1/14/14
to pytho...@python.org
On 14/01/2014 09:25, Igor Korot wrote:
> Hi, Rustom,
>
> On Tue, Jan 14, 2014 at 12:54 AM, Rustom Mody <rusto...@gmail.com> wrote:
> What if I want field 2 and field 3? ("seq 200" and "length 30")
>
Please do a little work before asking such a trivial question, it's
hardly difficult from the interactive interpreter, particularly when you
already have an example to start with.

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

Jussi Piitulainen

unread,
Jan 14, 2014, 4:43:48 AM1/14/14
to
Chris Angelico writes:

> Alternatively, you can split on the space and take just the very
> last word:
>
> for data in f:
> length = data.split(" ")[-1]
> # process length

Also, data.rsplit(' ', 1) will split data in two at the last space.

help(str.rsplit)

Igor Korot

unread,
Jan 14, 2014, 5:02:24 AM1/14/14
to pytho...@python.org
Hi, Rustom,

On Tue, Jan 14, 2014 at 1:37 AM, Rustom Mody <rusto...@gmail.com> wrote:
> On Tuesday, January 14, 2014 2:55:00 PM UTC+5:30, Igor Korot wrote:
>>
>> What if I want field 2 and field 3? ("seq 200" and "length 30")
>
> Wee you did say:
>
>> I'm interesred in only one element, so why should care about everything else?
>
> So its not clear what you want!

Sorry, I thought it would be easier to ask this way. Guess not.

I am actually looking for a way to get a result from split which is
sliced the way I want. Like in my example above.
I mean I can probably make more variable by creating a tuple, but why?
What is the purpose if I want only couple elements out of split.
Doing it Perl way does not help:

C:\Documents and Settings\Igor.FORDANWORK\Desktop\winpdb>python
Python 2.7.5 (default, May 15 2013, 22:43:36) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> test = "I,like,my,chocolate"
>>> print test.split(',')[2,3]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not tuple

I can do it this way:

>>> testlist = test.split(',')
>>> print testlist[2]
my

but it will needlessly creates a list on which I will access by the index.

Why? All I need is couple of values out of n-dimensional list (array).

>
> Do you want a one-liner? You could use a regular expression.
> [You will very soon find that the world divides between the regular and the
> irregular folks!]
>
> Or you want some other perl-ism? You need to say what...

Well is there a Python way to do what I want?
I mention Perl only because I'm familiar with the language and this is
easy in it to do that.

Thank you.

>
> Or maybe you just want to use scapy instead of tcpdump?
> --
> https://mail.python.org/mailman/listinfo/python-list

Igor Korot

unread,
Jan 14, 2014, 5:03:07 AM1/14/14
to pytho...@python.org
Sorry, that was sent to Mark directly.
Resending to the list.


---------- Forwarded message ----------
From: Igor Korot <ikor...@gmail.com>
Date: Tue, Jan 14, 2014 at 1:50 AM
Subject: Re: What's correct Python syntax?
To: Mark Lawrence <bream...@yahoo.co.uk>


Hi, Mark,

On Tue, Jan 14, 2014 at 1:37 AM, Mark Lawrence <bream...@yahoo.co.uk> wrote:
> On 14/01/2014 09:25, Igor Korot wrote:
>>
>> Hi, Rustom,
>> What if I want field 2 and field 3? ("seq 200" and "length 30")
>>
>> Thank you.
>>
>>>>>>
>>> --
>>> https://mail.python.org/mailman/listinfo/python-list
>
>
> Please do a little work before asking such a trivial question, it's hardly
> difficult from the interactive interpreter, particularly when you already
> have an example to start with.

C:\Documents and Settings\Igor.FORDANWORK\Desktop\winpdb>python
Python 2.7.5 (default, May 15 2013, 22:43:36) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> test = "I,like,my,chocolate"
>>> print test.split(',')[2,3]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not tuple

Like I said, I'm more used to Perl, but need to work with Python for a moment.

Thank you.

>
> --
> My fellow Pythonistas, ask not what our language can do for you, ask what
> you can do for our language.
>
> Mark Lawrence
>
> --
> https://mail.python.org/mailman/listinfo/python-list

Rustom Mody

unread,
Jan 14, 2014, 5:16:16 AM1/14/14
to
On Tuesday, January 14, 2014 3:32:24 PM UTC+5:30, Igor Korot wrote:
> Hi, Rustom,

> On Tue, Jan 14, 2014 at 1:37 AM, Rustom Mody wrote:
> > On Tuesday, January 14, 2014 2:55:00 PM UTC+5:30, Igor Korot wrote:
> >> What if I want field 2 and field 3? ("seq 200" and "length 30")
> > Wee you did say:
> >> I'm interesred in only one element, so why should care about everything else?
> > So its not clear what you want!

> Sorry, I thought it would be easier to ask this way. Guess not.

> I am actually looking for a way to get a result from split which is
> sliced the way I want. Like in my example above.
> I mean I can probably make more variable by creating a tuple, but why?
> What is the purpose if I want only couple elements out of split.
> Doing it Perl way does not help:

> C:\Documents and Settings\Igor.FORDANWORK\Desktop\winpdb>python
> Python 2.7.5 (default, May 15 2013, 22:43:36) [MSC v.1500 32 bit
> (Intel)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
> >>> test = "I,like,my,chocolate"
> >>> print test.split(',')[2,3]

You want this?

>>> test = "I,like,my,chocolate"
>>> test.split(',')
['I', 'like', 'my', 'chocolate']
>>> test.split(',')[2:4]
['my', 'chocolate']


> Well is there a Python way to do what I want?


Well I for one still dont get what you want!!

Heres a python one-liner using regexps
>>> r=r'(.*) +> +(.*):.*length (\d*)'
>>> re.findall(r,data)
[('192.168.1.6', '192.168.1.7', '30')]

Note: I am NOT suggesting you use regexps. Just that they will do what you want if you are so inclined

Igor Korot

unread,
Jan 14, 2014, 5:35:27 AM1/14/14
to pytho...@python.org
Hi, Rustom,
Yup, thats it.
Now 2 and 4 - it's a starting point and ending point, right?

Thank you.
>
>
>> Well is there a Python way to do what I want?
>
>
> Well I for one still dont get what you want!!
>
> Heres a python one-liner using regexps
>>>> r=r'(.*) +> +(.*):.*length (\d*)'
>>>> re.findall(r,data)
> [('192.168.1.6', '192.168.1.7', '30')]
>
> Note: I am NOT suggesting you use regexps. Just that they will do what you want if you are so inclined
> --
> https://mail.python.org/mailman/listinfo/python-list

Rustom Mody

unread,
Jan 14, 2014, 5:51:15 AM1/14/14
to
On Tuesday, January 14, 2014 4:05:27 PM UTC+5:30, Igor Korot wrote:
> Hi, Rustom,
>
>
>
> On Tue, Jan 14, 2014 at 2:16 AM, Rustom Mody wrote:
> > You want this?
> >
> >>>> test = "I,like,my,chocolate"
> >>>> test.split(',')
> > ['I', 'like', 'my', 'chocolate']
> >>>> test.split(',')[2:4]
> > ['my', 'chocolate']
>
>
> Yup, thats it.
> Now 2 and 4 - it's a starting point and ending point, right?

In python ranges are usually lo-inclusive hi-exclusive.
Slices are one case of this

See explanations:
http://docs.python.org/2/tutorial/introduction.html#strings
and
http://stackoverflow.com/questions/509211/pythons-slice-notation

Neat theoretical explanation
http://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html

Alister

unread,
Jan 14, 2014, 5:59:44 AM1/14/14
to
Am I missing something obvious here?
just split on ','

field [0] will contain a mix of data but who cares? you don't want it
anyway (you can always process it again afterwards.

>>> a='192.168.1.6 > 192.168.1.7: ICMP echo request, id 100, seq 200,
length 30'
>>> data=a.split(',')
>>> data
['192.168.1.6 > 192.168.1.7: ICMP echo request', ' id 100', ' seq 200', '
length 30']
>>> data[3]
' length 30'



--
It's not against any religion to want to dispose of a pigeon.
-- Tom Lehrer, "Poisoning Pigeons in the Park"

Peter Otten

unread,
Jan 14, 2014, 6:33:26 AM1/14/14
to pytho...@python.org
Igor Korot wrote:

> I am actually looking for a way to get a result from split which is
> sliced the way I want. Like in my example above.
> I mean I can probably make more variable by creating a tuple, but why?
> What is the purpose if I want only couple elements out of split.
> Doing it Perl way does not help:
>
> C:\Documents and Settings\Igor.FORDANWORK\Desktop\winpdb>python
> Python 2.7.5 (default, May 15 2013, 22:43:36) [MSC v.1500 32 bit
> (Intel)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
>>>> test = "I,like,my,chocolate"
>>>> print test.split(',')[2,3]
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> TypeError: list indices must be integers, not tuple
>
> I can do it this way:
>
>>>> testlist = test.split(',')
>>>> print testlist[2]
> my
>
> but it will needlessly creates a list on which I will access by the index.
>
> Why? All I need is couple of values out of n-dimensional list (array).

Python has no dedicated syntax for picking arbitrary items from a list
If you are only concerned about printing use format():

>>> items = ["alpha", "beta", "gamma", "delta"]
>>> print "{1} {3} {0}".format(*items)
beta delta alpha

If you want to work with the values use operator.itemgetter():

>>> from operator import itemgetter
>>> itemgetter(1, 0, -1)(items)
('beta', 'alpha', 'delta')



Ned Batchelder

unread,
Jan 14, 2014, 7:19:42 AM1/14/14
to pytho...@python.org
On 1/14/14 6:33 AM, Peter Otten wrote:
> Python has no dedicated syntax for picking arbitrary items from a list
> If you are only concerned about printing use format():
>
>>>> >>>items = ["alpha", "beta", "gamma", "delta"]
>>>> >>>print "{1} {3} {0}".format(*items)
> beta delta alpha

.format also supports item access directly:

>>> items = ["alpha", "beta", "gamma", "delta"]
>>> print "{0[1]} {0[3]} {0[0]}".format(items)
beta delta alpha

It's clunkier in this example, but if you have more than one value being
formatted, this (and the "{0.foo}" syntax) can make digging into nested
data more convenient.

--
Ned Batchelder, http://nedbatchelder.com

Roy Smith

unread,
Jan 14, 2014, 8:34:31 AM1/14/14
to
In article <mailman.5437.1389689...@python.org>,
Igor Korot <ikor...@gmail.com> wrote:

> Hi, ALL,
> I'm trying to process a file which has following lines:
>
> 192.168.1.6 > 192.168.1.7: ICMP echo request, id 100, seq 200, length 30
>
> (this is the text file out of tcpdump)
>
> Now I can esily split the line twice: once by ':' symbol to separate
> address and the protocol information and the second time by ',' to get
> information about the protocol.
> However, I don't need all the protocol info. All I'm interested in is
> the last field, which is length.

One possibility would be to forget about all the punctuation and just
use "length " (note the trailing space) as the split delimiter:

>>> line = '192.168.1.6 > 192.168.1.7: ICMP echo request, id 100, seq
200, length 30'
>>> line.split('length ')
'30'

this will only work if you're sure that "length " can never appear
anywhere else in the line. Another, perhaps more idiomatic, way would
be:

>>> _, length = line.split('length ')
>>> print length
30

What's happening here is split() is returning a list of two items, which
you then unpack into two variables, "_" and "length". It's common to
unpack unwanted fields into "_", as a hint (to the reader) that it's
unused.

Roy Smith

unread,
Jan 14, 2014, 8:47:35 AM1/14/14
to
In article <mailman.5443.1389693...@python.org>,
Igor Korot <ikor...@gmail.com> wrote:

> I can do it this way:
>
> >>> testlist = test.split(',')
> >>> print testlist[2]
> my
>
> but it will needlessly creates a list on which I will access by the index.

Stop worrying about needlessly creating lists. Write the code in a way
that works and is easy to understand. If it turns out that it's not
running fast enough, then you can go back and optimize.

BTW, for those of you into code golf:

>>> line = '192.168.1.6 > 192.168.1.7: ICMP echo request, id 100, seq
200, length 30'

>>> dict((k,int(v)) for k,v in (s.split() for s in line.split(', ')[1:]))
{'length': 30, 'id': 100, 'seq': 200}
Message has been deleted

Terry Reedy

unread,
Jan 14, 2014, 5:05:22 PM1/14/14
to pytho...@python.org
On 1/14/2014 3:46 AM, Igor Korot wrote:
> Hi, ALL,
> I'm trying to process a file which has following lines:
>
> 192.168.1.6 > 192.168.1.7: ICMP echo request, id 100, seq 200, length 30

> However, I don't need all the protocol info. All I'm interested in is
> the last field, which is length.

To directly extract only the needed info:

>>> s="192.168.1.6 > 192.168.1.7: ICMP echo request, id 100, seq 200,
length 30"
>>> s[s.rindex(' ')+1:]
'30'

--
Terry Jan Reedy

emile

unread,
Jan 14, 2014, 5:13:20 PM1/14/14
to pytho...@python.org
Any particular reason to prefer that over:

>>> s.split()[-1]
'30'

Is it a length of s and speed or rindex over build and toss the list
kind of thing?

Emile




Larry Hudson

unread,
Jan 15, 2014, 1:00:33 AM1/15/14
to
On 01/14/2014 02:03 AM, Igor Korot wrote:
[snip]

> C:\Documents and Settings\Igor.FORDANWORK\Desktop\winpdb>python
> Python 2.7.5 (default, May 15 2013, 22:43:36) [MSC v.1500 32 bit
> (Intel)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
>>>> test = "I,like,my,chocolate"
>>>> print test.split(',')[2,3]
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> TypeError: list indices must be integers, not tuple
>

Try again... but use a colon not a comma, [2:3]

Two additional comments:
(1) test.split(',')[2:3] will give you ["my"] only. The slicing syntax starts with the first
index and goes up to but NOT INCLUDING the second. In this case it is the same as the single
index, [2]. You want either [2:4] or [2:], or even [2:500]. Any value >= the length of the
list (or whatever sequence) is acceptable as the ending index in a slice. It's probably not a
good idea to use a value like this, but it does work. And obviously, don't try to read with an
out-of-bounds index, but it does work as the _ending_ index in a slice.

(2) A comma-separated list of data items IS a tuple, even without the usual enclosing
parenthesis. That is your error here -- [2,3] is the same as [(2,3)], which is a tuple.

-=- Larry -=-

0 new messages