Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

extract substring by regex from a text file

21 views
Skip to first unread message

Alessio

unread,
Apr 15, 2010, 3:15:03 AM4/15/10
to
Hi,

I'm facing the problem in the subject:
- I have a text file that I need to parse for producing a specifical
string (Json like) extracting some information (substring) in it;
- I created regural expressions capable to locate these substrings in
my txt file;

now I don't know how to continue. What is the best way to locate some
string in a file and output them (with print command or in another
file)?

Thx in advance

Neil Cerutti

unread,
Apr 15, 2010, 9:25:52 AM4/15/10
to

grep

Or: show your work.

--
Neil Cerutti

Alessio

unread,
Apr 17, 2010, 4:19:41 AM4/17/10
to
On Apr 15, 3:25 pm, Neil Cerutti <ne...@norwich.edu> wrote:

> On 2010-04-15, Alessio <alessio...@gmail.com> wrote:
>
> > Hi,
>
> > I'm facing the problem in the subject:
> > - I have a text file that I need to parse for producing a specifical
Thank you, I forgot to say that I already solved.
I used readlines() to read my text file, then with a for cicle I
extract line by line the substrings I need by regular expressions
(re.findall())

ciao

Stefan Behnel

unread,
Apr 17, 2010, 4:58:47 AM4/17/10
to pytho...@python.org
Alessio, 17.04.2010 10:19:

> I used readlines() to read my text file, then with a for cicle I
> extract line by line the substrings I need by regular expressions
> (re.findall())

Note that it's usually more efficient to just run the for-loop over the
file object, rather than using readlines() first. The latter will read all
lines into a big list in memory before doing any further processing,
whereas the plain for-loop will read line by line and let the loop body act
on each line immediately.

Stefan

Peter Otten

unread,
Apr 17, 2010, 5:05:49 AM4/17/10
to
Alessio wrote:

> I used readlines() to read my text file, then with a for cicle I
> extract line by line the substrings I need by regular expressions

Just in case you didn't know:

for line in instream:
...

looks better, uses less memory, and may be a tad faster than

for line in instream.readlines():
...

Peter

Alessio

unread,
Apr 17, 2010, 12:14:10 PM4/17/10
to
On Apr 17, 11:05 am, Peter Otten <__pete...@web.de> wrote:
>
> Just in case you didn't know:
>
>     for line in instream:
>         ...
>
> looks better, uses less memory, and may be a tad faster than
>
>     for line in instream.readlines():
>         ...
>
> Peter

Thanks for your suggestions, they are welcome... I'm at the beginning
with python.
I just changed my script to parse the file without readlines()

0 new messages