write text.concordance output to a file

1,756 views
Skip to first unread message

Andrés Chandía

unread,
Nov 2, 2010, 8:15:45 AM11/2/10
to nltk-users
Hi everybody,
I'm new to python and I'm trying to make a little program that find
concordances and write them to a file, up to this moment I've been
able to arrive to a good point but I'm failing in writing the output
to a file, I would appreciate any help on this, I'm pasting the code
for you to check it.
Thanks.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import nltk, os, sys
print "\n\n CONCORDANCES SEARCH ENGINE \n"
print "These are the files in your current directory \n"
print os.listdir('.')
print "\n\n Insert the text filename you want to work with\n"
thefile=raw_input()
f=open(thefile, 'rU')
text=f.read()
textD=text.split()
textDS=nltk.Text(textD)
print "\n Introduce the word you want to find \n"
theword=raw_input()
print "\n Concordances with \"" + theword + "\"\n"
results=textDS.concordance(theword)
print "\n"
print results

outfile = "concordance_with_"+theword+".txt"
final = open(outfile, "w")

#### HERE IS WHERE IT DOES NOT WORK
final.write(results)
####

final.close()
print "Succesfuly saved \n"


Here the error it gives me:

final.write(resultado)
TypeError: argument 1 must be string or read-only character buffer,
not None


I understand that I have to convert the output in a string, but I
don't know how. I've tried str(results) or `results`, and some others.

Richard Careaga

unread,
Nov 2, 2010, 6:34:27 PM11/2/10
to nltk-...@googlegroups.com
The Text class is intended to be used interactively, and the output is
hardwired to go to the console. It's nothing that you're doing wrong.

The package authors are aware that users would like to be able to write
to files and are considering changing this in a future version. In the
meantime, however, one has to use the underlying classes and methods,
which is not simple without a very good understanding of both Python and
the source code for the package.

Andrés Chandía

unread,
Nov 3, 2010, 6:12:33 AM11/3/10
to nltk-users
Thanks for your answer,
you mean that I should write the program without using NLTK but
defining classes and all that instead?
> > I understand that I have to convert the output to a string, but I

Richard Careaga

unread,
Nov 3, 2010, 10:30:28 AM11/3/10
to nltk-...@googlegroups.com
No, not quite that hard. Everything you need is in nltk; the hard part is identifying and putting together those pieces to do what the Text module does without requiring that all output be directd to the console. 

2010/11/3 Andrés Chandía <and...@chandia.net>
--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To post to this group, send email to nltk-...@googlegroups.com.
To unsubscribe from this group, send email to nltk-users+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/nltk-users?hl=en.


Andrés Chandía

unread,
Nov 3, 2010, 10:40:17 AM11/3/10
to nltk-users
Sorry, I'm a little lost......
> > nltk-users+...@googlegroups.com<nltk-users%2Bunsu...@googlegroups.com>
> > .

Uldis Bojars

unread,
Nov 3, 2010, 11:00:56 AM11/3/10
to nltk-...@googlegroups.com
Documentation for the Text class [1] says: "is intended to support
initial exploration of texts (via the interactive console). ... If you
wish to write a program which makes use of these analyses, then you
should bypass the Text class, and use the appropriate analysis
function or class directly instead."

http://nltk.googlecode.com/svn/trunk/doc/api/nltk.text.Text-class.html

You may want to look at the source code of the Text.concordance()
function, see how it works and use a modified version of the code,
where print statements are replaced by output to file.

If your program did not require interaction with the user (e.g., if
you supplied the required info on command line instead) then there's
another solution: just redirect output of your program to a file:

python concordance.py parameters > output_file

Uldis

2010/11/3 Andrés Chandía <and...@chandia.net>:

> To unsubscribe from this group, send email to nltk-users+...@googlegroups.com.

billym99

unread,
Nov 4, 2010, 2:00:18 PM11/4/10
to nltk-users
I am new to nltk and came to this same frustrating problem very early
on. Thanks to all for the various suggestions. I will be studying that
"logging" module as best I can. It was frustrating because the output
of some of the "Text" routines is very useful and above all I wanted
to get that output into Python variables to work with it further. I am
especially interested in text generation techniques along the lines of
exploring the ideas of William S. Burroughs and his now infamous "cut-
up" techniques (also, Bryan Gyson). So I wanted to be able to get the
output of text.generate() on the fly at any time to play with it
further.

I came up with a stop-gap solution similar to those here and am
working with that for now. However I then examined the generate()
function in the Text module and realized I could eventually write my
own version which would do exactly what I wanted it to (after further
study).

My stopgap solution is similar to others given:
1. switch stdout to a file,
2. use print to direct output to file,
3. read the created file and format it as desired.

The syntax I found was like this:

#################### switch to log file
saveout = sys.stdout
fsock = open(logfn, 'w')
sys.stdout = fsock
###################

(do some stuff like:
print text1.generate(1000)
print text1.collocations()

################### restore stdout and close log file
sys.stdout = saveout
fsock.close()
###################

One can then immediately read the file and manipulate it into some
variable using ordinary string methods.

Next step is make two functions, one which switches to the file and
one which switches back -- e.g. "turn_logging_on()" and
"restore_std_out()"











emorman

unread,
May 4, 2016, 4:42:41 PM5/4/16
to nltk-users


This appears to work.
uses standard output. 

sys.stdout


#!/usr/bin/env python
# -*- coding: utf-8 -*-
import nltk, os, sys
print "\n\n CONCORDANCES SEARCH ENGINE \n"
print "These are the files in your current directory \n"
print os.listdir('.')
print "\n\n Insert the text filename you want to work with\n"
thefile=raw_input()
f=open(thefile, 'rU')
text=f.read().decode('utf8','ignore')
textD=text.split()
textDS=nltk.Text(textD)
print "\n Introduce the word you want to find \n"
theword=raw_input()
print "\n Concordances with \"" + theword + "\"\n"
results=textDS.concordance(theword)
print "\n"


saveout = sys.stdout                                    
SavedConcordance = open('outfile.txt', 'w')                            
sys.stdout = SavedConcordance                                      
textDS.concordance(theword, width=100, lines=20)
sys.stdout = saveout                                    
SavedConcordance.close()                                           

print "Succesfuly saved to outfile.txt\n"

Leveraged from dive into python.  Here is the link:
Reply all
Reply to author
Forward
0 new messages