I am pretty sure my code isn't close to what I want. I need to be able
to skip html like commands from <defined> to <undefined> and to key on
another word in adition to </CsInstruments> to end the routine
I was also looking at se 2.2 beta but didn't see any easy way to use it
for this or for that matter search and replace where I could just add
it as a menu item and not worry about it.
thanks for any help in advance
<Eric_...@msn.com> wrote in message
news:1156714534.3...@i3g2000cwc.googlegroups.com...
I will look into that a little bit since that is so html like... maybe
some of the examples can lead me in the right direction on alot of it..
If you're dealing with html or html-like files, do check out
beautifulsoup. I had reason to use it the other day and man is it ever
useful!
Meantime, there are a few minor points about the code you posted:
1) open() defaults to 'r', you can leave it out when you call open() to
read a file.
2) 'file' is a builtin type (it's the type of file objects returned by
open()) so you shouldn't use it as a variable name.
3) file objects don't have a read_until() method. You could say
something like:
f = open(filename)
lines = []
for line in f:
lines.append(line)
if '</CsInstruments>' in line:
break
4) filename[-3:] will give you the last 3 chars in filename. I'm
guessing that you want all but the last 3 chars, that's filename[:-3],
but see the os.path.splitext() function, and indeed the other
functions in os.path too:
http://docs.python.org/lib/module-os.path.html
5) the regular expression objects returned by re.compile() will always
evaluate True, so you want to call their search() method on the data to
search:
if not pattern1.search(line):
But, 6) using re for a pattern as simple as "</" is way overkill. Just
use 'in' or the find() method of strings:
if "</" not in line:
or:
pos = line.find("</")
if pos == -1:
print >>orcfilename, line
else:
print >>orcfilename, line[:pos]
7) the "print >> file" usage requires a file (or file-like object,
anything with a write() method I think) not a string. You need to use
it like this:
orcfile = open(orcfilename, 'w')
#...
print >> orcfile, line
8) If you have a list of lines anyway, you can use the writelines()
method of files to write them in one go:
open(orcfilename, 'w').writelines(lines)
of course stripping out your unwanted data from that last line using
find() as shown above.
I hope this helps.
Check out the docs on file objects:
http://docs.python.org/lib/bltin-file-objects.html, but like I said,
if you're dealing with html or html-like files, be sure to check out
beautifulsoup. Also, there's the elementtree package for parsing XML
that could help here too.
~Simon
Frederic
sorry about that this is a link to a discription of the format
http://kevindumpscore.com/docs/csound-manual/commandunifile.html
It is possible to have more than one instr defined in an .csd file so I
would need to look for that string also if I want to seperate the
instruments out.
(This code don't even compile...!)
>def simplecsdtoorc(filename):
> file = open(filename,"r")
file is not a good name - hides the builtin type of the same name.
Same for dict, list...
> alllines = file.read_until("</CsInstruments>")
read_until???
> pattern1 = re.compile("</")
> orcfilename = filename[-3:] + "orc"
perhaps you want filename[:-3]+"orc"?
> for line in alllines:
> if not pattern1
if not pattern1.search(line):
> print >>orcfilename, line
Open the output file before the loop, and use its write() method here
>I am pretty sure my code isn't close to what I want. I need to be able
>to skip html like commands from <defined> to <undefined> and to key on
>another word in adition to </CsInstruments> to end the routine
Good job for Beautiful Soup: http://www.crummy.com/software/BeautifulSoup/
Gabriel Genellina
Softlab SRL
__________________________________________________
Preguntá. Respondé. DescubrÃ.
Todo lo que querÃas saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas
I looked at the format specification. It contains an example:
-----------------------------------------------
<CsoundSynthesizer>;
; test.csd - a Csound structured data file
<CsOptions>
-W -d -o tone.wav
</CsOptions>
<CsVersion> ;optional section
Before 4.10 ;these two statements check for
After 4.08 ; Csound version 4.09
</CsVersion>
<CsInstruments>
; originally tone.orc
sr = 44100
kr = 4410
ksmps = 10
nchnls = 1
instr 1
a1 oscil p4, p5, 1 ; simple oscillator
out a1
endin
</CsInstruments>
<CsScore>
; originally tone.sco
f1 0 8192 10 1
i1 0 1 20000 1000 ;play one second of one kHz tone
e
</CsScore>
</CsoundSynthesizer>
-------------------------------------
If I understand correctly you want to write the instruments block to a file (from <CsInstruments> to </CsInstruments>)? Right? Or
each block to its own file in case there are several?. You want your code to generate the file names? Can you confirm this or
explain it differently?
Regards
Frederic
I need to take it between the blocks only I also need to make sure I
only take one instrument
defined in this example with the code instr 1 I also need the code
<CsInstruments>
> ; originally tone.orc
> sr = 44100
> kr = 4410
> ksmps = 10
> nchnls = 1
regardless of what instrument I take. The function would have to
except the instrument number as an argument
Using BeautifulSoup and the interactive interpreter, I figured out the
following script in about 15 minutes:
# s is a string containing the example file from above.
from BeautifulSoup import BeautifulStoneSoup
soup = BeautifulStoneSoup(s)
csin = soup.contents[0].contents[5]
lines = csin.string.splitlines()
print csin.string
It prints:
; originally tone.orc
sr = 44100
kr = 4410
ksmps = 10
nchnls = 1
instr 1
a1 oscil p4, p5, 1 ; simple oscillator
out a1
endin
and of course you could say "lines = csin.string.splitlines()" to get a
list of the lines. That doesn't take you all the way, but it's
something.
Hope that helps,
Peace,
~Simon
Here's a function that screens out all instrument blocks and puts them into a dictionary keyed on the instrument number:
--------------------------------------------
def get_instruments (file_name):
INSIDE = 1
OUTSIDE = 0
f = file (file_name, 'ra')
state = OUTSIDE
instruments = {}
instrument_segment = ''
for line in f:
if state == OUTSIDE:
if line.startswith ('<CsInstruments'):
state = INSIDE
instrument_segment += line
else:
instrument_segment += line
if line.lstrip ().startswith ('instr'):
instrument_number = line.split () [1]
elif line.startswith ('</CsInstruments'):
instruments [instrument_number] = instrument_segment
instrument_segment = ''
state = OUTSIDE
f.close ()
return instruments
------------------------------------------------
You have received good advice on using parsers: "beautiful soup" or "pyparse". These are powerful tools capable of doing complicated
extractions. Yours is not a complicated extraction. Simon tried it with "beautiful soup". That seems simple enough, though he finds
the data by index leaving open where he gets the index from. There's surely a way to get the data by name.
Contrary to the parser the function will miss if tags take liberties with upper-lower case letters as they are probably
allowed by the specification. A regular expression might have to be used, if they do.
From your description I haven't been able to infer what the final format of your data is supposed to be. So I cannot tell you
how to go on from here. You'll find out. If not, just keep asking.
The SE solution which you said couldn't work out would be the following. It makes the same dictionary the function makes and it is
case-insensitive:
------------------------------------------------
>>> Instrument_Segment_Filter = SE.SE ('<EAT> "~(?i)<CsInstruments>(.|\n)*?</CsInstruments>~==\n\n" ')
>>> instrument_segments= Instrument_Segment_Filter ('file_name', '')
>>> print instrument_segments
(... see all instrument segments ...)
>>> Instrument_Number = SE.SE ('<EAT> ~instr.*~==\n')
>>> instruments ={}
>>> for segment in instrument_segments.split ('\n\n'):
if segment:
instr_line = Instrument_Number (segment)
instrument_number = instr_line.split ()[1]
instruments [instrument_number] = segment
--------------------------------------------------
(If you're on Windows and the CRs bother you, take them out with an additional definition when you make your
Instrument_Block_Filter: (13)= or "\r=")
Regards
Frederic
----- Original Message -----
From: <Eric_...@msn.com>
Newsgroups: comp.lang.python
To: <pytho...@python.org>
Sent: Wednesday, August 30, 2006 1:51 AM
Subject: Re: newbe question about removing items from one file to another file
>
> Anthra Norell wrote:
> > Dexter,
> >
> > I looked at the format specification. It contains an example:
> >
> > -----------------------------------------------
> >
> > <CsoundSynthesizer>;
> > ; test.csd - a Csound structured data file
> >
> > <CsOptions>
> > -W -d -o tone.wav
> > </CsOptions>
> >
...
etc.
Thanks for the help I can't wait to try it out.. (has to wait for the
weekend.. three days off finaly.)
I seem to be having problems getting the code to work.. Seems to crash
my whole project, I don't know if I am missing an import file or what
(I had to go back to an older version on my hd.. I have uploaded what
I have on to sourceforge
https://sourceforge.net/project/showfiles.php?group_id=156455&package_id=201306&release_id=444362
http://www.dexrow.com
thanks for the help
sorry I responded to the wrong post... I was having trouble figuring
out the buitiful soup download
I cut and pasted this.. It seems to be crashing my program.. I am not
sure that I have all the right imports.. seems to be fine when I go to
an older version of the file... I uploaded it onto source forge.
https://sourceforge.net/project/showfiles.php?group_id=156455&package_id=201306&release_id=444362
http://www.dexrow.com
>
> Anthra Norell wrote:
> > Dexter,
> >
> > Here's a function that screens out all instrument blocks and puts them into a dictionary keyed on the instrument number:
> >
> > --------------------------------------------
> >
> > def get_instruments (file_name):
etc.
> > > <CsOptions>
> > > > -W -d -o tone.wav
> > > > </CsOptions>
> > > >
> > ...
> > etc.
>
> I cut and pasted this.. It seems to be crashing my program.. I am not
> sure that I have all the right imports.. seems to be fine when I go to
> an older version of the file... I uploaded it onto source forge.
>
> https://sourceforge.net/project/showfiles.php?group_id=156455&package_id=201306&release_id=444362
> http://www.dexrow.com
>
Eric (Eric or Dexer?)
This thread seems to have split. So let me reiterate: please copy the output when you cut, paste and run. If you have an
import problem it must be on the other side of your interface with SE, because I don't import anything and SE imports what it needs.
Frederic
Frederic