babynames exercise

Yu Li

unread,

Nov 2, 2011, 10:32:10 AM11/2/11

to python-g...@googlegroups.com

Hi,

I am learning python through the google python course and currently work on all the exercises too. for the babyname exercise. I got part A worked well. How ever when I try to run part B. Basically, generate .summary file, I always got an error saying that: invalid mode <'rU'> or filename: 'baby*.html '.

I tried the program in the solution and same thing happened.

The operation system is windows 7 and I am runing it through MS command prompt. I use Notepad ++ for editing. Anyone know what is going on here? Many many thanks.

Robert Mandič

unread,

Nov 8, 2011, 2:58:19 PM11/8/11

to python-g...@googlegroups.com

Hello,

sorry for late post - had to get my hands on a windows machine with python cause I wasn't certain about my answer but now I am :D

The system doesn't know how to expand wildcard characeter "*" - therefore it's trying to open file named baby*.html.

If you have a linux/bsd/solaris OS available you can try it there and you'll see it works.

--
Lp, Robert

Yu Li

unread,

Nov 8, 2011, 3:13:58 PM11/8/11

to python-g...@googlegroups.com

Thanks. I posted to another forum and someone suggested me to use glob.glob. it works. Thanks for your reply.

Robert Mandič

unread,

Nov 8, 2011, 3:27:16 PM11/8/11

to python-g...@googlegroups.com

I see.

In case someone browses this forum in the future:

import glob
for arg in args:
for filename in glob.glob(arg):
filedata = extract_names(filename)

On Tue, Nov 8, 2011 at 9:13 PM, Yu Li <yul...@gmail.com> wrote:

Thanks. I posted to another forum and someone suggested me to use glob.glob. it works. Thanks for your reply.

--
Lp, Robert

J T Gillich

unread,

May 23, 2014, 11:47:33 PM5/23/14

to python-g...@googlegroups.com

Hi, I followed your suggestion below and I can now run the command in windows but it is only creating a summary file for the last file in the directory. Any idea why?

Robert Mandić

unread,

May 24, 2014, 3:24:45 AM5/24/14

to python-g...@googlegroups.com

Actually it's creating summary file for every file but it overwrites it for every iteration ... that's how you only see the summary for the last file.

--
You received this message because you are subscribed to the Google Groups "Python GCU Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python-gcu-for...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Lp, Robert

J T

unread,

May 24, 2014, 9:28:49 AM5/24/14

to python-g...@googlegroups.com

Thanks for the reply. I am not sure how I write it to do every file? I am very new to programming.

Robert Mandić

unread,

May 25, 2014, 2:32:40 AM5/25/14

to python-g...@googlegroups.com

Can u provide us with your code?

J T

unread,

May 25, 2014, 9:25:52 AM5/25/14

to python-g...@googlegroups.com

Here is my code.

#!/usr/bin/python

# Licensed under the Apache License, Version 2.0

# http://www.apache.org/licenses/LICENSE-2.0

# Google's Python Class

# http://code.google.com/edu/languages/google-python-class/

import sys

import re

def extract_names(filename):

#+++your code here+++

# LAB(begin solution)

# The list [year, name_and_rank, name_and_rank, ...] we'll eventually return.

names = []

# Open and read the file.

f = open(filename, 'rU')

text = f.read()

# Could process the file line-by-line, but regex on the whole text

# at once is even easier.

# Get the year.

year_match = re.search(r'Popularity\sin\s(\d\d\d\d)', text)

if not year_match:

# We didn't find a year, so we'll exit with an error message.

sys.stderr.write('Couldn\'t find the year!\n')

sys.exit(1)

year = year_match.group(1)

names.append(year)

# Extract all the data tuples with a findall()

# each tuple is: (rank, boy-name, girl-name)

tuples = re.findall(r'<td>(\d+)</td><td>(\w+)</td><td>(\w+)</td>', text)

# print tuples

# Store data into a dict using each name as a key and that

# name's rank number as the value.

# (if the name is already in there, don't add it, since

# this new rank will be bigger than the previous rank).

names_to_rank = {}

for rank_tuple in tuples:

(rank, boyname, girlname) = rank_tuple # unpack the tuple into 3 vars

if boyname not in names_to_rank:

names_to_rank[boyname] = rank

if girlname not in names_to_rank:

names_to_rank[girlname] = rank

sorted_names = sorted(names_to_rank.keys())

for name in sorted_names:

names.append(name + " " + names_to_rank[name])

return names

# LAB(replace solution)

# return

# LAB(end solution)

def main():

# This command-line parsing code is provided.

# Make a list of command line arguments, omitting the [0] element

#which is the script itself.

args = sys.argv[1:]

if not args:

print 'usage: [--summaryfile] file [file ...]'

sys.exit(1)

# Notice the summary flag and remove it from args if it is present.

summary = False

if args[0] == '--summaryfile':

summary = True

del args[0]

# +++your code here+++

# For each filename, get the names, then either print the text output

# or write it to a summary file

# LAB(begin solution

#for filename in args:

#names = extract_names(filename)

#text = '\n'.join(names)

import glob

for arg in args:

for filename in glob.glob(arg):

#print 'filename is', filename

filedata = extract_names(filename)

text = '\n'.join(filedata) + '\n'

#print 'this is the text', text

# Make text out of the whole list

if summary:

outf = open(filename + '.summary', 'w')

outf.write(text + '\n')

outf.close()

else:

print text

# LAB(end solution)

if __name__ == '__main__':

main()

Robert Mandić

unread,

May 25, 2014, 11:07:25 AM5/25/14

to python-g...@googlegroups.com

This is working as intended:

Running:

$ python tst.py --summaryfile baby1990.html baby2002.html

Created:

$ ls -1 *.summary

baby1990.html.summary

baby2002.html.summary

Reply all

Reply to author

Forward