Web Images Videos Maps News Shopping Gmail more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Message from discussion shuffle the lines of a large file
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Simon Brunning  
View profile  
 More options Mar 8 2005, 9:28 am
Newsgroups: comp.lang.python
From: Simon Brunning <simon.brunn...@gmail.com>
Date: Tue, 8 Mar 2005 14:28:09 +0000
Local: Tues, Mar 8 2005 9:28 am
Subject: Re: shuffle the lines of a large file
On Tue, 8 Mar 2005 14:13:01 +0000, Simon Brunning

<simon.brunn...@gmail.com> wrote:
> On 7 Mar 2005 06:38:49 -0800, g...@ll.mit.edu <g...@ll.mit.edu> wrote:
> > As far as I can tell, what you ultimately want is to be able to extract
> > a random ("representative?") subset of sentences.

> If this is what's wanted, then perhaps some variation on this cookbook
> recipe might do the trick:

> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/59865

I couldn't resist. ;-)

import random

def randomLines(filename, lines=1):
    selected_lines = list(None for line_no in xrange(lines))

    for line_index, line in enumerate(open(filename)):
        for selected_line_index in xrange(lines):
            if random.uniform(0, line_index) < 1:
                selected_lines[selected_line_index] = line

    return selected_lines

This has the advantage that every line had the same chance of being
picked regardless of its length. There is the chance that it'll pick
the same line more than once, though.

--
Cheers,
Simon B,
si...@brunningonline.net,
http://www.brunningonline.net/simon/blog/


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google