failed import of gevent???

49 views

Skip to first unread message

jrm...@gmail.com

unread,

Aug 19, 2020, 11:16:44 AM8/19/20

to gevent: coroutine-based Python network library

Here is my entire code for a PGATour web site scraper. Any help to resolve this 'issue' and also on lines 33/34 would be greatly appreciated. My directory does begin with C:// or if forward slashes are permitted C:\\Users/jrm12\ etc., etc.

import os
import gevent
import requests
from bs4 import BeautifulSoup

statUrlFormat = "https://www.pgatour.com/stats/stat.%s.%s.html" # statId, year
categoryUrlFormat = 'https://www.pgatour.com/stats/categories.%s.html'
categoryLabels = ['ROTT_INQ', 'RAPP_INQ', 'RARG_INQ', 'RPUT_INQ', 'RSCR_INQ', 'RSTR_INQ', 'RMNY_INQ', 'RPTS_INQ']

def saveHTML(url, filename):
    print ("Saving", url, "to", filename)
    r = requests.get(url)
    with open(filename, 'wt') as f:
        f.write(r.text)

# startYear: Most recent year of stats
# numYears:  Previous # of years
def generateURL(startYear, numYears):
    statIds = []
    for category in categoryLabels:
        categoryUrl = categoryUrlFormat % (category)
        page = requests.get(categoryUrl)
        html = BeautifulSoup(page.text.replace('\n',''), 'html.parser')
        for table in html.find_all("div", class_="table-content"):
            for link in table.find_all("a"):
                statIds.append(link['href'].split('.')[1])
    for statId in statIds:
        url = statUrlFormat % (statId, startYear)
        page = requests.get(url)
        html = BeautifulSoup(page.text.replace('\n',''), 'html.parser')
        stat = html.find("div", class_="main-content-off-the-tee-details").find('h1').text
        directory = "all_stats_html/%s" % stat.replace('/', ' ') #need to replace to avoid
        if not os.path.exists '(\Users\jrm12\OneDrive\Documents\GitHub\pga_analytics\)' : 
            os.makedirs '(\Users\jrm12\OneDrive\Documents\GitHub\pga_analytics\)'
        years = []
        for option in html.find("select", class_="statistics-details-select").find_all("option"):
            year = option['value']
            if year not in years and len(years) < numYears and year != "y2020":
                years.append(year)
        urlFilenamePairs = []
        for year in years:
            url = statUrlFormat % (statId, year)
            filename = "%s/%s.html" % (directory, year)
            if not os.path.isfile(filename):
                urlFilenamePairs.append((url, filename))
        jobs = [gevent.spawn(saveHTML, pair[0], pair[1]) for pair in urlFilenamePairs]
        gevent.joinall(jobs)

# Main
generateURL("y2019", 5)

Kevin Tewouda

unread,

Aug 20, 2020, 2:56:35 AM8/20/20

to gev...@googlegroups.com

Hi jrn,

the issue on lines 33/34 is that your quotes are surrounding your parenthesis but it should be the other way ('\Users...')

Some advices:

- You should probably use a nice editor capable of showing you quickly this kind of error. Two good editors I can recommend you are visual studio code (you need to install python extension) and pycharm (the community edition is free)

- I assume you are coding with python3 so I recommend you to use the pathlib module to handle file urls instead of writing them by hand like you are doing. You can write something like this:

path = Path('C:/Users/jrm12/OneDrive/Documents/GitHub/pga_analytics')
..
if not path.exists():
path.mkdir()

Hope this will help you for your scraping :)

Best regards

--
You received this message because you are subscribed to the Google Groups "gevent: coroutine-based Python network library" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gevent+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gevent/1b149872-b7ee-4217-8948-0fba2aab1ef9o%40googlegroups.com.

Tewouda T. R. Kevin

Ingénieur informatique options génie logiciel et réseaux informatiques à 3IL

Titulaire d'un diplôme post master en télécoms à Télécoms Paris Tech

Développeur python à Gandi

Reply all

Reply to author

Forward

0 new messages