Scraping a widget

57 views
Skip to first unread message

jk0...@gmail.com

unread,
Jun 18, 2018, 1:04:57 PM6/18/18
to Selenium Users
Hello,

I'm trying to scrape data from a widget. It was working for the first page but there was tons more data below that it wasn't scraping. So, next I added code to scroll to the end of the page so all the data could be scraped. Now, however
when it's finished scrolling to the end of the page, it just waits and never prints. Any idea how to get it to stop waiting and print? Eventually, I'd like to try to bring the data into excel if anyone knows how to do that too. Thanks


from
selenium import webdriver url = 'http://www.tradingview.com/screener' driver = webdriver.Firefox() driver.get(url) SCROLL_PAUSE_TIME = 0.5 # Get scroll height last_height = driver.execute_script("return document.body.scrollHeight") while True: # Scroll down to bottom driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") # Wait to load page time.sleep(SCROLL_PAUSE_TIME) # Calculate new scroll height and compare with last scroll height new_height = driver.execute_script("return document.body.scrollHeight") if new_height == last_height: break last_height = new_height element = driver.find_element_by_id('js-screener-container') print (element.text)

Munagala Srikanth

unread,
Jun 19, 2018, 2:00:06 AM6/19/18
to seleniu...@googlegroups.com
May be that element is not holding the data, find where the data is stored, use class instead of container

--
You received this message because you are subscribed to the Google Groups "Selenium Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to selenium-users+unsubscribe@googlegroups.com.
To post to this group, send email to selenium-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/selenium-users/dc900394-1ddf-4212-8229-693ffb076b0b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jk0...@gmail.com

unread,
Jun 19, 2018, 9:45:07 PM6/19/18
to Selenium Users
So here is the latest code and it's printing results without the scroll down part of the code.  When I add the scroll down code, it scrolls to the bottom of the page but keeps trying to scroll down infinitely instead of ending. .  Can someone show me code to go to the bottom but then end the loop so that it will print.


from selenium import webdriver

url = 'http://www.tradingview.com/screener'
driver = webdriver.Firefox()
driver.get(url)

# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")

while True:
    # Scroll down to bottom
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")



# will give a list of all tickers
tickers = driver.find_elements_by_css_selector('a.tv-screener__symbol')

# will give a list of all close values
close_values = driver.find_elements_by_xpath("//td[@class = 'tv-data-table__cell tv-screener-table__cell tv-screener-table__cell--numeric']/span")

# will give a list of all percentage changes
percentage_changes = driver.find_elements_by_xpath('//tbody/tr/td[3]')

# will give a list of all value changes
value_changes = driver.find_elements_by_xpath('//tbody/tr/td[4]')

# will give a list of all ranks
ranks = driver.find_elements_by_xpath('//tbody/tr/td[5]/span')

# will give a list of all volumes
volumes = driver.find_elements_by_xpath('//tbody/tr/td[6]')

# will give a list of all market caps
market_caps = driver.find_elements_by_xpath('//tbody/tr/td[7]')

# will give a list of all PEs
pes = driver.find_elements_by_xpath('//tbody/tr/td[8]')

# will give a list of all EPSs
epss = driver.find_elements_by_xpath('//tbody/tr/td[9]')

# will give a list of all EMPs
emps = driver.find_elements_by_xpath('//tbody/tr/td[10]')

# will give a list of all sectors
sectors = driver.find_elements_by_xpath('//tbody/tr/td[11]')

for index in range(len(tickers)):
   print("Row " + tickers[index].text + " " + close_values[index].text + " " + percentage_changes[index].text + " " + value_changes[index].text + " " + ranks[index].text + " " + volumes[index].text + " " + market_caps[index].text + " " + pes[index].text + " " + epss[index].text + " " + emps[index].text + " " + sectors[index].text + " ")
Reply all
Reply to author
Forward
0 new messages