How to Get the Current URL of an Item in a Loop Using Selenium

102 views
Skip to first unread message

Miracle Akinsola Ayodele

unread,
Nov 30, 2017, 2:40:02 AM11/30/17
to Selenium Users
I trying to scrape a website in selenium but i keep getting :

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document
  (Session info: headless chrome=62.0.3202.94)
  (Driver info: chromedriver=2.33.506092 (733a02544d189eeb751fe0d7ddca79a0ee28cce4),platform=Linux 4.4.0-101-generic x86_64)

What could be the issue, and here is the code:

def get_financial_info(self):
# instantiate a chrome options object so you can set the size and headless preference
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--window-size=1920x1080")
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path='/home/miracle/chromedriver')

driver.get("https://www.financialjuice.com")

try:
WebDriverWait(driver, 60).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='trendWrap']")))
except TimeoutException:
driver.quit()

category_url = [a.get_attribute("href") for a in
driver.find_elements_by_xpath("//ul[@class='nav navbar-nav']/li[@class='text-uppercase']/a[@href]")]

for record in category_url:
driver.get(record)
item = {}
cat = driver.find_elements_by_xpath("//h2[@class='text-uppercase corpName']")
title_element = driver.find_elements_by_xpath("//p[@class='headline-title']")
source_element = driver.find_elements_by_xpath("//p[@class='time']/span[@class='resource-name']/a")
url_element = [a.get_attribute('href') for a in driver.find_elements_by_xpath("//p[@class='headline-title']/a")]

categories = []

for category in cat:
categories.append(category.text)

for title, source, urls in zip(title_element, source_element, url_element):
item['category'] = str(categories)[1:-1].strip('"u')
item['title'] = title.text
item['source'] = source.text
item['date'] = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
driver.get(urls)
item['url'] = driver.current_url
print item

Scott Babcock

unread,
Nov 30, 2017, 8:19:10 PM11/30/17
to Selenium Users
You're navigating away from the page from which you extracted the link elements. Instead of working with the references directly, extract the [href] attribute from the references and use these instead.
Reply all
Reply to author
Forward
0 new messages