Selecting nth span

14 views
Skip to first unread message

Norman Chan

unread,
Dec 10, 2017, 8:22:41 PM12/10/17
to beautifulsoup
Hi,

I'm trying to get the contents of every second span in every div with class details section:

Here is one section of the page:
 
<div class="details_section">
   <div>
      <span class="label">Rated:</span>
      <span>
      Rated R for strong violence, disturbing images, and language throughout
      </span>
   </div>
   <div>
      <span class="label">Starring:</span>
      <span>
      <a href="/person/charlie-murphy">Charlie Murphy</a>,                                                                             <a href="/person/jack-oconnell">Jack O'Connell</a>,                                                                             <a href="/person/paul-anderson">Paul Anderson</a>,                                                                             <a href="/person/paul-popplewell">Paul Popplewell</a>,                                                                             <a href="/person/sam-hazeldine">Sam Hazeldine</a>,                                                                             <a href="/person/sam-reid">Sam Reid</a>,                                                                             <a href="/person/sean-harris">Sean Harris</a>                                                                    </span>
   </div>
   <div class="genres">
      <span class="label">Genre(s):</span>
      <span>
      <span>Action</span>,                                                                                    <span>Drama</span>,                                                                                    <span>Thriller</span>,                                                                                    <span>War</span>                                                                            </span>
   </div>
   <div class="userscore_text">
      <span class="label">User Score:</span>
      <span>7.4</span>
   </div>
   <div class="runtime">
      <span class="label">Runtime:</span>
      <span>99 min</span>
   </div>
   <div class="summary">
      <span>’71 takes place over a single night in the life of a young British soldier (Jack O’Connell) accidentally abandoned by his unit following a riot on the streets of Belfast in 1971. Unable to tell...</span>
   </div>
</div>



This webpage has many details section and I would like to get at each second span (so each span that DOESNT have class='label'.

This is what I have so far:

import requests
import string
import simplejson as json
import datetime
from bs4 import BeautifulSoup as bs

headers = {'User-Agent':'Mozilla/5.0'}
page = requests.get(url, headers=headers)
soup = bs(page.text, 'lxml')


def findDetails():
    details = soup.find_all('div', attrs={'class':'details_section'})
    for detail in details:
        print(detail.div)

findDetails()


Any help would be greatly appreciated.

Thanks,
Reply all
Reply to author
Forward
0 new messages