Until recently, I hadn't programmed in over 4 years and it's also my first time using python. After extensive google searches looking for regex in python, I came across beautifulsoup. Though the documentation was extremely long, a lot of it didn't seem to help me or I couldn't get it to work when I copy and pasted and modified the code. So I'm here hoping to get some help.I'm trying to parse a webpage to get the FIRST mp3 link ONLY. The code I will be parsing will look like this:<a href="#" playlist="http://dn-naverdic.ktics.co.kr/naverdic/f759cdac78d6e201e5dfd928acc70e2a/4ffec2f7/naverdic/endic/sound/clear/us/007/007582.mp3" class="play3 N=a:wrd.listencom,r:3,i:85c05904f36749e6aa9f6fd3f461f63c">I've tried the find all function with 'a' as the parameter and tried getting it to find 'playlist' but I could not get it to work. Could I get some help on how to extract the url for the mp3 please? Thank you--
You received this message because you are subscribed to the Google Groups "beautifulsoup" group.
To view this discussion on the web visit https://groups.google.com/d/msg/beautifulsoup/-/1TcB-ZYbTcIJ.
To post to this group, send email to beauti...@googlegroups.com.
To unsubscribe from this group, send email to beautifulsou...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/beautifulsoup?hl=en.
try this:soup = BeautifulSoup(your_html)print soup.find('a', 'play3')['playlist']This returns the string inside of playlist="" for the first anchor tag with css class "play3"If the links you are looking for always have the css class play3 then you are golden
On Thu, Jul 12, 2012 at 8:10 AM, Chris Lewis <chriskw...@gmail.com> wrote:
Until recently, I hadn't programmed in over 4 years and it's also my first time using python. After extensive google searches looking for regex in python, I came across beautifulsoup. Though the documentation was extremely long, a lot of it didn't seem to help me or I couldn't get it to work when I copy and pasted and modified the code. So I'm here hoping to get some help.I'm trying to parse a webpage to get the FIRST mp3 link ONLY. The code I will be parsing will look like this:<a href="#" playlist="http://dn-naverdic.ktics.co.kr/naverdic/f759cdac78d6e201e5dfd928acc70e2a/4ffec2f7/naverdic/endic/sound/clear/us/007/007582.mp3" class="play3 N=a:wrd.listencom,r:3,i:85c05904f36749e6aa9f6fd3f461f63c">I've tried the find all function with 'a' as the parameter and tried getting it to find 'playlist' but I could not get it to work. Could I get some help on how to extract the url for the mp3 please? Thank you
--
--
You received this message because you are subscribed to the Google Groups "beautifulsoup" group.
To view this discussion on the web visit https://groups.google.com/d/msg/beautifulsoup/-/UEKeo7-IKlkJ.
To post to this group, send email to beauti...@googlegroups.com.
To unsubscribe from this group, send email to beautifulsou...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/beautifulsoup?hl=en.
--
You received this message because you are subscribed to the Google Groups "beautifulsoup" group.
To view this discussion on the web visit https://groups.google.com/d/msg/beautifulsoup/-/MpcmTD4jmtkJ.
To post to this group, send email to beauti...@googlegroups.com.
To unsubscribe from this group, send email to beautifulsou...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/beautifulsoup?hl=en.
Great, I'll give that a try. Thank you so much for your help.