tekst = <li ><div class="views-field-field-webrubrik-value"><h3><a
href="/307046">Claus Hjort spiller med mrkede kort</a></h3> </div><div
class="views-field-field-skribent-uid"><div class="byline">Af: <span
class="authors">Dennis Kristensen</span></div> </div> <div
class="views-field-field-webteaser-value"> <div class="webteaser">Claus
Hjort Frederiksens argumenter for at afvise trepartsforhandlinger har ikke
hold i virkeligheden. Hans rinde er nok snarere at forberede det
ideologiske grundlag for en Løkke Rasmussens genkomst som
statsminister</div> </div><span class="views-field-view-node"> <span
class="actions"><a href="/307046">Ls mere</a> | <a
href="/307046/#comments">Kommentarer (4)</a></span> </span></li>
I am interested in finding the link, marked with yellow. - My search term
is the red highlighed "Rasmussen".
from bs4 import BeautifulSoup
import re
soup = BeautifulSoup(tekst)
contexts = soup.find_all(text=re.compile("Rasmussen"))
for a in contexts:
print "context: %s" % a.encode('utf-8')
for artikel_link in a.find_parents('a'):
print "Artikel link %s" % artikel_link
link = artikel_link.get('href')
print "Link %s" % link
Maybe the link is not really a parent, but a previous sibling? But however
much I tinker, I can't seem to extract the link, e.g.
for i in context:
i.find_previous_siblings('a')
returns an empty lists.
As the prettify() print below shows, The text string is not a direct child
of the link. But it's positioned with in the same <li> element, nested
below a sibling. - so neither sibling og parents can easily find this link
- correct?
print soup.prettify()
<html>
<body>
<li>
<div class="views-field-field-webrubrik-value">
<h3>
<a href="/307046">
Claus Hjort spiller med mrkede kort
</a>
</h3>
</div>
<div class="views-field-field-skribent-uid">
<div class="byline">
Af:
<span class="authors">
Dennis Kristensen
</span>
</div>
</div>
<div class="views-field-field-webteaser-value">
<div class="webteaser">
Claus Hjort Frederiksens argumenter for at afvise
trepartsforhandlinger har ikke hold i virkeligheden. Hans rinde er nok
snarere at forberede det ideologiske grundlag for en Lkke Rasmussens
genkomst som statsminister
</div>
</div>
<span class="views-field-view-node">
<span class="actions">
<a href="/307046">
Ls mere
</a>
|
<a href="/307046/#comments">
Kommentarer (4)
</a>
</span>
</span>
</li>
</body>
</html>
Thanks in advance
On Sun, Jul 29, 2012 at 4:32 PM, Andreas Christoffersen <
achristoffer...@gmail.com> wrote:
> Thanks for getting me up to speed Leonard. Everything now works as
> expected! - Love BS4 - What ever other reasons there is, I find it much
> easier than lxml (for my needs anyway). Also really good documentation.
> Thanks again.
I was able to get at the href you are targeting with this:
href = [parent for parent in contexts[0].parents][2].a['href']
In other words it's the first <a> in the third parent of the first context
that you pulled with re
Since the site is using Drupal module
Views<http://drupal.org/project/views/>to generate the html, you might
be able to get away with using the list
numbers hard-coded.
Otherwise I am sure you could check the contents of each until you hit it.
Link
On Tue, Jul 31, 2012 at 6:09 AM, Andreas Christoffersen <
achristoffer...@gmail.com> wrote:
> New question, i am afraid:
> tekst = <li ><div class="views-field-field-webrubrik-value"><h3><a
> href="/307046">Claus Hjort spiller med mrkede kort</a></h3> </div><div
> class="views-field-field-skribent-uid"><div class="byline">Af: <span
> class="authors">Dennis Kristensen</span></div> </div> <div
> class="views-field-field-webteaser-value"> <div class="webteaser">Claus
> Hjort Frederiksens argumenter for at afvise trepartsforhandlinger har ikke
> hold i virkeligheden. Hans rinde er nok snarere at forberede det
> ideologiske grundlag for en Løkke Rasmussens genkomst som
> statsminister</div> </div><span class="views-field-view-node"> <span
> class="actions"><a href="/307046">Ls mere</a> | <a
> href="/307046/#comments">Kommentarer (4)</a></span> </span></li>
> I am interested in finding the link, marked with yellow. - My search term
> is the red highlighed "Rasmussen".
> from bs4 import BeautifulSoup
> import re
> soup = BeautifulSoup(tekst)
> contexts = soup.find_all(text=re.compile("Rasmussen"))
> for a in contexts:
> print "context: %s" % a.encode('utf-8')
> for artikel_link in a.find_parents('a'):
> print "Artikel link %s" % artikel_link
> link = artikel_link.get('href')
> print "Link %s" % link
> Maybe the link is not really a parent, but a previous sibling? But however
> much I tinker, I can't seem to extract the link, e.g.
> for i in context:
> i.find_previous_siblings('a')
> returns an empty lists.
> As the prettify() print below shows, The text string is not a direct child
> of the link. But it's positioned with in the same <li> element, nested
> below a sibling. - so neither sibling og parents can easily find this link
> - correct?
> print soup.prettify()
> <html>
> <body>
> <li>
> <div class="views-field-field-webrubrik-value">
> <h3>
> <a href="/307046">
> Claus Hjort spiller med mrkede kort
> </a>
> </h3>
> </div>
> <div class="views-field-field-skribent-uid">
> <div class="byline">
> Af:
> <span class="authors">
> Dennis Kristensen
> </span>
> </div>
> </div>
> <div class="views-field-field-webteaser-value">
> <div class="webteaser">
> Claus Hjort Frederiksens argumenter for at afvise
> trepartsforhandlinger har ikke hold i virkeligheden. Hans rinde er nok
> snarere at forberede det ideologiske grundlag for en Lkke Rasmussens
> genkomst som statsminister
> </div>
> </div>
> <span class="views-field-view-node">
> <span class="actions">
> <a href="/307046">
> Ls mere
> </a>
> |
> <a href="/307046/#comments">
> Kommentarer (4)
> </a>
> </span>
> </span>
> </li>
> </body>
> </html>
> Thanks in advance
> On Sun, Jul 29, 2012 at 4:32 PM, Andreas Christoffersen <
> achristoffer...@gmail.com> wrote:
>> Thanks for getting me up to speed Leonard. Everything now works as
>> expected! - Love BS4 - What ever other reasons there is, I find it much
>> easier than lxml (for my needs anyway). Also really good documentation.
>> Thanks again.
> --
> You received this message because you are subscribed to the Google Groups
> "beautifulsoup" group.
> To post to this group, send email to beautifulsoup@googlegroups.com.
> To unsubscribe from this group, send email to
> beautifulsoup+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/beautifulsoup?hl=en.
I guess I am slowly learning that nothing is very easy when webscraping: My
sample code works in all, but this case – so far....
Not all the sites (danish newspapers) are drupal sites. So I need something
as generic as possible.
A solution seems to first try the current approach, and if not link is
found, then try [parents][0], if nothing found then parensts[1] etc.
(That's actually what I thought find_parents('a') did).
Seems a small recursive function is suited for this task. That's what you
had in mind when you wrote:
> Otherwise I am sure you could check the contents of each until you hit it
Correct?
Thanks a bunch!!!
On Tue, Jul 31, 2012 at 3:49 PM, Link Swanson <l...@mustbuilddigital.com>wrote:
> I was able to get at the href you are targeting with this:
> href = [parent for parent in contexts[0].parents][2].a['href']
> In other words it's the first <a> in the third parent of the first context
> that you pulled with re
> Since the site is using Drupal module Views<http://drupal.org/project/views/>to generate the html, you might be able to get away with using the list
> numbers hard-coded.
> Otherwise I am sure you could check the contents of each until you hit it.
> Link
> On Tue, Jul 31, 2012 at 6:09 AM, Andreas Christoffersen <
> achristoffer...@gmail.com> wrote:
>> New question, i am afraid:
>> tekst = <li ><div class="views-field-field-webrubrik-value"><h3><a
>> href="/307046">Claus Hjort spiller med mrkede kort</a></h3> </div><div
>> class="views-field-field-skribent-uid"><div class="byline">Af: <span
>> class="authors">Dennis Kristensen</span></div> </div> <div
>> class="views-field-field-webteaser-value"> <div class="webteaser">Claus
>> Hjort Frederiksens argumenter for at afvise trepartsforhandlinger har ikke
>> hold i virkeligheden. Hans rinde er nok snarere at forberede det
>> ideologiske grundlag for en Løkke Rasmussens genkomst som
>> statsminister</div> </div><span class="views-field-view-node"> <span
>> class="actions"><a href="/307046">Ls mere</a> | <a
>> href="/307046/#comments">Kommentarer (4)</a></span> </span></li>
>> I am interested in finding the link, marked with yellow. - My search term
>> is the red highlighed "Rasmussen".
>> from bs4 import BeautifulSoup
>> import re
>> soup = BeautifulSoup(tekst)
>> contexts = soup.find_all(text=re.compile("Rasmussen"))
>> for a in contexts:
>> print "context: %s" % a.encode('utf-8')
>> for artikel_link in a.find_parents('a'):
>> print "Artikel link %s" % artikel_link
>> link = artikel_link.get('href')
>> print "Link %s" % link
>> Maybe the link is not really a parent, but a previous sibling? But
>> however much I tinker, I can't seem to extract the link, e.g.
>> for i in context:
>> i.find_previous_siblings('a')
>> returns an empty lists.
>> As the prettify() print below shows, The text string is not a direct
>> child of the link. But it's positioned with in the same <li> element,
>> nested below a sibling. - so neither sibling og parents can easily find
>> this link - correct?
>> print soup.prettify()
>> <html>
>> <body>
>> <li>
>> <div class="views-field-field-webrubrik-value">
>> <h3>
>> <a href="/307046">
>> Claus Hjort spiller med mrkede kort
>> </a>
>> </h3>
>> </div>
>> <div class="views-field-field-skribent-uid">
>> <div class="byline">
>> Af:
>> <span class="authors">
>> Dennis Kristensen
>> </span>
>> </div>
>> </div>
>> <div class="views-field-field-webteaser-value">
>> <div class="webteaser">
>> Claus Hjort Frederiksens argumenter for at afvise
>> trepartsforhandlinger har ikke hold i virkeligheden. Hans rinde er nok
>> snarere at forberede det ideologiske grundlag for en Lkke Rasmussens
>> genkomst som statsminister
>> </div>
>> </div>
>> <span class="views-field-view-node">
>> <span class="actions">
>> <a href="/307046">
>> Ls mere
>> </a>
>> |
>> <a href="/307046/#comments">
>> Kommentarer (4)
>> </a>
>> </span>
>> </span>
>> </li>
>> </body>
>> </html>
>> Thanks in advance
>> On Sun, Jul 29, 2012 at 4:32 PM, Andreas Christoffersen <
>> achristoffer...@gmail.com> wrote:
>>> Thanks for getting me up to speed Leonard. Everything now works as
>>> expected! - Love BS4 - What ever other reasons there is, I find it much
>>> easier than lxml (for my needs anyway). Also really good documentation.
>>> Thanks again.
>> --
>> You received this message because you are subscribed to the Google Groups
>> "beautifulsoup" group.
>> To post to this group, send email to beautifulsoup@googlegroups.com.
>> To unsubscribe from this group, send email to
>> beautifulsoup+unsubscribe@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/beautifulsoup?hl=en.
> --
> Link Swanson
> Must Build Digital
> --
> You received this message because you are subscribed to the Google Groups
> "beautifulsoup" group.
> To post to this group, send email to beautifulsoup@googlegroups.com.
> To unsubscribe from this group, send email to
> beautifulsoup+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/beautifulsoup?hl=en.
> I guess I am slowly learning that nothing is very easy when webscraping:
> My sample code works in all, but this case – so far....
> Not all the sites (danish newspapers) are drupal sites. So I need
> something as generic as possible.
> A solution seems to first try the current approach, and if not link is
> found, then try [parents][0], if nothing found then parensts[1] etc.
> (That's actually what I thought find_parents('a') did).
> Seems a small recursive function is suited for this task. That's what you
> had in mind when you wrote:
>> Otherwise I am sure you could check the contents of each until you hit it
> Correct?
> Thanks a bunch!!!
> On Tue, Jul 31, 2012 at 3:49 PM, Link Swanson <l...@mustbuilddigital.com>wrote:
>> I was able to get at the href you are targeting with this:
>> href = [parent for parent in contexts[0].parents][2].a['href']
>> In other words it's the first <a> in the third parent of the first
>> context that you pulled with re
>> Since the site is using Drupal module Views<http://drupal.org/project/views/>to generate the html, you might be able to get away with using the list
>> numbers hard-coded.
>> Otherwise I am sure you could check the contents of each until you hit
>> it.
>> Link
>> On Tue, Jul 31, 2012 at 6:09 AM, Andreas Christoffersen <
>> achristoffer...@gmail.com> wrote:
>>> New question, i am afraid:
>>> tekst = <li ><div class="views-field-field-webrubrik-value"><h3><a
>>> href="/307046">Claus Hjort spiller med mrkede kort</a></h3> </div><div
>>> class="views-field-field-skribent-uid"><div class="byline">Af: <span
>>> class="authors">Dennis Kristensen</span></div> </div> <div
>>> class="views-field-field-webteaser-value"> <div class="webteaser">Claus
>>> Hjort Frederiksens argumenter for at afvise trepartsforhandlinger har ikke
>>> hold i virkeligheden. Hans rinde er nok snarere at forberede det
>>> ideologiske grundlag for en Løkke Rasmussens genkomst som
>>> statsminister</div> </div><span class="views-field-view-node"> <span
>>> class="actions"><a href="/307046">Ls mere</a> | <a
>>> href="/307046/#comments">Kommentarer (4)</a></span> </span></li>
>>> I am interested in finding the link, marked with yellow. - My search
>>> term is the red highlighed "Rasmussen".
>>> from bs4 import BeautifulSoup
>>> import re
>>> soup = BeautifulSoup(tekst)
>>> contexts = soup.find_all(text=re.compile("Rasmussen"))
>>> for a in contexts:
>>> print "context: %s" % a.encode('utf-8')
>>> for artikel_link in a.find_parents('a'):
>>> print "Artikel link %s" % artikel_link
>>> link = artikel_link.get('href')
>>> print "Link %s" % link
>>> Maybe the link is not really a parent, but a previous sibling? But
>>> however much I tinker, I can't seem to extract the link, e.g.
>>> for i in context:
>>> i.find_previous_siblings('a')
>>> returns an empty lists.
>>> As the prettify() print below shows, The text string is not a direct
>>> child of the link. But it's positioned with in the same <li> element,
>>> nested below a sibling. - so neither sibling og parents can easily find
>>> this link - correct?
>>> print soup.prettify()
>>> <html>
>>> <body>
>>> <li>
>>> <div class="views-field-field-webrubrik-value">
>>> <h3>
>>> <a href="/307046">
>>> Claus Hjort spiller med mrkede kort
>>> </a>
>>> </h3>
>>> </div>
>>> <div class="views-field-field-skribent-uid">
>>> <div class="byline">
>>> Af:
>>> <span class="authors">
>>> Dennis Kristensen
>>> </span>
>>> </div>
>>> </div>
>>> <div class="views-field-field-webteaser-value">
>>> <div class="webteaser">
>>> Claus Hjort Frederiksens argumenter for at afvise
>>> trepartsforhandlinger har ikke hold i virkeligheden. Hans rinde er nok
>>> snarere at forberede det ideologiske grundlag for en Lkke Rasmussens
>>> genkomst som statsminister
>>> </div>
>>> </div>
>>> <span class="views-field-view-node">
>>> <span class="actions">
>>> <a href="/307046">
>>> Ls mere
>>> </a>
>>> |
>>> <a href="/307046/#comments">
>>> Kommentarer (4)
>>> </a>
>>> </span>
>>> </span>
>>> </li>
>>> </body>
>>> </html>
>>> Thanks in advance
>>> On Sun, Jul 29, 2012 at 4:32 PM, Andreas Christoffersen <
>>> achristoffer...@gmail.com> wrote:
>>>> Thanks for getting me up to speed Leonard. Everything now works as
>>>> expected! - Love BS4 - What ever other reasons there is, I find it much
>>>> easier than lxml (for my needs anyway). Also really good documentation.
>>>> Thanks again.
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "beautifulsoup" group.
>>> To post to this group, send email to beautifulsoup@googlegroups.com.
>>> To unsubscribe from this group, send email to
>>> beautifulsoup+unsubscribe@googlegroups.com.
>>> For more options, visit this group at
>>> http://groups.google.com/group/beautifulsoup?hl=en.
>> --
>> Link Swanson
>> Must Build Digital
>> --
>> You received this message because you are subscribed to the Google Groups
>> "beautifulsoup" group.
>> To post to this group, send email to beautifulsoup@googlegroups.com.
>> To unsubscribe from this group, send email to
>> beautifulsoup+unsubscribe@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/beautifulsoup?hl=en.
> --
> You received this message because you are subscribed to the Google Groups
> "beautifulsoup" group.
> To post to this group, send email to beautifulsoup@googlegroups.com.
> To unsubscribe from this group, send email to
> beautifulsoup+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/beautifulsoup?hl=en.
I have some problems grokking this. I guess its mostly a python problem (i
am a beginner) than it's a BS4 problem. None the less I post it here. Hope
that is okay.
Below my try to create a recursive function as per the posts above. What I
don't get is the comments marked in yellow below.
Incidently I think "find nearest link" is not that uncommen?
Thanks in advance.
#-*- coding: utf-8 -*-
from bs4 import BeautifulSoup
import re
tekst = '<li ><div class="views-field-field-webrubrik-value"><h3><a
href="/307046">Claus Hjort spiller med mrkede kort</a></h3> </div><div
class="views-field-field-skribent-uid"><div class="byline">Af: <span
class="authors">Dennis Kristensen</span></div> </div> <div
class="views-field-field-webteaser-value"> <div class="webteaser">Claus
Hjort Frederiksens argumenter for at afvise trepartsforhandlinger har ikke
hold i virkeligheden. Hans rinde er nok snarere at forberede det
ideologiske grundlag for en Løkke Rasmussens genkomst som
statsminister</div> </div><span class="views-field-view-node"> <span
class="actions"><a href="/307046">Ls mere</a> | <a
href="/307046/#comments">Kommentarer (4)</a></span> </span></li>'
to_find = "Rasmussen"
soup = BeautifulSoup(tekst)
contexts = soup.find_all(text=re.compile(to_find))
def find_nearest(element, url, direction="both"):
"""Find the nearest link, relative to a text string.
When complete it will search up and down (parent, child),
Will then return the link the fewest steps away from the
original element. Assumes we have already found an element"""
# Is the nearest link readily available?
# If so - this is what we want.
if element.find_parents('a'):
for artikel_link in element.find_parents('a'):
print "artikel_link er fundet %" % artikel_link
link = artikel_link.get('href')
if ("http" or "www") not in link:
link = url+link
return link
# if the link is not readily available, we will go up
if not element.find_parents('a'):
element = element.parent
# Print for debugging
print element #on the 2nd run (i.e <li> this finds <a href=/307056>
# So shouldn't it be caught as readily available above?
print u"Found: %s" % element.name
# the recursive call
find_nearest(element,url)
if contexts:
for a in contexts:
find_nearest( element=a, url="http://information.dk")
On Tue, Jul 31, 2012 at 7:46 PM, Link Swanson <l...@mustbuilddigital.com>wrote:
> Correct, in some way you can loop through the parents and build checks
> using if statements to find what you need.
> Also correct that it is hard to handle diverse cases.
> Good luck!
> On Tue, Jul 31, 2012 at 11:32 AM, Andreas Christoffersen <
> achristoffer...@gmail.com> wrote:
>> Thanks Link,
>> I guess I am slowly learning that nothing is very easy when webscraping:
>> My sample code works in all, but this case – so far....
>> Not all the sites (danish newspapers) are drupal sites. So I need
>> something as generic as possible.
>> A solution seems to first try the current approach, and if not link is
>> found, then try [parents][0], if nothing found then parensts[1] etc.
>> (That's actually what I thought find_parents('a') did).
>> Seems a small recursive function is suited for this task. That's what you
>> had in mind when you wrote:
>>> Otherwise I am sure you could check the contents of each until you hit it
>> Correct?
>> Thanks a bunch!!!
>> On Tue, Jul 31, 2012 at 3:49 PM, Link Swanson <l...@mustbuilddigital.com>wrote:
>>> I was able to get at the href you are targeting with this:
>>> href = [parent for parent in contexts[0].parents][2].a['href']
>>> In other words it's the first <a> in the third parent of the first
>>> context that you pulled with re
>>> Since the site is using Drupal module Views<http://drupal.org/project/views/>to generate the html, you might be able to get away with using the list
>>> numbers hard-coded.
>>> Otherwise I am sure you could check the contents of each until you hit
>>> it.
>>> Link
>>> On Tue, Jul 31, 2012 at 6:09 AM, Andreas Christoffersen <
>>> achristoffer...@gmail.com> wrote:
>>>> New question, i am afraid:
>>>> tekst = <li ><div class="views-field-field-webrubrik-value"><h3><a
>>>> href="/307046">Claus Hjort spiller med mrkede kort</a></h3>
>>>> </div><div class="views-field-field-skribent-uid"><div class="byline">Af:
>>>> <span class="authors">Dennis Kristensen</span></div> </div> <div
>>>> class="views-field-field-webteaser-value"> <div class="webteaser">Claus
>>>> Hjort Frederiksens argumenter for at afvise trepartsforhandlinger har ikke
>>>> hold i virkeligheden. Hans rinde er nok snarere at forberede det
>>>> ideologiske grundlag for en Løkke Rasmussens genkomst som
>>>> statsminister</div> </div><span class="views-field-view-node"> <span
>>>> class="actions"><a href="/307046">Ls mere</a> | <a
>>>> href="/307046/#comments">Kommentarer (4)</a></span> </span></li>
>>>> I am interested in finding the link, marked with yellow. - My search
>>>> term is the red highlighed "Rasmussen".
>>>> from bs4 import BeautifulSoup
>>>> import re
>>>> soup = BeautifulSoup(tekst)
>>>> contexts = soup.find_all(text=re.compile("Rasmussen"))
>>>> for a in contexts:
>>>> print "context: %s" % a.encode('utf-8')
>>>> for artikel_link in a.find_parents('a'):
>>>> print "Artikel link %s" % artikel_link
>>>> link = artikel_link.get('href')
>>>> print "Link %s" % link
>>>> Maybe the link is not really a parent, but a previous sibling? But
>>>> however much I tinker, I can't seem to extract the link, e.g.
>>>> for i in context:
>>>> i.find_previous_siblings('a')
>>>> returns an empty lists.
>>>> As the prettify() print below shows, The text string is not a direct
>>>> child of the link. But it's positioned with in the same <li> element,
>>>> nested below a sibling. - so neither sibling og parents can easily find
>>>> this link - correct?
>>>> print soup.prettify()
>>>> <html>
>>>> <body>
>>>> <li>
>>>> <div class="views-field-field-webrubrik-value">
>>>> <h3>
>>>> <a href="/307046">
>>>> Claus Hjort spiller med mrkede kort
>>>> </a>
>>>> </h3>
>>>> </div>
>>>> <div class="views-field-field-skribent-uid">
>>>> <div class="byline">
>>>> Af:
>>>> <span class="authors">
>>>> Dennis Kristensen
>>>> </span>
>>>> </div>
>>>> </div>
>>>> <div class="views-field-field-webteaser-value">
>>>> <div class="webteaser">
>>>> Claus Hjort Frederiksens argumenter for at afvise
>>>> trepartsforhandlinger har ikke hold i virkeligheden. Hans rinde er nok
>>>> snarere at forberede det ideologiske grundlag for en Lkke Rasmussens
>>>> genkomst som statsminister
>>>> </div>
>>>> </div>
>>>> <span class="views-field-view-node">
>>>> <span class="actions">
>>>> <a href="/307046">
>>>> Ls mere
>>>> </a>
>>>> |
>>>> <a href="/307046/#comments">
>>>> Kommentarer (4)
>>>> </a>
>>>> </span>
>>>> </span>
>>>> </li>
>>>> </body>
>>>> </html>
>>>> Thanks in advance
>>>> On Sun, Jul 29, 2012 at 4:32 PM, Andreas Christoffersen <
>>>> achristoffer...@gmail.com> wrote:
>>>>> Thanks for getting me up to speed Leonard. Everything now works as
>>>>> expected! - Love BS4 - What ever other reasons there is, I find it much
>>>>> easier than lxml (for my needs anyway). Also really good documentation.
>>>>> Thanks again.
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "beautifulsoup" group.
>>>> To post to this group, send email to beautifulsoup@googlegroups.com.
>>>> To unsubscribe from this group, send email to
>>>> beautifulsoup+unsubscribe@googlegroups.com.
>>>> For more options, visit this group at
>>>> http://groups.google.com/group/beautifulsoup?hl=en.
>>> --
>>> Link Swanson
>>> Must Build Digital
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "beautifulsoup" group.
>>> To post to this group, send email to beautifulsoup@googlegroups.com.
>>> To unsubscribe from this group, send email to
>>> beautifulsoup+unsubscribe@googlegroups.com.
>>> For more options, visit this group at
>>> http://groups.google.com/group/beautifulsoup?hl=en.
>> --
>> You received this message because you are subscribed to the Google Groups
>> "beautifulsoup" group.
>> To post to this group, send email to beautifulsoup@googlegroups.com.
>> To unsubscribe from this group, send email to
>> beautifulsoup+unsubscribe@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/beautifulsoup?hl=en.
> --
> Link Swanson
> Must Build Digital
> --
> You received this message because you are subscribed to the Google Groups
> "beautifulsoup" group.
> To post to this group, send email to beautifulsoup@googlegroups.com.
> To unsubscribe from this group, send email to
> beautifulsoup+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/beautifulsoup?hl=en.