Error when using the beatifulsoup4

1,542 views
Skip to first unread message

Akpofure Enughwure

unread,
Mar 28, 2017, 7:18:37 AM3/28/17
to beautifulsoup
Good day everyone,

I am new to python... currently learning how to scrape data ffrom website.

I used the following codes

import requests

from bs4 BeautifulSoup 4

 r = requests.get ("url")

The content was loaded

Then I tried this

soup=BeautifulSoup(r.content)


and I got this errors

Warning (from warnings module):
  File "C:\Python34\lib\site-packages\bs4\__init__.py", line 181
    markup_type=markup_type))
UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 1 of the file <string>. To get rid of this warning, change code that looks like this:

 BeautifulSoup([your markup])

to this:

 BeautifulSoup([your markup], "html.parser")

>>> soup = BeautifulSoup ([r.content], "html.parser")
Traceback (most recent call last):
  File "<pyshell#6>", line 1, in <module>
    soup = BeautifulSoup ([r.content], "html.parser")
  File "C:\Python34\lib\site-packages\bs4\__init__.py", line 225, in __init__
    markup, from_encoding, exclude_encodings=exclude_encodings)):
  File "C:\Python34\lib\site-packages\bs4\builder\_htmlparser.py", line 157, in prepare_markup
    exclude_encodings=exclude_encodings)
  File "C:\Python34\lib\site-packages\bs4\dammit.py", line 366, in __init__
    for encoding in self.detector.encodings:
  File "C:\Python34\lib\site-packages\bs4\dammit.py", line 257, in encodings
    self.markup, self.is_html)
  File "C:\Python34\lib\site-packages\bs4\dammit.py", line 315, in find_declared_encoding
    declared_encoding_match = xml_encoding_re.search(markup, endpos=xml_endpos)
TypeError: expected string or buffer


What do I do? and I noticed  that my idle is slow when processing these data


Regards
 

tehtea

unread,
Apr 19, 2017, 9:59:52 PM4/19/17
to beautifulsoup
Hi there,

If you input type(r.content) into IDLE, you can see that it is actually a bytes object, rather than a string or a buffer as expected by BeautifulSoup.
My suggestion is that you can change 

soup=BeautifulSoup(r.content)

to 

soup=BeautifulSoup(r.text)

so that BeautifulSoup will be working with a string object instead.

Bishwas Bhandari

unread,
Oct 13, 2019, 1:29:48 PM10/13/19
to beautifulsoup
When you get this kind of error while dealing with PyDictionary or other modules, you can simply solve it by doing some changes in your utils.py file. In simple, we are learning How to get rid of BeautifulSoup user warning?

Reply all
Reply to author
Forward
0 new messages