[Report] Bug in diagnose method from diagnose.py

25 views
Skip to first unread message

Bubble fish

unread,
Jul 14, 2022, 5:42:32 AM7/14/22
to beautifulsoup
Hi, I find a small bug in diagnose() method from diagnose.py.
I have listed below the payload, crash information, crash causes, and fix suggestions.

The code is
from bs4.diagnose import diagnose
diagnose(".")

The crash information is
Traceback (most recent call last):
  File "test.py", line 3, in <module>
    diagnose(".")
  File "/home/server1/.cache/pypoetry/virtualenvs/pyfuzzgen-A1dD_9Bu-py3.8/lib/python3.8/site-packages/bs4/diagnose.py", line 70, in diagnose
    with open(data) as fp:
IsADirectoryError: [Errno 21] Is a directory: '.'

The crash cause is,
the diagnose function in the bs4 package does not handle the input data correctly and will report an error. In the following code, it only determines whether the path exists, but not whether the path is a file or a directory.
try:
  if os.path.exists(data):
    print(('"%s" looks like a filename. Reading data from the file.' % data))
    with open(data) as fp:
      data = fp.read()
except ValueError:
  # This can happen on some platforms when the 'filename' is
  # too long. Assume it's data and not a filename.
  pass

The fix suggestion is
try:
- if os.path.exists(data):
+  if os.path.isfile(data):
    print(('"%s" looks like a filename. Reading data from the file.' % data))
    with open(data) as fp:
      data = fp.read()
except ValueError:
  # This can happen on some platforms when the 'filename' is
  # too long. Assume it's data and not a filename.
  pass

Reply all
Reply to author
Forward
0 new messages