You've installed the Python 3 version of Beautiful Soup in a place
where it's being picked up by Python 2.
Leonard
> --
> You received this message because you are subscribed to the Google Groups "beautifulsoup" group.
> To post to this group, send email to beauti...@googlegroups.com.
> To unsubscribe from this group, send email to beautifulsou...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/beautifulsoup?hl=en.
>
"ImportError: No module named HTMLParser" is an error you'll get when
running Python 2 code under Python 3. It's the opposite of the
"ImportError: No module named html.parser" you were getting before. In
Python 3, the HTMLParser module was renamed to html.parser.
The code pulled down and the tarball are identical. However, the
tarball contains Python 2 code. To convert it to Python 3 code, you
need to make sure that when setup.py is run, it's run with the Python
3 interpreter.
easy_install will download the Beautiful Soup tarball and run
setup.py. But if you have both Python 2 and Python 3 installed on your
system, it's a crapshoot whether the command "easy_install" will use
the Python 2 interpreter or the Python 3 interpreter.
Since you are not the first person with this sort of problem on
Windows, I think it must be easy to get into a situation where Python
2 code is installed into the Python 3 library directory, or vice
versa.
For instance, imagine that you tell easy_install to install
beautifulsoup4 into lib\python3.2\site-packages\ but easy_install uses
the Python 2 interpreter. You will download the Beautiful Soup code
(which is written for Python 2) and install it in
lib\python3.2\site-packages/ without converting it to Python 3. This
will give the ImportError you mentioned when you try to use it.
I can imagine other sources of inconsistency, such as the
"easy_install" command running Python 2 while the "python" command
runs Python 3. I don't know exactly what the problem is, but it's
probably something like this.
Here's an idea for avoiding the problem. Hopefully on Windows you have
"easy_install-2.7" and "easy_install-3.2" scripts in the same
directory as "easy_install". (Or whatever versions you're using.)
Hopefully you also have "python-2.7" and "python-3.2" interpreters in
addition to "python". If you run into more problems, I recommend
redoing the installation process, making sure that you always specify
which version of the interpreter you're running.
So instead of "python setup.py" you would run "python-3.2 setup.py".
Instead of "easy_install beautifulsoup4" you would run
"easy_install-3.2 beautifulsoup4".
With this technique you should be able to install easy_install and
beautifulsoup4 under Python 2, then install completely separate
easy_install and beautifulsoup4 under Python 3.
If you still have problems, I'd like to see:
* The commands you used to set everything up, with output if possible
* The first line of the `easy_install` script, which contains the
interpreter (although I don't know if this works on Windows)
* Any other easy_install-* scripts you have, e.g. easy_install-2.7.
* The output of python --version, python2.7 --version, and python3 --version.
* The full path of every copy of BS4's bs4/builder/_htmlparser.py on
your system, and the first 10 lines of each copy. (So that I can see
whether it imports html.parser or HTMLParser.)
Leonard
Can you show me these errors? I think this is the point where the
problem happened.
"No module named HTMLParser" is also caused by Python 2 code that
wasn't translated to Python3. Since you don't have Python 2 installed
and you reported errors during the translation process, it seems like
the logical place to look.
It would also be helpful to see the complete output of your "python
setup.py install".
Leonard
The beautifulsoup4 setup.py starts out like this:
try:
from distutils.command.build_py import build_py_2to3 as build_py
except ImportError:
# 2.x
from distutils.command.build_py import build_py
build_py_2to3 is supposed to handle the conversion. setup.py assumes
that if build_py_2to3 is not present, then you're on Python 2 and the
code should not be converted.
That assumption fails in your case, and I think it's because you're
using Distribute instead of distutils. Distribute generally acts like
distutils, but it does not provide build_py_2to3. Instead, you need to
specify "use_2to3=True" as an argument to setup().
So, if I insert this line into setup.py, it should work with either
Distribute or distutils:
=== modified file 'setup.py'
--- setup.py 2012-03-02 13:16:53 +0000
+++ setup.py 2012-03-07 16:50:26 +0000
@@ -13,6 +13,7 @@
url="http://www.crummy.com/software/BeautifulSoup/bs4/",
download_url =
"http://www.crummy.com/software/BeautifulSoup/bs4/download/",
long_description="""Beautiful Soup sits atop an HTML or XML
parser, providing Pythonic idioms for iterating, searching, and
modifying the parse tree.""",
+ use_2to3 = True,
license="MIT",
packages=['bs4', 'bs4.builder', 'bs4.tests'],
cmdclass = {'build_py':build_py},
Would you try this (running the beautifulsoup4 setup.py manually) and
see if that works?
I will change setup.py to work with Distribute, but I don't understand
your problem and I'm at the limit of what I can do to fix it. Sorry.
Leonard
As long as I use "easy_install" then it's installed correctly.
And I'm a Python newbie so... I'll forget about it too. :)
Thanks for trying. I did learn some things so its not a total loss for me.
Skippy
1. I installed Python 3.2.2 from http://python.org/
2. I downloaded Distribute 0.6.24 from
http://pypi.python.org/pypi/distribute and installed it with
'python.exe setup.py install'
3. Download Beautiful Soup 4 beta 10 from
http://pypi.python.org/pypi/beautifulsoup4 and built it with
'c:\Python32\python.exe setup.py build'. I did *not* install the built
code, because I just wanted to see if the code was converted to Python
3.
I did see the 2to3 conversion running, and when I checked the code
located in build/, I saw that it had been converted to Python 3. From
the build/lib directory, I was able to start a Python shell, run "from
bs4 import BeautifulSoup", and parse a simple document.
4. I then removed Distribute (by deleting it from
c:\Python32\Lib\site-packages).
5. I then destroyed the Beautiful Soup build/ directory and rebuilt
it, having nothing installed but Python's built-in distutils package
("c:\Python32\python.exe setup.py build"). Again, the 2to3 conversion
ran and the code was properly converted to Python 3.
6. At this point I was pretty close to having a pristine Python
installation, since I removed Distribute and I never installed
Beautiful Soup. But just to be sure, I completely removed the Python
3.2.2 installation and destroyed the c:\Python32 directory.
------------------
Then I checked whether easy_install (as provided by Distribute)
installed the package properly.
7. I reinstalled Python 3.2.2
8. Then I installed Distribute again (with "python.exe setup.py install")
9. Then I ran "c:\Python32\Scripts\easy_install.exe beautifulsoup4"
10. Beautiful Soup was installed into c:\Python32\Lib\site-packages. I
spot-checked builder/_htmlparser.py to make sure the code was
converted to Python 3. I was able to "from bs4 import BeautifulSoup"
and parse a test file.
------------------
So, I think the "use_2to3" thing is a red herring, a feature that
Distribute supports *in addition to* the distutils-style import. I
never got into a situation where I couldn't "from distutils import
build_py_2to3", and I haven't been able to duplicate the problem on a
system that has only ever had Python 3 installed on it.
At the same time, this experience made me think it's very difficult to
forget what version of Python you're using on Windows--anything you
run has to include the directory name, which includes the version
number.
Leonard