I use HTML5lib for my PDF to HTML converter project "pisa"
<http://www.htmltopdf.org/> that is written in Python. Therefore I
added a
CSS parser that originally comes from the TechGame Python Framework
<http://www.techgame.net/projects/Framework> and modified it for
better use
with HTML specific definitions. So how about integrating such a CSS
parser
in the HTML5lib project? Get the most recent code here
<http://pypi.python.org/pypi/pisa>
This post was also added as a feature request to the issue tracker of
the project.
dirk.holtw...@gmail.com wrote: > I use HTML5lib for my PDF to HTML converter project "pisa" > <http://www.htmltopdf.org/> that is written in Python. Therefore I > added a > CSS parser that originally comes from the TechGame Python Framework > <http://www.techgame.net/projects/Framework> and modified it for > better use > with HTML specific definitions. So how about integrating such a CSS > parser > in the HTML5lib project? Get the most recent code here > <http://pypi.python.org/pypi/pisa>
Does the CSS parsing code have a test suite?
Is it licensed under the MIT license? If not, is relicensing a possibility?
Sam Ruby wrote: > dirk.holtw...@gmail.com wrote: >> I use HTML5lib for my PDF to HTML converter project "pisa" >> <http://www.htmltopdf.org/> that is written in Python. Therefore I >> added a >> CSS parser that originally comes from the TechGame Python Framework >> <http://www.techgame.net/projects/Framework> and modified it for >> better use >> with HTML specific definitions. So how about integrating such a CSS >> parser >> in the HTML5lib project? Get the most recent code here >> <http://pypi.python.org/pypi/pisa>
> Does the CSS parsing code have a test suite?
> Is it licensed under the MIT license? If not, is relicensing a possibility?
Those were 2/3 of the questions I was going to ask :) The third is what's the advantage of distributing these things together? Is there some tight coupling needed between the CSS parser and the HTML parser that is hard to achieve without hardwiring them to each other?
-- "Eternity's a terrible thought. I mean, where's it all going to end?" -- Tom Stoppard, Rosencrantz and Guildenstern are Dead
Maybe for parsing this project is also a good choice: <http://
cthedot.de/cssutils/>. But here a selector to DOM mapping is missing
as far as I know.
I think the bundling of HTML5lib and a CSS parser that allows mapping
selectors to the HTML DOM tree would be an advantage. I think HTML and
CSS stick together :) But if it should be bundled to HTML5lib is a
good question. My hope was that this way it could become an easy to
use standard for parsing CSS in Python, which is missing for the
moment. Hope this answers the question of James ;)
On 1/28/08, dirk.holtw...@gmail.com <dirk.holtw...@gmail.com> wrote:
> I think the bundling of HTML5lib and a CSS parser that allows mapping > selectors to the HTML DOM tree would be an advantage. I think HTML and > CSS stick together :) But if it should be bundled to HTML5lib is a > good question. My hope was that this way it could become an easy to > use standard for parsing CSS in Python, which is missing for the > moment. Hope this answers the question of James ;)
I'm not sure I see the advantage of having them in the same repository but I'm not against it per se. However, having tests and it being licensed under the MIT license is important for html5lib.
this is the license of the CSS parser, seems to be BSD style?
"""
Copyright (c) 2002-2004, TechGame Networks, LLC.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted
provided that the following conditions are met:
* Redistributions of source code must retain the above copyright
notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright
notice,
this list of conditions and the following disclaimer in the
documentation and/or
other materials provided with the distribution.
* Neither the name of TechGame Networks, LLC nor the names of its
contributors may
be used to endorse or promote products derived from this
software
without specific
prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS
IS" AND ANY EXPRESS
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY
AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS
OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER
IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
OTHERWISE) ARISING IN ANY WAY OUT
OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH
DAMAGE.
"""
But I will contact the TechGame people to know what they think about
it.
> > I'm not sure I see the advantage of having them in the same repository
> > but I'm not against it per se. However, having tests and it being
> > licensed under the MIT license is important for html5lib.
I am the author of the TechGame Framework, and specifically, the CSS
modules
under discussion. We would love to see our code be useful to you, as
it has
been for us. If you have further questions for me, please CC me
directly, as
I'm a little overrun with mailing lists to read....
> Is it licensed under the MIT license? If not, is relicensing a possibility?
Yes, it is licensed under a BSD style license, whose gensis is the MIT
style
license. You may download the code from our working mecurial/hg
repository.
(The subversion repo is old, sorry.)
This is our old svn/cvs style collection of code converted directly to
an hg
repo. In contrast to our new code projects like TG/blathernet, TG/
helix, and
TG/kvObserving, it is a collection of disparet packages intended to
speed our
development of desktop applications. We would love to split this out
into more
modular hg repositories, but it simply hasn't been high enough on The
List to
get done. It comes just before making release announcements...
[shane@TGHMacBookPro] test/w3c% ls testCSS*
testCSS.py testCSSErrors.py testCSSWithMinidom.py
testCSSCascade.py testCSSMedia.py
[shane@TGHMacBookPro] test/w3c% py testCSS.py
.............................................................
----------------------------------------------------------------------
Ran 61 tests in 0.028s
OK
[shane@TGHMacBookPro] test/w3c% py testCSSCascade.py
...
----------------------------------------------------------------------
Ran 3 tests in 0.007s
OK
[shane@TGHMacBookPro] test/w3c% py testCSSMedia.py
........................
----------------------------------------------------------------------
Ran 24 tests in 0.428s
OK
[shane@TGHMacBookPro] test/w3c% py testCSSWithMinidom.py
.
----------------------------------------------------------------------
Ran 1 test in 0.009s
OK
[shane@TGHMacBookPro] test/w3c% py testCSSErrors.py
......
----------------------------------------------------------------------
Ran 6 tests in 0.006s
>> Is it licensed under the MIT license? If not, is relicensing a >> possibility?
> Yes, it is licensed under a BSD style license, whose gensis is the MIT > style > license.
The third clause of your license makes it different from the MIT license (the MIT license is wholly equivalent to the two-clause BSD license, with what is redundant with the Berne Convention taken out). As long as we restrict ourselves to MIT licensed code (which I expect we will) that third clause means it won't go in.
> The third clause of your license makes it different from the MIT > license (the MIT license is wholly equivalent to the two-clause BSD > license, with what is redundant with the Berne Convention taken > out). As long as we restrict ourselves to MIT licensed code (which I > expect we will) that third clause means it won't go in.
thats great news! Hope to see an integration into HTML5lib soon.
I would like to provide a demo of the integration of the CSS parser
with HTML5lib, but I have no time resources for this at the moment. A
kind of sample could be the use in pisa <http://www.htmltopdf.org>.
Maybe someone else likes to provide some expressive implementation of
the use of the CSS parser?