Groups
Sign in
Groups
tagsoup-friends
Conversations
About
Send feedback
Help
tagsoup-friends
Contact owners and managers
1–30 of 75
Mark all as read
Report group
0 selected
Fan Su
12/6/19
Yahoo tagchowder, fork of tagsoup-1.2.1 with enhancement
https://github.com/yahoo/tagchowder
unread,
Yahoo tagchowder, fork of tagsoup-1.2.1 with enhancement
https://github.com/yahoo/tagchowder
12/6/19
Fan Su
12/6/19
yahoo tagchowder, a fork of tagsoup-1.2.1, with more enhancement
Hi, I am from Yahoo (now Verizon Media). Our team is actively maintaining tagchowder project, which
unread,
yahoo tagchowder, a fork of tagsoup-1.2.1, with more enhancement
Hi, I am from Yahoo (now Verizon Media). Our team is actively maintaining tagchowder project, which
12/6/19
fans...@gmail.com
10/25/18
!DOCTYPE with double quote only as system identifier
hi all, I have an HTML with following format <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
unread,
!DOCTYPE with double quote only as system identifier
hi all, I have an HTML with following format <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
10/25/18
Kalpa 1977
4/28/16
Tagsoup with namespace prefix
hi all, The html contains namespace with prefix, when I enable in the parser, parser.setFeature(
unread,
Tagsoup with namespace prefix
hi all, The html contains namespace with prefix, when I enable in the parser, parser.setFeature(
4/28/16
Christian Roth
12/17/15
<a href="abc.de"><ins>inserted</ins></a> becomes <a href="abc.de" /><ins>…</ins>
Hi, the following – valid and well-formed – code <a href="abc.de"><ins>inserted
unread,
<a href="abc.de"><ins>inserted</ins></a> becomes <a href="abc.de" /><ins>…</ins>
Hi, the following – valid and well-formed – code <a href="abc.de"><ins>inserted
12/17/15
anselmo oliveira
11/30/15
Analista de teste
Olá pessoal, sou novo por aqui, e sem querer tomar muito de seu tempo, gostaria de receber dicas para
unread,
Analista de teste
Olá pessoal, sou novo por aqui, e sem querer tomar muito de seu tempo, gostaria de receber dicas para
11/30/15
reyz
,
John Cowan
3
11/8/15
<p> inside <a> parsed outside <a> - how customize TagSoup parsing ?
Thanks for your answer. I finally found a solution inside my code by extending HTMLSchema: private
unread,
<p> inside <a> parsed outside <a> - how customize TagSoup parsing ?
Thanks for your answer. I finally found a solution inside my code by extending HTMLSchema: private
11/8/15
pghjvan...@gmail.com
11/13/14
bug - meta tag in body handled incorrectly
Usage of the <meta> tag in a document body has become legal (see http://www.w3.org/TR/html5/
unread,
bug - meta tag in body handled incorrectly
Usage of the <meta> tag in a document body has become legal (see http://www.w3.org/TR/html5/
11/13/14
binha...@gmail.com
11/4/14
Bug at code with nested <ul>
When using tagsoup 1.2.1 upon html code like: <ul> ... <ul> .. is handled as <ul> .
unread,
Bug at code with nested <ul>
When using tagsoup 1.2.1 upon html code like: <ul> ... <ul> .. is handled as <ul> .
11/4/14
Devaraja Swami
9/8/14
Parsing error with <a> tag inside <h1> tag
In the following HTML document "x.html", the <a> is inside the <h1> tag which
unread,
Parsing error with <a> tag inside <h1> tag
In the following HTML document "x.html", the <a> is inside the <h1> tag which
9/8/14
John Cowan
6/18/14
Re: Can I make tags be upper case?
Kathryn Mazaitis scripsit [in a message I accidentally deleted]: > I'm using TagSoup to fix
unread,
Re: Can I make tags be upper case?
Kathryn Mazaitis scripsit [in a message I accidentally deleted]: > I'm using TagSoup to fix
6/18/14
Ihe Onwuka
5/9/14
Abort caused by leading space in file name.
Input script java -jar $HOME/bin/tagsoup-1.2.1.jar --nons --files *.htm Exception in thread "
unread,
Abort caused by leading space in file name.
Input script java -jar $HOME/bin/tagsoup-1.2.1.jar --nons --files *.htm Exception in thread "
5/9/14
sebastien...@gmail.com
5/2/14
Issues when defining a personalized schema class.
Hello all, I am willing to adapt TagSoup to correct some input files which are not html files. This
unread,
Issues when defining a personalized schema class.
Hello all, I am willing to adapt TagSoup to correct some input files which are not html files. This
5/2/14
Ihe Onwuka
,
John Cowan
2
3/28/14
bug - string content of attribute being tokenized by tagsoup
Ihe Onwuka scripsit: > <meta property="og:description" content=" > > or
unread,
bug - string content of attribute being tokenized by tagsoup
Ihe Onwuka scripsit: > <meta property="og:description" content=" > > or
3/28/14
LP
,
zed
2
3/4/14
Nested Tables with Floating TD tag
I also have a similar problem, there seem to be different ways of fixing the HTML. Did you find a way
unread,
Nested Tables with Floating TD tag
I also have a similar problem, there seem to be different ways of fixing the HTML. Did you find a way
3/4/14
prabakaran selvan
,
John Cowan
2
3/3/14
High CPU issue
prabakaran selvan scripsit: > Hey, I am stuck with high CPU usage whenever i use Tagsoup cleaner.
unread,
High CPU issue
prabakaran selvan scripsit: > Hey, I am stuck with high CPU usage whenever i use Tagsoup cleaner.
3/3/14
Byte Array
, …
Fuad Efendi
7
9/25/13
org.w3c.dom.DOMException: NOT_FOUND_ERR: An attempt is made to reference a node in a context where it does not exist.
Hello, RE: Caused by: org.w3c.dom.DOMException: NOT_FOUND_ERR: An attempt is made to reference a node
unread,
org.w3c.dom.DOMException: NOT_FOUND_ERR: An attempt is made to reference a node in a context where it does not exist.
Hello, RE: Caused by: org.w3c.dom.DOMException: NOT_FOUND_ERR: An attempt is made to reference a node
9/25/13
markus
,
John Cowan
5
7/25/13
Some HTML5 elements missing in html.tssl schema file
I've seen i was missing h1..h6 being allowed in elements like anchors now. Sometimes we see HTML
unread,
Some HTML5 elements missing in html.tssl schema file
I've seen i was missing h1..h6 being allowed in elements like anchors now. Sometimes we see HTML
7/25/13
zed
,
John Cowan
4
2/8/13
Encoding/Decoding of special characters
zed scripsit: > To clarify, the input string is ', and the parser normally decodes this
unread,
Encoding/Decoding of special characters
zed scripsit: > To clarify, the input string is ', and the parser normally decodes this
2/8/13
Daemmon
, …
John Cowan
9
1/19/13
bug parsing <a><div ></a> nesting
Juan Carlos Garcia Segovia scripsit: > Would you accept patches to make TagSoup follow the HTML
unread,
bug parsing <a><div ></a> nesting
Juan Carlos Garcia Segovia scripsit: > Would you accept patches to make TagSoup follow the HTML
1/19/13
Subhabrata Biswas
,
John Cowan
5
12/28/12
Need help with the SAX API
Done. I am home and dry :-) Thanks a lot, John. On Friday, 28 December 2012 00:45:15 UTC+5:30, John
unread,
Need help with the SAX API
Done. I am home and dry :-) Thanks a lot, John. On Friday, 28 December 2012 00:45:15 UTC+5:30, John
12/28/12
Joe Humphreys
,
John Cowan
2
12/20/12
NullPointerException when sharing HTMLSchema
Joe Humphreys scripsit: > Hi. I have some server code in which different threads create Parsers
unread,
NullPointerException when sharing HTMLSchema
Joe Humphreys scripsit: > Hi. I have some server code in which different threads create Parsers
12/20/12
Fred Toth
,
David McMeans
3
11/28/12
tagsoup leaving file open on windows
Fred, I'm using tagSoup 1.2.1. There appears to be a missing close in the parse() method. public
unread,
tagsoup leaving file open on windows
Fred, I'm using tagSoup 1.2.1. There appears to be a missing close in the parse() method. public
11/28/12
iWantToKeepAnon
,
Steven Devijver
2
9/26/12
Re: [tagsoup-friends] Problem with tagsoup: java.io.IOException: Pushback buffer overflow
On Wednesday, September 26, 2012 4:17:14 AM UTC+2, iWantToKeepAnon wrote: FWIW, I ran this page
unread,
Re: [tagsoup-friends] Problem with tagsoup: java.io.IOException: Pushback buffer overflow
On Wednesday, September 26, 2012 4:17:14 AM UTC+2, iWantToKeepAnon wrote: FWIW, I ran this page
9/26/12
markus
,
John Cowan
4
9/21/12
Attributes on body tag
Digg deeper of course! On Friday, September 21, 2012 10:35:33 AM UTC+2, markus wrote: Thanks John! I
unread,
Attributes on body tag
Digg deeper of course! On Friday, September 21, 2012 10:35:33 AM UTC+2, markus wrote: Thanks John! I
9/21/12
John Cowan
,
zed
3
9/19/12
Re: [tagsoup-friends] Not restarting font tags?
zed scripsit: > Is there a new tagsoup version coming out soon with this fix? If not, I can >
unread,
Re: [tagsoup-friends] Not restarting font tags?
zed scripsit: > Is there a new tagsoup version coming out soon with this fix? If not, I can >
9/19/12
John Cowan
,
zed
2
9/18/12
Re: [tagsoup-friends] Tagsoup handling newlines inside <pre> tags
Thanks, that helped! On Monday, September 17, 2012 5:44:46 PM UTC-7, John Cowan wrote: zed scripsit:
unread,
Re: [tagsoup-friends] Tagsoup handling newlines inside <pre> tags
Thanks, that helped! On Monday, September 17, 2012 5:44:46 PM UTC-7, John Cowan wrote: zed scripsit:
9/18/12
ghazo
8/31/12
adding new element to DOMResult
hi All, I'm new to tagsoup, I read an html file successfully and save it into DomResult object
unread,
adding new element to DOMResult
hi All, I'm new to tagsoup, I read an html file successfully and save it into DomResult object
8/31/12
jomaras
, …
Brian Harcourt
5
8/5/12
Allowing block elements in inline elements
Any chance you could share exactly how you enabled blocks within inlines in tagSoup? Struggling the
unread,
Allowing block elements in inline elements
Any chance you could share exactly how you enabled blocks within inlines in tagSoup? Struggling the
8/5/12
Robert Sanders
7/25/12
Invalid Tag Names
Hi, I've been tasked with converting some legacy HTML into an XML format for import into a CMS.
unread,
Invalid Tag Names
Hi, I've been tasked with converting some legacy HTML into an XML format for import into a CMS.
7/25/12