Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

HTML Tidy Vs. Coldfusion

94 views
Skip to first unread message

elpestybandito

unread,
Nov 7, 2005, 10:15:57 AM11/7/05
to
Hi,

I've been playing around and have a lovely working version of
http://users.rcn.com/creitzel/tidy.html#cplusplus.

It just winds me up that Coldfusion (and Java in this case) lack features
like this that'd be so very useful. I already use JTidy to process any user
supplied HTML so I can clean it up and make it all valid. It'd be one of those
"killer-features" to have a web application that can check for accessibility
errors of a page when you update it, and then using the output from Tidy it
could give the use a summary of the errors.

If anyone is interested in such a feature and wants to try getting something
working I'm here to lend what help I can.

elpestybandito

unread,
Nov 23, 2005, 5:47:20 AM11/23/05
to
Thought I'd attach the code I used with JTidy to hopefully spark some interest.

<cfparam name="attributes.html" default="" type="string"> <!--- (R) HTML to
clean --->
<cfparam name="attributes.output" default="tidy" type="string"> <!--- (R)
Output variable --->
<cfparam name="attributes.bodyOnly" default="true" type="boolean"> <!---
Removes <html><body> tags --->
<cfparam name="attributes.xhtml" default="true" type="boolean"> <!--- Output
valid XHTML --->
<cfparam name="attributes.docType" default="omit" type="string">
<!--- [omit|auto|strict|loose] Or custom e.g. "-//ACME//DTD HTML 3.4748*//EN"
<- must include double quotes. --->
<cfparam name="attributes.word" default="true" type="boolean"> <!--- Attempts
to clean word HTML --->
<cfparam name="attributes.clean" default="false" type="boolean"> <!--- Removes
presentational markup --->
<cfparam name="attributes.strongem" default="true" type="boolean"> <!---
Replaces <b> - <strong> <i> - <em> --->

<cfscript>
attributes.html = Trim(attributes.html);

readBufferObj = CreateObject("java", "java.lang.String");
readBufferObj2 = readBufferObj.init(attributes.html);
readBuffer = readBufferObj2.getBytes();
sourceObj = CreateObject("java", "java.io.ByteArrayInputStream");
source = sourceObj.init(readBuffer);

resultObj = CreateObject("java", "java.io.ByteArrayOutputStream");
result = resultObj.init();

jTidy = CreateObject("java", "org.w3c.tidy.Tidy");

// Tidy settings
jTidy.setDocType(attributes.docType);
jTidy.setIndentAttributes(true);
jTidy.setIndentContent(true);
jTidy.setLogicalEmphasis(attributes.strongem);
jTidy.setMakeClean(attributes.clean);
jTidy.setQuiet(false);
jTidy.setQuoteAmpersand(true);
jTidy.setSmartIndent(false);
jTidy.setTidyMark(false);
jTidy.setWord2000(attributes.word);
jTidy.setWraplen(1024);
jTidy.setXHTML(attributes.xhtml);
jTidy.setPrintBodyOnly(attributes.bodyOnly);

jTidy.parse(source, result); // Parse the HTML

outStr = result.toString(); // Convert ByteStream into a normal string
SetVariable("caller.#attributes.output#", outStr);
</cfscript>

0 new messages