I've been playing around and have a lovely working version of
http://users.rcn.com/creitzel/tidy.html#cplusplus.
It just winds me up that Coldfusion (and Java in this case) lack features
like this that'd be so very useful. I already use JTidy to process any user
supplied HTML so I can clean it up and make it all valid. It'd be one of those
"killer-features" to have a web application that can check for accessibility
errors of a page when you update it, and then using the output from Tidy it
could give the use a summary of the errors.
If anyone is interested in such a feature and wants to try getting something
working I'm here to lend what help I can.
<cfparam name="attributes.html" default="" type="string"> <!--- (R) HTML to
clean --->
<cfparam name="attributes.output" default="tidy" type="string"> <!--- (R)
Output variable --->
<cfparam name="attributes.bodyOnly" default="true" type="boolean"> <!---
Removes <html><body> tags --->
<cfparam name="attributes.xhtml" default="true" type="boolean"> <!--- Output
valid XHTML --->
<cfparam name="attributes.docType" default="omit" type="string">
<!--- [omit|auto|strict|loose] Or custom e.g. "-//ACME//DTD HTML 3.4748*//EN"
<- must include double quotes. --->
<cfparam name="attributes.word" default="true" type="boolean"> <!--- Attempts
to clean word HTML --->
<cfparam name="attributes.clean" default="false" type="boolean"> <!--- Removes
presentational markup --->
<cfparam name="attributes.strongem" default="true" type="boolean"> <!---
Replaces <b> - <strong> <i> - <em> --->
<cfscript>
attributes.html = Trim(attributes.html);
readBufferObj = CreateObject("java", "java.lang.String");
readBufferObj2 = readBufferObj.init(attributes.html);
readBuffer = readBufferObj2.getBytes();
sourceObj = CreateObject("java", "java.io.ByteArrayInputStream");
source = sourceObj.init(readBuffer);
resultObj = CreateObject("java", "java.io.ByteArrayOutputStream");
result = resultObj.init();
jTidy = CreateObject("java", "org.w3c.tidy.Tidy");
// Tidy settings
jTidy.setDocType(attributes.docType);
jTidy.setIndentAttributes(true);
jTidy.setIndentContent(true);
jTidy.setLogicalEmphasis(attributes.strongem);
jTidy.setMakeClean(attributes.clean);
jTidy.setQuiet(false);
jTidy.setQuoteAmpersand(true);
jTidy.setSmartIndent(false);
jTidy.setTidyMark(false);
jTidy.setWord2000(attributes.word);
jTidy.setWraplen(1024);
jTidy.setXHTML(attributes.xhtml);
jTidy.setPrintBodyOnly(attributes.bodyOnly);
jTidy.parse(source, result); // Parse the HTML
outStr = result.toString(); // Convert ByteStream into a normal string
SetVariable("caller.#attributes.output#", outStr);
</cfscript>