Well I had been hoping that someone would come up with a neater way of using jTidy and
Mark Woods did(check out
his CMS is you have a moment). He dropped me a line today outlining how he built a custom tag that calls a class. Based on his tag I modified my function to dispense with creating the temporary files and rather use a ByteArrayInputStream and a ByteArrayOutputStream.
All I needed to then do was find an example of using
ByteArrayInputStream and
ByteOutputStream and the code was complete.
UPDATE
There was a slight problem with BlueDragon and the ByteArrayOutputBuffer. Andrew Wu from
NewAtlanta found the problem, apparently:
…outx is a ByteArrayOutputStream which BlueDragon doesn’t automatically treat as a String.
So by simply adding
outstr = outx.toString();, before stripping the output of it’s HTML header, the problem was resolved. Thanks Andrew! The code has been update accordingly.
<cffunction name=”makexHTMLValid” displayname=”Tidy parser” hint=”Takes a string as an argument and returns parsed and valid xHTML” output=”true”>
<cfargument name=”strToParse” required=”true” type=”string” default=”” />
<cfscript>
/**
* This function reads in a string, checks and corrects any invalid HTML.
* By Greg Stewart
*
* @param strToParse The string to parse (will be written to file).
* accessible from the web browser
* @return returnPart
* @author Greg Stewart (gregs(at)tcias.co.uk)
* @version 1, August 22, 2004
* @version 1.1, September 09, 2004
* with the help of Mark Woods this UDF no longer requires temp files and only accepts
* the string to parse
*/
var returnPart = “”; // return variable
parseData = trim(arguments.strToParse);
// jTidy part
// BD free version
pathToTidy = “/usr/local/NewAtlanta/BlueDragon_Server_61/lib/ext/Tidy.jar”;
// Create an instance of java.net.URL for passing to the URLClassLoader
URLObject = createObject(‘java’,’java.net.URL’);
// Initialize the object with the jar file
URLObject.init(“file:” & pathToTidy);
// Create an Array and add our URLObject to it
arr[1] = urlobject;
// Create and the URLClassLoader and pass it the array containing our path
loader = createObject(‘java’,’java.net.URLClassLoader’);
loader.init(arr);
// Use our new class loader to load the DOMConfigurator class
jTidy = loader.loadClass(“org.w3c.tidy.Tidy”).newInstance();
// CFMX/J2EE
// jTidy = createObject(“java”,”org.w3c.tidy.Tidy”);
jTidy.setQuiet(false);
jTidy.setIndentContent(true);
jTidy.setSmartIndent(true);
jTidy.setIndentAttributes(true);
jTidy.setWraplen(1024);
jTidy.setXHTML(true);
// create the in and out streams for jTidy
readBuffer = CreateObject(“java”,”java.lang.String”).init(parseData).getBytes();
inP = createobject(“java”,”java.io.ByteArrayInputStream”).init(readBuffer);
//ByteArrayOutputStream
outx = createObject(“java”, “java.io.ByteArrayOutputStream”).init();
// do the parsing
jTidy.parse(inP,outx);
// close the stream
// outx.close();
outstr = outx.toString();
// ok now strip all the header/body stuff
startPos = REFind(“<body>”, outstr)+6;
endPos = REFind(“</body>”, outstr);
returnPart = Mid(outstr, startPos, endPos-startPos);
</cfscript>
<cfreturn returnPart />
</cffunction>