Greg's Blog

helping me remember what I figure out

jTidy With No Temp Files

| Comments

Well I had been hoping that someone would come up with a neater way of using jTidy and Mark Woods did(check out his CMS is you have a moment). He dropped me a line today outlining how he built a custom tag that calls a class. Based on his tag I modified my function to dispense with creating the temporary files and rather use a ByteArrayInputStream and a ByteArrayOutputStream. All I needed to then do was find an example of using ByteArrayInputStream and ByteOutputStream and the code was complete.
UPDATE
There was a slight problem with BlueDragon and the ByteArrayOutputBuffer. Andrew Wu from NewAtlanta found the problem, apparently: …outx is a ByteArrayOutputStream which BlueDragon doesn’t automatically treat as a String. So by simply adding outstr = outx.toString();, before stripping the output of it’s HTML header, the problem was resolved. Thanks Andrew! The code has been update accordingly. <cffunction name=”makexHTMLValid” displayname=”Tidy parser” hint=”Takes a string as an argument and returns parsed and valid xHTML” output=”true”> <cfargument name=”strToParse” required=”true” type=”string” default=”” /> <cfscript> /** * This function reads in a string, checks and corrects any invalid HTML. * By Greg Stewart * * @param strToParse The string to parse (will be written to file). * accessible from the web browser * @return returnPart * @author Greg Stewart (gregs(at)tcias.co.uk) * @version 1, August 22, 2004 * @version 1.1, September 09, 2004 * with the help of Mark Woods this UDF no longer requires temp files and only accepts * the string to parse */ var returnPart = “”; // return variable parseData = trim(arguments.strToParse); // jTidy part // BD free version pathToTidy = “/usr/local/NewAtlanta/BlueDragon_Server_61/lib/ext/Tidy.jar”; // Create an instance of java.net.URL for passing to the URLClassLoader URLObject = createObject(‘java’,’java.net.URL’); // Initialize the object with the jar file URLObject.init(“file:” & pathToTidy); // Create an Array and add our URLObject to it arr[1] = urlobject; // Create and the URLClassLoader and pass it the array containing our path loader = createObject(‘java’,’java.net.URLClassLoader’); loader.init(arr); // Use our new class loader to load the DOMConfigurator class jTidy = loader.loadClass(“org.w3c.tidy.Tidy”).newInstance(); // CFMX/J2EE // jTidy = createObject(“java”,”org.w3c.tidy.Tidy”); jTidy.setQuiet(false); jTidy.setIndentContent(true); jTidy.setSmartIndent(true); jTidy.setIndentAttributes(true); jTidy.setWraplen(1024); jTidy.setXHTML(true); // create the in and out streams for jTidy readBuffer = CreateObject(“java”,”java.lang.String”).init(parseData).getBytes(); inP = createobject(“java”,”java.io.ByteArrayInputStream”).init(readBuffer); //ByteArrayOutputStream outx = createObject(“java”, “java.io.ByteArrayOutputStream”).init(); // do the parsing jTidy.parse(inP,outx); // close the stream // outx.close(); outstr = outx.toString(); // ok now strip all the header/body stuff startPos = REFind(“<body>”, outstr)+6; endPos = REFind(“</body>”, outstr); returnPart = Mid(outstr, startPos, endPos-startPos); </cfscript> <cfreturn returnPart /> </cffunction>