Jericho HTML Parser - Jericho HTML Parser is a simple but powerful java library allowing analysis and manipulation of parts of an HTML document, including some common server-side tags, while reproducing verbatim any unrecognised or invalid HTML. It also provides high-level HTML form manipulation functions.
commons-jelly-tags-html - These Jelly tags can scrub commons errors in HTML syntax.
Daisy html cleaner -
cocoon-html -
aptconvert -
jtidy -
Neko HTML -
nekohtmlXni -
nekohtmlSamples -