Nov 27

You can try HTML Tidy.
May not work depending on the quality of the HTML but worth a try.

written by objects \\ tags: , ,

Apr 19

DOM Load and Save API provides a means for serializing XML data. The following example shows how to serialize an XML DOM document and produce ‘pretty’ indented output.

// First load your xml into a DOM
// That’s covered in another answer (see below)
// then check if DOM Load and Save is supported
DOMImplementationLS DOMiLS = null;
if ((doc.getFeature("Core", "3.0") != null)
	&& (doc.getFeature("LS", "3.0") != null))
	// It is support so grab the available implementation
	DOMiLS = (DOMImplementationLS) (doc.getImplementation())
		.getFeature("LS", "3.0");
	throw new RuntimeException("DOM Load and Save unsupported");

// Next create your LS output destination
LSOutput lso = DOMiLS.createLSOutput();

// create a stream to write the resulting xml to
// we'll use a file in this example
OutputStream out = new FileOutputStream(outFile);

// create a LS serializer
// and tell it to make the output 'pretty'
LSSerializer lss = DOMiLS.createLSSerializer();
lss.getDomConfig().setParameter("format-pretty-print", Boolean.TRUE);

// finally serialize the xml to your output stream
result = lss.write(doc, lso);

See also:

written by objects \\ tags: , , , , ,

Apr 02

To insert a new root node into an existing DOM Document involves creating a new DOM Document with the required new root node, and then copying in the existing DOM Document into the new root.

The following code snippets shows an outline of the code involved.

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = dbf.newDocumentBuilder();
Document existingdoc = builder.parse(file);

// Create an empty document

Document doc = builder.newDocument();

// Add the new root node

Element root = doc.createElement("Objects");

// Add a copy of the nodes from existing document

Node copy = doc.importNode(existingdoc.getDocumentElement(), true);

written by objects \\ tags: , ,