CDATA magically escaped
scala> val xml = <xml><test><![CDATA[a < b]]></test></xml> xml: scala.xml.Elem = <xml><test>a < b</test></xml> <-- WTF?Same when loading from a String:
scala> val xml = XML.loadString("<xml><test><![CDATA[a < b]]></test></xml>") xml: scala.xml.Elem = <xml><test>a < b</test></xml>This is not what you want. The stuff in the CDATA is meant to be left alone. Instead, it seems that the CDATA is eaten and its contents magically escaped. This causes lots of grief if the contents of the CDATA are Javascript, for example.
One workaround is to use the built-in ConstructingParser to load XML.
scala> val xml2 = ConstructingParser.fromSource(Source.fromString("<xml><test><![CDATA[a < b]]></test></xml>"), preserveWS = true).document.docElem xml2: scala.xml.Node = <xml><test><![CDATA[a < b]]></test></xml>Looks good.
You can also use <xml:unparsed>. Check out this Scala XML faq for more.
XML Comments eaten
When loading XML from a string, XML comments disappear. Example:scala> val looksGood = <xml><test><!-- comment --></test></xml> looksGood: scala.xml.Elem = <xml><test><!-- comment --></test></xml> scala> val wtf = XML.loadString("<xml><test><!-- comment --></test></xml>") wtf: scala.xml.Elem = <xml><test></test></xml>Again, ConstructingParser can fix this:
scala> val correct = ConstructingParser.fromSource(Source.fromString("<xml><test><!-- comment --></test></xml>"), preserveWS = true).document.docElem correct: scala.xml.Node = <xml><test><!-- comment --></test></xml>There are some alternatives if you run into these issues.
- As described above, use scala.xml.parsers.ConstructingParser to load XML
- Use the Lift web framework's PCDataMarkupParser (extends Scala's built-in MarkupParser with various improvments)
- Daniel Spiewak's Anti-XML project looks promising
- Use any of the million Java XML parsers that are out there (but give up the convenient scala.xml syntax