Preparing HTML data for use in a Tridion Rich Text Format area
I recently had to create some Tridion components from code via the core service. The incoming data was in the form of HTML, and not XML in the XHTML namespace, which is what is required for a Tridion RTF area. I'd also had to do some preparatory clean-up of the data, and by the time I wanted to fix up the namespaces, I already had the input data in an XLinq XElement.
These days, if I'm processing XML in .NET, I'm quite likely to use XLinq. It's taken me a while to get comfortable with some of its idioms. The technique I ended up using is similar to the classic approach we typically adopt in XSLT, starting with an identity transform and making a couple of minor tweaks to the data as it goes through.
So, mostly by way of a "note to self", here's how it looks in XLinq. All you need to do is pass in your XElement containing your XHTML, and it will rip through all the elements and put them in the XHTML namespace, leaving all the attributes and other nodes untouched.
public XNode PutHtmlElementsInXhtmlNamespace(XNode input){
XNamespace xhtmlNs = "http://www.w3.org/1999/xhtml"; var element = input as XElement; if (element != null) { XName name = xhtmlNs + element.Name.LocalName; return new XElement(name,element.Attributes(), element.Nodes().Select(n => PutHtmlElementsInXhtmlNamespace(n))); } return input; }
In this way you can easily create data that's suitable for use in an RTF. Piecing the rest of a Content element together with XElement is pretty easy too, or of course, you can use the venerable Fields class for the rest.