Dominic Cronin's weblog
Using powershell to do useful things with XML lists from Tridion
For a while now I've been trying to persuade anyone that would listen that Windows Powershell is great for interacting with the Tridion object model (TOM). What I mean by this is that you can easily use the powershell to instantiate and use COM objects, and more specifically, TOM objects. In this post, I'm going to take it a bit further, and show how you can use the powershell's XML processing facilities to easily process the lists that are available from the TOM as XML Documents. The example I'm going to use is a script to forcibly finish all the workflow process instances in a Tridion CMS. (This is quite useful if you are doing workflow development, as you can't upload a new version of a workflow while there are process instances that still need to be finished.)
Although useful, the example itself isn't so important. I'm simply using it to demonstrate how to process lists. Tridion offers several API calls that will return a list, and in general, the XML looks very similar. I'm going to aim to finish all my process instances as a "one-liner", although I'm immediately going to cheat by setting up the necessary utility objects as shell variables:
> $tdse = new-object -com TDS.TDSE > $wfe = $tdse.GetWFE()
As you can see, I'm using the new-object cmdlet to get a TDSE object, specifying that it is a COM object (by default new-object assumes you want a .NET object). Then I'm using $tdse to get the WFE object which offers methods that list workflow items. With these two variables in place, I can attempt my one liner. Here goes:
> ([xml]$wfe.GetListProcessInstances()).ListWFProcessInstances.Item | % {$tdse.GetObject($_.ID,2)} | % {$_.FinishProcess()}
Well, suffice it to say that this works, and once you've run it (assuming you are an admin), you won't have any process instances, but perhaps we need to break it down a bit....
If you start off with just $wfe.GetListProcessInstances(), the powershell will invoke the method for you, and return the XML as a string, which is what GetListProcessInstances returns. Just like this:
> $wfe.GetListProcessInstances() <?xml version="1.0"?> <tcm:ListWFProcessInstances xmlns:tcm="http://www.tridion.com/ContentManager/5.0" xmlns:xlink="http://www.w3.org/1999/x link"><tcm:Item ID="tcm:24-269-131076" PublicationTitle="300 Global Content (NL)" TCMItem="tcm:24-363" Title="Test 1" T CMItemType="16" ProcessDefinitionTitle="Application Content Approval" ApprovalStatus="Unapproved" ActivityDefinitionTyp e="1" WorkItem="tcm:24-537-131200" CreationDate="2010-12-30T19:35:33" State="Started" Icon="T16L1P0" Allow="41955328" D eny="16777216"/><tcm:Item ID="tcm:24-270-131076" PublicationTitle="300 Global Content (NL)" TCMItem="tcm:24-570" Title= "Test 2" TCMItemType="16" ProcessDefinitionTitle="Application Content Approval" ApprovalStatus="Unapproved" ActivityDef initionType="1" WorkItem="tcm:24-538-131200" CreationDate="2010-12-30T19:36:04" State="Started" Icon="T16L1P0" Allow="4 1955328" Deny="16777216"/></tcm:ListWFProcessInstances>
OK - that's great - if you dig into it, you'll see that there is a containing element called ListWFProcessInstances, and that what it contains are some Item elements. All of this is in the tcm namespace, and each Item has various attributes. Unfortunately, the XML in this form is ugly and not particularly useful. Fortunately, the powershell has some built-in features that help quite a lot with this. The first is that if you use the [xml] cast operator, the string is transformed into a System.Xml.XmlDocument. To test this, just assign the result of the cast to a variable and use the get-member cmdlet to display it's type and methods:
> $xml = [xml]$wfe.GetListProcessInstances() > $xml | gm
(Of course, you don't type "get-member". "gm" is sufficient - most standard powershell cmdlets have consistent and memorable short aliases.)
I won't show the output here, as it fills the screen, but at the top, the type is listed, and then you see the API of System.Xml.XmlDocument. (Actually you don't need a variable here, but it's nice to have a go and use some of the API methods.)
All this would be pretty useful even if it stopped there, but it gets better. Because the powershell is intended as a scripting environment, the creators have wrapped an extra layer of goodness around XmlDocument. The assumption is that you probably want to extract some values without having to write XPaths, instantiate Node objects and all that other nonsense, so they let you access Elements and Attributes by magicking up collections of properties. Using the example above, I can simply type the names of the Elements and Attributes I want in a "dot-chain". For example:
> ([xml]$wfe.GetListProcessInstances()).ListWFProcessInstances.Item[0].ID tcm:24-269-131076
Here you can also see that I'm referencing the first Item element in the collection and getting its ID attribute. The tcm ID is returned. All this is great for exploring the data interactively, but be warned, there is a fly in the ointment. Behind the scenes, the powershell uses its own variable called Item to represent the members of the collections it creates. This means that whereas you ought to be able to type
([xml]$wfe.GetListProcessInstances()).ListWFProcessInstances
and get some meaningful output, instead, you'll get an error saying:
format-default : The member "Item" is already present. + CategoryInfo : NotSpecified: (:) [format-default], ExtendedTypeSystemException + FullyQualifiedErrorId : AlreadyPresentPSMemberInfoInternalCollectionAdd,Microsoft.PowerShell.Commands.FormatDefaultCommand
This is because Tridion's list XML uses "Item" for the element name, and it conflicts with powershell's own use of the name. It's an ugly bug in powershell, but fortunately it doesn't affect us much. Instead of saying "ListWFProcessInstances", just keep on typing and say "ListWFProcessInstances.Item" and you are back in the land of sanity.
Apart from this small annoyance, the powershell offers superb discoverability, so for example, it will give you tab completion so that you don't even have to know the name of ListWFProcessInstances. If at any point you are in doubt as to what to type next, just stop where you are and pipe the result into get-member - all will be revealed.
OK - back to the main plot. If you're with me this far, you have probably realised that
([xml]$wfe.GetListProcessInstances()).ListWFProcessInstances.Item
will get you a collection of objects representing the Item elements in the XML. As you probably know, an important feature of powershell is that you can pipeline collections of objects, and that there is syntax built in for processing them. The % character is used as shorthand for foreach, and within the foreach block (delimited by braces), the symbol $_ represents the current item in the iteration. For example, we could write:
> ([xml]$wfe.GetListProcessInstances()).ListWFProcessInstances.Item | % {$_.ID}
and get the output:
tcm:24-269-131076 tcm:24-270-131076
I'm sure you can see where this is going. We need to transform the collection of XML attributes: the IDs of the process instances, into a collection of TOM objects, so with a small alteration in the body of the foreach block, we have
% {$tdse.GetObject($_.ID,2)}
and then we can pipe the resulting collection of TOM objects into a foreach block which invokes the FinishProcess() method:
% {$_.FinishProcess()}
Of course, if you like really terse one-liners, you could amalgamate the last two pipeline elements so that instead of:
> ([xml]$wfe.GetListProcessInstances()).ListWFProcessInstances.Item | % {$tdse.GetObject($_.ID,2)} | % {$_.FinishProcess()}
we get:
> ([xml]$wfe.GetListProcessInstances()).ListWFProcessInstances.Item | % {$tdse.GetObject($_.ID,2).FinishProcess()}
but in practice, you develop these one-liners by exploration, and if you want something really terse, you are more likely to write a more long-hand version, put it in your $profile, and give it an alias.
As I said at the top - this is just an example. All the TOM functions that return XML lists can be treated in a similar manner. Generally all that changes is the name of the root element of the XML document, and as I have pointed out, this is easily discoverable.
I hope this approach proves useful to you. If you have any examples of good applications, please let me know in the comments.
A Happy New Year to you all.
Dominic
Tweeting from powershell
This evening, instead of hanging out at the Microsoft Dev Days Geek Night, I drove home, put the kids to bed, and sat down to figure out how to update my Twitter status from the Windows Powershell. This was inspired by the bash one-liner using curl that I learned about from Peteris Krumins’ blog (recommended). Well, it turns out not to be a one-liner in powershell, but FWIW - here's how!
function tweet([string] $status) { #http://blogs.msdn.com/shitals/archive/2008/12/27/9254245.aspx [System.Net.ServicePointManager]::Expect100Continue = $false try { $wc = new-object System.Net.WebClient $wc.BaseAddress = "http://twitter.com" $wc.Credentials = new-object System.Net.NetworkCredential $wc.Credentials.UserName = "Your account name" $wc.Credentials.Password = "password" $stream = $wc.OpenWrite("statuses/update.xml") $writer = new-object System.IO.StreamWriter -ArgumentList $stream $writer.Write("status=" + $status) } finally { $writer.Dispose() $stream.Dispose() $wc.Dispose() } }
XML Schema validation from Powershell - and how to keep your Tridion content delivery system neat and tidy
I don't know exactly when it was that Tridion started shipping the XML Schema files for the content delivery configuration files. For what it's worth, I only really became aware of it within the last few months. In that short time, schema validation has saved my ass at least twice when configuring a Tridion Content Delivery system. What's not to like? Never mind "What's not to like?" - I'll go further. Now that the guys over at Tridion have gone to the trouble of including these files as release assets - it is positively rude of you not to validate your config files.
Being a well-mannered kind of guy, I figured that I'd like to validate my configuration files not just once, but repeatedly. All the time, in fact. Whenever I make a change. The trouble is that the typical server where you find these things isn't loaded down with tools like XML Spy. The last time I validated a config file, it involved copying the offending article over to a file share, and then emailing it to myself on another machine. Not good. Not easy. Not very repeatable.
But enter our new hero, Windows 2008 Server - these days the deployment platform of choice if you want to run Tridion Content Delivery on a Windows box. And fully loaded for bear. At least the kind of bears you can hunt using powershell. Now that I can just reach out with powershell and grab useful bits of the .NET framework, I don't have any excuse any more, or anywhere to hide, so this afternoon, I set to work hacking up something to validate my configuration files. Well - of course, it could be any XML file. Maybe other people will find it useful too.
So to start with - I thought - just do the simplest thing. I needed to associate the xml files with their relevant schemas, and of course, I could have simply done that in the script, but then what if people move things around etc., so I decided that I would put the schemas in a directory on the server, and use XMLSchema-instance attributes to identify which schema belongs with each file.
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="schema.xsd"
OK - so I'd have to edit each of the half-dozen or so configuration files, but that's a one-off job, so not much trouble. The .NET framework's XmlReader can detect this, and use it to locate the correct schema. (although if it isn't correctly specified, you won't see any validation errors even if the file is incorrect. I'll hope to fix that in a later version of my script.)
I created a function in powershell, like this:
# So far this silently fails to catch any problems if the schema locations aren't set up properly # needs more work I suppose. Until then it can still deliver value if set up correctly function ValidateXmlFile { param ([string]$xmlFile = $(read-host "Please specify the path to the Xml file")) "===============================================================" "Validating $xmlFile using the schemas locations specified in it" "===============================================================" $settings = new-object System.Xml.XmlReaderSettings $settings.ValidationType = [System.Xml.ValidationType]::Schema $settings.ValidationFlags = $settings.ValidationFlags ` -bor [System.Xml.Schema.XmlSchemaValidationFlags]::ProcessSchemaLocation $handler = [System.Xml.Schema.ValidationEventHandler] { $args = $_ # entering new block so copy $_ switch ($args.Severity) { Error { # Exception is an XmlSchemaException Write-Host "ERROR: line $($args.Exception.LineNumber)" -nonewline Write-Host " position $($args.Exception.LinePosition)" Write-Host $args.Message break } Warning { # So far, everything that has caused the handler to fire, has caused an Error... Write-Host "Warning:: Check that the schema location references are joined up properly." break } } } $settings.add_ValidationEventHandler($handler) $reader = [System.Xml.XmlReader]::Create($xmlfile, $settings) while($reader.Read()){} $reader.Close() }
With this function in place, all I have to do is have a list of lines like the following:
ValidateXmlFile "C:\Program Files\Tridion\config\cd_instances_conf.xml" ValidateXmlFile "C:\Program Files\Tridion\config\live\cd_broker_conf.xml"
If I've made typos or whatever, I'll pretty soon find them, and this can easily save hours. My favourite mistake is typing the attributes in lower case. Typically in these config files, attributes begin with a capital letter. Once you've made a mistake like that, trust me, no amount of staring at the code will make it obvious. You can stare straight at it and not see it.
So there you have it - as always - comments or improvements are always welcome, particularly if anyone knows how to get the warnings to show up!