Hi Jeremy,
There are multiple ways of approaching this. I suspect you could write a single GREL expressions to try to capture all the cases but it would get quite long and instead I'd probably split this out to multiple steps and in this case work with an additional column to get my final text in, and then once that's ready replace/remove the original column with the new one.
So as a suggestion:
First deal with case 2 by:
1. From your starting column us "Add column based on this column" with an expression that will only work if the original cell contains XML. I'm not sure what the best expression to check this is and it might depend on your XML, but possibly a general expression that would work is:
if(value==value.parseXml().select("*").join("|"),value,null)
but if you know that a particular tag is always there you could do something like:
if(not(value.contains("</Section>")),value,null)
You should now have a new column that is populated with any raw strings
2. In the new column apply a facet by blank (under Facet -> Customized Facets) and select 'true' for to filter to only the remaining blank cells
3. In the new column use Edit Cells -> Transform and write GREL to extract the information from any <Section> elements but amending what you have below to pull the data from the original column like:
cells["XML Column name"].value.parseXml().select("Section")[0].xmlText()
4. This should lead to more of the cells in your new column being populated - refresh the facet to make sure these are now filtered by the 'Facet by blank' facet - so you only have blank cells again in the column
5. In the new column use Edit Cells -> Transform and write GREL to extract the information from any <Run> text attributes but amending what you have below to pull the data from the original column like:
cells["XML Column name"].value.parseXml().select("Run")[0].xmlAttr("Text")
6. Once these cells are populated and you have the facet by blank = true applied, you should be left only with the blank cells again ... which should mean that you are now left with only your scenario number 1 "if there is no string in the cell, do not transform the cell content" - but you can check this by examining the original column values that are left and make sure this is true and if necessary identify any more work that needs doing - which you can pursue in an appropriate way following the same pattern
Once the new column is fully populated as you want you can remove the Facet by blank and then either copy the data back to the original column, or just remove the original column and rename the new one (or however you want to manage it)
If there are situations where there is both text in side the Section element and also text in the Run element 'Text' attribute, the above method would only copy it from the Section element (because once you've done step 3 then there would be a populated cell in your column and so it would be filtered from future work). So it is probably worth checking up front if this ever happens and if so deciding how to deal with it. You can do this by writing a Custom Text Facet on the original XML column with the GREL like:
with(value.parseXml(),x,and(x.select("Section")[0].xmlText().length()>0,x.select("Run")[0].xmlAttr("Text").length()>0))
which checks for text appearing in both places - so if there is a 'true' result in that facet then you have situations where both are populated and need to decide how you handle this situation (get both texts? or prioritise one over the other?)
Sorry - I realise there's alot here - I hope it makes sense and please ask if not
Owen