How to read attribute from non-prefixed with-namespace tag

24 views
Skip to first unread message

J.

unread,
Feb 5, 2014, 5:42:44 AM2/5/14
to scale...@googlegroups.com

I'm trying to read MediaWiki XML format and it starts like this:

<mediawiki xmlns="http://www.mediawiki.org/xml/export-0.8/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.8/ http://www.mediawiki.org/xml/export-0.8.xsd" version="0.8" xml:lang="en">


Then under the tag there are a bunch of tags, some of which have a redirect tag such as:

<page>
   <title>Albigensian</title>
   <redirect title="Catharism" />
   <revision>
      ...
   </revision>
</page>


I'm using ScalesXML to do the parsing:

object WikiMediaImport extends App with Logging {
   val xml = pullXml(new FileReader(args(0)))

   val ns = Namespace("http://www.mediawiki.org/xml/export-0.8/")
   val p = ns // .prefixed("mediawiki") <-- that doesn't help either
   val mediawikiTag = p("mediawiki")
   val pageTag = p("page")
   val titleTag = p("title")
   val revisionTag = p("revision")
   val textTag = p("text")
   val timestampTag = p("timestamp")
   val redirectTag = p("redirect")
 //val redirectWhereAttr: Attribute = Attribute(redirectTag, "title")

   val pagePath = List(mediawikiTag, pageTag)
   val iterator =  iterate(pagePath, xml)

   for {
     page <- iterator
   } {
     val title = text(page \* titleTag)
     val timestamp = text(page \* revisionTag \* timestampTag)
     val content = text(page \* revisionTag \* textTag)

     println(s"$title $timestamp ${content.length}")
   }
}


However, I also want to get the mediawiki -> page -> redirect[title] attribute value and I'm not quite sure how to do this despite reading the help page.

If I get a prefixed Namespace, then nothing is found because in the file the namespace isn't actually prefixed. If I use NoNamespaceQName then nothing is found (presumably because in reality the XML file has a namespace specified).

And if I use a default Namespace then Scales doesn't allow me to define an attribute because those are only to be used with prefixed namespaces.

At least that's how I understand that.

Regards,
John


P.S. StackOverflow: http://stackoverflow.com/questions/21559845/how-to-read-attribute-in-scala-scales-xml-from-non-prefixed-with-namespace-tag

Chris Twiner

unread,
Feb 7, 2014, 4:52:07 AM2/7/14
to scale...@googlegroups.com

Hiya,

I've answered on so but I just wanted to thank you for reminding me to deprecate the less than correct attribute predicates.
Cheers
Chris

--
You received this message because you are subscribed to the Google Groups "scales-xml" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scales-xml+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Reply all
Reply to author
Forward
0 new messages