Anyone with experience of Feeds XPath Parser?

468 views
Skip to first unread message

James

unread,
Oct 16, 2012, 1:48:25 PM10/16/12
to nw...@googlegroups.com
I'm trying to get started with using the module for creating nodes from an imported XML file, but have a few problems with getting going.

I think it might be an issue with the XPath queries I'm using.

Is there anyone here who's had experience with this module or using XPath to query an XML file?

James Bisset

unread,
Oct 17, 2012, 5:51:25 AM10/17/12
to nw...@googlegroups.com
I've used Feeds (D7) to import a csv of articles from another CMS, and that took a few attempts before I got the hang of the workflow. Once I got round my confusion over attaching the import to a content type and then actually found the 'Import' page, it was fairly painless.

Never used Xpath or an XML file with it though.



Jim

--
James Bisset
m e d i a c h r o m e
http://www.mediachrome.com

Philip Norton

unread,
Oct 17, 2012, 6:08:46 AM10/17/12
to nw...@googlegroups.com
I've not used the xml parser in Feeds before, but I've been using and extending the csv parser and that works nicely. I think that if you go beyond RSS or sitemap XML standards then you'll probably need to create your own parser.

I just had a quick dig about in the code. If you extend FeedsSyndicationParser with your own class you can then override the parse function to do your own thing. The parse() method calls a function called common_syndication_parser_parse(), which is just a wrapper around simplexml_load_string() and returns a FeedsParserResult object. The common_syndication_parser_parse() function just extracts the different formats into a coherent data structure using different methods for different XML formats. The FeedsOPMLParser does use xpath to parse OPML files using the opml_parser_parse() function, so that might be worth a look as well?

You can override Feed classes using the hook_feeds_plugins()... er.. hook. Which I imagine you are already doing?

Phil

Jonty Bale

unread,
Oct 17, 2012, 4:21:47 PM10/17/12
to nw...@googlegroups.com
>> On 16 Oct 2012, at 18:48, James wrote:
>>
>> > I'm trying to get started with using the module for creating nodes from
>> > an imported XML file, but have a few problems with getting going.
>> >
>> > I think it might be an issue with the XPath queries I'm using.
>> >
>> > Is there anyone here who's had experience with this module or using
>> > XPath to query an XML file?

If you are happy including Zend Framework as part of the project I
would recommend having a look at Zend_Dom_Query.

http://framework.zend.com/manual/1.12/en/zend.dom.query.html

Its an abstraction later on top of XPath which allows you to use a css
selector "style" query syntax.

XPath is a hugely powerful, but can be a bit of a nightmare to get into.

Jonty.

James

unread,
Oct 20, 2012, 4:34:30 PM10/20/12
to nw...@googlegroups.com
Thanks for the replies, everyone.

Crikey - your reply is a bit beyond my level of understanding at the moment, Philip! Unfortunately, I'm not familiar with some of the development ideas and terms you're using, so it's not something I'm likely to be able to implement.

Regarding the Zend thing, Jonty - is this something a non-developer can implement? I've heard of Zend before, but that's about it, I'm afraid.



James

Jonty Bale

unread,
Oct 22, 2012, 6:58:26 PM10/22/12
to nw...@googlegroups.com
On 20 October 2012 21:34, James <jmsp...@gmail.com> wrote:
> Thanks for the replies, everyone.
>
> Crikey - your reply is a bit beyond my level of understanding at the moment,
> Philip! Unfortunately, I'm not familiar with some of the development ideas
> and terms you're using, so it's not something I'm likely to be able to
> implement.
>
> Regarding the Zend thing, Jonty - is this something a non-developer can
> implement? I've heard of Zend before, but that's about it, I'm afraid.

Not sure if there's a drupal module out there which uses it - its a
developer thing really, but pretty trivial to implement.

Would make things a lot easier - sorry if that's no help though.

Jonty.

James

unread,
Oct 22, 2012, 8:24:23 PM10/22/12
to nw...@googlegroups.com
Hi Jonty,

Did a bit more digging and found an alternative the XPath Parser - Feeds Querypath Parser - which is supposed to work on CSS selector-style queries. Unfortunately, I'm still as baffled and am waiting on replies from the module developer to understand a bit more about how to format queries properly.

simon p

unread,
Oct 23, 2012, 8:04:47 AM10/23/12
to nw...@googlegroups.com
I've used Xpath queries before with feeds, I'm actually using it now on a build - what do you need to know?

I'll be at Drupalcamp NW if you can wait.

Simon


On Tuesday, 16 October 2012 18:48:26 UTC+1, James wrote:

James

unread,
Oct 23, 2012, 5:11:20 PM10/23/12
to nw...@googlegroups.com
Hi Simon,

Thanks for replying - I've been tearing my hair these past few days trying to figure out XPath :/ I won't be making it to Drupalcamp NW, unfortunately, but if you have a few mins to help me figure something out here, that would be brilliant.

The output from an XML feed to a node isn't what I'm expecting. Whilst I get the output that I want when working with the Feeds CSV parser, it seems a bit more difficult with XPath. Rather than detail what's gone wrong and what I've been doing, could I ask for some examples of how you might work with the XML structure I'm trying to work with?


Say I wanted to extract data in the tags <name>, <desc>, <mLink> and <price>...

  • What query would you use as the context?
  • If I set to import that data into the fields title, bodyfield_product_deeplink:url and field_price, respectively, what should I use as the queries?

I'm also a bit unsure about what "Select the queries you would like to return raw XML or HTML" means, as well as what the debug options are - but I'm guessing these aren't nearly as important as know how to structure the queries to the above bits of XML data.

Any pointers you can offer will be very, very greatly appreciated!



James

JHellings

unread,
Oct 24, 2012, 6:29:44 AM10/24/12
to nw...@googlegroups.com
James,

Your XPath expressions may not be at fault
  1. The example xml you linked to is broken. You can verify the correctness of an xml document here...
        
    http://www.w3schools.com/dom/dom_validate.asp
  2. Once you have a well-formed xml document to work with, you can test your XPath expressions here...
        http://www.xmlme.com/XpathTool.aspx
  3. Once you have a well-formed xml document and a working XPath expression, try the import again. If it still fails then there may be something wrong with the way you're mapping fields (I tripped up with that in importing feeds in D6) or something more fundamental.
I hope that helps,

Justin

James

unread,
Oct 24, 2012, 4:48:33 PM10/24/12
to nw...@googlegroups.com
Justin,

That might just be enough to keep me sane for the moment!

The XML file is supplied from somewhere I would normally expect to be reliable. I'll raise a ticket and see what they say. I hope it is just a case of there being a problem with that particular feed that can be rectified.



Thanks,
James

simon p

unread,
Oct 25, 2012, 6:37:40 AM10/25/12
to nw...@googlegroups.com
The file seems ok to me.

the xpath queries in feeds would be as follows assuming you set the context to "//prod". I just uploaded all the screenshots - you need to create a content type too (I assume you already did this).

I have attached a feature that is all configured for you too if you know how to use features :)

Simon
Screen Shot 2012-10-25 at 11.32.37.png
Screen Shot 2012-10-25 at 11.32.45.png
Screen Shot 2012-10-25 at 11.32.54.png
Screen Shot 2012-10-25 at 11.33.00.png
Screen Shot 2012-10-25 at 11.33.04.png
Screen Shot 2012-10-25 at 11.33.11.png
Screen Shot 2012-10-25 at 11.33.16.png
Screen Shot 2012-10-25 at 11.33.21.png
merchant_product.tar

simon p

unread,
Oct 25, 2012, 6:58:47 AM10/25/12
to nw...@googlegroups.com
Almost forgot, don't forget to go to yourdomain/import to actually run the import - you might need to supply the URL http://brain-training-games.co.uk/feeds/testfeed.xml

Si

James

unread,
Oct 25, 2012, 10:43:48 AM10/25/12
to nw...@googlegroups.com
That's absolutely fantastic! Thank you, so, so much for your help! Couldn't have asked for anything better  - I owe you several beers, if I ever get along to a DUG meeting :D

I'll give it a shot when I get home, but I can't imagine there being any difficulties, given that it works fine for you in the screenshots.



Thanks,

James

James

unread,
Oct 25, 2012, 4:26:03 PM10/25/12
to nw...@googlegroups.com
Well, I've tried it and all appears to work nicely :)

It's slightly frustrating, because the queries I tried using initially, which were producing bad results, aren't a million miles off the ones you posted, Simon.

The examples are great - I feel a lot more confident now about being able to use XPath in future, now that I've seen how they should be set out.

Thanks again for the help.

Simon Poulston

unread,
Oct 26, 2012, 5:52:43 AM10/26/12
to nw...@googlegroups.com
No problem, it's worth remembering that for simple xml/rss feeds you can just normally use the standard xml reader in feeds and not bother with xpath but it's always good to know.

Simon

sonya

unread,
Nov 29, 2012, 4:38:16 AM11/29/12
to nw...@googlegroups.com
I am familiar with a few XPath techniques like the predicate parameter and the proximity position which allows you to query your xml file, this XPath tutorial might be helpful.
http://www.liquid-technologies.com/xpath-tutorial.aspx
Reply all
Reply to author
Forward
0 new messages