2012/5/8 Uche Ogbuji <
uc...@ogbuji.net>:
> Sorry, somehow I missed this. There is:
>
> amara.lib.xpath.util.abspath
>
> I think I had planned to update it in some way, but I think it works for the
> most part.
>
Yes, it's working for me. I wrote a discover funcion to help users to
extract xpath expressions that finds some text in web pages, so our
system can learn to find info in similar pages.
def discover_path(text, node):
''' Discover best xpaths to extract text.
I.e. it can find 'Hello, World' in these cases:
<h1>Hello, World</h1>
<p>Hello, <span>World</span></p>
'''
paths = []
if text in U(node):
if hasattr(node, 'xml_children'):
for child in node.xml_children:
path = discover_path(text, child)
if path:
paths.extend(path)
if not paths: # no more specific path in childs
paths.append(util.abspath(node))
return paths
Saludos,
-- luismiguel (@lmorillas)