I'm having a very hard time figuring out how to scrape an unnamed array that contains Jsonfiles. The implementation I'm using is build around the django dynamic scraper which relies on scrapy and Jsonpath.
My question is how to get an encapsulating Jsonpath that will hold all the Jsons in the array and not the elements of the Jsons itself?
I'm looking for a Jsonpath that selects only the Jsons and not the elements within them.
All 4 Jsonpaths below give me all elements of all Jsons but I can't think of one that holds the Jsons itself.
The background is that Django Dynamic Scraper relies on a concept called Base element which in terms of Xpath is the element that is shared by all elements one tries to scrape.
So I'm trying to figure out what the Json equivalent is but I'm not getting anywhere.
For 1):
I'm getting a TypeError
TypeError: object of type 'int' has no len()
File "/usr/local/lib/python2.7/dist-packages/dynamic_scraper/spiders/django_spider.py", line 421, in parse
if(len(base_objects) == 0):
For 2):
I'm getting an Exception
Exception: Parse error at 1:0 near token . (.)
File "/usr/local/lib/python2.7/dist-packages/jsonpath_rw/parser.py", line 69, in p_error
raise Exception('Parse error at %s:%s near token %s (%s)' % (t.lineno, t.col, t.value, t.type))
For 3):
I'm getting an AttributeError
AttributeError: 'NoneType' object has no attribute 'linen'
File "/usr/local/lib/python2.7/dist-packages/jsonpath_rw/parser.py", line 69, in p_error
raise Exception('Parse error at %s:%s near token %s (%s)' % (t.lineno, t.col, t.value, t.type))
For 4):
I'm getting an AttributeError
AttributeError: 'NoneType' object has no attribute 'linen'
File "/usr/local/lib/python2.7/dist-packages/jsonpath_rw/parser.py", line 69, in p_error
raise Exception('Parse error at %s:%s near token %s (%s)' % (t.lineno, t.col, t.value, t.type))
Thanks,
Enrico