C:/Ruby22-x64/lib/ruby/gems/2.2.0/gems/nokogiri-1.7.0-x64-mingw32/lib/nokogiri/xml/searchable.rb:165:in `evaluate': Undefined namespace prefix: .//parastyle:course (Nokogiri::XML::XPath::SyntaxError) from C:/Ruby22-x64/lib/ruby/gems/2.2.0/gems/nokogiri-1.7.0-x64-mingw32/lib/nokogiri/xml/searchable.rb:165:in `block in xpath' from C:/Ruby22-x64/lib/ruby/gems/2.2.0/gems/nokogiri-1.7.0-x64-mingw32/lib/nokogiri/xml/searchable.rb:156:in `map' from C:/Ruby22-x64/lib/ruby/gems/2.2.0/gems/nokogiri-1.7.0-x64-mingw32/lib/nokogiri/xml/searchable.rb:156:in `xpath' from C:/Users/JP/Desktop/Project/lib/scraper.rb:25:in `<main>'[Finished in 2.2s with exit code 1]
doc = Nokogiri::XML(open(courses_url)) do |config| config.hugeend
doc.xpath(".//parastyle:course").each do |node| puts nodeend
I'm really sorry if I'm asking too much, third post in a row. I'm doing a student project for my software engineering course and I've been experimenting with parsing multiple sites. Each site is different and has its own set of issues.My current issue is I'm parsing an XML page from my university of courses.It has no closing tags, and I'm not sure if that's the source of all my errors.
1. My current error is I'm trying to xpath(".//parastyle:course"). I get
C:/Ruby22-x64/lib/ruby/gems/2.2.0/gems/nokogiri-1.7.0-x64-mingw32/lib/nokogiri/xml/searchable.rb:165:in `evaluate': Undefined namespace prefix: .//parastyle:course (Nokogiri::XML::XPath::SyntaxError)from C:/Ruby22-x64/lib/ruby/gems/2.2.0/gems/nokogiri-1.7.0-x64-mingw32/lib/nokogiri/xml/searchable.rb:165:in `block in xpath'from C:/Ruby22-x64/lib/ruby/gems/2.2.0/gems/nokogiri-1.7.0-x64-mingw32/lib/nokogiri/xml/searchable.rb:156:in `map'from C:/Ruby22-x64/lib/ruby/gems/2.2.0/gems/nokogiri-1.7.0-x64-mingw32/lib/nokogiri/xml/searchable.rb:156:in `xpath'from C:/Users/JP/Desktop/Project/lib/scraper.rb:25:in `<main>'[Finished in 2.2s with exit code 1]
My code
doc = Nokogiri::XML(open(courses_url)) do |config|config.hugeenddoc.xpath(".//parastyle:course").each do |node|puts nodeend
2. My second error is the output of doc has a huge block of ending tags that Nokogiri inserts at the end. I see theres a method for no_empty_tags but that's only for nodes. I'm not sure how to use it when I can't xpath each node.
doc = Nokogiri::XML(open(courses_url)) do |config| config.hugeend
courses = []
doc.xpath("//*[name()='ParaStyle:Course']").each do |node| courses.push(node)end
puts courses