This is my own personal blog, each article is an XML document and the code poweringit is hand cranked in XQuery and XSLT. It is fairly simple and has evolved only asI have needed additional functionality. I plan to Open Source the code once it isa bit more mature, however if you would like a copy in the meantime drop me a line.
Redstone is an excellent and simple XML-RPC library. Recently when trying to create a simpleXML-RPC client using Redstone in some example code for a book I am writing, I neededto be able to authenticate with the 3rd-party XML-RPC server.
The server that I was attempting to communicate with (eXist) requires at least HTTP Basic Authentication for operations that require a user tohave been granted various permissions. Redstone provides no explicit functionalityfor authentication or information on how to achieve such things. However as its OpenSource, I dug through the code and discovered that it uses a standard java.net.HttpUrlConnection. Whilst it is quite possible to do HTTP Basic Authentication with HttpUrlConnectionsby setting the correct HTTP Header, I was making use of Redstone's XML-RPC Proxy facility,to help keep my client code simple, which unfortunately does not expose anymore thanan instance of the Interface you proxy. Another quick examination of the Redstonecode showed that they were using java.lang.reflect.Proxy to create a dynamic proxy of the provided Interface.
We can use Proxy.getInvocationHandler on our XML-RPC proxy to get the Invocation Handler, which happens to be an instanceof redstone.xmlrpc.XmlRpcProxy which offers us the method setRequestProperty(name, value). Any request propertiesset in this way are set as Headers on the Http Request used by the XML-RPC proxy.
For the RESTXQ specificationthat I am working on as part of my EXQuery efforts,I need to write up a "formal" specification for RESTXQ.The EXQuery RESTXQ code base lives on GitHub( ),and the specification has been authored in the exquery-restxq-specification module.
The RESTXQ specification is authored in HTML using Robin Berjon's excellentReSpec tool.As specifications are arguably meant to be read by people, it would be nice if wecould present thework in progress from the source repository to users as a web page.
Fortunately GitHub provides a nice facility for web pages calledGitHub Pages.However, the pages are taken from a branch of your GitHub repository called gh-pages. Theadvantage of this is that your 'pages' can contain different content to your mainsource codebase (i.e. your master branch). If your page content is in your masterbranch though, you need a facility for keeping the copy in your gh-pages branch up to datewith your commits to master.
I simply wanted to be able to keep a single folder called exquery-restxq-specificationfrom master in sync with my gh-pages. When creating mygh-pages repository, following the instructionsabove,rather than delete everything in the gh-pages branch, I deleted everything exceptthe exquery-restxq-specification folder, and then committed and pushed.
To keep the folder in sync across both branches, we can add a post-commit hook locallyto the master branch,so that when we commit changes to that folder in master, the changes are propagated to thegh-pages branch.
Now simply changing something in the exquery-restxq-specification folder inthe master branch and committing, will cause Git to also sync the changesto the gh-pages branch.
As a further exercise it might be interestingto take the commit message for gh-pages from the last commitmessage of master...
Whilst at this moment I am meant to be preparing my sessions of the XML Summer School this year, I was reviewing Priscilla Walmsley's slides from last year and saw thefollowing example given as a 'Search and Browse' use-case for XQuery:
Sadly however, eXist-db, which is the XQuery platform I like to use, does not implement the W3C Full-Textextensions yet. Instead it has its own full-text extensions based on Lucene, so ineXist-db the equivalent would be:
If I stopped there however, it would be quite a short blog post. It also appears fromthe implementation test results that the W3C XPath and XQuery Full-Text specification is not widely implemented.So how about implementing this in pure XQuery? I took the challenge, and my Solutionis below.
I would be interested to see attempts at a more elegant implementation or suggestionsfor improvements.
Now, this seemed very strange to me as I could paste that URL into any Web Browserand be returned a HTML Web Page! So I broke out one of my old favourite tools, Wireshark, to examine the differences between the HTTP request made by the EXPath HTTP Client(which is really the Apache Commons HTTP Components Client underneath) and cURL. I decided to use cURL as its very simple and so therefore I knew it would not insertunnessecary headers into a request, of course I made sure it worked first!
So what is going on here? Why does one request for the same URL succeed and the otherfail? If we examine the requests the only difference is that the HTTPClient requestincludes a header 'Connection: keep-alive' whereas the cURL request does not, andthe User-Agent header represents each client.
So What is 'Connection: keep-alive'? The HTTP 1.1 specification describes persistent connections in 8 starting on page 43. Basically a persistentconnection allows multiple http requests and responses to be sent through the sameTCP connection for efficiency. The specification states in 8.1.1:
So whilst persistent connections 'SHOULD' be implemented rather than 'MUST' be implemented,the default behaviour is that of persistent connections, which seems a bit, erm...strange! So whether the client sends 'Connection: keep-alive' or not, the defaultis in effect 'Connection: keep-alive' for HTTP 1.1, therefore cURL and HTTPClientare semantically making exactly the same request.
If both cURL and HTTPClient are making the same request, why do they get differentresponses from the server? Well, we can check if persistent connections from the HTTPClientare the problem by forcing the HTTPClient to set a 'Connection: close' header as detailedhere:
Unfortunately we yet again get a HTTP 404 response. Which is actually correct if weassume that the implementations and server adhere to the specification. So the onlyremaining difference is the User Agent header.
The only remaining difference is the User Agent string, but why would such a usefulinformation website block requests from application written in Java using a very commonlibrary? I dont know! So perhaps we should choose a very common User Agent string,for example one from a major web browser and try the request again:
This weekend I returned to Devon and attended the NASA Space Apps Challenge at theMet Office. This is only the second hackathon I have attended outside of the eXist-dbsessions I have done in the past and it was great fun.
The goal of the 'Predict the Sky' project was to create applications which would allowa user to know what objects are in the sky over their location at night, and the chancesof them being able to see those objects based on the weather.
Each challenge group had their own space in the Met Office building in Exeter whichwas good because it was quiet, but bad because it restricted the easy cross-pollinationof ideas and offers of help between the projects.
Personally, I think we were very lucky with the structure of our volunteer team, wehad two designers, two mobile app developers (one IOS and one Android), two back-endprogrammers and a couple of web developers, this wide range of skills allowed us toaddress multiple targets at once.
I myself worked on the API for providing data to the Mobile Apps and Website. Thegoal of the API was to act as a proxy, whereby a single call to our API would calla number of other 3rd party APIs and scrape various websites, combining the data intoa simple form useful for our clients.
For mashing up the data from the APIs and the Web in real-time based on requests comingto us, I decided to use XQuery 3.0 running on the eXist-db 2.0 NoSQL database. As the APIs I was calling produce XML, and extension functionsfrom the EXPath project allow us to retrieve HTML pages and tidy them into XML, XQuery is a naturalchoice as its data-model and high-level nature enable me to munge the data in justa few lines of code, then store and query it into eXist-db with just a couple morelines. eXist-db also has a nice feature, whereby it provides a set of serializersfor its XQuery processor, which enable me to process the XML and then choose at APIinvocation time whether to serialise the results as XML, JSON or JSON-P with justa single function call, this is great when different clients require different transportformats.
For my first attempt I took data from the UHAPI (Unofficial Heavens API) and the Met Office DataPoint API. I combined these two sources based on the time of a Satellite (e.g. The InternationalSpace Station or The Hubble Telescope) passing overhead and determined the weatherat that time.
The first approach proved too limited as the UHAPI only provides data for the currentday, whereas the Met Office is capable of providing a five day forecast in three hourlyincrements. The front-end developers wanted to be able to display the soonest ClearSky event and then a chronological list of all upcoming events. Based on this I switchedfrom the UHAPI to scraping the HTML tables from the Heavens Above website. The implementation was trivial in less than 100 lines of code, and the only painreally came from having to convert the arbitrary date formats used in the HTML fordisplay into valid xs:date and xs:dateTime formats for later calculation.
The challenge started at 11am on Saturday and by finish time at 12pm on Sunday, theteam were able to present that they had created a working API thats live on the web, complete design mock-ups of the Mobile UI, and both IOS and Android mobile app skeletons which talk to the API and show real results.
In addition the team was also able to identify data sources for Meteor Showers andIridium Flares and did also complete the implementation of a number of coordinatemapping algorithms to help us establish the longitude and latitude of such events,although we ran out of time to implement these in the API code-base.
b1e95dc632