To test how easy it was for an user to set up a fresh OpenWayback installation based only on the documentation as it is today, I let two of my colleagues, armed with just the GitHub URL, the task of getting it up and running. As you can see from the report below, there's still lot to be done on documentation. I totally agree with Andy what he pointed out in an earlier message that we should focus on what configurations people actually use. We also found that the bug introduced quite a while ago, where uncompressed (w)arcs can not be read, is still there.
Best, John Erik
Here is the report on the experiences they had during installation:
--------------------
My colleague and I was given the task of installing OpenWayback to establish if a user without any significant knowledge of the package would be able to install it and get it up and running. We were pointed to the GitHub homepage at
https://github.com/iipc/openwayback, and had a go at it.
First we downloaded the source and it was fairly easy to set up a Netbeans project and build the software package. With a freshly built .WAR file we were ready to set up the server.
Following the step by step instructions we were able to install the package on JETTY, Tomcat 6 and 7. I had some problems getting it up and running correctly, and the issue turned out to be related to me not having replaced "8080:wayback" with ="8080" in wayback.xml in the section
<bean name="8080:wayback" class="org.archive.wayback.webapp.AccessPoint">
For someone not used to configuring Spring it was not very intuitive to figure this one out, and I see no reason why the generic wayback.xml should contain the “:wayback” part in the bean name.
After having corrected the problem, we loaded some .ARC files to the to directories we set up for data, and saw from the catalina.out that the OpenWayback engine picked up the datafiles, and indexed them. When indexing were complete, we were able to search the content, and got the initial page of hits. Trying to display the actual content of the page would result in “java.lang.IndexOutOfBoundsException”. After some tracking in the code we suspected it had something to do with not being able to read the .ARC datafiles correctly, so we tried to load .WARC files instead. We got similar problems with these, and switched to .WARC.GZ. This did the trick and we were now able to both search and display the content of the indexed pages.
We have only done tests on a limited dataset, so we can´t give a definitive answer on whether the installation is working as expected, but as far as we can tell, it is OK.
Roger Mathisen
National Library of Norway