SRS Issues + Jhove support

4 views
Skip to first unread message

Gordon Paynter

unread,
Jan 11, 2009, 9:40:40 PM1/11/09
to warc-tools, gildas...@bnf.fr
Hi all:

I have been looking at the WARC-Tools project again and have just
built WARC support into Jhove and used it to validate a few files. I
jad a few little isues along the way, mostly around where various Java
files reside on Ubuntu (and one non-WARC-related compile problem with
Jhove) but it generally went really well. Nice work Younes!


Those of you who are not affiliated with the IIPC may be confused by
all the Type-SRS issues in the tracker. In summary, Erik, Bjarne and I
have been asked verify all these requirements have been met by the
project. We're currently down to 25 open SRS issues in the tracker
(out of the original 90).

Here's a list of what is left: http://code.google.com/p/warc-tools/issues/list?can=2&q=Type-SRS

It feels like we're getting to the point where we have the problem
cases left. I have four issues open and assigned to me: three (SRS
82-84) call for the project to "release" various source code and
binary packages, which is not currently done, and I am not sure what
Hanzo's plans are in this area. Mark & Younes: It would be good to
have various tarballs and even jar files available for download from
the website, though you might want to wait for the standard before you
invest this effort. My fourth is a Java requirement that I am not
really qualified to assess (http://code.google.com/p/warc-tools/issues/
detail?id=71&can=3&q=Type-SRS) so if there are any Java devs on the
list who can offer advice on whether this is done I'll be glad to take
it.

Bjarne & Erik -- I have noted some activity lately on closing more of
the issues, it would be good to get this over with. I we can get it
down to fewer than (say) 10 I'll be happy to sign it off.

Gordon

Bjarne Andersen

unread,
Jan 12, 2009, 6:08:39 AM1/12/09
to warc-...@googlegroups.com, gildas...@bnf.fr
I have verified the Apache Module as working. It will take me quite some time to do the same tests for Lighttp. The building of that module goes fine as well and the documentation for Lighttp is also there - so I would say that SRS's are fullfilled as well. I have closed all issues on Apache and Lighttp.

I have re-assigned one outstanding JHove SRS to you Gordon since you have already done some work in this area.

That means that all my Issues are now Done (Verified)

I would be happy to sign off from my side now - meaning that going below 10 in total we should close it. Also since I  assume money for Phase 2 is already paid and Phase 3 is running

best
Bjarne

raffaele messuti

unread,
Jan 12, 2009, 11:35:18 AM1/12/09
to warc-...@googlegroups.com

On Jan 12, 2009, at 3:40 AM, Gordon Paynter wrote:
> Those of you who are not affiliated with the IIPC may be confused by
> all the Type-SRS issues in the tracker. In summary, Erik, Bjarne and I
> have been asked verify all these requirements have been met by the
> project. We're currently down to 25 open SRS issues in the tracker
> (out of the original 90).

further details about this SRS?
http://code.google.com/p/warc-tools/issues/detail?id=42&can=1&q=cdx

why not cdx anymore? i think that's easier to read and parse cdx
other then current warcdump output

greets


--
raff...@atomotic.com


Bjarne Andersen

unread,
Jan 12, 2009, 1:33:08 PM1/12/09
to warc-...@googlegroups.com
I'm quite sure it's because CDX is not a standard. There exists several different CDX formats used by different tools (heritrix, WaybackMachine and others)

best
Bjarne Andersen

John A. Kunze

unread,
Jan 12, 2009, 1:41:28 PM1/12/09
to warc-tools, gildas...@bnf.fr
I did a bunch of work verifying the SRS issues assigned to Erik.
I'll talk to him about the remaining issues to see how to polish
them off between the two of us.

-John

Gordon Paynter

unread,
Jan 14, 2009, 3:40:42 PM1/14/09
to warc-...@googlegroups.com
Hi:

The SRS document is available from the WARC tools homepage as a PDF, here a direct link:
http://warc-tools.googlecode.com/files/warc_tools_srs.pdf

The text around requirement SRS 36 (section 4) explains the no-CDX decision (basically, it says what Bjarne already said...)

Gordon


>>> raffaele messuti <raff...@atomotic.com> 13/01/09 5:35 a.m. >>>
Reply all
Reply to author
Forward
0 new messages