Announcing Sheet2RDF: a new system developed for the lifting of spreadsheet content to RDF.
Sheet2RDF is an extension of CODA (http://art.uniroma2.it/coda/), a general architecture for the triplification of unstructured information, based in turn on Apache UIMA (Unstructured Information Management Architecture - https://uima.apache.org/ ).
Sheet2RDF extends CODA with dedicated capabilities for spreadsheet content management, a specific UIMA type system for representing spreadsheets, and with a convention-over-configuration approach which allows - under given conditions - to project the content of a spreadsheet file onto some RDF vocabulary without any configuration at all.
When things “get more complicate” (and/or when the excel file is not so self-explicative), specific projections can be expressed by means of the PEARL language offered by CODA, a dedicated transformation language for projecting UIMA annotations onto RDF content.
Sheet2RDF comes both as a CLI (command line interface) utility, and as an extension for Semantic Turkey.
The system is also being ported to VocBench (http://vocbench.uniroma2.it/), the SKOS collaborative editing tool based on the Semantic Turkey engine. Though not still available natively in VB, both the CLI and Semantic Turkey extension can be used to acquire data for VocBench from spreadsheets (e.g. initialize a thesaurus from scratch or create new content to be imported in an existing one).
Currently, spreadsheets means Microsoft Excel files (better in the new xlsx format available from Office 2010 on), though we will add support for more formats in the near future.
Sheet2RDF is available here: http://art.uniroma2.it/sheet2rdf/.
The ART team