6.0.0 is out

47 views
Skip to first unread message

Nicholas Car

unread,
Jul 20, 2021, 10:00:00 AM7/20/21
to rdfli...@googlegroups.com
Hi all,

Yes, 6.0.0 is out:
Please publicise this release: it has a lot of stuff since 5.0.0 in April last year.

Thank you very much to all of you who contributed, in particular my co-maintainers, Ashley & Natanael and Edmond, Iwan, Tom, Remi, Harold and all the PR and Issue creators. Thanks also to the institutions that provided time for their staff to contribute.

If you see issues, please let the co-maintainers know straight away: we keen to get a 6.0.1 release out shortly (like weeks to a month) to speed up the RDFlib release cycle.

Cheers,

Nick

--
kind regards
Dr Nicholas Car
Data Systems Architect
 
SURROUND Australia Pty Ltd and
SURROUND NZ Limited
 
AddressLevel 9, Nishi Building,
2 Phillip Law Street
New Acton Canberra 2601
Mobile+61 477 560 177
Emailnichol...@surroundaustralia.com
Websitehttps://www.surroundaustralia.com
 
Enhancing Intelligence Within Organisations
delivering evidence that connects decisions to outcomes

Dr Nicholas Car
Adjunct Senior Lecturer
 
Research School of Computer Science
 
The Australian National University,
Canberra ACT Australia
+61 477 560 177
nichol...@anu.edu.au
https://cs.anu.edu.au/people/nicholas-car
https://orcid.org/0000-0002-8742-7730
 

Florent Georges

unread,
Jul 20, 2021, 10:23:39 AM7/20/21
to rdfli...@googlegroups.com
Congratulations, and thank you all for the hard work! 

--
Florent Georges
H2O Consulting
http://h2o.consulting/

--
http://github.com/RDFLib
---
You received this message because you are subscribed to the Google Groups "rdflib-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rdflib-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rdflib-dev/CAP7nqh19yjpwB8EoHVqs5QzKug_rSq1X%2BfFHfnFtOJBdZ1RwYg%40mail.gmail.com.

Wes Turner

unread,
Jul 20, 2021, 2:14:24 PM7/20/21
to rdfli...@googlegroups.com
Congrats and thanks!

From the release notes on the Release: 

```
6.0.0 is a major stable release that drops support for Python 2 and Python 3 < 3.7. Type hinting is now present in much
of the toolkit as a result.

It includes the formerly independent JSON-LD parser/serializer, improvements to Namespaces that allow for IDE namespace
prompting, simplified use of g.serialize() (turtle default, no need to decode()) and many other updates to
documentation, store backends and so on.

Performance of the in-memory store has also improved since Python 3.6 dictionary improvements.

There are numerous supplementary improvements to the toolkit too, such as:

- inclusion of Docker files for easier CI/CD
- black config files for standardised code formatting
- improved testing with mock SPARQL stores, rather than a reliance on DBPedia etc
```

Have there been ANN posts to e.g. Hacker news and e.g. /r/semanticweb?

Natanael Arndt

unread,
Jul 20, 2021, 3:58:48 PM7/20/21
to rdfli...@googlegroups.com, Wes Turner
I've retweetet the tweet by jarven. But I don't use reddit or hacker news, I think also semantic web mailing list would be a good idea.

If you'd like to post something in the channels, please do so.

Natanael
>>> - https://pypi.org/project/rdflib/6.0.0/
>>> - https://github.com/RDFLib/rdflib/releases/tag/6.0.0
>>>
>>> Please publicise this release: it has a lot of stuff since 5.0.0 in
>April
>>> last year.
>>>
>>> Thank you very much to all of you who contributed, in particular my
>>> co-maintainers, Ashley & Natanael and Edmond, Iwan, Tom, Remi,
>Harold and
>>> all the PR and Issue creators. Thanks also to the institutions that
>>> provided time for their staff to contribute.
>>>
>>> If you see issues, please let the co-maintainers know straight away:
>we
>>> keen to get a 6.0.1 release out shortly (like weeks to a month) to
>speed up
>>> the RDFlib release cycle.
>>>
>>> Cheers,
>>>
>>> Nick
>>>
>>> --
>>> kind regards
>>> Dr Nicholas Car
>>> Data Systems Architect
>>>
>>> SURROUND Australia Pty Ltd and
>>> SURROUND NZ Limited
>>>
>>> Address Level 9, Nishi Building,
>>> 2 Phillip Law Street
>>> New Acton Canberra 2601
>>> Mobile +61 477 560 177
>>> Email nichol...@surroundaustralia.com
>>> Website https://www.surroundaustralia.com
>>>
>>> Enhancing Intelligence Within Organisations
>>> delivering evidence that connects decisions to outcomes
>>>
>>> Dr Nicholas Car
>>> Adjunct Senior Lecturer
>>>
>>> Research School of Computer Science
>>>
>>> The Australian National University,
>>> Canberra ACT Australia
>>> +61 477 560 177
>>> nichol...@anu.edu.au
>>> https://cs.anu.edu.au/people/nicholas-car
>>> https://orcid.org/0000-0002-8742-7730
><https://www.surroundaustralia.com>
>>>
>>>
>>> --
>>> http://github.com/RDFLib
>>> ---
>>> You received this message because you are subscribed to the Google
>Groups
>>> "rdflib-dev" group.
>>> To unsubscribe from this group and stop receiving emails from it,
>send an
>>> email to rdflib-dev+...@googlegroups.com.
>>> To view this discussion on the web visit
>>>
>https://groups.google.com/d/msgid/rdflib-dev/CAP7nqh19yjpwB8EoHVqs5QzKug_rSq1X%2BfFHfnFtOJBdZ1RwYg%40mail.gmail.com
>>>
><https://groups.google.com/d/msgid/rdflib-dev/CAP7nqh19yjpwB8EoHVqs5QzKug_rSq1X%2BfFHfnFtOJBdZ1RwYg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
>> http://github.com/RDFLib
>> ---
>> You received this message because you are subscribed to the Google
>Groups
>> "rdflib-dev" group.
>> To unsubscribe from this group and stop receiving emails from it,
>send an
>> email to rdflib-dev+...@googlegroups.com.
>> To view this discussion on the web visit
>>
>https://groups.google.com/d/msgid/rdflib-dev/CADyR_r1Q_hvfnufYVD0YLYhP%3DwEXnjsi5ucpjzWK_owyYfsfnQ%40mail.gmail.com
>>
><https://groups.google.com/d/msgid/rdflib-dev/CADyR_r1Q_hvfnufYVD0YLYhP%3DwEXnjsi5ucpjzWK_owyYfsfnQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>

--
Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.

Miel Vander Sande

unread,
Jul 27, 2021, 3:56:13 AM7/27/21
to rdfli...@googlegroups.com
Hi all,

little late to the party, but what a great effort this is! Congrats with the release and thank you; this library is super essential to my work and it makes RDF usable in ways other libraries can't. 

Sidenote: I have a streaming direct json-to-rdf mapping implementation (port of https://github.com/AtomGraph/JSON2RDF) that I'd like to contribute, possibly in combination with a refactoring of https://rdflib.readthedocs.io/en/stable/apidocs/rdflib.tools.html#rdflib.tools.csv2rdf.CSV2RDF. Would that be of interest?

Best,

Miel

Op di 20 jul. 2021 om 21:58 schreef Natanael Arndt <arn...@gmail.com>:

Nicholas Car

unread,
Jul 28, 2021, 12:09:56 AM7/28/21
to rdfli...@googlegroups.com
Hi Meil,

Yes, all offers of contribution are of interest! The CSV 2 RDF stuff is very old and many tools related to it, such as pyTARQL (https://github.com/RDFLib/pyTARQL), are missing. Are you planning on presenting JSON2RDF as a new plugin to RDFlib? that may be an option, however remember that another option is also just to present your tool's repository within RDFlib's family of repositories (i.e. within https://github.com/RDFLib) and the choice will depend on how stable the tool is and how you see it's future development going.

But perhaps you have other things in mind? Whatever the case, we'd love to hear your plans.

Cheers,

Nick

Miel Vander Sande

unread,
Jul 28, 2021, 4:49:23 AM7/28/21
to rdfli...@googlegroups.com
Hi Nick,

TBH, it's pretty much a function that converts a Dict or a JSON file in a streaming fashion: https://github.com/viaacode/construction-site/blob/main/construction_site/parse_functions.py. I think it's a stand-alone thing; I don't plan anything extra on that specifically, with maybe the exception of a cmd interface (hence the proposed refactoring of csv2rdf)

I do plan to develop more components that assist scalable ETL, data-to-rdf like tasks. This includes a plugin for Apache Airflow ("provider"), which would be good as a RDFLib family repository.

Best,

Miel

Op wo 28 jul. 2021 om 06:09 schreef Nicholas Car <nichol...@surroundaustralia.com>:

Wes Turner

unread,
Jul 31, 2021, 6:45:21 AM7/31/21
to rdfli...@googlegroups.com
On Wed, Jul 28, 2021 at 4:49 AM Miel Vander Sande <miel.van...@meemoo.be> wrote:
Hi Nick,

TBH, it's pretty much a function that converts a Dict or a JSON file in a streaming fashion: https://github.com/viaacode/construction-site/blob/main/construction_site/parse_functions.py. I think it's a stand-alone thing; I don't plan anything extra on that specifically, with maybe the exception of a cmd interface (hence the proposed refactoring of csv2rdf)
 
Profiling / [comparative] benchmarks with e.g. Scalene [1][2] and/or perfplot [3] (%timeit) [4][5] could be worthwhile.


ijson [6] looks like it has some interesting features; iterative, asyncio, push. How does the performance compare?


I do plan to develop more components that assist scalable ETL, data-to-rdf like tasks. This includes a plugin for Apache Airflow ("provider"), which would be good as a RDFLib family repository.

- The datasette and dogsheep projects have a bunch of *-to-sqlite utils and an interface that a number of projects on PyPI have implemented:

      - parse datetimes in CSVs
        - xsd:datetime (and schema.org/Date and schema.org/dateCreated and schema.org/dateModified) specifies that time will be specified in ISO8601 formats

What are the solutions for generating RDFS schema from CSVs and SQL tables?

  - doesn't do anything with datatypes FWICS


  > PyRDB2RDF provides RDFLib with an interface to relational databases as RDF stores. The underlying data is accessed via SQLAlchemy. It is mapped to RDF according to the specifications of RDB2RDF. The corresponding RDF graph is represented as an RDFLib graph.
  >
  > Translating from relational data to RDF via direct mapping is currently supported. Translating in the other direction and mapping with R2RML are planned but not yet implemented.

  
  - Does this handle datetimes?

- Generate JSONschema from JSON and SHACL from JSON-Schema:
  - https://pypi.org/project/genson/ has been recently updated
    > JSON-LD Schema defines a simple 'semantics' JSON-Schema vocabulary (effectively a JSON-Schema meta-schema) that reuses the official JSON Schema for JSON-LD to provide definitions for @context and @type properties. These annotations can be used to provide JSON-LD context for a JSON-Schema document. Provided this JSON-LD context, constraints over named 'properties' in a JSON Schema document can be understood as constraints over CURIES of JSON-LD documents following the context rules defined in the JSON-LD specification.

## CSVW: CSV on the Web

- Homepage: https://w3c.github.io/csvw/
- Standard: https://www.w3.org/TR/tabular-data-model/
- Standard: https://www.w3.org/TR/tabular-metadata/
- Standard: https://www.w3.org/TR/csv2json/
- Standard: https://www.w3.org/TR/csv2rdf/
- Namespace: https://www.w3.org/ns/csvw#
- xmlns: `@prefix csvw: <https://www.w3.org/ns/csvw#> .`
- @context: https://www.w3.org/ns/csvw.jsonld

CSVW (*CSV on the Web*) is a set of relatively new standards
for representing :ref:`CSV` rows and columns
as :ref:`RDF` (and :ref:`JSON` / :ref:`JSON-LD`)
along with *metadata*.

* URIs for datatypes (XSD)
* URIs for columns (RDF)
* Document Metadata
* CSV -> JSON (-> JSON-LD -> RDF)
* CSV -> RDF
 
Could there be a file naming convention for specifying the extra CSVW header to apply_to or transform zero or more CSV files with?

filename.csv
filename.csv.csvw
filename.csv.csvwheader.jsonld.json
filename.csv.csvw.jsonld.json



Best,

Miel

Op wo 28 jul. 2021 om 06:09 schreef Nicholas Car <nichol...@surroundaustralia.com>:
Hi Meil,

Yes, all offers of contribution are of interest! The CSV 2 RDF stuff is very old and many tools related to it, such as pyTARQL (https://github.com/RDFLib/pyTARQL), are missing. Are you planning on presenting JSON2RDF as a new plugin to RDFlib? that may be an option, however remember that another option is also just to present your tool's repository within RDFlib's family of repositories (i.e. within https://github.com/RDFLib) and the choice will depend on how stable the tool is and how you see it's future development going.

But perhaps you have other things in mind? Whatever the case, we'd love to hear your plans.

Cheers,

Nick

On Tue, Jul 27, 2021 at 5:56 PM Miel Vander Sande <miel.van...@meemoo.be> wrote:
Hi all,

little late to the party, but what a great effort this is! Congrats with the release and thank you; this library is super essential to my work and it makes RDF usable in ways other libraries can't. 

Sidenote: I have a streaming direct json-to-rdf mapping implementation (port of https://github.com/AtomGraph/JSON2RDF) that I'd like to contribute, possibly in combination with a refactoring of https://rdflib.readthedocs.io/en/stable/apidocs/rdflib.tools.html#rdflib.tools.csv2rdf.CSV2RDF. Would that be of interest?

Wes Turner

unread,
Jul 31, 2021, 8:03:41 AM7/31/21
to rdfli...@googlegroups.com
On Sat, Jul 31, 2021 at 6:45 AM Wes Turner <wes.t...@gmail.com> wrote:


On Wed, Jul 28, 2021 at 4:49 AM Miel Vander Sande <miel.van...@meemoo.be> wrote:
Hi Nick,

TBH, it's pretty much a function that converts a Dict or a JSON file in a streaming fashion: https://github.com/viaacode/construction-site/blob/main/construction_site/parse_functions.py. I think it's a stand-alone thing; I don't plan anything extra on that specifically, with maybe the exception of a cmd interface (hence the proposed refactoring of csv2rdf)
 
Profiling / [comparative] benchmarks with e.g. Scalene [1][2] and/or perfplot [3] (%timeit) [4][5] could be worthwhile.



Other methods for CSV + transforms => RDF?
- #rdflib csv2rdf
 
- COW
- desc: Integrated CSV to RDF converter, using CSVW and nanopublications

- #CSVW  https://github.com/cldf/csvw/blob/master/README.md#see-also
 
Could https://github.com/cldf/csvw/ be modified to support (1) json-ld-streaming; and (2) alternate csv parsers?

- @kidehen sponger / rdfm_yq_parse_csv()?

- #tarql
- ProgrammingLanguage: Java

- #csv2rdf GH topic: https://github.com/topics/csv2rdf

#csv2rdf
 
Adding columnar & dataset-level metadata *with URIs* is the value add here, IMHO #LR 

"7 metadata header rows (column label, property URI path, DataType, unit, accuracy, precision, significant figures)"

Example Table A with 7 metadata header rows:

The csv2rdf tool would need to optionally read this additional metadata from either additional header rows or an external 'header' file.
FWIU, there is not yet a vocabulary for physical units like meters**2 in the JSON-LD Recommended Context:

QUDT is one such vocabulary:
```turtle
qudt-quantity:Time
    rdf:type qudt:SpaceAndTimeQuantityKind ;
    rdfs:label "Time"^^xsd:string ;
    qudt:description "Time is a basic component of the measuring system used to sequence events, to compare the durations of events and the intervals between them, and to quantify the motions of objects."^^xsd:string ;
    qudt:symbol "T"^^xsd:string ;
    skos:exactMatch <http://dbpedia.org/resource/Time> .

# ...
unit:SecondTime
      rdf:type qudt:SIBaseUnit , qudt:TimeUnit ;
      rdfs:label "Second"^^xsd:string ;
      qudt:abbreviation "s"^^xsd:string ;
      qudt:code "1615"^^xsd:string ;
      qudt:conversionMultiplier
              "1"^^xsd:double ;
      qudt:conversionOffset
              "0.0"^^xsd:double ;
      qudt:symbol "s"^^xsd:string ;
      skos:exactMatch <http://dbpedia.org/resource/Second> .
# ...

... We must be able to say that the numbers in a column have a physical unit with URI; to specify columnar metadata so that downstream tools don't need to try to sniff and cast between datatypes and lossily drop units from strings in column names:

- (_datatype_ _physical_unit_):
- (float64, "unit:SecondTime",)
- (float64, unit["SecondTime"],)

 
* URIs for columns (RDF)
* Document Metadata
* CSV -> JSON (-> JSON-LD -> RDF)
* CSV -> RDF
 
Could there be a file naming convention for specifying the extra CSVW header to apply_to or transform zero or more CSV files with?

filename.csv
filename.csv.csvw
filename.csv.csvwheader.jsonld.json
filename.csv.csvw.jsonld.json
```python

uri = 'filename.csv'
if Path(uri + 'csvw.jsonld.json').exists():
    read_csvw(uri, *args, **kwargs)
else:
    read_csv(uri, *args, **kwargs)
 
```

Wes Turner

unread,
Jul 31, 2021, 8:14:39 AM7/31/21
to rdfli...@googlegroups.com
Remaining rdflib-jsonld work:

- Connection Negotiation
  So that e.g. @context: https://schema.org/ correctly resolves the Link: header

- Only access external resources if RDFLIB_CONFIG or rdflibconfig['allow_access_external_resources']
  instead of by default, which is what 6.0 is currently doing:

  "URLInputSource can be abused to retrieve arbitrary documents if used naïvely"

  - Should RDFlib cache @contexts with requests-level caching with requests-cache or CacheControl or something else?

  - Should RDFlib cache at least contexts in the JSON LD Recommended Context [so that @context: https://schema.org/ works out of the box]?

lix joy

unread,
Aug 5, 2021, 6:35:00 AM8/5/21
to rdfli...@googlegroups.com
Hi, Does RDFLIB currently support RDF-STAR and SPARQ-Star?  thanks.

Wes Turner <wes.t...@gmail.com> 于2021年7月31日周六 下午8:14写道:

Miel Vander Sande

unread,
Aug 13, 2021, 3:15:11 AM8/13/21
to rdfli...@googlegroups.com
Hi Wes,

Op za 31 jul. 2021 om 12:45 schreef Wes Turner <wes.t...@gmail.com>:


On Wed, Jul 28, 2021 at 4:49 AM Miel Vander Sande <miel.van...@meemoo.be> wrote:
Hi Nick,

TBH, it's pretty much a function that converts a Dict or a JSON file in a streaming fashion: https://github.com/viaacode/construction-site/blob/main/construction_site/parse_functions.py. I think it's a stand-alone thing; I don't plan anything extra on that specifically, with maybe the exception of a cmd interface (hence the proposed refactoring of csv2rdf)
 
Profiling / [comparative] benchmarks with e.g. Scalene [1][2] and/or perfplot [3] (%timeit) [4][5] could be worthwhile.


ijson [6] looks like it has some interesting features; iterative, asyncio, push. How does the performance compare?


Compared performance to what? 
 

I do plan to develop more components that assist scalable ETL, data-to-rdf like tasks. This includes a plugin for Apache Airflow ("provider"), which would be good as a RDFLib family repository.

- The datasette and dogsheep projects have a bunch of *-to-sqlite utils and an interface that a number of projects on PyPI have implemented:

      - parse datetimes in CSVs
        - xsd:datetime (and schema.org/Date and schema.org/dateCreated and schema.org/dateModified) specifies that time will be specified in ISO8601 formats


nice pointers, thanks!
To be clear: I'll start with refactoring what's there, but I'm not sure how many improvements I can make to the CSV side of things. I don't really have an immediate use case for CSV and there are already so many tools that do that, including the ones you mention.
 
  > PyRDB2RDF provides RDFLib with an interface to relational databases as RDF stores. The underlying data is accessed via SQLAlchemy. It is mapped to RDF according to the specifications of RDB2RDF. The corresponding RDF graph is represented as an RDFLib graph.
  >
  > Translating from relational data to RDF via direct mapping is currently supported. Translating in the other direction and mapping with R2RML are planned but not yet implemented.

keeping a csv2rdf implemnentation around is probably not that useful if you have tools like this (but I haven't seen much fro JSON)
 

I've co-developed the RML language and the KG construction community group is working on a possible standardisation. I've seen the code of this particular implementation though, and I don't see it lasting very long. A good, maintainable RML processor in python would of course be very useful.
 
  
  - Does this handle datetimes?

- Generate JSONschema from JSON and SHACL from JSON-Schema:
  - https://pypi.org/project/genson/ has been recently updated
    > JSON-LD Schema defines a simple 'semantics' JSON-Schema vocabulary (effectively a JSON-Schema meta-schema) that reuses the official JSON Schema for JSON-LD to provide definitions for @context and @type properties. These annotations can be used to provide JSON-LD context for a JSON-Schema document. Provided this JSON-LD context, constraints over named 'properties' in a JSON Schema document can be understood as constraints over CURIES of JSON-LD documents following the context rules defined in the JSON-LD specification.

That's interesting work for two layered validation of JSON-LD, which I can use in our metadata pipelines!
No, it just streams ntriples. That could be achieved by piping the output to some streaming json-ld implementation.

Cheers,

Miel
 
Reply all
Reply to author
Forward
0 new messages