Skip to first unread message

Michael Bar-sinai

unread,
May 16, 2019, 6:28:46 PM5/16/19
to Dataverse Users Community
Hello all,

In the coming community call (5/21) we'll try to kick off a discussion about adding ActivityPub integration to Dataverse. Here's some background, as I assume most people are not familiar with it.

ActivityPub is a protocol recommended by W3C. From their site:

The ActivityPub protocol is a decentralized social networking protocol... It provides a client to server API for creating, updating and deleting content, as well as a federated server to server API for delivering notifications and content.

Why would we want that? And should this replace current harvesting that uses OAI-PMH?

Dissemination is, of course, an important part of "doing science". Now, consider exposing a the activity inside a dataverse instance as if it was a social network. New published dataset versions and dataverses could be presented as "posts" or "tweets". Users will be able to follow a dataverse using an ActivityPub-compliant client app (of which there are many, both for mobile and desktop). Users may also follow a subject, and have newly published dataset versions appear on their social feed. The same could work with keywords. So far, this was not feasible, since all popular social networks were based on proprietary APIs. ActivityPub's popularity (mostly thanks to Mastodon) has changed that.

Taking this thought a bit further, if other scholarly systems (such as journals, micro-publications outlets, pre-print archives, funding agencies publishing grants) expose an ActivityPub interface, people would be able to curate their own "academic feed" that suites their interests, in the same manner we curate our social feeds today. This would facilitate dissemination and discovery of research opportunities. But that's left to future work, as they say.

In the community call, the plan is to discuss this idea, and see if it's worth exploring it further.

See you on the 21st.

-- Michael

P.S. I think in Dataverse, ActivityPub could live side-by-side with OAI-PMH, but could also replace it in the long run.

Crosas, Mercè

unread,
May 17, 2019, 8:44:53 AM5/17/19
to dataverse...@googlegroups.com
Thanks, Michael. This is a very interesting idea worth exploring. I do not think that it should replace OAI-PMH though.

Thanks for organizing the call.

Mercè 

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/88d365af-e7b3-494b-aacf-29ac8388bc99%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Mercè Crosas, Ph.D.
Harvard University's Research Data Officer, Office of Vice Provost for Research
Chief Data Science and Technology Officer, Institute for Quantitative Social Science

Philip Durbin

unread,
May 22, 2019, 12:40:29 PM5/22/19
to dataverse...@googlegroups.com
That was a really great call. I think I took most of the notes below and I apologize for any errors. Thanks for presenting the idea, Michael!


2019-05-21 Dataverse Community Call

Agenda

* #dataverse2019 and recent releases
* ActivityPub ( https://www.w3.org/TR/activitypub/ )
* Community Questions

Attendees

* Merce Crosas (IQSS)
* Danny Brooke (IQSS)
* Gustavo Durand (IQSS)
* Mike Heppler (IQSS)
* Phil Durbin (IQSS)
* Tania Schlatter (IQSS)
* Slava Tykhonov (DataverseEU)
* Sherry Lake (UVA)
* Paul Boon (DANS)
* Michael Bar Sinai (IQSS)
* Paul Dante (UBC)
* Pierre-Antoine Rault (Loria - OLKi project)
* Julian Gautier (IQSS)
* Jim Myers (QDR)

Notes

* #dataverse2019 (https://projects.iq.harvard.edu/dcm2019)
* Dataverse 4.14 is out. OpenAIRE compliance, ability to move datasets (thank you, Paul Boon!): https://dataverse.org/blog/dataverse-414
* (Slava) Hackathon will be on Monday and Tuesday?
   * (Danny) The official hackathon is on Wednesday. Stefan will be hacking on pyDataverse on Tuesday.
* (Paul) Is PostgreSQL 9.6 required? It seems to be.
   * (Phil) There was a good conversation recently in this "Flyway, new SQL script process, PostgreSQL version" thread: https://groups.google.com/d/msg/dataverse-dev/CTRpKg0xP2o/SKWHedmvBQAJ
* ActivityPub ( https://www.w3.org/TR/activitypub/ )
   * A very initial idea about Dataverse and ActivityPub ( https://cryptpad.fr/pad/#/2/pad/edit/PP6iDKQqXMy8uH016MnH0hIE/ )
   * ActivityPub discussion on the mailing list ( https://groups.google.com/d/msg/dataverse-community/hekvbHfD-3w/nN5is0nDAQAJ )
   * (Michael) A protocol that allows federated social networks. Like Twitter but open source. ActivityPub is getting a lot of traction. The most popular or familar tool is Mastodon. ActivityPub is a proper W3C standard. We can piggyback on it. It could be a better mechanism for disseminating things from Dataverse. Let's not replace something that's working like OAI-PMH. What if we expose Dataverse as a Twitter clone? People could follow tags or keywords from a different Dataverse instance. You publish a dataset and then information about it gets distributed across the federated ActivityPub network. Disseminating academics data across a standard protocol. Academics could curate their own social feed.
   * (Merce) It's not only that OAI-PMH is working, it's a very solid stable protocol that's used by other systems. Even within a Dataverse installation, could there be sharing and subscribing?
   * (Michael) Yes.
   * (Phil) Can you talk at all about the use of JSON-LD. Does it help with flexibility?
      * (Michael) ActivityPub uses ActivityStreams. LD is for Linked Data. We'd need to work it out.
      * (rigelk) More in-depth example to see how LD is used: https://blog.joinmastodon.org/2018/06/how-to-implement-a-basic-activitypub-server/
   * (Slava) Some time ago I had a chat with Herbert van de Sompel, the creator of OAI-PMH, and he has an idea for another protocol called Signposting. https://www.slideshare.net/hvdsomp/signposting-overview-version-november-2017 and http://signposting.org
      * (rigelk) it doesn’t cover the same range of functionality than what ActivityPub could bring (i.e.: inboxes and notifications per user/instance)
   * (Michael) As an academic I could get notifications in my normal social feed. I'm not saying ActivityPub is the best protocol pe se but it has a strong following with millions of users. The idea is to piggyback off a popular language.
   * (rigelk) OAI-PMH may be limited (no links to files for instance). An AP dialect could bring more information.
   * (Merce) What is ActivityPub used for?
      * (Michael and Phil) Peertube, Funkwhale, another that’s like Instagram (Pixelfed)
   * Implementation report: https://activitypub.rocks/implementation-report/
   * The next step could be a small project that shows the benefit of this for the community
      * (Merce) Maybe a diagram that shows benefits to end users and use cases.
      * (rigelk) Showing that users from Mastodon, Pleroma, and other multiple-type ActivityPub platforms could subscribe to Dataverses and receive notifications on new datasets is probably the most obvious benefit to show.
      * Possibly for the Community Meeting (poster)
* Community Questions
   * (Slava) Building a service, Oliver is helping. We need to get off Glassfish 4.1, maybe to Payara.
      * (Danny) Gustavo is coordinating this.
      * (Merce) We can learn from what Bob and Ellen are doing.
   * UI Testing - Students from the University of Zurich are working on #5846 - Slava, any interest in coordinating? https://github.com/IQSS/dataverse/issues/5846
      * (Slava) we’re setting up the infrastructure right now, so the time is not right.
   * (Jim) Is ActivityPub more of an additional ‘Share’ option than something to compare with OAI-PMH? ‘Share’ is the Dataverse button that lets you tweet, etc.
      * (Michael) I don’t see it as a Share option, more of a push option (keywords, tags, authors, etc.)
      * (rigelk) you can share data as LD/ActivityStreams and just have it static, much like OAI-PMH. But the whole advantage of using ActivityPub is to have a pub/sub server with it.
      * Use case – answer: What would be useful to be able to follow?
      * (rigelk) different examples exist in the existing implementations of AP. PeerTube servers can follow each other and get notified on new content on the servers they follow. Users can also have an account that can be subscribed to individually (like on Mastodon/Pleroma/others).
      * (rigelk) you can get an idea of what kind of functionality can be shared across instances by looking at the AS vocabulary used by its software: https://docs.joinpeertube.org/lang/en/devdocs/federation.html or https://docs.joinmastodon.org/development/activitypub/
   * (Slava) Thank you to Jim Myers for creating the "File Previewers": http://guides.dataverse.org/en/4.14/installation/external-tools.html
   *  (Merce) I'd like to understand ORE. It's for an aggregation object, right? What are the main requirements for ORE. Can it have a tree or not?
      * (Jim) Sort of. The spec says a tree should be split into multiple ore maps, but we’ve used dc:hasPart to create a tree in one map.
      * Will have a poster at Dataverse 2019…


For more options, visit https://groups.google.com/d/optout.


--

Michael Bar-sinai

unread,
May 24, 2019, 9:50:18 AM5/24/19
to Dataverse Users Community
Hello all,

The presentation about ActivityPub and Dataverse is now ready for your comments at:


In order to make this idea a bit more tangible, this presentation contains some mock and real screenshots. It is intended as a starting point for discussion, and is  open for comments - email me if you want edit permissions.

Thanks @pdurbin for helping prepare it.

-- Michael

Michael Bar-Sinai

unread,
May 24, 2019, 10:03:47 AM5/24/19
to dataverse...@googlegroups.com
Issue #5883 follows.

--
You received this message because you are subscribed to a topic in the Google Groups "Dataverse Users Community" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dataverse-community/hekvbHfD-3w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dataverse-commu...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

Philipp at UiT

unread,
May 26, 2019, 1:58:17 AM5/26/19
to Dataverse Users Community
OAI-PMH is essential for us and should not be replaced by ActivityPub - unless of course all main stakeholders (e.g. DataCite) would replace OAI-PMH by ActivityPub OAI-PMH.

Best, Philipp


fredag 17. mai 2019 14.44.53 UTC+2 skrev Crosas, Mercè følgende:
Thanks, Michael. This is a very interesting idea worth exploring. I do not think that it should replace OAI-PMH though.

Thanks for organizing the call.

Mercè 
On Thu, May 16, 2019 at 6:28 PM Michael Bar-sinai <mich.b...@gmail.com> wrote:
Hello all,

In the coming community call (5/21) we'll try to kick off a discussion about adding ActivityPub integration to Dataverse. Here's some background, as I assume most people are not familiar with it.

ActivityPub is a protocol recommended by W3C. From their site:

The ActivityPub protocol is a decentralized social networking protocol... It provides a client to server API for creating, updating and deleting content, as well as a federated server to server API for delivering notifications and content.

Why would we want that? And should this replace current harvesting that uses OAI-PMH?

Dissemination is, of course, an important part of "doing science". Now, consider exposing a the activity inside a dataverse instance as if it was a social network. New published dataset versions and dataverses could be presented as "posts" or "tweets". Users will be able to follow a dataverse using an ActivityPub-compliant client app (of which there are many, both for mobile and desktop). Users may also follow a subject, and have newly published dataset versions appear on their social feed. The same could work with keywords. So far, this was not feasible, since all popular social networks were based on proprietary APIs. ActivityPub's popularity (mostly thanks to Mastodon) has changed that.

Taking this thought a bit further, if other scholarly systems (such as journals, micro-publications outlets, pre-print archives, funding agencies publishing grants) expose an ActivityPub interface, people would be able to curate their own "academic feed" that suites their interests, in the same manner we curate our social feeds today. This would facilitate dissemination and discovery of research opportunities. But that's left to future work, as they say.

In the community call, the plan is to discuss this idea, and see if it's worth exploring it further.

See you on the 21st.

-- Michael

P.S. I think in Dataverse, ActivityPub could live side-by-side with OAI-PMH, but could also replace it in the long run.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

Michael Bar-Sinai

unread,
May 26, 2019, 2:49:38 AM5/26/19
to dataverse...@googlegroups.com
Of course - removing OAI-PMH was never an intention here. ActivityPub could connect Dataverse with an additional eco-system (adding, not replacing). The only replacement that was contemplated was using AP for inter-Dataverse harvesting, which is - as far as I understand - an internal change.

Regards,
-- Michael

You received this message because you are subscribed to a topic in the Google Groups "Dataverse Users Community" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dataverse-community/hekvbHfD-3w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dataverse-commu...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages