2020-05-13 Notes and Feedback Form for today's WL3C

2 views
Skip to first unread message

Zainan Zhou (a.k.a Victor)

unread,
May 13, 2020, 5:32:18 PM5/13/20
to WikiLoop Coalition
Hi WikiLoop Coalition, bcc every individual participants of today

Thank you for attending today's WikiLoop Coalition Conference Call (WL3C), here are the unrefined notes today.

Useful Link
- Join our Google Groups to subscribe to future calendar invites and emails. Don't tell your friends who you may find this group relevant to.

2020-05-13 Adding Citations

Speakers

  • Semantic Scholar: Sebastian Kohlmeier

  • Citation Hunt: Guilherme Gonçalves<ggonc...@google.com>

  • Citation related work in Wikidata.org / WMDE by Lydia Pintscher


Self-Introduction: 

  • Aaron Halfaker: WMF Principle Research Scientist, AI team for detecting vandalism, quality, topic modeling.  Studies quality control dynamics in Wikis like Wikipedia. 

  • Elan: Google, WikiLoop Team, WikiLoop Game

  • James Hare: working on Wikibase for COVID19 research. Documenting works on Wikidata. / Citation on Wikipedia

  • Maria: Google Open Source Team / WikiLoop communication lead.  Interested in citations.  Makes open information reliable for users. 

  • Thad: Big data, different sort of citation, cite / referenceable datapoint, most interested in machine-readable cite formats

  • Vinay: Google Brain, help create a tool to assist WP editors create stub articles.

  • Lydia: Product Management for WD at WMDE.

  • Guilherme Goncalves: SRE at Google, Citation Hunt author. Think of Citation as a good way for micro-contribution

  • Sebastian: Senior PM of Allen Institute for AI

  • SJ Klein: federated data, Wikipedia + Knowledge Futures group 


Citation Graph by Allen Institute for AI, Sebastian Kohlmeier, Link to Slides

  • Allen Institute for AI: contribute to humanity through AI...

    • Semantic Scholar: science and technology academic article search engine.

    • COVID 19 research repository "CORD-19". parsed and extracted large corpus of related academic articles

  • CitationGraph

    • 186M+ Papers, 1B+ Citations, expand coverage with Microsoft Academic Graph and publishers

    • Citation Classification: background citation, result citation, method citation.

    • Citation Alert

  • CitationGraph + Wikipedia Collaboration

    • Citation Template Integration (S2CID) / Semantic Scholar Citation ID

    • Initial work with S2 Author IDs in WIkidata

    • Integration with Citation Bot

  • LoopRequest: Want to be able to recommend citations for WIkidata, using similar text.


Citation Hunt by Guilherme Goncalves


Wikidata and Citation (Lydia) Link to Slides

  • Wikidata grows and needs automation!

  • Automation related to citation

    • Hoped for cleaner data, but not always so.

  • We can do statistics on this when we have enough instances of a value (country demographics).  But things like author names may only appear once or twice: pretty poor.

  • Prototype: generated a bunch of refs, got feedback from the community

  • Next: Wikidata game instances for this; early result: +300 refs this way [how does this compare to 1lib1ref in terms of participation, time-per-cite?]

  • Want more input on how to make this more effective!

  • Ideas

    • 1) schema.org: look at Type, to infer meaningful/high quality cites

    • 2) models that could be trained: 
      A: how suspicious is a cite?  A1: How important is it to find a source for this, if it will be kept?  A2: where a source exists, how important is a doublecheck that the source makes the claim it’s associated with?
      B: how trustworthy is a claim [cited or not]?  Context includes the age of the account that posted it, the amount of context for the cite, the # of other claims for that entity, the amount of visibility/traffic the entity-page and the property get, the trustworthiness of any source, as per A above

    • 3) connect w/ 1Lib1Ref and CitationHunt 

    • 4) look across the web for pages w/ text that makes a similar claim (entity E w/ value V for property P: look for {E,V,P} in proximity anywhere in a) SemSchol, b) CommonCrawl, c) Google)



LoopRequest and LoopOffer

  • AI2 offers citation recommendation candidates, Citation Hunt to suggest (long term) Guilherme + Sebastain

  • Citation Hunt to support Wikidata, Guilherme + Elan + Lydia

  • "Special” LoopOffer: Thad is there to help (and for free!) with respect to schema.org

  • WikiData make use of 1Lib1Ref  Guilherme + Lydia

  • Late request: intern to help w/ WP-on-IPFS , SJ + Santhosh

  • Thad: can we use "Google Crowdsource" for abbreviations, other properties?
    → can highlight schema.org props that are often abbreviated (not canonical strings)
    → could help lydia  ]




Interested Guests to invite:  Thad, Daniel




Victor,
Google Search
Reply all
Reply to author
Forward
0 new messages