James Hare
Thad Guidry
Sebastain Kohlmeier
Zainan Zhou
The following topics and LoopRequest and LoopOffer were discussed:
Vandalism patrol + Wikiloop Battlefield
Useful KG standards for wikilooping
Vandalism patrol
Wikiloop Battlefield - staying neutral while providing an interface.
Maintaining community interest w/ a simple + universally useful tool.
Q: why not provide ‘good faith‘ label options? A: simple, avoid value judgement
Community members care a lot about careful assessment. Collecting facts, review.
Where we can contribute: speed up that evidence-gathering.
Goals: lower the barrier to contribute
Can get many more labels from regular readers/users than from wiki editors - working on capturing this. (comparison: you get billions of spam emails, many fewer vandal edits)
Loop requests from WLB: contributions + interface feedback (link to slide w/ invitations)
Loop offer: University of Dallas - Prof Andrews + students may be interested in collaboration + helping w/ such work
Other models?
Predictive models -- Is anyone training a model to predict when a future edit (next edit on a page, next edit by an article) will trigger ORES? Related: models of the potential challenges to a current edit, based on metadata other than (author age) and (edit diff): recent changes to the article, recentchanges for the entire wiki, other topic modeling and
Do trends matter to spam? Could we feed G + W trends into tools to help informhelp focus attention where there will be the most trouble? Cf. early-onset protection (perhaps very gradually titrated, less-restrictive protection options?)
Similar projects that interface w/ or use ORES : ?
Q: does ORES bad-actor data track ‘repeat actors’? Possibly not.
Related: huggle doesn’t track good edits.
Loop request: Tracking across projects
Behavioral fingerprints, textual fingerprints? Tracking individual problems and sock-puppets across projects.
~ Track across namespaces and wikis
~ Track across different tools (compare spam-remediation - talk to blacklist maintainers, WP plugin designers) wikis, blog
Other counter-spam approaches? [many problems considered solve in academia don’t scale well, so simpler min-approaches are implemented.tguidry linkedin]
MS Sharepoint spam?
YT misinfo spam : problem description is considered much different. Comments: be nice. Videos: don’t cross some bright lines (clickbait, illegal)
Useful KG standards (and categories)
What are we using / what do we need?
Beyond schema.org: useful protocols, ontologies, specifications
TG: works often on linked data protocols, and related ontologies
How can we improve linked data implementations within Wikidtaa?
Examples:
~ data shapes + constraints (ShEx), Mutexes (from FB)
Request: Export catalog of mutexes + descriptions -- BigMama + others
canonical functions to enforce constraints (??+)
Request: Is there a current project doing this?
Look into → OKFN catalog, Linked Open Vocabs (terms), [SJ/TG]
Wanted: incompatible type-matrix, meta-catalogs [support unmerge-requests]
Framework for communal agreement on incompat + other type-annotation
~ canonical ways to refer to a function (code + a VM to run it?) :
Future: WikiLambda, other code-wikis
Request: Existing collections of [functions] that would make good Z-objects
~ dataset [ontologies]: update-tempo, patch-mechanism, detailed source tree?
Request: Good current standards for (maintainable, shared datasets)
What standards does G!Dataset Search care about; where to find feeds of changes + submissions, how to connect bots for reading + writing
What other data seas* have helpful approaches to tracking inflows?
~ universal PIDs: concordances of IDs and authority files. (VIAF, Lens ID, …)
compare the fragility of url-shorteners and the IA work to preserve them. PIDs are basically url shorteners...
other KG standards - which [W3C, IEEE] ones to attend to? (Wikidata is often an effective substitute for more brittle + specialized standards, how to engage those processes to combine forces?)
https://datacommons.org/ -- [Q for us: connected w/ the Open Data Commons?]
To what extent does this offer good examples of [dataset ontologies + PID-sets]?
Some metadata fields for Datasets are within here: https://schema.org/Dataset
Idea: Typeset a process/bot to use/process data from specific sources.
Idea: Define a ShEx? shape from a subset of Dataset fields that provides the needed affordances [ draft some rules ] → Start by drafting rules as comments?
Orig Proposal: http://blog.schema.org/2012/07/describing-datasets-with-schemaorg.html
Related: https://arxiv.org/abs/1903.03096
[aside: Leigh Dodds @ODI on data trusts + related norms]