Iso Windows 7 Home Premium 32 Bits Pt-br

0 views
Skip to first unread message

Cdztattoo Barreto

unread,
Jun 30, 2024, 9:52:27 AM6/30/24
to fromeanheipost

Hey folks, I created a new mockup for a home for revcoding work. Such an interface could give our volunteer handcoders a window into the system's labeled data needs and may provide easy access to the revision handcoder to add new data.

I'd like to throw my two cents on the matter. After Halfak's explanation yesterday I find his approach on the matter quite sound. I think there is great benefit in having a gadget which has several campaigns which is divided among tasks that expire if people are sitting on them. I think this approach suits our crowd-sourcing culture at Wikimedia projects better. After all crowd-sourcing itself is a divide-and-conquer strategy to begin with. -- とある白い猫 chi? 09:04, 21 February 2015 (UTC)Reply

In the context of vandalism detection, top predictors could be used to improve the lists of badwords used by abuse filters, Salebot and similar tools which do not use machine learning. In the specific case of the Salebot list, we could even use the learned weights to fine tune the weights used by the bot.

This approach differs from the one currently in use on Revision-Scoring, where we just count the number (and the proportion, etc) of badwords added in a revision. It is as if all words in the list had the same weight, which doesn't look quite right. Helder 13:10, 6 January 2015 (UTC)Reply

Gold ranks among the most valuable ore. In the 17th century the Philosopher's stone was a legendary substance believed to be capable of turning inexpensive metals into gold and was represented by an alchemical gliph. Our logo is inspired by this glyph since the service we intend to provide will convert otherwise worthless ore data (raw data) to gold ore data. Mind that it is still ore for others to process. As this services main goal is to enable other more powerful tools.

Hey folks. Again, I'm late with this report. You can blame the mw:MediaWiki Developer Summit 2015. I talked to a lot of people about the potential of this project there and learned a bit about operation concerns related to service proliferation. Anyway, last week:

Wikipedians have proposed other reforms, too. The Wikimedia Foundation is funding research into more robust bots that could score the quality of site revisions and refer bad edits to volunteers for review. Another proposed bot would crawl the site and parse suspicious passages into questions, which editors could quickly research and either reject or approve.

So, there's been some discussion recently of ORE's performance and how it's not nearly as fast as we would like to request that a bunch of revisions get scored. I'd like to take the opportunity to document a few things that I know are slow. I'll sign each section I create so that we can have a conversation about each point.

Looking for misspellings is one of our most substantial bottlenecks. Right now, we're using nltk's "wordnet" in order to look for words in English and Portuguese. This is slow. One some pages, scanning for misspellings can take up to 4 seconds on my i5. That's way too much -- especially because we end up scanning at least two revisions for misspellings. So, I've been doing some digging and I think that 'pyenchant' might be able to help us out here. The system uses your unix installed dictionaries to do lookups and it is much faster. Here's a performance comparison looking for misspellings in enwiki:4083720:

One way that we can improve this is by batching all of the requests in advance before we provide the data to the feature extractor. So, let's say we receive a request to score 50 revisions, we would make one batch request to the API for content from those 50 revisions. Then we would make another batch request to retrieve the content of all parent revisions. I think we can also batch the requests for a first edit to a page (specifying multiple page_ids to prop=revisions with rvlimit=1). We can batch the request to list=users and list=usercontribs too. We'd have to use the extractors dependency injection to address these bits for each revision after the fact then. For example:

It makes me a bit sad to do this since we don't know that the revision.doc, parent_revision.doc necessary in the code. We might want to provide some functionality at the ScorerModel level to allow us to check this. E.g.

Since we know that the majority of our requests are going to be for recent data, we could try to beat our users to the punch by generating scores and caching them before they are requested. Assuming caching is in place, we'd just need to listen to something like RCStream and simply submit requests to ORES for changes as they happen. If we're fast enough, we'll beat the bots/tools. If we're too slow, we might end up needing to generate a score twice. It would be nice to be able to delay a request if a score is already being generated so that we only do it once. --EpochFail (talk) 16:29, 26 April 2015 (UTC)Reply

@Ladsgroup: what is the condition for keeping/removing a string in the list?I belive it is so subjective, and that there are so many criteria, that I don't really know what to do with lists like these. I even kept the list which I generated as is, due to the lack of an objective criteria for removing items from it.

I believe we need to have some kind of labelling efort for adding (multiple) tags for each string in the lists, so that we can have different categories.I also don't know which would be the common categories, but there are many reasons why an edit adding a given string might be considered damaging:

I've been working with Yuvipanda to work out some performance and scalability improvements for ORES. I've captured our discussions about upcoming work in a series of diagrams that describe the work we plan to do.

My plan is to work from left to right implementing improvements incrementally and testing against the server's performance. I've already been doing that actually as we have been implementing improvements along the way. Right now, the basic flow as seen substantial improvements in the misspellings look-up speed and request batching against the API.

Hello User:Halfak (WMF) pinging you after your Wikimania talk. Super interesting stuff! I asked about direct database access to a full set of article quality scores. I'm maintaining the WikiMiniAtlas for which I need a metric to prioritize articles shown on the mat at a given zoomlevel. I'd prefer to show high quality articles as prominently as possible (my current ranking is just based on article size). --Dschwen (talk) 15:47, 19 July 2015 (UTC)Reply

Apologies if this discussion is already under way somewhere else... I wanted to ask about the training data for the revscoring:reverted models, and whether you plan to unpack the various motivations behind the "undo" action. In short, I think it's imperative that we present a multiple-choice field during undo, to allow the editor to categorize their reason for reverting. This will allow us to provide much higher quality predictions in the future.

To make an sloppy analogy, what we're currently doing is like training a neural network on the yes-no question, "is this a letter of the alphabet?". What we want to do is, have it learning which specific letter is which.

I'd like to see documentation on exactly how the training data is harvested, because I'm concerned that our revert model is actually capturing something ephemeral about on-wiki culture, which has shifted over time. Clearly you've considered this problem, the introduction to ORES/reverted suggests as much. For historical data, we would probably need to correlate reverts with debate about the revert--was this a contentious revert? Can we guess whether it was done in good faith? Was the outcome to rollback the revert? Was there an edit war? Also, was the revert destructive, was the original author offended? Engaged? Retained? This seems like a really hard problem, which is why I'd suggest we focus on recent data only, to capture current norms around acceptable article style, and also introduce a self-reporting mechanism where the editor can categorize the reason for their revert.

Hi. Wikitrust has been to my opinion one of the most advanced tool to do users & revisions scoring. I'm surprised to not see here an analysis of it, it's even not in the list of tools. What is the reason for that? Regards Kelson (talk) 12:33, 3 September 2015 (UTC)Reply

We just had a good work session on our midterm report for the IEG. It became blatantly obvious that our methods for gathering metrics on requests, cache-hit/miss and activity of our precached service left a lot to be desired. So I put in a couple of marathon sessions this week and got our new metrics collection system (using graphite.wmflabs.org) up and running.

Per the revscoring sync meeting here I my thoughts on the matter. First off, we have established that bots are almost never reverted on wikidata. Secondly I think everyone can agree that on wikidata bots FAR out weight humans in terms of number of edits. As a consequence we are dealing with an over-fitting problem where probably all human edits will be treated as bad because the algorithm will give too much weight on features that basically distinguish bots from humans. This is more of an intuitive assessment than actual analysis. I could very well be wrong. I came to this assessment because on wikidata vast majority of good edits will come from bots and bots will always dominate the random sample set. We can perform a more selective sampling but honestly I do not see the benefit of it.

Based on the two assessments above I propose a different type of modelling for wikidata than what we use on wikipedias. First off, we need to segregate bot edits from other edits. Indeed this will have the potential of a bias but I will explain how this can be avoided. So we would have two models for wikidata, one for bots and other for non-bots. Two independent classifiers would be trained. For instance different classifiers can be used. In such a case bot edits can have Naive Bayes while human edits can be processed with SVM. This is kind of a top level decision tree which delegates first two branches to classifiers.

d3342ee215
Reply all
Reply to author
Forward
0 new messages