31 down ratings in sequence on 2011-08-06 between 18:234 and 22:05
310 down ratings in sequence on 2011-08-08 between 10:39 and 22:25
509 down ratings in sequence on 2011-08-08 between 01:39 and 18:30
In between no neutral or positive voting for VAM.
Ratings before and after those 3 periiods have been both: live changing
and unfullfilling. Given that the first section in docs is "contact me
if you have trouble" its hard to believe that those are honest ratings.
community: What do you suggest? I don't like the negative rating because
you are not forced to pass a note about why you downrate a plugin which
would be very helpful. I've written about this earlier.
Now that manipulation clearly takes place I'd like to emphasize that
eventually irc and mailinglists are more important then ever to get
valuable advice based on experience rather than arbitrary down votings.
I'm not sure what can be done against it because down voters can adopt
easily. Some protction could be done by disallowing down votings - eg
there is a reason that google and faecbook only accept "I like it / plus
one" rather than "I dislike it/ minus one".
Thus we could remove all negative votings and continue - people can only
upvote their own plugins then which would hurt other projects less.
I personally don't care: VAM surely has many happy and active users who
help me debug issues fast and I'm thankful for their support.
github and other arbitrary queries about any of the plugins I maintain
including snipmate etc. are usually processed within a couple of days
and the first reply is very likely to happen within 24 hours.
So don't get me wrong: I'm not complaining here. I'm telling those who
don't know it yet what I'm observing.
Sorry for this noise - but I think people should know about it.
If you want me to take action tell me to do so.
Marc Weber
<$0.02>
Since it only takes one rotten apple out there, to render a negative
voting system maliciously destructive of volunteer effort, yesterday
begins to look like a good time to turn this off. Any contact mechanism
which provides explicit detailed defect/deficiency feedback to the
author seems productive, in contrast. Public scrutiny makes a mailing
list (or a bugtracker/wishlist_logger) better suited for keeping
negative feedback real, I think.
A corrupted voting scheme does not seem to be as helpful as no voting
scheme.
</$0.02>
Erik
PS: The targetting of VAM is perhaps a mark of its popularity.
--
Why make things difficult, when it is possible to make them cryptic
and totally illogical, with just a little bit more effort?"
- A. P. J.
> It happened to a plugin of tpope and to one of mine:
>
> 31 down ratings in sequence on 2011-08-06 between 18:234 and 22:05
> 310 down ratings in sequence on 2011-08-08 between 10:39 and 22:25
> 509 down ratings in sequence on 2011-08-08 between 01:39 and 18:30
>
> In between no neutral or positive voting for VAM.
This sounds like the same issue that affected one of Dr. Chip's plugins:
Thread on vim-dev, Subject: manpageview rating dive
As I explained there¹:
"""
...some search engine(s) grabbed the down-vote URL when crawling
www.vim.org. In this case, googling:
site:www.vim.org inurl:unfulfilling
(where 'unfulfilling' is the 'rating' value for a down-vote) comes up
with exactly one result for me:
ManPageView - [...]
[...]
Seems like the ratings should only use $_POST (PHP var), but they appear
to be using $_GET, too.
"""
Bram subsequently fixed the issue², but that didn't happen until after
the timeframe of your downvotes.
Perhaps downvotes from some period of time could be discarded (by
whoever has access to the raw data and can see an overall pattern).
¹: https://groups.google.com/d/msg/vim_dev/-TVtxlNoi98/QqoPvye3bOIJ
²: https://groups.google.com/d/msg/vim_dev/-TVtxlNoi98/So-Bu1d_Ij8J
> [...]
>
> community: What do you suggest?
That you ignore the ratings, as most people seem to do (I never notice
them, personally). Probably they're not very useful in general, so
maybe removing them altogether is the right solution. As Erik writes:
A corrupted voting scheme does not seem to be as helpful as no voting
scheme.
Although, I'm also likely not the right person to ask, since I'm much
more likely to install things I find on github or directly on Dr. Chip's
site than anything from vim.org. Most of the scripts I see on the
scripts site seem to be abandonware, whereas the ones on github that
come up in general google results usually have clear indications of
recent activity.
--
Best,
Ben
Regards,
Chip Campbell
I do think that a negative rating is useful. Especially if a script
doesn't work well or there is another script that is much better.
What we don't want is a single user repeatedly giving negative ratings.
Since only one rating can be given per IP address, this should not
happen. But the "-1" ratings in sequence are coming from various IP
addresses. How can this happen? I can only think of the URL appearing
in a place where many people would click on it. Or a botnet that has
been setup to do this, which would be really weird. After the main
stream of -1's there are a few more the following day. That would
suggest it's a link that users click on.
I have manually deleted the sequence of -1 ratings for script 2905.
Not a thing that we should need to do often.
--
Very funny, Scotty. Now beam down my clothes.
/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ an exciting new programming language -- http://www.Zimbu.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///
Its ok to downvote for a reason such as "contains executable".
But its bad to have people provide negative rating and not knowing why!
The website should allow some minimal feedback.
> I have manually deleted the sequence of -1 ratings for script 2905.
> Not a thing that we should need to do often.
You're right: In open source malicious downvoting does not happen often
or should not. I've checked it: I can no longer access HTTP access logs.
Bram: do you still know when you changed the voting system using POST?
Then we can confirm what happened to Dr Charles Campbell's scripts.
Marc Weber
> Excerpts from Bram Moolenaar's message of Thu Nov 24 21:42:27 +0100 2011:
>> I do think that a negative rating is useful.
> It is - but not in the current shape.
> If you give a negative rating its ok - but you should
> - allow the author to remedy the issue
> - force users dowvoting scripts to leave a message
If you want inaccurate ratings, sure, do that.
That's what you could do if you're assuming that the ratings are for the
benefit of the plugin authors. I'd rather the ratings be useful for
users.
Anything that makes leaving a rating take more effort will artificially
inflate the ratings, which is bad for people trying to choose between
two or more plugins that do the same thing. If someone who actually
cares enough to go back and submit a rating isn't allowed to do so
without leaving a comment, they'll likely not leave a rating. But, I
don't see why not caring enough to help a plugin author fix the problem
would invalidate the user's opinion.
> Its ok to downvote for a reason such as "contains executable".
> But its bad to have people provide negative rating and not knowing why!
> The website should allow some minimal feedback.
Allow? Yes, great idea. Force? No.
>> I have manually deleted the sequence of -1 ratings for script 2905.
>> Not a thing that we should need to do often.
> You're right: In open source malicious downvoting does not happen
> often or should not. I've checked it: I can no longer access HTTP
> access logs.
>
> Bram: do you still know when you changed the voting system using POST?
He wrote a message stating it was fixed on Sept. 2.
> Then we can confirm what happened to Dr Charles Campbell's scripts.
Either it was what I explained (Google crawled a link that caused bad
ratings when clicked) or it was malicious. Since it hasn't happened
since the change to POST (to either Manpageview or your script), seems
like evidence of the former.
--
Best,
Ben
Its like telling to babyies: "You're not useful to me - so I don't even
try to teach you standing up and how to walk."
You don't see that it's children serving your needs when you're old
(unless you suicide). From this example its easy to understand that
"little boys/girls" need guidance to be helpfull to the community.
That's my understanding about open source.
> without leaving a comment, they'll likely not leave a rating.
Then they don't understand feedback loops. Then they should not vote.
Becaues its *you* benefting as user if you read "downvoting because
script-id 326 gets the job done much better".
> > Its ok to downvote for a reason such as "contains executable".
> > But its bad to have people provide negative rating and not knowing why!
> > The website should allow some minimal feedback.
> Allow? Yes, great idea. Force? No.
I agree its debatable. Why do I prefer comments? Because I can validate and
proof them. I can't judge votings without comments - so I surely am in favour
of forcing comments.
> He wrote a message stating it was fixed on Sept. 2.
I've found the followings scripts which got dowvoted after Sept. 2.
I searched for scripts having 10 or more down votings in sequence since Sept. 2.
SCRIPT_ID / downvote count / time range
3695 / 40 (2011-10-23 from 09:09 till 09:23)
2140 / 117 (2011-10-29 01:34:49 - 2011-10-29 02:07:53)
1435 / 100 (2011-10-23 08:28:59 - 2011-10-23 10:05:42)
670 / 17 (2011-10-23 09:05:14 - 2011-10-23 09:09:59)
294 / 191 (2011-10-23 07:43:33 - 2011-10-23 11:43:03)
122 / 134 (2011-10-23 08:22:55 - 2011-10-23 09:07:22 )
SCRIPT_ID / NAME (AUTHOR) => voting result
3695: commentary.vim : Comment stuff out; takes a motion as a target (Tim Pope) => 28/103
2140: xoria256.vim : Soft pastel gamma on dark background, same appearence in {,g}vim (Dmitriy Zotikov) => 249/245
1435: HiMtchBrkt : withdrawn (Charles Campbell) => -36/131,
670: VisIncr : Produce increasing/decreasing columns of numbers, dates, or daynames (Charles Campbell) => 785/648
294: Align : Help folks to align text, eqns, declarations, tables, etc (Charles Campbell) => 1452/712
122: Astronaut (Charles Campbell) => -57/169
4 times Charles Campbell
1 time Tim Pope
1 time Dmitriy Zotikov
comment: I cleaned up commentary Aug 20 21:07:23 2011 so it happened again.
As example I attached relevant data for 3695, see below. And even for
www.vim.org I can't believe users voting the same plugins every 4 secs?
If Bram fixed the issue on Sept 2. then its very likely that someone wrote a
script or some other magic is going on I can't imagine - maybe search engines
do follow forms as well? If so why didn't it happen more often?
http://googlewebmastercentral.blogspot.com/2008/04/crawling-through-html-forms.html
.. says it may happen and might have happened in the past.
A fix would be robots.txt for google if this was the case.
But then - why should google bot run that many queries if there are 3 options only ?!
Doesn't make sense to me. Passing invalid data to web apps will yield "internal
server errors" very often.
Additionally the following plugins still have 20 down votes in sequence within 6 hours:
515|2008-04-04 00:22:44|2008-04-04 00:26:00
2002|2008-04-04 00:40:53|2008-04-04 00:45:16
SCRIPT: USER
| python_fold | wiersma |
| python_ifold | hellhound |
Marc Weber
Sample data for script id 3695:
TWO RELEASES
+---------------------+
| creation_date |
+---------------------+
| 2011-08-20 16:57:40 |
| 2011-08-28 03:20:10 |
+---------------------+
is not related to votings IMHO:
ID | VOTING | DATE
3695 1 2011-09-23 19:06:32
3695 4 2011-10-21 03:48:14
3695 -1 2011-10-23 09:09:23
3695 -1 2011-10-23 09:09:24
3695 -1 2011-10-23 09:09:26
3695 -1 2011-10-23 09:09:30
3695 -1 2011-10-23 09:11:30
3695 -1 2011-10-23 09:11:52
3695 -1 2011-10-23 09:12:34
3695 -1 2011-10-23 09:12:42
3695 -1 2011-10-23 09:12:51
3695 -1 2011-10-23 09:12:59
3695 -1 2011-10-23 09:13:46
3695 -1 2011-10-23 09:14:13
3695 -1 2011-10-23 09:14:22
3695 -1 2011-10-23 09:14:29
3695 -1 2011-10-23 09:14:43
3695 -1 2011-10-23 09:14:49
3695 -1 2011-10-23 09:15:46
3695 -1 2011-10-23 09:16:10
3695 -1 2011-10-23 09:16:20
3695 -1 2011-10-23 09:17:38
3695 -1 2011-10-23 09:18:42
3695 -1 2011-10-23 09:19:00
3695 -1 2011-10-23 09:19:18
3695 -1 2011-10-23 09:19:32
3695 -1 2011-10-23 09:19:48
3695 -1 2011-10-23 09:21:01
3695 -1 2011-10-23 09:21:31
3695 -1 2011-10-23 09:21:38
3695 -1 2011-10-23 09:21:49
3695 -1 2011-10-23 09:21:55
3695 -1 2011-10-23 09:22:03
3695 -1 2011-10-23 09:22:12
3695 -1 2011-10-23 09:22:20
3695 -1 2011-10-23 09:22:32
3695 -1 2011-10-23 09:22:38
3695 -1 2011-10-23 09:22:43
3695 -1 2011-10-23 09:22:50
3695 -1 2011-10-23 09:23:27
3695 -1 2011-10-23 09:23:33
3695 -1 2011-10-23 09:23:40
3695 4 2011-10-27 18:08:25
3695 1 2011-11-21 13:48:30
3695 4 2011-11-22 14:45:56
> Excerpts from Benjamin R. Haskell's message of Fri Nov 25 04:52:14 +0100 2011:
>> That's what you could do if you're assuming that the ratings are for
>> the benefit of the plugin authors. I'd rather the ratings be useful
>> for users.
> Let me explain it to you: Its plugin author writing scripts.
> Thus its plugin authors serving the needs of users.
> If you setup a simple efficient feedback loops (the way github does)
> you'll get nice system moving forward improving on its own.
If you want a feedback loop the way github provides, use github. That's
what a lot of script authors do (including you, from what I recall).
Vim.org has a ratings system.
> Its like telling to babyies: "You're not useful to me - so I don't
> even try to teach you standing up and how to walk."
> You don't see that it's children serving your needs when you're old
> (unless you suicide). From this example its easy to understand that
> "little boys/girls" need guidance to be helpfull to the community.
>
> That's my understanding about open source.
Following your analogy, plugins are authors' babies, not users' babies.
Users have no connection or obligation to plugin authors. If someone
else's baby can't walk, it's not my job to teach it, it's theirs.
Providing feedback is nice, yes. And the right thing to do. And the
reason open source succeeds when it does. Back to the analogy: society
as a whole is better off when people are willing to help other people's
babies.
>> without leaving a comment, they'll likely not leave a rating.
> Then they don't understand feedback loops. Then they should not vote.
> Becaues its *you* benefting as user if you read "downvoting because
> script-id 326 gets the job done much better".
Vim.org's ratings are more like amazon.com's star ratings. Comments
aren't required with star ratings, and the reviews that accompany the
ratings are much more useful than the ratings alone. So you're right
about that: rating + feedback is better. But, both serve a purpose.
>>> Its ok to downvote for a reason such as "contains executable". But
>>> its bad to have people provide negative rating and not knowing why!
>>> The website should allow some minimal feedback.
>> Allow? Yes, great idea. Force? No.
> I agree its debatable. Why do I prefer comments? Because I can
> validate and proof them. I can't judge votings without comments - so I
> surely am in favour of forcing comments.
Yes, I also agree it's debatable, and that comments are preferable.
But, as I mentioned in another thread recently, I'm probably the wrong
person to ask about how to improve vim.org's script-hosting, since I
rarely install anything directly from it.
Bringing this back to a bigger picture: I think it would take a
tremendous amount of effort to improve the scripts site to the level of
something like github. And, personally, I think that effort isn't worth
it, since the option already exists to simply use github. Even the
effort to maintain the scripts site's current level of usefulness is
increasing (cf. recent increase in spam scripts, and the downvote issue
from this thread).
I'm somewhat curious as to whether people share my opinion of vim.org's
script hosting. Whenever it was first deployed, I'm sure it filled a
need. But, today, there are any number of free, easy-to-use places to
host scripts (github, bitbucket, Google Code).
Does the scripts site have much reason to continue?
I guess one clear benefit is that the scripts site is a nice way to have
a "stable" release, vs. a "development" release on github. So, maybe
for that reason alone it makes sense to continue to devote effort to
maintaining/improving the site.
Yes. With these new data post-2011/09/02 it seems clear the problem
wasn't simply the downvote link being crawled. (How do you have access
to that data?)
> If so why didn't it happen more often?
Before changing to POST, the effect would only be from:
1. The bad link gets spidered (so, 1 downvote)
2. Many people search for that plugin, then click the bad link (then,
many downvotes)
I'm not sure, though, why it would still happen after the change to
POST.
> http://googlewebmastercentral.blogspot.com/2008/04/crawling-through-html-forms.html
> .. says it may happen and might have happened in the past.
> A fix would be robots.txt for google if this was the case.
> But then - why should google bot run that many queries if there are 3 options only ?!
> Doesn't make sense to me. Passing invalid data to web apps will yield "internal
> server errors" very often.
One potential issue is that Googlebot started crawling more forms with
method="POST" recently.
http://googlewebmastercentral.blogspot.com/2011/11/get-post-and-safely-surfacing-more-of.html
But, there doesn't seem to be a nice way to prevent it with the current
setup, because the voting page is the same as the script information
page. If instead of using the same page, the form with name="rating"
used action="/scripts/rating.php", then rating.php redirected the user
back to the /scripts/script.php?script_id=NNNN page, /scripts/rating.php
could be added to robots.txt.
> Additionally the following plugins still have 20 down votes in sequence within 6 hours:
>
> [trimming data for scripts 515, 2002, and 3695]
Do you have a way to correlate the voting data with access logs that
contain the User-Agent? It'd be interesting to see if it lined up.
--
Best,
Ben
I'd like to do the opposite: gather VimL projects from github and
display summaries on www.vim.org. The first step is to keep a separate
site and see how it evolves. If it works - then discussion about
integrating it into www.vim.org can start. Much too early for now.
If you're interested tell me and I'll set you a link to our "vision".
Duplicating github ? Insane amount of work. No chance.
But we should think about reusing some of their ideas such as display
README or displaying doc/*.txt files from plugins. Then authors don't
have to duplicate install instructions.
> I guess one clear benefit is that the scripts site is a nice way to have
> a "stable" release, vs. a "development" release on github. So, maybe
> for that reason alone it makes sense to continue to devote effort to
> maintaining/improving the site.
Its debatable which branch should serve which purpose. My plugins should
have a stable trunk - if they are not its a bug and should be fixed
instantly. Experimental ideas are put into branches. But some people
feel differently about it.
> Yes. With these new data post-2011/09/02 it seems clear the problem
> wasn't simply the downvote link being crawled. (How do you have access
> to that data?)
I asked Bram once - cause I wanted to improve the website.
I granted me access.
I even rewrote much of the code finally noticing that PHP is not going
to serve me looking at the ideas I sketched above. I also spend more
time on mercurial than on coding cause I'm a git user and I'm missing
trivial things like remote locations and such. I always feel unsafe
using it. Maybe its also because I don't know it very well.
Bram statement was simple: sourceforge's hosting is going to serve it
well in the future. Using a custom solution may not. He's right on it.
So unless I can't guarantee funding for at least 10 years or so ..
I should shut up and make more money so that I can do so.
> Do you have a way to correlate the voting data with access logs that
> contain the User-Agent? It'd be interesting to see if it lined up.
No. I tried. Those older logs are all gone. It would be very interesting
to read the HTTP_REFERER code.
Marc Weber
[...]
> > He wrote a message stating it was fixed on Sept. 2.
> I've found the followings scripts which got dowvoted after Sept. 2.
> I searched for scripts having 10 or more down votings in sequence
> since Sept. 2.
>
> SCRIPT_ID / downvote count / time range
> 3695 / 40 (2011-10-23 from 09:09 till 09:23)
> 2140 / 117 (2011-10-29 01:34:49 - 2011-10-29 02:07:53)
> 1435 / 100 (2011-10-23 08:28:59 - 2011-10-23 10:05:42)
> 670 / 17 (2011-10-23 09:05:14 - 2011-10-23 09:09:59)
> 294 / 191 (2011-10-23 07:43:33 - 2011-10-23 11:43:03)
> 122 / 134 (2011-10-23 08:22:55 - 2011-10-23 09:07:22 )
>
> SCRIPT_ID / NAME (AUTHOR) => voting result
[...]
> If Bram fixed the issue on Sept 2. then its very likely that someone wrote a
> script or some other magic is going on I can't imagine - maybe search engines
> do follow forms as well? If so why didn't it happen more often?
The IP addresses are all different (one can only submit a vote from an
IP address once). That's the weird thing.
> Sample data for script id 3695:
>
> TWO RELEASES
> +---------------------+
> | creation_date |
> +---------------------+
> | 2011-08-20 16:57:40 |
> | 2011-08-28 03:20:10 |
> +---------------------+
>
> is not related to votings IMHO:
>
> ID | VOTING | DATE
> 3695 1 2011-09-23 19:06:32
> 3695 4 2011-10-21 03:48:14
> 3695 -1 2011-10-23 09:09:23
> 3695 -1 2011-10-23 09:09:24
> 3695 -1 2011-10-23 09:09:26
> 3695 -1 2011-10-23 09:09:30
> 3695 -1 2011-10-23 09:11:30
[etc.]
The mystery is that these happen in sequence from different IP
addresses. That doesn't point to a crawler (it would it the URL only
once per day at most). It might point to some botnet script.
It's going to be very difficult to automatically distinguish these
down-ratings from what happens to a really bad script that gets posted
and deserves these down-ratings.
Since it doesn't happen frequently, we can write a simple PHP function
to enter a date range and script number and remove the negative votes in
that date range. Simply removing the negative votes from the table has
the problem that the total count for the script still has to be updated,
which is a hassle to do manually.
--
hundred-and-one symptoms of being an internet addict:
191. You rate eating establishments not by the quality of the food,
but by the availability of electrical outlets for your PowerBook.
Well - close to impossible because its we who must decide on what should
be "valid" and not - you can't distinguish bots from humans without
spending hours on analyzing behaviour of HTTP clients - user agent
strings can be changed trivially.
> Since it doesn't happen frequently, we can write a simple PHP function
> to enter a date range and script number and remove the negative votes in
> that date range. Simply removing the negative votes from the table has
> the problem that the total count for the script still has to be updated,
> which is a hassle to do manually.
And the sum of total counts. Of course writing a PHP interface for it
would be possible. But how to catch those issues?
You can trivially write a bot which does this kind of voting:
$r = rand(100)
if $r < 8 then vote_neutral()
if $r < 12 then vote_up()
else vote_down()
and it'll be very hard to recognize this kind of abuse.
If there is a bot - bots have time. They can vote slowly submitting
5 votes a day.
Alternatives? Force login? ...
Probably not an option - unless we think less but higher quality votes
are better.
I'd replace that voting system by usage counts if possible. Eg allow
users to submit the plugins they have installed. This would also reflect
the current state of popularity rather than historical data.
If we asked authors to add simple
" vim_script_voting_id: MY_PLUGIN_NAME
~/.vim/plugin/*.vim could be grepped and submitted - eventually forcing
logins.
If abuse happens - users could start to "follow" usage counts of other users.
Then abuse is close to impossible. Eg you like plugins of authors A,B,C
and user X then only follow their plugin usage counts.
I don't know how well such would work in practise - its an idea only.
In an older thread we've already discussed the option to decay votings
slowly - new version may no longer suffer from issues of older versions.
Marc Weber
Maybe a captcha system could be integrated?
If throwing out old votes seems too drastic, maybe the votes
could be displayed according to their age, with totals computed
over the past 6 months, 1 year, or the entire history.
Michael Henry
Maybe it would be enough hot display all votings - then people can judge
whether abuse has taken place.
Yes - captchas would be a solution as well.
Marc Weber
If throwing out old votes seems too drastic, maybe the votes
could be displayed according to their age, with totals computed
over the past 6 months, 1 year, or the entire history.
I agree that it doesn't happen that often at all.
About removing or weightning old ratings less: I'd assign ratings to
script versions or thus. Don't have time right now to work on anything. :(
Marc Weber