Issue 141 in perlwikipedia: Slow response times

2 views
Skip to first unread message

perlwi...@googlecode.com

unread,
Oct 6, 2010, 8:46:23 PM10/6/10
to perlwiki...@googlegroups.com
Status: New
Owner: mike.lifeguard
Labels: Type-Defect Priority-Medium

New issue 141 by mike.lifeguard: Slow response times
http://code.google.com/p/perlwikipedia/issues/detail?id=141

On Wed Oct 06 17:49:21 2010, vonankh wrote:
> Not sure what the problem is, but doing the exact same request using
> "Perlwikipedia" Vs. "MediaWiki::Bot", takes only 1/3 the time as the
> "MW:Bot". I have tried many different thing, but cannot find the
> problem...

> This is especially evident in the:

> my @genosets = $bot->get_pages_in_category($cats);

> calls.

perlwi...@googlecode.com

unread,
Oct 6, 2010, 8:50:25 PM10/6/10
to perlwiki...@googlegroups.com
Updates:
Owner: ---
Cc: mike.lifeguard

Comment #1 on issue 141 by mike.lifeguard: Slow response times
http://code.google.com/p/perlwikipedia/issues/detail?id=141

Sorry, are you saying the request is *faster* when using
the "Perlwikipedia" alias? I don't see how that's possible, since it loads
the exact same code, by a different name.

Please post your test code.

perlwi...@googlecode.com

unread,
Oct 10, 2010, 4:53:17 PM10/10/10
to perlwiki...@googlegroups.com

Comment #2 on issue 141 by vonankh: Slow response times
http://code.google.com/p/perlwikipedia/issues/detail?id=141

That's correct! It makes no sense whatsoever. But are you sure the settings
of the code is the same? Could it be that the site I'm trying to get the
pages from is choking/throttling "agents" from "MediaWiki::Bot" but not
from "Perlwikipedia" as that toolset is getting outdated?

perlwi...@googlecode.com

unread,
Oct 10, 2010, 5:07:24 PM10/10/10
to perlwiki...@googlegroups.com

Comment #3 on issue 141 by vonankh: Slow response times
http://code.google.com/p/perlwikipedia/issues/detail?id=141

Addendum: I'm using these:
1) Perlwikipedia-1.5.2.tar.gz
2) MediaWiki-Bot-3.2.4.tar.gz

perlwi...@googlecode.com

unread,
Oct 10, 2010, 5:20:28 PM10/10/10
to perlwiki...@googlegroups.com

Comment #4 on issue 141 by niftymike: Slow response times
http://code.google.com/p/perlwikipedia/issues/detail?id=141

So you *aren't* using the Perlwikipedia alias from MediaWiki::Bot. You're
using a (very) old distribution. Please upload the tarball, and I'll try to
find where the inefficiency lies in the newer code.

perlwi...@googlecode.com

unread,
Oct 10, 2010, 5:54:48 PM10/10/10
to perlwiki...@googlegroups.com

Comment #5 on issue 141 by mike.lifeguard: Slow response times

perlwi...@googlecode.com

unread,
Oct 26, 2010, 3:32:51 AM10/26/10
to perlwiki...@googlegroups.com

Comment #6 on issue 141 by james.lick: Slow response times
http://code.google.com/p/perlwikipedia/issues/detail?id=141

I believe I may be having the same problem. I recently had to upgrade from
MediaWiki-Bot 2.3.0 to 3.2.4 to solve a login issue after the snpedia.com
wiki did a server upgrade. After the upgrade, the get_pages_in_category
calls have become extremely slow, about 25 times slower in the worst case.
It seems to get exponentially worse as the number of pages returned
increases.

Here is my minimum sample code which demonstrates the problem:

use MediaWiki::Bot;
my $bot = MediaWiki::Bot->new();
$bot->set_wiki('www.snpedia.com','/');
my @rsnums = $bot->get_pages_in_category("Category:Is_a_snp",{ max => 0 });

This returns about 13,500 pages currently.

In 2.3.0, this code took about 54 seconds to complete. In 3.2.4 it takes
about 23 minutes to complete.

I went back and starting with 2.3.0 upgraded one at a time. 2.3.0, 2.3.1,
and 3.0.0 all take about 54 seconds. Starting with 3.1.5 it jumps up to
around 23 minutes.

I would like to suggest the priority be increased from medium to high, as a
25x reduction in performance is a pretty serious problem.

perlwi...@googlecode.com

unread,
Oct 26, 2010, 4:14:07 AM10/26/10
to perlwiki...@googlegroups.com

Comment #7 on issue 141 by james.lick: Slow response times
http://code.google.com/p/perlwikipedia/issues/detail?id=141

I found the "problem".

Up to 3.0.0, cmlimit => 500 was used for the call to MediaWiki::API's list
function. In 3.1.5 this was removed.

cmlimit specifies the number of items to request in each query. If not
specified, it uses 10 per request. In other words, 13,500 pages would
require 1,350 requests with the default setting or 27 requests with the
cmlimit set to 500. Suggest either restoring the old code, or make this an
option.

In the meantime, for those who want to restore the old behavior, in the
MediaWiki-Bot source, edit lib/MediaWiki/Bot.pm then find the
get_pages_in_category function and add the line "cmlimit => 500," to $hash,
e.g.:

my $hash = {
action => 'query',
list => 'categorymembers',
cmtitle => $category,
cmlimit => 500,
};

After making this change to 3.2.4, my sample code took 57 seconds, on par
with the previous behavior.

perlwi...@googlecode.com

unread,
Oct 26, 2010, 10:58:27 AM10/26/10
to perlwiki...@googlegroups.com
Updates:
Status: Started

Comment #8 on issue 141 by mike.lifeguard: Slow response times
http://code.google.com/p/perlwikipedia/issues/detail?id=141

Could you please test by using automatic login and configuration? If your
account is a bot, highlimits should be set, so you should get 5000 results
per query.

But, yes, this should specifically set cmlimit for normal users.

Actually, 'max' can be used:
All list queries return a limited number of results. This limit is 10 by
default, and can be set as high as 500 for regular users, or 5000 for users
with the apihighlimits right (typically bots and sysops). Some modules
impose stricter limits under certain conditions. If you're not sure which
limit applies to you and just want as many results as possible, set the
limit to max. In that case, a <limits> element will be returned, specifying
the limits used.

Reply all
Reply to author
Forward
0 new messages