would someone tell me what is wrong with this code? would someone show
me how to do a multi-value input in perl (e.g. 10k rows with single
insert statment)?
thanks,
-ricky
my $select_handle =
$dbh->prepare('SELECT agent_id, agent_fullname FROM agents WHERE
agent_shortname = ? AND RDEOUTPUT_rde_id = ?');
my $insert_handle =
$dbh->prepare('INSERT INTO actual VALUES
(DEFAULT,?,?,?,?,?,?,?,?,?)');
while (<$GTRUTH>) {
chomp;
my @gtruth = split;
# get FK: AGENTS_agent_id
$select_handle->execute($gtruth[3],$rde_id);
my ($AGENTS_agent_id, $agent_fullname) = $select_handle->fetchrow;
$insert_handle->execute(
$AGENTS_agent_id, # FK from agents
$rde_id, # FK from rdeoutput
$gtruth_line[0], # ground truth time
0, # target is DEAD
$gtruth_line[7], # move status
$gtruth_line[8] # action
);
}
> my $select_handle =
> $dbh->prepare('SELECT agent_id, agent_fullname FROM agents WHERE
> agent_shortname = ? AND RDEOUTPUT_rde_id = ?');
Does your DB have an index on the agent_shortname and RDEOUTPUT_rde_id columns?
--
Tad McClellan SGML consulting
ta...@augustmail.com Perl programming
Fort Worth, Texas
yes, the table has an index for both columns.
yes, the table has an index for both columns.
How is the CPU usage divided between mysql and perl during that time?
Where/how are you committing?
>
> would someone tell me what is wrong with this code?
There isn't necessarily anything wrong with your code. Some things
are just slow.
> would someone show
> me how to do a multi-value input in perl (e.g. 10k rows with single
> insert statment)?
Do you know that mysql syntax for doing that? What part of that are you
having trouble with? This is probably one place where I would not use
placeholders, but rather use $dbh->quote($data). If you want to do it in
one command, you could use (from right to left) a map to escape and quote
the data, a join to put commas in, a map to wrap those in paranthesis, and
a join to put commas between them. And with a
while (my @chunk = splice @data,0,10_000) {
around all of it to break it into chunks of 10k.
> my $select_handle =
> $dbh->prepare('SELECT agent_id, agent_fullname FROM agents WHERE
> agent_shortname = ? AND RDEOUTPUT_rde_id = ?');
> my $insert_handle =
> $dbh->prepare('INSERT INTO actual VALUES
> (DEFAULT,?,?,?,?,?,?,?,?,?)');
>
> while (<$GTRUTH>) {
> chomp;
> my @gtruth = split;
>
> # get FK: AGENTS_agent_id
> $select_handle->execute($gtruth[3],$rde_id);
> my ($AGENTS_agent_id, $agent_fullname) =
> $select_handle->fetchrow;
>
> $insert_handle->execute(
> $AGENTS_agent_id, # FK from agents
> $rde_id, # FK from
> rdeoutput $gtruth_line[0], #
> ground truth time 0,
> # target is DEAD $gtruth_line[7],
> # move status $gtruth_line[8] #
> action );
How long does it take if you comment out the $insert_handle->execute and
only do the $select_handle part? How about just printing the formatted
data into a text file (rather than inserting it) then using mysql LOAD DATA
command?
> }
Xho
--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
almost no CPU usage and small memory footprint (6MB). the database is
on a computer in another city, but we have T1 lines.
> There isn't necessarily anything wrong with your code. Some things
> are just slow.
i'm starting to figure this out.
> How long does it take if you comment out the $insert_handle->execute and
> only do the $select_handle part? How about just printing the formatted
the select handle roughly halves the run time.
> data into a text file (rather than inserting it) then using mysql LOAD DATA
> command?
this is finally what i resorted to. i can parse the files to CSV text
documents in < 1 min. using LOAD DATA i can fill the table in 40
seconds. i would really like to know why my code is ~400 time slower
than LOAD DATA.
>> How is the CPU usage divided between mysql and perl during that time?
>> Where/how are you committing?
>
> almost no CPU usage and small memory footprint (6MB). the database is
> on a computer in another city, but we have T1 lines.
There's your bottleneck. I'd say 500k rows in 4 hours (34.7/secs) is
acceptable under the circumstances. Even on a T1 it will be vastly slower
than if running on a local network or the same machine.
If you can run the script on the same machine as the db I expect you will
cut the 4hrs to nearer 4 minutes.
--
Brian Wakem
Email: http://homepage.ntlworld.com/b.wakem/myemail.png
You are probably limited by latency, not by throughput. A wide area T1
line may have high throughput, but I would guess the latency is still quite
poor.
Can you just ship the file over to the remote computer and run your
script over there?
> > There isn't necessarily anything wrong with your code. Some things
> > are just slow.
>
> i'm starting to figure this out.
>
> > How long does it take if you comment out the $insert_handle->execute
> > and only do the $select_handle part? How about just printing the
> > formatted
>
> the select handle roughly halves the run time.
So even if you used multi-valued insert statements, it could only halve
the time. Can you just pull the entire selected table into a look-up
hash rather than making a round-trip each time?
> > data into a text file (rather than inserting it) then using mysql LOAD
> > DATA command?
>
> this is finally what i resorted to. i can parse the files to CSV text
> documents in < 1 min. using LOAD DATA i can fill the table in 40
> seconds. i would really like to know why my code is ~400 time slower
> than LOAD DATA.
It's just the way things are. When you do it row by row, you send a small
message to the server and can't move on to the next one until you receive
the mysql-level acknowledgement/results for the last one. With LOAD DATA
you can send big messages and send many at once without waiting for
acknowledgements.
-ricky