How can I measure what slows my code down?

Georgios Petasis

unread,

Jul 27, 2018, 8:42:42 AM7/27/18

to

Hi all,

I am using tdbc to connect to a mysql server. In there there is a table
with 2 million rows (not that much).

I have wrapped the tdbc commands inside an OO class.

If I do:

tdbc::mysql::connection create $db ...
foreach entry [$db allrows -as lists -- "SELECT * FROM `entry` WHERE e1
IS NOT NULL"] {
}

takes a few seconds (and ~2.4GB or RAM) to complete.

But if I use:

$db foreach -as dicts entry SELECT * FROM `entry` WHERE e1 IS NOT NULL" {
}

it takes ages (hours). Ok, I have used two levels of uplevel, as in:

oo::class create A {
method foreach {args} {
my variable db
my connect
catch { uplevel 1 [list $db foreach {*}$args] } result options
catch {my disconnect}
return -options $options $result
};# foreach
}

oo::class create B {
superclass A

method foreach_entry {script {filter {}}} {
catch { uplevel 1 [list [self object] foreach -as dicts entry \
"SELECT * FROM `entry` $filter" $script] } result options
return -options $options $result
};# foreach_entry

method generate_forms_unambiguous_dict {} {
set dict [dict create]
my foreach_entry {
dict lappend dict [dict get $entry form] [dict get $entry e1]
} {WHERE `e2` IS NULL}
return $dict
};# generate_forms_unambiguous_dict

}

Is the use of uplevel the problem? Is there an easy way to find out
where the problem is?

George

Gerald Lester

unread,

Jul 27, 2018, 5:38:27 PM7/27/18

to

Read and apply the time man/help page.

You may also want to look at the execute cost of the query plan in mysql
-- it has been known to act very non-linear.

--
+----------------------------------------------------------------------+
| Gerald W. Lester, President, KNG Consulting LLC |
| Email: Gerald...@kng-consulting.net |
+----------------------------------------------------------------------+

heinrichmartin

unread,

Jul 30, 2018, 3:21:03 AM7/30/18

to

On Friday, July 27, 2018 at 2:42:42 PM UTC+2, Georgios Petasis wrote:
> If I do:
>
> tdbc::mysql::connection create $db ...
> foreach entry [$db allrows -as lists -- "SELECT * FROM `entry` WHERE e1
> IS NOT NULL"] {
> }
>
> takes a few seconds (and ~2.4GB or RAM) to complete.
>
> But if I use:
>
> $db foreach -as dicts entry SELECT * FROM `entry` WHERE e1 IS NOT NULL" {
> }
>
> it takes ages (hours).

=timing=
Sounds like the problem is huge. So there is no need to be smart about timing to get the quantity: puts stderr [clock microseconds] inside the loop (or twice at start and end of your code, to decide whether the delay is in your code or in the db package).

=possible reasons=
Constructing separate dicts per row in the package?
Your code? Is the above your very code? Did you use a version with uplevel inside the loop? Logging dicts?
I once had a shimmering problem when writing all state dicts of a state machine to a debug log file. It turned out that the culprit was not reading and writing of sub-dicts of a huge dict, but getting the string representation of that possibly modified huge dict. (Obvious with some knowledge of Tcl's internals.)

=btw, return sanity=
You might want to add [dict incr options -level] before passing it on.