tombert <
tomber...@live.at> wrote:
> Hi all,
>
> I was bit shocked that base64 encoding is so much faster than my
> tries of simple xor of two bytes.
Why would you be shocked? The base64 command is implemented in C, your
xor's below are in Tcl. C will in general always be faster than Tcl,
esp. for the type of use below.
> I needed a simple scrambling algorithm making text not readable for
> humans. Any ideas of a solution that run faster than the TCL native
> base64 encoding?
How not readable do you want. Just "not recognizable as human text", or
"a determined adversary can not work out the origional input", or
somewhere in between?
Someone else in this thread mentioned string map. With an approriate
mapping list, it might in the same speed range as the base64 command.
> string length [set t [string repeat "abc" 1048576]]
> ...
> proc strxor1 {str} {
> set result {}
> foreach char [split $str ""] {
> append result [binary format c [expr {[scan $char %c] ^ 42}]]
> }
> return $result
> }
>
> proc strxor2 {str} {
> set result {}
> binary scan $str c* chars
> foreach char $chars {
> append result [binary format c [expr {$char ^ 42}] c]
> }
> return $result
> }
In both of those above you end up creating 1,048,576 Tcl_Obj's (which
each have to be allocated the first time they get created), then in Tcl
you iterate each Tcl_Obj performing a single xor of one byte stored in
each obj. So there are two pointer indirections (one to go from the
list to the Tcl_Obj, one to go from the Tcl_Obj to the single byte)
plus the xor. The append is likely reasonably quick compared to the
rest of the loop, but it is also incurring two pointer indirections
(get to Obj, get to string) plus Tcl overhead, to append single bytes.
Then, when the foreach exist, all 1,048,576 Tcl_Obj's have to be
deallocated (this will be faster than the alloc, as Tcl will just put
them on an internal free list rather than calling the C free() function
on each).
Neither one of these will match a C variant that can do single pointer
lookups, and direct overwrite (presuming preallocation of the arrays).
for (i=0; i<array_length; i++) {
r[i] = a[i] ^ b[i];
}
That loop above in C would likely compile to an assembly loop of with a
loop core of only 3-4 instructions (the xor itself, an increment for
the pointer, a compare for the array_length (unless the compiler
recognized it could decrement towards zero with the same result) and a
branch to the top of the loop. Tcl will never beat a core loop that is
only 3-4 native CPU instructions.
As another poster mentioned, if you really do want to bulk xor a large
block of data, you'll be much better off performance wise creating a
small Tcl C extension to do the xor operation on the data.