ROT13

mss

unread,

Dec 28, 2009, 10:48:46 AM12/28/09

to

#!/bin/awk -f
# ROT13 in AWK
# a slight modifcation of the example found at:
# http://www.miranda.org/~jkominek/rot13/awk/

BEGIN {

from = "NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm0987654321"
to = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890"

for (i = 1; i <= length(from); i++) {
letter[substr(from, i, 1)] = substr(to, i, 1)
}
}

{
for (i = 1; i <= length($0); i++) {
char = substr($0, i, 1)
if (match(char, "[a-zA-Z]|[0-9]") != 0) {
printf("%c", letter[char])
} else {
printf("%c", char)
}
}
printf("\n")
}

--
later on,
Mike

http://topcat.hypermart.net/

Janis Papanagnou

unread,

Dec 28, 2009, 12:40:52 PM12/28/09

to

mss wrote:
> #!/bin/awk -f
> # ROT13 in AWK
> # a slight modifcation of the example found at:
> # http://www.miranda.org/~jkominek/rot13/awk/
>
> BEGIN {
>
> from = "NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm0987654321"
> to = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890"
>
> for (i = 1; i <= length(from); i++) {
> letter[substr(from, i, 1)] = substr(to, i, 1)
> }
> }
>
> {
> for (i = 1; i <= length($0); i++) {
> char = substr($0, i, 1)
> if (match(char, "[a-zA-Z]|[0-9]") != 0) {

You can write this as...

if (match(char, /[a-zA-Z0-9]/) {

or as...

if (match(char, /[[:alnum:]]/) {

But why not just...

if (char in letter) {

> printf("%c", letter[char])
> } else {
> printf("%c", char)
> }
> }
> printf("\n")
> }

If you're using GNU awk you may make use of FS=""...

BEGIN { FS = ""
n = split("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890",a)
split("NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm0987654321",b)
for (i=1; i<=n; i++) t[a[i]] = ""b[i]
}
{
for (i=1; i<=NF; i++)
printf "%c",(($i in t)?t[$i]:""$i)
print ""
}

Or use index() into a string of allowed characters and add some offset
(e.g. 13) modulo length of the string to get the respective new index.
(This is depending on the character class subsets you use, and special
handling for your number range is necessary.)

Janis

mss

unread,

Dec 28, 2009, 1:22:47 PM12/28/09

to

Janis Papanagnou wrote:

> if (match(char, /[a-zA-Z0-9]/) {

Sure enough. [a-zA-Z] | [0-9] is redundant in my 1st post.

> if (match(char, /[[:alnum:]]/) {

Is this portable? And also, something I don't yet 'grok'...
Why must double brackets be used in [[:those:]] classes?

> if (char in letter) {

I like this best myself too.

> (This is depending on the character class subsets you use, and special
> handling for your number range is necessary.)

Further to that end, another very nifty implementation here:

http://rosettacode.org/wiki/Rot-13#AWK

Janis Papanagnou

unread,

Dec 28, 2009, 10:04:19 PM12/28/09

to

mss wrote:
> Janis Papanagnou wrote:
>
>> if (match(char, /[a-zA-Z0-9]/) {
>
> Sure enough. [a-zA-Z] | [0-9] is redundant in my 1st post.

The other change WRT your post was to use /.../ instead of "...".
Not much different in case of the given character set, but using the
former you've less trouble generally; I prefer them where possible.

>
>> if (match(char, /[[:alnum:]]/) {
>
> Is this portable?

It's POSIX standard, I'm sure. And quite surely not portable WRT very
old awk's. I also think it isn't mentioned in the book of A, K, and W.

WRT the rot13 task in general, and character classes specifically, you
have to consider locales. E.g. in German, how would the umlauts ��
and the � be rot13'ed, and you have to consider that those additional
characters would be in the [:alnum:] character class as well.

> And also, something I don't yet 'grok'...
> Why must double brackets be used in [[:those:]] classes?

To be able to differentiate syntactically between charaxters and classes
of characters. The outer brackets define the character set, and the
inner [:alnum:] defines the predefined set of alphanumeric characters.
You can, for example, add an underscore to the alnum set by either of
those expressions

[[:alnum:]_] [_[:alnum:]]

>
>> if (char in letter) {
>
> I like this best myself too.

Yes, it's the most elegant.

Janis

mss

unread,

Dec 29, 2009, 8:38:25 AM12/29/09

to

Okay, lets see. Here's the version I'll be using for now.
This iteration, encapsulates the functionality in its
own function. Thanks for your help & input Janis.

function rot13(str, from, to, q, letter, char, buf) {

# rot13 for awk
# more info at: http://en.wikipedia.org/wiki/ROT13

# a slight modifcation of the example found at:
# http://www.miranda.org/~jkominek/rot13/awk/

# authors: Janis Papanagnou and Michael Sanders

from = "NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm0987654321"
to = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890"

for (q = 1; q <= length(from); q++) {
letter[substr(from, q, 1)] = substr(to, q, 1)
}

for (q = 1; q <= length(str); q++) {
char = substr(str, q, 1)
if (char in letter) {
buf = buf sprintf("%c", letter[char])
} else {
buf = buf sprintf("%c", char)
}
}

return buf

Janis Papanagnou

unread,

Dec 29, 2009, 1:16:49 PM12/29/09

to

mss wrote:
> Okay, lets see. Here's the version I'll be using for now.
> This iteration, encapsulates the functionality in its
> own function. Thanks for your help & input Janis.

You're welcome. And thanks for your attribution :-)
Though, I don't feel like being a co-author; just gave some feedback.

>
> function rot13(str, from, to, q, letter, char, buf) {
>
> # rot13 for awk
> # more info at: http://en.wikipedia.org/wiki/ROT13
> # a slight modifcation of the example found at:
> # http://www.miranda.org/~jkominek/rot13/awk/
> # authors: Janis Papanagnou and Michael Sanders

# author: Michael Sanders (with comments from Janis Papanagnou)

Grant

unread,

Dec 29, 2009, 3:09:18 PM12/29/09

to

On Tue, 29 Dec 2009 13:38:25 +0000 (UTC), mss <m...@dev.null> wrote:

>Okay, lets see. Here's the version I'll be using for now.
>This iteration, encapsulates the functionality in its
>own function. Thanks for your help & input Janis.
>
>function rot13(str, from, to, q, letter, char, buf) {
>
># rot13 for awk
># more info at: http://en.wikipedia.org/wiki/ROT13
># a slight modifcation of the example found at:
># http://www.miranda.org/~jkominek/rot13/awk/
># authors: Janis Papanagnou and Michael Sanders
>
>from = "NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm0987654321"
>to = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890"
>
> for (q = 1; q <= length(from); q++) {
> letter[substr(from, q, 1)] = substr(to, q, 1)
> }

If you call this function more than once in a session, the above should
be in a BEGIN block, you may need a 'buf = ""' here to clear old one?

> for (q = 1; q <= length(str); q++) {
> char = substr(str, q, 1)
> if (char in letter) {
> buf = buf sprintf("%c", letter[char])
> } else {
> buf = buf sprintf("%c", char)
> }
> }
>
> return buf
>
>}
>
>--

sig delim --> s/-- /-- / :)

Grant.
--
http://bugsplatter.id.au

mss

unread,

Dec 29, 2009, 5:21:18 PM12/29/09

to

Grant wrote:

>>from = "NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm0987654321"
>>to = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890"
>>
>> for (q = 1; q <= length(from); q++) {
>> letter[substr(from, q, 1)] = substr(to, q, 1)
>> }
>
> If you call this function more than once in a session, the above should
> be in a BEGIN block

Yes, however for the sake of brevity...

> you may need a 'buf = ""' here to clear old one?

Repeated use here has no side effects...

My understanding is that 'buf' would only last the life of the function,
since its locally scoped:

A function definition:

function myfunction (param, local)
{
print param # the value passed to the function
print local # a locally scoped variable declared in the formal
# parameter list (hides global variables with the
# same name)
print global # a global variable
}

> sig delim --> s/-- /-- / :)

On my end it contains the space 'before leaving the box'... Mutt problem?

mss

unread,

Dec 29, 2009, 5:31:44 PM12/29/09

to

Janis Papanagnou wrote:

> You're welcome. And thanks for your attribution :-)
> Though, I don't feel like being a co-author; just gave some feedback.

Corrected.

Anton Treuenfels

unread,

Dec 29, 2009, 8:53:52 PM12/29/09

to

"mss" <m...@dev.null> wrote in message news:hhd0oh$9vr$1...@news.albasani.net...

> for (q = 1; q <= length(str); q++) {
> char = substr(str, q, 1)
> if (char in letter) {
> buf = buf sprintf("%c", letter[char])
> } else {
> buf = buf sprintf("%c", char)
> }
> }

Isn't this bit here doing too much work, since the contents of 'char' and
'letter[]' are already one character strings? Here's one alternative:

for ( q = length(str); q; --q ) {
char = substr( str, q, 1 )
if ( char in letter )
buf = letter[ char ] buf
else
buf = char buf
}

Um, I also like to count down to avoid repeated evaluation of 'length(str)',
which always gives the same result.

- Anton Treuenfels

mss

unread,

Dec 29, 2009, 9:49:17 PM12/29/09

to

Anton Treuenfels wrote:

...

>
> Isn't this bit here doing too much work, since the contents of 'char' and
> 'letter[]' are already one character strings? Here's one alternative:

Yes, its not yet very efficient at all to be honest, but neither am
I at awk yet...

> for ( q = length(str); q; --q ) {
> char = substr( str, q, 1 )
> if ( char in letter )
> buf = letter[ char ] buf
> else
> buf = char buf
> }

I'll study this more...

> Um, I also like to count down to avoid repeated evaluation of 'length(str)',
> which always gives the same result.

Agreed, & in fact already thought of this too, the function now uses variables
rather evaluating the length(s) with every iteration (I know 'ouch'), so...

x = length(from)
y = length(str)

Thanks Anton, I'm learning.

mss

unread,

Dec 29, 2009, 10:09:57 PM12/29/09

to

mss wrote:

> Agreed, & in fact already thought of this too, the function now uses variables
> rather evaluating the length(s) with every iteration (I know 'ouch'), so...
>
> x = length(from)
> y = length(str)
>
> Thanks Anton, I'm learning.

Here's where I'm at currently:

function rot13(str, from, to, x, y, z, letter, char, buf) {

# rot13 for awk
# more info at: http://en.wikipedia.org/wiki/ROT13

# a slight modification of the example found at:
# http://www.miranda.org/~jkominek/rot13/awk/

from = "NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm0987654321"
to = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890"

x = length(from)
y = length(str)

for (z = 1; z <= x; z++) {
letter[substr(from, z, 1)] = substr(to, z, 1)
}

for (z = 1; z <= y; z++) {
char = substr(str, z, 1)
if (char in letter) {
buf = buf letter[char]
} else {
buf = buf char
}
}

return buf

W. James

unread,

Dec 30, 2009, 1:30:36 AM12/30/09

to

mss wrote:

BEGIN {
FS = OFS = ""
from = rng("A","Z") rng("a","z") "0987654321"
to = rng("N","Z") rng("A","M")
to = to tolower( to ) "1234567890"
for (i=1; i<=length(to); i++)
map[ substr( from, i, 1 ) ] = substr( to, i, 1 )
}

{
for (i=1; i<=NF; i++) $i = rot13( $i )
print
}

function rng( lo, hi, all )
{ for (i=1; i<128; i++)
{ c = sprintf( "%c", i )
if ( lo <= c && c <= hi ) all = all c
}
return all
}

function rot13( c )
{ return (c in map) ? map[c] : c
}

--

mss

unread,

Dec 30, 2009, 7:50:48 AM12/30/09

to

W. James wrote:

> BEGIN {
> FS = OFS = ""
> from = rng("A","Z") rng("a","z") "0987654321"
> to = rng("N","Z") rng("A","M")
> to = to tolower( to ) "1234567890"
> for (i=1; i<=length(to); i++)
> map[ substr( from, i, 1 ) ] = substr( to, i, 1 )
> }
>
> {
> for (i=1; i<=NF; i++) $i = rot13( $i )
> print
> }
>
> function rng( lo, hi, all )
> { for (i=1; i<128; i++)
> { c = sprintf( "%c", i )
> if ( lo <= c && c <= hi ) all = all c
> }
> return all
> }
>
> function rot13( c )
> { return (c in map) ? map[c] : c
> }

Interesting take, thanks for sharing. Take a look at the other posts
in this thread to see how its progressing, for more ideas...

My goals are:

- A single reusable generic function

- Isolated variables (we don't want to clobber the rest of a script)

- A rigor that *reduces complexity*

pk

unread,

Dec 30, 2009, 1:35:09 PM12/30/09

to

mss wrote:

> Here's where I'm at currently:
>
> function rot13(str, from, to, x, y, z, letter, char, buf) {
>
> # rot13 for awk
> # more info at: http://en.wikipedia.org/wiki/ROT13
> # a slight modification of the example found at:
> # http://www.miranda.org/~jkominek/rot13/awk/
>
> from = "NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm0987654321"
> to = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890"
> x = length(from)
> y = length(str)
>
> for (z = 1; z <= x; z++) {
> letter[substr(from, z, 1)] = substr(to, z, 1)
> }

You rebuild the letter[] array every time the function is called, which is
probably not too efficient, especially if you call it lots of times. Since
it doesn't change, you could just build it once and put it in a global
variable.

>
> for (z = 1; z <= y; z++) {
> char = substr(str, z, 1)
> if (char in letter) {
> buf = buf letter[char]
> } else {
> buf = buf char
> }
> }

It's just cosmetic syntax, but how about

for (z = 1; z <= y; z++) {

buf = buf ((char = substr(str, z, 1)) in letter)?letter[char]:char
}

or even (to save a variable!)

while (y) {
buf = (((char = substr(str, y--, 1)) in letter)?letter[char]:char) buf
}

But I agree that in this case there is not a real difference in efficiency
so it might be worth keeping it more readable if that's better for you.

mss

unread,

Dec 30, 2009, 7:36:45 PM12/30/09

to

pk wrote:

>
> You rebuild the letter[] array every time the function is called, which is
> probably not too efficient, especially if you call it lots of times. Since
> it doesn't change, you could just build it once and put it in a global
> variable.

Hey pk.

Yes you're right. (Its only included in the function to illustrate,
else wise I'll use it in BEGIN{}).

> for (z = 1; z <= y; z++) {
> buf = buf ((char = substr(str, z, 1)) in letter)?letter[char]:char
> }
>
> or even (to save a variable!)
>
> while (y) {
> buf = (((char = substr(str, y--, 1)) in letter)?letter[char]:char) buf
> }
>
> But I agree that in this case there is not a real difference in efficiency
> so it might be worth keeping it more readable if that's better for you.

The 2nd fragment is nifty! You and Anton seem to have a knack for
'unwinding' a loop.

Question... Have only been coding in C (Pelles) about two years now,
and of that, its nearly all Windows API related. Can you provide a blurb
or two on how '?'..':' works in AWK? I typically use If, or Case
statements...

Anyhow, thanks for the ideas & stay tuned... will work some of these ideas
into the mix, I like the thinking.

Janis Papanagnou

unread,

Dec 30, 2009, 8:11:18 PM12/30/09

to

mss wrote:
> [...]

> Question... Have only been coding in C (Pelles) about two years now,
> and of that, its nearly all Windows API related. Can you provide a blurb
> or two on how '?'..':' works in AWK? I typically use If, or Case
> statements...

It's a conditional expression; if you've coded in C you might have seen
the identical construct there.

A conditional statement...

if (c) x = 1 ; else x = 2 ;

and an equivalent conditional expression...

x = c ? 1 : 2

Conditional expressions are long existing constructs, even Algol had them.
They had long been considered deprecated as being less performant that the
equivalent conditional statement. Meanwhile, as compiler have optimizers,
typically, it shouldn't make a difference in performance any more.

Janis

> [...]

mss

unread,

Dec 31, 2009, 12:55:08 PM12/31/09

to

Janis Papanagnou wrote:

> if (c) x = 1 ; else x = 2 ;

...

> x = c ? 1 : 2

Thanks Janis!

mss

unread,

Dec 31, 2009, 1:00:29 PM12/31/09

to

Okay, here's my latest with help from the (g)awk community.
Still readable, yet pretty tight:

function rot13(str, from, to, x, y, z, letter, char, buf) {

# rot13 for gawk

# more info at: http://en.wikipedia.org/wiki/ROT13

# a modification of the example found at:
# http://www.miranda.org/~jkominek/rot13/awk/

from = "NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm0987654321"
to = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890"

x = length(from)
y = length(str)

for (z = 1; z <= x; z++) {
letter[substr(from, z, 1)] = substr(to, z, 1)
}

for (z = 1; z <= y; z++) {

char = substr(str, z, 1)

buf = (char in letter) ? buf letter[char] : buf char
}

return buf

Brian Donnell

unread,

Dec 31, 2009, 4:38:07 PM12/31/09

to

On Dec 28, 7:48 am, mss <m...@dev.null> wrote:
> #!/bin/awk -f
> # ROT13 in AWK
> # a slight modifcation of the example found at:

> #http://www.miranda.org/~jkominek/rot13/awk/

>
> BEGIN {
>
> from = "NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm0987654321"
> to = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890"
>
> for (i = 1; i <= length(from); i++) {
> letter[substr(from, i, 1)] = substr(to, i, 1)
> }
>
> }
>
> {
> for (i = 1; i <= length($0); i++) {
> char = substr($0, i, 1)
> if (match(char, "[a-zA-Z]|[0-9]") != 0) {
> printf("%c", letter[char])
> } else {
> printf("%c", char)
> }
> }
> printf("\n")
>
> }
>
> --
> later on,
> Mike
>
> http://topcat.hypermart.net/

And see the thread started by AaronM, "increment letters", 10/18/2008

mss

unread,

Jan 1, 2010, 4:09:30 AM1/1/10

to

Brian Donnell wrote:

> And see the thread started by AaronM, "increment letters", 10/18/2008

Will do, appreciate the heads up.

Brian Donnell

unread,

Jan 2, 2010, 3:02:28 PM1/2/10

to

Hi, Mike

In my trials, I found that indexing $z in the "from" string and
returning the char at that index in the "to" string was slightly
faster than building and using an associative array to relate the two
strings.

Also, as a New Year's pleasantry, I send this:--

BEGIN { FS = ""
s = "aNbOcPdQeRfSgThUiVjWkXlYmZnAoBpCqDrEsFtGuHvIwJxKyLzM"
}

{ l = ""
for (i = 1; i <= NF; i++) {
n = index(s, $i)
l = l (n ? (n%2 ? tolower(substr(s, n+1, 1)) : \
toupper(substr(s, n-1, 1))) : $i)
}
print l
}

Best wishes to all Awkers for the New Year, Brian

mss

unread,

Jan 3, 2010, 11:59:40 AM1/3/10

to

Brian Donnell wrote:

> Hi, Mike
>
> In my trials, I found that indexing $z in the "from" string and
> returning the char at that index in the "to" string was slightly
> faster than building and using an associative array to relate the two
> strings.
>
> Also, as a New Year's pleasantry, I send this:--
>
> BEGIN { FS = ""
> s = "aNbOcPdQeRfSgThUiVjWkXlYmZnAoBpCqDrEsFtGuHvIwJxKyLzM"
> }
>
> { l = ""
> for (i = 1; i <= NF; i++) {
> n = index(s, $i)
> l = l (n ? (n%2 ? tolower(substr(s, n+1, 1)) : \
> toupper(substr(s, n-1, 1))) : $i)
> }
> print l
> }
>
> Best wishes to all Awkers for the New Year, Brian

How nifty is that? Brian this is great. That's cool the way you've
inter-woven the string in such a manner. I can see that shaving a
few cycles off the time too...

This is a keeper for my snippet collection. Thank you kind sir.

Loki Harfagr

unread,

Jan 3, 2010, 12:15:41 PM1/3/10

to

Sun, 03 Jan 2010 16:59:40 +0000, mss did cat :

As a foolowup, though I'd have posted earlier if I hadn't been
hit by a gastro/celebration virus those last days ;-(
here below are my 2 cents and a half ;-)

- allow me to insist that your said "rot13" is strictly speaking
a "rot13 and a rev symmetry on nums.
- as the chosen "cipher" is bijective there's no need to list
the full set twice (as Brian also noticed ;-)
- as you insist on using a unique function I put my example that way
but in case of a use on an input with many records you'd really
better think about a _pre_function generating "global span" arrays.
- the parsing of the input 'str' can be made as you like it with
a 'substr' but I here chose to use the 'split' way ;-)
----------
#!/bin/awk -f
# ROT13 and crossrev5 in awk
# a slight modification on a slight modifcation of
# http://www.miranda.org/~jkominek/rot13/awk/
###
function rot13xrev5(str, ALF, res, h, i, j, lin, lDNA, rDNA) {
ALF="ABCDEFGHIJKLMnopqrstuvwxyz09876NOPQRSTUVWXYZabcdefghijklm12345"
h=(split(ALF,alf,""))/2
for(j=h+(i=1); i<=h; j=h+ ++i){
lDNA[alf[i]]=alf[j]
rDNA[alf[j]]=alf[i]
}
j=split(str,lin,res="")
for(i=1; i<=j; i++)
res=res (index(ALF,lin[i])?(lDNA[lin[i]] rDNA[lin[i]]):lin[i])
return res
}
{ print rot13xrev5($0) }
----------

mss

unread,

Jan 3, 2010, 12:32:21 PM1/3/10

to

Loki Harfagr wrote:

> better think about a _pre_function generating "global span" arrays.

Yes! *Its only in the function for the example*. Otherwise, with
repeated use, the lookup table (that long string) should be in BEGIN{}.

And thanks, I'll study your example Loki!

Brian Donnell

unread,

Jan 3, 2010, 12:44:10 PM1/3/10

to

Hi, Mike--

The single-string code I posted lightheartedly seems to run slower
than a script using two strings mapped to each other as in your code.
Haven't tried Loki's; I was influenced by his responses to the AaronM
10/18/2008 posting mentioned earlier.

Loki, I hope you're recovered from the holidays. Brian

Loki Harfagr

unread,

Jan 3, 2010, 2:29:13 PM1/3/10

to

Sun, 03 Jan 2010 09:44:10 -0800, Brian Donnell did cat :

Hah! well, of course it goes the same path (hash and index) hence runs
slower than 'direct' (string-pointer + displacement), I post below
the probably "best" effort version, on a one million lines sample
input it runs in 'one TimeUnit' while mss code runs in 151p100 TU
and my sample first coed with all the arrays was 555p100 :D)

> I was influenced by his responses to the AaronM 10/18/2008
> posting mentioned earlier.

Er?-) You do mean the "easy to obfuscate" idea, don't you ?-)

>
> Loki, I hope you're recovered from the holidays. Brian

Thanks Brian, (I hope too ,D) time will tell and if anyone comes with a
highly faster version that below I'll know I'm not yet
back to sourcery ,-)

-------------------

#!/bin/awk -f
# ROT13 and crossrev5 in awk
# a slight modification on a slight modifcation of
# http://www.miranda.org/~jkominek/rot13/awk/
###

function _pre_rot13xrev5(h, i, j) {

ALF="ABCDEFGHIJKLMnopqrstuvwxyz09876NOPQRSTUVWXYZabcdefghijklm12345"
h=(split(ALF,alf,""))/2
for(j=h+(i=1); i<=h; j=h+ ++i){

DNA[i]=alf[j]
DNA[j]=alf[i]
}
}
function rot13xrev5(str,ALF,res, h, i, c) {
i=1+length(str)
while(--i){
c=substr(str,i,1)
h=index(ALF,c)
res=(h?DNA[h]:c) res
}
return res
}
BEGIN{ _pre_rot13xrev5() }
{ print rot13xrev5($0,ALF) }
-------------------

Janis Papanagnou

unread,

Jan 3, 2010, 3:45:49 PM1/3/10

to

Loki Harfagr wrote:
[...]

>
> Hah! well, of course it goes the same path (hash and index) hence runs
> slower than 'direct' (string-pointer + displacement), I post below
> the probably "best" effort version, on a one million lines sample
> input it runs in 'one TimeUnit' while mss code runs in 151p100 TU
> and my sample first coed with all the arrays was 555p100 :D)
>
>> I was influenced by his responses to the AaronM 10/18/2008
>> posting mentioned earlier.
>
> Er?-) You do mean the "easy to obfuscate" idea, don't you ?-)
>
>> Loki, I hope you're recovered from the holidays. Brian
>
> Thanks Brian, (I hope too ,D) time will tell and if anyone comes with a
> highly faster version that below I'll know I'm not yet
> back to sourcery ,-)

If we're seriously starting to inspect performance we should at that
point note that one should always use the right tool for the task.
On Unix'es, e.g., you'd apply the tr(1) command, which not only runs
ten times faster than this awk program below but is also much easier
in the code, actually just half a line of code (40 characters, or so).
;-}

Good recovery, Loki.

Janis

Loki Harfagr

unread,

Jan 4, 2010, 1:21:51 PM1/4/10

to

Sun, 03 Jan 2010 21:45:49 +0100, Janis Papanagnou did cat :

> Loki Harfagr wrote:
> [...]
>>
>> Hah! well, of course it goes the same path (hash and index) hence runs
>> slower than 'direct' (string-pointer + displacement), I post below the
>> probably "best" effort version, on a one million lines sample input it
>> runs in 'one TimeUnit' while mss code runs in 151p100 TU and my sample
>> first coed with all the arrays was 555p100 :D)
>>
>>> I was influenced by his responses to the AaronM 10/18/2008 posting
>>> mentioned earlier.
>>
>> Er?-) You do mean the "easy to obfuscate" idea, don't you ?-)
>>
>>> Loki, I hope you're recovered from the holidays. Brian
>>
>> Thanks Brian, (I hope too ,D) time will tell and if anyone comes with a
>> highly faster version that below I'll know I'm not yet back to sourcery
>> ,-)
>
> If we're seriously starting to inspect performance we should at that
> point note that one should always use the right tool for the task. On
> Unix'es, e.g., you'd apply the tr(1) command, which not only runs ten
> times faster than this awk program below but is also much easier in the
> code, actually just half a line of code (40 characters, or so). ;-}

Oh well, yes indeed :-) but the OP seemed to search paths of
exploration in awk and I just added the "speed optimized" version
as Brian mentionned the perf hiatus between using arrays hashes and
using 'semi-direct' targetting :-)
I reckon I once tried and build a 'rice' and 'arith' toolbench in gawk just
because I thought it'd help the trainees not to have to struggle with
asm or C (or wotever), well the tools somewhat worked but the perfs were
really beyond the human life span when testing on actual big files :D)

> Good recovery, Loki.

Thank you Janis, after my first day at work it's already much better,
now back in my dent ;D)

>
> Janis
>
>
>> -------------------
>> #!/bin/awk -f
>> # ROT13 and crossrev5 in awk

...