Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

obfuscated AWK code challenge

99 views
Skip to first unread message

RARE Kpop Manifesto

unread,
Dec 26, 2021, 9:53:53 PM12/26/21
to
without actually running this code, can you figure out which prime number has been encoded within the input of ASCII letters

[ Mcb ]

plus the trick that allowed me to achieve such a compression ratio.

* The code works equally well in gawk, mawk-1, mawk2, and nawk.
It is ENTIRELY self-encapsulated.

Enjoy !

= === === === === === === === === === === === === ==

echo; cmd=' echo Mcb | mawk2 '\''function _________(__,_,___) { (___="bc <<< \"obase="(_*=_^=_+=++_)"; "(__)" ;\"")|getline _; close(___); return _ } BEGIN {_____="%c";for(_-=(___-=(_=(_*=_*=_*=_+=++_)))^!_;-""<=_;_--) {__[sprintf(_____,___-+-_)]=_}; for(_ in __) { __[_] } } {____=+""; for(_=_____=length($(____));_;_--) { ___=__[substr($(____),_,!!_)]___ }; ______=___%(_____=((!-"")(-""))^(_____*=(_^=_=-"")^-!-""));_______=int(___/_____);___="%*.f"; ________=sprintf((___"_")___,--_______,!_,--______+-+_______,!___); gsub(/[^_]/,_^!_,________); sub(/[_]/,+"",________); sub(/..$/,"",_____);_____+=_="";_+=_^=_; printf(" log-base-"(_)" :: %."(_^_^_)"g%c%c %c digits in decimal :: %d%c%c 0x %s%c%c",log(_)^(_/-_)*(log((".")(________))+log(_____)*(___=length(________))), _____, _____,_^_*_____-_^-!!_*_____,___,_____,_____, _________(________), _____,_____); system("openssl prime -checks "(_+=_*=(_^=_)*_)" "(________)) }'\'' '; gprintf '\n%s\n\n' "${cmd}" ; echo; eval "${cmd}"

Janis Papanagnou

unread,
Dec 27, 2021, 5:01:32 AM12/27/21
to
Subject: Re: obfuscated AWK code challenge

It's arguable whether it's appropriate to call code an "AWK code
challenge" that obviously relies on external commands like 'od',
'openssl', and various shell commands _within_ the awk code.

Janis

Ed Morton

unread,
Dec 27, 2021, 7:52:51 AM12/27/21
to
On 12/26/2021 8:53 PM, RARE Kpop Manifesto wrote:
> without actually running this code, can you figure out which prime number has been encoded within the input of ASCII letters

Serious question - why would we want to do that? Is this homework?

>
> [ Mcb ]
>
> plus the trick that allowed me to achieve such a compression ratio.
>
> * The code works equally well in gawk, mawk-1, mawk2, and nawk.
> It is ENTIRELY self-encapsulated.

No, it's not, it's spawning subshells to call external tools from awk
and using shell to populate a variable then calling eval to execute it.
You could call this a shell code challenge I suppose but there's still
the question of "why?".

Ed.

Kpop 2GM

unread,
Jan 15, 2022, 7:16:19 AM1/15/22
to
the openssl part is only to do a real-time proving that whatever that was generated is indeed a 3312-bit prime number. The "bc" part is just to print that prime out in hex, that's all. if u want to u can ignore that part. i could add a bigint2hex function here but it would bloat it up.

when i posted this one, 3312-bit was the largest i found, with this output :

log-base-2 :: 3312.114313696145

# digits in decimal :: 998

0x 1151C1900FBB915A46082603C8F0F9A89505D3D11D440819AB64EC6F02A03DE9D9ADB5BE503EF7BF92835B5E480BA38B69DF05C51BC341797A7F830A27E5BD987D6F9FF7ADE7617228200D1457DD81512E421655F9AA1252496124EFA42A709113D454C7C605C13CAE151822938F9CF88F182868A5F6A8EA5A007181B734F37EE8287BEEBC65D79C45ED096CDA1212298ABDFC6740B545B2FBA76661A855AE963C14E370031656B010E8EBBEE709727B15B86DD0C2C85D30ECE5AE9485933B64A3F6D41913C83D6E0CD267B315A5BE712927B2940C52498CFE2CDC490FE243643D8D43BF359B2F220EED2CC77B1A00998A6968E13016DA892EC80B3E07A453CBAB356870BB4C87BE80C425B1B1D74C3BC4A0D000B10735C0C29B2885AD0D9929C7CB1C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C7111CF137D09F26D544CC35DBAF70FC67C02B7B5390BE713DA98BB8DB709D78DA1F0F9BFD19D7BA9DC48C0EE771761D5E72AA8E43E79211925708B0770CF93ECDF1C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C7

11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 is prime

that's not a bit-string - that's a base-10 decimal number that happens to only have 1's and 0's. This code actually runs pretty fast - no recursion needed, no loops in the main body (other than setting up a lookup table at beginning)

it's Mcb cuz knowing the very speicific structure of only 1s, and 1 single zero, all i have to encode would be total length, and position of that zero

779 is the position, 998 is the length
779998
by ASCII ordinance numbers, 77,99,98 maps to M, c, and b, which is even more efficient encoding that trying to encode that number in hex, which is BE6DE. Print that hex out as bytes x0B xE6 xDE and it's still 3-bytes needed, but won't be particularly human-readable.

that's before. since the original posting, using the same technique, i could now encode a 13,789-bit prime using just 4 bytes (see below). What's the point ? i'm curious if any algorithm, anywhere, can achieve a better compression ratio than that. RLE is still much larger, as is LZW or LZMA.

u can say mine is a bit cheating, since it's not a generic algorithm - i see it as : decompression algorithms are just a form of lossless reconstruction, so why limit the possibilities, as long as the output is what's intended ? there are algorithms optimized for lossless compression of audio, like FLAC, so perhaps there should also be compression algorithms optimized for the key exchanges in cryptography .

Make the overhead low enough, and we could even move to a paradigm where every other message, or even every message, is a different one-time key.

function __________(__,_,___) {
(___="bc <<< \"obase="(_*=_^=\
_+=++_)"; "(__)" ;\"")|getline _;
close(___); return _ };

function _________(_,__,___,____,
_____,______,_______,________) {
_____="%c";
for(_-=(___-=(_=(_*=_*=_*=\
________=_+=++_)))^!_;\
-""<=_;_--) {__[\
sprintf(_____,___-+-_)]=\
(((______=sprintf(substr(_____,\
_^!___,!-"")(+"")(_______)"d",_)\
)!~/.../)?______\
: substr(______,(!-"")+(!-"")))
};for(_ in __) {
__[_] }
____=+(_______="")
_______=_=-(\
_^=_^=_+=_^=_="")
for(_=_____=length(____=sprintf(\
"%c%c%c%c",\
_______+(4-6)^6-4*6+4^!6,\
_______+46+6,\
_______+(4+6)^(-4+6),_______+46));_;_--) {
___=__[substr(____,_,!!_)]___;
}
if ((______=___%(_____=\
((!-"")(-""))^(_____*=\
(_^=_=-"")^-!-"")))<(_______=\
int(___/_____))) {___=______;
______=_______;_______=___}
___="%*.f";________=sprintf(\
(___"_")___,--_______,!_,\
--______+-+_______,!___)
gsub(/[^_]/,_^!_,________)
sub(/[_]/,+"",________)
sub(/..$/,"",_____)
_____=(_+=_^=_="")^_*_+_;
printf(" log-base-"(_)" :: %."(\
_^_^_)"g\n\n %c digits in "\
"decimal :: %d\n\n 0x %s\n\n",\
(log((".")(substr(________,_/_,\
_____*_*_____)))+log(_____)*(___=\
length(________)))/log(_),_____/-_+\
_^_*_____,___, __________(________))
system("openssl prime "\
" "(________)); return ________ };

log-base-2 :: 13789.47552497089

# digits in decimal :: 4152

0x 2C7E5AA2F0BE2B80FF9069A51FD1F58B439FCA60DEDD9D7F4F12FDCCAECEA620CC211B4A3DDB9E1CC7C6A478ACDD0DA895D8FA594B723A6D8F8D0A9998B2640A2089A54D82480D1218C764285FD9F96D8222623E64E7FD9C5B33359DC5B3C87591CE9BC6CE493E9CD0FA633BEC77432074E6E4FBB04D0BD33C9E4BE92076D7DFC75D6A78BCCE4D0A80949167BA292C7756CD0A7311118394E74D99728A2175149F57E0B3B081530AC1DFB6D05EFB02E82A83EBE03A4B73350EC780788438340462CB782ED9A5F3208DB16392CED5E976BCF1E385D88A494214F20186945A74151DFB107DFC33C3371E314C26906C5D5696F47E75F186D61454EE594A0254DAA1877CAB24B23547D3FEB04E89DAEAD6577EF7251F893493FBD28E3D1045579BD9C9F8133001600F2D9A7B290350CB9A736D8D4F07C7183A6A6A8676204B692F7341FC8EE68D3DC20E00B015FBB946C7999A42DCFC26B6D78ECDCAD06457BE0BD51604C823DB5948AFB7F038ED981DBCDCCBF401DEC2C0F803A69DE914BE1B99FBA80EA6CB41EB2F1D9DCE7DC1D14753F354B775B5E780C7B4114F241237691BF8549EC6D54F4A718CB81C35DE1E6DA861C56B57BA87998218EA10497378C60E2467CC5E20E54E583EF1396EE0BBEA29CD70B57515D8FCC44D681098EE61FE24D4D9CFAC54E2A63ACA9DBBB470291AB988346D7A015CE7C0DC0464FB5272512149266B1BF1E292EB788B74EC716DA132FDD0FA4D1D66823E12748ECC06536A0956BAF2C6C1656A5A6C1E612E30263DECBD1FBBD34383DD34E6A7FF6FA0C19A264CF24E8374F5E55CF9B755BB00969F940D08384E1FB6A81C838E763953A8F8F0734570261AB051903A123BBA8FE4AF37FB36C4E61497FD5ECB84C1A170777591AB46513006859C777F91FDAA1464B7158258E83F27081ABA1F5B66F8377DA9EB96D4124C322FF5C0721CB097D317380EDC375F526B62B5D736F59FCC749B1C145584CE60ED15FE804FC4D533CCFA6DB77B0352B177A2E587D6D35932197969B20DB2327BE2DEF73EC46BDC17358B10ACFB4B01D460930BC27B3EE11EE69692E4554C396231E8103B479335A68155949A9D2021C2D30AD16AB295339FA5D9BA7BBAA713F0AC26D8A5292AFC7F94583F3F0B193CCF6AC450178218EBA7D2396531B1F48A7ADF6915B0BEE0A2AF6C875A13145AABECADC63FCCCDEB097496C1115387690B2DDD4C51B6E22F8E060841AF02A4AFEB62A25B67C872E232F6C70EAD3222E57BB12823C34225A12503549FB9F3E5D8967AD738B4D28FF72EC8FBAD18CF4AB82BD50F35B7A9A9D41CB27506363404D6092670C018F41D951A7F342CF5363459DAEC4E38B4797FACE0DF7356F67CE9694469AE27C68DC13C12EC6ACDB8F003C775B19650B9AACAB3C97CE2F8EC78EA7F05F7B3211EEA6A8006316AE93EF4811ED6688F8BA34A7119B16438FD14EC1A21F22868738EA7D8F95E895438F1E7FA199E904A2869C39AEA681B10E7E00555539730E9CD1E4D842C31B1156ECCEE49B3E83F1542D6CC3CBD4E487D48832E7F4C93261F2792192146A429763E9B641A548492A5E0E76CAE75DB5D1B0F919FC79E15FA793027F1F9FC9B8DA1C2FEB40B56AB442135606BE914B620541BB552F2D0A558E1133581E3749C4EBE585D68F31E70AE62965579F671EAD5F5006D25069BE5C46C4B5C5A8B2165E31C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C71C7

111111111111111111111111111111111111111111111011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 is prime

Kpop 2GM

unread,
Jan 17, 2022, 5:28:28 AM1/17/22
to

@Janis Papanagnou : if u want a pure awk code challenge, tell me what output comes out of this :

mawk _^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^-_^_^_^_^_^_^_^_^_^_^_^_^_^--_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_

Janis Papanagnou

unread,
Jan 17, 2022, 5:54:28 AM1/17/22
to
On 17.01.2022 11:28, Kpop 2GM wrote:
>
> @Janis Papanagnou : if u want a pure awk code challenge, tell me what output comes out of this :
>
> mawk _^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^-_^_^_^_^_^_^_^_^_^_^_^_^_^--_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_
>

From a quick glance it seems only half of what is produced by either of

awk 'Doh! Uh - oh, oh no; no! (not again)'

awk 'Doh! Uh - oh, oh no; no! (not again) - Unclear?! then say: "bye"'


Janis

pk

unread,
Jan 17, 2022, 6:03:53 AM1/17/22
to
Absolutely nothing if you don't supply an input, and even then I doubt it
produces any output at all.

Janis Papanagnou

unread,
Jan 17, 2022, 6:14:48 AM1/17/22
to
Careful! - You have a variable named _ and it has a value in numeric
context of 0, then you have these cascades of power 0^0 which results
in 1, and some arithmetic in between (- and - -), that effectively
seems to result in 1, meaning an awk condition 'true', and that the
input is therefore just reproduced in the output. (Just an educated
guess, no analysis done.)

Janis

Kpop 2GM

unread,
Jan 20, 2022, 4:12:36 PM1/20/22
to
it's actually really straight forward - that code simply prints everything except first line.

here's another pure awk one for u : what does this perform :

mawk -F= 'BEGIN {_+=(_^=__=_+=++_+_)-_/_} $!!_=$!_=$(__+(_<=+$!_)*NF)'

Kpop 2GM

unread,
Jan 20, 2022, 4:16:34 PM1/20/22
to
On Monday, January 17, 2022 at 6:14:48 AM UTC-5, Janis Papanagnou wrote:
> On 17.01.2022 12:03, pk wrote:
> > On Mon, 17 Jan 2022 02:28:27 -0800 (PST), Kpop 2GM
> >
> >>
> >> @Janis Papanagnou : if u want a pure awk code challenge, tell me what
> >> output comes out of this :
> >>
> >> mawk
> >> _^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^-_^_^_^_^_^_^_^_^_^_^_^_^_^--_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_
> >
> > Absolutely nothing if you don't supply an input, and even then I doubt it
> > produces any output at all.
> Careful! - You have a variable named _ and it has a value in numeric
> context of 0, then you have these cascades of power 0^0 which results
> in 1, and some arithmetic in between (- and - -), that effectively
> seems to result in 1, meaning an awk condition 'true', and that the
> input is therefore just reproduced in the output. (Just an educated
> guess, no analysis done.)
>
> Janis

those cascading 0^0, and one single negation and one single pre-decrement were absolutely intentional.that wasn't a typo.

but i also found the absolutely shortest syntax possible to utilize gawk's hex decoder and print out decimals, assuming the input is just rows of 0x….. :

gawk -n '($!_=+$_)~_'
or
gawk -nM '($!_=+$_)~_' (if you need higher than double precision)

Kpop 2GM

unread,
Jan 20, 2022, 4:21:11 PM1/20/22
to

> but i also found the absolutely shortest syntax possible to utilize gawk's hex decoder and print out decimals, assuming the input is just rows of 0x….. :
>
> gawk -n '($!_=+$_)~_'
> or
> gawk -nM '($!_=+$_)~_' (if you need higher than double precision)

need to append my statement - the same code also works for octals-to-decimals.

echo 025333523235356512534543125646531264523653261 | gawk -nM '($!_=+$_)~_'

1822980154315230596830091282486207141553

basically anything in the same standardized format accepted by strtonum( ) would work, without having to call that function, or even use the print statement, and without even having to type in any numbers at all in the code.

Kpop 2GM

unread,
Jan 20, 2022, 4:33:22 PM1/20/22
to
the posix flag -P is similar as the -n (nondecimal) flag in the sense it can interpret hex and octals, but only realistically up to 2^53 since -P flag doesn't pair well with bignum flag -M - it wouldn't print out an error message, but the -P flag gets nullified by -M flag

the easiest way i found to detect whether an innovation of any variant of awk is in gawk -P mode is

("<"<"\x3c")

\x3C is the hex code for byte "<", so this boolean criteria fails for everyone else since one cannot be less than one-self, except gawk -P, which ignores the hex notation, and compares "<" (\074) against "x" (\170)

Janis Papanagnou

unread,
Jan 20, 2022, 6:09:06 PM1/20/22
to
On 20.01.2022 22:12, Kpop 2GM wrote:
> On Monday, January 17, 2022 at 5:54:28 AM UTC-5, Janis Papanagnou wrote:
>> On 17.01.2022 11:28, Kpop 2GM wrote:
>>>
>>> @Janis Papanagnou : if u want a pure awk code challenge, tell me what output comes out of this :
>>>
>>> mawk _^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^-_^_^_^_^_^_^_^_^_^_^_^_^_^--_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_
>>>
>> From a quick glance it seems only half of what is produced by either of
>>
>> awk 'Doh! Uh - oh, oh no; no! (not again)'
>>
>> awk 'Doh! Uh - oh, oh no; no! (not again) - Unclear?! then say: "bye"'
>>
>>
>> Janis
>
> it's actually really straight forward - that code simply prints everything except first line.

Not in my book, not in my system environment. And certainly not as
designed and intended.

Does it really do that in your environment?
Have you tried other awks than mawk?

And it's much less straightforward than your previously posted code.

Your code had just one type of obscurity; a uninitialized untypical
variable _ used that gets a default value in an expression that has
just be copied many many times. Any maybe the decrement operator can
be considered tricky because it requires an lvalue not a value, but
that's standard in C based programming languages.

In my code you find various concepts; lots of implicit forth and back
conversions between strings and integers, grouping, arithmetic and
negations, string constants, a range operator, and last but not least
even a ternary operator. All grouped like a sentence.

Spoiler; it should effectively evaluate to awk '1;1' thus, as noted,
duplicating every input line. (I wrote "[your code produces] half of
what is produced by [my code]", and I meant that literally since your
code produces the same as awk '1'.)

Janis

Janis Papanagnou

unread,
Jan 20, 2022, 6:23:20 PM1/20/22
to
On 20.01.2022 22:16, Kpop 2GM wrote:
> On Monday, January 17, 2022 at 6:14:48 AM UTC-5, Janis Papanagnou wrote:
>> On 17.01.2022 12:03, pk wrote:
>>> On Mon, 17 Jan 2022 02:28:27 -0800 (PST), Kpop 2GM
>>>
>>>>
>>>> @Janis Papanagnou : if u want a pure awk code challenge, tell me what
>>>> output comes out of this :
>>>>
>>>> mawk
>>>> _^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^-_^_^_^_^_^_^_^_^_^_^_^_^_^--_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_
>>>
>>> Absolutely nothing if you don't supply an input, and even then I doubt it
>>> produces any output at all.
>> Careful! - You have a variable named _ and it has a value in numeric
>> context of 0, then you have these cascades of power 0^0 which results
>> in 1, and some arithmetic in between (- and - -), that effectively
>> seems to result in 1, meaning an awk condition 'true', and that the
>> input is therefore just reproduced in the output. (Just an educated
>> guess, no analysis done.)
>>
>> Janis
>
> those cascading 0^0, and one single negation and one single pre-decrement
> were absolutely intentional.that wasn't a typo.

I didn't say or meant it was a typo. That's just "some arithmetic",
as also said in my other recent post. You used arithmetic, ^, -, --,
and a variable _ , that's all WRT complexity, the duplication doesn't
really contribute.

BTW, in my other post I forgot to mention one more trick in your code;
one should be aware that the exponentiation has right-associativity
(right grouping expression) and the evaluation of the subexpression
toggles between 0 and 1, so if you reduce the expression you have to
do that pair-wise to decompose that correctly.

_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_

is equivalent to either of

_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_
_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_
_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_
...
_^_^_^_
_^_


>
> but i also found the absolutely shortest syntax possible to utilize
> gawk's hex decoder and print out decimals, assuming the input is just rows of 0x….. :
>
> gawk -n '($!_=+$_)~_'
> or
> gawk -nM '($!_=+$_)~_' (if you need higher than double precision)
>

With my version of GNU awk this code just reproduces the input.

Want to provide test samples?

Janis

Kpop 2GM

unread,
Jan 21, 2022, 1:03:12 AM1/21/22
to
try this one :

echo '0xCAFEBEEFFEED' | gawk -n '($!_=+$_)~_'
223195473903341

Kpop 2GM

unread,
Jan 21, 2022, 1:09:01 AM1/21/22
to
nawk is slightly more verbose :

echo '0xCAFEBEEFFEED' | nawk '($!+_=+$+_)~_'
223195473903341

echo '0xCAFEBEEFFEED' | mawk '($!_=+$_)^_' OFMT=%.f
223195473903341

echo $'0x0\n0xFEEDCAFEBEF' | mawk '($!_=+$_)<"_"' OFMT=%.f
0
17518579149807

echo '0xCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEED' | gawk -nM '($!+_=+$+_)~_'

696760147094848118127845202403676761678558245541567696744233608423914272675373998167144529865231202819614166286871620170316927958511530298151998159289949617901

Kpop 2GM

unread,
Jan 21, 2022, 1:12:50 AM1/21/22
to
echo '0xCAFEBEEFFEED' |gawk -p- -P '($!+_=+$+_)~_'
223195473903341
# gawk profile, created Fri Jan 21 01:12:16 2022

# Rule(s)

1 ($! +_ = +$+_) ~ _ { # 1
1 print
}

Kpop 2GM

unread,
Jan 21, 2022, 2:11:42 AM1/21/22
to
i've been building this huge library for my own use, and to ensure it's as portable as possible, i test the library against mawk 1.3.4, mawk1.9.9.6, gawk in unicode mode, gawk in byte mode, and nawk (or whatever proper name of that awk is that comes with macos at /usr/bin/awk)

from either pre-compiled binaries at homebrew, or just a straight up "make" using mawk-2's source code.

as a result, the library practically has nothing gawk specific at all, but the library also auto-detects which variant I invoked it with based on built-in behavior of the awk itself that cannot simply be tricked by hardcoding in a constant, or setting a variable somewhere, shell or inside awk :

e.g. this criteria (+"0x1" * 0x1)

every other awk and every other invocation flag of gawk would produce a zero, EXCEPT gawk -n / gawk -n -M

or this one : (+"")^atan2(+"",-log(+!+""))

every other awk and every invocation flag of gawk would produce a zero, EXCEPT nawk

i made this next criteria to quickly identify a few different variants/gawk flags, although not all of them are unique :

'BEGIN {
print \
-log((log(-"")*log(-""))^-log(-""))\
/(-"0xABCD")^-!-"" }'

mawk inf
mawk2 nan
nawk inf
gawk -P -e +inf
gawk -c -e +nan
gawk -e +nan
gawk -M -e -nan
gawk -n -e +inf

just gawk alone, you can get it to print out either positive Infinity, negative NaN, or positive NaN, depending on which flags.

Kpop 2GM

unread,
Jan 21, 2022, 2:25:12 AM1/21/22
to
to check for mawk-2, it's

("\333\222"~"[^\333\222]")

it's a bug that only shows up in mawk-2. run it in gawk unicode mode or gawk byte mode, it's still a false.

or simply checking using hex decoding, one can split out 4 different invocation flags of gawk :

% gawk -n -e 'BEGIN { print +0xDEAD,+"0xCAFE" }'
57005 51966

% gawk -t -e 'BEGIN { print +0xDEAD,+"0xCAFE" }' (same for just -e)
57005 0

% gawk -P -e 'BEGIN { print +0xDEAD,+"0xCAFE" }'
0 51966

% gawk -c -e 'BEGIN { print +0xDEAD,+"0xCAFE" }'
0 0

Kpop 2GM

unread,
Jan 21, 2022, 6:04:32 AM1/21/22
to
i found you a great test case : it's a pair of hex that's horizontal mirrors of each other, and both are prime. As you can see here, the same syntax works across mawk-1, nawk, and gawk -n :

mawk '($!+_=$+_=($+_)(":")(+$+_))~_' CONVFMT=%.f <<< $'0x11111BBBBFFF\n0xFFFBBBB11111' | ecp

0x11111BBBBFFF:18765177405439
0xFFFBBBB11111:281456650817809

nawk '($!+_=$+_=($+_)(":")(+$+_))~_' CONVFMT=%.f <<< $'0x11111BBBBFFF\n0xFFFBBBB11111' | ecp

0x11111BBBBFFF:18765177405439
0xFFFBBBB11111:281456650817809

gawk -n '($!+_=$+_=($+_)(":")(+$+_))~_' CONVFMT=%.f <<< $'0x11111BBBBFFF\n0xFFFBBBB11111' | ecp

0x11111BBBBFFF:18765177405439
0xFFFBBBB11111:281456650817809

Message has been deleted

Kpop 2GM

unread,
Jan 21, 2022, 6:28:05 AM1/21/22
to
apparently even right-aligning of decimals doesn't require a printf( ) statement :

mawk '($(_^_+!_)=+$+_)^_' CONVFMT=%20.f <<< $'0x11111BBBBFFF\n0xFFFBBBB11111' | gtr ' ' '.'

0x11111BBBBFFF.......18765177405439
0xFFFBBBB11111……281456650817809

(it looks screwy here since default google font isn't fixed width)

or with built-in vertical separation :

mawk '($(_^_+!_)=+$+_)<"~"' CONVFMT=%.f OFS="\f" <<< $'0x11111BBBBFFF\n0xFFFBBBB11111'

0x11111BBBBFFF
18765177405439
0xFFFBBBB11111
281456650817809

Janis Papanagnou

unread,
Jan 22, 2022, 4:57:36 AM1/22/22
to
Not for me...

$ echo '0xCAFEBEEFFEED' | gawk -n '($!_=+$_)~_'
0xCAFEBEEFFEED


BTW, I skip/skipped your many posts from the last two days; I find it
inconvenient to get fragmentary thoughts spread over many postings.
(Maybe I read them later or maybe not.)

But, for a change, I like your idea of an awk code challenge and will
open a new thread with another one.

Janis

Kpop 2GM

unread,
Jan 22, 2022, 1:25:09 PM1/22/22
to
this is what my gawk looks like with minor variations each time :

% echo '0xCAFEBEEFFEED' | gawk -n '$1=$0=$0'
0xCAFEBEEFFEED
(as expected)

% echo '0xCAFEBEEFFEED' | gawk -n '$1=$0'
0xCAFEBEEFFEED

% echo '0xCAFEBEEFFEED' | gawk -n '$1=$0=$0'
0xCAFEBEEFFEED

% echo '0xCAFEBEEFFEED' | gawk -n '$1=$0=+$0'
223195473903341

% echo '0xCAFEBEEFFEED' | gawk -P '$1=$0=+$0'
223195473903341

% echo '0xCAFEBEEFFEED' | gawk -P '$1=+$0'
223195473903341

% echo '0xCAFEBEEFFEED' | gawk -P '$1=+$1'
223195473903341

it's quite baffling to me why your gawk acts like that, seeing that on gnu.org, they list gawk 3.1 as first time it contains ability to interpret hex, which is quite some time ago. I got the exact same syntax working in gawk 5.1.1, mawk 1.3.4, and macos nawk to go from hex-to-decimal, so if you're still stuck I don't know what else I could suggest to workaround it other than doing it the old fashion verbose way of strtonum( ), e.g.

echo '0xCAFEBEEFFEED' | gawk -e '$!_=$_=strtonum($_)'
223195473903341

<<< '0xCAFEBEEFFEED' gawk -e '($!_=strtonum($_))^_'
223195473903341

Kpop 2GM

unread,
Jan 22, 2022, 1:32:55 PM1/22/22
to

> BTW, I skip/skipped your many posts from the last two days; I find it
> inconvenient to get fragmentary thoughts spread over many postings.
> (Maybe I read them later or maybe not.)

it's not that i enjoy fragmentation (maybe it's just a manifestation of my ADHD). this, being a good ole' newsgroup, means I couldn't go edit existing posts. the only other option being i copy-over full text of existing to a new post, plus the amendment(s), then deleting the old post (and repeating that cycle numerous times). I'll do it if that's your preference.

Kpop 2GM

unread,
Jan 22, 2022, 3:38:36 PM1/22/22
to
@Janis : i wasn't even intentally obfuscating code for others. I write code directly in that style. like this function here, performs arbitrary-length big-integer multiplication

function _x_(_,__,___,____,_____,______,_______,
________,_________,__________,___________) {
if ((_=="")||(__=="")) {
if (__=="") {
return _ };_=__}
_____="^[-]";________=substr("-",!-"",\
sub(_____,"",_)!=sub(_____,"",__))
sub(/[-]/,"+",_____)
sub(_____,"",_)-sub(_____,"",__)
_______="^["(+"")"]+";
sub(_______,"",_)-sub(_______,"",__)
if (_~(_______="^"(!-"")"?$")) {
return (________)(_?__:_)
} else if (__~_______) {
return (________)(__?_:__) }
_______=""; gsub(/./,+"",_____)
_=(_____)_; gsub(/./,".",_____)
sub("("(_____)")+$","_&",_)
sub("[^_]*[_]","",_)
_________=___*=___=length(_____)
___-=match(___,"$")
if(((_____=(__________=length(_))+ \
(___________=length(__)))<_________)\
|| (_________==_____\
&& (_*__)<(_________^___))) {
return ________?-_*__:_*__;
};_________-=--___;___=\
__________;____=___________;
split(genZeros(_____),______,//);
_____-=!!_+!!_;_____-=_________;
___________-=_________;
for(___^=!___;___<__________;___+=_________) {
_______=+substr(_,___,_________++);
for(____=___________;-_________<____;\
____-=_________) {______[\
_____-___-____]+=_______*(((\
!___<____)||FLG_AWK_MAWK_2)\
? substr(__,____,_________)\
: substr(__,___^!___,\
____+_________-___^!___))
};--_________};_______=\
___^=_____=+(___=____=_____="")
_______=length(______)+(_^=_="")
_^=_="";_+=_+=_-+-++_;
while(___<_______) {____=(\
(_____+=______[___++])%_)____;
_____=int(_____/_) }
sub("^"(!_)"*",________,____); return ____ }

then i use this next one to convert arbitrary sized integers to hex :

function int2hex(_______,______________, _____________,____________,___________,
__________,_________,________,______,
_____,____,___,__,_) {
___________=((_____=((__+=++__)\
)^__)^((__^(__*__)-++__)))*_____;
______________=(__^--__+--__)^(++__)^++__;
___________/=(______=(++__)^(__*=__))
__="";
sub(/^[+-]?[0]*/,"",_______)
sub(/[.][[:digit:]]*$/,"",_______)
if (_______=="") {
return "0x0" }
if (length(_______)<((__+=++__)^__^__)) {
if (_______~/^[0-9]$/) {
return ("0x")(+_______) }
#$if (_______<((___=_____*_____)+___))
__=sprintf("%X%.8X",int(\
_______/______),_______%______)
sub(/^0*/,"0x",__); return __;
}
split("",_);_____=__^=__^=__/__+__;
__=(__=".")__;
gsub("",__,__)
sub("("(__)")+$","_&",_______)
gsub(".","[^_]",__)
___=__=gsub(__,"&_",_______)
____=+"";
while(_______) {
________=(____=____*\
______________+_______)%_____;
_[__--]=int(____/_____)
____=________;
_______=substr(_______,\
index(_______,"_")+!+"") }
_______=sprintf("%.6X",____%_____)
_____+=_____+=_____;
__=____="";__________=-(__^__)
while(___) {
if(_[___]==(____=+"")) {
delete _[___--] }
if(!___) {
break }
for(__=___;-""<=__;__--) {
________=(____=____*\
______________+_[__])%_____;
_[__]=int(____/_____)
____=________ }
if (__________<-"") {
__________=+____
} else {_______=( !FLG_AWK_MAWK_1 \
? sprintf("%.13X",____*_____+__________)\
: sprintf("%.5X%.8X",int((__________+=\
____*_____)/______),__________%______))_______;
__________=-!!______;
} }
_______=sprintf("%X%08X",int((__________=\
__________<-""?____+_[___]:__________+\
_____*(____+_[___]))/______),\
__________%______)_______;
sub(/^0*/,"0x",_______); return _______ };
0 new messages