Exactly, something you should also improve with your next version.
Just as an example I hashed the following strings and printed the
internal state right after the data injection has finished. The hash is
an intermediate value without involving the finalisation function. As
you can see the intermediate hashes clearly reflect the one bit change
of the input data. The final hashes however are fine due to the
extensive byte swapping inside the finalisation function.
intermediate hash = c29232fdece7627817 57 cd7848e0a9cc
intermediate hash = c29231fdece7627817 58 cd7848e0a9cc
Clearly an adversary would follow the evolvement of the sboxes and check
the internal state before the run of the finalisation function. His task
then in would be crafting a 256 byte order that would reset the sboxes
to their initial state and then add whatever byte he like in order to
generate two identical hashes for different documents. Well, nothing
done in a minute, but a motivated hacker might write some kind of
program to automate this task.
echo -en "01000000000000000000000003000000000010" | ./rc4_DavidEather_hash
05:31:64:97:cb:00:36:6d:a5:de:0e:53:12:cc:18:49:
89:ca:8f:4f:93:d8:1e:65:9f:1d:35:80:0d:e8:5f:ae:
fe:13:a1:f4:48:9e:23:27:28:29:2a:2b:2c:2d:2e:2f:
30:01:32:33:34:1a:06:37:38:39:3a:3b:3c:3d:3e:3f:
40:41:42:43:44:45:46:47:24:0f:4a:4b:4c:4d:4e:21:
50:51:52:0b:54:55:56:57:58:59:5a:5b:5c:5d:5e:16:
60:61:62:63:02:17:66:67:68:69:6a:6b:6c:07:6e:6f:
70:71:72:73:74:75:76:77:78:79:7a:7b:7c:7d:7e:7f:
1b:81:82:83:84:85:86:87:88:10:8a:8b:8c:8d:8e:0c:
90:91:92:14:94:95:96:03:98:99:9a:9b:9c:9d:25:0a:
a0:22:a2:a3:a4:08:a6:a7:a8:a9:aa:ab:ac:ad:1f:af:
b0:b1:b2:b3:b4:b5:b6:b7:b8:b9:ba:bb:bc:bd:be:bf:
c0:c1:c2:c3:c4:c5:c6:c7:c8:c9:11:04:1c:cd:ce:cf:
d0:d1:d2:d3:d4:d5:d6:d7:15:d9:da:db:dc:dd:09:df:
e0:e1:e2:e3:e4:e5:e6:e7:19:e9:ea:eb:ec:ed:ee:ef:
f0:f1:f2:f3:26:f5:f6:f7:f8:f9:fa:fb:fc:fd:20:ff:
intermediate hash = c29232fdece762781757cd7848e0a9cc
final hash = 1717ff3f85fbc4d98f1ecbffac7b6376
echo -en "01000000000000000000000003000000000001" | ./rc4_DavidEather_hash
05:31:64:97:cb:00:36:6d:a5:de:0e:53:12:cc:18:49:
89:ca:8f:4f:93:d8:1e:65:9f:1d:35:80:0d:e8:5f:ae:
fe:13:a1:f4:48:9d:23:27:28:29:2a:2b:2c:2d:2e:2f:
30:01:32:33:34:1a:06:37:38:39:3a:3b:3c:3d:3e:3f:
40:41:42:43:44:45:46:47:24:0f:4a:4b:4c:4d:4e:21:
50:51:52:0b:54:55:56:57:58:59:5a:5b:5c:5d:5e:16:
60:61:62:63:02:17:66:67:68:69:6a:6b:6c:07:6e:6f:
70:71:72:73:74:75:76:77:78:79:7a:7b:7c:7d:7e:7f:
1b:81:82:83:84:85:86:87:88:10:8a:8b:8c:8d:8e:0c:
90:91:92:14:94:95:96:03:98:99:9a:9b:9c:25:9e:0a:
a0:22:a2:a3:a4:08:a6:a7:a8:a9:aa:ab:ac:ad:1f:af:
b0:b1:b2:b3:b4:b5:b6:b7:b8:b9:ba:bb:bc:bd:be:bf:
c0:c1:c2:c3:c4:c5:c6:c7:c8:c9:11:04:1c:cd:ce:cf:
d0:d1:d2:d3:d4:d5:d6:d7:15:d9:da:db:dc:dd:09:df:
e0:e1:e2:e3:e4:e5:e6:e7:19:e9:ea:eb:ec:ed:ee:ef:
f0:f1:f2:f3:26:f5:f6:f7:f8:f9:fa:fb:fc:fd:20:ff:
intermediate hash = c29231fdece762781758cd7848e0a9cc
final hash = 2595bb5e98327337b1ee50ca6640b5a5
My own design p8 act differently, as shown below. The intermediate
hashes look widely different due the fact that for every output byte of
the hash the internal state variables and the internal state of the
stream cipher get updated. Still the two internal states are nearly
identical. But there is a second 256 byte internal state array (key[])
which get updated with every input byte as well (- even in the current
version it has not enough influence).
intermediate hash = 5ce27139ddf9a3ff18feed8192eadddf
intermediate hash = 3c624ee80e10758974d7e343d6c6314d
Though even if my first versions of SBox8 and p8 are not ideal (the PIA
was broken by rich here in public) I suppose it worth that you might
take a look on the design in order to get some inspiration on how to
design your next version.
The important point in my opinion is that every input byte should
change a widespread of internal state variables and update the stream
ciphers internal state as much as possible without slowing down the
hashing process too much.
Still p8 does suffer somehow on the same problem with a specially
crafted 256 byte order that reset the internal state. But as the
involved stream cipher is more complex than RC4 and there is the
internal key[] array (internal state not listed here) which needed to be
reset as well it might raise the bar for an adversary. But there is
always room for design improvement though ...
In general I'm convinced that it's possible to design a
cryptographically secure hash based on an RC4-like stream cipher.
echo -en "01000000000000000000000003000000000010" | ./p8
c6:16:36:fb:12:65:e7:08:ce:d1:2b:bc:a9:f8:f4:11:
ad:7c:10:cd:09:60:0c:33:95:a2:89:1e:6f:43:52:ae:
8b:c7:56:5b:5d:9f:7d:f6:c3:e8:61:da:53:cc:45:be:
37:b0:29:58:1b:94:0d:86:ff:78:f1:6a:e3:5c:d5:4e:
20:40:b9:32:ab:24:9d:00:8f:d6:81:fa:73:ec:e4:de:
57:d0:49:c2:3b:b4:2d:a6:1f:98:9e:8a:03:90:f5:6e:
a7:74:d9:b5:cb:44:bd:79:af:28:a1:1a:93:ed:85:fe:
77:f0:69:e2:6b:d4:4d:87:3f:b8:31:aa:23:9c:15:8e:
07:80:f9:72:eb:64:dd:99:cf:48:c1:3a:b3:2c:a5:4a:
97:f2:c8:02:7b:25:6d:e6:5f:d8:51:ca:3c:ba:35:2e:
27:a0:19:92:0b:84:fd:76:ef:68:e1:5a:d3:4c:c5:3e:
b7:30:66:22:9b:14:8d:06:7f:ac:71:ea:63:dc:55:4f:
47:c0:39:b2:41:a4:1d:96:0f:88:01:7a:f3:6c:e5:5e:
d7:50:c9:42:bb:34:17:26:04:18:91:0a:83:fc:75:ee:
67:e0:59:d2:4b:c4:3d:b6:2f:a8:21:9a:13:8c:05:7e:
f7:70:e9:62:db:54:82:46:bf:38:b1:2a:a3:1c:df:0e:
intermediate hash = 5ce27139ddf9a3ff18feed8192eadddf
final hash = 12b5a0d9b02f4d8a11b7b859c522bcb9
echo -en "01000000000000000000000003000000000001" | ./p8
c6:16:36:fb:12:65:e7:08:ce:d1:2b:bc:a9:f8:f4:11:
ad:7c:10:cd:09:60:0c:33:95:a2:89:1e:6f:43:52:ae:
8b:c7:56:5b:fc:0a:7d:f6:c3:e8:61:da:53:cc:45:be:
37:b0:29:58:1b:94:0d:86:ff:78:f1:6a:e3:5c:d5:4e:
20:40:b9:32:ab:24:9d:00:8f:d6:81:fa:73:ec:e4:de:
57:d0:49:c2:3b:b4:2d:a6:1f:98:9e:8a:03:90:f5:6e:
5d:74:d9:b5:cb:44:bd:79:af:28:a1:1a:93:ed:85:fe:
77:f0:69:e2:6b:d4:4d:87:3f:b8:31:aa:23:9c:15:8e:
07:80:f9:72:eb:64:dd:99:cf:48:c1:3a:b3:2c:a5:4a:
97:f2:c8:02:7b:25:6d:e6:5f:d8:51:ca:3c:ba:35:2e:
27:a0:19:92:0b:84:fd:76:ef:68:e1:5a:d3:4c:c5:3e:
b7:30:66:22:9b:14:8d:06:7f:ac:71:ea:63:dc:55:4f:
47:c0:39:b2:41:a4:1d:96:0f:88:01:7a:f3:6c:e5:5e:
d7:50:c9:42:bb:34:17:26:9f:18:91:04:83:a7:75:ee:
67:e0:59:d2:4b:c4:3d:b6:2f:a8:21:9a:13:8c:05:7e:
f7:70:e9:62:db:54:82:46:bf:38:b1:2a:a3:1c:df:0e:
intermediate hash = 3c624ee80e10758974d7e343d6c6314d
final hash = f99a21808c1d081aa57f1180753d2f75
--
cHNiMUBACG0HAAAAAAAAAAAAAABIZVbDdKVM0w1kM9vxQHw+bkLxsY/Z0czY0uv8/Ks6WULxJVua
zjvpoYvtEwDVhP7RGTCBVlzZ+VBWPHg5rqmKWvtzsuVmMSDxAIS6Db6YhtzT+RStzoG9ForBcG8k
G97Q3Jml/aBun8Kyf+XOBHpl5gNW4YqhiM0=