CRC64 collision on 48bit input string.

cinsk

unread,

Apr 5, 2010, 10:54:13 PM4/5/10

to

Hi,

Sorry if this is out of concern of this newsgroup (I don't know where
I should post).

If the given string is 48-bit data, and take the CRC64, does it
collide?

The problem is that, one of the source codes that I need to verify,
uses CRC64 value as a hash value, where the input data is 48bit mac
address.

I need to make sure if the CRC64 value collide with other or not when
the input data is arbitrary 48-bit string.

Unfortunately, I'm very poor at math and CRC64, so I don't know if it
is possible or not.

Regards,

cinsk

unread,

Apr 5, 2010, 11:04:00 PM4/5/10

to

I didn't mention that we uses ISO 3309 standard.
(generator poly is x^64 + x^4 + x^3 + x + 1)

David Schwartz

unread,

Apr 6, 2010, 12:05:44 AM4/6/10

to

On Apr 5, 7:54 pm, cinsk <cin...@gmail.com> wrote:

> The problem is that, one of the source codes that I need to verify,
> uses CRC64 value as a hash value, where the input data is 48bit mac
> address.
>
> I need to make sure if the CRC64 value collide with other or not when
> the input data is arbitrary 48-bit string.
>
> Unfortunately, I'm very poor at math and CRC64, so I don't know if it
> is possible or not.

If "no collisions" is a requirement, CRC64 is a terrible choice. Why
not just send the 48-bit value itself? That will save you 16 bits,
save you some CPU, and is, of course, guaranteed not to collide.

DS

cinsk

unread,

Apr 6, 2010, 12:34:28 AM4/6/10

to

Yes, I found that CRC64 (ISO) is considered weak for hashing, but the
software is already implemented, so I couldn't do anything for now.

Our software has two parts; part-A for generating unique ID (I know
that it is a bad idea, but we uses CRC64 from the 48bit string as an
unique ID), part-B for processing some job based on that unique ID and
write a log file.

The problem is, the log file from part-B has mysterious entries that
shows two entities has a same ID. The testing team asks me that
whether it is possible to generate duplicated ID (CRC64 from 48bit
arbitrary data) theoretically.

If that is possible, I need to report the tester team that we need to
change the whole process to generate unique ID. It could be very
expensive for our current situation, sadly..

If not, I need to check whether the original 48 bit data source is
unique or not. Unfortunately, we do not have enough history data to
prove this.

That's why I asked the original question; Is it possible that two
CRC-64 ISO values from unique 48-bit data to collide each other?

Thanks.

Thus,

Chris Friesen

unread,

Apr 6, 2010, 1:31:22 AM4/6/10

to cinsk

On 04/05/2010 10:34 PM, cinsk wrote:

> That's why I asked the original question; Is it possible that two
> CRC-64 ISO values from unique 48-bit data to collide each other?

I haven't done the math, but I wouldn't expect collisions given that the
input is shorter than the hash itself.

There have been cases of duplicate hardware addresses, and duplicate
software addresses (in virtual machines, for instance, or software
overriding the hardware address at runtime) is even more likely.
Assuming the MAC addresses are unique is probably a mistake.

Chris

William Ahern

unread,

Apr 6, 2010, 4:11:51 AM4/6/10

to

It's a bad idea to assume MACs are unique. I've personally witnessed a case
where the OEM shipped several bulk orders of network cards all with the same
MAC address. The issue took awhile to surface because, IIRC, the effected
batches were never run in the relevant configuration _together_; only
individually tested or run in production against earlier batches.

Given the profit margins on network cards, I wouldn't be surprised if this
is very common today.

I vageuly recollect that the OEM apologized not about the cloned MAC
addresses, but thay they had shipped them all to the same purchaser.

Nicolas George

unread,

Apr 6, 2010, 4:23:06 AM4/6/10

to

cinsk wrote in message
<71662e6b-fdb6-411e...@11g2000yqr.googlegroups.com>:

> If the given string is 48-bit data, and take the CRC64, does it
> collide?

CRC is, mostly, the remainder of an euclidean division. Therefore, I expect
that if you take CRCn and feed it all 2^m possible m-bits strings, with m >=
n, you will find every 2^n possible outputs exactly 2^(m-n) times. And if
m < n, there should be no collisions.

But I agree with the other people in the thread: using a hash function
bigger than its input, and then relying on the fact that there will be no
collision is a terrible design. There are standards way to derive a 64-bits
unique identifier from a MAC address, for example the one used for IPv6
autoconfiguration.

Ersek, Laszlo

unread,

Apr 6, 2010, 7:01:27 AM4/6/10

to

On Mon, 5 Apr 2010, cinsk wrote:

> Our software has two parts; part-A for generating unique ID (I know that
> it is a bad idea, but we uses CRC64 from the 48bit string as an unique
> ID), part-B for processing some job based on that unique ID and write a
> log file.
>
> The problem is, the log file from part-B has mysterious entries that
> shows two entities has a same ID. The testing team asks me that whether
> it is possible to generate duplicated ID (CRC64 from 48bit arbitrary
> data) theoretically.
>
> If that is possible, I need to report the tester team that we need to
> change the whole process to generate unique ID.

If you'll have to switch to a new uuid-generator:

OSSP uuid:
http://www.ossp.org/pkg/lib/uuid/
http://packages.debian.org/lenny/libossp-uuid-dev

The uuid library in e2fsprogs:
http://e2fsprogs.sourceforge.net/
http://packages.debian.org/lenny/uuid-dev

See the debian package links for concise descriptions.

lacos

Chris Friesen

unread,

Apr 6, 2010, 11:40:04 AM4/6/10

to

On 04/06/2010 02:23 AM, Nicolas George wrote:
> There are standards way to derive a 64-bits
> unique identifier from a MAC address, for example the one used for IPv6
> autoconfiguration.

That only ensures that the mapping from MAC to identifier is unique. It
doesn't ensure that the end identifier is unique (since there are
documented cases of duplicate MAC addresses).

Chris

David Schwartz

unread,

Apr 6, 2010, 7:57:31 PM4/6/10

to

Exactly. IPv6 autoconfiguration specifically does not guarantee that
the address is globally unique. Only that's as unique as the MAC. (If
MACs collide on a LAN, you're hosed anyway. So using IPv6
autoconfiguration on a LAN creates no new problems.)

DS

ian.ca...@thepathcentral.com

unread,

Jul 6, 2017, 6:03:13 AM7/6/17

to

CRC is a bijective hash. 64-bit CRC will guarantee a unique result for x=0 to x=2^64-1. So yes, for a 48-bit input, there will be no collisions.

Paul

unread,

Jul 7, 2017, 4:28:37 AM7/7/17

to

This article contains a warning. And that tells
me that more research is required.

https://en.wikipedia.org/wiki/Cyclic_redundancy_check

CRC-64-ECMA --- 0x42F0E1EBA9EA3693

CRC-64-ISO considered weak 0x000000000000001B
for hashing

Paul

Spiros Bousbouras

unread,

Jul 17, 2017, 11:17:14 AM7/17/17

to

Out of curiosity , did you notice that the post you replied to is from 2010 ?

Anyway , the question sounds like an algorithmic one so comp.programming
is probably "the right" newsgroup ; and it could use some love too.