Algorithm to find nearest color

Guilherme Utrabo

unread,

Jun 26, 2012, 2:11:50 PM6/26/12

to

In my application I have a list of RGB colors. The user then chooses one color and I have to find the nearest color. I know there is more than one method to do this, using hue and saturation, etc. That's not what I need help with. My problem is to find an optimal algorithm to find the nearest color. Today my approach is very straight foward, but lacks performance:

Get the RGB for the user color. Then I sort my list by the sum of the absolute difference of the RGBs.
I do that directly on the database using SQL, but I can do it in the code if I find a better way.
I'll post the SQL code just for reference:
SELECT TOP 1 * FROM COLORS C ORDER BY ABS(C.Red - UserRed) + ABS(C.Green - UserGreen) + ABS(C.Blue - UserBlue) DESC

Do you guys suggest me an algorithm to do that search?
Thanks for your help.

Regards,
Guilherme

Guilherme Utrabo

unread,

Jun 26, 2012, 2:48:29 PM6/26/12

to

Edit.: Actually the ORDER BY clause is ASC.

Daniel Pitts

unread,

Jun 26, 2012, 4:05:17 PM6/26/12

to

This reminds me of the nearest neighbor problem, and it happens to be in
3d. There are probably some algorithms that solve this relatively
efficiently. Good luck googling for Nearest Neighbor.

Nobody

unread,

Jun 26, 2012, 4:22:32 PM6/26/12

to

If you need to do many lookups against a fixed set of colours, the fast
solution is a lookup table, but it requires 2^24 ~= 16 million entries for
a 24-bit colour.

If you can't afford that much memory, then use a coarser lookup table
which only uses the top N bits of each component and which maps any
colour to a list of "candidate" colours.

If you don't perform many lookups or if the list changes frequently, then
it's a tradeoff between optimising the lookup time and optimising the time
to modify the list, which depends upon the relative frequency of the
operations.

Nicolas Bonneel

unread,

Jun 26, 2012, 6:05:18 PM6/26/12

to

if you use the Lab colorspace, it is perceptually euclidean, which means
you can use all space partitionning techniques for querying the nearest
neighbor (kd-tree etc). You could use the ANN library for such nearest
neighbor query.
If you want to stick to a hue/saturation, your space is not euclidean
(basically a cyldinder or cone or similar since you're dealing with
angles), and the nearest neighbor query can be much more complicated.
Alternatively, you can still hack a kd-tree in the RGB space, with less
good results but that could still be ok.

Cheers

--
Nicolas Bonneel
http://people.seas.harvard.edu/~nbonneel/

BGB

unread,

Jun 27, 2012, 6:07:15 PM6/27/12

to

nevermind that nearly anything involving actual code (C or C++ or Java
or whatever) will probably be considerably faster than doing it using
SQL queries.

typically, when I do stuff based on colors, I tend to make the
simplifying assumption that color-distance==euclidean-distance, so, it
is finding the color with the smallest value for
(R1-R0)^2+(G1-G0)^2+(B1-B0)^2.

as for nearest neighbor:
a lot will depend on the number of colors in question (to match against);
for a small number of colors, a linear search is likely to be fairly
effective;
tree-based strategies make more sense for larger numbers of colors, but
may themselves be fairly costly if the number of colors is fairly small
(say, due to having larger constant factors).

an example of a tree-building strategy would be:
if the pool is less than N items, build a leaf holding these items;
find the estimated center-point (say, the average of the colors);
find a good dividing-plane normal-axis (several strategies exist, 1);
split the collection in half, based on which side of the plane the item
is on;
build a node holding the plane and the results of repeating this process
for each sub-collection.

1: a few strategies:
add the squared distances from the origin, for each axis, and pick the
axis with the greatest value;
like the above, but simply treat this value as a vector (normalize and
use as axis);
produce a 3x3 matrix, [[XX,XY,XZ], [YX,YY,YZ], [ZX,ZY,ZZ]], and pick the
row with the greatest length;
...

finding the nearest point with such a tree is basically done by walking
the tree (going down the side which the point is on), where one can note
that the closest point on the opposite side of a plane will be the
distance from the plane, meaning that when walking back up the tree, the
opposite side of the tree can be ignored if the distance to the plane is
>= that of the currently nearest point.

note: this is probably overkill for color-matching tasks.

or such...

Kaba

unread,

Jun 27, 2012, 6:56:10 PM6/27/12

to

Hi,

Each color is a triple of coordinates in some color space (color is
three-dimensional per the three kinds of receptors in the eye), and can
be thought of as a geometric point in space (the rgb most probably
refers to sRGB). Measuring distances between colors is based on human
vision, and so needs to be done carefully. The LAB color space was
created to warp the XYZ color space such that the color difference can
be measured by the euclidean distance. There has been much research into
better models than this, but the simplicity of LAB is appealing, and
perhaps good enough for the non-color-scientist.

Therefore, the first task is to convert from the sRGB color space to the
XYZ color space, and from there to the LAB color space. Perhaps like
this (sRGB <-> XYZ)

http://kaba.hilvi.org/pastel/pastel/gfx/color_srgb.cpp.htm

and this (XYZ <-> LAB)

http://kaba.hilvi.org/pastel/pastel/gfx/color_lab.cpp.htm

After you have mapped your colors into a set of LAB points, the task
that is left is to search for nearest neighbors. For a single nearest
neighbor search the brute-force is the most efficient algorithm. It is
more probable, though, that you need to carry out multiple searches on
the same point-set, and do these fast.

Now comes in a kd-tree. This is a recursive binary partitioning of space
by axis-aligned boxes; take a box, split it into two by an axis-aligned
plane, split them into two by other axis-aligned planes etc. You will
get a tree structure. The color points will be stored in the leaf nodes
of this tree. You can use this data structure to perform efficiently
many geometric queries, such as the nearest neighbors searching. Perhaps
something like this:

http://kaba.hilvi.org/pastel/pastel/geometry/pointkdtree.htm

An important detail is to choose the position of the split-plane wisely.
What it means to choose wisely depends on the geometric task you are
using the kd-tree for. In this case the task is nearest neighbors
searching, and for this wisely means to use the sliding midpoint rule.
Choose the split-axis as the one where the node is widest. Then along
this axis first choose the middle position. If all of the points in the
node lie on the same side of the plane, slide the plane to touch the
nearest of those points. More discussion of splitting rules here:

http://kaba.hilvi.org/pastel/pastel/geometry/splitrule_pointkdtree.htm

The implementation of the sliding midpoint rule is here:

http://kaba.hilvi.org/pastel/pastel/geometry/slidingmidpoint_splitrule_pointkdtree.hpp.htm

If you choose to implement this kd-tree approach, we can discuss later
how to implement the search algorithm.

--
http://kaba.hilvi.org

Guilherme Utrabo

unread,

Jun 28, 2012, 1:09:37 PM6/28/12

to

Thank you guys for all the help.
Following your instructions, I implemented a KdTree search and managed to optimize greatly the performance.

Regards,
Guilherme

Ian Zimmerman

unread,

Jul 25, 2012, 12:31:46 AM7/25/12

to

The following message is a courtesy copy of an article
that has been posted to comp.graphics.algorithms as well.

BGB> nevermind that nearly anything involving actual code (C or C++ or
BGB> Java or whatever) will probably be considerably faster than doing
BGB> it using SQL queries.

BGB> typically, when I do stuff based on colors, I tend to make the
BGB> simplifying assumption that color-distance==euclidean-distance, so,
BGB> it is finding the color with the smallest value for
BGB> (R1-R0)^2+(G1-G0)^2+(B1-B0)^2.

BGB> as for nearest neighbor: a lot will depend on the number of colors
BGB> in question (to match against); for a small number of colors, a
BGB> linear search is likely to be fairly effective; tree-based
BGB> strategies make more sense for larger numbers of colors, but may
BGB> themselves be fairly costly if the number of colors is fairly small
BGB> (say, due to having larger constant factors).

gcolorsel [1] does repeated Quicksort (n ln n each iteration) to show
neighboring colors while user navigates the Gtk color selection widget.
It turns out to be quick enough for the list of colors distributed with
X Window (rgb.txt).

[1] https://github.com/nobrowser/gcolorsel

--
Ian Zimmerman
gpg public key: 1024D/C6FF61AD
fingerprint: 66DC D68F 5C1B 4D71 2EE5 BD03 8A00 786C C6FF 61AD
http://www.gravatar.com/avatar/c66875cda51109f76c6312f4d4743d1e.png
Rule 420: All persons more than eight miles high to leave the court.