Definitions

# Color histogram

In computer graphics and photography, a color histogram is a representation of the distribution of colors in an image, derived by counting the number of pixels of each of given set of color ranges in a typically two-dimensional (2D) or three-dimensional (3D) color space.

A histogram is a standard statistical description of a distribution in terms of occurrence frequencies of different event classes; for color, the event classes are regions in color space.

An image histogram of scalar pixel values is more commonly used in image processing than is a color histogram.

### Overview

Color histograms are flexible constructs that can be built from images in various color spaces, whether RGB, rg chromaticity or any other color space of any dimension. A histogram of an image is produced first by discretization of the colors in the image into a number of bins, and counting the number of image pixels in each bin. For example, a Red–Blue chromaticity histogram can be formed by first normalizing color pixel values by dividing RGB values by R+G+B, then quantizing the normalized R and B coordinates into N bins each; say N = 4, which might yield a 2D histogram that looks like this table:

red
0-63 64-127 128-191 192-255
blue 0-63 43 78 18 0
64-127 45 67 33 2
128-191 127 58 25 8
192-255 140 47 47 13

Similarly a histogram can be made three-dimensional, though it is harder to display.

The histogram provides a compact summarization of the distribution of data in an image. The color histogram of an image is relatively invariant with translation and rotation about the viewing axis, and varies only slowly with the angle of view. By comparing histograms signatures of two images and matching the color content of one image with the other, the color histogram is particularly well suited for the problem of recognizing an object of unknown position and rotation within a scene. Importantly, translation of an RGB image into the illumination invariant rg-chromaticity space allows the histogram to operate well in varying light levels.

The main drawback of histograms for classification is that the representation is dependent of the color of the object being studied, ignoring its shape and texture. Color histograms can potentially be identical for two images with different object content which happens to share color information. Conversely, without spatial or shape information, similar objects of different color may be indistinguishable based solely on color histogram comparisons. There is no way to distinguish a red and white cup from a red and white plate. Put another way, histogram-based algorithms have no concept of a generic 'cup', and a model of a red and white cup is no use when given an otherwise identical blue and white cup. Another problem is that color histograms have high sensitivity to noisy interference such as lighting intensity changes and quantization errors. High dimensionality(bins) of color histograms are also another issue. Some color histogram feature spaces often occupy more than one hundred dimensions[8].

Some of the proposed solutions have been color histogram intersection, color constant indexing, cumulative color histogram, quadratic distance, and last but not least color correlograms [8]. Check out the external link to Standford for in depth look at the equations.

Although there are drawbacks of using histograms for indexing/classifications, using color in a real-time system has several relative advantages. One is that color information is faster to compute, compared to other "invariants." It has been shown in some cases that color can a be an efficient method for identifying objects of known location and appearances (refer to external link for findings in study)[8].

Further research into the relationship between color histograms data to the physical properties of the objects in an image has shown they can represent not only object color and illumination but relate to surface roughness and image geometry and provide improved estimate of illumination and object color.

### Applications of color histograms

In photography, color histograms in either 2D or 3D spaces are frequently used in digital cameras for estimating the scene illumination, as part of the camera's automatic white balance algorithm. Look at image histogram for information about image histograms. In remote sensing, color histograms are typical features used for classifying different ground regions from aerial or satellite photographs. In the case of multi-spectral images, the histograms may be four-dimensional, or more. In Computer vision, color histograms can be used in object recognition and image retrieval systems/databases. For an example visit the State Hermitage Museum QBIC system, placed in external links below. You are able to retrieve a large number of images based on the color layout that your looking for.

Color Histograms are a commonly used as appearance-based signature to classify images for content-based image retrieval systems (CBIR). By adding additional information to global color histogram signature, such as spatial information, or by dividing an image into regions and storing local histograms for each of these areas, the signature for each image becomes increasingly robust. Local color histograms are robust to partial occlusion and can be more efficient than global histograms for image retrieval in some cases. For example, applying a weighted color histogram based on color ratios to local histograms, illumination-insensitive object extraction can be achieved. Another techniques for increasing the robustness of color histograms is to incorporate directional edge information to retain spatial information.

In one large scale image database application, over 15000 images could be queried in under two seconds by refining color histograms using a technique called color coherence vector.

## References

8. http://vision.lbl.gov/People/han/tip02.pdf