is the process of embedding information into a digital signal. The signal may be audio, pictures or video, for example. If the signal is copied, then the information is also carried in the copy.
In visible watermarking, the information is visible in the picture or video. Typically, the information is text or a logo which identifies the owner of the media. The image on the right has a visible watermark. When a television broadcaster adds its logo to the corner of transmitted video, this is also a visible watermark.
In invisible watermarking, information is added as digital data to audio, picture or video, but it cannot be perceived as such. An important application of invisible watermarking is to copyright protection systems, which are intended to prevent or deter unauthorized copying of digital media. Steganography is an application of digital watermarking, where two parties communicate a secret message embedded in the digital signal. Annotation of digital photographs with descriptive information is another application of invisible watermarking. While some file formats for digital media can contain additional information called metadata, digital watermarking is distinct in that the data is carried in the signal itself.
The use of the word of watermarking is derived from the much older notion of placing a visible watermark on paper.
Instance of a Digital Watermarking Scheme
A general watermarking scheme is defined as:
where E defines the embedding function, D detecting function, R retrieval function and M the message. Furthermore, the embedding parameters defines the parameter set used for watermark embedding, defines the detection parameters and retrieval parameters. Hence, each watermarking scheme may have different instances according to the values that these parameters may adopt. An instance of the watermarking scheme for a particular value of the parameter vectors.
Watermarking Life-Cycle Phases
In general, the usage of digital watermarking can be simplified as follows. An unmarked (mostly original) signal (
) is the source signal, where the watermark (
) is embedded by using an embedding function
. The result is the marked signal
. It can be defined, that this process is done in a secure environment. The following step could be, for example, the distribution of
over the Internet or storage of it to provide authenticity or integrity checks. These processes can be seen as an insecure part, where attacks (
) occur on
. After distribution of
, the signal is defined as
because potential attacks could have destroyed the watermark. A detecting function
tries to detect the watermark
or a retrieval function
tries to retrieve the embedded message
. The detection/retrieval can be done in a secure or insecure environment, depending on the used application of the watermarking algorithm.
The complete scenario is defined as life cycle of a watermark, because it begins with embedding and ends with detection/retrieval. This is shown in the following figure with expected secure and insecure parts.
The information to be embedded is called a digital watermark, although in some contexts the phrase digital watermark means the difference between the watermarked signal and the cover signal. The signal where the watermark is to be embedded is called the host signal. A watermarking system is usually divided into three distinct steps, embedding, attack and detection. In embedding, an algorithm accepts the host and the data to be embedded and produces a watermarked signal.
The watermarked signal is then transmitted or stored, usually transmitted to another person. If this person makes a modification, this is called an attack. While the modification may not be malicious, the term attack arises from copyright protection application, where pirates attempt to remove the digital watermark through modification. There are many possible modifications, for example, lossy compression of the data, cropping an image or video, or intentionally adding noise.
Detection (often called extraction) is an algorithm which is applied to the attacked signal to attempt to extract the watermark from it. If the signal was unmodified during transmission, then the watermark is still present and it can be extracted. In robust watermarking applications, the extraction algorithm should be able to correctly produce the watermark, even if the modifications were strong. In fragile watermarking, the extraction algorithm should fail if any change is made to the signal.
In general, the fundamental watermarking parameters are classifies into the 7 watermarking properties capacity, complexity, invertibility, transparency, robustness, security and verification (alphabetic order):
The Capacity is in general divided into embedding and retrieval capacity.
The embedding capacity
of a watermarking scheme is defined as the amount of information that is (seems to be) embedded into the cover object to obtain the marked object. A simple definition for a capacity measure
would be related to the size of the embedded message, i.e.
. In addition, capacity is often given relative to the size of the cover object:
Note that such measure only takes into account the information embedded, but not the information that is retrieved. Note, also, that this measure does not consider the possibility of repeat coding, in which the mark is replicated as many times as needed prior to its insertion. All these issues are related to the retrieval capacity which is defined in the retrieval function.
The definition of retrieval capacity defines the capacity with respect to the retrieved message
. First of all, zero-bit watermarking schemes do not transmit any message, since the watermark
is just detected but a message
is not retrieved. In such a case, the retrieval capacity of these schemes is zero
For non zero-bit watermarking schemes the retrieval capacity is considered after data extraction. The following retrieval capacity function is defined: . In such a situation, the retrieval capacity should consider all the repetitions as follows