There are two main reasons why gaps occur during playback: compression scheme artifacts, and delayed output.
This issue is technical but also standards-related. The popular MP3 standard, for example, defines no way to record the amount of delay or padding for later removal. Also, the encoder delay may vary from encoder to encoder, making automatic removal difficult. Even if two tracks are decompressed and merged into a single track, a gap will usually remain between them. More recent compressed audio formats (such as Ogg Vorbis) have been designed to address this problem, and can therefore produce gapless audio if played back correctly.
A different design problem relates to software/firmware/hardware which are not ready to seamlessly move to the next track by the time the current track is complete. In this scenario, the listener is left waiting in silence as the player locates the next file, reads it, decodes the first blocks if necessary and then starts loading the buffer for playback. The gap can be as much as half a second, or even more — very noticeable in "continuous" music such as certain classical or dance genres.
Many older audio players on personal computers do not implement the required buffering to play gapless audio. Some of these rely on third-party gapless audio plug-ins to buffer output. Some newer players and newer versions of old players now support gapless playback directly.
That alone may not address the issue of introduced gaps. Ensuring the audio hardware itself is not stopped and started between tracks such that a click is added may also be necessary and it may help to process the next track while the current one is running so that the data is available as a continuous stream.
For those seeking precision this may be the ideal solution because there is no guesswork being performed by the software: the playback timing would be identical to the source.
It can also be difficult to properly implement silence removal. If the silence threshold is too low and the track contains decoder artifacts, the software may not recognize some silences. Conversely, if the threshold is too high, the software may remove entire sections of quiet music at the beginning or end of a track.
DSP plugins can also be used to cross-fade between tracks. This eliminates gaps that some listeners find distracting, but also greatly alters the audio data and is not always desirable. In particular, when tracks are meant to be played together and perform the transition at high volume, cross-fading results in a large volume drop.
Both of these alternate solutions are typically used to address compression methods that do not support the metadata for gapless playback. Like the optimal solution, they still require buffering and not closing the output stream; however, they require more computations, making them less efficient. In portable digital audio players, this can mean a reduced playing time on batteries.
Due to the drawbacks of the alternative solutions above, some listeners dislike their negative effects more than the gap they attempt to remove. Another problem is that the solutions above do nothing to prevent the output stream from being closed and reopened at track boundaries; some measures can be taken to simulate a gapless output stream, but they are not always successful and side-effects may occur.
Another alternative is to ignore track boundaries, encoding a single collection of tracks as a single compressed file, relying on cue sheets (or something similar) for navigation. While this method results in gapless playback within the collection of tracks with consecutive playback, it can be unwieldy due to the possibly large size of the resulting compressed file. Furthermore, unless the playback software or hardware can recognize the cue sheets, navigating between tracks may be difficult.
Last of all, with some implementations, it is possible to add gapless metadata to existing files. If the encoder is known, it is possible to guess the encoder delay. Assuming the compression was performed on CD audio to create the files, the original playback length will be an integer multiple of 588 samples. Thus the total playback time can be guessed also. Adding such information to audio files will work with implementations which recognize metadata.
Since lossless data compression excludes the possibility of the introduction of padding, all lossless audio file formats are inherently gapless.
These lossy audio file formats have provisions for gapless encoding:
Some other formats do not officially support gapless encoding, but some implementations of encoders or decoders may handle gapless metadata.
Alternative or partial solutions: