Video Coding Format - Wikipedia
Video Coding Format - Wikipedia
Video Coding Format - Wikipedia
A video coding format[1][2] (or sometimes video compression format) is a content representation
format for storage or transmission of digital video content (such as in a data file or bitstream). It typically
uses a standardized video compression algorithm, most commonly based on discrete cosine transform
(DCT) coding and motion compensation. Examples of video coding formats include H.262 (MPEG-2 Part 2),
MPEG-4 Part 2, H.264 (MPEG-4 Part 10), HEVC (H.265), Theora, RealVideo RV40, VP9, and AV1. A
specific software or hardware implementation capable of compression or decompression to/from a specific
video coding format is called a video codec; an example of a video codec is Xvid, which is one of several
different codecs which implements encoding and decoding videos in the MPEG-4 Part 2 video coding format
in software.
Some video coding formats are documented by a detailed technical specification document known as a
video coding specification. Some such specifications are written and approved by standardization
organizations as technical standards, and are thus known as a video coding standard. The term
'standard' is also sometimes used for de facto standards as well as formal standards.
Video content encoded using a particular video coding format is normally bundled with an audio stream
(encoded using an audio coding format) inside a multimedia container format such as AVI, MP4, FLV,
RealMedia, or Matroska. As such, the user normally doesn't have a H.264 file, but instead has a .mp4 video
file, which is an MP4 container containing H.264-encoded video, normally alongside AAC-encoded audio.
Multimedia container formats can contain any one of a number of different video coding formats; for
example the MP4 container format can contain video in either the MPEG-2 Part 2 or the H.264 video coding
format, among others. Another example is the initial specification for the file type WebM, which specified
the container format (Matroska), but also exactly which video (VP8) and audio (Vorbis) compression format
is used inside the Matroska container, even though the Matroska container format itself is capable of
containing other video coding formats (VP9 video and Opus audio support was later added to the WebM
specification).
Contents
Distinction between "format" and "codec"
History
Motion-compensated DCT
Video coding standards
List of video coding standards
Lossless, lossy, and uncompressed video coding formats
Intra-frame video coding formats
Profiles and levels
See also
References and notes
This distinction is not consistently reflected terminologically in the literature. The H.264 specification calls
H.261, H.262, H.263, and H.264 video coding standards and does not contain the word codec.[3] The
Alliance for Open Media clearly distinguishes between the AV1 video coding format and the accompanying
codec they are developing, but calls the video coding format itself a video codec specification.[4] The VP9
specification calls the video coding format VP9 itself a codec.[5]
As an example of conflation, Chromium's[6] and Mozilla's[7] pages listing their video format support both
call video coding formats such as H.264 codecs. As another example, in Cisco's announcement of a free-as-
in-beer video codec, the press release refers to the H.264 video coding format as a "codec" ("choice of a
common video codec"), but calls Cisco's implementation of a H.264 encoder/decoder a "codec" shortly
thereafter ("open-source our H.264 codec").[8]
A video coding format does not dictate all algorithms used by a codec implementing the format. For
example, a large part of how video compression typically works is by finding similarities between video
frames (block-matching), and then achieving compression by copying previously-coded similar subimages
(e.g., macroblocks) and adding small differences when necessary. Finding optimal combinations of such
predictors and differences is an NP-hard problem,[9] meaning that it is practically impossible to find an
optimal solution. While the video coding format must support such compression across frames in the
bitstream format, by not needlessly mandating specific algorithms for finding such block-matches and other
encoding steps, the codecs implementing the video coding specification have some freedom to optimize and
innovate in their choice of algorithms. For example, section 0.5 of the H.264 specification says that
encoding algorithms are not part of the specification.[3] Free choice of algorithm also allows different
space–time complexity trade-offs for the same video coding format, so a live feed can use a fast but space-
inefficient algorithm, while a one-time DVD encoding for later mass production can trade long encoding-
time for space-efficient encoding.
History
The concept of analog video compression dates back to 1929, when R.D. Kell in Britain proposed the
concept of transmitting only the portions of the scene that changed from frame-to-frame. The concept of
digital video compression dates back to 1952, when Bell Labs researchers B.M. Oliver and C.W. Harrison
proposed the use of differential pulse-code modulation (DPCM) in video coding. The concept of inter-frame
motion compensation dates back to 1959, when NHK researchers Y. Taki, M. Hatori and S. Tanaka proposed
predictive inter-frame video coding in the temporal dimension.[10] In 1967, University of London
researchers A.H. Robinson and C. Cherry proposed run-length encoding (RLE), a lossless compression
scheme, to reduce the transmission bandwidth of analog television signals.[11]
The earliest digital video coding algorithms were either for uncompressed video or used lossless
compression, both methods inefficient and impractical for digital video coding.[12][13] Digital video was
introduced in the 1970s,[12] initially using uncompressed pulse-code modulation (PCM) requiring high
bitrates around 45–200 Mbit/s for standard-definition (SD) video,[12][13] which was up to 2,000 times
greater than the telecommunication bandwidth (up to 100 kbit/s) available until the 1990s.[13] Similarly,
uncompressed high-definition (HD) 1080p video requires bitrates exceeding 1 Gbit/s, significantly greater
than the bandwidth available in the 2000s.[14]
Motion-compensated DCT
Practical video compression was made possible by the development of motion-compensated DCT (MC DCT)
coding,[13][12] also called block motion compensation (BMC)[10] or DCT motion compensation. This is a
hybrid coding algorithm,[10] which combines two key data compression techniques: discrete cosine
transform (DCT) coding[13][12] in the spatial dimension, and predictive motion compensation in the
temporal dimension.[10]
DCT coding is a lossy block compression transform coding technique that was first proposed by Nasir
Ahmed, who initially intended it for image compression, while he was working at Kansas State University in
1972. It was then developed into a practical image compression algorithm by Ahmed with T. Natarajan and
K. R. Rao at the University of Texas in 1973, and was published in 1974.[15][16][17]
The other key development was motion-compensated hybrid coding.[10] In 1974, Ali Habibi at the
University of Southern California introduced hybrid coding,[18][19][20] which combines predictive coding
with transform coding.[10][21] He examined several transform coding techniques, including the DCT,
Hadamard transform, Fourier transform, slant transform, and Karhunen-Loeve transform.[18] However, his
algorithm was initially limited to intra-frame coding in the spatial dimension. In 1975, John A. Roese and
Guner S. Robinson extended Habibi's hybrid coding algorithm to the temporal dimension, using transform
coding in the spatial dimension and predictive coding in the temporal dimension, developing inter-frame
motion-compensated hybrid coding.[10][22] For the spatial transform coding, they experimented with
different transforms, including the DCT and the fast Fourier transform (FFT), developing inter-frame
hybrid coders for them, and found that the DCT is the most efficient due to its reduced complexity, capable
of compressing image data down to 0.25-bit per pixel for a videotelephone scene with image quality
comparable to a typical intra-frame coder requiring 2-bit per pixel.[23][22]
The DCT was applied to video encoding by Wen-Hsiung Chen,[24] who developed a fast DCT algorithm with
C.H. Smith and S.C. Fralick in 1977,[25][26] and founded Compression Labs to commercialize DCT
technology.[24] In 1979, Anil K. Jain and Jaswant R. Jain further developed motion-compensated DCT video
compression.[27][10] This led to Chen developing a practical video compression algorithm, called motion-
compensated DCT or adaptive scene coding, in 1981.[10] Motion-compensated DCT later became the
standard coding technique for video compression from the late 1980s onwards.[12][28]
The first digital video coding standard was H.120, developed by the CCITT (now ITU-T) in 1984.[29] H.120
was not usable in practice, as its performance was too poor.[29] H.120 used motion-compensated DPCM
coding,[10] a lossless compression algorithm that was inefficient for video coding.[12] During the late 1980s,
a number of companies began experimenting with discrete cosine transform (DCT) coding, a much more
efficient form of compression for video coding. The CCITT received 14 proposals for DCT-based video
compression formats, in contrast to a single proposal based on vector quantization (VQ) compression. The
H.261 standard was developed based on motion-compensated DCT compression.[12][28] H.261 was the first
practical video coding standard,[29] and was developed with patents licensed from a number of companies,
including Hitachi, PictureTel, NTT, BT, and Toshiba, among others.[30] Since H.261, motion-compensated
DCT compression has been adopted by all the major video coding standards (including the H.26x and
MPEG formats) that followed.[12][28]
MPEG-1, developed by the Motion Picture Experts Group (MPEG), followed in 1991, and it was designed to
compress VHS-quality video.[29] It was succeeded in 1994 by MPEG-2/H.262,[29] which was developed with
patents licensed from a number of companies, primarily Sony, Thomson and Mitsubishi Electric.[31] MPEG-
2 became the standard video format for DVD and SD digital television.[29] Its motion-compensated DCT
algorithm was able to achieve a compression ratio of up to 100:1, enabling the development of digital media
technologies such as video-on-demand (VOD)[13] and high-definition television (HDTV).[32] In 1999, it was
followed by MPEG-4/H.263, which was a major leap forward for video compression technology.[29] It was
developed with patents licensed from a number of companies, primarily Mitsubishi, Hitachi and
Panasonic.[33]
The most widely used video coding format as of 2019 is H.264/MPEG-4 AVC.[34] It was developed in 2003,
with patents licensed from a number of organizations, primarily Panasonic, Godo Kaisha IP Bridge and LG
Electronics.[35] In contrast to the standard DCT used by its predecessors, AVC uses the integer DCT.[24][36]
H.264 is one of the video encoding standards for Blu-ray Discs; all Blu-ray Disc players must be able to
decode H.264. It is also widely used by streaming internet sources, such as videos from YouTube, Netflix,
Vimeo, and the iTunes Store, web software such as the Adobe Flash Player and Microsoft Silverlight, and
also various HDTV broadcasts over terrestrial (Advanced Television Systems Committee standards, ISDB-T,
DVB-T or DVB-T2), cable (DVB-C), and satellite (DVB-S2).
A main problem for many video coding formats has been patents, making it expensive to use or potentially
risking a patent lawsuit due to submarine patents. The motivation behind many recently designed video
coding formats such as Theora, VP8 and VP9 have been to create a (libre) video coding standard covered
only by royalty-free patents.[37] Patent status has also been a major point of contention for the choice of
which video formats the mainstream web browsers will support inside the HTML5 video tag.
The current-generation video coding format is HEVC (H.265), introduced in 2013. While AVC uses the
integer DCT with 4x4 and 8x8 block sizes, HEVC uses integer DCT and DST transforms with varied block
sizes between 4x4 and 32x32.[38] HEVC is heavily patented, with the majority of patents belonging to
Samsung Electronics, GE, NTT and JVC Kenwood.[39] It is currently being challenged by the aiming-to-be-
freely-licensed AV1 format. As of 2019, AVC is by far the most commonly used format for the recording,
compression and distribution of video content, used by 91% of video developers, followed by HEVC which is
used by 43% of developers.[34]
Video Market
Basic Popular
coding Year Publisher(s) Committee(s) Licensor(s) share
algorithm implementations
standard (2019)[34]
Hitachi,
PictureTel,
Videoconferencing,
H.261 1988 CCITT VCEG NTT, BT, N/A
videotelephony
Toshiba,
etc.[30]
Motion
JPEG 1992 JPEG JPEG N/A N/A QuickTime
(MJPEG)
Fujitsu, IBM,
MPEG-1 Video-CD, Internet
1993 ISO, IEC MPEG Matsushita, N/A
Part 2 video
etc.[40]
H.262 /
Sony,
MPEG-2 DVD Video, Blu-ray,
ISO, IEC, ITU- Thomson,
Part 2 1995 MPEG, VCEG 29% DVB, ATSC, SVCD,
T Mitsubishi,
DCT (MPEG-2 SDTV
etc.[31]
Video)
Videoconferencing,
videotelephony,
Mitsubishi, H.320, Integrated
Hitachi, Services Digital
H.263 1996 ITU-T VCEG Unknown
Panasonic, Network
etc.[33] (ISDN),[41][42]
mobile video (3GP),
MPEG-4 Visual
MPEG-4 Mitsubishi,
Part 2 Hitachi, Internet video,
1999 ISO, IEC MPEG Unknown
(MPEG-4 Panasonic, DivX, Xvid
Visual) etc.[33]
Motion
DWT JPEG 2000 2001 JPEG[43] JPEG[44] N/A Unknown Digital cinema[45]
(MJ2)
Blu-ray, HD DVD,
HDTV (DVB,
ATSC), video
Advanced
streaming
Video Panasonic,
(YouTube, Netflix,
Coding ISO, IEC, ITU- Godo Kaisha
2003 MPEG, VCEG 91% Vimeo), iTunes
(H.264 / T IP Bridge, LG,
Store, iPod Video,
MPEG-4 etc.[35]
Apple TV,
AVC)
videoconferencing,
Flash Player,
Silverlight, VOD
Microsoft,
Panasonic,
Blu-ray, Internet
VC-1 2006 SMPTE SMPTE LG, Unknown
video
Samsung,
DCT etc.[46]
High
UHD Blu-ray, DVB,
Efficiency
Samsung, ATSC 3.0, UHD
Video
ISO, IEC, ITU- GE, NTT, JVC streaming, High
Coding 2013 MPEG, VCEG 43%
T Kenwood, Efficiency Image
(H.265 /
etc.[39][47] Format, macOS
MPEG-H
High Sierra, iOS 11
HEVC)
AV1 2018 AOMedia AOMedia N/A 7% HTML5 video
Versatile
Video
Coding 2020 JVET JVET Unknown N/A N/A
(VVC /
H.266)
Uncompressed video formats, such as Clean HDMI, is a form of lossless video used in some circumstances
such as when sending video to a display over a HDMI connection. Some high-end cameras can also capture
video directly in this format.
Because interframe compression copies data from one frame to another, if the original frame is simply cut
out (or lost in transmission), the following frames cannot be reconstructed properly. Making 'cuts' in
intraframe-compressed video while video editing is almost as easy as editing uncompressed video: one finds
the beginning and ending of each frame, and simply copies bit-for-bit each frame that one wants to keep,
and discards the frames one doesn't want. Another difference between intraframe and interframe
compression is that, with intraframe systems, each frame uses a similar amount of data. In most interframe
systems, certain frames (such as "I frames" in MPEG-2) aren't allowed to copy data from other frames, so
they require much more data than other frames nearby.[49]
It is possible to build a computer-based video editor that spots problems caused when I frames are edited
out while other frames need them. This has allowed newer formats like HDV to be used for editing.
However, this process demands a lot more computing power than editing intraframe compressed video with
the same picture quality. But, this compression is not very effective to use for any audio format.
A profile restricts which encoding techniques are allowed. For example, the H.264 format includes the
profiles baseline, main and high (and others). While P-slices (which can be predicted based on preceding
slices) are supported in all profiles, B-slices (which can be predicted based on both preceding and following
slices) are supported in the main and high profiles but not in baseline.[50]
See also
Comparison of container formats
Data compression#Video
List of video compression formats
Video file format
Text is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply. By using this site, you
agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit
organization.