The 137th meeting of MPEG was held online from 2022-01-17 until 2022-01-21. Find more information here.
MPEG Systems Wins Two More Technology & Engineering Emmy® Awards
MPEG Systems is pleased to report that MPEG is being recognized this year by the National Academy for Television Arts and Sciences (NATAS) with two Technology & Engineering Emmy® Awards, for (i) “standardization of font technology for custom downloadable fonts and typography for Web and TV devices and for (ii) “standardization of HTTP encapsulated protocols”, respectively.
The first of these Emmys is related to MPEG’s Open Font Format (ISO/IEC 14496-22). Fonts are the critical components of any written communication. Text carries a meaning, but it is the font that makes text readable, i.e., fonts give the written word its voice. The standardization of the Open Font Format technology by MPEG over the last 20 years has significantly influenced the capabilities of all classes of consumer electronic devices, bringing advanced font technology for digital TV, streaming media environments, and the Web. It also inspired many open-source projects that enabled mass adoption of high-quality font rendering and advanced text support, making high quality font support easy and cost-effective for OEMs, service providers and content authors to deploy new features and applications supporting all of the world’s languages and writing systems.
The second of these Emmys is related to MPEG Dynamic Adaptive Streaming over HTTP (i.e., MPEG DASH, ISO/IEC 23009). The MPEG DASH standard is the only commercially deployed international standard technology for media streaming over HTTP and it is widely used in many products. MPEG developed the first edition of the DASH standard in 2012 in collaboration with 3GPP and since then has produced four more editions amending the core specification by adding new features and extended functionality. Furthermore, MPEG has developed six other standards as additional “parts” of ISO/IEC 23009 enabling the effective use of the MPEG DASH standards with reference software and conformance testing tools, guidelines, and enhancements for additional deployment scenarios. MPEG DASH has dramatically changed the streaming industry by providing a standard that is widely adopted by various consortia such as 3GPP, ATSC, DVB, and HbbTV, and across different sectors. The success of this standard is due to its technical excellence, large participation of the industry in its development, addressing the market needs, and working with all sectors of industry all under ISO/IEC JTC 1/SC 29 MPEG Systems’ standard development practices and leadership.
“Thank you to the National Academy of Television Arts & Sciences (NATAS) for recognizing the outstanding contributions of media coding experts in ISO/IEC JTC 1. Their work has significantly expanded the benefits of MPEG standardization.” — Philip C. Wennblom, Chair of ISO/IEC JTC 1
“It’s a tremendous recognition for the MPEG Systems work in SC 29 to receive two Technology & Engineering Emmy® awards in 2022. The MPEG Open Font Format and DASH standards have been very widely adopted in industry and have become fundamental to the interoperability of today’s television and web-based multimedia services.” — Gary J. Sullivan, Chair of ISO/IEC JTC 1/SC 29
“MPEG Systems has been persistently working to respond to industry needs in collaboration with the key industry players and other standards development organizations. The two Technology & Engineering Emmy® awards this year for the MPEG Open Font Format and DASH standards show the great success and industry recognition of such efforts.” — Youngkwon Lim, Convenor of MPEG Systems
These are MPEG’s fifth and sixth Technology & Engineering Emmy® Awards (after MPEG-1 and MPEG-2 together with JPEG in 1996, Advanced Video Coding (AVC) in 2008, MPEG-2 Transport Stream in 2013, and ISO Base Media File Format in 2021) and MPEG’s seventh and eighth overall Emmy® Awards (including the Primetime Engineering Emmy® Awards for Advanced Video Coding (AVC) High Profile in 2008 and High Efficiency Video Coding (HEVC) in 2017).
MPEG had previously issued a Call for Proposals (CfP) for 6DoF immersive audio technology in April 2021 for convincing Virtual Reality (VR) and Augmented Reality (AR) experiences. At the 137th MPEG meeting, submissions to this CfP were reviewed and technology was selected. Fourteen submissions were evaluated in three subjective VR/AR listening tests, conducted at 12 test sites around the world, as a basis for the selection.
The technology selected in the CfP permits the user to have a VR or AR experience in which the user can freely navigate and interact with the virtual environment using 6 degrees of freedom (6DoF), thus enabling unconstrained spatial movement and user rotation. The rendered audio signals can be in multiple formats using audio objects, channels, and Higher Order Ambisonics (HOA). The VR/AR environment (scene) is encoded in the bitstream and the MPEG-I Immersive Audio technology renders a binaural audio signal to the user’s headphone based on the transmitted and decoded scene and implements a rich set of audio effects, such as directionality, localization, extent, occlusion, diffraction and Doppler shift of sound sources, and sophisticated modelling of the acoustic environment. A wide range of user interactions with the VR/AR environment are supported.
It is expected that the standard will progress to the first formal stage of its approval process with a Committee Draft (CD) in January 2023, Draft International Standard (DIS) in April 2023, and International Standard (IS) in October 2023.
At the 137th MPEG meeting, MPEG Requirements (WG 03) issued a Call for Proposals (CfP) for technologies enabling encoder and packager synchronization and related distributed media asset storage.
The encoder and packager synchronization framework will define preferred ways of generating content from distributed sources based on existing MPEG standards such as Common Media Application Format (CMAF), MPEG Dynamic Adaptive Streaming over HTTP (DASH), and potentially other standards and specifications. This will enable redundant live and (Video on Demand) VoD content generation setups producing synchronized content that is robust to failures and loss of input in one or more of the components of the setup. Additionally, solutions for storing media assets at scale are solicited. The asset storage solution will enable live-to-VoD and VoD-to-live use cases and take advantage of the synchronized encoder/packager framework.
This CfP requests proposals from companies and other organizations. Registration is required by 10 April 2022 and the submission by the 17 April 2022. Evaluation of the submissions in response to the CfP will be performed at the 138th MPEG meeting in April 2022.
Companies and organizations that have developed encoder synchronization and media storage technologies are kindly invited to bring such information in response to this CfP by contacting Dr. Youngkwon Lim, MPEG Systems Convenor at firstname.lastname@example.org, or Dr. Igor Curcio, MPEG Requirements Convenor at email@example.com.
At the 137th MPEG meeting, MPEG Systems (WG 03) completed the MPEG-I Scene Description standard, a key technology in enabling immersive 3D user experiences, by promoting it to the final approval stage as a Final Draft International Standard (FDIS). The specification describes the composition of a 3D scene by referencing and positioning different media assets in the scene. The information provided in the scene description is then used by an application to render the 3D scene accordingly. To address the needs of immersive applications, the specification has developed MPEG extensions for Khronos glTF 2.0, a scene description solution widely used by the industry. glTF 2.0 provides a solid and efficient baseline for exchangeable and interoperable scene descriptions and can enable realistic rendering of immersive content such as by using Physically Based Rendering (PBR). However, glTF 2.0 has primarily been designed for static scenes and assets, which does not fully address the requirements and needs of dynamic and rich 3D scenes in immersive environments.
Based on this analysis, MPEG has developed extensions to Khronos’s glTF 2.0 to integrate real-time media, i.e., support for dynamic visual objects, audio, timed updates of the scene, and media access related functions. The standard also defines an architecture that decouples media rendering from media access by specifying the relevant APIs to access media referenced by the scene description. This work has continuously been coordinated with Khronos and 3GPP. Further details can be found at http://mpeg-sd.org/.
In the last few years, MPEG has developed a set of standardized Resource Description Framework (RDF) ontologies and XML schemas for the codification of intellectual property (IP) rights information related to music and media. The ISO/IEC 21000-19 Media Value Chain Ontology (MVCO) facilitates rights tracking for fair, timely, and transparent marketplace transactions by capturing user roles and their permissible actions on a particular IP entity. The ISO/IEC 21000-21 Media Contract Ontology (MCO) facilitates the conversion of narrative contracts to digital ones related to the management of IP rights, payments, and notifications. With respect to the latter, XML schemas have been developed as the ISO/IEC 21000-20 Contract Expression Language (CEL).
At the 137th MPEG meeting, MPEG Systems (WG 03) completed the development of ISO/IEC 21000-23 Smart Contracts for Media by promoting the standard to the Final Draft International Standard (FDIS) stage, which is the final approval milestone in the development of a standard. The standard specifies the means (e.g., application programming interfaces) for converting the above-mentioned RDF ontologies and XML schemas to smart contracts that can be executed on existing Distributed Ledger Technology (DLT) environments. This important standard will greatly assist the music and media industry and its stakeholders in achieving effective interoperability for the exchange of verified contractual data between different DLT environments. In this way, it will increase trust among the stakeholders for sharing high-value data (e.g., music rights) in the ecosystem. Another important feature of this standard is that it offers the possibility to bind, through persistent links, the clauses of a smart contract to their corresponding ones of a human-readable contract. In this way, each party signing an ISO/IEC 21000-23 conforming smart contract will be able to know exactly what its clauses express.
At the 137th MPEG meeting, MPEG Systems (WG 03) reached the first formal milestone in the approval process for an amendment of its recently Emmy Award-winning standard ISO/IEC 14496-12 ISO Base Media File Format (ISOBMFF) comprising improved brand documentation and other improvements. The Committee Draft Amendment (CDAM) of ISO/IEC 14496-12:2021 Amendment 1 includes the enhancement of random access of the media stored in ISOBMFF by using an external elementary stream providing inter-frame prediction references for the decoding of an elementary stream that is stored as the main stream, supporting so-called extended dependent random access points (EDRAPs). Additionally, the amendment provides a way to list the sets of (one or multiple) media components representing one version of the media presentation that may be selected by a user for simultaneous decoding and presentation. The amendment also clarifies the use of major brands and compatible brands in the file type box to provide guidelines for defining a new brand identifier for ISOBMFF. This amendment is expected to reach its final approval milestone as a Final Draft Amendment (FDAM) in early 2023.
At the 137th MPEG meeting, MPEG Video Coding (WG 04) completed the development of the Conformance and Reference software standard for Low Complexity Enhancement Video Coding (LCEVC) (ISO/IEC 23094-3) with its promotion to the Final Draft International Standard (FDIS) stage for final approval and publication.
This standardization will assist implementers of LCEVC in checking the proper functioning of their implementations during the development of their products. The tests will also help the users and potential users of LCEVC products by providing a way to verify claims of the conformance of such products. Thus, such tests will assist the community in achieving interoperability of encoder and decoder products and will encourage the adoption and use of LCEVC.
LCEVC adds an enhancement data stream that can appreciably improve the resolution, bit depth and visual quality of reconstructed video with effective compression efficiency of limited encoding and decoding complexity by building on top of existing and future video coding formats. It is designed to be compatible with existing video workflows (e.g., with CDNs, metadata management, and DRM/CA) and streaming/media formats (e.g., DASH, and CMAF) to facilitate the rapid deployment of enhanced video services. LCEVC can be used to deliver higher video quality in limited bandwidth scenarios, especially when the available bit rate is relatively low for high-resolution video encoding and delivery or when decoding complexity is a challenge.
MPEG Video Coding issues Committee Draft of Conformance and Reference Software for MPEG Immersive Video
At the 137th MPEG meeting, MPEG Video Coding (WG 04) promoted its MPEG Immersive Video (MIV) Conformance and Reference software standard (ISO/IEC 23090-23) to the Committee Draft (CD) stage, the first formal milestone of its approval process. The document specifies how to conduct conformance tests and provides reference encoder and decoder software for ISO/IEC 23090-12 MPEG Immersive video. This draft includes 18 verified and validated conformance bitstreams and encoding and decoding reference software based on version 12.0 of the Test model for MPEG Immersive Video (TMIV). The test model, objective metrics, and some other tools are publicly available at https://gitlab.com/mpeg-i-visual.
MIV was developed to support compression of immersive video content, in which a real or virtual 3D scene is captured by multiple real or virtual cameras. The standard enables the storage and distribution of immersive video content over existing and future networks, for playback with 6 degrees of freedom (6DoF) of view position and orientation. MIV is a flexible standard for multiview video with depth (MVD) that leverages the strong hardware support for commonly used video formats to code volumetric video. Views may use equirectangular, perspective, or orthographic projection. By pruning and packing views, MIV can achieve bit rates around 15 to 30 Mb/s using High Efficiency Video Coding (HEVC) and a pixel rate equivalent to HEVC Level 5.2. Besides the MIV Main profile for MVD, there is the MIV Geometry Absent profile, which is suitable for use with cloud-based and decoder-side depth estimation, and the MIV Extended profile, which enables coding of multi-plane images (MPI). The MIV standard is designed as a set of extensions and profile restrictions on the Visual Volumetric Video-based Coding and Video-based Point Cloud Coding (ISO/IEC 23090-5) standard, and its conformance bitstreams cover all of the specified profiles.
In addition to conformance testing, work on the verification testing of MIV is ongoing, and the carriage of MIV is specified through the Carriage of V3C Data standard (ISO/IEC 23090-10). MPEG Requirements (WG 02) will publish final use cases and requirements for the MIV 2nd edition, which will be an evolution of the MIV standard, and MPEG Liaison and Communication (AG 3) will publish a white paper on MIV (as further discussed below).
JVET produces Second Editions of VVC & VSEI and finalizes VVC Reference Software
At the 137th MPEG meeting, MPEG Joint Video Coding Team(s) with ITU-T SG 16 (WG 05; JVET) completed the development of the second editions of Versatile Video Coding (VVC, ISO/IEC 23090-3 | ITU-T H.266) and Versatile supplemental enhancement information messages for coded video bitstreams (VSEI, ISO/IEC 23002-7 | ITU-T H.274) by promoting them to Final Draft International Standard (FDIS) status for final approval and publication. The new VVC version defines profiles and levels supporting larger bit depths (up to 16 bits), including some low-level coding tool modifications to obtain improved compression efficiency with high bit-depth video at high bit rates. VSEI version 2 adds SEI messages giving additional support for scalability, multi-view, display adaptation, improved stream access, and other use cases. Furthermore, a Committee Draft Amendment (CDAM) for the next amendment of VVC was issued to begin the formal approval process to enable linking VVC with the Green Metadata (ISO/IEC 23001-11) and Video Decoding Interface (ISO/IEC 23090-13) standards and add a new unconstrained level for exceptionally high capability applications such as certain uses in professional, scientific, and medical application scenarios. Finally, the reference software package for VVC (ISO/IEC 23090-16) also been completed with its achievement of FDIS status. Reference software is extremely helpful for developers of VVC devices, helping them in testing their implementations for conformance to the video coding specification.
At the 137th MPEG meeting, MPEG Joint Video Coding Team(s) with ITU-T SG 16 (WG 05; JVET) finalized the 10th edition of Advanced Video Coding (ISO/IEC 14496-10 | ITU-T H.264) by issuing it as a Final Draft International Standard (FDIS) for final approval and publication. Beyond various text improvements, this specifies a new SEI message for describing the shutter interval applied during video capture. This can be variable in video cameras, and conveying this information can be valuable for analysis and post-processing of the decoded video.
At the 137th MPEG meeting, MPEG Joint Video Coding Team(s) with ITU-T SG 16 (WG 05; JVET) started the approval process of a new second amendment of High Efficiency Video Coding (HEVC, ISO/IEC 23008-2 | ITU-T H.265) by issuing a Committee Draft Amendment (CDAM) that defines new levels and tiers providing support for very high bit rates and video resolutions up to 16K, as well as defining an unconstrained level. This will enable the usage of HEVC in new application domains, including professional, scientific, and medical video sectors.
MPEG Genomic Coding evaluated Responses on New Advanced Genomics Features and Technologies
The extensive usage of high-throughput DNA sequencing technologies enables new approaches to healthcare known as “precision medicine” as well as many other emerging applications such as monitoring the evolution of outbreaks and pathogen surveillance in agriculture and the food industry. DNA sequencing technologies produce extremely large amounts of heterogeneous data including raw sequence reads, analysis results, annotations, and associated metadata which are stored in different repositories worldwide and the use of the data needs to be enabled by standardized and interoperable formats. Structuring and high-performance compression of such genomic data is required to reduce the storage size, increase the transmission speed, and improve the browsing and searching performance of these large data sets as required by a wide variety of applications and use cases. The current MPEG-G standard series (ISO/IEC 23092) addresses the representation, indexing, compression, and transport of genome sequencing data with support for annotation data and searching capability. The ISO/IEC 23092 (MPEG-G) standard series provides a file and transport format, compression technology, metadata specifications, protection support, and standard APIs for the access of genomic data and annotation data in a native compressed format.
At the 134th MPEG meeting, MPEG Genomic Coding (WG 06) had previously issued a Call for Proposals (CfP) to collect submissions of new technologies improving the current compression, transport and indexing capabilities of the ISO/IEC 23092 standard series.
Now at the 137th MPEG meeting, MPEG Genomic Coding (WG 08) evaluated submitted responses to the CfP, addressing the representation and usage of graph genome references and supporting interfaces with existing standards for the interchange of clinical data. The initial evaluation results show that the incorporation of native representations of graph genome references into the MPEG-G standard is desirable and will provide new, advanced, and efficient representation capabilities for extended support of use cases of genomic sequencing data. Concerning the support of interfaces with existing standards for the interchange of clinical data (HL7 and FHIR) the response received shows that such an extension is desirable for the efficient integration of MPEG-G data into clinical workflows and shows a standardization path to achieve such integration.
At the 137th MPEG meeting, MPEG Liaison and Communication (AG 03) approved the following three MPEG white papers.
Artificial neural networks have been adopted for a broad range of tasks in almost every technical field, such as medical applications, transportation, network optimization, big data analysis, surveillance, speech, audio, image and video classification, image and video compression, and many more. Their recent success is based on the feasibility of processing much larger and more complex neural networks (deep neural networks, DNNs) than in the past, and the availability of large-scale training data sets. An additional factor for the exponential growth in the use of such neural networks is the appearance of new use cases, such as federated learning with continuous communication between many devices. Accordingly, this requires the highest capability for compression of neural networks to minimize the overall communication traffic and to reduce the size of the networks when used for inference. Thus, a standard for neural network coding (NNC) has been defined in ISO/IEC 15938-17 “Compression of Neural Networks for Multimedia Description and Analysis”.
This white paper provides an overview of ISO/IEC 23094-2 Low Complexity Enhancement Video Coding (LCEVC). The coding format is designed for use in conjunction with existing video coding formats, leveraging encoder-driven upsampling and specific tools for the encoding of “residuals”, i.e., the difference between the original video and a predicted rendition. LCEVC can improve the compression efficiency and reduce the overall computational complexity for coding a given resolution and bit depth by using a small number of specialized enhancement tools. This white paper provides an outline of the architecture, coding tools, and an overview of the compression efficiency of LCEVC.
The MPEG Immersive video (MIV) standard produced by MPEG was completed with its Final Draft International Standard (FDIS) issued in October 2021 for final approval and publication. The goal of the MIV standard is to provide efficient coding of immersive, six degree of freedom (6DoF) volumetric visual scenes. An immersive 6DoF representation, unlike a three degree of freedom (3DoF) representation, provides a larger viewing space, where viewers have both translational and rotational freedom of movement at their disposition. 6DoF videos also enable the perception of motion parallax, where the relative positions of scene geometry change with the pose of the viewer. Without the 6DoF capabilities of MIV, the absence of motion parallax in 3DoF videos is inconsistent with the workings of the human visual system and often leads to visual discomfort. This white paper provides a brief overview of the important new MIV standard.