Skip to content

154th meeting of MPEG

At its 154th meeting, MPEG finished the development of four standardization projects. This included promoting one standard to Final Draft of International Standard (FDIS), one standard to Amendment (AMD), and three standards to Final Draft of Amendment (FDAM) driving innovation in JVET, Genomic information, Systems and Audio coding, respectively:

  • AVC 12th edition (FDIS)
  • Genomic information representation Reference Software (AMD)
  • Encryption of ISOBMFF enhanced with support of AES-256 (FDAM)
  • 2nd edition Amendment 1 Support of MPEG-I immersive audio, scene understanding and other extensions (FDAM)
  • AAC Immersive Interchange Format (IIF) and Media Authenticity (FDAM)

At the latest MPEG meeting, Working Group 6 hosted a series of successful demonstrations showcasing the capabilities of MPEG-I immersive audio, including fully interactive six degrees of freedom (6DoF) experiences on contemporary mobile phones and head-mounted displays (HMDs).

Five innovative demonstrations presented by four leading companies were presented, underscoring the readiness of MPEG-I immersive audio technology for real-world deployment.

The demonstrations included a virtual museum, a virtual restaurant, and an interactive virtual space with realistic acoustics that attendees could explore. A volumetric video dancing-and-singing experience further highlighted how immersive audio can be integrated with advanced visual rendering for a cohesive audio-visual experience.

The demos emphasized the added value of fully interactive 6DoF immersive audio across both mobile phones and a range of HMDs, demonstrating that these experiences can run on existing consumer-grade devices and real-world networks.

In a 6DoF communication experience, attendees experienced how IVAS-coded audio, paired with the MPEG-I immersive audio renderer, can deliver rich, lifelike sound environments that support low-latency conversation. In the virtual museum and restaurant experiences, participants could hear their own voice within virtual rooms and explore customizable accessibility features designed to benefit people with hearing impairments. The demonstrations also showed that multiple users can share the same VR scene simultaneously, communicating by speaking while interacting with shared virtual audio objects.

These demonstrations underscore MPEG’s commitment to pushing the boundaries of immersive media and represent a major step toward bringing next-generation audio experiences to consumers and professionals.

Encryption of ISOBMFF enhanced with support of AES-256 has reached the final standard development, FDAM

Regardless of contents protection systems applied, the contents packaged in ISO base media file formats (ISOBMFF) are encrypted with the technologies specified in ISO/IEC 23001-7 Common Encryption in ISO base media file format files. AES (advanced Encryption Standard)-128 symmetric block cipher has been mostly adopted by the industry as an encryption technology and now more secure encryption technology, AES-256, is available as additional choice.

At the 154th MPEG meeting, MPEG Systems (WG 3) advanced the ISO/IEC 23007-1:2023 Amendment 1 Support for AES-256 to the final stage of standard development, Final Draft Amendment (FDAM). The specification allows use of AES-256 technology for both protection mechanism widely used by the industries, AES counter mode (CTR) and Cipher Block Chaining (CBC). To indicate the use of AES-256, a new version of the track encryption box is introduced. The definition of a number of new brands to indicate the use of AES-256 in combination with the tools previously defined for AES-128 is under development and will be further standardized as an additional amendment soon.

Further improvements on MPEG-I Scene description standard.

At the 154th MPEG meeting, MPEG Systems (WG 3) finalized the development of the ISO/IEC 23090-14 2nd edition Amendment 1 Support of MPEG-I immersive audio, scene understanding and other extensions by advancing the standard to the final stage of standard development, Final Draft Amendment (FDAM).

The major enhancements provided by the amendments includes support for specifying spatial description around a scene for augmented reality applications, support for describing physical objects attached to a scene and interaction with them, support for using punctual light source for a scene. The standard also adds integration of MPEG-I Immersive audio, ISO/IEC 23090-4 as well. For integration of MPEG-I immersive audio, the standard assumes the presence of an MPEG-I immersive audio renderer that will receive the MPEG-I immersive audio bitstream, a set of MPEG-H audio streams, as well as information about some scene metadata, such as listener’s pose. Integration of additional volumetric media codecs are under development and to be finalized soon.

Compressing Feature data for AI-based split inference applications – MPEG publishes Committee Draft of Feature coding for machines (FCM)

Development of ISO/IEC 23888-4 FCM has progressed since the evaluation of the Feature Coding for Video Coding for Machines (FCVCM) CfP in WG 2 MPEG Requirements in October 2023 to the point where a CD of ISO/IEC 23888-4 was issued at the 23rd ISO/IEC JTC 1/SC 29/WG 4 MPEG Video coding meeting.

FCM standardizes technology for compression of intermediate feature data encountered within computer vision networks. Compression of intermediate feature data allows a computer vision network to be executed in a distributer manner. In particular, FCM enables ‘split inferencing’ applications where an edge device may run the backbone of a computer vision network while the central cloud server only needs to run the head of the computer vision instead of running the entire network.

FCM reuses existing coding standards for its core coding technology, with options both for block-based coding solutions that are available in deployed chipsets and software-based coding of reduced-dimension feature data derived from the feature data obtained from the split point between network backbone and head. This flexibility allows FCM to be deployed using existing silicon, either reusing hardware-implemented codecs or using as a software-only implementation.

Compressing Tensor Data in AI-based Media Coding – MPEG publishes Committee Draft of Amendment of Neural Network Coding (NNC)

After publishing two editions of the standard for coding of neural networks for multimedia content description and analysis (NNC, ISO/IEC 15938-17, edition 2 published in 2024), which address coding of neural network parameters and their incremental updates, MPEG is working on an amendment of this standard.

NNC has proven to be an effective coding framework beyond the originally considered use cases, covering a variety of neural network topologies and layer types, including many computer vision and large language models. The amendment under development provides increased flexibility beyond coding of neural network parameters, as it extends NNC to other types of tensorial data in media coding, such as feature maps in split inference and federated learning, or parameters of 3D and 4D scene representations such as Gaussian splats.

In particular, the amendment adds support for lossless compression (including support for mixed lossy and lossless compression in the same bitstream), source data type indication for models with limited bit depth, more fine-grained configuration of the quantisation applied to elements of the data, and configurable extensions for initialising, ordering and packing data. In addition, it adds support for signing and verification of data units, which allows verifying the authenticity of a set of neural network parameters, as well as parts of a neural network or its updates.

Next edition of AVC finalized

At its 40th meeting, JVET (SC 29 WG 5) finalized work towards the next edition/version of the advanced video coding standard (AVC, ISO/IEC 14496-10 12th ed. | ITU-T Rec. H.264 V16), by promoting it to ISO/IEC FDIS stage, and submitting the twin text for ITU-T consent. The main aspect of the new edition is to provide additional support for more SEI messages that had recently been defined in the 4th edition of the VSEI standard (ISO/IEC 23002-7 | ITU-T Rec. H.274). More specifically, the new edition of AVC adds the interface to support eleven new SEI messages, namely SEI processing order, processing order nesting, encoder optimization information, source picture timing information, modality information, digitally signed content initialization, digitally signed content selection, digitally signed content verification, generative face video, generative face video enhancement, and AI usage restrictions request. It also includes a new mechanism that allows future extensions of the newly introduced SEI messages. It also adds support to interface extended versions of the film grain characteristics and neural network post-filter activation SEI messages through referencing the VSEI standard. The changes in this edition also include corrections to various minor defects in the previous edition.

MPEG Audio Finalizes AAC Immersive Interchange Format (IIF) and Media Authenticity Work

At its 154th meeting, MPEG Audio (WG 6) advanced the AAC Immersive Interchange Format (IIF) and Media Authenticity amendments to Final Draft Amendment (FDAM) status, marking the conclusion of the technical work on these specifications.

A central element of this milestone is the ISO/IEC 14496-3 (MPEG-4 Audio) amendment, which finalizes the AAC Immersive Interchange Format (IIF). AAC IIF enables reliable immersive audio transmission via transport links at low latency, supports the carriage of different types of metadata associated with the audio data, and allows flexible handling of various use cases in consumer and professional systems. The same amendment also enables media authenticity in connection with this new format and the AAC-family of codecs.

The broader work also covers amendments to ISO/IEC 23003-3 (MPEG-D Unified Speech and Audio Coding) and ISO/IEC 23008-3 (MPEG-H 3D Audio), extending MPEG Audio standards with functionality that enables verification of the authenticity of audio streams in a very bitrate-efficient manner and helps protect content against tampering and unauthorized alteration.

A key strength of the MPEG approach is interoperability across the broader digital media ecosystem. The media authenticity technology in MPEG Audio is designed to support robust linkage of authenticity-related metadata to other content layers, including video and systems, and to reference external systems for broader provenance and verification workflows across distributed platforms.

The completion of MPEG Audio’s technical work on these amendments reflects MPEG’s continued commitment to advancing trust, reliability, interoperability, and efficient immersive audio workflows in next-generation digital media.

MPEG advances Geometry-based Point Cloud Coding for Solid Objects to Committee Draft

MPEG 3D Graphics and Haptics Coding has promoted Geometry-based Point Cloud Coding for Solid Objects, ISO/IEC 23090-41, to the Committee Draft stage. This new specification extends the MPEG point cloud coding portfolio by addressing the efficient compression of dense point clouds representing solid 3D objects within a purely geometry-based coding framework.

In this context, “solid” refers to 3D objects represented by dense point clouds, typically describing object surfaces with sufficient spatial detail to support accurate reconstruction, visualization, manipulation and downstream processing. MPEG already provides efficient solutions for dense point clouds through V-PCC, which relies on the video coding ecosystem by projecting 3D information onto video representations. By contrast, the G-PCC family follows a geometry-based approach, directly coding the 3D structure and associated attributes of point clouds.

ISO/IEC 23090-41 complements the original G-PCC standard, ISO/IEC 23090-9, which was primarily designed for sparse point clouds. It completes the G-PCC family by providing tools better adapted to dense solid-object representations, as increasingly produced by high-resolution 3D scanning, photogrammetry and industrial digitization workflows.

This work is expected to support applications such as digital twins, industrial inspection, cultural heritage digitization, immersive media, e-commerce, simulation and extended reality services. The progression to Committee Draft confirms the technical maturity of the specification and enables National Bodies to review the text and provide comments as the project advances toward the next stages of standardization.

MPEG takes a second major step toward standardized Gaussian Splat Coding

MPEG 3D Graphics and Haptics Coding has promoted the first amendment to Geometry-based Point Cloud Compression, ISO/IEC 23090-9, to the Committee Draft Amendment stage. This marks a second major step toward standardized Gaussian Splat Coding, following the recent promotion of the first amendment to V-PCC, ISO/IEC 23090-5, which addresses Gaussian splat coding through a video-based compression framework.

While the V-PCC amendment builds on the video coding ecosystem by projecting 3D information onto video representations, the G-PCC amendment follows a complementary geometry-based approach. It brings Gaussian Splat Coding into the G-PCC framework by directly coding the 3D structure and associated rendering attributes of Gaussian splat representations.

Gaussian splatting is gaining strong momentum as a powerful representation for immersive and real-time 3D experiences. By representing scenes through spatially distributed primitives carrying geometry and rendering-related attributes, Gaussian splats can support high visual quality, efficient rendering and flexible 3D scene reconstruction. Their growing use in virtual production, extended reality, digital twins, cultural heritage, e-commerce and immersive communication creates a clear need for standardized compression technologies that enable efficient storage, transmission and interoperability.

With this amendment, MPEG extends the G-PCC family beyond conventional point cloud coding and adapts its geometry-based approach to the specific characteristics of Gaussian splat data. The progression to Committee Draft Amendment confirms the technical maturity of this work and opens the way for broader review by National Bodies. It represents an important milestone toward a common and interoperable compression framework for Gaussian splat content, helping future immersive media systems exchange, store and deliver advanced 3D scenes efficiently.

MPEG announces completion of ISO/IEC 23092-4 : 2020 AMD 1 reference software for part 6 (Genomic Annotations)

At the 154th MPEG meeting, MPEG Genomic Coding (WG 8) announced the completion of ISO/IEC 23092-4: 2020 AMD 1 (Reference software for part 6). This update to the reference software provides support for the compressed representations of genomic annotations (ISO/IEC 23092-6) linked to the compressed representation of raw sequencing data (ISO/IEC 23092-2) and metadata (ISO/IEC 23092-3).

ISO/IEC 23092-6 complements existing MPEG genomics standards to incorporate not only the primary (raw sequencing data) and secondary (aligned sequencing data), but also tertiary genomic data, including variant calls, gene expressions, mapping statistics, contact matrices (e.g., Hi-C), genomic tracks information, and functional annotations, which are collectively referred to as annotation data in the ISO/IEC 23092 series of standards, with efficient compression, indexing, and search capabilities. Reference software is intended to provide support for implementors of the standard by providing basic, working, executable software which implements the standard. The reference software is available for download from MPEG.