153rd meeting of MPEG

At its 153^rd meeting, MPEG finished the development of four standardization projects. This included promoting two standards to International Standard(IS), one to Final Draft International Standard (FDIS), and one standard to Amendment (AMD), driving innovation in Systems, JVET, and Graphics coding, respectively:

Carriage of green metadata (IS)
Conformance and reference software for scene description (AMD)
The third edition of VVC conformance (IS)
Conformance and Reference Software for Low Complexity, Low Latency LiDAR Coding (FDIS)

At the 153rd MPEG meeting, MPEG Audio Coding (WG6) received two proposals to its Call for Proposals for Audio Coding for Machines (ACoM) issued at the 151st MPEG meeting. MPEG-ACoM aims to define a bitstream and data format for compressing audio, multi-dimensional streams, or features extracted from such signals that is efficient in terms of bitrate/size and can be used for machine analysis tasks or hybrid machine and human consumption. In addition, such a data format can be used to transport recorded audio data from sensor networks to machine listening units.

The call focused on lossless audio coding enabling the use of the same compression scheme for many different applications. The two proposals had been evaluated based on 1822 different audio samples from the use cases of “predictive maintenance”, “in-line testing”, “traffic monitoring and control”, “flexible medical data”, “user generated content”, “live stream content analysis” and “artistic creation”. One of the proposals had a small but statistically significant better compression rate for most of the use cases. Both proposals were based on the draft joint biomedical and general waveform coding (BWC) standard ISO/IEC 23003-8 and ITU-T T.261, which is a joint project of WG 6 MPEG Audio Coding and ITU-T Q6/21 (VCEG). The call for proposals of ACoM did not consider the complexity of the algorithm. On the other hand, the algorithms in BWC have already been optimized for lower complexity. To avoid duplication of standards and to optimize computational complexity, WG 6 and VCEG experts started to investigate how ACoM proposals can be merged into the ongoing BWC standard.

The first step on restructuring the green metadata standards for future growth finished

About a decade ago, the first edition of the standard for metadata enabling energy-efficient video processing, ISO/IEC 23001-11:2015 Energy-efficient media consumption (known as Green metadata), was published by MPEG for the first time. The standard includes both the definition of metadata that is agnostic to media codecs and metadata specific to certain media codecs, as well as methods for carrying them. Over the past decade, as the specification has grown significantly, for better organization and future scalability the technologies for carrying green metadata have been split into a separate standard, ISO/IEC 23001-19 Carriage of green metadata. At the 153rd MPEG meeting, MPEG Systems (WG 3) advanced the specification to the publication stage of standard development as an IS (International Standard).

The new specification defines the storage of Green metadata in ISO Base Media File Format (ISOBMFF) files (ISO/IEC 14496-12) and its delivery with MPEG-DASH (ISO/IEC 23009-1). It also provides a series of detailed examples to help readers better understand the technologies. As interest and importance in energy-efficient media consumption continue to rise, MPEG Systems is actively working on restructuring related standards to accommodate further developments. Several other standards are expected to be restructured and enhanced in the near future.

Additional reference software for MPEG-I Scene description published.

Considering the importance of the availability of open-source implementations of the standard, which enables developers to easily test and integrate the technology into their applications and encourages the adoption of such technologies, particularly in the 3D rendering area, MPEG has made significant efforts to deliver reference software for the MPEG-I scene description standard. At the 153rd MPEG meeting, MPEG Systems (WG 3) advanced the ISO/IEC 23090-24 AMD 1 Conformance and reference software for scene description on haptics, augmented reality, avatars, and interactivity to the publication stage of standard development as an AMD (Amendment).

This standard provides implementations of technologies developed by ISO/IEC 23090-14 Scene description over the past several years, including anchoring extensions, interactivity extensions, sampler extension, lighting extension, V3C extension, avatar extension, and more. Additionally, MPEG has already initiated the development of an amendment to ISO/IEC 23090-24 to include implementations of technologies recently standardized in ISO/IEC 23090-14.

Video coding for machines reaches a new milestone

At the 153rd MPEG meeting, MPEG Video coding (WG 4) promoted ISO/IEC 23888-2 Video coding for machines to Draft International Standard (DIS), reflecting continued progress in efficient coding of video for machine consumption.

This emerging standard defines video compression technologies optimized for consumption of video content by machine tasks. It specifies a pipeline of processing and coding stages to support efficient compression of video in ways that do not degrade machine task performance. The emerging standard performs transformations and synthesis operations after decoding a coded video bitstream, which are applied spatially, temporally, across colour channels, and in amplitude prior to the operation of a machine task network.

The standard is planned to be completed, i.e., to reach the status of Final Draft International Standard (FDIS), in late 2026.

Draft Joint Call for Proposals on video compression with capability beyond VVC issued

At its 40th meeting in October 2025, the Joint Video Experts Team (JVET, SC 29 WG 5 established together with ITU-T Study Group 21) had successfully conducted evaluation of responses received on a Call for Evidence (CfE). At the 41st meeting, follow-up work towards an open Call for Proposals (CfP, to be issued on behalf of JVET’s SC 29 and ITU-T SG21 parent bodies) on submission of technology for a next generation of video compression standard was conducted, with the first draft issued as a public document available at JVET-AO2026. The CfP requests submissions regarding video compression technology that has compression performance or additional functionality beyond that of the Versatile Video Coding standard (VVC, ISO/IEC 23090-3 and ITU-T H.266), where the tradeoff in terms of encoder and decoder implementation cost is also an important criterion. Beyond targeting improved compression in general, another test case with an emphasis on runtime-constrained encoding will also be evaluated. Information about technology supporting for functionality that may not be sufficiently supported by existing video compression standards is also requested. Both further improvements based on conventional technology (i.e., using traditional signal processing) as well as technology based on neural networks and artificial intelligence are considered to be of interest in this context.

To evaluate the proposed compression technologies, formal subjective tests will be conducted using video sequences in seven categories. To ensure robustness to the full variety of expected uses, the planned test set for proposals will consist primarily of new materials not used in previous video coding standardization projects (including cropped 8K, user-generated, and gaming content to support new application domains). Proposals will be investigated under conditions enforcing both low-delay and random-access constraints of applications.

A final draft CfP is expected to be issued from the next meeting in April 2026, and a final Call by July 2026. Submissions are planned to be due by the end of October 2026, and evaluation of responses is planned during the meeting in January 2027. As an anticipated tentative timeline after the CfP, finalization of a first version for a next-generation video compression standard could be expected at the end of 2029.

New edition of VVC conformance testing

At its 40th meeting, JVET (SC 29 WG 5) submitted the third edition of ISO/IEC 23090-15 for publication as an international standard. This includes additional bitstreams for testing multi-layer profiles testing, and also includes some updates of bitstreams already defined in previous editions. An equivalent twin text was already approved in ITU-T and published as ITU-T H.266.1 version 3.

MPEG-I immersive audio verification test finalized

At the 153rd MPEG meeting a verification test assessment of the MPEG-I immersive audio coding standard (ISO/IEC 23090-4) was completed.

The MPEG-I immersive audio standard is a comprehensive specification for compact and realistic representation and rendering of audio for Virtual, Augmented and Mixed Reality (VR/AR/MR) applications, including mixed virtual acoustic scenes and physical spaces. It provides high-quality real-time interactive rendering of virtual audio content with six degrees of freedom (6DoF), i.e. the user can not only turn their head in all directions (pitch/yaw/roll) but also move around freely in 3D space (x/y/z). This permits a very high sense of user immersion in the VR/AR scene. The reference software (ISO/IEC 23090-34) implements all aspects of the text specification (ISO/IEC 23090-4) and runs in real-time.

For the verification test the reference software was used. The test set consisted of 10 test scenes with different acoustic properties. Listeners with two levels of expertise participated: Listeners with expertise in audio quality assessment and listeners with additional expertise in 6DoF audio rendering. 56 listeners from six different organizations in total completed the test. The general outcome of the two expert groups was the same, with smaller variances and slightly lower values for the anchors from the 6DoF-experts. The two anchors were rated as “fair” and “poor” (47/100 and 16/100 respectively).

Overall, listeners rated the MPEG-I immersive audio condition as “excellent” (median 84/100), confirming that the initial development goal of achieving high-quality interactive audio rendering has been accomplished.

MPEG advances video-based Gaussian Splat Coding

At the 153rd MPEG meeting, MPEG 3D Graphics and Haptics Coding (WG 7) promoted an Amendment to ISO/IEC 23090-5 Video-based Point Cloud Compression (V-PCC) for Gaussian Splat Coding to the Committee Draft Amendment (CDAM) stage.

Gaussian Splats have recently emerged as an efficient representation for realistic 3D scenes, enabling high-quality rendering from a set of 3D Gaussian primitives with associated attributes. While Gaussian Splats share similarities with point clouds, their attribute structure and rendering-oriented usage introduce new requirements for compression. This amendment addresses these requirements by defining a V-PCC profile specialized for Gaussian Splats, enabling their compression using the existing video-based point cloud coding framework without introducing new core coding tools.

The amendment builds on the RAW patch mechanism of V-PCC to efficiently map Gaussian Splat attributes into video representations suitable for compression by standard video codecs. By leveraging the mature V-PCC architecture and its established toolset, this work provides a fast and interoperable path for video-based compression of Gaussian Splats, addressing immediate market needs while remaining fully aligned with the existing MPEG point cloud ecosystem.

MPEG finalizes conformance and reference software for Low Complexity, Low Latency LiDAR Coding (L3C2)

At the 153rd MPEG meeting, MPEG 3D Graphics and Haptics Coding (WG 7) promoted ISO/IEC 23090-35, Conformance and Reference Software for Low Complexity, Low Latency LiDAR Coding (L3C2), to the Final Draft International Standard (FDIS) stage, completing the development of this important new standard by the working group.

L3C2 addresses the specific characteristics of point clouds generated by spinning LiDAR systems, where the acquisition order of points is known a priori. By exploiting this structured acquisition process, L3C2 enables efficient compression with significantly reduced computational complexity and latency compared to more generic point cloud coding approaches. These properties make L3C2 particularly well suited for real-time applications and deployment scenarios with strict latency or processing constraints.

The ISO/IEC 23090-35 specification provides the normative conformance framework and reference software for L3C2, ensuring consistent and interoperable implementations across platforms. The availability of standardized conformance tests and reference software completes the L3C2 ecosystem, facilitating adoption, validation, and deployment of the standard in practical LiDAR-based systems.

MPEG Announces Successful Completion of MPEG-G Genomics Hackathon to Apply AI for Innovative Uses of Microbiome Data

MPEG is proud to announce the completion of a hackathon on the processing of genomics data encoded using the MPEG-G standard series (ISO/IEC 23092). The goal of the hackathon was to promote the awareness of MPEG-G compressed standard formats and to collect user feedbacks by challenging the participants to analyze microbiome data using AI approaches to generate innovative insights.

The hackathon consisted of a longitudinal microbiome study led by Stanford Medicine and the data collected as part of the study. The hackathon had a duration of 4 months, with the first participants signing up on June 20, 2025. The hackathon was comprised of 2 challenges with cash prizes for the top participant submissions in each challenge. The top three finalists in Challenge 1 were awarded prizes, and Challenge 2 was further split into 5 tracks with the top performer in each track being awarded a prize. The term microbiome refers to the community of microorganisms (bacteria, viruses, fungi, archaea) living in a particular environment – in this case, particularly the microbiome associated with humans. In humans, key microbiome sites include the gut, skin, oral cavity, nasal passages, and the urogenital tract, and data from various sites was provided to the hackathon participants.

Challenge 1 was a Microbiome Classification Challenge, where participants were tasked with classifying microbiome samples by body site and individual – using sequencing data in compressed MPEG-G format. Further details about Challenge 1, plus the winning submissions, can be found on the Challenge 1 web site: https://zindi.africa/competitions/mpeg-g-microbiome-classification-challenge

Challenge 2, with the potential for new clinical insights and building on the first challenge, explored the immune system (cytokine profiles) interactions with the microbiome over time. This challenge encouraged participants to build AI pipelines for parsing, decoding, and inferring or predicting the impact of the microbiome on inflammatory response and health status, such as insulin resistance, of study participants. Hackathon participants worked to uncover meaningful host–microbe interaction patterns that go beyond classical statistical models. Further details about Challenge 2, plus the winning submissions of each of the 5 tracks, can be found on the Challenge 2 web site: https://zindi.africa/competitions/mpeg-g-decoding-the-dialogue

MPEG thanks all of the participants in the hackathon and congratulates the winners of both Challenges. Across both Challenges and all tracks, the winners stood out for their technical excellence, thoughtful problem-solving, and strong alignment with the goals of efficient, standardized MPEG-G genomic data representation.

The hackathon was sponsored by Philips with substantial contributions from Leibniz University Hannover, CIMA University of Navarra, Stanford Medicine, Fudan University’s Intelligent Medicine Institute and Viome.