Why xHE-AAC is being embraced at Meta
- We’re sharing how Meta delivers high-quality audio at scale with the xHE-AAC audio codec.
- xHE-AAC has already been deployed on Fb and Instagram to offer enhanced audio for options like Reels and Tales.
At Meta, we serve each media use case possible for billions of individuals internationally — from short-form, user-generated content material, corresponding to Reels, to premium video on demand (VOD) and reside broadcasts. Given this, we’d like a next-generation audio codec that helps a variety of working factors with glorious compression effectivity and trendy, system-level audio options.
To handle these wants now and into the long run, Meta has embraced xHE-AAC because the car for delivering high-quality audio at scale.
The advantages of xHE-AAC
xHE-AAC is the most recent member of the MPEG AAC audio codec household. The Fraunhofer Institute for Integrated Circuits IIS performed a considerable position within the improvement of xHE-AAC and the MPEG-D DRC normal.
Right now, xHE-AAC is already offering a superior audio expertise on Fb and Instagram — together with on Reels and Tales — and has a variety of precious options.
Loudness administration
With a whole lot of thousands and thousands of uploads per day throughout Fb and Instagram, we obtain audio tracks with loudness ranges starting from silence to full scale, and every thing in between.
When folks play these movies sequentially, they’ll understand some audio as being too loud or too quiet. This creates listener fatigue from having to continually alter the quantity.
xHE-AAC’s built-in loudness administration system solves for loudness inconsistency whereas meticulously preserving creator intent by bringing the common loudness of all classes to the identical goal degree and managing the dynamic vary of every session to suit the playback setting.
As an alternative of burning in a particular goal degree and dynamic vary compression (DRC) profile throughout encoding, xHE-AAC permits us to go away the unique audio traits untouched and delegate loudness administration processing to the consumer through loudness metadata, for the optimum audio expertise based mostly on context.
On account of xHE-AAC’s loudness administration, folks can spend extra time immersed of their favourite content material and fewer time twiddling with the quantity management.
Adaptive bit price audio
Most individuals who use our apps eat media on cell gadgets and anticipate the very best audio high quality with out interruption. This presents a problem for streaming media as a result of connection high quality varies on cell and can lead to a really uneven person expertise.
To optimize high quality beneath dynamic bandwidth constraints, we produce a number of video and audio qualities to match various community circumstances at playback time. Though we produce a number of audio lanes, we have now traditionally solely employed adaptive bit price (ABR) algorithms to modify video qualities throughout playback as a result of it’s tough to allow adaptive bit price audio with out compromising high quality throughout lane transitions.
So as to allow seamless audio ABR, xHE-AAC introduces the idea of instant playout frames (IPFs) that comprise all the info obligatory to begin taking part in a brand new audio lane with out counting on knowledge from different frames. By putting an IPF in the beginning of every Dynamic Adaptive Streaming over HTTP (DASH) phase and aligning the phase durations of every lane, we are able to seamlessly swap between audio lanes throughout playback to offer the highest-quality audio at any obtainable bandwidth whereas avoiding playback stalls.
After launching audio ABR on Fb for Android, we have been capable of enhance person expertise by decreasing the variety of classes the place playback stalls.
How we deployed xHE-AAC
We generate xHE-AAC bitstreams utilizing an encoder SDK supplied by the Fraunhofer Institute for Built-in Circuits IIS, after which put together the ensuing audio recordsdata for DASH streaming with shaka-packager. The xHE-AAC encoder’s two-pass encoding mode is used to measure the enter loudness envelope and common program loudness on the primary move and carry out the precise audio knowledge compression on the second move. As an additional benefit, two-pass encoding permits us to make use of loudness vary management (LRAC) DRC, which mitigates pumping artifacts in any other case launched by single-pass DRC algorithms.
To arrange an xHE-AAC audio adaptation set for ABR supply, IPFs are inserted at fixed time intervals, audio configuration parameters corresponding to pattern price and channel configuration are saved fixed, and distinctive stream identifiers are chosen for every lane within the audio adaptation set.
At playback time, we custom-fit the audio to the listening setting by configuring a goal loudness degree and DRC impact kind based mostly on context, and because of the embedded loudness metadata, we are able to adapt a single xHE-AAC bitstream to quite a lot of audio consumption use instances, from headphones to gadget audio system and numerous ranges of background noise. Lastly, if the consumer is starved for knowledge or bandwidth is plentiful, audio ABR will robotically swap audio qualities to make sure that the very best audio high quality is performed with out interrupting the playback session.
The place are you able to expertise xHE-AAC as we speak?
You’ll be able to expertise xHE-AAC audio on Fb for iOS and Android, in addition to on focused surfaces on Instagram, corresponding to Reels and Tales. We encourage you to put in the most recent model of Fb and Instagram apps on iOS 13+ and Android 9+ to make sure that you may expertise it.
Acknowledgements
This work is the collective results of all the Video Infrastructure and Instagram Media Platform groups at Meta in collaboration with Fraunhofer Institute for Built-in Circuits IIS. The creator want to prolong particular because of Abhishek Gera, Tim Harris, Arun Kotiedath, Edward Li, Meng Li, Srinivas Lingutla, Denise Noyes, Mohanish Penta, David Ronca, Haixia Shi, Mike Starr, Cosmin Stejerean, Jithin Parayil Thomas, Simha Venkataramaiah, Juehui Zhang, Runshen Zhu, and the engineering crew at Fraunhofer Institute for Built-in Circuits IIS.