©  Pier Paolo Zimmermann

Whose voice is your voice?

Control, identity and metahuman sounds in AI audio compression codes 

Af
  • Alberto Ricca
18. september 2025

Abstract

MP3 changed the aural world as the first digital technology to compress audio, interpreting it through perceptual coding that models the standard human, hearing standard sounds, in order to sell and share them. Today, in the era of streaming and virtual meetings, bandwidth has become an even scarcer commodity: Artificial Intelligence comes to help, but the current codecs, such as Lyra by Google, or Opus, used by Whatsapp and Discord, present a sound reconstructed on the basis of their reference corpuses, and fine-tuned for speech – »good enough«, as defined by Sterne. A model is trained to efficiently distinguish noise from voiced information, and to rebuild a synthetic simulacrum of them, neutering context. When this happens, whose voice is your voice? Whose ears are your ears? And if, following Marius Schneider, sounds create the world, what is the world those models remember?

The research stems from a technical analysis of these codecs, their bias and their proven limits and virtues, to focus on the consequences of this widespread acceptance of apparently transparent audio transmission: a social, efficiency-driven homogenization of the soundscape; the elimination of more-than-human sounds. leading to a deeper dissonance in authenticity of the acoustic ecology; and creative possibilities hidden in abusing the emergence of meta-human debris.

Special thanks to Luigi Monteanni, Sissj Bassani and Pier Paolo Zimmermann.

This work is dedicated to the memory of Jonathan Sterne.