My project "Adding AAC via S/PDIF and Bluetooth" is one that really teaches me a lot about multimedia development and at the same time allows me to apply the knowledge i've learnt in college. I'll try to explain some aspects and updates about my work through this blog.

AAC

AAC is the Advanced Audio Codec developed for lossy digital audio compression. AAC has been standardized by ISO and IEC as part of the MPEG-2 and MPEG-4 specifications. AAC encoded content is often used in MPEG-4 .mp4 and .mp4a(audio only containers). Containers can be considered as storage mediums for multiple data streams such as video, audio and subtitle files. These containers such as .mkv (developed by one of the mentors at VLC!) then has to have it's data demuxed and then decoded. AAC uses a modified direct cosine transform (DCT) algorithm for encoding the analog signal.

What is Passthrough?

When a signal is allowed to move from the source to the destination without altering its form it is known as passthrough. When using HDMI(High Definition Media Interface with S/PDIF frames) or Bluetooth for transmission the input signal is often altered. Without this passthrough it may happen that you may get a lesser fidelity sound. It might be seen with 3D playback as supported HDMI's might allow 3D playback with surround sound while unsupported versions might block 3D playback. For the initial stage of the project I am working on the passthrough for Android using the Audiotrack API.

What is currently happening?

In our case considering VLC player, VLC doesn't currently support AAC passthrough.

Consider an AAC encoded 2 channel(stereo sound setup) sampled at 44.1KHz Whenever an AAC encoded stream is played from a container say .aac or .mp4a then firstly the demux is called which then separates the audio stream and looks for a subtitle stream for music lyrics.

These samples are then packetized by a packetizer. Some containers (like AVI) cannot distinguish between raw samples and those with headers. Therefore, they forward these samples to a packetizer. The packetizer's role here is to detect the presence of these headers and properly handle or strip them, preparing the stream for the next stage (e.g., decoding or remuxing).

These packets are then passed to the decoder, the decoding is handled by ffmpeg's decoding algorithms which in our case is the Lavc59.37.100. It then is pre-buffered for playback.

There are currently 3 different methods for audio output on Android: 1. OpenSLES - not used by default 2. AudioTrack - legacy and supports passthrough 3. AAudio - newer faster and in development

Initial AAC playback log
Initial AAC playback log

We are working on the AudioTrack API for this project.

What am I working on?

I am working to avoid decoding the signal and sending a PCM output to amplifiers instead of the original AAC.

Here are a few terminologies to help you understand better:

Sample: The sampling rate determines the number of samples taken per second. For a sampling rate of 44.1KHz we take 44100 samples every second. According to the Nyquist theorem, to accurately reproduce a signal, the sampling rate must be at least twice the highest frequency present in the signal. For human hearing, the maximum frequency is about 20 kHz, so 44.1 kHz is more than sufficient for audio applications.

Channel: An audio channel is a single stream of audio information. Each channel can carry a separate sound signal, and when multiple channels are combined, they create a spatial audio experience for the listener.

The most common configurations are mono audio(single channel), stereo 2.0, 5.1 and 7.1.

The first digit corresponds to the full bandwidth channels(left, center, right, left surround, right surround) and the digit after the decimal signifies the low-frequency channel(subwoofer) for bass.