Sony's ATRAC with Type-R DSP

Translated from a Japanese Sony brochure that mentions the Type-R DSP for ATRAC

Adaptive High Band Control Technology Improves the Signal Processing Accuracy

With the MDS-JA33ES/JA22ES we have developed a new Type-R DSP that has, at its core, two times more signal processing power than before. In addition, a newly developed intelligent bit re-allocation algorithm that fully utilizes this increased processing power has been adopted. The algorithm re-analyzes the musical data and searches for subtle, redundant bit allocations that up until now had been difficult to find. The algorithm re-allocates these redundant bits preferentially to psycho-acoustically important bands, essentially refining the allocation of bits and improving the reproduction of the source signal. Keeping the accuracy of each processing result is essential for digital signal processing. In making the Type-R DSP for ATRAC, each processing step was checked and very accurate processing has been enabled. As a result, the reproduced sound has become euphonious and quite close to the source sound.


(June 2003)

Sony Japan's 2003 A-V Catalog (p. 15) included a brochure-level (i.e. non-technical) description of general ATRAC compression along with this illustration of the Bit Reallocation technique employed in ATRAC Type-R:

In a re-analysis of the audio data, bits are re-allocated to frequency bands critical to audibility.

The illustration clarifies the 2-dimensional nature of the encoding bit allocation problem. ATRAC and other perceptual coders preferentially allocate bits to those frequency components in each analysis frame* deemed most audible to humans. The idea is to encode each segment of the spectrum with an accuracy that is proportional to its audibility. Once the available bits have been apportioned to each of ATRAC's 52 spectral bands, ATRAC Type-R apparently takes a second look at the actual quantization error that would occur, were each component encoded with its apportioned bits. At this point a decision can reasonably be made to reallocate bits from one band to another if doing so would decrease the [perceptually weighted] overall quantization error of the frame.

The point is that a closed-form solution wherein these two optimization criteria are simultaneously considered may be difficult or impossible to implement. In ATRAC Type-R a two-pass approach is employed, first considering how to allocate bits based upon the human auditory system's response to a frame, and second considering how to minimize the quantization error of the actual spectral components this frame must encode.

*Analysis frame (a.k.a. "window") size for SP mode audio is 512 samples (11.6ms) and 1024 samples (23ms) for the LP modes.


This is an educated guess as to the workings of ATRAC Type-R, I would be happy to take up a conversation about its accuracy with those familiar with the topic. Discuss...

Return to the MiniDisc Community Page.