Kenji Nakao, Seiji Murata, Kenji Torazawa, Seikou Yamanaka, Tomohiro Gotoh
Hypermedia Research Center, Information Products Research Center
Sanyo Electric Co., Ltd., Ohmori-180, Anpachi-Cho, Gifu, 503-01, Japan

March 96, from the digest of technical papers of the 1996 IEEE ICCE: 0-7803-3029

Abstract: We have developed a high-performance small-sized transcriber which enables us to record or transcribe a maximum of 48 hours of voice in addition to enjoying playback of hi-fi audio sound on a MiniDisc.

1. Introduction

This device is designed to overcome the limitations of cassette tapes used for dictation and other extended duration voice recordings. Cassettes used for these purposes have problems such as low-speed seek/search capabilities and a maximum capacity of 4 hours of speech.

MD is a good potential replacement, however in its standard form it has a maximum capacity of only 148 minutes of audio.

Therefore, this new type of transcriber was developed which has all the portability and quick access benefits of MD, but which can also record up to 48 hours of voice on a single disc.

Figure 1. 
Block Diagram of Transcriber
2. System Design

The basic requirement was to create a portable, MD based system, that could record at least 4 hours per disc. Figure 1 shows a block diagram of the new design.

The RF AMP + ADIP (Address In Pre-groove Demodulator) + LPC (Laser Power Control) block was already available from previous development work, and in this project the ENC/DEC + SPP (Shock Proof Processor) and Interface + Memory Controller units were developed.

ENC/DEC + SPP: This is an LSI circuit that encodes and decodes the EFM signal and controls the Shock Proof Memory RAM. This is accomplished directly in hardware, which reduces the software load. The SPP data format is suitable for the ATRAC decoder and the Voice Codec.

Interface + Memory Controller: This unit is an FPGA (field programmable gate-array) that controls the data flow for both audio and voice, or the external Static-RAM so as to extend the interval over which recording can occur without the intervention of the micro-computer. The Interface carries out the detection and management of data errors.

3. System Features

Voice coding is accomplished with Modified Sub Band Coding (MSBC), which provides a Signal to Noise Ratio (SNR) 6dB greater than Adaptive Delta Pulse Code Modulation (ADPCM). This allows the transcriber to record high quality voice for 9 hours with a data transfer rate of 32kbps, or 48 hours at 6kbps, all on just one MD. These rates are a combination of the quantization bit count and the sampling frequency, which can be selected arbitrarily. [Note that if they had used straight telephone mu-law encoding, the capacity would have been only 5 hours of speech.]

On playback the transcriber automatically selects the ATRAC decoder or the Voice Codec based on the data contents of the disc. It can also playback voice coded audio at 1.6X normal speed without changing voice pitch.

Rec/Play Time (per disc)9, 18, 36,48 hours (selectable)
Voice Coding FormatModified Sub Band Coding (MSBC)
Audio Coding FormatATRAC
Usable MediumAudio MD, MD-Data
Voice Play SpeedNormal
Simple Fast (x1.3, x1.6, x2.0)
Simple Slow (x0.8, x0.9)
Voice Speed Converting
Editing FunctionsErase, Combine, Divide, Move
Other FunctionsVoice Operated Recording, Back Repeat, Title Editing

3. Summary

This high performance transcriber employs a newly developed LSI and gate-array, and is the first system permitting recordings of up to 48 hours on MiniDisc. The report's authors feel this transcriber is an epoch making achievement which can change the world of conventional recording and word processing.

