AZID Readme v1.7.1 build 698 (21.07.2001). By Midas <midas@egon.gyaloglo.hu>
-------------------------------------------------------------------------------


Introduction
============

This is the documentation for the AC3 decoder application, azid. It is written
by Midas <midas@egon.gyaloglo.hu>.



Usage and legal conditions:
---------------------------
  This is a test implemenation of standard A/52 from ATSC (Digital Audio
  Compression Standard), and it may contain algorithms covered by pending
  patents. This application may solely be used for proving that bitstreams
  are compliant to this standard for test and demonstration purposes only.
  Any other use may be prohibited by law in your country. The author has
  no liability regarding this application whatsoever. This application
  may be distributed freely unless prohibited by law.



Overview
--------

AC3 is a digital compression algorithm which may compress up to 5 full
bandwidth channels and one low-frequency effects channel (with limited
bandwidth of 120Hz) into a bitstream. The size of this compressed bitstream
is typically reduced by a factor of 13 compared with the raw data rate.

First of all the specification for the AC3 decoder can be found in the ATSC
specification A/52 at http://www.dolby.com or http://www.atsc.org.

AC3 encoded audio is divided into frames. One frame gives 1536 samples of
audio or 32ms of audio at 48kHz sampling rate. A frame is divided into several
sub-sections:

      - syncronization
      - bit stream information (BSI)
      - 6 audio blocks
      - auxilliary data (and CRC)

The BSI section contains information regarding the bitstream and the current
frame. It contains information like samplerate, number of encoded channels,
downmix-levels, dynamic compression types, program contents, etc.

The audio block contains the actual encoded audio. One block gives 256 samples
(approx. 5.3ms at 48kHz). One audio block is atomic; the audio decoding
operation is repeated for each of these six blocks. The specific details of
this operation can be found in the A/52 spec.



Decoder operation
-----------------

The decoder decodes an audio block into elementary channels of audio. These
elementary channels represents the same channels that where fed into the
ac3 coder at the studio; like center, left, right, etc. If the number of
actual output channels is fewer than the encoded number of audio channels,
the decoder must downmix these channels into the correct number of channels.

In this documentation these channels are called "input channels" and the
channels are named: left (l), right (r), center (c), surround left (sl or s),
surround right (sr) and low-frequency effect channel (lfe).

The downmix operation reduces the number of input channels to the requested
number of speakers. This operation is controlled by the option -d. It selects
how many front and rear speakers the decoder should decode to. If for example
-d2/0 (2 front speakers, no rear speakers) is selected, everything is mixed
into these two speakers/channels. 

Then the audio is fed to the output selector. This controls which of these
channels to route to the actual speaker output. The first speaker output is
named output speaker 0, the next output speaker 1, etc. The decoder supports
up to max. 6 output speakers.

The -o operaion controls what channel(s) to output. If -d2/0 is selected, the
left and the right channel may be output with the -ol,r option. In this case,
all other channels than l or r does not contain any audio because -d2/0
generates only audio in the l and r channels. Other sequences of channel
output may also be chosen, or the same channel may be output several times. For
example: -or,l or -oc,c is both legal options.

The -l and -L options control how the LFE channel is downmixed. The -l selects
the downmix level of the LFE channel into the LFE (output) channel, while -L
controls the amount of LFE audio into the left and right channels.

TIP: If -d3/2 decoding is chosen, no downmix is performed and it's possible
to listen to individual channels selectable with the -o option. E.g. use 
-ol,r to listen to the left and right channels, or use -osl,sr to listen to
the surround channels.

A special command option named --ch may be used to individually change the
attributes for each channel. This option may be used both on input channels
(named l,c,r,etc.) or on output speakers (numbers 0-5). The attributes
may contain gain (see -g for syntax) and/or a dynamic compression value (see
-c).




COMMAND OPTIONS
===============

This section describes each setting of the decoder and how it affects the
decoding process.

The command-line syntax of azid is:

    azid [options] input.ac3 [output.wav]

The output.wav file is optional. If omitted, azid sends the audio to your
soundcard. If the -N is omitted, no ouput is produced (neither to a file nor
to the soundcard).

The numbers of entries in the -o option controls the number of channels that
azid will produce. The default option -ol,r will produce a stereo wav or play
stereo sound. If, however, -oc is used, the wav-file will be mono and the
playback will be mono. More than 2 channel playback depends on your
soundcard (driver).



-b BOOL, --bsi-log=BOOL 
-----------------------

Default: true

The AC3 bitstream contains a BSI (Bit Stream Information) section. This
section contains information about the bitstream, like sampling rate, number
of channels, and other informative information.

This command option selects if such a print-out of this section should be
shown. A typlical BSI print-out looks like this:


      +------ BSI -----
      |  Bitrate: 448 kbit (48 kHz)
      |  Mode: Complete Main (CM)
      |  Audio mode: 2/2  L,R,SL,SR
      |  Surround mix level: -3.0dB
      |  Dialogue level: -27dB
      |  Language: English
      |  Mixlevel: 105dB SPL
      |  Roomtype: Small root, flat monitor
      |  Stream: Copyright protected, Original stream
      +----------------



-c COMPR, --dcompr=COMPR
------------------------

Default: none

This option sets the overall dynamic compression in the decoder. This value
is applied to every output speaker.

The bitstream contains information of how much to amplify or attenuate the
sound to decrease the overall dynamic variations (loudness) in the program
contents. Different options exists to choose the wanted dynamic reduction:


    o none	No dynamic compression. The program contents is unchanged.

    o normal	Normal dynamic compression. Normal in-store decoders use
		this as an hardcoded default.

    o light	Light dynamic compression. This is 50% (-6db) of the
		reduction/gain that normal dynamic compression would give.

    o heavy	Heavy dynamic compression. Intended for poor listening
		environment with much background noise.

    o inverse	Dynamic expansion. This is the inverse value of the light
		dynamic compression, i.e. it makes strong sounds stronger
		and weaker sounds weaker.



-C LEVEL, --clevel=LEVEL
------------------------

Default: BSI

This command option controls the center dowmix level into the LR channels.
Normally, the BSI section contains a field which tells the decoder of how to
downmix the center channel into the LR channels. 

With this option, the user may override the BSI center downmix level and 
specify a custom value. Note that this option is only active when the 
output decode mode (-d) is 2/x.

Allowable values is gain values (either in db's or a positive numerical
value) or BSI. When BSI is selected, the center downmix level gets its value
from the BSI section.



--ch#=ATTRIB0[,ATTRIB1[,...]]
-----------------------------

This option sets one or more attributes for the given channel. There are two
major types of channels available:

    o The input channels. This is channels coming directly from the 
      decoder prior to downmixing. Each of these channes represent the
      same as the channels put into the ac3 coder at the studio. 
      Allowed channel names are: l,c,r,sl,sr,s or lfe.

    o The output channel or speaker. It refers to a output channel after
      downmixing and output selecting (-o). It refers directly to the
      index of the -o option. E.g. '-ol,r' implies that output channel
      0 the left channel, and the output channel 1 is the right channel.
      If '-oc,c --ch0=12db' is used, both output will contain the
      center channel, but only the first channel will have 12db gain.
      Allowed output channel names are: 0,1,2,3,4 and 5.

The attributes may be:

    o Channel gain. This specifies how much the signal on the given
      channel should be amplified/attenuated. Legal values are positive
      numbers or a logarithmic value written with the postfix 'db'.
      Examples: --chl=12 --chc=0 --ch0=+3db --ch1=-3db
     
    o Channel dynamic compression. This specifies the dynamic compression
      to use for that channel. Allowable values are: none,light,normal,
      heavy and inverse. Examples: --chc=normal

Several attributes may be separated by commas. Like this:

    --chc=normal,3db  or  --chlfe=light,0.5  or  --ch0=none,-3db



-d FRONT/REAR, --decode=FRONT/REAR
----------------------------------

Default: 2/0

This option selects how many front and rear speakers the decoder should
downmix for. The argument is given as front speakers/rear speakers.

Note that this option only sets the downmix type, not the actual output.
The -o option controls which channels to output. This option does not
control the LFE channel (see -l and -L).

Possible values are: 1/0, 2/0, 3/0, 1/1, 2/1, 3/1, 1/2, 2/2, 3/2



-e ERROR_ACTION, --erraction=ERROR_ACTION
-----------------------------------------

Default: zero

This options contols the decoder action in case of bitstream errors. Possible
values are:

    o quit. This causes azid to quit the entire application if it
      encounters an error in the bitstream.

    o zero. The decoder will skip the current frame of ac3-data and pad
      the output with silence and continue with the next frame of
      data.



-f BOOL, --read-filter=BOOL   *NEW*
---------------------------

Default: off

This option controls rear-channel filtering in 2/0 output mode. The filter is
a 2nd order Butterworth filter with at -3 dB point at 7 kHz. There are two
major applications for this feature: 

    o To provide proper Pro Logic downmix of the rear channels

    o Phasing-problems in the downmix (washy sound) caused by the rear
      channel downmix into the L R channels.

Usually the rear channels are phased 90 deg in respect of the front channels
prior or inside the ac3 encoder. This is done to avoid phasing problems when
downmixing the program contents to two channels. Some sources do not provide
this shifting, and thus this feature is added.

The filter provides an increasing phase shift according to frequency. It is
90 deg at 7kHz.

NOTE: This option is only effective when 2/0 output mode is selected (-d 2/0).



-F FILE_TYPE, --filetype=FILE_TYPE   *NEW*
----------------------------------

Default: wav

Selects the file type to generate. Possible values are:

    o wav. Generates "normal" 16-bits stereo wav.

    o wav24. Generate 24-bit floating-point wav.

    o wav32. Generate 32-bit floating-point wav.

    o pcm. Generate 16-bit pcm-file.




-g GAIN, --gain=GAIN
--------------------

Default: 1.0 (or 0db)

This option controls the main (output speaker) gain. The value can be given
in db's (by specifying "db" after the argument) or a positive numerical
value. Examples: -g-3db, -g5.3, -g6db




-l LFE_LEVEL, --lfe=LFE_LEVEL
-----------------------------

Default: 0.0

This controls the downmix-level of the LFE channel into the LFE output
speaker. I.e. if this options is set to a non-zero value, the LFE channel
output may be listened to with the -olfe option.



-L LRLFE_LEVEL, --lrlfe=LRLFE_LEVEL
-----------------------------------

Default: 1.0 (or 0db)

This controls the downmix-level of the LFE channel into the LR channels.



-m MONO_MODE, --mono=MONO_MODE
------------------------------

Default: stereo

This option control what type of 1+1 decoding should be used. A special 
channel configuration exists, where the stream contains two mono audio
channels (called 1+1). Selectable options:

    o ch1. Route channel 1 into center.

    o ch2. Route channel 2 into center.

    o mono. Route channel 1 + channel to into center.

    o stereo. Route channel 1 into left and channel 2 into right.



-M BOOL, --matrix-log=BOOL
--------------------------

Default: off

This option makes the decoder print the dowmix matrix with its individual
coeffesients. A typical print would look something like this:


      +------ DOWNNMIX MATRIX -----
      |          IN0     IN1     IN2     IN3     IN4     IN5
      |  L  : +0.2426 +0.1716 +0.0000 -0.1716 -0.1716 +0.2426
      |  C  : +0.0000 +0.0000 +0.0000 +0.0000 +0.0000
      |  R  : +0.0000 +0.1716 +0.2426 +0.1716 +0.1716 +0.2426
      |  SL :                 +0.0000 +0.0000 +0.0000
      |  SR :                 +0.0000 +0.0000 +0.0000
      |  LFE:         +0.0000 +0.0000 +0.0000 +0.0000 +0.0000
      +----------------------------

The channels on the top (INx) are the input channels. Which channel each of
these inputs represent can be read from the audio mode section in the
BSI printout:

      |  Audio mode: 2/2  L,R,SL,SR

Here L is IN0, R is IN1, etc. Note that the input channel gain does not
affect the downmix matrix coeffesients, while -C and -S does.



-n BOOL, --norm=BOOL
--------------------

Default: false

This selects if the decoder should use dialog normalization reduction. The
normal dialogue level in a program is defined a reference of loudness, 0db.
The BSI info variable "dialogue level" informs how much this dialogue level
is under 0db full-scale (FS) - or how much headroom there is above the
dialogue level before clip.

One of Dolby's intentions with this variable is to ensure that all dialogue
levels are played back with the same volume, regardless of the program's
amount of headroom. It is good to have when the movie you're looking at is
interrupted by a commercial break, where the headroom varies enormously.
(It prevents blowing your ears off when the break comes.)

This feature is implemented by attenuate everything such that all programs
have 31 db headroom, regardless of its original headroom. For a typical
-27db headroom program, this will case a -4db gain.



-N, --no-output
---------------

This options causes the decoder not to produce any output, neither to a wav-
file nor to the soundcard. This is ideal for running through the file to
check its validy. It requires no arguments.



-o SEQUENCE, --output=SEQUENCE
------------------------------

Default: l,r

This options controls the channel and the sequence of the output channels.
The selectable channels are all input channels (l,c,r,s,sl,sr and lfe)
and a special zero-data channel (0). Up to 6 channels may be listed with this
command.

The -d option controls what kind of decoding target to use. This -o option
controls which of these channels to ouput (and their sequence). Let's say
for example that you have a 4 channel soundcard. You would like to have left
and right on one of the outputs and surround left and surround left on the
other. To do this you must specify -ol,r,sl,sr.



-p PRESET, --preset=PRESET
--------------------------

Default: 2ch

Azid has some pre-defined settings. The default is 2/0 which all other
settings are derived from. The default command-prompt is: 

    -ezero -b1 -z1 -M0 -mstereo -ssurround -d2/0 -ol,r -L1 -l0
    -Cbsi -Sbsi -cnone -n0 -g1

(which is the same as using -p2ch and not using the -p option at all). The
pre-defined options are:

    o 2ch. This is the configuration for 2/0 channel decoding. This option
      is really redundant, since this is the default preset.

    o 4ch. This setting will produce a 4 channel output, 2/2. The command
      prompt equivalent is: -d2/2 -ol,r,sl,sr

    o 6ch. This setting will produce a 6 channel output, 3/2+lfe. The
      command prompt equivalent is: -d3/2 -L0 -l1 -ol,r,sl,sr,c,lfe



-q, --no-logging
----------------

This option will disable the output logging. No BSI info, no settings, nor
bitstream error will be shown. This option overrides the -b, -z and -M option.
It requires no argument.



-Q, --no-progress
-----------------

This option will disable the decoding progress indicator. It does not 
require any arguments.



-s STEREO_MODE, --stereo=STEREO_MODE
------------------------------------

Default: surround

When 2/0 decoding is selected, this option controls what kind of stereo
downmix should be applied



-S LEVEL, --slevel=LEVEL
------------------------

Default: BSI

This command option controls the surround dowmix level into the LR channels.
Normally, the BSI section contains a field which tells the decoder of how to
downmix the surround channels into the LR channels. 

With this option, the user may override the BSI surround downmix level and 
specify a custom value. Note that this option is only active when the 
output decode mode (-d) is 2/x and the input stream is either x/1 or x/2.

Allowable values is gain values (either in db's or a positive numerical
value) or BSI. When BSI is selected, the surround downmix level gets its value
from the BSI section.



-z BOOL, --set-log=BOOL
-----------------------

Default: on

This option selects if the current settings should be printed in an easy-
readable output. Like this:

      +------ SETTINGS -----
      |  Input channel configuration:
      |    Left     :  None    compression, +0dB gain
      |    Center   :  None    compression, +0dB gain
      |    Right    :  None    compression, +0dB gain
      |    Sur Left :  None    compression, +0dB gain
      |    Sur Right:  None    compression, +0dB gain
      |    LFE      :  None    compression, +0dB gain
      |  Output configuration: 2/0
      |    Ch0 [Left     ]:  None    compression, +0dB gain
      |    Ch1 [Right    ]:  None    compression, +0dB gain
      |  Output Dual mono mode: Stereo
      |  Output Stereo mode: Dolby surround compatible
      |  LFE levels: To LR +0dB, To LFE -INF
      |  Center   mix level: +40.0dB
      |  Surround mix level: BSI
      |  Dialog normalization: No
      +---------------------

