Dynamic Range of 16-bit Audio

DigiSUN · Post by **DigiSUN** » Sat Dec 11, 2004 7:48 am

Hello,
I'm recently having a problem about dynamic range calculation of digital audio. My fellow Pulsarians, maybe you can help me out

Everyone says that 16 bit audio gives 96dB dynamic range. the formula is as follows:
20*log(2^16) = 96dB (approx.)

But, 16 bit values cover the whole amplitude axis from 0dB on top, through the zero axis, down to 0dB again on bottom. Don't they?
(Take a look at the axis on the left on a wave graph, in Sound Forge, for example).

Shouldn't a dynamic range represent the range from 0dB down to -infinity (middle of the axis)? if so, then from -inf to 0dB we have only HALF of the axis: 15 bits. Therefore, 16bit audio should have only 20log(2^15)= 90dB...

Then why is a 16bit dynamic range threated as 96dB?

I hope i explained myself well enough...

Immanuel · Post by **Immanuel** » Sat Dec 11, 2004 10:23 am

EDIT: THIS POST IS OFCOARSE COMPLETE NONSENSE. I JUST LET IT BE FOR OTHER TO LEARN HOW NOT TO THINK OF THE SUBJECT.

/ME DIGS A CAVE TO GO HIDE IN

I see it as 8bit +wave and 8bit -wave ... if you get, what I meen?

[ This Message was edited by: Immanuel on 2004-12-15 13:46 ]

deejaysly · Post by **deejaysly** » Sat Dec 11, 2004 12:58 pm

I might be wrong but thinking about it I would have thought DigiSUN is correct...

...so it terms of 0db to -infinity you would only have 15bits dynamic range.

In this case the wave values are treated as "signed integers" with a value of zero being the -infinity point???

Blah! I could be way of the mark!!

deejaysly · Post by **deejaysly** » Sat Dec 11, 2004 1:00 pm

oops, but then again I understand what Immanuel is saying and which also makes sense.

I give up...............

Post by **garyb** » Sat Dec 11, 2004 1:32 pm

so what? that's plenty in either case. lps(vinyl) aren't near that and everyone LOVES how they sound...

that's what limiters are for.....

[ This Message was edited by: garyb on 2004-12-11 13:32 ]

symbiote · Post by **symbiote** » Sat Dec 11, 2004 4:16 pm

Actually Immanuel, that would be a 15-bit +wave and 15-bit -wave

. An 8-bit +wave and -wave would take 9 bits to encode/represent. With a 16-bit signal, the values between 0 and +0dB oscillate between 0 and 32768 (15 bits,) not 0 and 256 (8-bit.)

Otherwise, DigiSUN, I think what you propose might make some sense if both sides (over and under 0/-infinity) were a mirror of each other and have 0 dc offset, but a signal really doesn't have to be centered around 0 (of course, it helps to have it centered if you don't want your speakers to explode

), nor have identical "top" and "bottom" parts, hence you have to take the whole 16 bits when calculating the whole dynamic range of the encoded signal.

at0m · Post by **at0m** » Sat Dec 11, 2004 4:59 pm

Exactly. It's still a 16bit=2^16=65536 range for each sample, not half of that.

Then about the audible dynamic range that can produce... I suppose every bit removed halves the range, so allows for -6dB. With the value at max, you have 0dB. So reduced 16 times the -6dB/bit rate, there's a -96dB range. Yes, these -96dB cover + and - range, so it's 90dB per halfwave. But we do listen to both halves.

Much more interesting then this dynamic range discussion, DigiSUN, if found your view on Niquist theory

For the others here, when I was in Israel we had a discussion when DigiSUN asked about what would be produced at let's say 1/4 Niquist, for example at 11kHz for 44.1kHz samplerate.

I'll take 1 waveform cycle as an example. 1 waveform would exist out of 4 samples. For a triangular waveform, these samples could be 0,-1,0,+1. But why would it not be -0.5, -0.5, +0.5, +0.5? This randomness can lead to great instability of our waveform, in extremis making our triangle a suboctave square oscillator.

The benefit of higher samplerates for synthesis became much more obvious when I realised that!

To return to the topic, of 16bit numbers: If there's 65536 possible values for a 16bit number, then where is the 0? A middle point requires an uneven number

This may look silly, but it can be a pain in the bntt if you have to use that for some more scientific calculations eh.

astroman · Post by **astroman** » Sat Dec 11, 2004 5:37 pm

On 2004-12-11 16:59, at0m|c wrote:
...about what would be produced at let's say 1/4 Niquist, for example at 11kHz for 44.1kHz samplerate.

I'll take 1 waveform cycle as an example. 1 waveform would exist out of 4 samples. For a triangular waveform, these samples could be 0,-1,0,+1. But why would it not be -0.5, -0.5, +0.5, +0.5? This randomness can lead to great instability of our waveform, in extremis making our triangle a suboctave square oscillator...

right - it's left to the listener ('s phantasy)
On the french 'Air' duo's Kelly watch the stars is a sound perceived as a Theremin (pure sine wave afaik). According to an interview with the band it's made with a Casio SK1 toy sampler - at 8khz sample rate

high dynamic rates are nothing but a production advantage imho - you just have a wider range of possibilities to sort things out during a mix.

For the final product there's no relevance at all.
Contemporary music is way too much mixed 'into the face' and vinyls as GaryB already pointed out cannot have > 70db dynamic range.
And even in classic concerts one cannot make use of a 120 dB physical capability due to a generally undisciplined audience - let alone you live in 'regular' locations of a city...

cheers, Tom

DigiSUN · Post by **DigiSUN** » Sun Dec 12, 2004 11:35 am

Thank you all for your replies

At0m|c,

, to visualize the Niquist problem, allow me to pull it further toward the extreme:
According to Niquist, the sampling frequency should be (at least) twice than the wave frequency. Suppose we sample a 22.05kHz sine wave with a 44.1kHz sampling rate: if our sampling point falls at the zero axis, we get to sample only the zero-crossings!
If, however, our sampling point falls on the stationary (min/max) points, we get a triangle wave.

As for the Dynamic Range, it still seems weird to me: if the silence falls on the -inf (at optimal conditions), hence - middle axis, then only half the axis should be taken when calculating Signal-to-noise ratio (or should i say... signal-to-silence

)... Isn't that what a dynamic range is all about?

Maybe the declaration of axis as silence point isn't 100% accurate, because if i have a maximized square wave, the rising and falling go all the way from bottom to top, hence: 16bit dynarange...
After all, the zero-crossing is -inf dB only at the best equilibrium (when there's no DC offset)...
Hmmm... I wish i had Hubird's smiley inventory

[ This Message was edited by: DigiSUN on 2004-12-12 11:38 ]

symbiote · Post by **symbiote** » Sun Dec 12, 2004 5:04 pm

I'm not sure exactly why you would only want to use half the signal to measure SNR or dynamic range or whatnot. Your discussion/perspective applies only to symmetrical signals centered around the "center" axis. In that case, then yeah, I guess you could use only half of the wave to qualify the "dynamic range" of "16-bit signals" (in actuality, you would be using only 15 bits, hence your 90dB figure.)

In practice, the "point of silence" really doesn't have to be at "-inf" or right in the middle. If you feed a straight DC offset into a speaker, it won't make any noise, the cone will remain at a stable position. Silence isn't categorized/qualified by a position on the axis, but by a presence or absence of movement on the medium/mechanism (aka, signal.)

That said, in practice, if you feed a straight dc offset to a speaker, it will most likely get damaged, explode, and/or catch fire, which might be a bit noisy

.

Still in practice, your half-wave dynamic range measure would only be usefule for balanced signals, and not the whole of the signals that can be represented in 16 bit.

About Nyquist, you are right that you would have a problem sampling a pure 22050hz tone with a 44100hz sampling rate, since it's the "limit" value. The theorem says that all signals *below* half-the-sampling-frequency can be reconstructed almost perfectly. With a signal of exactly half-the-sampling-rate, it depends on where you will sample the signal. With all other frequencies below 22050hz, you won't have that kind of syncing effect and you will get much better reconstruction.

If the signal exceeds the sample rate, it will slowly get mirrored back into the sampled spectrum. For example, a 22051hz frequency will appear at around 22049hz (not that you can make the difference with your current auditive apparatus

), while a say 30000hz signal will get aliased at around 14khz.

So yeah, 44.1khz is a bit of a small value, I guess for the time it was considered "good enough", but you'll definitely still hear some artifacts from it. I'd still rate it at fairly below "annoyance" level, just slight coloring, which can be altered a bit given the converters and signal path you use at reconstruction and things.

All that human-ears-max-out-at-20khz stuff is, as far as I know, based on studies made in the 20s involving "pure signals", like say I feed you a straight 30khz signal by itself, you won't hear it. Great. But what if that 30khz signal is a higher harmonic of a much lower signal that you are actively hearing and listening? I suspect then that it might play a much bigger role than a 30khz signal by itself, and that's partly why vinyl sounds so good even the dynamic range is much smaller than CD and so on.

So yeah, for the average untrained hear, CD sampling rate is alright, with a bit of training know, it's like everything else, you can push it a bit further.

Mmm rambling, need coffee

symbiote · Post by **symbiote** » Sun Dec 12, 2004 5:37 pm

Try it like this:

What's the biggest dynamic change that can happen on a 16-bit signal? 2^16 steps.

What's the smallest non-zero dynamic change that can happen on a 16-bit signal? 1 step.

So what's the dynamic range? 1/2^16.

In dB, 20 * log10(1/2^16) = -96.329598612473982468396446311838dB. (approx

)

Same idea applies to your Signal to Noise ratio calculation. Why would you only want to take half the signal? The lowest noise level could easily oscillate between 0 and 1 (or 0 and -1.) With your half-range calculation, you would invariably assume that the noise would, at the minimum, oscillate between -1 and 1, since you assume a signal is always centered around 0 and oscillate at the same "distance" below and above the "center axis".

The mathematics/equations are made to cover the whole range of possibilities in the signal's behavior, not just the ideal case of signals that are perfectly balanced around a center axis.

DigiSUN · Post by **DigiSUN** » Mon Dec 13, 2004 6:04 am

Symbiote... you certainly gave me quite a lot of material for thought... THANKS!

And thanks again to all of you...

(This post doesn't intend to lock the topic. More replies are always welcome...

)

DigiSUN · Post by **DigiSUN** » Fri May 12, 2006 3:27 am

Donno exactly why i decided to bring this topic back from the dead...
Just wanted to clean up the mess about the Nyquist theorem: (My discussion with at0m

)
If we sample a 22kHz sine wave at 44kHz, then we can, theoretically, restore the sine wave exactly as-is. One can draw only a single sine through the two/three sampling points. (except the single extreme case of sampling only the two zero-crossings).
If getting a triangle wave in restoration would be an option - then we should have considered that a triangle wave contains HIGHER frequency harmonics than its fundamental - and therefore the nyquist frequency should have been initially set higher than 44kHz.
So it may sound vain to say but... Nyquist was right!

[ This Message was edited by: DigiSUN on 2006-05-12 04:28 ]

Shroomz~> · Post by **Shroomz~>** » Mon May 15, 2006 2:53 am

Nyquist is always right & I've also heard it said that it's your friend not foe

Atom said:
To return to the topic, of 16bit numbers: If there's 65536 possible values for a 16bit number, then where is the 0? A middle point requires an uneven number.

Atom, is not 0-65536 = 65537 values ??

at0m · Post by **at0m** » Mon May 15, 2006 3:44 am

A multiple of 2 is always even.

[edit: elaborated a bit]
Let's try 2^2. With this, you can form:
00
01
10
11
...which offers 4 possibilities, as one would expect, since 2^2=4.

00 isn't necessarily 0, it's the first number.

For audio waves, which are bipolar, 00 and 01 could make the negative part of the wave, and 10 and 11 could represent the positive part. Or 00 can be negative, 01 could be center, and 10 and 11 could be positive. That's just a matter of protocol, but there's not much in between.

Since 2^16 is a fairly large number, the consequences of that inbalance isn't audible.

But it presents us with a bunch of issues when trying to do it mathematically correct, for example -(-2^15) is not the same as 2^15, unless you don't use a 0 or centerpoint - in binary at least.
This is how it could be done, at low level: any number starting with a 0 is negative, and
any number starting with a 1 is positive. But where's the 0 ?

For us, humans in decimal system, we just add the 0, but the digital world doesn't know 2^16+1, it doesn't fit in, unless you'd start working with 2^17, but that would complicate things way too much.

_________________
more has been done with less

[ This Message was edited by: at0m on 2006-05-15 05:23 ]

MD69 · Post by **MD69** » Mon May 15, 2006 4:52 am

Hi,

This discussion looks more and more esoteric !
Wonder what Humpty Dumpty would have said about it

and to add a bit:
What do we count? the trees or the interval between the trees?

cheers

symbiote · Post by **symbiote** » Mon May 15, 2006 6:46 am

edit: mm never post before morning coffee

at0m you are right, but the "middle" thing is specific to audio signals, and with a wide enough wordlength and/or fast enough sampling rate, you can minimize this to the point where it's not audible and doesn't matter too much.

[ This Message was edited by: symbiote on 2006-05-15 07:56 ]

symbiote · Post by **symbiote** » Mon May 15, 2006 7:03 am

as a side note, it also depends on the electronics/circuitry you feed those values into. i.e. you could use 1 bit for sign, which would bundle 2 values together (i.e. 0 and -0) and you'd get a nice uneven number of steps.