I had to make some notes for an upcoming assessment at college and I thought this is a good idea to make a useful post out of it. This information can be collided separately from other places (see references), but this short compendium might be useful for the unanointed or for anyone else whom might be interested.
A rate at which at an acoustic wave repeats within one full cycle of positive and negative amplitude. If that rate is increased within a fixed (given) time period, the frequency will increase and with that the perceived pitch of the sound will be higher. Similarly when the frequency decrease, the perceived pitch of the sound will be lower
The average hearing range of a human is between 20Hz and 20kHz
What is digital audio
It is a representation of sound recorded or converted into digital, binary form being represented by numerical samples
Binary system is mostly being used in electronics and computer systems. It represents data using only two values (numbers) 0 or 1. As opposed to our everyday use of decimal system where values are represented by numbers ranged between 0-9
Analogue to digital sampling
the digital recording process takes periodic samples of a changing analogue audio form and translates these samples into a a representative binary form that can be stored or manipulated before reproduction of sound
It is corresponding to the number of samples taken from the audio signal in one second. A sample rate of 48kHz is 48000 samples per a second or it takes 1/48000 second long for 1 sample
Sample rate clock
A circuitry which responsible of the timing of each sample taken through the sampling process.
Sample and hold
During the digital sampling process of an analogue signal, a sample is taken at precise intervals (defined by the sample rate). Each time when a sample is taken, the analogue signal is “held”, almost like a snapshot of it. That is because digital circuitry needs fixed measuring points which it can convert into binary values> After the conversion the process releases the hold and moves on to the next sample.
Jitter is a phenomenon which appears during the sampling process and it is due to deviances in the sample rate clock timing. The sampling must happen in an equal intervals of time. When the timing is not accurate, the sampled values will not be taken from where it supposed to be. This will be a problem during the reconstruction of the analogue signal. Jitter is audible in a form of distortion of the sound.
The theorem states that during the sampling process in order to adequately translate an analogue value into a binary form the sample rate has to be at least twice as high as the highest frequency (of the analogue signal) to be recorded. E.g. an audio signal with the bandwidth of 20kHz (upper limit of human hearing) would need to have a minimum sampling rate of 40kHz.
The value of the highest frequency to be captured without aliasing, or one half of the sample rate
Is twice of the frequency to be sampled
Aliasing, alias frequencies
If we use a lower sample rate as the Nyquist Theorem requires, there will be unwanted erroneous frequencies introduced into our signal. Theses frequencies otherwise called as alias frequencies, which can be heard as harmonic distortion. The phenomenon called “aliasing”.
The signal portion which is higher then the Nyquist’s frequency will reappear as lower frequencies which is audible as distortion.
In order to reduce the undesirable effects of aliasing we have to introduce an anti-aliasing filter into our signal path. In essence an anti-aliasing filter is a simple (simple in electronics) low pass filter.
A low pass filter is an electrical circuitry which filters (block) the higher frequencies above a set threshold. The ideal filter would be a “brick wall” filter which would have total attenuation above the required cut-off frequency, however, technically it is impossible to construct such a circuit.
For that reason a slightly higher sample rate has to be chosen than the absolutely required. That is because there is an attenuation slope and a slight delay before the filter is fully effective (basically because of the use of capacitors)
As an example a sample rate of 44.1kHz is desired in order to accurately translate a bandwidth up to 20kHz
An other way of reducing the effects of aliasing is oversampling in which process the multiply the number of samples taken by a specific factor.
It represents the amplitude of a continuous analogue wave. The process translates the analogue voltage levels (at the sampling points) into binary digits for digital processing
Analogue audio signal has infinite amplitude values. However in digital audio we have to determine amplitude with a precise number. The bit depth limits the number of values we can record in each sample. 16/24/32 bit are the most common sampling bit depth values. We can think of that as the maximum resolution of our digital sampler. As high as the resolution as easier to more accurately determine the analogue values. The number of values equal of Xbit integer, ie 16bit depth gives us 216 or 65536 different values.
Otherwise known as signal to error ratio. During the sampling process we are converting “held” continuous values into a digital binary domain. Depending on the bit depth there is an element of error introduced into our signal, because of the limited number of values used in the binary system there will be a small proportion of the analogue value which will be rounded up or down (0 or 1).
To calculate the dynamic range from the theoretical signal to error ratios is: 6n+1.8(dB) where “n”=bit depth.
Dither and distortion
As a result of the signal to error ratio there is an element of harmonic distortion being introduced into our signal. This is due to rounded up/down value into the least significant bit in equivalent the binary value. That means there is a 50% chance that it will be encoded to the wrong binary value. If we look at the shape of the signal it appears “squared off” instead a curve.
The effect of this harmonic distortion can be suppressed by introducing noise or dither in the signal.. This random element of noise will provide a mathematical probability curve that allows the A/D circuit determine better whether the least significant bit is 0 or 1. The noise or dither can also be shaped (noise shaping) in other words calculated in order to provide the best LSB levels from a statistical perspective.
The introduction of noise also useful when we “truncate” or drop the bit depth from higher to lower which will introduce new Quantisation error.
Fun fact, that during the digital conversion of tape recording, the noise is being added “naturally” via the internal noise of the tape machine.
David Miles Hubert, Robert E. Runstein, Modern Recording Techniques, 6th Edition, Focal Press – Elsevier, 2005,
Binary Data and Representation, -, BBC, 2020, <https://www.bbc.co.uk/bitesize/topics/zd2xsbk>
The Basics of Anti-Aliasing Low-Pass Filters (and Why They Need to be Matched to the ADC), 2020, Art Pini, Digi-Key’s North American Editors, <https://www.digikey.co.uk/en/articles/the-basics-of-anti-aliasing-low-pass-filters>
Jitter, – , Bob Katz, Digital Domain, <https://www.digido.com/portfolio-item/jitter/>
What is jitter in audio?, 2017, Yuri Korzunov, <https://headfonics.com/what-is-jitter-in-audio/>
Capturing Images, – , -, <http://microscopy.berkeley.edu/courses/dib/sections/02Images/sampling.html>
Nyquist Theorem, Chapters from various books,- , <https://www.sciencedirect.com/topics/engineering/nyquist-theorem>
Nyquist frequency, – , – , <https://www.gatan.com/nyquist-frequency>
The “Sound” of Performance Data Part 2 – Aliasing, Jon Hodgson, 2018,
Digital Audio Basics: Sample Rate and Bit Depth, Griffin Brown, Izotope, 2019,
What Is Dithering in Audio?, Ian Stewart, Izotope, 2020,