A reader has asked about our use of sound signature in reviews for headphone, TV, soundbar, speakers, smartphones and other sound devices. The answer is that a sound signature is a consistent way to describe a devices ability to reproduce sound. By comparison, a reviewer’s subjective comments from listening to their favourite soundtracks, although important, can vary wildly.
Why sound signature? Audio quality depends on so many things
– the bit rate, sample rate, file format, and speaker construction. It also
depends on the ability of the encoder to get the important bits right and the reviewer’s
Signature 101 and why we use it to rate sound devices
Frequency response – the human hearing range
Humans generally hear frequencies from 20Hz (bass) to 20kHz
(treble). As you get older, the top end usually progressively falls off – it is
not uncommon for the elderly to only be able to hear to as low as 3kHz. That is
why hearing aids generally don’t boost volume, but boost lost frequency ranges.
But sound is a combination of hearing (tones and harmonics within
the eardrum) and feeling (subjective impressions that depend on musical
associations, your mood at the moment, and a bunch of oh-so-human indefinables).
Oh, and throw in spatial variables like sound stage/separation (left/right, up/down,
forward/behind), echo (the reflection off walls), speed of sound and timbre and
you wonder how we can hear at all via two small ear canals.
Well, we have the world’s most powerful non-AI computer to
post-process, fill in the gaps and make the most of whatever we hear.
That frequency response covers:
Deep Bass: 16/20-40Hz – which you can often feel
more in your body that you hear in your ears
Midbass: 40-100Hz – if this is intact, you will
be getting just about all the musically important bass
Upper Bass: 100 to 200Hz – most small sound
devices, like portable Bluetooth speakers, start here
Mid: 200-4kHz– this is where the action is, it
covers the human voice and is the area where our ears are most sensitive … even
as we age.
Upper Treble: 4-10kHz – this defines the character
of the sound. Its absence makes the sound dull.
Dog whistle – top octave: 10-20kHz – you can’t
generally hear this, but you know if it is missing. Its presence improves the
sense of direction in the sound and provides a feeling of “air”, a reality as
though the music were really there, rather than merely reproduced.
Six sound signatures describe the natural state of the sound device.
Of course, you can have a combination of two or more, and
many have equaliser and sound profile apps that can change the signature
entirely often resulting in ‘frankensound’.
Balanced: (bass boosted, mid recessed, treble
boosted) also called V-shaped and the default on many devices – despised by
Bass: (bass boosted, mid/treble recessed) – for
bass music but can sound boomy or muddy compared to warm and sweet
Warm and Sweet (bass/mid boosted, treble
recessed) – the nirvana for most music and movies
Bright Vocal (bass recessed, mid/treble boosted)
– thought to be for vocal and string instruments, but actually makes them harsh
Analytical: (bass/mid recessed; treble boosted)
– crisp but can be overly harsh and not pleasant for most music
We like to add a seventh – flat or neutral that neither adds
nor subtracts from the native music, but this is rare – if it exists at all!
Where possible, we test with the Equaliser (EQ) set to flat.
So, if we say something is warm and sweet, you can count on
it for good music.
Most sound devices are naturally mid-centric and use some form of psycho-acoustic trickery via ‘tuning’ the DAC (Digital Analogue Converter) or an EQ to boost specific frequency by several dB. This can add a little bass, mid or treble, but it is not the speaker’s native signature. We often refer to this as ‘synthetic sound’ – it is not bad, but it is not entirely natural. There is an excellent article here that delves into sound signature nuances although it uses slightly different terms.
Speakers range from small 4-6mm earphone transducers to monster cones. But we have yet to find a single speaker that can do it all – reproduce the full range of frequency response. Physics precludes this in loudspeakers, although some headphones and earphones can come closer to this ideal. Some headphones can get great bass because they have an excellent over-the-ear seal – less air to push. And this is because you need a larger speaker to push volumes of air for bass and a smaller, ‘shriller’ speaker for treble. So, in a decent soundbar or hi-fi system, you will have separate speakers (and amps) for bass (sub-woofer), mid, and treble (tweeters).
Then there’s the separate issue of surround sound. Typically
1.0 – mono speaker and some use passive radiators to increase bass or reflectors to increase 180-360° sound
2.0 – stereo speakers (L/R)
2.1 – L/R and subwoofer
3.1 – L/R, Centre (usually tuned for clear voice frequencies) and subwoofer
5.1 – Front L/R, Centre, front L/R (upwards-firing and tuned for spatial effects instead of reproducing front sound) and a sub-woofer
5.1.2 – As above plus L/R rear speakers (Minimum for Dolby Atmos)
7.1.4 as per 5.1.2 but with up-firing (or ceiling) rear speakers (tuned for spatial effects instead of reproducing rear sound)
The above cover the usual channels that a sound device can
down-mix to. For example, if a 2.0 TV or soundbar claims Dolby Atmos
compatibility, it means that it takes the 5/7.1.2/4 native signal and downmixes
an approximation to the physical number of amps and speakers. Conversely, DTS:X
up mixes (emulates) 1.0 or higher to the speakers capacity.
Most sound sources are better than most human hearing anyway
Definitions: (The higher, the better for all)
Bit rate is finished file size in kilobits per second and relates to audio quality.
Sample rate in kHz – the number of times per second the sound is sampled. 44.1kHz covers 5Hz -22.05kHz.
Bit depth – 8-bit (256 levels or 48dB), 16-bit (65,536 levels or 96dB), 24-bit (16,777,216 levels or 144dB), 32-bit or more is simply the granularity of data stored and relates to the dynamic range (dB). We perceive those lower bit depths as more noise. It is usually a simply hiss, but at low signal, levels can produce nasty effects. Those are usually dealt with by adding some dither noise.
MP3 uses a bit rate from 8-320kbps (typically 128kbps or ‘radio quality’) at a sample rate from 8-44.1kHz (generally 22kHz). It allows you to compress large music files to smaller sizes which are lossy (a smaller % sample of the original sound). For example, a typical MP3 produces a file size of 128 kilobits every second (they’re actually larger, thanks to metadata including album covers.)
AAC has a variable bit rate of 8-256kbps (typically 230kbps
per channel) and a sample rate of 8-96kHz (almost always 44.1kHz). Widely used
by Apple and can be less lossy. But it is only Bluetooth
codec that makes uses of psychoacoustic modelling* to transmit data, so it’s a
very processing-heavy codec.
CD sound requires 16-bit/44.1kHz (44,100 samples a second) sampling
and 1,411kbps data rate. Some audiophiles comment that CD sampling only covers
80% of the original sound information.
DVD and Blu-ray audio are typically 24-bit/96 or 192kHz and cover almost all the original sound information. By comparison, telephone quality is from 200Hz-3.2khz and uses 8 or 12-bit and a 64-96kbps data rate so you can see voice frequency range is quite limited. Depending on your content type, the sound quality and frequency response vary. All tests should be at CD quality to be fair to the device. You can read more and listen to different bit rate clips here.
*Psychoacoustic modelling determines which sound won’t be heard. For example, some sounds within a few milliseconds of louder sounds – even if they come first – won’t be heard. Models are used to determine those; then the encoder abandons them. Conceptually it’s the same for all lossy compression systems: MP3, AAC, WAV and so on. Just some do it better than others.
Then there is the device interface type
Most devices have 3.5mm (cable) audio, RCA, optical Toslink, HDMI, USB, Thunderbolt 3, Bluetooth or Wi-Fi interfaces. 3.5mm (or analogue) audio inputs may make their way in pure analogue format to the speakers, but there’s no guarantee of that. Many devices simply convert analogue to digital for processing, before converting back to analogue for output.
In theory, pure analogue should be the best test of the speaker’s capability, but in practice, digital-to-analogue and analogue-to-digital conversion are so good that it’s largely indistinguishable from pure analogue, if competently performed.
But most music is digital and needs a chip to convert from
digital to analogue (called a DAC or digital audio converter). These can vary
enormously in quality and low and high filter capabilities.
Let’s remember that MP3 is 128-320kbps, and CD quality is 1,411kbps (1.411Mbps) (Source).
The difference is that MP3 – and other lossy compression
systems – toss out the content that the psychoacoustic models judge to be
inaudible. The higher the bitrate for a given codec, typically the better
Bluetooth Codecs (typical or maximum rate with 44.1kHz quality):
Standard Bluetooth codec (Sub-band coding or SBC) is 127 (mono) to 328kbps (stereo)
aptX (mono/stereo) is 128/256/352kbps, aptX LL (low latency) is 352Kbps, aptX HD is 192/384/529Kbps and aptX Adaptive is 276-420kbps
Advanced Audio Coding (AAC) is 8-576kbps (stereo) but typically 256/320kbps over BT – good on iPhone (Apple AAC) but not so good on Android that uses the Fraunhofer AAC codec
LDAC is variable from 303/606/990kbps
Codecs also suffer latency (lag):
SBC: 150-250 ms (typically 175ms)
aptX: 130-180 ms (typically 166ms and aptX LL
tries to keep this under 50ms)
AAC: 190-240 ms
LDAC: 160-210 ms
Bluetooth 1/2/3/4/5 is 1/25/25/50Mbps, but the codec will slow it down
USB 1/2 is 12/480Mbps, and 3/4 is 5/40Gbps
Thunderbolt 3 is 20/40Gbps
Optical is 3.072Mbps
HDMI and DisplayPort are 36.863Mbps
Ethernet up to 1Gbps
Wi-Fi ranges from 50Mbps to as high as AX 11Gbps
The bottom line is that you need to test as a cabled device
(if possible) to get the native sound signature and then preferably use a
high-res BT codec like LDAC and USB 3.0 to get the range of sound signatures.
Now back to where we started – sound signatures
Sorry for the long tome but merely listening to a favourite music track does not cut it – it is plain wrong for all the reasons above. We use a tone generator to see the original signature of the speaker. We measure this via a frequency response meter. There are tone generators for Window, Android, macOS and iOS. Here is a YouTube tester:
BUT, in deference to all reviewers who may have better ears
than I, here are the tracks I always use
The Blue Brothers Peter Gunn Theme
Those magnificent trumpets over a deep, bass, backbeat –
Just the facts ma’am! I could listen to Blues Brother Jazz all day long.
Next track is the Beach Boys Fun, Fun, Fun
It is a vocal track with electric guitars and synthesiser behind it.
Finally, Manhattan Transfer Twilight Zone
It mixes voice and heavy bass as well as using the complete directional sound stage. Here is a 432kHz version (almost high res).
A sound signature is the best guide to what speakers sound like. If you describe food, you could say salty, sweet, bitter, sour, soft, hard, mushy, like chicken etc. Avoid any reviews that simply talk about their favourite tracks. I hope you can see that its all a science! Sound is a very important category at GadgetGuy so we hope this helps you understand how you can rely on our reviews.