Written by Travis M. Moore
Last edited 27-Sep-2019
A reasonable question is: "Can we save time and plot a signal in the time and frequency domain at the same time?" That would, after all, give us all the information we need at once. Unless you have been sequestered in a palace your whole life, you have probably noticed that life is made up of give and take. It turns out that the physical world is the same way. The world will give us some information (e.g., time), but then take away other information (e.g., frequency). It really is true that we cannot eat our cake and have it too.
What does this give and take look like for acoustics in the real world? It simply means that the more precise you are about a signal's frequency, the less precise you can be about where it starts and stops in time. The converse is also true: defining exactly where a sound starts and stops in time means we cannot get a precise measure of the frequency of that sound. If you're feeling a little incredulous, that's ok. The uncertainty principle is most noticeable when being incredibly precise. Oftentimes, simply being a little less precise means the uncertainty effects go unnoticed by human beings.
Now for a concrete example. Take a look at Figure 1. This is a plot we have seen many times now. A sine wave in time, and a single frequency from the FFT. Now look at Figure 2, which is the exact same as Figure 1, just zoomed in on the frequency domain. It turns out when we take an extremely close look, the "straight line" from Figure 1 isn't actually a single line. It widens at very low amplitudes to include other frequencies. This happens because our sine wave isn't really a perfect sine wave.
An ideal sine wave is infinite in duration and would have no beginning or end. Let's think about this. If the sine wave has no end, we cannot say anything about how long it is (i.e., quantify it in the time domain). In other words, there is absolute uncertainty about choosing a single point in time to describe "when" the wave occurred. If time is perfectly unquantifiable, then we can be perfectly certain of the frequency, and our measurement of the infinite sine wave would reveal a single line on our plot no matter how far we zoomed in.
Clearly we can't create an infinite sine wave, so we deal in less than perfect approximations. The shorter the sine wave, the more we can point to a single moment in time and define exactly when the sound existed. The 3-ms wave in Figures 1 and 2 is a whole lot shorter than infinite, which means we can point to a much more exact time frame of when the wave was "on" (i.e., 3 ms). This increase in resolution in the time domain means our view of frequency has to become a tad fuzzier. The slopes around the center frequency in Figure 1 when we're zoomed in are the fuzziness: extra frequency information that creeps into our "pure" tone. While we cannot say with absolute precision that our sine wave is 1000 Hz, we're still pretty close. Listen to the audio sample beneath the figures to see for yourself.
Let's see what happens when we shorten the duration of our sine wave even further. Figure 3 shows the FFT for a quarter of a cycle (0.25 ms duration). The frequency domain has broadened quiet a bit. Just playing 1/4 cycle of a 1000 Hz tone doesn't sound much like 1000 Hz. Try playing the audio sample to hear what the signal sound like now. You may have to turn up your computer volume to hear it. Figure 4 shows the FFT of half of the quarter cycle (0.125 ms duration). Now the frequency spectrum is beginning to look like a flat line, including every frequency! Play the audio sample beneath Figure 4 and notice that there is more high frequency energy now.
There is quite a difference between the audio samples of Figures 1 - 4. However, even the 300 ms audio sample from Figures 1 and 2 is recognizable as a "pure tone." The takeaway is that even with a sine wave a long way away from infinite, we can essentially accomplish the goal of creating a sine wave that is perceptible as such to a human being. So when does the uncertainty principle matter in daily life? Why have we spent so much time discussing this?
The patterns we've seen from Heisenberg's uncertainty principle might seem abstract, and hard to imagine being of much use on a daily basis. But believe it or not, this principle is at work in a big way in some of the most fundamental tests audiologists perform. Need convincing? Read on.
Stimuli available for the auditory brainstem response (ABR) can be divided into two categories based on their duration. The first type is extremely short: a 100-μs click. The second type is still short (e.g., toneburst), but measured in ms; not μs. The consequence is that stimuli with durations measured in ms are much more "tone-like" than click stimuli. In fact, a click is so short that it includes energy across all the frequencies audiologists are interested in.
Think about the implications of using the wrong stimulus for a specific test. For instance, to save time during screenings or to do a gross check of neural synchrony, a click is a great choice. It casts a wide net and stimulates most of the basilar membrane. If the only thing we're looking for is a single sign of auditory function somewhere, anywhere, then a click is a fine choice (there are other scenarios where clicks are useful in ABR testing, but they are not as pertinent to the topic at hand.)
What about a diagnostic ABR where we want to measure thresholds at specific frequencies? Can we still use a click and make short work of the test? NO! For a frequency specific test we need a more frequency-specific stimulus (e.g., toneburst or narrow-band chirp). Only when we present a frequency-specific stimulus can we interpret the waveforms from the ABR as telling us anything specific about different frequencies. Using a click stimulus to determine whether a patient has an intact auditory nerve and brainstem is not possible.
Heisenberg's uncertainty principle is also at play every time you present a tone using an audiometer. The audiometer output has been carefully synthesized to avoid contamination from multiple frequencies. You might be wondering why any adjustments are necessary at all - the 300 ms tone from Figure 1 sounded pretty good, and the audiometer presents a tone for roughly the same duration. What's the problem?
The problem is the onset and offset of a finite signal; in this case a pure tone from an audiometer. If the tone is abruptly turned on, we can point to exactly when in the time domain the tone started (and stopped with an abrupt offset). The top panel of Figure 5 ("Ungated Signal") shows an instant onset and offset. It is easy to see where the tone transitions from "off" to "on." However, well-defined onsets/offsets means there will be extra frequencies creeping into the signal when it is switched on and off. Play the audio sample labeled "200 Hz Ungated" below and listen for some high frequency distortion at the very beginning and end of the tone. Those wanted high frequencies sound like little "pops" and would destroy the validity of the audiogram. Why? Because an audiogram is a frequency-specific test and it involves some very soft sounds. If you are interested in measuring threshold at 1000 Hz, but the patient hears that "pop" at one of the unwanted frequencies, the patient's response no longer tells you what threshold is at 1000 Hz. You're just measuring whatever frequency or frequencies the patient can detect from the broad spectrum of the "pop." By the way, those "pops" are referred to as spectral splatter or spectral transients.
The solution to avoiding spectral splatter is to ramp the tone on and off slowly. The middle panel of Figure 5 ("Envelope") shows the kind of shape we need. Applying an envelope to a sine wave is called gating, hence the title "Ungated Signal" in the top panel of Figure 5. The bottom panel of Figure 5 shows the sine wave after gating (i.e., applying the envelope). Listen to the audio sample below labeled "200 Hz Gated." No more onset/offset distortion! Also notice that we can no longer say exactly when the tone first started. Should you consider the middle of the ramp the onset? The beginning of the ramp has no amplitude so that can't be it. Maybe at the transition to the "fully on" part of the tone? But what if a patient had already detected the tone before that? There are many different ways to define where the onset (and offset) of a gated signal are, but we can all agree the timing of the onset got a lot less precise.