So far, I have focused on an ecological description of sound, that is, how sounds appear from the perspective of an organism in its environment: the structure of sound waves captured by the ears in relationship with the sound-producing object, and the structure of interaction with sounds. There is nothing psychological per se in this description. It only specifies what is available to our perception, in a way that does not presuppose knowledge about the world. I now want to describe subjective experience of sounds in the same way, without preconceptions about what it might be. Such a preconception could be, for example, to say: pitch is the perceptual correlate of the periodicity of a sound wave. I am not saying that this is wrong, but I want to describe the experience of pitch as it appears subjectively to us, independently of what we may think it relates to.
This is in fact the approach of phenomenology. Phenomenology is a branch of philosophy that describes how things are given to consciousness, our subjective experience. It was introduced by Edmund Husserl and developed by a number of philosophers, including Merleau-Ponty and Sartre. The method of “phenomenological reduction” consists in suspending all beliefs we may have on the nature of things, to describe only how they appear to consciousness.
Here I will briefly discuss the phenomenology of pitch, which is the percept associated to how high or low a musical note is. A vowel also produces a similar experience. First of all, a pure tone feels like a constant sound, unlike a tone modulated at low frequency (say, a few Hz). This simple remark is already quite surprising. A pure tone is not a constant acoustical wave at all, it oscillates at a fast rate. Yet we feel it as a constant sound, as if nothing were changing at all in the sound. At the same time, we are not insensitive to this rate of change of the acoustical wave: if we vary the frequency of the pure tone, it feels very differently. This feeling is what is commonly associated to pitch: when the frequency is increased, the tone feels “higher”, when it is decreased, it feels “lower”. Interestingly, the language we use to describe pitch is that of space. I am too influenced by my own language and my musical background to tell whether we actually feel high sounds as being physically high, but it is an interesting observation. But for sure, low pitch sounds tend to feel larger than high pitch sounds, again a spatial dimension.
A very distinct property of pitch is that changing the frequency of the tone, i.e., the temporal structure of the sound wave, does not produce a perceptual change along a temporal dimension. Pitch is not temporal in the sense of: there is one thing, and then there is another thing. With a pure tone, there always seems to be a single thing, not a succession of things. In contrast, with an amplitude-modulated tone, one can feel that the sound is sequentially (but continuously) louder and weaker. In the same way, if one hits a piano key, the loudness of the sound decreases. In both cases there is a distinct feel of time associated to the change in amplitude of the sound wave. And this feel does not exist with the fast amplitude change of a tone. This simple observation demonstrates that phenomenological time is distinct from physical time.
Another very salient point is that when the loudness of the sound of the piano key decreases, the pitch does not seem to change. Somehow the pitch seems to be invariant to this change. I would qualify this statement, however, because this might not be true at low levels.
When a tone is accelerated, the sound seems to go higher, as when one asks a question. When it is decelerated, it seems to go lower, as when one ends a sentence. Here there is a feeling of time (first it is lower, then it is higher), corresponding to the temporal structure of the frequency change at a fast timescale.
Now when one compares two different sounds from the same instrument in sequence, there is usually a distinct feeling of one sound being higher than the other one. However, when the two sounds are very close in pitch, for example when one tunes a guitar, it can be difficult to tell which one is higher, even though it may be clearer that they have distinct pitches. When one plays two notes of different instruments, it is generally easy to tell whether it is the same note, but not always which one is higher. In fact the confusion is related to the octave similarity: if two notes are played on a piano, differing by an octave (which corresponds to doubling the frequency), they sound very similar. If they are played together instead of sequentially, they seem to fuse, almost as a single note. It follows that pitch seems to have a somewhat circular or helicoidal topology: there is an ordering from low to high, but at the same time pitches of notes differing by an octave feel very similar.
If one plays a melody on one instrument and then the same melody on another instrument, they feel like the same melody, even the though the acoustic waves are very different, and certainly they sound different. If one plays a piano key, then it is generally easy to immediately sing the same note. Of course when we say “the same note”, it is actually a very different acoustical wave that is produced by our voice, but yet it feels like it is the same level of “highness”. These observations certainly support the theory that pitch is the perceptual correlate of the periodicity of the sound wave, with the qualification that low repetition rates (e.g. 1 Hz) actually produce a feel of temporal structure (change in loudness or repeated sounds, depending on what is repeated in the acoustical wave) rather than a lower pitch.
The last observation is intriguing. We can repeat the pitch of a piano key with our voice, and yet most of us do not possess absolute pitch, the ability to name the piano key, even with musical training. It is intriguing because the muscular commands to the vocal system required to produce a given note are absolute, in the sense that they do not depend on musical context. This means, for most of us who do not possess absolute pitch, that these commands are not available to our consciousness as such. We can sing a note that we just heard, but we cannot sing a C. This suggests that we actually possess absolute pitch at a subconscious level.
I will come back to this point. Before, we need to discuss relative pitch. What is meant by “relative pitch”? Essentially, it is the observation that two melodies played in different keys sound the same. This is not a trivial fact at all. Playing a melody in a different key means scaling the frequency of all notes by the same factor, or equivalently, playing the fine structure of the melody at a different rate. The resulting sound wave is not at all like the original sound wave, either in the temporal domain (at any given time the acoustical pressures are completely different) or in the frequency domain (spectra could be non-overlapping). The melody sounds the same when fundamental frequencies are multiplied by the same factor, not when they are shifted by the same quantity. Note also that the melody is still recognizable when the duration of notes or gaps is changed, when the tempo is different, when expressivity is changed (e.g. loudness of notes) or when the melody is played staccato. This fact questions neurophysiological explanations based on adaptation.
Thus, it seems that, at a conscious level, what is perceived is primarily musical intervals. But even this description is probably not entirely right. It suggests that the pitch of a note is compared to the previous one to make sense. But if one hears the national hymn with a note removed, it will not feel like a different melody, but like the same melody with an ellipse. It is thus more accurate to say that a note makes sense within a harmonic context, rather than with respect to the previous note.
This point is in fact familiar to musicians. If a song is played and then one asks to sing another song, then the singer will tend to start the melody in the same key as the previous song. The two songs are unrelated, so thinking in terms of intervals does not make sense. But somehow there seems to be a harmonic context in which notes are interpreted.
Now the fact that there is such an effect of the previous song means that the harmonic context is maintained in working memory. It does not seem to require any conscious effort or attention, as when one tries to remember a phone number. Somehow it stays there, unconsciously, and determines the way in which future sounds are experienced. It does not even appear clearly whether there is a harmonic context in memory or if it has been “forgotten”.
Melodies can also be remembered for a long time. A striking observation is that it is impossible for most people to recall a known melody in the right key, the key in which it was originally played, and it is also impossible to tell whether the melody, played by someone else, is played in the right key. Somehow the original key is not memorized. Thus it seems that it is not the fundamental frequency of notes that is memorized. One could imagine that intervals are memorized rather than notes, but as I noted earlier, this is probably not right either. More plausible is the notion that it is the pitch of notes relative to the harmonic structure that is stored (i.e., pitch is relative to the key, not to the previous note).
We arrive at the notion that both the perception and the memory of pitch is relative, and it seems to be relative in a harmonic sense, i.e., relative to the key and not in the sense of intervals of successive notes. Now what I find very puzzling is that the fact that we can even sing means that, at a subconscious level but not at a conscious level, we must have a notion of absolute pitch.
Another intriguing point is that we can imagine a note, play it in our head, and then try to play it on a piano: it may sound like the note we played, or it may sound too high or too low. We are thus able to make a comparison between a note that is physically played and a note that we consciously imagine. But we are apparently not conscious of pitch in an absolute sense, in a way that relates directly to properties of physical sounds. The only way I can see to resolve this apparent contradiction is to say that we imagine notes as degrees in a harmonic context (or musical scale), i.e., “tonic” for the C note in a C key, “dominant” for the G note in a C key, etc, and in the same way we perceive notes as degrees. The absolute pitch, independent of the musical key, is also present but at a subconscious level.
I have only addressed a small portion of the phenomenology of pitch, since I have barely discussed harmony. But clearly, it appears that the phenomenology of pitch is very rich, and also not tied to the physics of sound in a straightforward way. It is deeply connected with the concepts of memory and time.
In light of these observations, it appears that current theories of pitch address very little of the phenomenology of pitch. In fact, all of them (both temporal and spectral theories) address the question of absolute pitch, something that most of us actually do not have conscious access to. It is even more limited than that: current models of pitch are meant to explain how the fundamental frequency of a sound can be estimated by the nervous system. Thus, they start from the physicalist postulate that pitch is the perceptual correlate of sound periodicity, which, as we have seen, is not unreasonable but remains a very superficial aspect of the phenomenology of pitch. They also focus on the problem of inference (how to estimate pitch) and not on the deeper problem of definition (what is pitch, why do some sounds produce pitch and not others, etc.).