What is sound? (XI) What is loudness?

In the previous post, I discussed proximal aspects of loudness, which depend on the acoustical wave at the ear. For example, when we say that a sound is too loud, we are referring to an unpleasant feeling related to the effect of acoustical waves sensed at the ear. That is, the same sound source would feel less loud if it were far from the ear.

But we can also perceive the “intrinsic” loudness of a sound source, that is, that aspect of loudness that is not affected by distance. This is a distal property, which I will call source loudness. The loudness of a sound can be defined as proximal or distal in the same way as a visual object has a size on the retina (proximal) and a physical size as an external object (distal).

First of all, what can possibly be meant by source loudness? We may consider that it is a perceptual correlate of an acoustical property of the sound source, for example the energy radiated by a sound source. The acoustical energy at a given point depends on the distance to the source through the inverse square law, but the total energy at a given distance (integrated on the whole surface of a sphere) is constant (neglecting reflections). However, we cannot sense this kind of invariant since it implies sampling the acoustical wave in the entire space (but see the last comments in this post).

An alternative is to consider that source loudness is that property of sound that does not vary with distance, and more generally with the acoustical environment. The problem with this definition is that it applies to all distal properties of sound (pitch, speaker identity, etc). The fact that we refer to source loudness using the word loudness suggests that there is relationship between proximal and distal loudness. Therefore, we may consider that source loudness is that property of sound that is univocally related to (expected) proximal loudness at a reference location (say, at arm distance). Defined in this way, source loudness indeed does not depend on distance. Indirectly, it is a property of the sound field radiated by the source, although it is defined in reference with the perceptual system (since proximal loudness is not an intrinsic property of acoustical waves, as I noted in the previous post).

Another way to define source loudness involves action. For example, we can produce sounds by hitting different objects. The loudness of the sound correlates with the energy we put in the action. So we could imagine that source loudness corresponds to the energy we estimate necessary to produce the sound. This also gives a definition that does not depend on source distance. However, hitting the ground produces a sound that feels louder when the ground is hard (concrete) than when it is soft (grass, snow). Some of the energy we put into the mechanical interaction is dissipated and some is radiated, and it seems that only the radiated energy contributes to perceived loudness. Therefore, this definition is not entirely satisfying. I would not entirely discard it, because as we have seen loudness is not a unitary percept. It might also be relevant for speech: when we say that someone screams or speaks softly, we are referring to the way the vocal chords are excited. Thus, this is a distal notion of loudness that is object-specific.

So we have two notions of source loudness, which are invariant with respect to distance. One is related to proximal loudness; the other one is related to the mechanical energy required to produce the sound and is source-specific (in the sense that the source must be known). The next question is: what exactly in the acoustical waves is invariant with respect to distance? In Gibsonian’s terms, what is the invariant structure related to source loudness?

Let us start with the second notion. What in the acoustical wave specifies the energy put into the mechanical interaction? If this property is invariant to distance, then it should be invariant with respect to scaling the acoustical wave. It follows that such information can only be captured if the relationship between interaction strength and the resulting wave is nonlinear. Source loudness in this sense is therefore a measure of nonlinearity that is source-specific.

The first notion defines source loudness in relationship to proximal loudness at a reference distance. What in the acoustical wave specifies the proximal loudness that would be perceived at a different distance? One possibility is that this involves a highly inferential process: source loudness is first perceived in a source-specific way (previous paragraph), and then associated with proximal loudness. Another inferential process would be: distance is estimated using various cues, and then proximal loudness at a reference distance is inferred from proximal loudness at the current location. One such cue is the spectrum: the air absorbs high frequencies more than low frequencies, and therefore distant sounds have less high frequency content. Of course this is an ambiguous cue since spectrum at the ear also depends on the spectrum at the source, so it is only a cue in a statistical sense (i.e., given the expected spectral shape of natural sounds).

There is another possibility that was tested with psychophysical experiments in a very interesting study (Zahorik & Wightman (2001)). The subjects listen to noise bursts played from various distances at various intensities, and are asked to evaluate source loudness. The results show that 1) the evaluation of loudness does not depend on distance, 2) the scale of loudness depends on source intensity in the same way as for proximal loudness (loudness at the ears). This may seem surprising, since the sounds have no structure, they do not have the typical spectrum of natural sounds (which tend to decay as 1/f) and there is no nonlinearity involved. The key is that the sounds were presented in a reverberating environment (room). The authors propose that loudness constancy is due to the properties of diffuse fields. In acoustics, a diffuse field has the property that it is identical at all spatial locations within the environment. This is never entirely true of natural environments, but some reverberant environments are close to it. This implies that the reverberant part of the signal depends linearly on the source signal but does not depend on distance. Therefore, the reverberant part is invariant with respect to source location and can provide the basis for the notion of source loudness that we are considering. Reverberation preserves the spectrum of the source signal, but not the temporal envelope (which is blurred). However, we note that since reverberation depends on the specific acoustical environment, it is in principle only informative about the relative loudness of different sources; but it is important to observe that it allows comparisons between different types of sources.

Alternatively, the ratio between direct and reverberant energy provides a way to estimate the distance of the source, from which source loudness can be deduced. But we note that estimating the distance is in fact not necessary to estimate source loudness. The study does not mention cues due to early reflections on the ground. Indeed a reflection on the ground interferes with the direct signal at a specific frequency that is inversely proportional with the delay between direct and reflected signals. This could be a monaural or binaural cue to distance (Gourévitch & Brette 2012).

To conclude this post, we have seen that loudness actually encompasses several distinct notions:

1) a proximal notion that is related to intelligibility (as in “not loud enough”), and therefore to the relationship between the signal of interest and the background, considered as a distracter;

2) a proximal notion that is related to biological responses to the acoustical signal (as in “too loud”), which may (speculatively) be numerous (energy consumption, risk of cochlear damage, startle reflex);

3) a distal notion that relates to the mechanical energy involved in producing the sound (a sensorimotor notion), which is source-specific;

4) a distal notion that relates to the sound field radiated from the source, independently of the distance, which may be defined as the expected proximal loudness at a reference distance.

Time

What is time and how is it perceived? This is of course a vast philosophical question, which I will only scratch.

1) Time, space and existence

It is customary to describe time as “the fourth dimension”. This point of view comes from the equations of mechanics and is highly misleading, because it seems to imply that time is of the same kind as space. A century ago, Henri Poincaré noted that our concept of space, both perceptually and scientifically, derives from our physical interactions with the world. That is to say, knowing where something is is knowing how to get there. Space is defined by the laws that govern movements in the physical world and the structure of these laws (Euclidean geometry). A law, some property that does not change, can only be defined with respect to something that changes. Therefore, time, defined as the source of change in the world, is a prerequisite to space. Space exists only by its persistence through the passing of time.

2) Time and change

In fact, nothing exists without the passing of time, because the essence is precisely what does not change through the flow of time. If we see someone throwing a ball, that ball is moving. Our visual sensations change, but we see a ball in movement: this is to say that there is something in the visual signals that does not change, which characterizes the ball as such. We do not see an object in the flickering white noise of a TV set.

In the TV series Bewitched, Samantha the housewife twitches her nose and everyone freezes except her. Then she twitches her nose and everyone unfreezes, without noticing that anything happened. For them, time has effectively stopped. This is to say that time is not perceived as such, but only through the changes it causes in our body. It is these changes that are perceived, not time per se (i.e., not time as in the variable in the equations of mechanics).

3) Irreversibility of time

From the fact that time is the perceived cause of changes, it follows that time has a direction, because physical processes are generally irreversible. This is also related to the theorem in information theory that states that information can only be lost, and never gained, when a process is applied to a variable. The current state of a physical system results from previous processes only, which constitutes “the past”.

A physical system in which events occur (our body) can be seen as a dynamical system, or series of processes that make the state of the system evolve. From one state s, the system changes subsequently to state s’. There is a direction to this change: s -> s’. This is the action of time on the system, and it is directed (the “arrow of time”). If the system where isolated, then time would be arbitrary. One could consider any dimension that is isomorphic to time and preserves directionality, and call it “time”, without changing the organization of changes within the system. It would make no difference for the system.

4) The unity of time

This raises the question of the perceptual unity of time: if time is perceived through changes in our body, then why do we feel that time is a single thing, when lots of different things change in our body? How is it that an auditory event and a visual event can appear to occur “at the same time”, given that they impact different receptors? Why isn’t there a different time for each process in our body? What does it mean that an event occurs “before” another one?

Imagine two independent processes that are spatially separated. From the perspective of these processes, it would make no difference if time passed at a different pace. The unity of time must come from an interaction between processes. The interaction between different processes defines a common flow of time.

Going further would probably require a discussion of consciousness and working memory, so I will leave these questions mostly unanswered for now.

5) The grain of time

How fine is our perception of time? When one listens to an auditory click played through headphones with 500 µs delay between the two ears, we do not hear two clicks. We hear a single click, lateralized towards one side. If we repeatedly play clicks at 50 Hz (every 20 ms), we do not hear a series of clicks. We hear a single continuous sound. When we listen to a pure tone at 50 Hz, the amplitude of the tone varies all the time but we do not hear this variation of amplitude. On the contrary, it feels like the tone has constant loudness.

These remarks suggest that our perception of time has a “grain” of a few tens of ms. That is, processes occurring within a few tens of ms are perceived as being caused by the same event, and the temporal occurrence of events within that time window is not perceived as time. Why?

To see how tricky this is, consider again the first example, when we listen to two clicks delayed by 500 µs between the two ears. The temporal order of the clicks can be clearly distinguished: if the click is first played in the left earphone, the sound is perceived as coming from the left, and conversely if the click is first played in the right earphone. In addition, if the delay between the two clicks is changed, then the sound is perceived as coming from a different direction (usually somewhere between the two ears), in a way that is reproducible. Such changes are perceived when the delay is changed by about 20 µs.

So from a computational point of view, time is processed with a grain of 20 µs. But phenomenologically, time appears to have a grain about a thousand times larger. Why such a difference? The perceptual grain of time does not appear to reflect the precision of neural processing, or in other words, the timescale at which states of the brain seem constant.

6) Duration

This post probably raised more questions than I could answer. I will end it with a discussion of the concept of duration. Spinoza described it as follows: “Duration is an attribute under which we conceive the existence of created things insofar as they persevere in their actuality”. This is essentially the point I have developed at the beginning of this post. In contrast with, say, color and pitch, duration is not a quality of things. Duration is about existence (the fact that a thing exists), while color or pitch is about essence (what this thing is). Properties of objects are defined by their persistence through time, but duration does not persist through time. Rather, duration quantifies how much time some properties exist. For example, it can be said that a musical note has a timbre (the instrument), a pitch and a duration. These are not three independent qualities: duration is about the pitch and timbre (for how much time they can be said to exist), but timbre is not about duration.

In summary: time is about existence, space is about essence.