is a basic concept in physics. Every symmetry of the
to a conservation law – this is
is also important in mathematics. Whatever mathematical structure is being studied,
a mathematicians will ask: "Under what transformations is this structure invariant?"
Originally this insight about the importance of symmetry was applied to geometry by
the mathematician Felix Klein, who published the
In music science, symmetries are also important. For music there are three questions
we need to ask about symmetries:
- What symmetries exist?
- How do they occur?
- Why do they occur?
The pattern of "What, how and why" is not unique to the study of musical symmetries – it
applies to the study of any biological phenomenon:
- What is the phenomenon?
- How does it occur?
- Why does it occur?
("Why" questions are specific to biology, and for those who dislike the implication of unexplained
purpose, they can be rephrased as: "What selective pressure has caused this feature of a living organism to evolve?"
And if you think that music science is not part of biology, remember that music is something which is
performed and enjoyed by human beings, and human beings are living organisms, which is what biology is the study of.)
The symmetries of music are not the same thing as symmetrical music.
Symmetrical music is music that is unexpectedly symmetrical, in other words, more symmetrical than music
usually is. For example, music which is the same forwards as backwards,
or the same if played backwards and upside-down. It is not particularly common for music to have
these symmetries, and there aren't
necessarily any good examples of such music ("good" in the sense of being musical and popular).
Whereas, the symmetries of music apply to all music. They are the symmetries of music per se,
and not the symmetries of individual musical items.
The symmetries of music correspond to perceived properties of music that are invariant under certain transformations.
The "symmetries of music" can be better described as "symmetries of music perception",
highlighting that it is the perception of music or some aspect of music which is preserved under the
relevant transformation. However I will continue to refer to them more simply as "symmetries of music".
The symmetries of music are not exact, not in the way that physical symmetries can be exact.
(Although even some important physical symmetries are not exact, and these symmetries correspond to
conservation laws that only hold under some circumstances.)
An example of the inexactness of a musical symmetry
is that invariance under pitch translation is not exact, because with a sufficiently large pitch translation,
the sounds of a musical item will be transformed into frequencies that are too low or too high to hear at all.
There are six major symmetries of music perception.
- Pitch translation invariance – shifting the whole of a musical item up or down by a fixed musical interval.
- Time scaling invariance – playing a musical item slower or faster.
- Time translation invariance – playing a musical item earlier or later. This symmetry can be regarded as
being exact for all practical purposes, apart from the limitation that any individual musical listener only
lives for a finite lifetime (and also an individual's musical tastes can and do change slowly over time).
- Amplitude scaling invariance – playing music louder or softer. The musical properties of a musical item
are not substantially altered by changing the volume, except to the extent that if we like particular music,
then we usually enjoy it more if it is played more loudly.
- Octave translation invariance – shifting all or part of a musical item up or down by a fixed number of
octaves. The set of octave translations is a sub-group of the set of all possible pitch translations. However,
unlike arbitrary pitch translations,
octave translation can be applied to individual components of a musical item, especially base notes, or chords, or
notes within chords,
without substantially altering the musical properties of the item. Octave translation invariance also applies
to most musical scales – that is, a musical scale repeats every octave.
- Pitch reflection invariance – reflecting pitch values about some fixed pitch. This symmetry does not
preserve all musical properties – however it does preserve the consonance of intervals, given that consonance
is a relationship between two pitch values which does not depend on any ordering or labelling of those two pitch values.
(If A to C is a consonant interval, then so is C to A.)
The A minor scale with an A minor home chord
can be regarded as a reflection of the C major scale with a C major home chord (reflected about the note D).
This may or may not be a consequence of the pitch reflection invariance of consonance (i.e. it might be a consequence, if
the "hominess" of a chord on a scale is largely a function of the mutual consonance relationships between notes in a scale).
The symmetries of music can be classified according to functionality, generality and applicability to speech perception.
A functional symmetry is where a biological purpose is satisified by perceiving two different sounds "the same",
when the sounds are equivalent according to the symmetry.
A musical symmetry is general if it is relevant to the perception of all sounds
(and not just music, or just music and speech).
Some musical symmetries are specifically applicable to the perception of music and speech.
- Pitch translation invariance is functional and applicable to speech: it enables the "same" speech melodies to
be perceived in the speech of speakers with voices that have lower or higher pitch.
- Time-scaling invariance is functional and applicable to speech: it enables the "same" speech rhythms to
be perceived in speech that is faster or slower. (Note: language-specific speech rhythms are known to play an
important role in the perception of speech, particularly for identifying syllable and word boundaries.)
- Time translation invariance is a general functional symmetry: the same sound occurring on different occasions
should be perceived as being the same.
- Amplitude scaling invariance is a general functional symmetry: among other things, it enables a sound to be
perceived as being "the same" whether it is created nearby or far away. (Also, speech, like many other sounds, can
be generated more quietly or more loudly.)
- Octave translation invariance does not appear to serve any direct function – that is, there is no obvious
biological benefit to perceiving pitch values separated by an octave as being "the same".
One possible indirect benefit is that if
pitch values are represented in some cortical map modulo octaves, then pitch values can be represented more precisely
within a smaller area of cortex. This is a form of perceptual trade-off: the brain cares more about the precise position
of a pitch value within the octave, and less about which octave the pitch value is in. This trade-off may contribute
to the brain's ability to perceive very small pitch differences.
- Pitch reflection invariance may be a simple consequence of how consonance is perceived. The perception
of consonance is itself something that serves no obvious function (but see next item).
The functional symmetries of Pitch Translation Invariance and Time Scaling Invariance need to be calibrated.
We are so subjectively familiar with these two invariances, that we might think it is "obvious" that a tune
shifted in pitch is the "same" tune, or that a rhythm played faster is the "same" rhythm.
But something being "obvious" to us humans is not the same thing as it being obvious how such "obviousness" is implemented
within the human brain.
So I ask: how does the brain "know" that a pitch-translated
version of a melody is the "same" melody, or that a time-scaled version of a rhythm is the "same" rhythm?
I will state a series of hypotheses:
both of these symmetries are non-trivial to implement, there is some measurable
cost to their implementation, and there must be, in each case, some mechanism by which the brain learns
that, according to the relevant symmetry, certain perceptions are the "same" as other perceptions.
For each of these two symmetries, we can identify plausible candidates for calibration targets, i.e.
perceptible phenomena which occur in such a manner than perceptual equivalence under the relevant transformations
can be learned. These are:
- For pitch translation invariance, the target is consonant intervals. That is, consonance of intervals is
a property of intervals which is invariant under pitch translation. A plausible model of how the brain "learns"
to identify consonance is given in
The Statistical Structure of Human Speech Sounds Predicts Musical Universals by Schwartz et al. (Note, the research in this paper does not posit consonance as a calibration target
for the pitch translation invariance. Indeed, the authors assume pitch translation invariance a priori in their
model of how consonance is "learned" from perception of vowel sounds in speech. However, this assumption could be
easily removed from their model without affecting the main result – except that their 1-D plot of consonance as a function
of interval size would have to be replaced by a 2-D plot of consonance as a function of pairs of interval end-points.)
- For time scaling invariance, a plausible target is multiplication of time intervals by very small
numbers – mostly 2 or 3. For example, a time interval of x seconds can be compared to a time interval of 2x
seconds whenever one perceives 3 events A, B and C separated by two time intervals of x seconds.
In such a case, the interval
from A to C will be 2x seconds. The relationship of "2x seconds is twice as long as x seconds" is a
relationship which is invariant under time scaling. For example, if we scale x by some arbitrary number a,
e.g. y = ax, then the scaled relationship is "2y seconds is twice as long as y seconds".
There is a direct correspondence between the invariants used to calibrate functional symmetries specific to music and
speech, and the significant relationships that occur within music.
That is, the perceptual targets which are (plausibly) used to calibrate the functional symmetries of music, happen
to be the same relationships which occur within music, as relationships between different elements of a music item.
- Both musical scales and music harmony are determined to a large extent by consonant musical intervals.
That is, notes in the same scale, and even more so, notes in the same chord, are related to each other by
consonant intervals. (At least this is the case for the modern Western diatonic scale, which is the
now the most popular scale for music throughout much of the world.)
- Both note lengths and rhythmic regular beats are related by multiples of 2 (more often) or 3 (sometimes).
Thus, if there are notes of length x seconds, there will be notes
of length x/2 seconds, and notes of length 2x seconds.
And if there is a regular beat of y notes per second, there will likely be a regular beat of 2y notes
per second and/or a regular beat of y/2 notes per second.
The implementations of the functional symmetries of music and speech perception probably share mechanisms with
those of other continuous perceptual symmetries.
The symmetries of music and speech perception are not the only kind of perceptual symmetry
that the human brain deals with.
The most significant example of a perception with perceptual symmetries, one that applies not only to humans,
but to all animals that have eyes with retinas, is the visual perception of objects. 3-D objects
are projected onto the 2-D retina. We can consider various movements of an object relative to a viewer
which result in reversible transformations of 2-D images, which suggests that visual perception of
an object should be invariant under those 2-D transformations.
- If the viewer gets closer to the object, or farther away, the 2-D image of the object will increase
or decrease in size accordingly. Therefore perception of the object, after factoring out perceived
distance, should be invariant under 2-D scaling.
- When the viewer moves their head around in various ways (while continuing to look at one object),
the 2-D image is subject to translations and rotations. Therefore
object perception should be invariant under 2-D translations and rotations of the 2-D image. (There are limits
to this, for example we are not very good at recognising faces which are upside down. This
reflects the reality that normally we don't need to be able recognise upside down faces.)
- If the object being viewed is a flat surface, rotations of the object result, in the most general case, in
projective transformations of the 2-D image, or, in the case
where the object subtends a "small" angle in the visual field,
affine transformations. So perception of flat surfaces
should be invariant under these transformations of the image.
We can also consider the general case where the object rotates around any axis (not just around
the axis in the line of
sight). If we assume the existence of a hypothetical reconstructed description of the object, inside
the viewer's brain, then a rotation of the 3-D object should correspond to some reversible transformation
of the internal representation of the object (ignoring issues of opacity, where different parts of the
object become hidden or not hidden as it rotates), and perception of the object should be invariant
under that type of transformation.
The visual perception of 3-D objects in a 3-D world is something that has an evolutionary history much
older then the perception of human speech and music, so it is plausible that the evolving perception of
speech and music has "borrowed" mechanisms of invariance which are relevant to visual perception.
Probabilistic Theories of Perceptual Symmetries
Relevant to any analysis of perceptual symmetries is the probabilistic approach championed by
Dale Purves, as explained in
A Primer on Probabilistic Approaches to Visual Perception.
(Note: Dale Purves is one of the co-authors of the paper
The Statistical Structure of Human Speech Sounds Predicts Musical Universals referred to above.)
Purve's theory can be regarded as an attempt to create a general theory of perceptual "learning"
that explains all situations where different percepts imply the perception of the "same" thing. Such a theory would
seem to subsume all situations where the equivalence of "different" percepts of the "same thing" can be
given an interpretation
as invariant under a set of transformations that belong to a formally defined mathematical group.
To Pre-Wire or Not to Pre-Wire?
From what I've read, Purve's research and that of his co-researchers seems to stop short
of making or testing any hypotheses about the "neural wiring" that supports this type of perceptual learning. However, even
if the relevant learning processes are driven by actual experience, we might expect the brain to be somewhat "pre-wired"
with the evolutionarily evolved expectation that certain types of symmetry are likely to occur in practice.
In particular, if we assume the existence of certain special calibration targets, which drive the learning of particular
perceptual symmetries, then these targets would have to be "pre-wired" somehow, as being of special interest
early on in life, even though the final outcome of the
learning process is not itself pre-wired, and even though the perceptual symmetry is eventually going to be applied to
the perceptions of things different to the initial calibration targets.
Also, perceptual symmetries can be partly pre-wired in the sense that
neurons representing different percepts that are likely to be learned as being the perception of the "same thing" can
be laid out in the cortex in a manner such that they are likely to be close to each other, facilitating the formation
of the required connections that form when the experiential learning process identifies those different percepts
as being the perception of the "same" (or very similar) thing.
My guess is that, even if the probabilistic explanation of perceptual symmetries is largely correct,
such "pre-wiring" does exist both for the symmetries of visual perception and for
the symmetries of music and speech perception.