Symmetry Matters in Physics and Mathematics and it also Matters in Music Science

Philip Dorrell, 23 July 2012

Symmetry is a basic concept in physics. Every symmetry of the Lagrangian corresponds to a conservation law – this is Noether's Theorem.

Symmetry is also important in mathematics. Whatever mathematical structure is being studied, a mathematicians will ask: "Under what transformations is this structure invariant?" Originally this insight about the importance of symmetry was applied to geometry by the mathematician Felix Klein, who published the Erlangen Program.

In music science, symmetries are also important. For music there are three questions we need to ask about symmetries:

What symmetries exist?
How do they occur?
Why do they occur?

The pattern of "What, how and why" is not unique to the study of musical symmetries – it applies to the study of any biological phenomenon:

What is the phenomenon?
How does it occur?
Why does it occur?

("Why" questions are specific to biology, and for those who dislike the implication of unexplained purpose, they can be rephrased as: "What selective pressure has caused this feature of a living organism to evolve?" And if you think that music science is not part of biology, remember that music is something which is performed and enjoyed by human beings, and human beings are living organisms, which is what biology is the study of.)

The symmetries of music are not the same thing as symmetrical music.

Symmetrical music is music that is unexpectedly symmetrical, in other words, more symmetrical than music usually is. For example, music which is the same forwards as backwards, or the same if played backwards and upside-down. It is not particularly common for music to have these symmetries, and there aren't necessarily any good examples of such music ("good" in the sense of being musical and popular).

Whereas, the symmetries of music apply to all music. They are the symmetries of music per se, and not the symmetries of individual musical items.
The symmetries of music correspond to perceived properties of music that are invariant under certain transformations.

The "symmetries of music" can be better described as "symmetries of music perception", highlighting that it is the perception of music or some aspect of music which is preserved under the relevant transformation. However I will continue to refer to them more simply as "symmetries of music".
The symmetries of music are not exact, not in the way that physical symmetries can be exact.

(Although even some important physical symmetries are not exact, and these symmetries correspond to conservation laws that only hold under some circumstances.)

An example of the inexactness of a musical symmetry is that invariance under pitch translation is not exact, because with a sufficiently large pitch translation, the sounds of a musical item will be transformed into frequencies that are too low or too high to hear at all.
There are six major symmetries of music perception.

These are:
- Pitch translation invariance – shifting the whole of a musical item up or down by a fixed musical interval.
- Time scaling invariance – playing a musical item slower or faster.
- Time translation invariance – playing a musical item earlier or later. This symmetry can be regarded as being exact for all practical purposes, apart from the limitation that any individual musical listener only lives for a finite lifetime (and also an individual's musical tastes can and do change slowly over time).
- Amplitude scaling invariance – playing music louder or softer. The musical properties of a musical item are not substantially altered by changing the volume, except to the extent that if we like particular music, then we usually enjoy it more if it is played more loudly.
- Octave translation invariance – shifting all or part of a musical item up or down by a fixed number of octaves. The set of octave translations is a sub-group of the set of all possible pitch translations. However, unlike arbitrary pitch translations, octave translation can be applied to individual components of a musical item, especially base notes, or chords, or notes within chords, without substantially altering the musical properties of the item. Octave translation invariance also applies to most musical scales – that is, a musical scale repeats every octave.
- Pitch reflection invariance – reflecting pitch values about some fixed pitch. This symmetry does not preserve all musical properties – however it does preserve the consonance of intervals, given that consonance is a relationship between two pitch values which does not depend on any ordering or labelling of those two pitch values. (If A to C is a consonant interval, then so is C to A.) The A minor scale with an A minor home chord can be regarded as a reflection of the C major scale with a C major home chord (reflected about the note D). This may or may not be a consequence of the pitch reflection invariance of consonance (i.e. it might be a consequence, if the "hominess" of a chord on a scale is largely a function of the mutual consonance relationships between notes in a scale).
The symmetries of music can be classified according to functionality, generality and applicability to speech perception.

A functional symmetry is where a biological purpose is satisified by perceiving two different sounds "the same", when the sounds are equivalent according to the symmetry.

A musical symmetry is general if it is relevant to the perception of all sounds (and not just music, or just music and speech).

Some musical symmetries are specifically applicable to the perception of music and speech.
- Pitch translation invariance is functional and applicable to speech: it enables the "same" speech melodies to be perceived in the speech of speakers with voices that have lower or higher pitch.
- Time-scaling invariance is functional and applicable to speech: it enables the "same" speech rhythms to be perceived in speech that is faster or slower. (Note: language-specific speech rhythms are known to play an important role in the perception of speech, particularly for identifying syllable and word boundaries.)
- Time translation invariance is a general functional symmetry: the same sound occurring on different occasions should be perceived as being the same.
- Amplitude scaling invariance is a general functional symmetry: among other things, it enables a sound to be perceived as being "the same" whether it is created nearby or far away. (Also, speech, like many other sounds, can be generated more quietly or more loudly.)
- Octave translation invariance does not appear to serve any direct function – that is, there is no obvious biological benefit to perceiving pitch values separated by an octave as being "the same". One possible indirect benefit is that if pitch values are represented in some cortical map modulo octaves, then pitch values can be represented more precisely within a smaller area of cortex. This is a form of perceptual trade-off: the brain cares more about the precise position of a pitch value within the octave, and less about which octave the pitch value is in. This trade-off may contribute to the brain's ability to perceive very small pitch differences.
- Pitch reflection invariance may be a simple consequence of how consonance is perceived. The perception of consonance is itself something that serves no obvious function (but see next item).
The functional symmetries of Pitch Translation Invariance and Time Scaling Invariance need to be calibrated.

We are so subjectively familiar with these two invariances, that we might think it is "obvious" that a tune shifted in pitch is the "same" tune, or that a rhythm played faster is the "same" rhythm.

But something being "obvious" to us humans is not the same thing as it being obvious how such "obviousness" is implemented within the human brain.

So I ask: how does the brain "know" that a pitch-translated version of a melody is the "same" melody, or that a time-scaled version of a rhythm is the "same" rhythm?

I will state a series of hypotheses: both of these symmetries are non-trivial to implement, there is some measurable cost to their implementation, and there must be, in each case, some mechanism by which the brain learns that, according to the relevant symmetry, certain perceptions are the "same" as other perceptions.

For each of these two symmetries, we can identify plausible candidates for calibration targets, i.e. perceptible phenomena which occur in such a manner than perceptual equivalence under the relevant transformations can be learned. These are:
- For pitch translation invariance, the target is consonant intervals. That is, consonance of intervals is a property of intervals which is invariant under pitch translation. A plausible model of how the brain "learns" to identify consonance is given in The Statistical Structure of Human Speech Sounds Predicts Musical Universals by Schwartz et al. (Note, the research in this paper does not posit consonance as a calibration target for the pitch translation invariance. Indeed, the authors assume pitch translation invariance a priori in their model of how consonance is "learned" from perception of vowel sounds in speech. However, this assumption could be easily removed from their model without affecting the main result – except that their 1-D plot of consonance as a function of interval size would have to be replaced by a 2-D plot of consonance as a function of pairs of interval end-points.)
- For time scaling invariance, a plausible target is multiplication of time intervals by very small numbers – mostly 2 or 3. For example, a time interval of x seconds can be compared to a time interval of 2x seconds whenever one perceives 3 events A, B and C separated by two time intervals of x seconds. In such a case, the interval from A to C will be 2x seconds. The relationship of "2x seconds is twice as long as x seconds" is a relationship which is invariant under time scaling. For example, if we scale x by some arbitrary number a, e.g. y = ax, then the scaled relationship is "2y seconds is twice as long as y seconds".
There is a direct correspondence between the invariants used to calibrate functional symmetries specific to music and speech, and the significant relationships that occur within music.

That is, the perceptual targets which are (plausibly) used to calibrate the functional symmetries of music, happen to be the same relationships which occur within music, as relationships between different elements of a music item.

In particular:
- Both musical scales and music harmony are determined to a large extent by consonant musical intervals. That is, notes in the same scale, and even more so, notes in the same chord, are related to each other by consonant intervals. (At least this is the case for the modern Western diatonic scale, which is the now the most popular scale for music throughout much of the world.)
- Both note lengths and rhythmic regular beats are related by multiples of 2 (more often) or 3 (sometimes). Thus, if there are notes of length x seconds, there will be notes of length x/2 seconds, and notes of length 2x seconds. And if there is a regular beat of y notes per second, there will likely be a regular beat of 2y notes per second and/or a regular beat of y/2 notes per second.
The implementations of the functional symmetries of music and speech perception probably share mechanisms with those of other continuous perceptual symmetries.

The symmetries of music and speech perception are not the only kind of perceptual symmetry that the human brain deals with.

The most significant example of a perception with perceptual symmetries, one that applies not only to humans, but to all animals that have eyes with retinas, is the visual perception of objects. 3-D objects are projected onto the 2-D retina. We can consider various movements of an object relative to a viewer which result in reversible transformations of 2-D images, which suggests that visual perception of an object should be invariant under those 2-D transformations.

In particular:
- If the viewer gets closer to the object, or farther away, the 2-D image of the object will increase or decrease in size accordingly. Therefore perception of the object, after factoring out perceived distance, should be invariant under 2-D scaling.
- When the viewer moves their head around in various ways (while continuing to look at one object), the 2-D image is subject to translations and rotations. Therefore object perception should be invariant under 2-D translations and rotations of the 2-D image. (There are limits to this, for example we are not very good at recognising faces which are upside down. This reflects the reality that normally we don't need to be able recognise upside down faces.)
- If the object being viewed is a flat surface, rotations of the object result, in the most general case, in projective transformations of the 2-D image, or, in the case where the object subtends a "small" angle in the visual field, affine transformations. So perception of flat surfaces should be invariant under these transformations of the image.
We can also consider the general case where the object rotates around any axis (not just around the axis in the line of sight). If we assume the existence of a hypothetical reconstructed description of the object, inside the viewer's brain, then a rotation of the 3-D object should correspond to some reversible transformation of the internal representation of the object (ignoring issues of opacity, where different parts of the object become hidden or not hidden as it rotates), and perception of the object should be invariant under that type of transformation.

The visual perception of 3-D objects in a 3-D world is something that has an evolutionary history much older then the perception of human speech and music, so it is plausible that the evolving perception of speech and music has "borrowed" mechanisms of invariance which are relevant to visual perception.

Probabilistic Theories of Perceptual Symmetries

Relevant to any analysis of perceptual symmetries is the probabilistic approach championed by Dale Purves, as explained in A Primer on Probabilistic Approaches to Visual Perception. (Note: Dale Purves is one of the co-authors of the paper The Statistical Structure of Human Speech Sounds Predicts Musical Universals referred to above.)

Purve's theory can be regarded as an attempt to create a general theory of perceptual "learning" that explains all situations where different percepts imply the perception of the "same" thing. Such a theory would seem to subsume all situations where the equivalence of "different" percepts of the "same thing" can be given an interpretation as invariant under a set of transformations that belong to a formally defined mathematical group.

To Pre-Wire or Not to Pre-Wire?

From what I've read, Purve's research and that of his co-researchers seems to stop short of making or testing any hypotheses about the "neural wiring" that supports this type of perceptual learning. However, even if the relevant learning processes are driven by actual experience, we might expect the brain to be somewhat "pre-wired" with the evolutionarily evolved expectation that certain types of symmetry are likely to occur in practice.

In particular, if we assume the existence of certain special calibration targets, which drive the learning of particular perceptual symmetries, then these targets would have to be "pre-wired" somehow, as being of special interest early on in life, even though the final outcome of the learning process is not itself pre-wired, and even though the perceptual symmetry is eventually going to be applied to the perceptions of things different to the initial calibration targets.

Also, perceptual symmetries can be partly pre-wired in the sense that neurons representing different percepts that are likely to be learned as being the perception of the "same thing" can be laid out in the cortex in a manner such that they are likely to be close to each other, facilitating the formation of the required connections that form when the experiential learning process identifies those different percepts as being the perception of the "same" (or very similar) thing.

My guess is that, even if the probabilistic explanation of perceptual symmetries is largely correct, such "pre-wiring" does exist both for the symmetries of visual perception and for the symmetries of music and speech perception.

Vote for or comment on this manifesto on Reddit or Hacker News ...

This manifesto is a Propositional Manifesto. It is licensed under the Creative Commons License Creative Commons Attribution-ShareAlike license.

propositional writing

Symmetry Matters in Physics and Mathematics and it also Matters in Music Science

Probabilistic Theories of Perceptual Symmetries

To Pre-Wire or Not to Pre-Wire?