"Musicality" is that quality of music that distinguishes music from all things which are not musical. Assuming that musicality is one-dimensional gives rise to several conclusions:
A symmetry of music perception means that music can be transformed in some way, and some or all of the musical qualities remain unchanged. (The symmetries of musical perception are not exact, which requires one to develop a theory of "inexact" symmetries, however I will ignore this issue for the moment.)
The two alternatives of function and trade-off can also be explained as follows, assuming that music perception is actually perception of an aspect of speech:
To see how this might apply in practice, we need to look at each symmetry of music perception in turn.
Time-translation invariance is a general symmetry of all sound perception, so it does not require any special explanation with regard to speech or music perception. However, time-translation invariance of sound perception may not be completely trivial to implement, and the specific mechanisms of trim-translation invariance, to detect similarity between segments of sound occurring within some interval in time, may be relevant to the occurrence of phrasal repetition within musical items.
A model for this is a cortical map which perceives very small changes in pitch via very small changes in the frequencies of harmonic frequencies. A feature of such a map is that pitch values an octave apart will seem very "close", because they have many matching harmonics.
A second advantage of octave invariance is that if a cortical map represents pitch values modulo octaves, then that map can represent more precise information about pitch than a cortical map which represented all perceivable pitch values directly (consisting of 6 to 8 octaves).
A model for this is a cortical map where neurons respond to pitch values when they occur and for some time afterwards – consonance is recognise when neurons representing pitch values consonantly related to each other are concurrently active. A consequence of this model is that information about which pitch value in a pair of consonantly related values occurred first is lost, so the perceived relationship is necessarily symmetric between the two values. (This model also explains how harmony can be a component of the perception of a "melody" which consists of only pitch value at any given time, yet is perceived more strongly when consonantly related pitch values occur simultaneously.)
Rhyming is a ubiquitous feature of modern popular music. The hypothesis here is that rhyming is not intrinsically musical, but it enforces the perception of the balanced binary structure of music, and it is this perceived balanced binary structure which has musicality.
Those features being:
Most if not all perceived aspects of music have analogs with perceived aspects of speech. But perception of speech is not purely aural – we also perceive gestures, facial expressions and body language of the speaker. It follows that there could exist musical analogs of these visual aspects of speech.
Apart from the obvious association of the the timing of music and dance, and the very fact that people usually dance to music, dance has the following aspects which are plausibly interpreted as being super-stimuli for some aspect of visual perception of non-dance (i.e. "ordinary") human motion: