Linear algebra birthed a new field of music theory

How the most popular microtonal scales today were built with linear algebra in the 1990s


Linear Algebra and the Mathematics of Microtonal Music: How Regular Temperament Theory Constructs New Musical Worlds

Your knowledge of vector spaces, kernel spaces, and matrix rank could help you design entirely new musical tuning systems: Regular Temperament Theory (RTT) is a branch of music theory that is, at its core, applied linear algebra.

This article assumes you're comfortable with concepts like basis vectors, linear maps, null spaces, and lattices. It assumes almost nothing about music theory. By the end, you'll understand how mathematicians and musicians use the same tools you already know to systematically explore a vast space of possible musical tunings, many of which have never been heard in Western classical music.


1. What Is a Musical Scale, Mathematically?

Let's start from scratch. What actually is a musical note, from a physics standpoint?

A note is a periodic sound wave. Its defining property, for our purposes, is its frequency — measured in Hz (cycles per second). When musicians talk about the relationship between two notes, they talk about the ratio of their frequencies. This is the fundamental insight that makes everything else possible: the musical interval between two notes is determined entirely by the ratio of their frequencies, not by the frequencies themselves.

This means the relevant algebraic structure isn't an additive one — it's multiplicative. A note at 440 Hz and a note at 880 Hz sound, to human ears, like "the same note, one octave apart," because 880/440 = 2/1. A note at 220 Hz is also "the same note, two octaves down," because 440/220 = 2/1. The frequency ratio 2:1 is called the octave, and it is by far the most consonant interval in music across virtually all human cultures.

A musical scale is simply a chain of intervals that repeats itself, for example the major scale is a chain of ‘major second’ and ‘minor second’ intervals in particular order.

Now here's where the linear algebra begins.


2. Just Intonation: The Free Abelian Group of Intervals

The Prime Factorization of Intervals

Just Intonation (JI) is the system of tuning where all intervals are expressed as ratios of positive integers. The simplest and most consonant intervals are small integer ratios: 2/1 (octave), 3/2 (perfect fifth), 4/3 (perfect fourth), 5/4 (major third), and so on.

Now, consider what happens when you multiply intervals together. If you go up a perfect fifth (3/2) and then up another perfect fifth (3/2), you get the ratio (3/2) × (3/2) = 9/4. If you then want to bring that back down into the same octave (between frequencies 1 and 2), you divide by 2 to get 9/8. This operation — multiplying frequency ratios — is how you combine intervals.

The set of all positive rational numbers under multiplication forms a group. But we can identify a much more useful structure. By the Fundamental Theorem of Arithmetic, every positive rational number has a unique prime factorization. So any frequency ratio p/q can be written as:

where the exponents a(i) are integers (positive or negative). For example:

  • 3/2 = 2⁻¹ · 3¹, so its exponent vector is (-1, 1, 0, 0, 0, ...)
  • 5/4 = 2⁻² · 5¹, so its exponent vector is (-2, 0, 1, 0, 0, ...)
  • 7/6 = 2⁻¹ · 3⁻¹ · 7¹, so its exponent vector is (-1, -1, 0, 1, 0, ...)

This means every just interval corresponds to a vector in ℤⁿ, where n is however many primes we decide to care about. Multiplying intervals corresponds to adding their vectors. The octave is the vector (1, 0, 0, 0, ...), the perfect fifth is (-1, 1, 0, 0, ...), and so on.

The set of all just intervals using primes up to some prime p forms a free abelian group, isomorphic to ℤⁿ where n is the number of primes up to and including p. We call this the p-limit just intonation group. For example:

  • 3-limit JI uses primes {2, 3}: intervals are vectors in ℤ²
  • 5-limit JI uses primes {2, 3, 5}: intervals are vectors in ℤ³
  • 7-limit JI uses primes {2, 3, 5, 7}: intervals are vectors in ℤ⁴
  • 11-limit JI uses primes {2, 3, 5, 7, 11}: vectors in ℤ⁵

This lattice of intervals is called the pitch space or JI lattice. It is the foundational object in Regular Temperament Theory.

Monzos: The Column Vector Representation

In RTT, an interval represented as its prime exponent vector is called a monzo, named after Joe Monzo. The convention is to write it as a column vector (or in a special bracket notation). The monzo for the ratio 3/2 is written |−1, 1, 0, 0, ...⟩ — a ket vector, borrowing notation from quantum mechanics.

For example, in 7-limit JI (primes 2, 3, 5, 7):

  • The octave 2/1 is the monzo |1, 0, 0, 0⟩
  • The fifth 3/2 is |−1, 1, 0, 0⟩
  • The major third 5/4 is |−2, 0, 1, 0⟩
  • The harmonic seventh 7/4 is |−2, 0, 0, 1⟩
  • The tritone 45/32 = 2⁻⁵ · 3² · 5¹ is |−5, 2, 1, 0⟩

Adding monzos corresponds to stacking intervals (multiplying their ratios). Negating a monzo inverts the interval. The monzo representation turns all of musical harmony into integer linear algebra.


3. The Problem: Just Intonation Is Infinite and Unwieldy

Here's the practical problem. The JI lattice ℤⁿ is infinite. If you want to build an instrument with a finite number of notes, you need to collapse this infinite lattice into something manageable — ideally a small cyclic or periodic structure.

The Syntonic Comma: A Famous Near-Zero Interval

Consider what happens when you go up four perfect fifths from a starting note and then come down two octaves. In terms of ratios: (3/2)⁴ ÷ 2² = 81/16 ÷ 4 = 81/64. This is supposed to approximate a major third. But the pure just major third is 5/4 = 80/64.

The difference between the two — 81/80 — is called the syntonic comma. Its monzo is |−4, 4, −1, 0⟩. It's a very small interval — about 21.5 cents (where 100 cents = one semitone in standard Western tuning). It's not zero, but it's small enough that for many musical purposes, it would be convenient if it were zero.

What does "setting an interval to zero" mean musically? It means we're declaring that two notes whose frequencies differ by 81/80 are to be considered the same note. This is called tempering out the comma. And when you do this, you are exactly defining a quotient group — you're collapsing the infinite JI lattice by identifying elements that differ by a multiple of the comma vector.

Commas as Elements of the Kernel

Here's the linear algebra framing. Suppose we have a map φ from the JI lattice ℤⁿ to some other group (representing our tempered tuning system). An interval r in the JI lattice is tempered out (mapped to the identity, i.e., mapped to the unison) if and only if r is in the kernel of φ.

The set of all tempered-out intervals forms a subgroup of ℤⁿ — specifically, a sublattice. In RTT, a set of intervals chosen to be in the kernel (i.e., to be tempered out) is called a comma basis or a val list. The sublattice they generate is called the commatic unison vector subgroup or simply the temperament kernel.

The whole game of RTT is: choose a kernel sublattice, quotient out ℤⁿ by it, and study the resulting quotient group as a tuning system.


4. Vals: The Dual Space and Tuning Maps

What Is a Val?

We've been thinking about intervals as vectors in ℤⁿ. Now let's think about tuning systems as linear functionals on this space.

val is a linear map v : ℤⁿ → ℤ that tells you, for each interval, how many steps of some generator that interval corresponds to in a given tuning system. Vals are row vectors in the dual space (ℤⁿ)* — they are elements of Hom(ℤⁿ, ℤ).

The most important val is the one associated with a given equal division of the octave. An equal temperament with N notes per octave (written N-EDO or N-ET) divides the octave into N equal steps, where each step has size 1200/N cents. The val for N-EDO tells you: for each prime p, how many of those N steps best approximates the interval of the pure prime p?

For example, in 12-EDO (standard Western tuning with 12 notes per octave):

  • The octave 2/1 maps to 12 steps (exactly, by definition)
  • The fifth 3/2 maps to 7 steps (since 7 × 100 cents = 700 cents ≈ 702 cents for pure fifth)
  • The major third 5/4 maps to 4 steps (since 4 × 100 = 400 cents ≈ 386 cents for pure third)
  • The harmonic seventh 7/4 maps to 10 steps

So the 12-EDO val in 7-limit is the row vector ⟨12, 19, 28, 34| (where the entries are the number of steps approximating each prime: 2, 3, 5, 7). Wait — where did 19 come from? That's the number of 12-EDO steps approximating the prime 3 (i.e., the interval 3/1, which spans one octave plus a fifth — 19 semitones). And 28 steps for 5/1 (two octaves plus a major third), 34 steps for 7/1.

The notation ⟨...| is the bra-vector notation, dual to the monzo ket-vector. When you apply a val to a monzo, you take the dot product (called the val-monzo pairing or inner product), and the result is the number of steps the interval occupies in that tuning system:

⟨v∣m⟩=v⋅m=number of steps⟨v∣m⟩=v⋅m=number of steps

For example, how many steps does the fifth 3/2 = |−1, 1, 0, 0⟩ have in 12-EDO?

⟨12,19,28,34∣−1,1,0,0⟩=12(−1)+19(1)+28(0)+34(0)=−12+19=7✓⟨12,19,28,34∣−1,1,0,0⟩=12(−1)+19(1)+28(0)+34(0)=−12+19=7✓

The Val as a Homomorphism

Algebraically, a val is a group homomorphism from (ℚ₊, ×) to (ℤ, +), factoring through the prime exponent representation. It maps each just interval to an integer (the number of scale steps), and it's linear: the val of a combined interval is the sum of the vals of its parts. In this sense, vals are the coordinate functions of a discrete tuning map.

The collection of all vals forms the dual lattice to the monzo lattice. This duality is central to RTT and mirrors the duality in linear algebra between a vector space and its dual.


5. Temperaments as Matrices: Rank, Generators, and Mapping Matrices

Classifying Temperaments by Rank

We're now ready to see how RTT classifies tuning systems. A regular temperament is defined by a mapping matrix M whose rows are vals. Each row tells you how a particular generator (a repeating interval from which all notes in the system are built) maps the primes.

The rank of the temperament (as a free abelian group) equals the number of independent generators, which is the rank of the mapping matrix. RTT classifies temperaments by this rank:

  • Rank 1 temperaments (equal temperaments): a single generator — one step of some N-EDO — generates all the notes. The mapping matrix is a single row vector (one val).
  • Rank 2 temperaments (linear temperaments): two generators (typically the octave and a "period"). The mapping matrix has two rows.
  • Rank 3 temperaments: three generators, mapping matrix has three rows.
  • Full-rank (rank n) temperament: just intonation itself — no tempering at all.

Example: Meantone as a Rank-2 Temperament

Meantone is perhaps the most historically important temperament and a great illustrative example. It tempers out the syntonic comma 81/80, which has monzo |−4, 4, −1, 0⟩ (in 7-limit, though meantone is fundamentally a 5-limit temperament).

Meantone uses two generators: the octave (2/1) and a slightly flattened fifth. Every note in the system can be expressed as an integer combination of these two generators:

note=a⋅octave+b⋅fifthnote=a⋅octave+b⋅fifth

The mapping matrix for 5-limit meantone is:

Here, the columns correspond to primes 2, 3, 5. The first row says: the octave generator maps 2→1 step, 3→0 steps, 5→-4 steps. The second row says: the fifth generator maps 2→0 steps, 3→1 step, 5→4 steps. So to find how many generators (expressed as [octave-steps, fifth-steps]) any just interval requires:

  • The prime 2 (octave): 1 octave + 0 fifths = [1, 0]
  • The prime 3 (twelfth = octave + fifth): 0 octaves + 1 fifth = [0, 1]
  • The prime 5 (major third, approx): −4 octaves + 4 fifths = [−4, 4]

Wait — this tells us that the major third 5/4 is reached by stacking 4 fifths and descending 4 octaves. That's the familiar construction: C → G → D → A → E. Four fifths up from C gives E, which is a major third. And by setting the syntonic comma to zero, we've made this exact, not an approximation.

The kernel of this mapping — the set of JI intervals that map to the zero vector [0, 0] — is exactly the sublattice generated by the syntonic comma |−4, 4, −1⟩. This is the commatic kernel of meantone.

The Mapping Matrix and the Kernel

In general, for a rank-r temperament in n-limit JI (n primes), the mapping matrix M is an r × n integer matrix. The kernel of M (as a linear map ℤⁿ → ℤʳ) is the set of all monzos that are tempered to zero — the comma subgroup of the temperament.

Conversely, if you start with a comma subgroup (given as a list of comma vectors), the mapping matrix is related to a basis for the orthogonal complement of that comma subgroup — specifically, it's a basis for the annihilator of the kernel in the dual space.

This is a direct analogue of the relationship between a subspace and its annihilator in linear algebra. If K ⊆ ℤⁿ is the comma subgroup (the kernel), then the row space of M is the annihilator K⊥ in the dual lattice. The Rank-Nullity Theorem applies here beautifully:

rank(M)+dim(ker⁡M)=n

So the rank of the temperament (number of generators) plus the number of independent commas equals the number of primes. Temperament more things ↔ fewer generators ↔ lower rank ↔ more structure.


6. The Tonnetz and Higher-Dimensional Lattices

Visualizing the JI Lattice

In 5-limit JI (primes 2, 3, 5), we can mostly ignore the octave direction (the prime 2) for visualization purposes, since octave equivalence is pervasive in music. If we project the ℤ³ lattice onto the 2D plane orthogonal to the octave direction, we get the famous Tonnetz — a 2D lattice of notes where:

  • Moving right = stacking a perfect fifth (×3/2)
  • Moving up-right = stacking a major third (×5/4)
  • Moving up-left = stacking a minor third (×6/5)

The Tonnetz (German for "tone network") has a rich history in music theory and appears naturally as the weight lattice of a rank-2 group action. Each note is a vertex; adjacent notes are harmonically close intervals.

In 7-limit JI, the Tonnetz becomes 3-dimensional (with the 7/4 "harmonic seventh" adding a third axis). In 11-limit, it's 4-dimensional. In 13-limit, 5-dimensional. These higher-dimensional harmonic lattices are the playgrounds of microtonal composers.

Fundamental Domains and Scale Structures

When you define a temperament by its kernel, you're quotienting the JI lattice by a sublattice. The resulting quotient is a lower-dimensional object. For a rank-2 temperament like meantone, the 2D Tonnetz gets folded into a kind of infinite strip. A scale in this temperament corresponds to choosing a fundamental domain — a finite set of coset representatives of some further quotient.

In meantone temperament, the most natural scales come from taking a finite segment of the "chain of fifths." The standard 7-note major scale in meantone corresponds to 7 consecutive notes on the chain of fifths: F-C-G-D-A-E-B. The 12-note chromatic scale corresponds to 12 consecutive notes. This is a deeply geometric operation: you're choosing a Voronoi cellor a parallelotope fundamental domain in the tempered lattice.


7. The Wedge Product and Temperament Identification

Why We Need the Exterior Algebra

Here's a subtle but important point. Different sets of commas can define the same temperament. Different sets of vals can also define the same temperament (in terms of which intervals are identified). How do we get a canonical representationof a temperament that doesn't depend on the choice of basis?

The answer in RTT is the wedge product from exterior algebra, which produces a representation called the wedgie.

The Wedgie: A Coordinate-Free Invariant

Suppose you have a rank-2 temperament in 5-limit JI defined by two vals v₁ = ⟨a₁, b₁, c₁| and v₂ = ⟨a₂, b₂, c₂|. The wedgieof this temperament is their exterior product:

W=v1​∧v2​

which is an element of the exterior algebra Λ²(ℤ³). In coordinates, the wedgie is the list of 2×2 minors of the 2×3 matrix formed by the two vals — exactly as in the definition of the cross product or the Plücker embedding.

For 5-limit rank-2 temperaments, the wedgie is a vector in ℤ³ (since (32)=3(23​)=3). For 7-limit rank-2 temperaments, it's a vector in ℤ⁶ (since (42)=6(24​)=6). The wedgie uniquely identifies the temperament up to sign and is independent of the choice of val basis — it's the Plücker coordinate of the subspace in the dual lattice.

The meantone wedgie in 5-limit is ⟨⟨1, 4, 4|| (the double-bracket notation is the RTT convention for wedgies of rank-2 temperaments, also called "bivectors").

The key property: two rank-2 temperaments are the same temperament if and only if their wedgies are equal (up to sign). This gives us a powerful tool for classifying temperaments — any two sets of commas that produce the same wedgie define the same tuning system.

Duality via the Hodge Star

For rank-r temperaments in n-limit JI, the wedgie lives in ΛʳZⁿ. But there's a dual description: the same temperament can be described by its comma basis, which lives in Λⁿ⁻ʳℤⁿ. The correspondence between these two descriptions is given by the Hodge star operator ★, which maps Λʳ → Λⁿ⁻ʳ via contraction with the unit n-vector.

This duality between "val-based" and "comma-based" descriptions of a temperament is mathematically the same as the duality between a subspace and its orthogonal complement, mediated by the exterior algebra. In RTT, this duality is exploited constantly: sometimes it's easier to describe a temperament by what generators it uses (vals), sometimes by what intervals it eliminates (commas).


8. Error, Optimization, and the Geometry of Tuning

Tuning a Temperament: Optimization in ℝⁿ

Up to now, we've been working over ℤ (integer arithmetic). But actual physical instruments need actual real-valued frequencies. When we implement a temperament on an instrument, we need to choose real-valued sizes for our generators.

This brings us to the tuning problem: given a rank-r mapping matrix M (integer-valued), find a vector t ∈ ℝʳ of generator sizes (in cents) that minimizes the tuning error — how far the tempered intervals are from their just counterparts.

The just value of prime pᵢ is log₂(pᵢ) × 1200 cents. Call the vector of just values j ∈ ℝⁿ. Then the tempered value of prime pᵢ under the tuning t is (Mᵀ t)ᵢ (using the mapping matrix and generator-size vector). The error vector is:

e=j−MTt∈Rn

We want to choose t to minimize this error. This is exactly a least-squares problem! The optimal solution is:

t∗=(MMT)−1Mj

(assuming M has full row rank, i.e., the generators are independent). This is the least-squares tuning or minimax tuningof the temperament, depending on whether you minimize the L² or L∞ norm.

Weighted Tuning and the Tenney Lattice

But not all primes are perceptually equal. Humans are more sensitive to errors in simple, low-prime intervals than to errors in complex ones. The standard weighting in RTT uses Tenney complexity: the weight for prime pᵢ is 1/log₂(pᵢ). This corresponds to measuring intervals in octave-equivalent Tenney "height".

With weights wᵢ = 1/log₂(pᵢ), we form the weight matrix W = diag(w₁, ..., wₙ). The weighted least-squares problem becomes:

This defines a specific inner product on the monzo space — the Tenney inner product — which turns the JI lattice into a metric space called the Tenney lattice. The induced metric is related to the Tenney harmonic distance between intervals:

d(p/q,1)=log⁡2(numerator(p/q))+log⁡2(denominator(p/q))

This is a natural measure of the "complexity" of a just interval: 3/2 has Tenney height log₂(3) + log₂(2) ≈ 2.58, while 27/16 has Tenney height log₂(27) + log₂(16) ≈ 8.25 — much more complex despite being closer in pitch.

The Tenney lattice, with its inner product, is the ambient Euclidean space in which RTT does its geometry. Temperaments are subspaces; tuning problems are projections; comma bases are orthogonal complements.

Badness: Complexity vs. Error

Given the Tenney metric, we can define a rigorous notion of temperament badness — a combined measure of how complex a temperament is and how large its tuning error is. Different formulas exist, but a common one is:

badness=error×complexityk

for some exponent k. This allows RTT to rank and compare temperaments systematically. The best temperaments (low badness) are the ones that achieve good harmonic approximations with few generators — the ones that "do a lot with a little."

This is, at its heart, an optimization problem over a structured mathematical space. The systematic enumeration and ranking of temperaments by badness is what gives RTT its power as a compositional tool.


9. Some Famous Temperaments and Their Structure

Let’s now look at several important temperaments through the RTT lens, connecting their mathematical structure to their musical properties.

12-EDO: The Standard Western Tuning

12-EDO (12 equal divisions of the octave) is the tuning system used in essentially all Western music since the Baroque period. It’s a rank-1 temperament — everything is generated by a single step of 100 cents.

The val is ⟨12, 19, 28, 34| (for 7-limit). We can verify:

  • 19 steps × (1200/12 cents/step) = 1900 cents ≈ 1902 cents (pure 3/1) ✓
  • 28 steps × 100 = 2800 cents ≈ 2786 cents (pure 5/1) — this is the worst approximation in 12-EDO
  • 34 steps × 100 = 3400 cents ≈ 3369 cents (pure 7/1)

12-EDO tempers out several important commas:

  • The syntonic comma 81/80 |−4, 4, −1⟩ (meantone)
  • The diesis 128/125 |7, 0, −3⟩ (three major thirds = one octave)
  • The Pythagorean comma 531441/524288 |−19, 12⟩ (twelve fifths = seven octaves)

The set of commas tempered out by 12-EDO is the kernel of the 12-EDO val, which forms a sublattice of the JI lattice. This tells us exactly which JI intervals are “conflated” — treated as identical — in standard Western music.

31-EDO: A Meantone-Based Microtonal System

31-EDO divides the octave into 31 equal parts of approximately 38.7 cents each. Its val in 7-limit is ⟨31, 49, 72, 87|. This is a much better approximation of 5-limit harmony than 12-EDO, and it also handles the 7-limit (harmonic sevenths) well.

31-EDO is one of the best rank-1 approximations of meantone temperament. It tempers out the syntonic comma (like all meantone systems) but also tempers out the septimal comma 64/63 |6, 0, 0, −1⟩, making it a septimal meantone.

Musically, 31-EDO provides noticeably sweeter major thirds (they’re much closer to the pure 5/4 than in 12-EDO) and access to the harmonic seventh 7/4 as a distinct consonant interval — a sound virtually absent from standard Western music.

19-EDO: Another Meantone System

19-EDO with val ⟨19, 30, 44, 53| is the “other” commonly used meantone EDO. It tempers out the syntonic comma as well but has slightly sharper major thirds than 31-EDO (closer to 12-EDO). It gives a wonderful approximation of 3-limit and 5-limit harmony.

19-EDO has the interesting property that its major third (378.9 cents) is slightly flat of 5/4 (386.3 cents), while 31-EDO’s major third (387.1 cents) is very slightly sharp.

53-EDO: The Pythagorean Perfection

53-EDO is beloved because it provides extraordinary approximations of both 3-limit (Pythagorean) and 5-limit harmony simultaneously. Its val is ⟨53, 84, 123, 149|. The fifth is only 0.07 cents from pure — essentially perfect — and the major third is only 1.4 cents flat.

53-EDO corresponds to a different temperament family from meantone: it tempers out the schisma 32805/32768 = |−15, 8, 1⟩ (known as schismatic temperament) rather than the syntonic comma. This means the major third is approached differently: 8 descending fifths rather than 4 ascending fifths. Pythagorean tuning and 5-limit just intonation become, to a good approximation, the same thing in 53-EDO.

Miracle Temperament: An Astonishing Discovery

Miracle temperament is a remarkable rank-2 temperament in 11-limit JI, discovered by George Secor and further analyzed by Dave Keenan and Graham Breed. Its generators are the octave and a tiny interval called the secor (~116.7 cents).

What makes Miracle astonishing is that 10 secors stacked approximates all of the intervals 3/2, 5/4, 7/4, 9/8, 11/8 — the fundamental consonances of 11-limit harmony — simultaneously and accurately. The mapping matrix is:

The second row shows how many secors each prime requires. In 11-limit (primes 2, 3, 5, 7, 11):

  • Prime 2: 1 octave, 0 secors
  • Prime 3: 0 octaves, 6 secors
  • Prime 5: 0 octaves, −7 secors (i.e., 7 secors down)
  • Prime 7: 0 octaves, −2 secors
  • Prime 11: 0 octaves, 15 secors

Miracle tempers out several commas simultaneously, including the ampersand comma |7, −3, 8⟩ and the septimal kleisma |−6, −5, 6⟩ among others. The Blackjack scale — a 21-note scale in Miracle temperament — is one of the most harmonically rich scales ever constructed.

Porcupine, Magic, and Pajara

The RTT catalog contains thousands of named temperaments. A few others worth mentioning:

  • Porcupine temperament: 7-limit, tempers out 250/243 = |−1, −5, 3⟩. The generator is about 163 cents. An unusual system where the generator is neither a fifth nor a third but something between a whole tone and a minor third.
  • Magic temperament: 5-limit, tempers out 3125/3072 = |−10, 0, 5⟩. The major third (5/4) is the generator. 5 major thirds = one octave + a small interval. This produces scales with a distinctly “spiral” structure.
  • Pajara temperament: 7-limit, tempers out 50/49 = |1, 0, 2, −2⟩ and 64/63. Uses a half-octave period and a generator of ~106 cents. 22-EDO is a good realization of Pajara.

Each of these corresponds to a specific sublattice of the JI lattice — a specific geometric/algebraic structure. The diversity of temperaments available is a direct consequence of the richness of integer linear algebra in multiple dimensions.


10. Scales, Modes, and the MOS Property

Moment of Symmetry Scales

Given a rank-2 temperament (octave period + one generator), there’s a canonical way to build finite scales: choose N consecutive generator-steps and sort them by pitch within an octave. These are called MOS (Moment of Symmetry) scales.

A MOS scale has a special property: it contains exactly two step sizes (called “large” L and “small” s). This means the scale has a kind of self-similar structure — it looks the same from any starting position (modulo which of the two step sizes you’re starting on). This is precisely the property that makes Western music’s 7-note major scale feel so “natural”: it’s a MOS of the fifth-generator in meantone, with 5 large steps (whole tones) and 2 small steps (semitones).

Mathematically, the step sizes of a MOS scale are determined by the Euclidean algorithm (or equivalently the continued fraction expansion) of the generator-to-octave ratio. If the generator is g cents out of an octave of 1200 cents, then the MOS scale sizes occur at the denominators of the convergents of the continued fraction of g/1200.

For example, the fifth g ≈ 701.96 cents gives g/1200 ≈ 0.5850. The continued fraction is [0; 1, 1, 2, 3, 1, 5, 2, …] and the denominators of convergents are 1, 2, 3, 5, 7, 12, 41, 53, … These are exactly the most natural MOS scale sizes for fifth-based temperaments: 5 notes (pentatonic), 7 notes (diatonic), 12 notes (chromatic), 41 notes, 53 notes.

This is the Stern-Brocot tree and Farey sequence structure of scale theory — a beautiful connection between rational approximation theory (Diophantine approximation), continued fractions, and musical scales.

The Carey-Clampitt Theorem

A deep theorem due to Norman Carey and David Clampitt (1989) characterizes MOS scales more precisely: a scale is a MOS (or “well-formed” scale) if and only if it is generated by a single interval and satisfies the two-step-size condition. Furthermore, a scale is a MOS of its generator if and only if the generator and the period (octave) are related by the scale size through a Bezout coefficient — that is, if N is the scale size and M is the number of generators used, then gcd(N, M) = 1.

This is a direct application of the theory of cyclic groups and Bezout’s theorem from elementary number theory, showing how deep the connections run between number theory, algebra, and musical structure.


11. The Smith Normal Form and Temperament Calculations

Smith Normal Form in RTT

When computing with temperaments, a fundamental tool is the Smith Normal Form (SNF) of integer matrices. Recall that for any integer matrix A, there exist invertible integer matrices P and Q such that PAQ = D, where D is diagonal with non-negative entries d₁ | d₂ | … | dᵣ (the invariant factors, each dividing the next).

In RTT, the SNF of the mapping matrix M tells you:

  1. The rank of the temperament (number of non-zero diagonal entries)
  2. Whether the temperament is torsion-free (if all invariant factors are 1, the quotient group ℤⁿ / Im(Mᵀ) is free abelian)
  3. The torsion subgroup of the quotient (if any invariant factor > 1, there are finite cyclic components)

Torsion in a temperament quotient would mean there exist intervals that are “pure unison” when combined some number of times — a kind of wrap-around that makes the temperament inconsistent. Well-behaved temperaments have torsion-free mapping matrices.

Hermite Normal Form for Canonical Representations

The Hermite Normal Form (HNF) of a matrix provides a canonical form for representing temperament mapping matrices. Every rank-r integer matrix is equivalent (under left-multiplication by unimodular matrices) to a unique matrix in Hermite normal form, where the pivot columns form a lower triangular matrix with positive diagonal entries and non-negative sub-diagonal entries.

RTT uses HNF to provide canonical names for temperaments: a temperament mapping matrix is “officially” represented in Hermite normal form. This allows unambiguous comparison and listing of temperaments in databases.

The actual computation of HNF and SNF involves the same row and column reduction operations as Gaussian elimination, but restricted to elementary integer operations (adding integer multiples of one row to another, swapping rows, negating rows) so that all intermediate results remain integers. This is the theory of integer programming and lattice algorithms.


12. The Geometric Picture: Temperament as Projection

Mapping JI Space to Generator Space

Let’s assemble the full geometric picture. Start with the JI pitch space ℝⁿ (the real extension of the monzo lattice — we allow real-valued combinations of prime logarithms). This is an n-dimensional real vector space.

A rank-r temperament picks out an r-dimensional subspace — call it the generator space — and projects ℝⁿ onto it. The projection map is the tuning map φ : ℝⁿ → ℝʳ. Intervals in JI space that project to the same point in generator space are “the same note” in the temperament.

The fibers of this projection — the preimages of single points — are (n−r)-dimensional affine subspaces of JI space. The “direction” of these fibers is exactly the commatic kernel — the set of intervals that are tempered to zero.

So a temperament is literally a projection of high-dimensional JI space onto a lower-dimensional “essential harmony” subspace. Different temperaments correspond to different projections. The art of temperament design is choosing the projection direction (which intervals to collapse) to preserve the harmonic relationships you care about while reducing the dimensionality of the system.

The Dual Picture: From Vals to Generators

The dual picture: the generator sizes (the “tuning” of the temperament) define a set of r linear functionals on ℝⁿ. These r functionals are the rows of the mapping matrix, weighted by the generator sizes. The composite map:

ϕ:Rn→MZr→tR

sends each just interval to its size in cents in the temperament. This map can be thought of as choosing a 1-dimensional projection of ℝⁿ (the size-in-cents map) that factors through the r-dimensional generator space.

The optimal tuning problem is then: among all such factored projections, which one minimizes the weighted distance to the “pure just” projection j = [log₂(2), log₂(3), log₂(5), …] × 1200?

This is a projection problem in a Euclidean space (with the Tenney inner product). The solution is the orthogonal projection of j onto the image of Mᵀ — exactly the least-squares formula we derived earlier. The geometry is completely standard; only the context is musical.


13. Limits, Extensions, and Higher Primes

Adding New Primes: Extending a Temperament

Suppose you’ve been working in 5-limit JI and you have a beautiful temperament. Now you want to extend it to handle the prime 7 as well — you want to add the harmonic seventh to your harmonic vocabulary. How does this work in RTT?

You need to augment your mapping matrix with an additional column (for the new prime 7) and decide what value to assign: how many generator-steps will be used to approximate 7/4?

This is the extension problem. For a given rank-2 temperament with period p and generator g, you need to find integers a, b such that:

a⋅p+b⋅g≈log⁡2(7)×1200 cents

There are typically multiple reasonable choices of (a, b), each giving a different 7-limit extension of the same 5-limit temperament. For example, meantone has several 7-limit extensions: dominantflattoneseptimal meantone, each using a different number of generators to approximate 7/4.

The different extensions correspond to different choices of which comma involving the prime 7 to temper out. Once you add the comma, the kernel grows — the temperament identifies more JI intervals — and you potentially get a more compact (or differently structured) quotient group.

No-Twos and Subgroup Temperaments

RTT doesn’t require that we work with all primes up to some limit. We can choose any subgroup of the JI group — for instance, the subgroup generated by {3/2, 5/4, 7/4} without including the octave 2/1.

These subgroup temperaments are useful for various reasons. Some EDOs approximate a subset of primes much better than the full prime set. Some compositional styles naturally avoid the prime 5 (pure Pythagorean music) or work only with primes {2, 3, 7}.

Mathematically, a subgroup temperament starts from a free abelian subgroup H ⊆ ℤⁿ (instead of all of ℤⁿ) and applies the same formalism. The mapping matrix then maps H to the generator group. All the same tools — wedgies, SNF, least squares tuning — apply with minor modifications.


14. Practical Scale Construction: An Example

Let’s walk through a concrete example of constructing a scale from scratch using RTT principles.

Step 1: Choose a Prime Limit

Let’s work in 7-limit JI (primes {2, 3, 5, 7}), so our monzo space is ℤ⁴.

Step 2: Choose Commas to Temper Out

We’ll temper out two commas to get a rank-2 temperament:

  • The septimal comma 64/63 = |6, 0, 0, −1⟩ (²⁶/7 — makes 7/4 close to three augmented thirds)
  • The syntonic comma 81/80 = |−4, 4, −1, 0⟩ (meantone)

These two commas define a specific temperament. Let’s find the mapping matrix.

Step 3: Compute the Mapping Matrix

We need to find a 2×4 integer matrix M such that:

  • M · |6, 0, 0, −1⟩ = |0, 0⟩
  • M · |−4, 4, −1, 0⟩ = |0, 0⟩

One approach: use the commas as a basis for the kernel, then find a complementary subspace (the generator space). By performing integer column operations, we can reduce the 4×2 comma matrix to echelon form and read off the generators.

The result is septimal meantone: generators are the octave (1200 cents) and a fifth (~696.5 cents for the “optimal” tuning). The mapping matrix is:

(The columns are primes 2, 3, 5, 7; rows are octave-steps and fifth-steps.)

Step 4: Choose a Scale Size

We want a finite scale. Using the continued fraction of the fifth-to-octave ratio ≈ 696.5/1200 ≈ 0.5804, the convergents give scale sizes 1, 2, 3, 5, 7, 12, 19, 31. Let’s use the 12-note MOS (the familiar chromatic scale in meantone).

Step 5: Compute the Actual Frequencies

With the optimal tuning (minimizing weighted error), the fifth is ≈ 696.88 cents. Starting from A = 440 Hz:

  • A: 440.00 Hz
  • A♯/B♭: 440 × 2^(96.88/1200) ≈ 466.16 Hz
  • B: 440 × 2^(193.76/1200) ≈ 494.06 Hz
  • C: 440 × 2^(290.63/1200) ≈ 523.77 Hz
  • … and so on

The result is a 12-note scale that is very close to (but mathematically distinct from) standard 12-EDO. Its major thirds are purer (about 386 cents vs. 400 cents), its harmonic seventh 7/4 is also well-tuned, and it supports a richer variety of consonant chords.


15. Connections to Other Areas of Mathematics

RTT doesn’t exist in isolation — it connects to several other areas of mathematics in beautiful and sometimes surprising ways.

Lattice Reduction: LLL and Music

The problem of finding the “simplest” comma basis for a temperament is related to lattice reduction — finding a “short” basis for a lattice. The LLL algorithm (Lenstra–Lenstra–Lovász) has been explicitly applied in RTT to find optimal comma bases. Shorter commas (in the Tenney metric) are musically simpler and easier to hear, so finding a reduced comma basis is musically meaningful.

Algebraic Number Theory

The p-adic valuations of rational numbers are precisely the coordinates of the monzo representation! The monzo |a₀, a₁, a₂, …⟩ of a rational r is simply the tuple of p-adic valuations v_p(r) for each prime p. This means the entire RTT framework can be viewed as working in the group of adèles (or a finite-prime analogue thereof) of the rational numbers.

Information Theory and Complexity

The Tenney height log₂(numerator × denominator) of a rational number r = p/q is precisely the Kolmogorov complexity(approximately) of the ratio — the information needed to specify it. This means the Tenney metric is a natural information-theoretic measure of harmonic complexity, connecting RTT to the theory of Kolmogorov complexity and minimum description length.

Crystallography

The study of JI lattices and their temperament quotients has deep analogies with crystallography — the study of crystal structures as quotients of periodic lattices in 3D space. The symmetry groups of harmonic lattices under temperament are analogous to space groups and point groups in crystallography. The Bravais lattices of 2D and 3D JI have been studied explicitly by some RTT researchers.

Combinatorics on Words

The theory of MOS scales is intimately connected to Sturmian sequences — infinite binary sequences with the lowest possible complexity (exactly n+1 distinct subwords of length n). The MOS pattern of large (L) and small (s) steps in a generated scale is always a Sturmian word. This connects scale theory to combinatorics on words, symbolic dynamics, and the theory of aperiodic sequences (like the Fibonacci word, which gives pentatonic-to-diatonic MOS scales).


16. Software, Resources, and the Xenharmonic Community

Tools for RTT

Several software tools implement RTT calculations:

  • Scala: the classic microtonal scale tool, with a large database of scales
  • Temperament calculators by Graham Breed at x31eq.com: find temperaments given comma lists or val lists, compute optimal tunings
  • LMSO (Lumatone Mapping Software Optimizer): for mapping temperaments to physical keyboards
  • Xenharmonic WIKI tools: the Xen Wiki has interactive tools for computing wedgies, finding temperament extensions, and exploring EDOs

The Xenharmonic Community

The study and composition of microtonal music using RTT is an active community, centered around the Xenharmonic Wiki and various online forums (the Xenharmonic Alliance on Facebook, the /r/microtonal subreddit, and Discord servers).

Prominent figures in RTT development include:

  • Erlich Paul — developer of Pajara, author of the famous 22-tone paper, and a central figure in systematizing RTT
  • Dave Keenan — extensive theoretical contributions, co-developer of the HEJI notation
  • Gene Ward Smith — systematic mathematical development of RTT, extensive temperament catalogs
  • Graham Breed — software tools and theoretical contributions
  • George Secor — inventor of the Miracle temperament and Sagittal notation

Conclusion: Why Linear Algebra Is the Language of Harmony

We’ve covered a lot of ground. Let’s summarize the core correspondences:

Regular Temperament Theory demonstrates something profound: the structure of musical harmony is a structure of linear algebra over the integers. The consonances of nature (simple frequency ratios) generate a free abelian group; musical scales are quotients of this group; tuning systems are linear projections; the complexity of harmonies is measured by a natural metric; and the art of scale design reduces to a precise optimization problem in a high-dimensional lattice.

This isn’t a metaphor or an analogy. It’s a literal identification. When a composer working in RTT designs a new scale, they are — whether they know it or not — choosing a sublattice of ℤⁿ, computing a quotient group, solving a least-squares problem, and optimizing a projection under a Euclidean metric.

And the music that results — with its new harmonies, its unfamiliar but internally coherent scales, its notes that Western music has never named — sounds like nothing you’ve heard before.

That, in the end, is what mathematics can give to music: not just the ability to describe what already exists, but the tools to navigate a much larger space of the possible, and to return with sounds that could not have been found any other way.


For further reading, the Xenharmonic Wiki is the primary reference for RTT concepts. Paul Erlich’s paper “A Middle Path Between Just Intonation and the Equal Temperaments” is an excellent entry point, as is the RTT primer available at Graham Breed’s microtonal site. For the mathematical foundations, any treatment of free abelian groups, Smith Normal Form, and lattice theory will provide the algebraic background; the application to music is the community’s own development.

Comments