In the discussion on my minimal chords post, someone commented:

I have no idea what any of this means (what is a chord? what is a major chord? what is a note? what is a first/fourth/fifth note? is there a 65th note? what is a scale? what is a major scale? what does it mean that a note is of a scale? what does it mean that a chord uses a note? is there a difference between a chord using a note of a scale and not of a scale?)

Here's an attempt to introduce enough music theory to answer these questions:

We hear changes in air pressure. If those changes are rapid enough and consistent enough, we hear them as pitch (frequency). We can talk about these in terms of how many changes we get per second, which we call "Hz". For example, a pitch could be 100Hz or 500Hz. When we say a pitch is "higher" or "above" another pitch, we mean more changes per second: 500Hz is higher than 100Hz.

A note is something that gives the impression of being a single pitch. For example, what you get when you play a single key on the piano, or pluck a string on a stringed instrument. Many instruments can only play one note at a time: trumpet, flute, saxophone.

The standard notes used in Western music differ in pitch by a factor of the 12th root of 2 (~1.06x). This means that if you go up twelve notes (which we call "half steps", confusingly) your pitch doubles (the 12th root of 2, multiplied by itself twelve times, is just 2). Two notes whose pitch differs by a factor of two (ex: 100Hz and 200Hz) are said to be an "octave" apart, and sound almost like the same note. We give notes that differ by some number of octaves the same name (ex: "C"), though when we want to be specific about which octave we're talking about we can append numbers ("C1" at ~32Hz is an octave above "C0" at ~16Hz).

A scale is a set of notes from an octave. We usually talk about a scale as being sorted from lowest note to highest. We can define a scale by the distances between its notes. Perhaps the simplest scale (the "chromatic scale") would be to go up by one note each time, playing every note: 111111111111. This typically doesn't sound very good, and we don't usually use it.

A "major scale" has the pattern 2212221: you go up by two notes, two notes, one note, etc. This gives you seven different notes in your octave. We can call these notes the "first", "second", etc notes of the major scale. We typically don't talk about "65th" notes because they would be way too high.

We name the notes with the letters A through G, which is only seven options for twelve notes. Each letter refers to a note that is one or two notes higher than the previous. For example, if we have the notes "A B C", to go from A to B we go up two notes, while from B to C we go up one note. To refer to the note we skipped when going from A to B we can say say "A#" ("A sharp") which means "start at A and go up one note" or "Bb" ("B flat") which means "start at B and go down one note". This is all very silly, but it's what we're stuck with for historical reasons. If you start with C and go up through the notes of the major scale, you will use the seven named notes: "C D E F G A B".

A chord is multiple notes played at the same time. The chords I was talking about in my post were "triads", which means they are three simultaneous notes. A major chord is notes one, three, and five of a major scale. A minor chord is the same, but the middle note (three) is moved down one note, which we call "flat" or "minor". You can also skip the third and play just notes one and five ("open fifths" or "power chords") which I do a lot on mandolin.

A key is the combination of a scale and a starting note. For example, "C major" is a major scale starting a C, while "D major" is the same but starting on a D. Most songs in traditional, pop, folk, and rock music draw all their notes from a single key, and all their chords will be built out of notes from that key as well.

Comment via: facebook

59

New Comment
11 comments, sorted by Click to highlight new comments since: Today at 3:42 AM

Thanks! This, together with gjm's comment, is very informative.

How is the base or fundamental frequency chosen? What is special about the standard ones?

There isn't really anything special: you could take almost any piece of music and shift it up or down a few percent without affecting how people experience it very much. On the other hand, if you have multiple instruments together, it matters a lot that they agree on what frequencies to use. We've generally standardized on setting A=440Hz, and everything else relative to that.

Aside: this was a real missed opportunity, because it puts C very close to 256Hz but not quite there. We could have had 2^N Hz be C for all N!

Some instruments have notes or keys that they are best at. For example, a singer will have some minimum and maximum note, and perhaps some areas in between that sound better or worse, which means that for any given piece of music there is a (often narrow) range of keys where it will fit best. Other instruments, like a flute or trumpet have some keys that fall very naturally on the instrument (D and Bb respectively), while other some other keys require awkward fingerings. Some instruments (bagpipes, anglo concertina, tin whistle) can even only be played in one or a few keys, because they are missing notes that would be needed for other keys.

Thanks, I found this very helpful!  My daughter is taking guitar lessons and I think now I can maybe talk a bit more intelligently with her about it.

The standard notes used in Western music differ in pitch by a factor of the 12th root of 2 (~1.06x).

I was going to ask "why?" here and in some other places. But I'm guessing you answer the "why" later when you say:

This is all very silly, but it's what we're stuck with for historical reasons.

 As other commenters have said, approximating integer ratios is important.

  • 1:2 is the octave
  • 2:3 is the perfect fifth
  • 3:4 is the perfect fourth
  • 4:5 is the major third
  • 5:6 is the minor third

and it just so happens that these ratios are close to powers of the 12th root of 2. 

  • 2^(12/12) is the octave
  • 2^(7/12) is the perfect fifth
  • 2^(5/12) is the perfect fourth
  • 2^(4/12) is the major third
  • 2^(3/12) is the minor third

You can do the math and verify those numbers are relatively close.

It's important to recognize that this correspondence is relatively recently discovered; it was developed independently in china in 1584 and in europe in 1605, and coexisted with other schemes for finding approximations of those ratios for hundreds of years, and there are still people who think that this system sucks and we should use a different one, because of minor differences in pitch. (Also, the Chinese system actually used 24th roots of 2, not 12th roots.) This system is called "Equal Temprament", and there any many other tuning systems that make slightly different choices.

Why not just use the exact integer ratios instead of the approximate ones? Well, if you're playing on a violin or singing, you can use exact integer ratios. But if you're using a fixed-note instrument, like a guitar (with frets) or a piano, then you have to deal with the issue that if you go up 1 octave and 1 minor third from a note, and also go up three perfect fourths, you get two notes that are almost identical, but different enough to go out of tune. (This is called the Syntonic comma.) So which one do you put on the piano? If you choose one, the other will sound a little wrong. Or, you could choose the average, and they'll both sound a little wrong.

The thing here is, that ancient people discovered that notes which have frequencies in a ratio of small integers sound good together. Eg. 2:3.

For a long time, people were creating scales trying to have as many nice ratios as possible. This has problems. I’ll let you think about those yourself.

Then some guy figured out that human ear is not perfect and we can’t really tell whether we hear 2:3 or 2:2.9966. And came up with idea of doing those 12th rootes of 2.

Now, try to do 2^(7/12). Try also 2^(4/12). You see ?

 You might ask "... but why do small-integer ratio sound good?". The most plausible explanation I know of is due to William Sethares, whose book Tuning, Timbre, Spectrum, Scale I highly recommend. He reckons (with some evidence) that it goes like this:

  • If you play pure sine waves together and ask people what sounds nice, there isn't any strong preference for small-integer ratios. What there is is a dislike of notes that are almost but not quite coincident.
  • Real musical notes are not pure sine waves, but they can be considered (via Fourier analysis) to be made of pure sine waves, and the frequency-discriminating hardware in our ears does in fact basically do that Fourier analysis.
  • For most (but not all) ways of generating musical notes, what you actually have is a "fundamental" frequency together with integer multiples of that frequency. The higher multiples have less energy in them. Sometimes you only have odd but not even multiples. The exact distribution of sound energy across the frequencies -- the so-called "spectrum" of the sound -- is one of the main things that determines what it sounds like to us.
  • So: suppose two notes sound good together when none of the sine waves making them up clash with one another by being close but not close enough to be indistinguishable to our ears. You might try to make that happen by having none of those frequencies close to one another, but it turns out that you can't, for real instruments that have lots of frequencies in their spectrum. The other thing you can do is to make the ones that come close together actually coincide, at least closely enough for human ears -- and the way you do that is by having the ratio of the two frequencies be in a small-integer ratio.

This account of consonance and dissonance has some interesting consequences. For instance, if you make an instrument whose overtones are not harmonic -- i.e., not all simple integer multiples of the fundamental frequency -- then the combinations of notes on that instrument that will sound good together will not be the same ones that work for harmonic instruments like violins, flutes, saxophones, and human voices. This typically happens for instruments where the most important resonating object isn't basically one-dimensional (like a violin string, or the column of air in an organ pipe) -- for instance, a drum or bell. And, indeed, if you listen to gamelan music, which is played on bells and drums, you will notice that it uses a different scale (in fact, two different scales) from the one that's common in "Western" music, and one that in fact is a better fit for the spectra of those instruments! (So says Sethares, anyway.)

And if you have some nonstandard scale and would like some of the possible chords you can play with it to sound good, you can make it so by constructing an instrument with a suitable spectrum. That's hard to do with actual physical instruments, but in these glorious days of computational everything it's pretty easy to do with a synthesized instrument. And lo, Sethares has e.g. constructed "instruments" in which one can play nice-sounding music in 10-tone equal temperament, even though none of its intervals is anything like a nice simple rational number.

Do you feel like a major triad is more consonant than a minor triad? (I do.) Sethares's theory can kinda explain that: you have the same set of intervals (a major third, a minor third, a perfect fifth) but the major third is "nicer" than the minor third, and in a major triad the less-consonant minor third occurs at higher pitch, which means that fewer of the overtones are present and more of them are up where your ears don't hear so well.

Sethares' theory is very nice: we don't hear "these two frequencies have a simple ratio", we hear "their overtones align". But I'm not sure it is the whole story.

If you play a bunch of sine waves in ratios 1:2:3:4:5, it will sound to you like a single note. That perceptual fusion cannot be based on aligning overtones, because sine waves don't have overtones. Moreover, if you play 2:3:4:5, your mind will sometimes supply the missing 1, that's known as "missing fundamental". And if you play some sine waves slightly shifted from 1:2:3:4:5, you'll notice the inharmonicity (at least, I do). So we must have some facility to notice simple ratios, not based on overtone alignment. So our perception of chords probably uses this facility too, not only overtone alignment.

Just making something explicit that I think I missed for a minute when reading your comment: the point isn't "Sethares doesn't explain how our ears/brains determine what's one note and what's more, so his theory is incomplete" (his theory isn't trying to be a theory of that) but "our ears/brains seem to determine what's one note and what's more by doing something like looking for simple integer frequency multiples, and if there's a mechanism for that it seems likely that it's also involved in determining what combinations of tones sound good to us". I think there's something to that. Here are two things that seem like they push the other way:

  • On the face of it, this indicates machinery for identifying integer ratios, not necessarily rational ones. (Though maybe the missing-fundamental phenomenon suggests otherwise.)
  • Suppose you hear a violin and a flute playing the same note. You probably will not hear them as a single instrument. I think that whatever magic our ears/brains do to figure out what's one instrument and what's several also involves things like the exact times when spectral components appear and disappear, and which spectral components appear to be fluctuating together (in frequency or amplitude or both), and maybe even fitting spectral patterns to those of instruments we're used to hearing. (I suspect there's a pile of research on this. I haven't looked.) The more other things we use for that, the less confident we can be that integer-frequency-ratio identification is part of it.
    • Interesting experiment which I am too lazy to try: pick two frequencies with a highly irrational ratio, construct their harmonic series, and split each harmonic series into two groups. So we have A1 and A2 (splitting up the spectrum of note A) and B1 and B2 (splitting up the spectrum of note B). Now construct a sound built out of all those components -- but make A1 and B1 match closely in details of timing, slight fluctuations in frequency and amplitude, and anything else we can think of, and likewise for A2+B2. Does someone listening to this then hear two clashing tones, each of them nicely harmonic (i.e., A1+A2 versus B1+B2), or two weird inharmonic tones that somehow fit together (i.e., A1+B1 versus A2+B2)? What happens if first of all we play A1+B1, A2+B2, A1+A2, B1+B2 separately? The answer may well be that there isn't really an answer, alas.

If you play a bunch of sine waves in ratios 1:2:3:4:5, it will sound to you like a single note. That perceptual fusion cannot be based on aligning overtones, because sine waves don't have overtones.

The way I would explain this, is that when hearing real sounds it is very common that you hear a frequency and it's harmonics. Almost all the time, if you hear 1:2:3:4:5 etc that is because a single note just sounded. So, if you hear a bunch of sine waves in that ratio (ex: a determined a group of people whistling) it sounds like one note.

For those who don't want to break out a calculator, Wikipedia has it here:

https://en.wikipedia.org/wiki/Equal_temperament#Comparison_with_just_intonation

You can see the perfect fourth and perfect fifth are very close to 4/3 and 3/2 respectively. This is basically just a coincidence and we use 12 notes per octave because there are these almost nice fractions. A major scale uses the 2212221 pattern because that hits all the best matches with low denominators, skipping 16/15 but hitting 9/8, for example. 

Even temparement, the twelth root of two thing, was introduced to allow different instruments of be played together, and to create a finite set of notes who keys. Under the previous system, you get to use exact simple ratios like 3:2 , which even temperament can only approximate, but repeatedly multiplying by 3:2 generates and infinite number of notes, not a cycle of twelve. Even temperament is always slightly off, but evenly so, hence the name. Guitars are naturally even tempered..a Pythagorean guitar would have frets zig zagging instead of parallel and evenly spaced.

New to LessWrong?