A couple years ago I got really excited about the idea of controlling a synthesizer by whistling. I came back to this recently, figured out a much better algorithm, and it's much more musical now!

(Short version: go try my new program, in your browser!)

In my original version, I counted zero crossings the estimate the current pitch (detailed explanation). Then I fed this into simple additive synthesis code. It worked, but it had a few issues:

  • Sometimes it would detect whistling when there wasn't any, and make clicks (or beeps in a later version).

  • If you whistled too quietly or not clearly enough, it would failed to detect the whistling and you get silence, or drop outs.

  • It was very sensitive to a "gate" parameter: set it too high and you get the first problem above, too low and you you get the second. In noisy environments that was often no good setting.

  • It was not steady enough to feed into an existing synthesizer, because those expect the precise input you get for my keyboard.

  • My simple custom synthesizer didn't sound very musical.

My new version is much better on all of these dimensions. The key idea is that because the input signal is very close to a sine wave, we can directly use it for synthesis. Here is a short snapshot of whistling:

As before, each time we get to an upward zero crossing we estimate the pitch:

This time, however, instead of tuning a synthesizer, we start playback of the previous cycle at a multiple of the original speed. Let's say we move at half speed, to shift the pitch down an octave. When the next inputs cycle begins, we will also be beginning an output cycle. A quarter of the way through the input cycle, when we are at our maximum value, the output is still ramping up. This continues, until we have finished an input cycle and half an output cycle:

At this point or zero crossing detector will give us another pitch estimate, so we kick off another half speed speed replay. Since our previous replay is only halfway through the wave, however, we need to start this one at with reversed polarity or else they will cancel out.

We can run each replay for several cycles, which has the nice benefit of reducing incidental noise: if there is a blip In the input, it will probably not be repeated in the next cycle. Possibly there is something more aggressive, where we take the median value from N recent cycles?

This gives you a basic octaver, which shifts the pitch of the input down an octave as long as the timbre is simple enough that zero crossings work well. (example)

We can play around with this, however, since there two parameters we can tweak:

  • How quickly do we pass over the previous cycle? Changing this mostly affects timbre.

  • How do we decide what polarity to use? Changing this affects pitch and timbre. I've divided this into two parameters cycle and mod as polarity=(num_cycles*cycle)%mod == 0

You can stack several of these replays with a different parameters to build more complex sounds. Because whistling is so harmonically minimal, you generally want to layer several playbacks. If you sing, however, the harmonics are already complex enough that you typically only want one or maybe two playbacks.

I'm quite happy with it. It's much more musical than my earlier approaches, and much less finicky. Give it a try: in-browser pitch-shifter and synthesizer.

Comment via: facebook

New to LessWrong?

New Comment