A couple years ago I gotreally excitedabout theidea of controlling a
synthesizer by whistling. I came back to this recently, figured out a
much better algorithm, and it's much more musical now!

(Short version: go try my new program, in your
browser!)

In my original version, I counted zero crossings the estimate the
current pitch (detailed
explanation). Then I fed this into simple additive synthesis
code. It worked, but it had a few issues:

Sometimes it would detect whistling when there wasn't any, and
make clicks (or beeps in a later version).

If you whistled too quietly or not clearly enough, it would
failed to detect the whistling and you get silence, or drop outs.

It was very sensitive to a "gate" parameter: set it too high and
you get the first problem above, too low and you you get the second.
In noisy environments that was often no good setting.

It was not steady enough to feed into an existing
synthesizer, because those expect the precise input you get for my
keyboard.

My simple custom synthesizer didn't sound very musical.

My new version is much better on all of these dimensions. The key idea
is that because the input signal is very close to a sine wave, we can
directly use it for synthesis. Here is a short snapshot of whistling:

As before, each time we get to an upward zero crossing we estimate the
pitch:

This time, however, instead of tuning a synthesizer, we start playback
of the previous cycle at a multiple of the original speed. Let's say
we move at half speed, to shift the pitch down an octave. When the
next inputs cycle begins, we will also be beginning an output cycle.
A quarter of the way through the input cycle, when we are at our
maximum value, the output is still ramping up. This continues, until
we have finished an input cycle and half an output cycle:

At this point or zero crossing detector will give us another pitch
estimate, so we kick off another half speed speed replay. Since our
previous replay is only halfway through the wave, however, we need to
start this one at with reversed polarity or else they will cancel
out.

We can run each replay for several cycles, which has the nice benefit
of reducing incidental noise: if there is a blip In the input, it will
probably not be repeated in the next cycle. Possibly there is
something more aggressive, where we take the median value from N
recent cycles?

This gives you a basic octaver, which shifts the pitch of the input
down an octave as long as the timbre is simple enough that zero
crossings work well. (example)

We can play around with this, however, since there two parameters we can tweak:

How quickly do we pass over the previous cycle? Changing this
mostly affects timbre.

How do we decide what polarity to use? Changing this affects
pitch and timbre. I've divided this into two parameters
cycle and mod as
polarity=(num_cycles*cycle)%mod == 0

You can stack several of these replays with a different parameters to
build more complex sounds. Because whistling is so harmonically
minimal, you generally want to layer several playbacks. If you sing,
however, the harmonics are already complex enough that you typically
only want one or maybe two playbacks.

A couple years ago I got really excited about the idea of controlling a synthesizer by whistling. I came back to this recently, figured out a much better algorithm, and it's much more musical now!

(Short version: go try my new program, in your browser!)

In my original version, I counted zero crossings the estimate the current pitch (detailed explanation). Then I fed this into simple additive synthesis code. It worked, but it had a few issues:

Sometimes it would detect whistling when there wasn't any, and make clicks (or beeps in a later version).

If you whistled too quietly or not clearly enough, it would failed to detect the whistling and you get silence, or drop outs.

It was very sensitive to a "gate" parameter: set it too high and you get the first problem above, too low and you you get the second. In noisy environments that was often no good setting.

It was not steady enough to feed into an existing synthesizer, because those expect the precise input you get for my keyboard.

My simple custom synthesizer didn't sound very musical.

My new version is much better on all of these dimensions. The key idea is that because the input signal is very close to a sine wave, we can directly use it for synthesis. Here is a short snapshot of whistling:

As before, each time we get to an upward zero crossing we estimate the pitch:

This time, however, instead of tuning a synthesizer, we start playback of the previous cycle at a multiple of the original speed. Let's say we move at half speed, to shift the pitch down an octave. When the next inputs cycle begins, we will also be beginning an output cycle. A quarter of the way through the input cycle, when we are at our maximum value, the output is still ramping up. This continues, until we have finished an input cycle and half an output cycle:

At this point or zero crossing detector will give us another pitch estimate, so we kick off another half speed speed replay. Since our previous replay is only halfway through the wave, however, we need to start this one at with reversed polarity or else they will cancel out.

We can run each replay for several cycles, which has the nice benefit of reducing incidental noise: if there is a blip In the input, it will probably not be repeated in the next cycle. Possibly there is something more aggressive, where we take the median value from N recent cycles?

This gives you a basic octaver, which shifts the pitch of the input down an octave as long as the timbre is simple enough that zero crossings work well. (example)

We can play around with this, however, since there two parameters we can tweak:

How quickly do we pass over the previous cycle? Changing this mostly affects timbre.

How do we decide what polarity to use? Changing this affects pitch and timbre. I've divided this into two parameters

andcycle

asmod`polarity=(num_cycles*`

cycle)%mod== 0You can stack several of these replays with a different parameters to build more complex sounds. Because whistling is so harmonically minimal, you generally want to layer several playbacks. If you sing, however, the harmonics are already complex enough that you typically only want one or maybe two playbacks.

I'm quite happy with it. It's much more musical than my earlier approaches, and much less finicky. Give it a try: in-browser pitch-shifter and synthesizer.

Comment via: facebook