The single most important requirement for MTS realtime tuning change processing

Mike Battaglia

unread,

Sep 18, 2013, 11:47:36 PM9/18/13

to micro...@googlegroups.com

Before I get into the discussion about the nuances of MTS non-realtime tuning changes and SysEx, I want to articulate my vision really clearly about the one simple requirement synth manufacturers need to obey to avoid those artifacts mentioned.

That requirement is: the tuning change needs to ensure that it's zoned properly by the synth.

That means that synth manufacturers need to ensure that, no matter what a MIDI note number is being retuned to, things like the choice of sample, filter settings, keyboard tracking, etc, depend only on the target frequency, not the 12-EDO default frequency for that MIDI note.

An obvious example: if we're working with a sampler, and we retune the entire tuning table to 31-EDO, MIDI note 0 needs to get the appropriate sample for "two octaves below middle C," not the default sample for MIDI note 0 naively being resampled up like three octaves or whatever, which would sound ridiculous.

This is obvious to us, but is one of those things that really needs to be stated explicitly. Most manufacturers, for instance, don't treat the existing channel pitch bend this way. If you load up a piano sample and pitch bend it up two octaves, you end up getting timbre deformation due to the naive resampling that's typically used. Since synth manufacturers don't know how to deal with MTS a priori, they need to know to not treat these realtime tuning changes as polyphonic pitch bends from a 12-EDO skeleton, which they may well do if they think we only care about 12-note scales. It needs to be spelled out.

This rather unambitious requirement turns out to have rather remarkable consequences. If you can take any note and tune it to anything, and it's guaranteed the synth isn't going to continue to cling to the former 12-EDO interpretation of that note in any way at all, then you can take note 127 and tune it to the bass, or take note 0 and tune it to middle C, or take anything and tune it to anything, and the synth always handles it properly.

This really defines a new paradigm for handling MIDI note numbers, since they now no longer contain any inherent tuning information at all. In effect, they instead simply become 127 "voices," which are tuned to whatever. Effectively, obedience towards this one simple requirement means that MIDI becomes a protocol containing 127-voice polyphony with notes of arbitrary frequency.

Retuners to do a tremendous amount of stuff under this scheme. One of those things is a particularly powerful algorithm that I call "freestyle" retuning, which works as follows:

1) In general, new notes are sounded with combination "tuning change + note on" messages.

2) All 127 voices are put in a queue.

3) Every time a new note comes on, the retuner just uses the note that was last "note offed" in the queue.

That's it. I find this simple idea to be very useful, because it means we don't need to require synth manufacturers to implement things like multitimbral channel linking. Instead, we can just have them keep one-instrument-per-channel as most usually do now, and have retuners do the job instead. In fact, you can even use just realtime tuning messages with it and it still works flawlessly with no artifacts. You get virtually zero chance of dopplering, because a note's tuning is only changed the next time it's processed in the queue.

I think that stuff like this will lead to the retuner ending up as an integral part of the modern tuning workflow, so that the architecture ends up being controller->retuner->synth. We can simply ask synths to implement this one requirement, as well as asking controllers to implement whatever the easiest possible thing is, and retuners will bridge the the gap.

I'll leave it there for now, lots more I could go into but that's long enough.

Mike

Freeman Gilmore

unread,

Sep 19, 2013, 1:13:06 AM9/19/13

to micro...@googlegroups.com

Mike:

Let me see if I under stand you:

#3 would it be better to say, Every time a new note comes on, the
retuner just uses the note that was {oldest} "note offed" in the
queue.

#2 because you would not be playing 127 notes, in one channel, at one
time, you could get by with much less than 127.

ƒg

> --
> --
> You received this message because you are subscribed to the
> "Making Microtonal Tools" Google group.
>
> To unsubscribe from this group, send email to
> microtools-...@googlegroups.com
>
> For more options, visit this group's web site at
> http://groups.google.com/group/microtools
> ---
> You received this message because you are subscribed to the Google Groups
> "Making Microtonal Tools" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to microtools+...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Carl Lumma

unread,

Sep 19, 2013, 3:28:22 AM9/19/13

to micro...@googlegroups.com

Mike Battaglia wrote:

That means that synth manufacturers need to ensure that, no matter what a MIDI note number is being retuned to, things like the choice of sample, filter settings, keyboard tracking, etc, depend only on the target frequency, not the 12-EDO default frequency for that MIDI note.

This is obvious to us, but is one of those things that really needs to be stated explicitly.

Good point.

Retuners to do a tremendous amount of stuff under this scheme. One of those things is a particularly powerful algorithm that I call "freestyle" retuning, which works as follows:
1) In general, new notes are sounded with combination "tuning change + note on" messages.
2) All 127 voices are put in a queue.
3) Every time a new note comes on, the retuner just uses the note that was last "note offed" in the queue.

That's it. I find this simple idea to be very useful, because it means we don't need to require synth manufacturers to implement things like multitimbral channel linking.

What's multitimbral channel linking?

A MIDI note number usually refers to only one physical key on a keyboard and one pitch.
In xenharmonic applications, it may have to refer to more than one of each.
If I'm using a keyboard, it helps a lot to preserve the mapping to its keys and throw the
mapping to pitch out the window as necessary. For example, I may want to retune a key
as I'm playing. It's helpful for debugging etc. if I just know which MIDI note number is being
sent each time I press it. My software needs to have some understanding of the keyboard
for this, which is no problem.

If I'm not using a keyboard (e.g. notes are coming from a sequencer) then a unique mapping
to pitches is something we might try to preserve. If a piece uses 128 pitches or fewer, that's
no problem. You can get 4 octaves of 31-ET for instance. And even when we need more
than that, it seems we should set an initial scale by tuning the entire note range with the
single-note non-realtime message at setuple. This could be a requirement, e.g. a MIDI file
that doesn't begin this way would throw an error if you tried to play it in a hypothetical
MTS-MIDI player.

When a pitch comes along that's not in the current tuning, we should find the note number
currently holding the closest pitch and put it there. This will help preserve our sanity. In
case somebody starts in 12-ET and then switches to 128-ET and immediately does a gliss,
you might argue we'll be sending the entire gliss to the same note number and having to
retune it really fast. I'm OK with that. Such a case should be relatively rare, and anyway
we *should* demand the synths be that good. The whole point of MIDI is to have an
abstraction halfway between a human-readable score and the internal bytecode of a synth.
At the end of the day, the synth has to perform a gliss in 128-ET. It doesn't make it any
easier if the command comes in as different MIDI notes or not. But it does make it harder
for us if the standard asks software to retune note numbers willy nilly.

-Carl

Mike Battaglia

unread,

Sep 19, 2013, 4:35:19 PM9/19/13

to micro...@googlegroups.com

On Thu, Sep 19, 2013 at 1:13 AM, Freeman Gilmore <freeman...@gmail.com> wrote:

Mike:

Let me see if I under stand you:

#3 would it be better to say, Every time a new note comes on, the
retuner just uses the note that was {oldest} "note offed" in the
queue.

Yeah, you can call it that. I was thinking of it differently at first, but that turns out to be equivalent and be simpler to explain. The key point is that you add notes into the queue once a note off occurs, not when a note on occurs.

#2 because you would not be playing 127 notes, in one channel, at one
time, you could get by with much less than 127.

Yes, although I don't see the benefit in intentionally limiting the polyphony.

Mike

Mike Battaglia

unread,

Sep 23, 2013, 1:43:59 AM9/23/13

to micro...@googlegroups.com

Back to this...

Carl wrote:

What's multitimbral channel linking?

In the past, we've discussed schemes where the same instrument would be split between different channels, each of which would have its own tuning table. This makes microtuning easy to implement for controllers and notation editors - you just put in the channel/note mapping and we're off to the races.

For synths to comply, they'd need to make sure that multiple channels are multiplexed onto the same instrument, so that things like filters, effects, articulation stuff, probably controllers and etc are shared between the two channels, but the tuning table isn't. I called that "channel linking."

I forget why I put "multitimbral" before it. I may have been on non-literal crack, which is the worst kind of crack.

A MIDI note number usually refers to only one physical key on a keyboard and one pitch.
In xenharmonic applications, it may have to refer to more than one of each.

If I'm using a keyboard, it helps a lot to preserve the mapping to its keys and throw the
mapping to pitch out the window as necessary. For example, I may want to retune a key
as I'm playing. It's helpful for debugging etc. if I just know which MIDI note number is being
sent each time I press it. My software needs to have some understanding of the keyboard
for this, which is no problem.

OK, there are two important clarifications I should issue:

First, the thing about freestyle retuning isn't something I was necessarily envisioning putting as a requirement in the standard. The only requirement is that zoning needs to be done properly, which needs to happen no matter what. Proper handling of that requirement just so happens to unlock lots of potential for people to be creative and come up with new tools, one of which is the freestyle retuning algorithm I proposed (which may not be appropriate in all circumstances).

Second, I envisioned that the requirement needs to really just be met in time for the next note on. The realtime tuning change is also supposed to let you bend notes in realtime, for modeling things like an oud or whatever. It shouldn't be necessary for synths to do ridiculously crazy amounts of rezoning if 500 tuning changes are sent in for a held note in some kind of continuous glide (that might lead to "timbre cracking" if samplers don't implement sample interpolation). If it's easier for them to treat it as a pitch bend while the note is held, that's still fine, but by the time that next note on comes around, things better be set up properly.

If I'm not using a keyboard (e.g. notes are coming from a sequencer) then a unique mapping
to pitches is something we might try to preserve. If a piece uses 128 pitches or fewer, that's

no problem. You can get 4 octaves of 31-ET for instance.

Yes, I agree. I think that 31-EDO is beyond the cutoff of where I'd find "static" tuning tables to be useful, but for something like 15-EDO or 19-EDO none of this fiddling with dynamic freestyle retuning should be necessary. Tangentially, though, I started writing a piano etude in 31-EDO a while ago and was *extremely* frustrated by the lack of range that I had.

And even when we need more

than that, it seems we should set an initial scale by tuning the entire note range with the
single-note non-realtime message at setuple.

OK, so to get more precise about this, I'm going to put this in the context of a controller->retuner->synth architecture. The controller can be anything from a Terpstra to a notation program to a sequencer to a Halberstadt MIDI controller, and we can assume it sends out note data in whatever format. The retuner is a black-box program like Scala or LMSO which can generally take such data and reformat it as standardized MTS, which the synth can read. If we're dealing with a modern, intelligent controller which can output standardized MTS from the get go, then we can just treat the whole thing as a combined controller->retuner, with the internalized representation of an arbitrarily tuned note as the "controller" part, and the algorithm to take that and derive MTS from it as the "retuner" part.

So within that framework, then in your situation, we have a controller which wants to output notes freely, using whatever format. The retuner's job is to take these notes and generate MTS in such a way that the artifacts that I mentioned are avoided, as well as other undesirables. So long as synths are zoning things properly as previously described, there are a lot of different algorithms that a retuner might use. Freestyle was one of them, and the "static tuning table" thing you just proposed was another one. You have now also suggested the following algorithm:

When a pitch comes along that's not in the current tuning, we should find the note number
currently holding the closest pitch and put it there. This will help preserve our sanity.

Assuming I've understood correctly, this is another one of the six algorithms I worked out with Graham offlist, which I was calling "skeleton" retuning. The algorithm would look something like this:

1) The tuning tables are flashed with an initial scale, which is the "skeleton."

2) When a note is needed that isn't in the skeleton, the note nearest to it has its tuning changed dynamically to match the desired pitch.

3) This becomes the new skeleton.

Seems like that's about it. However, while this approach might be useful in certain circumstances, I found that it leads to a few artifacts in others. You mention:

In case somebody starts in 12-ET and then switches to 128-ET and immediately does a gliss,
you might argue we'll be sending the entire gliss to the same note number and having to
retune it really fast.

This one is pretty rare, but there's worse ones with that. For instance, say we do the initial flashing with four octaves of 31-EDO. I then want to play a C in the bass that's an octave below the lowest note, so it assigns the lowest note to C. Then, while holding that, I want to play a C that's an octave below that. Now I can't do it, and I have to lose the first C for the second.

You can solve that problem by picking a wider skeleton, though I'm not sure what the one-size-fits-all range would be. But you develop a new problem, as the skeleton becomes wider: if you ever need a note that's in between two already-sounding notes, for instance, there's going to be voice stealing. So if I'm playing around with 72-EDO where the initial skeleton is 12-EDO, and I play 8:9:10:11:12 which is assigned to MIDI notes 60, 62, 64, 66 and 67, and while holding those notes I want to go to 16:17:18:19:20:21:22:23:24, there's going to be a conflict, because there's nowhere for 23/16 to fit in - so either it has to get dropped, or 3/2 has to disappear. It doesn't seem difficult to come up with situations where this will happen.

Furthermore, if you want to play a new note not in the current skeleton, there's at most only ever one note that you can change to it, which raises the risk of Doppler artifacts. Like if I'm in 19-EDO with an initial skeleton of meantone[12], and I start trying to play a 19-EDO-ified version of Flight of the Bumblebee which has a melody like E Eb D# D Db C# C, notes are going to have to be retuned really fast. Since the notes we're retuning are notes we just played, then if the release trail doesn't die out ultra-fast there's going to be weird subtle Dopplering, which in this context, will sound like rubber bands whizzing by your ears (Graham ran into this problem with Lilypond and pitch bends).

Some of those examples can be fixed by picking a denser skeleton than 12 notes per octave, but I haven't ever found a skeleton that somehow perfectly solves this tradeoff between the two voice stealing situations. These are the reasons why I found that skeleton retuning wasn't good for a one-size-fits-all situation, and when I get a Terpstra I'm sure that an hour of playing will lead me to discover plenty more instances like this, most of which I haven't even thought of yet, all of which will drive me absolutely nuts.

One thing that's notable about this scheme is that it forces the tuning table to remain monotonic, which I guess is what you meant when you said it'd help preserve our sanity. I initially tried to work out an artifact-free retuning scheme that kept the table monotonic, but failed pretty miserably in doing so. Maybe you can do better than I did, but my ultimate conclusion was that monotonicity didn't do anything except please the OCD center of my brain for no reason, made the whole problem *way* harder to solve, and was probably best viewed as inertia that just weighed everything down.

Mike

Graham Breed

unread,

Sep 23, 2013, 4:27:00 PM9/23/13

to micro...@googlegroups.com

On 09/23/2013 06:43 AM, Mike Battaglia wrote:

> For synths to comply, they'd need to make sure that multiple channels are
> multiplexed onto the same instrument, so that things like filters, effects,
> articulation stuff, probably controllers and etc are shared between the two
> channels, but the tuning table isn't. I called that "channel linking."

It looks like omni mode, but with distinct pitches. Why not call it
"microtonal omni mode"? And why not decree that actual omni mode should
work like this with MTS?

Graham

Mike Battaglia

unread,

Sep 23, 2013, 4:35:36 PM9/23/13

to micro...@googlegroups.com

Omni mode only makes sense for monotimbral synths.

Mike

Reply all

Reply to author

Forward