Framework for physical Audio User Interfaces

Post by **^rooker** » Fri Oct 26, 2007 8:10 pm

Inspired by my work for serial-pyio (for monome devices) and problems I've personally encountered regarding user interfaces for musical applications, I was hoping to maybe contribute something useful back to the community.

This thread will contain ideas and document my progress regarding my diploma thesis at the technical university.

At first I was planning to focus on the problem of lacking "flow" when it comes to making electronic music.

Due to the large amount of articles and scientific papers regarding that topic, I thought: I probably won't be *the* genius who will solve that in such a short time.

Since I didn't want to give up on that subject, I thought about other ways of improving the current situation for musicians dealing with electronic music. My idea is to increase the availability of alternative devices for the public - but how?

I thought about what keeps me from building my own physical interface, or use an existing one differently (e.g. joystick, wii-mote, ...). The work, time and knowhow necessary to build the bridge between the device and an audio application of my choice is simply a lot.

Furthermore, I'm usually not even sure about the results of that work:
- will the new interface really solve some of my problems?
- will my code be reliable/stable enough for live performances?
- what about maintaining it?
- can I share my work?
- which audio applications will/should/can it work with?
- damn! so much docs to read for that one device...

Post by **^rooker** » Fri Oct 26, 2007 8:39 pm

What about alternative interfaces that you can simply go out an buy?

1) Most of them are so expensive that I guess a lot of people think twice before buying one (e.g. Lemur).

2) Very often they're designed to work with certain applications. Only with these certain applications. Furthermore, the audio apps targeted by commerical suppliers of audio interfaces are usually "Mainstream only" - meaning windows, and sometimes Mac - but hardly or never really platform independent - and without any support for LInux.

3) Open interfaces (e.g. monome), offer amazing capabilities, but a lot of code provided by the community is too specifically designed for a certain software environment or application.

Examples:
- Programs for handling audio samples using a monome 40h have been written and are being used by a wide audience, but in Max/MSP. thus, requiring new programming work to do the same thing in a different application, although remapping of commands might do the job (e.g. PureData, SooperLooper, FreeWheeling, ...)

- There exist videos about people using the Wii-mote for musical performances (e.g. wii-tar), but often it is a cumbersome task setting up ones own system to even try it out (see "lab environment" problem, below).

What about alternative interfaces as a result of research?

Some interfaces that have been built while doing UI research are actually also unavailable to the public. why?

a) Commercial research:
Research being done by companies will probably never be made available to anyone, unless some executives think that it will yield some profit. That's why major companies are more likely to stick to "well known" interfaces / devices. And even if they do come up with something, they make it closed (like the tenori-on) with the highest degree of interoperability being MIDI.

b) Scientific research:
UIs that emerged in a more "open" context (e.g. universities), were often built in what I call "lab environments", making it cumbersome or almost impossible to get it running in another environment. I've experienced that with some of my own works already, which I couldn't get up and running again after eg. half a year, because the development setup (my "lab environment") was quite fragile and changing steadily... It simply was a prototype that was used only once.

Unfortunately, the setup of a proper lab environment (especially regarding software prerequisites) for a project is often lacking good documentation - or is too complicated for someone else to rebuild within reasonable time (e.g. Studierstube & PUC).

Post by **^rooker** » Fri Oct 26, 2007 9:23 pm

Here's a short overview of my idea to improve the current situation:

Make a framework for audio user interface developers - Giving them a base to work on. As a programmer, I know that the hardest part is the start of a new project - if you don't have to start from scratch, but can improve existing stuff, it's less demotivating and due to a common ground, developers can exchange code and experiences more easily.

If that framework is done as I imagine, it should provide a number of reusable blocks (like lego blocks) making it possible even for non-programmers to link a hardware device to their favorite audio application (by using e.g. OSC or MIDI).

My framework idea consists of several parts:
(Entries marked with a * were contributed by Brian Crabtree (monome.org) - Thanks!)

1) Support different kinds of hardware interfaces, by logically splitting each device into its elements. e.g. buttons, leds, faders... and map each one of them to programming structures.

Examples:
- 2-axis, 4-button Joystick: 4 Buttons, 2 Faders.
- Wii-mote: 4 buttons, 3 Faders, 4 LEDs
- Monome 40h: 64 buttons, 64 LEDs

There are more elements than just buttons, leds and faders, but for now, I'll just focus on these most basic ones.

2) Value preprocessing:
Some devices - especially DIY ones (e.g. arduino based), require preprocessing of input / output values.

- Transformation (*)
e.g. some ADC returns values from 0-1024, but the target application expects a range from 0-100.

- Ramping (*)
e.g. Instead of immediately applying a new input/output value, one can define a ramp-time from the previous value to the new one.

3) These elements should be mappable to to UI-widgets, by a so called "semantic layer". This grouping is supposed to be possible across devices.

Examples:
a) elementary:
- several LEDs as a meter row.
- several pushbuttons as radio group.
- several LEDs in a 2D array as output display

b) state keeping:
- button A on: joystick X-axis = volume,
button A off: joystick X-axis = panning
- 2 pushbuttons for increasing/decreasing a value (*)

4) Mapping UI-widgets to OSC/MIDI messages.
Finally, each one of the previously defined virtual devices (=UI widgets) must be made available to audio applications. I think doing the communcation using OSC is preferrable, but MIDI might be necessary to provide compatibility with other apps, too.

Post by **^rooker** » Fri Oct 26, 2007 9:29 pm

The "semantic layer" is my personal favorite.
I'd like to make the semantics of devices configurable using XML files. These will include the connection between a virtual UI-widget and the physical element (button, LED, ...).

It could even be imaginable to write an editor for such XMLs (similar to "mapd"). Although I'm not really a fan of frontends for config files, it could offer a possibility for absolute-non-coders to still use this framework.

Edit / Sidenote:
I've found it quite confusing when writing/talking/thinking about the semantic layer, because of its name - so from now on I'll refer to it as "widget layer".

Post by **^rooker** » Fri Nov 02, 2007 2:44 pm

Use cases
In order to move on towards implementation design, I thought it might make sense to define some use cases that this framework should be able to handle while trying to keep things as simple as possible.

My examples will be quite specific to get some concrete ideas on how they could possibly be managed and to avoid too many "could's" and "should's".

Furthermore, I do not want to assume that the target audio application is scriptable, like pd, Max or ChucK - so I'd like to have preprocessing capabilities within the framework.
However that one's still a subject of discussion, since if done badly it could add unnecessary complexity to the framework, causing undesired additional work on programmer/musician side.

Here's some brainstorming:

1) 2 btn-Joystick to control parameters: speed (BPM), volume, effect (wet)
2) A series of LEDs as VU meter
3) A series of LEDs as position indicator
4) x buttons as radio group
5) A monome 40h displaying a 2D pattern

Even though a single user will end up with a 1(device):1(app) mapping, my idea is to make other profit from the work and effort that one user has already put into getting a certain device to work with some app - and make it reusable with other apps, too (e.g. loop handling in MaxMSP, enabling someone else to reuse sample position display in SooperLooper, etc...)

Post by **^rooker** » Fri Nov 02, 2007 3:12 pm

How could one want meaningfully use a joystick to control some parameter values (e.g. effect parameters, speed, volume, etc...)?

Let's assume we have a default 2-axis, 4 button joystick with a coolie-hat.
(The coolie-hat is somewhat tricky, because when using an analogue gameport, it's read as a 3rd axis where each of the 4 positions of the hat is mapped to a certain value on that axis)

=== Things to consider: ===
a) Changing the value of only 1 axis at a time.
This causes some trouble, because it's quite hard to move a joystick in a single dimension only, without slightly affecting the other dimension, too.

b) Absolute vs. relative values:
absolute:
In some cases, one might want the value from an axis to absolute - For example, if we assume a 16bit signed integer value, that would be:
-32767 / 0 / +32767
snapping back to 0 whenever one releases the joystick. This is nice for e.g . pitch-bending.

PROs / CONs
++) This method is nice when continuously modifying a value without interruption.

--) A drawback of this approach is that one cannot "hold" a certain value - and even if you had a "lock" feature (e.g. press/release button), it's hard to return to the value you've locked in case you'd like to gradually change the value relative to its locked position.

relative:
So, what about a relative approach? This would require the push/release of a button so that an application knows when to ignore the movement of the joystick (because it'll snap back to its center position).

However, this approach is easier for values that one will change at certain points in time, having a pause in between - or when you don't have your hand on the joystick all the time.

=== Desired output: ===
Widgets:
(Note: I'm referring to axis as "fader" (see elements-theory in previous posts above), and I'm writing it down in some pseudo code so that I see which things will be necessary in the widget layer)

I'd like to have the fader-meaning depending on which button is pressed: