The Acids Team at IRCAM (Institut de Recherche et Coordination Acoustique/Musique) – a French institute dedicated to the research of music and sound – has introduced Neurorack, an AI-based realtime-synthesizer that’s available as an open source DIY project.
Neurorack is based on IRCAM research into Diet Deep Generative Audio Models With Structured Lottery (pdf).
The research explores the idea that deep models may be highly over-parameterized and the hypothesis that extremely efficient small sub-networks exist in deep models and would provide higher accuracy than larger models, if trained in isolation. The approach tested removes up to 95% of the model weights, without significant degradation in accuracy. This makes it possible to implement deep generative audio models on embedded platforms.
The developers note:
“Deep learning models have provided extremely successful methods in most application fields, by enabling unprecedented accuracy in various tasks, including audio generation. However, the consistently overlooked downside of deep models is their massive complexity and tremendous computation cost.
In the context of music creation and composition, model reduction becomes eminently important to provide these systems to users in real-time settings and on dedicated lightweight embedded hardware, which are particularly pervasive in the audio generation domain. Hence, in order to design a stand alone and real time instrument, we first need to craft an extremely lightweight model in terms of computation and memory footprint.”
The Neurorack is a hardware demonstration of this approach, based on the the NVIDIA Jetson Nano, a compact computer that lets you run multiple neural networks in parallel. The prototype, shown in the demo embedded above, is designed to be compatible with Eurorack modular synthesizers.
The developers note four goals for Neurorack design:
- Musical: The generative model chosen is particularly interesting, as it produces sounds that are impossible to synthesize without using samples.
- Controllable: The interface was relevantly chosen, to be easy to manipulate.
- Real-time: The hardware behaves as traditional instrument and is as reactive.
- Stand alone: It is playable without any computer.
The initial model is designed for the generation of impact sounds. The model allows for the creation of a wide range of impact sounds, with 7 adjustable ‘descriptors’:
- Loudness
- Percussivity
- Noisiness
- Tone-like
- Richness
- Brightness
- Pitch
It’s not obvious from the project’s “cryptic demonstration video”, but all of the impact sounds are generated by the Neurorack module.
The Neurorack design is open source and available now via Github.
Couple thoughts:
* the Jetson Nano dev kit is a CHONKY BOI, about 100mm x 80mm x 30mm. That means that module is at least 80mm deep
* it also likes power, A LOT. The marketing claims “only 5W” but that’s a whole ampere at 5V. Hope the other modules in the rack like it warm, too.
* the ADC board they’ve chosen (https://shop.pimoroni.com/products/ads1015-adc-breakout – linked on the “Hardware Choices” page) seems wholly unsuited to what it’s being marketed as; the underlying chip only accepts voltages in between Vcc (7V max) and ground. The breakout board adds some TL062s, but they’re _also_ powered from 5V and ground. Putting a 12V Eurorack signal into that thing will likely let out the magic smoke.
Hey Matt how would you build it?
What’s with the spongy module thing @ 0:19?
probably another hoax …?