Building with Open Source AI Music Models

We’ve been hearing a lot about AI music over the past year. But several years before this recent wave of audio generation, big tech companies like Google and OpenAI were testing the waters with open source AI music models. This inspired me to build DrumloopAI, a beat generator that offers an easy-to-use interface and pleasant user experience and is now used by over 65,000 beatmakers around the world. 
 

Nick Berns Nick Berns
Veröffentlichung
Reading time
5 Minuten
Listen to article
Loading the Elevenlabs Text to Speech AudioNative Player...

We’ve been hearing a lot about AI music over the past year, with instant song generators like Suno and Udio dominating public discourse. These are closed-source, commercial applications which means that 3rd party developers can’t build on top of them or integrate with their platform.

But several years before this recent wave of audio generation, big tech companies like Google and OpenAI were testing the waters with open-source AI MIDI generation models.

Google’s open source AI MIDI model was called Magenta (2016) while OpenAI’s was called Musenet (2019). OpenAI shut down MuseNet the same week that ChatGPT launched, presumably to focus their server power on the LLM.

Magenta, on the other hand, runs easily on a local device and consumes minimal memory or resources. They had a few browser applications for public experimentation and launched a free suite of MIDI VSTs called Magenta Studio.

As a lifelong musician and software developer, my experience with Magenta and Musenet inspired me. It was clear that the dev teams were focused on the AI model more than the user experience and interface. It felt like there was an opportunity to build a web app that hooked into Magenta, but with a more colorful look and feel.

Over the past two years, I’ve brought that dream to life through a web application called Drumloop AI. The sequencer uses the Magenta AI MIDI model to generate syncopated lines and AI audio models to create short loopable samples and breaks. We currently have over 65,000 registered users and are growing month over month.

Google Magenta Studio

For those not familiar with Magenta, it might be helpful to first have a look at Google’s Studio interface. I’m specifically going to show you their Drumify plugin, to give you a sense of how limited it was. The plugin runs on their open source MIDI generator DrumRNN (short for recurrent neural network) and that’s the model we built DrumLoop AI around.

The VST takes in a percussive MIDI reference file and then generates variations on that. As you can see in the screenshot below, it’s an extremely lightweight interface that depends heavily on a DAW’s piano roll and sequencers to begin actually making music. 

drumify

OpenAI’s AI MIDI tool Musenet

OpenAI’s Musenet was a second, popular AI MIDI tool that took MIDI inputs and then used a “continue” feature to expand on that idea. You can see the piano roll and four horizontal lines below them. Those four bars are clickable and will present four “extension” variations. 

screenshot chopooin style

This was the extent of their music making interface! So you can imagine how an indie hacker and musician like myself would feel motivated to improve the user experience. We’ll get into that next, with an overview of Drumloop AI. 

Creating interfaces for open source AI models

Magenta and MuseNet were both viable models, but their developers spent the least amount of time on the user interface. So I built this multitrack sequencer, DrumloopAI, around the Magenta DrumRNN model but with a web app interface that people can use without a DAW. 

screenshot - DrumloopAI

The first version of DrumloopAI was based on open source AI audio models like AudioLM, using text-to-beat workflows to prompt the system. However, the low fidelity audio and minimal controls drove me to explore MIDI and audio sampling as a second-layer of the app.

Today we have combined both AI audio and AI MIDI models into a single product, catering to beatmakers and electronic music artists.

Users have control over the BPM, musical style, drum kit, and AI parameters (represented by the groove, sequence and “wildness” sliders). Each time the user generates a beat, a new AI generated sequence appears. Think of it as auto-complete: You give the initial beat and the AI will write the continuation of that beat and evolve it further. This is great for getting jam started or exploring new ideas.

Models like Magenta and Musenet do not store records of prior sessions, so we built out a user library for managing those project files. This creates a better overall experience, since great ideas rarely come out in a finished state during the very first session. 

drumloop genreate screenshot

Today, we have both a sequencer and a text-to-beat interface for users who prefer to work with a text-based AI interface. Here you control the output with your text prompt and you can adjust the desired tempo and loop length.

Starting your own music software company

There’s a growing number of open source AI music and audio models available on the internet today. If you’re thinking about starting your own music software company, I can share some simple pointers here for getting started.

First, decide whether you want to run a subscription-based web application or a DAW plugin. These are very different paths, each with their pros and cons. As a plugin provider, you’ll be using C++ with a software development framework called JUCE, whereas web developers who don’t know C++ may have a better time with a subscription service.

I fall into the second category, as a longtime software developer and agency founder. There’s a lot more maintenance required in a SaaS business model, but the upside is high too. You can acquire customers and iterate on your product quickly.

As a web service, there’s less focus on operating system compatibility, and more time spent on browser accessibility (especially for music playback with older browsers like Safari). 

Enter the web-based route and you’ll launch faster and access a bigger audience. Go down the plugin route and you’ll be closer to the producers who live in the DAW. Both are viable, it depends on your product.

My background in SEO and marketing helped a lot with launching the product as well. Building a product is only half the battle – getting people to come use it is a whole other ballgame. That’s a bit outside the scope of this article but I can recommend this short tutorial from my friends at Dotted, an agency that specializes in advertising for music tech startups.

If you have any questions about starting your own music business, feel free to reach out directly via our support portal at DrumloopAI. 

Happy hacking!

Nick Berns

Nick Berns, is the founder of Bluelightweb.co.nz, a performance marketing agency with a strong focus on SEO & Ads. He is also the founder of Bernssoftware.com, a SaaS product studio which builds and acquires innovative companies such as DrumloopAI.com, HairstyleAI.com and TattoosAI.com.

Article topics

Original language: English
Article translations are machine translated and proofread.