Emotion and disruption: artificial intelligence in music production

To what extent is artificial intelligence changing the way music is produced? Markus Deisenberger talked to five renowned music producers (Wolfgang Schrammel, Thomas Foster, David Piribauer, Zebo Adam and Georg Tomandl) about the upheaval in the markets, the associated losses and opportunities.

Markus Deisenberger

Veröffentlichung
18.03.2025

Reading time
13 Minuten

Listen to article

Go back

"At the threshold where we are now or have already crossed, it will be disruptive," says Wolfgang Schrammel. Schrammel, a legendary Salzburg producer who has had a hand in countless outstanding music productions over the last few decades, believes that AI will lead to major efficiency gains,"but that no stone will be left unturned, whether for photographers, graphic designers, recording studios, voice artists or technicians."

What does this mean for him and his studio? "Up until today [which was late summer at the time of our interview, note], I've only had three voice recordings. In the comparable period of the last few years, there were already between twenty and twenty-five on average at this point." So the market is collapsing, which is of course a certain problem for someone who, in good old indie style, was trying to make recording in their own studio affordable for young, up-and-coming bands as long as the studio was generating good income from advertising at the same time. Most people in the industry were also amazed, according to Schrammel, "at how quickly people left because they obviously didn't care that the results achieved by AI didn't sound as good as if they had had it professionally recorded. We didn't expect customers to accept so quickly that the results would be lukewarm compared to what professional studios deliver."

And this despite the fact that when you rent a studio like his, you always want something very specific from your voice: "Voice color. Pitch. Melody. Ultimately, it's the emotion that I want to conveyl." The voice is not simply recorded in his studio. "The voice actors are given stage directions beforehand. The spoken word is then recorded by a 5,000-euro microphone via a 3,500-euro pre-amp and a super converter, then edited." Three-dimensionality is achieved through pre-amplification and conversion. "You can hear every saliva." Schrammel doesn't use the same microphone for every person either. In the end, the customers have exactly what they originally wanted. "I get something similar from the AI, but not the sound and the emotion, which is ultimately not what matters."

Yes, it's now possible to "reinforce" the emotion and achieve different things with different interfaces,"but the result is still not what matters." In the case of an answering machine that only announces the opening hours of a company, this may not matter, but not in other cases. Why? Because people normally want to communicate with people. And: "People also react badly if they feel they have been tricked."

But, Schrammel concedes: "The algorithms are getting better every week." It could be that at some point you will be able to prompt so well that the AI will know exactly where to slow down, etc. His prediction: "Top speakers, the three or four most popular per country, will be less affected, but the average speakers will feel it strongly or are already feeling it now."

Post-production and consultation

Logically, this also opens up new opportunities. For example, he receives files from companies "that were created by AI but sound so bad that they want me to polish them up. Yes, and then I use plug-ins to turn lukewarm sounds into ones that are better." A new market is therefore opening up for people who are professionals in the video and audio sector and can assess whether something could be improved. "The market is opening up for executive producers or consultants," says Schrammel. These are new markets, yes, but the question is whether people want to cultivate these new markets. Schrammel doesn't want to switch to a laptop to play around with algorithms in his home office. "I am a thoroughbred musician and producer. I want to deliver quality and not be responsible for a mistake that has crept in through algorithms." Because every time he has hired a native speaker in the past, they have found inconsistencies - "even in languages such as English or French. There are things that might be okay, but that a native speaker would say differently."

But quite apart from that, he can't vouch for the fact that something translated by an AI fits in Korean or Romanian. "You also see a lot of mistranslated things on Instagram. If a dog is playing and the text is wrong, nobody gets hurt. But it's very tricky when it comes to product promises."Schrammeldoesn'tsee it as"that problematic" with singers. "They will be a little less affected, because with singers you also want to hear their personality," says the producer.

"The topic is over"

Thomas Foster

Thomas Foster takes a fundamentally different view. Together with his partner Peter Kent, he has composed many a catchy signature tune. Whether for "Zeit im Bild" on ORF or radio stations in New York and Moscow, his signatures are on rotation. But he also produces a steady stream of house tracks and much more besides. He is the main protagonist in a current 3Sat documentary about the use of artificial intelligence in music.

How did this come about? He had produced a video entitled "I'm unemployed" for his podcast. He experimented with AI tools, and while the AI-generated song was playing, he reacted to it. In the end, he simply said: "Okay, now I'm unemployed." He normally has between 1,000 and 2,000 clicks. This video, however, managed more than 100,000 straight away, which apparently also caught the attention of someone at 3Sat.

"AI will revolutionize the way we produce music even more than the computer did," he is convinced. It may have been a massive shift from the tape machine, mixing desk and real synth to the computer,"but what is already possible with AI and what will be possible will be much greater."

He came to the subject of AI because he has always been someone who has been fascinated and excited by technical innovations. "Even if I already have nine plug-ins that can all do the same thing, I still have to try out the tenth one and get to grips with it.

He still remembers the first program he used to create vocals, i.e. he could enter notes and text and get vocals. "It sounded terrible, a wooden robot voice." But theoretically it worked. In the meantime, no one can hear the difference anymore. As part of the 3Sat documentary, he played various of his productions to Monika Ballwein, an Austrian singer who was recently the lead singer in the successful Queen musical, and she had to guess which song had been sung by a real person and which had been generated entirely by AI. The error rate was 80 percent. So she only guessed two out of ten. "But if not even someone who deals intensively with the subject of singing on a daily basis can hear the difference, the issue is over," says Foster.

Has AI changed the way he produces? "Yes, fundamentally and in a positive way," he says, because singing has always been a "difficult subject" for him. Even though he enjoys working with other people and finds the exchange with people great, he has often found the dependence on other people to be a disadvantage. "You write a great song and are in a workflow, have fun working, and then at some point the subject of singing comes up. Until now, this has meant stopping work, playing out the composed music with a piano, contacting singers. If it has to be in English, there are very few people who can sing in English without a record, so in many cases it's better to record in L.A. and London, etc. It takes two weeks before the lady or gentleman goes into the studio, which means that the project has to stand for two weeks and costs a lot of money because the studio and the singer have to be paid. And if you're unlucky, you get a result that you're not happy with and the whole thing starts all over again ..." With AI, Foster is now staying in the creative flow. He first tries out the vocals with AI, enters the notes. "The note with less vibrato, the note a little shorter, the note a little longer and so on. The creativity that previously went to the singer has come back to me." That's great for him. "The fact that it's terrible for singers is another story."

Foster gives another example: he recently started a collaboration with a successful DJ, he says. "He had the idea of turning a Blondie song into a modern dance version. I said: 'Send it over to me and I'll do the vocals with AI right away'. He replied: 'Yes, but we're not going to use AI in the final production, are we? I replied: 'We don't have to, no. But let's work and if we don't like the vocals at the end, we'll have someone else sing them. That's not an issue. The next day, he sent his cooperation partner the first examples of how it could sound. The DJ wrote back: 'Oh my God. I didn't know the topic had gotten this far. That sounds great.'"

"UDIO" and "Suno" are the tools that Foster uses to create vocals. However, his main tool is called "Synthesizer V". And on lalals.com you can convert your own voice into that of Ed Sheeran, Madonna or Michael Jackson. Then there are two or three other tools, but those are the most important.

Do natural, human-produced vocals no longer play a role in his studio? Yes, he has just finished a track with vocals by Jan Johnston. Johnston has worked with all the big trance DJs such as Armin von Buren, Tiesto, Paul van Dijk and others. Not only was it great to get to know her and work with her, but the collaboration also has added commercial value: "When I release a song with her, it's not just the people who follow me who are interested, but also the people who follow her. Her fans."

"A different tangibility"

David Piribauer

David Piribauer takes a completely different approach to the topic of AI. He runs the Mushroom Studio in Pinkafeld, which is well known to all those who like good, handmade pop and rock music. He previously worked as a session drummer for Christine Mc Vie(Fleetwood Mac) and Solange Knowles, and also played in Alice Cooper's band. So Piribauer is a dazzling figure in the Austrian music scene. Does AI play a role in his production methods?

"Not really," he says. "If you look in the recording room, there's a drum kit and a grand piano. It's built like a real music recording studio. I also use a lot of virtual instruments, sometimes this, sometimes that, but we usually record most of it here at." With analog equipment, a different sound, a different tangibility could be achieved. "Whether that's better or worse remains to be seen, it's just different." The working method has certain advantages, but also disadvantages. Basically, he has both: "Super Pro-Tools and also the analog." But: "Ultimately, they're just tools. You have to know what the strengths of the different things are and whether you really need them."

When using plug-ins, he himself differentiates between analyzing and converting a file. Then he does that with AI, yes. What he doesn't do is have voice or an instrument produced by AI. That would be pointless. "I do the whole thing for one reason: I want to have fun with it. I don't just do it to deliver a result. I enjoy creating. Music is art and art has to do with skills. If someone plays the piano and the air vibrates in the room or a band plays great music together and it vibrates, then that is unique." Perhaps an AI can also do something unique,"but then it's not these people who put their heart and soul into it."

For him, the difference lies very clearly in the production method: "If a production method has mainly been 'in the box' in the past - which is generally something I hardly ever do - then you are closer to AI. If I were doing advertising, it would certainly be different, then I would probably use AI. But when you work with artists like I do, it's about the personal, not about creating a background track, but about the truly individual production of an artist who has written it, represents it and then performs it live."

Goal-oriented or autonomous

Zebo Adam

Zebo Adam has a similar view. With his productions for the successful Austrian band Bilderbuch (such as the albums "Schick Schock" and "Magic Life"), the Viennese has attracted attention far beyond Austria's borders. This was followed by work for the Beatsteaks (singles "Ticket" and "Mad River") and album productions for the Steaming Satellites and most recently Wanda. In principle, he considers the topic of AI to be extremely exciting because it is initially about the performance and computing power of programs. "And we are currently experiencing something that is perhaps only comparable to the change from the time when music could not yet be documented except in the mind or on paper, when there were no recordings, and the time when it became possible to record music." The changes that are currently happening are"more vehement and more serious", he is sure of that.

But"without knowing where it will go", his personal approach is to keep more and more distance from it.

Why? "Because first and foremost, making music means a great deal of personal responsibility for me." And AI in music production does one thing above all: it takes responsibility.

Like Piribauer, he also distinguishes between two fundamentally different approaches to music production. One is very goal-oriented and AI makes a lot of sense here. "If the goal is to make music that meets certain parameters, AI is great. It used to take weeks to create a playback. Today, it can be done in eight hours in at least as good a quality. But does that automatically mean that you make better music? That you make more interesting music? I don't think so, because I think that the interesting thing about music is the personal decision." And this is often the result of a mistake. "When I think about recording music, I want to find solutions, and all the tools I use are just aids, but I still have to climb the mountain myself," says Adam.

Georg Tomandl

Georg Tomandl, Managing Director of Sunshine Mastering, on the other hand, constantly uses AI tools for sound editing, such as "dxRevive" - "that's the best plug-in for getting rid of noise." This is particularly useful for film recordings, but sometimes also for poor music recordings. He also works with "SpectraLayers", a program that can be used to break down a song into its different components, for example if you want to separate vocals and music. "If a voice is too loud or too quiet, you can simply make it louder or quieter in the mastering process."

But the moment you start replacing creativity with AI, you cut off the branch you're sitting on, says Tomandl. "Making music fills us with something." And that, says the Viennese producer, is something that defines us as human beings. "If you tell the machines to do it and we just stand there and watch, you have to ask yourself: why?" Of course, you can't forbid anyone to let someone else do it, but "I think it's a shame if we take away creativity and the opportunity for artistic expression," says Tomandl. "We want to do creative things." On the other hand, there is AI with its concentrated computing power, which draws on a great deal of expertise and is therefore unstoppable. Tomandl is currently working on the dubbed version of an Austrian film, for example. He believes that development in this area in particular is unstoppable. "Even if you are in solidarity with speakers like I am, I fear that in the future it will only be done by AI." There are already too many voices on offer.

Independently of each other, Tomandl and Adam both talk enthusiastically about a music project that Thomas Rabitsch has launched to celebrate Hansi Lang's 70th birthday next year. "He is currently working on material that Hansi has yet to record," says Adam. AI was used to extract the voice of the deceased singer and build music around it. "It's absolutely amazing what it makes possible and how you can make music." It's great, says Adam, because AI is being used in a way that expands our ability to make music.

Whether we like it or not, whether we focus on analog quality and personal responsibility in music production or prefer fast, uncomplicated results, development is progressing. Algorithms are evolving and computing power is increasing. AI is being used even by its biggest critics, if only to make something not produced by AI better. The markets, especially the spoken word and dubbing market, are already in a state of flux. But the streaming market has also changed dramatically in recent months: Over 100,000 songs are uploaded to Spotify every day. It is not known how many of these have already been created by an AI. However, it is assumed that it is over 50 percent. So that would already be between 60,000 and 70,000 songs created using AI that are uploaded every day. Nevertheless, Thomas Foster believes that the big hits of this world will not be made with AI in the future either. These are nerds who upload so much now and are happy when they get 1,000 clicks. "But the new song by Beyonce or Lady Gaga? No. I can imagine someone working with AI to make them a better song, yes." The fact that AI supports the producer in composing because it knows music "that I haven't even heard yet" is extremely appealing. "Suddenly Mozart, Prince and Michael Jackson are sitting next to you and giving you tips on how to make a song even better," he says enthusiastically. "A machine that brings all the music knowledge in the world with it to support you."

This article first appeared on mica - music information center austria on November 6, 2024. Since the topic of music and AI is of particular interest to Sounding Future, we have republished the article here as part of a cooperation between mica and Sounding Future.