Adobe Teases AI-Powered ‘Photoshop For Audio’

Adobe shared this sneak preview of a new project that promises to be a sort of AI-powered Photoshop for audio.

Project Music GenAI Control, an experimental project from Adobe Research, lets users generate music from text prompts, and then provides fine-grained control to edit that audio for their precise needs.

The new tools begin with a text prompt fed into a generative AI model. A user inputs a text prompt, like “powerful rock,” “happy dance,” or “sad jazz” to generate music. Once the tools generate music, fine grained editing is integrated directly into the workflow.

Users can transform their generated audio based on a reference melody; adjust the tempo, structure, and repeating patterns of a piece of music; choose when to increase and decrease the audio’s intensity; extend the length of a clip; re-mix a section; or generate a seamlessly repeatable loop.

“One of the exciting things about these new tools is that they aren’t just about generating audio—they’re taking it to the level of Photoshop by giving creatives the same kind of deep control to shape, tweak, and edit their audio. It’s a kind of pixel-level control for music,” explains Nicholas Bryan, Senior Research Scientist at Adobe Research.

Project Music GenAI Control is being developed in collaboration with colleagues at the University of California, San Diego, and at the School of Computer Science, Carnegie Mellon University. Details on availability are to be announced, but Adobe describes this it ‘an early-stage generative AI music generation and editing tool’.

34 thoughts on “Adobe Teases AI-Powered ‘Photoshop For Audio’

  1. So after buying Cool Edit, ruining Cool Edit, and dropping Cool Edit, Adobe brings us Cool Edit with an… umm, rather generic garage band clip package?

    Couldn’t you just bring back Cool Edit?
    (oh, right. Your product is 80% matched by the free ocenaudio? Aww, that’s harsh)

  2. I have a couple thoughts on this new technology….
    A) How cool that this opens up musical “creation” to anybody who wants to create…. and
    B) About time!! I’m tired of the industry and insiders telling me that I have to be of a certain level of proficiency in order to play in the game. That if I don’t “create” the way they do, that it’s “wrong”, that I’m not a “real musician”, that what I do isn’t music. Who are you to tell me that what I do isn’t real? Yeah, I’m all for this technological advancement.

    If the kid with the iphone wins the Oscar, then good for him!! Maybe you old school guys better try harder!! Lol lol

    1. Well, it sounds a bit like music, but it’s far from good; maybe it’s good enough as a hip-hop loop, where only short fragments are used anyway.

      1. This attitude is what will allow you to look back over your shoulder and watch everyone else stuck in the tar pits. The future has nothing to do with being creative, it has everything to do with “good enough” to get paid. I’m sure sports fans will complain when teams get wifi chips installed with ear pods… so they can be coached remotely with new tech. Everyone will cry and say… look you are ruining sports… as if $ hasn’t already. Might as well embrace it now or face the consequences you having to fake it till you make it. And 10 year olds will be prompting “good enough” music… and probably will win an Oscar and be ahead of the game

    2. I’m sure AI *will* be good for creating ideas at some point, but that’s not possible with current tech.
      All these systems do very detailed and big scale pattern matching on data similar to training data. At every note (word) you get the most likely (usually good) result from a similar text.
      This is perfect to create school essays or wikipedia-like stuff, but not for new material.
      By definition, a new idea has LOW correlation with what was just said, but if you tell an AI to go off at a tangent, it just goes random.
      So, maybe next time, but not in this round.

      (of course, if I train it exclusively on Chick Corea’s Return to Forever and Kraftwerk, maybe we finally get good jazzy disco beats…? 😉 )

  3. What’s the point of watching that fourrier analysis window? For sure the results don’t come from there. Theirs is a music ai, not a sound one.

    1. Actually I think the way Riffusion works is by generating a spectrogram regarded as an image. So it might be how this one works too.

    1. Not exactly. Facebook/Meta (which does a lot of AI research) put out an open source library last year which makes it easy to separate a piece of music into stems. Akai is almost certainly using a variation of that because as far as I recall it had a very permissive license as Facebook doesn’t have any special interest in audio.

  4. Absolutely not sold on this. Every software company is trying to jump on the AI bandwagon to stake out market share, but generative AI for music, while powerful, does not have any associated concepts of how to edit etc. For iamges the technology is fairly far along to isolate elements and rotate/regenerate/adjust them them, but those techniques were already highly developed for image editing softweare before the altest AI boom. The same is definitely NOT true for audio. Adobe is looking at musicians as a market to gained, nothing more.

    Consider the history of their audio editing software, Audition. They needed something like this to complement their video editing product, so they bought CoolEdit, which used to be made by Syntrillium software. Cooledit was a fully featured audio/MIDI editor, a complete DAW but with a heavy focus on single-waveform editing, much like Sound Forge (but imho way better). After they acquired it Adobe rebuilt the user interface a bit to make it compatible with their other tools, but added very little in the way of functionality, and eventually started *removing* functionality like MIDI even though users begged them not to. They spun off separate products just for dialog editing in film/video and then abandoned them, never coming through with promised features or further development.

    I was a beta tester for them and they were very unresponsive to change requests, even when a clear industry use case was provided. In their view users should have no input into design, only bug-catching. And once they’ve got some user hooked on a product, they’ll switch to a subscription model (or use that from the outset) and if you don’t keep paying, your software stops working.

    Nobody at Adobe cares about pro audio other than as a market to have $ squeezed out of it. This product was cooked up by managers putting together a feature list and bolting an AI model onto their existing software inventory. Sorry to be so negative but I’ve followed the history of this particular software for 25 years (since it was CoolEdit) and had a long relationship with Adobe. It’s gonna look great on the surface but only be an inch deep.

  5. I have an app like this for food and now I am a pretty decent food creator. No need to spend all that time learning to feed myself or my fam. I choose a prompt and it creates the food. It’s a bit more analogue though as a human delivers it on a bike. But I still created it. I had friends over last week I showed them how good I am at creating quite complicated sushi. They were all quite impressed when I said I’d made it.

    Isn’t it wonderful that we can both evolve and de-evolve concurrently. It’s like the matrix, all these new instantaneous skills you can learn that enhance your soul. This app is gonna create some great stuff, whilst saving those soon to be talented folk who use it the tedious, joyless, certainly not the point of it process of making music and give them some banging day time type tv show theme tunes that sound like they’ve been round robin’s through google translate a few times.

  6. Can you input in “the new style that will come in 2025 that will follow EDM?” no…you can only input existing styles. Its going to that the creativity of HUMANS to make that.

  7. A lot of electronic music isn’t compositionally rich or complicated, the emphasis is on sound design, so I wouldn’t be surprised if this can match the average EDM musician that relies on sound banks / sample libraries.

  8. I’ll just remind everyone that pretty much these exact same fears and arguments happened when drum machines and synth bass were coming on the scene in the late 70s/early 80s. “There will be no more creativity in music!!!,” they proclaimed. “It will all sound like a machine made it!,” they cried.

    Calm down, folks, they’re just tools.

  9. the tech is coming no matter how we feel about it.

    as said previous this is just a tool . and people who like creating music will just use it as a tool in there arsenal.

  10. We’ve been seeing generative tools evolving both in the MIDI and audio realms, but more so with MIDI. Scaler 2 is a good example. You can pick a scale type (i.e., mood) and work out a chord progression, then have the thing spit out some pretty complex material from those choices.

    For many generative processors, the flow goes:
    1. Press MAKE button.
    2. If Yuck, press MAKE again, if Yum move on to next step.

    The problem with many of these generative tools that their musical elements are so “basic”/rudimentary that they prevent you from finding something new and cool. They almost all are geared toward finding only low hanging fruit.

    With AI, you have this “prompt” concept, but how rich will the vocabulary be? Can you ask for “Haven’t seen that goofy dog in hours, hope he’s ok.” Ok, maybe not a good example, but something with some layers? Also, if you type the same prompt 50 times will you get 50 completely different things? Can you have it generate something with more unusual time-signatures, chords, scales, tunings, etc.? Will it know how to keep it in the realm of controlled-chaos?

    Yes, the tech is coming, and it will stay. It will improve. Currently, with lots of hand-holding, it is capable of making harmless or even interesting content. As it quickly evolves, it will have an impact on how things are done and how earnings are made or unmade.

    1. i can see that. when old farts have lingered so long that they can be defined as old, and continue to advance in intensity after the initial emission, instead of dispersing, i imagine they would indeed intimidate everyone in the vicinity.

  11. People are responding to this rudimentary AI like oh AI is here how do we respond to this level of AI. This is just the beginning of AI. AI will continue to evolve and progress in its’ sophistication and intelligence, including artistic creativity. Writing ,music, visual art, science, philosophy, etc. This is very basic AI right now but it will evolve. I guess we humans have to ask ourselves well how do we fit in to this future superior intelligence???? AI is in its’ infancy.

Leave a Reply

Your email address will not be published. Required fields are marked *