ksherlock 12 days ago

Not to be dismissive, but as far as I can tell, the heavy work is done by facebook's demucs and this is an electron front end to run the demucs cli (and I guess search youtube for videos to download). The demucs project page has more information.


  • lapink 12 days ago

    Original Demucs author here. Thanks for putting forward our research!

    I’m definitely happy to see more front ends for Demucs being developed and to read that it has been useful to other musicians!

    We are working on the next iteration of the model, and with more sources, hopefully released by the end of the year :)

    If you are interested in this research you can follow my Twitter (@honualx) or star the Demucs repo.

    • game-of-throws 12 days ago

      I'm curious, what is the business justification for funding development of Demucs, if you don't mind me asking? It doesn't seem very related to FB's core business.

      • FLT8 12 days ago

        Solving problems like audio source separation (eg. Distinguishing multiple speakers in a noisy environment, or picking speech out of a background where music is playing) seems very much in FB's wheelhouse.

      • lapink 12 days ago

        The goal of Meta AI Research is to do open research, not necessarily with direct applications at the time we start it. Indeed, the architecture, or the lessons learnt working on it can become useful later for the company, for instance for remote presence with VR, to isolate the main speaker from its environnement ( https://arxiv.org/pdf/2206.15423.pdf ).

      • nickserv 11 days ago

        Just a guess here, but I wouldn't be surprised if it's used to better spy on your messenger audio conversations. They already listen in and will pick up keywords to populate your FB ad stream.

    • dekhn 10 days ago

      Hi, I just downloaded demucs yesterday and started using it. It's amazing! I really appreciate all the work you put into making it easy to install and understand.

      Is there any chance you can disentangle guitar and keyboard? I work a lot with Grateful Dead music and I'd like to be able to pull jerry's guitar out from the keyboard from live shows. Similarly, it would be cool if you could parse shpongle into its consituent tracks, but I think that's probably impossible.

  • pininja 12 days ago

    There’s no need to be dismissive since they say this in the first sentence. Preparing an easy to use app for all platforms probably does get this into more creative hands, and that’s a net-positive contribution I can appreciate.

    • amelius 11 days ago

      It should be in the title.

    • TimTheTinker 12 days ago

      It does seem rather disingenuous that the product page makes no mention that the author didn't do the heavy lifting, and that at the same time it features a prominent donation button.

      If I didn't know who did the real work and benefitted a lot from this tool, I'd give to StemRoller in proportion to my gratitude -- which I'm sure others are liable to do.

      • frob 12 days ago

        It's in the first paragraph of the README in the github repo and th3 second paragraph on the website. I'm not sure what more can be asked of the author.

        • TimTheTinker 12 days ago

          Thanks - I didn't even notice there was content below the fold on the web site (I'm on a desktop browser).

          How about saying so above the fold?

          • wszfahwbwbaha 12 days ago

            How about you stop being so pedantic and admit it when you're wrong instead of digging a deeper, dumber hole.

            • TimTheTinker 12 days ago

              This is a real design problem - many web sites do this.

              Saying so and suggesting an alternative isn't being pedantic.

              • drivebycomment 11 days ago

                > It does seem rather disingenuous that the product page makes no mention that the author didn't do the heavy lifting,

                It does seem rather disingenuous that your replies make no mention of admission for making a factually incorrect statement in the first reply.

                Not a big deal to me personally, but it is not surprising that some people see this as being petty.

                • TimTheTinker 11 days ago

                  Thanks for clarifying. I did think saying "Thanks" to the person who corrected me is a straightforward admission that I was wrong... if not - yes, I was wrong.

                  To the downvoters above: is a "thanks" to a correcting comment not enough on HN?

                  Also, was it rude to say hiding content "below the fold" is a design problem?

                  This is a very odd thread to me - like I was being chased down by pedants who are calling me pedantic.

                  • wszfahwbwbaha 9 days ago

                    Seems you think everyone besides yourself is always the issue...maybe self reflect a bit

thisiswater 12 days ago

Tried splitting a complex arrangement (Chicago by Sufjan Stevens). Drums bass and vocals come out fairly well, though the drums stem seems to lack other percussion elements outside of the core rock drumkit (e.g. tamborine), and cymbals hits are clipped rather than ringing. The 'other' stem, the rest of the instrumentation, keeps a fair bit of the percussion and there's bleed from the vocal melody.

The backing vocals seem to have disappeared for the most part, and are only audible in the vocals stem when the lead vocal is present (like they're reverse-ducked? Been a while since I did any production, the terms have escaped me...).

Not much use with complex arrangements to be honest, I was hoping to get things like the strings section separated from the rest of the arrangement.

Original: https://www.youtube.com/watch?v=tWX3El-slpY

Output: https://file.io/etpOQt57ziKe

  • pcf 12 days ago

    Did you use a FLAC/WAV file? That should yield the best results.

    (Only asking because you linked to YouTube, and I'm not sure if you used the YouTube audio for your source.)

    • thisiswater 12 days ago

      Perhaps you're right, I'd have to check.

      I typed the song in the search and pressed the first likely result, which is the youtube video I linked. Using the software as intended I believe.

      • marssaxman 9 days ago

        YouTube audio is optimized for bit rate, not quality (128K MP3). You will get better results with a higher-bitrate MP3 (320K would be good), better still with an uncompressed format like FLAC or WAV.

    • BitPirate 12 days ago

      Makes sense. MP3 tries to compress without loosing information in the hearable spectrum of a human but that information can still be processed by algorithms.

djcannabiz 12 days ago

I tried throwing some underground rap artists at this app, as stem splitters usually struggle with them

I split https://www.youtube.com/watch?v=DDaL7KBjkDI

And it gave me this https://www.dropbox.com/sh/inyk38n2jrp5i45/AACpB0xXNFxamEmP3... I noticed some weird hissing with the 808s, but other then that it sounded pretty good

For more of a challenge, I inputted https://www.youtube.com/watch?v=uAwQ3njiU4M

and it came up with https://www.dropbox.com/sh/97lzke0puh9dzeo/AACE75vsbNS43UqqH... It was able to separate some of the kicks from the 808s, which is really impressive to me!

Overall, I'm very impressed! This sounds much better then lalal.ai to me

  • polishdude20 12 days ago

    I'd like to take a moment to mention how great dropbox's audio seeking thing is. It's super fast and works as intended. Great work whoever implemented this.

  • pelagic_sky 12 days ago

    I’ve found Lala to be my go to. If this is better, then I’m very interested in trying it out.

    • pelagic_sky 12 days ago

      Just a follow up. My two conversions so far, Lalal.ai has been better. Especially separating drums from instruments. I'll give Stemroller a few more tries as I am always looking for options.

      • pelagic_sky 10 days ago

        Update number three. I now just use both lalal and stemroller because each one seems to do better in certain cases. If I hadn’t paid for lalal, I’d probably just use stemroller as it’s way better than RX9

  • metadat 12 days ago

    Why do vocals.wav, other.wav, and instrumental.wav all start out the exact same (with vocal sounds)?

  • squeaky-clean 12 days ago

    Super impressive splitting there, wow. Just curious, was your source a lossless or compressed file?

    • djcannabiz 12 days ago

      The second file was lossless, the first was ripped from a CD.

elaus 12 days ago

This seems to run just fine under Linux as well, not completely out of the box though: It's basically missing builds and config for Linux which can be build analogous to the existing Win/Mac stuff.

You also have to build the demucs-cxfreeze dependency (as described in its repo, https://github.com/stemrollerapp/demucs-cxfreeze).

  • elaus 12 days ago

    It's almost eerie how well this works with electronic music. Coming from an age where your best try to separate a track was using equalizers, I didn't have high hopes.

    Trying it out with Alan Walker's Alone, it separates the vocals and drums almost perfectly. Bass is really fine as well, only instrumental and 'other' was a bit mixed up in my try.

  • knicholes 12 days ago

    Whenever I see an "##Installation" section with more than one step, I immediately call DOCKER!

dylan604 12 days ago

"Download and extract the latest ffmpeg snapshot from evermeet.cx and place the ffmpeg executable inside"

Why? Why can't this just point to the location where ffmpeg is rather than making a copy of ffmpeg? symlink might work, but just do a $(which ffmpeg) or ask the user for the path ~/bin/ffmpeg /usr/local/bin/ffmpeg etc

  • PaulDavisThe1st 12 days ago

    ffmpeg has not had a stable command line interface for some time. It can be a problem to assume that the system-installed version accepts the arguments you plan to give it.

  • Rodeoclash 12 days ago

    It's even easier than that. There's a few npm libs around that are dedicated to shipping a copy of ffmpeg with electron.

    • dylan604 12 days ago

      even easier than what i already have on my system? what are you saying here, as it makes no sense to me

  • linux2647 12 days ago

    Maybe there’s some feature of bleeding edge ffmpeg that’s required for the app

setgree 12 days ago

Open Culture recently posted a link to Abbey Road but with only Paul's bass lines, but the actual content got taken down. [0] It was really cool though, in part because it's not nearly as precise as I would have thought, which made it feel really organic.

[0] https://www.openculture.com/2022/04/hear-the-beatles-abbey-r...

  • TylerE 12 days ago

    In the real world where tracks are cut live, there is a fair bit of microphone bleed

    • salmo 11 days ago

      I imagine studio-era Beatles in particular would be difficult.

      Microphone bleed, lots of overdubs (especially vocals), and repeated re-layering tracks on tape over and over due to channel limitations. They really were doing crazy stuff with limited tech.

      I think this would be hard for bands that really fill the spectrum and don’t have that clean treble, mid, bass separation. Or recordings really compressed into a frequency range.

      Now this makes me want to see what happens with like My Bloody Valentine and Husker Du :).

    • hammock 12 days ago

      Especially in the day and style that the Beatles recorded. Today, not so much

phonescreen_man 12 days ago

Been using demucs for a couple of weeks now, mostly taking my early produced music which I have since lost the project files for and giving them a remix and update. Gotta say I have been blown away by how good demucs is. I installed it following the repo instructions and then created a zsh alias to run it with any file name. Eg $ai_split mySong.mp3

Wait fifteen minutes and out pops four stems, flawless so far, even been messing around with mainstream tracks and using ableton with warp applied to quickly build out remixes. Demucs is going to be /is already a game changer!

  • eyelidlessness 12 days ago

    This testimonial almost has me wanting to try it on an “album”[1] I recorded when I was in a “band”[2] in high school. I too lost all of the source files[3].

    1: On second thought maybe not. It has not aged well.

    2: Me and another kid, with a guitar, a pre-OS X Mac, a pirated copy of Rebirth, a pirated copy of SoundEdit 16, and literally the mic that Apple used to include with (some?) Macs. I’d back-reference[1], but our equipment was not the problem. Well, except for [3].

    3: I learned my lesson: I should have been older and had a job that would afford me a backup drive, so I could sample the sounds of that dying HDD and retcon the samples into my “album”[1].

  • pininja 12 days ago

    That’s awesome! I wonder if there are projects to create a repository of pre-split public domain music? Seems like something the internet archive could host once created.

Dwedit 12 days ago

Let's see how long it takes for some new Neil Cicierega remixes to appear now.

  • intvocoder 12 days ago

    With a tool like this, you could get back into the animutation scene. (Edit: I guess it's a bit of a non-sequitur, but I enjoyed Suzukisan, so there's that.)

nr2x 12 days ago

How is this similar/different than the Deezer one?

  • ksherlock 12 days ago

    I just did a quick test of demucs vs spleeter:4stems. demucs is significantly slower but the output is better.

    in a semi blind comparison, I prefer demucs for all 4 tracks (drum, bass, vocals, and other). bass and other stand out the most so let me say a couple words about them.

    bass - the demucs bass has less bleed from other instruments and the volume is consistent throughout. with spleeter, the volume varies a lot and there are multiple sections of 1-2 bars where it just drops out completely. In Capo, the demucs spectrogram is nice and clear whereas spleeter tends to look like pencil smudges for the most part.

    other - with spleeter, whenever there are vocals, the other instruments turn to mush. demucs is much better. Oh, you can tell people are singing -- the instruments get muffled -- but you can still hear them.

    • anigbrowl 12 days ago

      It's pretty decent. I threw a drum'n'bass track at it to see how it would cope with heavily produced material and the results were surprisingly good.

  • CharlesW 12 days ago

    I'd also be interested in how it compares to iZotope RX's Music Rebalance (examples from earlier releases here: https://www.izotope.com/en/learn/stem-isolation-music-rebala...).

    • avis 12 days ago

      I'd be interested to know how it compares to iZotope as well as phonicmind.

eshack94 12 days ago

I dabble in audio production in my free time outside of work, and I typically will use iZotope RX 9 or Neural Mix Pro for isolating vocals or stems. However, these are paid products, and it's encouraging to see more open source projects being built around this space.

I like the opportunity to view the source code and learn from it, as opposed to most paid products which are typically closed-source and a bit of a "black box".

Sure - this is mostly just an accessible frontend for Demucs, but that's still okay. The author clearly indicates that in his repo, giving credit where credit is due. Additionally, this helps less-technical creators be creative in new ways.

Thanks to all who contributed.

yarg 12 days ago

Honestly, this sort of thing is cool; but why (in general) is it necessary in the first place?

If the elements of the song are recording in isolation - which they are in all studio versions, why can't we just move to a format that supports the layering?

  • gavinray 12 days ago

    Musicians and studios don't generally tend to offer the public access to original stems for songs (why would they?)

    Say that you want to make a remix, mashup, or otherwise use sound-bytes from a song. The easiest thing to do is use a tool like Spleeter/Demucs to separate the source layers so that you can then further process them in your DAW.

    This is what I do, but I just use the Demucs CLI because it's simple enough.


    • pabs3 12 days ago

      Are there no communities of "open source" music? It sounds like the stems are part of the "source code" for tracks.

      • jononor 12 days ago

        Many niches in electronic music have small knit communities of creators and producers that regularly remixes each-others stuff. But it is not an open community, you gotta have a decent standing (from making own music or prior remixes) before someone is willing to send you their stems. For anyone musician that has a label/publisher, they also need to be in the loop, for handling of the royalties. So sharing stems happen regularly in the music industry, but it is not easily accessible. Which makes tools like the one mentioned very useful for everyone else that would like to participate.

  • osigurdson 12 days ago

    It isn't really in the best interest of the artist to provide this. The final mix is part of the overall product / work of art. Providing all of the individual tracks (there could be 30 or more in total) would also take up a lot of space / increase processing requirements while benefiting very few.

  • spyrefused 12 days ago

    I usually use this kind of tools to get the bass score of some songs, for example. With the isolated elements it is much easier to know exactly what notes are sounding (I don't have a good ear). The same for drums or synth notes.

    As after all the sound quality doesn't interest me too much to do this, I usually use iZotope RX, but I will try this tool.

  • amelius 11 days ago

    This is like asking why we need decompilers.

    • yarg 11 days ago

      > (In general)

      Yes, I agree.

atoav 12 days ago

For all who look for something like this, iZotope RX (the audio retouche software) has a function called "Musical Rebalance" which is great for reducing spill or changing the balance in a live recording.

ccn0p 12 days ago

talk about a missed opportunity without examples. did I miss them somewhere?

nerfhammer 12 days ago

I've always wanted a way to extract just the kick drums in realtime but I don't understand this field well enough to understand whether it would be remotely possible or not.

  • jononor 12 days ago

    You want just the beat, ie the time markers of each kick? Or you want the isolated sound (ie audio) of each kick? Both are generally possible today, though the approach will differ a little bit.

screech 12 days ago

Just wow! There were methods extracting acapellas from tracks, but this tool here is another level. Fascinating how good the results are.

polishdude20 12 days ago

This is awesome! Tried it out on Rush's Tom Sawyer and it splits out the vocals great! I can see this being super useful!

abbusfoflouotne 12 days ago

Would appreciate an easier way to download and run this! The steps on the readme are pretty long, at least for me (Mac user)

interestica 12 days ago

How does it compare to lalal.ai ?

  • threefour 12 days ago

    It's free.

    • amelius 11 days ago

      And otherwise identical?

      • kbob 10 days ago

        Demucs did a much better job of isolating the bass on a blues track than LALAL. The bass actually sounded like a bass. LALAL got the note pitches but lost their attacks.

colecut 11 days ago

Anyone else just getting 'failed' on every song they try?

NonNefarious 12 days ago

How do you load a local file?

  • diimdeep 12 days ago

    There is no support for a such thing, this is software in year 2022, never local, online first.

    • NonNefarious 11 days ago

      Hahah, I know, right? People actually believe that shit... until they get jacked by a service provider.

volkse 12 days ago

Is there a VST front end?

raydiatian 12 days ago

How does it perform compared to Deezer Spleeter or lalal.ai

Else who cares