|Spleeter is Deezer's open-source AI tool capable of isolating a song's vocals / Photo Credit: rafapress (via Shutterstock)|
It’s tedious for producers and DJs to split a song into instrumentals and vocals— even people who want to play around with isolated audio may find it too difficult, as reported by James Vincent of American technology news site The Verge. There are other ways of doing it, but they are often time-consuming. Don’t worry, an open-source AI will make the process faster and easier.
Music streaming service Deezer developed Spleeter for research purposes. Deezer released the software as an open-source package. The company put up the code in Github for anyone who wants to download and use the software. All you have to do is feed it with an audio file and it will split into two, four, or five separate audio tracks or stems. The results are not perfect but the tracks are usable. When Spleeter is run on a dedicated GPU, it can “split audio files into four stems 100 times faster than real-time.”
You might be tempted to use Spleeter to create mashups but you’ll need some tech expertise to utilize the software. If you’re constantly playing with Python or Google AI’s toolkit TensorFlow, you can download a few programs to run the software. You’ll also have to start getting comfortable with using a command line output rather than a “more accessible visual interface.”
Deezer’s chief data and research officer Aurelien Herault told The Verge via email that the company trained Spleeter by using 20,000 songs with pre-isolated vocals in various genres. Through the songs, the software managed to learn how to isolate the tracks by itself.
In conclusion, Spleeter is one example of how AI tools can make creative work simpler. Machine learning is used to automate tasks such as removing a photo’s background or upscaling textures in old video games. Machine learning tools are also incorporated into consumer software like Photoshop and Runway ML. Deezer does not have plans to make Spleeter a consumer tool. Herault explained, “Internally, we’re using it as a pre-processing tool for complex research tasks such as music categorization, transcription and language detection.”