Building web applications:
Audio separation

Building a Simple audio separation
Web Application using Python

Abstract

Hello, fellow enthusiasts of technology, creativity and development! Welcome to another blog post on the JP Madsen website. I’m thrilled to share an exciting new web application that emerged from my personal passion for music composition and audio editing. In this article, we’ll explore the world of Audio Separation, a powerful tool that harnesses the magic of machine learning to transform the way we experience music.

Content

Unleash the potential of audio separation

Imagine having the ability to dissect a music track into its individual components, such as vocals, bass, drums, and other elements. This is precisely what the Audio Separation web application brings to the table. Inspired by the groundbreaking Demucs Music Source Separation package by Meta Research, this tool empowers you to delve into the intricate layers of your favorite tunes and even your original compositions.

How it works

At the heart of the Audio Separation web application is the Demucs package, which utilizes machine learning to separate the audio sources. The beauty lies in its simplicity. With just a few clicks, you can upload an audio file, and the tool will perform its magic. Once the process is complete, you’ll have access to individual audio tracks that were once entwined in a harmonious blend.

Setting up the environment

Before you embark on your journey of audio separation, it’s essential to set up the right environment. Create a Conda environment to isolate the project’s dependencies. Open your terminal and run the following commands:

Bash
conda create -n pdf-app python=3.8
conda activate pdf-app
pip install gradio scipy demucs

You’re all set up to experience audio separation directly from your terminal with Demucs. But let’s take this a little further and build a web application that can be deployed on Hugging Face Spaces, as well as shared on different platforms such as your company website or SharePoint.

The infrastructure

To create our web application, I’ll rely on two essential libraries (apart from Demucs): Gradio and Scipy. Gradio provides an elegant way to design interactive interfaces for machine learning and other applications. Scipy, on the other hand, is a powerful library for audio manipulation.

Code breakdown

The code behind the Audio Separation web application involves a series of steps that allow you to harness the power of the Demucs package for audio separation. Here’s a simplified breakdown of the process:

Python
# Import necessary libraries
import os
import gradio as gr
from scipy.io.wavfile import write, read
import subprocess

def inference(audio):
    # Create a directory to store output files
    os.makedirs("out", exist_ok=True)
    
    # Save the uploaded audio as a WAV file
    write('mix.wav', audio[0], audio[1])

    # Command to run the Demucs process through the terminal
    command = "python3 -m demucs -n mdx_extra_q -d cpu mix.wav -o out"
    process = subprocess.run(command, 
                             shell=True, 
                             stdin=subprocess.DEVNULL, 
                             stdout=subprocess.PIPE, 
                             stderr=subprocess.PIPE
                            )

    # Check if separated audio files exist
    files = ["./out/mdx_extra_q/mix/vocals.wav",
             "./out/mdx_extra_q/mix/bass.wav",
             "./out/mdx_extra_q/mix/drums.wav",
             "./out/mdx_extra_q/mix/other.wav"]
    
    for file in files:
        if not os.path.isfile(file):
            print(f"File not found: {file}")
        else:
            print(f"File exists: {file}")

    # Return paths to separated audio files
    return files

# Define the article content for the blog post
article = "Inspired by <p><a href='https://github.com/facebookresearch/demucs' target='_blank'>Demucs</a></p>\n<p>Copyright © 2023 JP Madsen</p>"

# Create the Gradio interface
demo = gr.Interface(
    inference,
    gr.inputs.Audio(type="numpy", label="Input"), 
    [gr.outputs.Audio(type="filepath", label="Vocals"),
     gr.outputs.Audio(type="filepath", label="Bass"),
     gr.outputs.Audio(type="filepath", label="Drums"),
     gr.outputs.Audio(type="filepath", label="Other")
    ],
    article=article,
    theme='nuttea/Softblue',
    allow_flagging="never"  
)

# Launch the Gradio interface
demo.launch()

Deploying on Hugging Face Spaces

Take your Python application to the cloud by utilizing an online hosting service such as Hugging Face Spaces, offering free tier spaces with 2 CPUs and 16 GB memory. Create a free account and follow the steps to deploy your Python app. Once deployed, your app will be accessible to users around the world. Naturally, you can also decide to self-host, but such an infrastructure will not be covered by this blog post.

Sharing on different platforms

Want to embed your Audio Separation app on your company website or SharePoint? You can use web components to seamlessly integrate the app. Simply add the following code snippets to your website’s HTML code:

HTML
<script
  type="module"
  src="https://gradio.s3-us-west-2.amazonaws.com/3.39.0/gradio.js"
></script>

<gradio-app src="https://*YOUR_APP*.hf.space" eager="true" info="false"></gradio-app>

Conclusion

The Audio Separation web application is a testament to the power of technology in enhancing our understanding and experience of music. Whether you’re a musician, music lover, or simply curious about the intricate layers of sound, this tool opens up new possibilities for creativity and exploration.

Unlock the magic of music and explore the Audio Separation tool today:

Let your ears and imagination run wild!

Comments

Looking for an outside perspective?

Just send me a message!

This website uses cookies. By continuing to use this site, you accept our use of cookies.  Learn more

Scroll to Top