audio processing

10 Deep Learning Resources for Audio Processing

April 19, 2019 Posted by Programming 0 thoughts on “10 Deep Learning Resources for Audio Processing”

We’ve written before about the rise of voice assistants in the IoT market. As these devices become more and more sophisticated, we can expect to see them number of voice assistant to skyrocket to 8 billion by 2023. This emerging market creates an excellent opportunity for those interested in audio processing or audio anomaly detection. In the future, there may be more emphasis put in creating valuable resources that can help up-and-coming audio processing data scientists practise their craft. As of now, there are plenty of resources available. Below are 10 resources for audio processing.


CS Machine Learning for Signal Processing

These are signal processing notes from a Computer Science class, but they cover the basics of the mathematical background you need to then perform other tasks. The topics include DSP Primer, Perception and Features, Principal Component Analysis, ICA and NMF, KPCA and Manifold Methods, Detection and Matched Filters, Decision theory & classifiers, Nonlinear classifiers, Classification bits and pieces, Clustering, DTW and HMMs, Missing data & dynamical models, Arrays & source separation, Underconstrained separation, Deep Learning.



You can’t train model without data. Audio Set provides millions of free sound clips.



A forum user posted a comprehensive guide about deep learning with audio.


Speech and Audio Understanding

Another CS course filled with lecture slides. These notes focuses on auditory processing by understanding both biology and the underlying technical process.


Human and Machine Hearing: Extracting Meaning from Sound 

This book, written by Richard F. Lyon, will teach you how humans hear and how to build machines that respond to them. It’s worth a read for anyone involved in audio processing.



“aubio is a library to extract annotations from audio signals: it provides a set of functions that take an input audio signal, and output pitch estimates, attack times (onset), beat location estimates, and other annotation tasks.”



“LibROSA is a python package for music and audio analysis. It provides the building blocks necessary to create music information retrieval systems.”


Urban Sound Classification with Neural Networks in Tensorflow

“This post discuss techniques of feature extraction from sound in Python using open source library Librosa and implements a Neural Network in Tensorflow to categories urban sounds, including car horns, children playing, dogs bark, and more.”


Audio Classification using FastAI and On-the-Fly Frequency Transforms

This article provides a more informal approach to Deep Learning and Audio Classification by introducing a practical technology called FastAI.


Science Wiki for Audio Processing

If those suggestions weren’t enough and you want to completely geek out on audio processing, search for audio processing in the Science Wiki.

Please follow and like us:
Tags: , ,