Python Audio Modules
Python has a few modules that deal with audio processing, from basic playback and recording to more complex audio manipulation. Here are some of the most common Python audio modules and their features:
1. pyaudio
Purpose:
pyaudio is a library for working with audio streams. It is commonly used for audio recording and playback.
Key Features:
- Supports audio input and output streams.
- Provides control over sample rate, channels, and buffer size.
- Compatible with various audio formats.
Common Use Cases:
- Voice recording.
- Real-time audio processing.
- Basic playback of WAV files.
Example:
import pyaudio
p = pyaudio.PyAudio()
# Open a stream
stream = p.open(format=pyaudio.paInt16,
channels=1,
rate=44100,
input=True,
frames_per_buffer=1024)
print("Recording...")
frames = []
for _ in range(0, int(44100 / 1024 * 5)): # 5 seconds of recording
data = stream.read(1024)
frames.append(data)
print("Done recording.")
stream.stop_stream()
stream.close()
p.terminate()
2. wave
Purpose:
wave is a module in Python’s standard library, and it is used for reading and writing WAV audio files.
Key Features:
- Supports PCM (Pulse Code Modulation) audio data.
- Tools for extracting metadata like number of channels and sample width.
Common Use Cases:
- Saving audio data recorded using
pyaudio. - Reading WAV files for simple analysis or playback.
Example:
import wave
# Writing a WAV file
with wave.open('output.wav', 'wb') as wf:
wf.setnchannels(1) # Mono
wf.setsampwidth(2) # 16-bit
wf.setframerate(44100) # Sampling rate
wf.writeframes(b''.join(frames)) # Frames from PyAudio
3. soundfile
Purpose:
soundfile reads and writes audio files in a variety of formats like WAV and FLAC
Key Features:
- Offers high-quality sound I/O.
- Supports 24-bit as well as floating-point audio data.
- Supports various audio file formats.
Common Use Cases:
- Advanced file manipulation with better performance than
wave.
Example:
import soundfile as sf
# Reading a file
data, samplerate = sf.read('input.wav')
# Writing a file
sf.write('output.wav', data, samplerate)
4. librosa
Purpose:
librosa is a popular library for music and audio analysis.
Key Features:
- Spectral processing (e.g., Fourier Transforms).
- Pitch, tempo, and beat detection.
- Feature extraction (e.g., Mel-frequency cepstral coefficients – MFCCs).
Common Use Cases:
- Building music recommendation systems.
- Analyzing musical features of audio.
Example:
import librosa
# Load audio
audio_data, sr = librosa.load('audio.mp3')
# Extract features
tempo, beats = librosa.beat.beat_track(y=audio_data, sr=sr)
print(f"Tempo: {tempo}")
5. pygame
Purpose:
pygame is mostly a game developing library but contains support for some basic audio playing.
Key Features:
- Simple WAV and MP3 file playback
- Looping and volume control
Common Use Cases:
- Game effects
- Background game music.
Example:
import pygame
# Initialize pygame mixer
pygame.mixer.init()
# Load and play audio
pygame.mixer.music.load("background.mp3")
pygame.mixer.music.play()
6. audiorecorder
Purpose:
audiorecorder is a Python package for recording audio easily.
Key Features:
- Minimal setup for recording audio.
- Simple and intuitive API.
Example:
from audiorecorder import AudioRecorder
recorder = AudioRecorder(channels=1, rate=44100)
recorder.start() # Start recording
input("Press Enter to stop recording...")
recorder.stop()
recorder.save("recording.wav")
7. scipy
Purpose:
scipy.io.wavfile is a module of the scipy library and is used for reading and writing of WAV files.
Key Features:
- The simple interface to handle WAV files.
- Converts the audio data into numpy arrays to be analyzed.
Common Use Cases:
- Processing audio signals with mathematical operations.
Example:
from scipy.io import wavfile
# Read a WAV file
rate, data = wavfile.read('input.wav')
# Write a WAV file
wavfile.write('output.wav', rate, data)
8. speech_recognition
Purpose:
For processing and recognizing speech from audio.
Key Features:
- It works with the Google Speech Recognition API and other engines.
- Speech recognition API, very simple.
Common Use Cases:
- Building voice assistants.
- Speech-to-text applications.
Example:
import speech_recognition as sr
recognizer = sr.Recognizer()
with sr.AudioFile('speech.wav') as source:
audio = recognizer.record(source)
text = recognizer.recognize_google(audio)
print(f"Recognized Text: {text}")
Comparison of Modules:
| Module | Key Use Case | Formats Supported |
|---|---|---|
pyaudio | Recording & Playback | Real-time audio streams |
wave | WAV file I/O | PCM WAV |
soundfile | File I/O | WAV, FLAC, etc. |
librosa | Audio Analysis | WAV, MP3 |
pygame | Game Audio | WAV, MP3 |
audiorecorder | Simple Recording | WAV |
scipy.io.wavfile | Basic WAV Processing | PCM WAV |
speech_recognition | Speech-to-text | WAV (via recognizers) |