Python Audio Modules

Python has a few modules that deal with audio processing, from basic playback and recording to more complex audio manipulation. Here are some of the most common Python audio modules and their features:

1. pyaudio

Purpose:

pyaudio is a library for working with audio streams. It is commonly used for audio recording and playback.

Key Features:

  • Supports audio input and output streams.
  • Provides control over sample rate, channels, and buffer size.
  • Compatible with various audio formats.

Common Use Cases:

  • Voice recording.
  • Real-time audio processing.
  • Basic playback of WAV files.

Example:

import pyaudio

p = pyaudio.PyAudio()

# Open a stream
stream = p.open(format=pyaudio.paInt16,
                channels=1,
                rate=44100,
                input=True,
                frames_per_buffer=1024)

print("Recording...")
frames = []

for _ in range(0, int(44100 / 1024 * 5)):  # 5 seconds of recording
    data = stream.read(1024)
    frames.append(data)

print("Done recording.")

stream.stop_stream()
stream.close()
p.terminate()

2. wave

Purpose:

wave is a module in Python’s standard library, and it is used for reading and writing WAV audio files.

Key Features:

  • Supports PCM (Pulse Code Modulation) audio data.
  • Tools for extracting metadata like number of channels and sample width.

Common Use Cases:

  • Saving audio data recorded using pyaudio.
  • Reading WAV files for simple analysis or playback.

Example:

import wave

# Writing a WAV file
with wave.open('output.wav', 'wb') as wf:
    wf.setnchannels(1)        # Mono
    wf.setsampwidth(2)        # 16-bit
    wf.setframerate(44100)    # Sampling rate
    wf.writeframes(b''.join(frames))  # Frames from PyAudio

3. soundfile

Purpose:

soundfile reads and writes audio files in a variety of formats like WAV and FLAC

Key Features:

  • Offers high-quality sound I/O.
  • Supports 24-bit as well as floating-point audio data.
  • Supports various audio file formats.

Common Use Cases:

  • Advanced file manipulation with better performance than wave.

Example:

import soundfile as sf

# Reading a file
data, samplerate = sf.read('input.wav')

# Writing a file
sf.write('output.wav', data, samplerate)

4. librosa

Purpose:

librosa is a popular library for music and audio analysis.

Key Features:

  • Spectral processing (e.g., Fourier Transforms).
  • Pitch, tempo, and beat detection.
  • Feature extraction (e.g., Mel-frequency cepstral coefficients – MFCCs).

Common Use Cases:

  • Building music recommendation systems.
  • Analyzing musical features of audio.

Example:

import librosa

# Load audio
audio_data, sr = librosa.load('audio.mp3')

# Extract features
tempo, beats = librosa.beat.beat_track(y=audio_data, sr=sr)
print(f"Tempo: {tempo}")

5. pygame

Purpose:

pygame is mostly a game developing library but contains support for some basic audio playing.

Key Features:

  • Simple WAV and MP3 file playback
  • Looping and volume control

Common Use Cases:

  • Game effects
  • Background game music.

Example:

import pygame

# Initialize pygame mixer
pygame.mixer.init()

# Load and play audio
pygame.mixer.music.load("background.mp3")
pygame.mixer.music.play()

6. audiorecorder

Purpose:

audiorecorder is a Python package for recording audio easily.

Key Features:

  • Minimal setup for recording audio.
  • Simple and intuitive API.

Example:

from audiorecorder import AudioRecorder

recorder = AudioRecorder(channels=1, rate=44100)
recorder.start()  # Start recording
input("Press Enter to stop recording...")
recorder.stop()
recorder.save("recording.wav")

7. scipy

Purpose:

scipy.io.wavfile is a module of the scipy library and is used for reading and writing of WAV files.

Key Features:

  • The simple interface to handle WAV files.
  • Converts the audio data into numpy arrays to be analyzed.

Common Use Cases:

  • Processing audio signals with mathematical operations.

Example:

from scipy.io import wavfile

# Read a WAV file
rate, data = wavfile.read('input.wav')

# Write a WAV file
wavfile.write('output.wav', rate, data)

8. speech_recognition

Purpose:

For processing and recognizing speech from audio.

Key Features:

  • It works with the Google Speech Recognition API and other engines.
  • Speech recognition API, very simple.

Common Use Cases:

  • Building voice assistants.
  • Speech-to-text applications.

Example:

import speech_recognition as sr

recognizer = sr.Recognizer()

with sr.AudioFile('speech.wav') as source:
    audio = recognizer.record(source)

text = recognizer.recognize_google(audio)
print(f"Recognized Text: {text}")

Comparison of Modules:

ModuleKey Use CaseFormats Supported
pyaudioRecording & PlaybackReal-time audio streams
waveWAV file I/OPCM WAV
soundfileFile I/OWAV, FLAC, etc.
librosaAudio AnalysisWAV, MP3
pygameGame AudioWAV, MP3
audiorecorderSimple RecordingWAV
scipy.io.wavfileBasic WAV ProcessingPCM WAV
speech_recognitionSpeech-to-textWAV (via recognizers)