liuq 95d491e160 更新 hai 3 meses
..
linux 95d491e160 更新 hai 3 meses
macos 95d491e160 更新 hai 3 meses
windows 95d491e160 更新 hai 3 meses
README.md 95d491e160 更新 hai 3 meses
__init__.py 95d491e160 更新 hai 3 meses

README.md

WebRTC Audio Processing Python Wrapper

Python ctypes bindings for WebRTC Audio Processing Module, providing echo cancellation, noise suppression, automatic gain control, and other audio processing features.

Features

  • Echo Cancellation (AEC): Removes acoustic echo from audio signals
  • Noise Suppression: Reduces background noise
  • Automatic Gain Control (AGC): Maintains consistent audio levels
  • High-pass Filter: Removes low-frequency noise
  • Transient Suppression: Reduces keyboard clicks and other transient sounds
  • Platform Support: Linux, macOS, and Windows with automatic library loading

Installation

The wrapper automatically detects your platform and loads the appropriate native library:

  • Linux: linux/{arch}/libwebrtc_apm.so
  • macOS: macos/{arch}/libwebrtc_apm.dylib
  • Windows: windows/{arch}/libwebrtc_apm.dll

Supported architectures: x64, arm64, x86

Quick Start

from webrtc_apm import WebRTCAudioProcessing, create_default_config
import ctypes
import numpy as np

# Initialize audio processing
apm = WebRTCAudioProcessing()

# Configure with echo cancellation and noise suppression
config = create_default_config()
config.echo.enabled = True
config.noise_suppress.enabled = True
config.high_pass.enabled = True

# Apply configuration
apm.apply_config(config)

# Create stream configurations for 16kHz mono audio
sample_rate = 16000
num_channels = 1
capture_config = apm.create_stream_config(sample_rate, num_channels)
render_config = apm.create_stream_config(sample_rate, num_channels)

# Set echo delay
apm.set_stream_delay_ms(50)

# Process audio frames
frame_size = 160  # 10ms at 16kHz
capture_audio = np.random.randint(-1000, 1000, frame_size, dtype=np.int16)
render_audio = np.random.randint(-500, 500, frame_size, dtype=np.int16)

# Convert to ctypes arrays
capture_buffer = (ctypes.c_short * frame_size)(*capture_audio)
render_buffer = (ctypes.c_short * frame_size)(*render_audio)
processed_capture = (ctypes.c_short * frame_size)()
processed_render = (ctypes.c_short * frame_size)()

# Process render stream (echo reference)
apm.process_reverse_stream(render_buffer, render_config, render_config, processed_render)

# Process capture stream (apply processing)
apm.process_stream(capture_buffer, capture_config, capture_config, processed_capture)

# Clean up
apm.destroy_stream_config(capture_config)
apm.destroy_stream_config(render_config)

Configuration Options

Echo Cancellation

config.echo.enabled = True
config.echo.mobile_mode = False  # Use full AEC (not mobile version)
config.echo.enforce_high_pass_filtering = True

Noise Suppression

config.noise_suppress.enabled = True
config.noise_suppress.noise_level = 2  # 0=Low, 1=Moderate, 2=High, 3=VeryHigh

Automatic Gain Control (AGC1)

config.gain_control1.enabled = True
config.gain_control1.controller_mode = 0  # 0=AdaptiveAnalog, 1=AdaptiveDigital, 2=FixedDigital
config.gain_control1.target_level_dbfs = 3
config.gain_control1.compression_gain_db = 9
config.gain_control1.enable_limiter = True

High-pass Filter

config.high_pass.enabled = True
config.high_pass.apply_in_full_band = True

API Reference

Classes

WebRTCAudioProcessing

Main audio processing class.

Methods:

  • create_stream_config(sample_rate, num_channels) - Create stream configuration
  • destroy_stream_config(config_handle) - Destroy stream configuration
  • apply_config(config) - Apply processing configuration
  • process_stream(src, src_config, dest_config, dest) - Process capture audio
  • process_reverse_stream(src, src_config, dest_config, dest) - Process render audio
  • set_stream_delay_ms(delay_ms) - Set echo delay in milliseconds

Config

Configuration structure with all processing options.

Functions

create_default_config()

Returns a Config object with sensible default values.

Enums

  • DownmixMethod: How to convert multi-channel to mono
  • NoiseSuppressionLevel: Noise suppression intensity
  • GainController1Mode: AGC operating mode
  • ClippingPredictorMode: Clipping prediction algorithm

Examples

See example.py for comprehensive usage examples including:

  • Basic echo cancellation and noise suppression
  • AGC configuration
  • Full processing pipeline setup
  • Audio frame processing loop

Audio Format Requirements

  • Sample Rate: 8000, 16000, 32000, or 48000 Hz
  • Channels: 1 (mono) or 2 (stereo)
  • Sample Format: 16-bit signed integer (int16)
  • Frame Size: 10ms worth of samples (e.g., 160 samples at 16kHz)

Error Handling

All processing functions return status codes:

  • 0: Success
  • Non-zero: Error occurred
result = apm.process_stream(src, src_config, dest_config, dest)
if result != 0:
    print(f"Processing failed with code: {result}")

Performance Notes

  • Process audio in 10ms frames for optimal performance
  • Reuse stream configurations when possible
  • Call process_reverse_stream() before process_stream() for best echo cancellation
  • Set appropriate stream delay based on your audio system latency

Platform-Specific Notes

Linux

  • Requires the shared library libwebrtc_apm.so
  • Works on x64 and ARM64 architectures

macOS

  • Uses dynamic library libwebrtc_apm.dylib
  • Supports both Intel (x64) and Apple Silicon (arm64)

Windows

  • Requires libwebrtc_apm.dll
  • Supports x64 and x86 architectures
  • May need Visual C++ Redistributable

Troubleshooting

Library not found: Ensure the native library is in the correct platform subdirectory.

Processing errors: Check that audio format matches stream configuration (sample rate, channels).

Poor echo cancellation: Verify render stream is processed before capture stream and stream delay is set correctly.

High CPU usage: Use 10ms frames and appropriate sample rates (16kHz recommended for voice).