liuq 95d491e160 更新		4 tháng trước cách đây
..
linux	95d491e160 更新	4 tháng trước cách đây
macos	95d491e160 更新	4 tháng trước cách đây
windows	95d491e160 更新	4 tháng trước cách đây
README.md	95d491e160 更新	4 tháng trước cách đây
__init__.py	95d491e160 更新	4 tháng trước cách đây

WebRTC Audio Processing Python Wrapper

Python ctypes bindings for WebRTC Audio Processing Module, providing echo cancellation, noise suppression, automatic gain control, and other audio processing features.

Features

Echo Cancellation (AEC): Removes acoustic echo from audio signals
Noise Suppression: Reduces background noise
Automatic Gain Control (AGC): Maintains consistent audio levels
High-pass Filter: Removes low-frequency noise
Transient Suppression: Reduces keyboard clicks and other transient sounds
Platform Support: Linux, macOS, and Windows with automatic library loading

Installation

The wrapper automatically detects your platform and loads the appropriate native library:

Linux: linux/{arch}/libwebrtc_apm.so
macOS: macos/{arch}/libwebrtc_apm.dylib
Windows: windows/{arch}/libwebrtc_apm.dll

Supported architectures: x64, arm64, x86

Quick Start

from webrtc_apm import WebRTCAudioProcessing, create_default_config
import ctypes
import numpy as np

# Initialize audio processing
apm = WebRTCAudioProcessing()

# Configure with echo cancellation and noise suppression
config = create_default_config()
config.echo.enabled = True
config.noise_suppress.enabled = True
config.high_pass.enabled = True

# Apply configuration
apm.apply_config(config)

# Create stream configurations for 16kHz mono audio
sample_rate = 16000
num_channels = 1
capture_config = apm.create_stream_config(sample_rate, num_channels)
render_config = apm.create_stream_config(sample_rate, num_channels)

# Set echo delay
apm.set_stream_delay_ms(50)

# Process audio frames
frame_size = 160  # 10ms at 16kHz
capture_audio = np.random.randint(-1000, 1000, frame_size, dtype=np.int16)
render_audio = np.random.randint(-500, 500, frame_size, dtype=np.int16)

# Convert to ctypes arrays
capture_buffer = (ctypes.c_short * frame_size)(*capture_audio)
render_buffer = (ctypes.c_short * frame_size)(*render_audio)
processed_capture = (ctypes.c_short * frame_size)()
processed_render = (ctypes.c_short * frame_size)()

# Process render stream (echo reference)
apm.process_reverse_stream(render_buffer, render_config, render_config, processed_render)

# Process capture stream (apply processing)
apm.process_stream(capture_buffer, capture_config, capture_config, processed_capture)

# Clean up
apm.destroy_stream_config(capture_config)
apm.destroy_stream_config(render_config)

Configuration Options

Echo Cancellation

config.echo.enabled = True
config.echo.mobile_mode = False  # Use full AEC (not mobile version)
config.echo.enforce_high_pass_filtering = True

Noise Suppression

config.noise_suppress.enabled = True
config.noise_suppress.noise_level = 2  # 0=Low, 1=Moderate, 2=High, 3=VeryHigh

Automatic Gain Control (AGC1)

config.gain_control1.enabled = True
config.gain_control1.controller_mode = 0  # 0=AdaptiveAnalog, 1=AdaptiveDigital, 2=FixedDigital
config.gain_control1.target_level_dbfs = 3
config.gain_control1.compression_gain_db = 9
config.gain_control1.enable_limiter = True

High-pass Filter

config.high_pass.enabled = True
config.high_pass.apply_in_full_band = True

API Reference

Classes

`WebRTCAudioProcessing`

Main audio processing class.

Methods:

create_stream_config(sample_rate, num_channels) - Create stream configuration
destroy_stream_config(config_handle) - Destroy stream configuration
apply_config(config) - Apply processing configuration
process_stream(src, src_config, dest_config, dest) - Process capture audio
process_reverse_stream(src, src_config, dest_config, dest) - Process render audio
set_stream_delay_ms(delay_ms) - Set echo delay in milliseconds

`Config`

Configuration structure with all processing options.

Functions

`create_default_config()`

Returns a Config object with sensible default values.

Enums

DownmixMethod: How to convert multi-channel to mono
NoiseSuppressionLevel: Noise suppression intensity
GainController1Mode: AGC operating mode
ClippingPredictorMode: Clipping prediction algorithm

Examples

See example.py for comprehensive usage examples including:

Basic echo cancellation and noise suppression
AGC configuration
Full processing pipeline setup
Audio frame processing loop

Audio Format Requirements

Sample Rate: 8000, 16000, 32000, or 48000 Hz
Channels: 1 (mono) or 2 (stereo)
Sample Format: 16-bit signed integer (int16)
Frame Size: 10ms worth of samples (e.g., 160 samples at 16kHz)

Error Handling

All processing functions return status codes:

0: Success
Non-zero: Error occurred

result = apm.process_stream(src, src_config, dest_config, dest)
if result != 0:
    print(f"Processing failed with code: {result}")

Performance Notes

Process audio in 10ms frames for optimal performance
Reuse stream configurations when possible
Call process_reverse_stream() before process_stream() for best echo cancellation
Set appropriate stream delay based on your audio system latency

Platform-Specific Notes

Linux

Requires the shared library libwebrtc_apm.so
Works on x64 and ARM64 architectures

macOS

Uses dynamic library libwebrtc_apm.dylib
Supports both Intel (x64) and Apple Silicon (arm64)

Windows

Requires libwebrtc_apm.dll
Supports x64 and x86 architectures
May need Visual C++ Redistributable

Troubleshooting

Library not found: Ensure the native library is in the correct platform subdirectory.

Processing errors: Check that audio format matches stream configuration (sample rate, channels).

Poor echo cancellation: Verify render stream is processed before capture stream and stream delay is set correctly.

High CPU usage: Use 10ms frames and appropriate sample rates (16kHz recommended for voice).

README.md