|
|
hai 3 meses | |
|---|---|---|
| .. | ||
| linux | hai 3 meses | |
| macos | hai 3 meses | |
| windows | hai 3 meses | |
| README.md | hai 3 meses | |
| __init__.py | hai 3 meses | |
Python ctypes bindings for WebRTC Audio Processing Module, providing echo cancellation, noise suppression, automatic gain control, and other audio processing features.
The wrapper automatically detects your platform and loads the appropriate native library:
linux/{arch}/libwebrtc_apm.somacos/{arch}/libwebrtc_apm.dylibwindows/{arch}/libwebrtc_apm.dllSupported architectures: x64, arm64, x86
from webrtc_apm import WebRTCAudioProcessing, create_default_config
import ctypes
import numpy as np
# Initialize audio processing
apm = WebRTCAudioProcessing()
# Configure with echo cancellation and noise suppression
config = create_default_config()
config.echo.enabled = True
config.noise_suppress.enabled = True
config.high_pass.enabled = True
# Apply configuration
apm.apply_config(config)
# Create stream configurations for 16kHz mono audio
sample_rate = 16000
num_channels = 1
capture_config = apm.create_stream_config(sample_rate, num_channels)
render_config = apm.create_stream_config(sample_rate, num_channels)
# Set echo delay
apm.set_stream_delay_ms(50)
# Process audio frames
frame_size = 160 # 10ms at 16kHz
capture_audio = np.random.randint(-1000, 1000, frame_size, dtype=np.int16)
render_audio = np.random.randint(-500, 500, frame_size, dtype=np.int16)
# Convert to ctypes arrays
capture_buffer = (ctypes.c_short * frame_size)(*capture_audio)
render_buffer = (ctypes.c_short * frame_size)(*render_audio)
processed_capture = (ctypes.c_short * frame_size)()
processed_render = (ctypes.c_short * frame_size)()
# Process render stream (echo reference)
apm.process_reverse_stream(render_buffer, render_config, render_config, processed_render)
# Process capture stream (apply processing)
apm.process_stream(capture_buffer, capture_config, capture_config, processed_capture)
# Clean up
apm.destroy_stream_config(capture_config)
apm.destroy_stream_config(render_config)
config.echo.enabled = True
config.echo.mobile_mode = False # Use full AEC (not mobile version)
config.echo.enforce_high_pass_filtering = True
config.noise_suppress.enabled = True
config.noise_suppress.noise_level = 2 # 0=Low, 1=Moderate, 2=High, 3=VeryHigh
config.gain_control1.enabled = True
config.gain_control1.controller_mode = 0 # 0=AdaptiveAnalog, 1=AdaptiveDigital, 2=FixedDigital
config.gain_control1.target_level_dbfs = 3
config.gain_control1.compression_gain_db = 9
config.gain_control1.enable_limiter = True
config.high_pass.enabled = True
config.high_pass.apply_in_full_band = True
WebRTCAudioProcessingMain audio processing class.
Methods:
create_stream_config(sample_rate, num_channels) - Create stream configurationdestroy_stream_config(config_handle) - Destroy stream configurationapply_config(config) - Apply processing configurationprocess_stream(src, src_config, dest_config, dest) - Process capture audioprocess_reverse_stream(src, src_config, dest_config, dest) - Process render audioset_stream_delay_ms(delay_ms) - Set echo delay in millisecondsConfigConfiguration structure with all processing options.
create_default_config()Returns a Config object with sensible default values.
DownmixMethod: How to convert multi-channel to monoNoiseSuppressionLevel: Noise suppression intensityGainController1Mode: AGC operating modeClippingPredictorMode: Clipping prediction algorithmSee example.py for comprehensive usage examples including:
All processing functions return status codes:
0: Successresult = apm.process_stream(src, src_config, dest_config, dest)
if result != 0:
print(f"Processing failed with code: {result}")
process_reverse_stream() before process_stream() for best echo cancellationlibwebrtc_apm.solibwebrtc_apm.dyliblibwebrtc_apm.dllLibrary not found: Ensure the native library is in the correct platform subdirectory.
Processing errors: Check that audio format matches stream configuration (sample rate, channels).
Poor echo cancellation: Verify render stream is processed before capture stream and stream delay is set correctly.
High CPU usage: Use 10ms frames and appropriate sample rates (16kHz recommended for voice).