[WebRTC] Rework device handling sequence so that we can handle unplugging/re-plugging devices (#4593)

* [WebRTC] Rework device handling sequence so that we can handle unplugging/re-plugging devices The device handling was not processing device updates in the proper sequence as things like AEC use both input and output devices. Devices like headsets are both so unplugging them resulted in various mute conditions and sometimes even a crash. Now, we update both capture and render devices at once in the proper sequence. Test Guidance: * Bring two users in the same place in webrtc regions. * The 'listening' one should have a headset or something set oas 'Default' * Press 'talk' on one, and verify the other can hear. * Unplug the headset from the listening one. * Validate that audio changes from the headset to the speakers. * Plug the headset back in. * Validate that audio changes from speakers to headset. * Do the same type of test with the headset viewer talking. * The microphone used should switch from the headset to the computer (it should have one) Do other various device tests, such as setting devices explicitly, messing with the device selector, etc. * Fix race condition when multiple change device requests might come in at once * Update to m137 The primary feature of this commit is to update libwebrtc from m114 to m137. This is needed to make webrtc buildable, as m114 is not buildable by the current toolset. m137 had some changes to the API, which required renaming or changing namespace of some of the calls. Additionally, this PR moves from a callback mechanism for gathering the energy levels for tuning to a wrapper AudioDeviceModule, which gives us more control over the audio stream. Finally, the new m137-based webrtc has been updated to allow for 192khz audio streams. * Properly pass the observer setting into the inner audio device module * Update to m137 and get rid of some noise This change updates to m137 from m114, which required a few API changes. Additionally, this fixes the hiss that happens shortly after someone unmutes: https://github.com/secondlife/server/issues/2094 There was also an issue with a slight amount of repeated after unmuting if there was audio right before unmuting. This is because the audio processing and buffering still had audio from the previous speaking session. Now, we inject nearly a half second of silence into the audio buffers/processor after unmuting to flush things. * Install nsis on windows * Use the newer digital AGC pipeline m137 improved the AGC pipeline and the existing analog style is going away so move to the new digital pipeline. Also, some tweaking for audio levels so that we don't see inworld bars when tuning, so one's own bars seem a reasonable size, etc. * Install NSIS during windows sisgning and package build step * Try pinning the packaging to windows 2022 to deal with missing nsis * Adjust gain calculation and audio level calculations for tuning and peer connections * Update with mac universal webrtc build * Tuning of voice indicators for both tuning mode and inworld for self. * Redo device deployment to handle cases where multiple deploy requests pile up Also, mute when leaving webrtc-enabled regions or parcels, and unmute when voice comes back. * pre commit issue
author: Roxanne Skelly <roxie@lindenlab.com> 2025-09-12 17:07:51 -0700
committer: GitHub <noreply@github.com> 2025-09-12 20:07:51 -0400
commit: a6d4c1d394eef2cea41f6c6bcd751fec746ec17d (patch)
tree: 23f87bb894d5d6113acf412d8a707d9cb4cf0562 /indra/newview/llvoicewebrtc.cpp
parent: 42695904d600c3a51893e5e5c718165956086591 (diff)
1 files changed, 98 insertions, 41 deletions
diff --git a/indra/newview/llvoicewebrtc.cpp b/indra/newview/llvoicewebrtc.cpp
index 34f3e22182..b26a48fd5f 100644
--- a/indra/newview/llvoicewebrtc.cpp
+++ b/indra/newview/llvoicewebrtc.cpp
@@ -82,9 +82,15 @@ const std::string WEBRTC_VOICE_SERVER_TYPE = "webrtc";
 
 namespace {
 
-    const F32 MAX_AUDIO_DIST      = 50.0f;
-    const F32 VOLUME_SCALE_WEBRTC = 0.01f;
-    const F32 LEVEL_SCALE_WEBRTC  = 0.008f;
+    const F32      MAX_AUDIO_DIST           = 50.0f;
+    const F32      VOLUME_SCALE_WEBRTC      = 0.01f;
+    const F32      TUNING_LEVEL_SCALE       = 0.01f;
+    const F32      TUNING_LEVEL_START_POINT = 0.8f;
+    const F32      LEVEL_SCALE              = 0.005f;
+    const F32      LEVEL_START_POINT        = 0.18f;
+    const uint32_t SET_HIDDEN_RESTORE_DELAY_MS = 200;  // 200 ms to unmute again after hiding during teleport
+    const uint32_t MUTE_FADE_DELAY_MS       = 500;   // 20ms fade followed by 480ms silence gets rid of the click just after unmuting.
+                                                     // This is because the buffers and processing is cleared by the silence.
 
     const F32 SPEAKING_AUDIO_LEVEL = 0.30;
 
@@ -201,7 +207,6 @@ bool LLWebRTCVoiceClient::sShuttingDown = false;
 
 LLWebRTCVoiceClient::LLWebRTCVoiceClient() :
     mHidden(false),
-    mTuningMode(false),
     mTuningMicGain(0.0),
     mTuningSpeakerVolume(50),  // Set to 50 so the user can hear themselves when he sets his mic volume
     mDevicesListUpdated(false),
@@ -348,25 +353,45 @@ void LLWebRTCVoiceClient::updateSettings()
         static LLCachedControl<std::string> sOutputDevice(gSavedSettings, "VoiceOutputAudioDevice");
         setRenderDevice(sOutputDevice);
 
-        LL_INFOS("Voice") << "Input device: " << std::quoted(sInputDevice()) << ", output device: " << std::quoted(sOutputDevice()) << LL_ENDL;
+        LL_INFOS("Voice") << "Input device: " << std::quoted(sInputDevice()) << ", output device: " << std::quoted(sOutputDevice())
+                            << LL_ENDL;
 
         static LLCachedControl<F32> sMicLevel(gSavedSettings, "AudioLevelMic");
         setMicGain(sMicLevel);
 
         llwebrtc::LLWebRTCDeviceInterface::AudioConfig config;
 
+        bool audioConfigChanged = false;
+
         static LLCachedControl<bool> sEchoCancellation(gSavedSettings, "VoiceEchoCancellation", true);
-        config.mEchoCancellation = sEchoCancellation;
+        if (sEchoCancellation != config.mEchoCancellation)
+        {
+            config.mEchoCancellation = sEchoCancellation;
+            audioConfigChanged       = true;
+        }
 
         static LLCachedControl<bool> sAGC(gSavedSettings, "VoiceAutomaticGainControl", true);
-        config.mAGC = sAGC;
+        if (sAGC != config.mAGC)
+        {
+            config.mAGC        = sAGC;
+            audioConfigChanged = true;
+        }
 
-        static LLCachedControl<U32> sNoiseSuppressionLevel(gSavedSettings,
+        static LLCachedControl<U32> sNoiseSuppressionLevel(
+            gSavedSettings,
             "VoiceNoiseSuppressionLevel",
             llwebrtc::LLWebRTCDeviceInterface::AudioConfig::ENoiseSuppressionLevel::NOISE_SUPPRESSION_LEVEL_VERY_HIGH);
-        config.mNoiseSuppressionLevel = (llwebrtc::LLWebRTCDeviceInterface::AudioConfig::ENoiseSuppressionLevel)(U32)sNoiseSuppressionLevel;
-
-        mWebRTCDeviceInterface->setAudioConfig(config);
+        auto noiseSuppressionLevel =
+            (llwebrtc::LLWebRTCDeviceInterface::AudioConfig::ENoiseSuppressionLevel)(U32)sNoiseSuppressionLevel;
+        if (noiseSuppressionLevel != config.mNoiseSuppressionLevel)
+        {
+            config.mNoiseSuppressionLevel = noiseSuppressionLevel;
+            audioConfigChanged            = true;
+        }
+        if (audioConfigChanged)
+        {
+            mWebRTCDeviceInterface->setAudioConfig(config);
+        }
     }
 }
 
@@ -695,21 +720,38 @@ void LLWebRTCVoiceClient::OnDevicesChangedImpl(const llwebrtc::LLWebRTCVoiceDevi
     std::string outputDevice = gSavedSettings.getString("VoiceOutputAudioDevice");
 
     LL_DEBUGS("Voice") << "Setting devices to-input: '" << inputDevice << "' output: '" << outputDevice << "'" << LL_ENDL;
-    clearRenderDevices();
-    for (auto &device : render_devices)
+
+    // only set the render device if the device list has changed.
+    if (mRenderDevices.size() != render_devices.size() || !std::equal(mRenderDevices.begin(),
+                    mRenderDevices.end(),
+                    render_devices.begin(),
+                    [](const LLVoiceDevice& a, const llwebrtc::LLWebRTCVoiceDevice& b) {
+            return a.display_name == b.mDisplayName && a.full_name == b.mID; }))
     {
-        addRenderDevice(LLVoiceDevice(device.mDisplayName, device.mID));
+        clearRenderDevices();
+        for (auto& device : render_devices)
+        {
+            addRenderDevice(LLVoiceDevice(device.mDisplayName, device.mID));
+        }
+        setRenderDevice(outputDevice);
     }
-    setRenderDevice(outputDevice);
 
-    clearCaptureDevices();
-    for (auto &device : capture_devices)
+    // only set the capture device if the device list has changed.
+    if (mCaptureDevices.size() != capture_devices.size() ||!std::equal(mCaptureDevices.begin(),
+                    mCaptureDevices.end(),
+                    capture_devices.begin(),
+                    [](const LLVoiceDevice& a, const llwebrtc::LLWebRTCVoiceDevice& b)
+                    { return a.display_name == b.mDisplayName && a.full_name == b.mID; }))
     {
-        LL_DEBUGS("Voice") << "Checking capture device:'" << device.mID << "'" << LL_ENDL;
+        clearCaptureDevices();
+        for (auto& device : capture_devices)
+        {
+            LL_DEBUGS("Voice") << "Checking capture device:'" << device.mID << "'" << LL_ENDL;
 
-        addCaptureDevice(LLVoiceDevice(device.mDisplayName, device.mID));
+            addCaptureDevice(LLVoiceDevice(device.mDisplayName, device.mID));
+        }
+        setCaptureDevice(inputDevice);
     }
-    setCaptureDevice(inputDevice);
 
     setDevicesListUpdated(true);
 }
@@ -762,7 +804,14 @@ bool LLWebRTCVoiceClient::inTuningMode()
 
 void LLWebRTCVoiceClient::tuningSetMicVolume(float volume)
 {
-    mTuningMicGain      = volume;
+    if (volume != mTuningMicGain)
+    {
+        mTuningMicGain = volume;
+        if (mWebRTCDeviceInterface)
+        {
+            mWebRTCDeviceInterface->setTuningMicGain(volume);
+        }
+    }
 }
 
 void LLWebRTCVoiceClient::tuningSetSpeakerVolume(float volume)
@@ -774,21 +823,10 @@ void LLWebRTCVoiceClient::tuningSetSpeakerVolume(float volume)
     }
 }
 
-float LLWebRTCVoiceClient::getAudioLevel()
-{
-    if (mIsInTuningMode)
-    {
-        return (1.0f - mWebRTCDeviceInterface->getTuningAudioLevel() * LEVEL_SCALE_WEBRTC) * mTuningMicGain / 2.1f;
-    }
-    else
-    {
-        return (1.0f - mWebRTCDeviceInterface->getPeerConnectionAudioLevel() * LEVEL_SCALE_WEBRTC) * mMicGain / 2.1f;
-    }
-}
-
 float LLWebRTCVoiceClient::tuningGetEnergy(void)
 {
-    return getAudioLevel();
+    float rms = mWebRTCDeviceInterface->getTuningAudioLevel();
+    return TUNING_LEVEL_START_POINT - TUNING_LEVEL_SCALE * rms;
 }
 
 bool LLWebRTCVoiceClient::deviceSettingsAvailable()
@@ -824,6 +862,11 @@ void LLWebRTCVoiceClient::setHidden(bool hidden)
 
     if (inSpatialChannel())
     {
+        if (mWebRTCDeviceInterface)
+        {
+            mWebRTCDeviceInterface->setMute(mHidden || mMuteMic,
+                                            mHidden ? 0 : SET_HIDDEN_RESTORE_DELAY_MS); // delay 200ms so as to not pile up mutes/unmutes.
+        }
         if (mHidden)
         {
             // get out of the channel entirely
@@ -990,7 +1033,6 @@ void LLWebRTCVoiceClient::updatePosition(void)
         {
             if (participant->mRegion != region->getRegionID()) {
                 participant->mRegion = region->getRegionID();
-                setMuteMic(mMuteMic);
             }
         }
     }
@@ -1115,13 +1157,14 @@ void LLWebRTCVoiceClient::sendPositionUpdate(bool force)
 // Update our own volume on our participant, so it'll show up
 // in the UI.  This is done on all sessions, so switching
 // sessions retains consistent volume levels.
-void LLWebRTCVoiceClient::updateOwnVolume() {
-    F32 audio_level = 0.0;
-    if (!mMuteMic && !mTuningMode)
+void LLWebRTCVoiceClient::updateOwnVolume()
+{
+    F32 audio_level = 0.0f;
+    if (!mMuteMic)
     {
-        audio_level = getAudioLevel();
+        float rms = mWebRTCDeviceInterface->getPeerConnectionAudioLevel();
+        audio_level = LEVEL_START_POINT - LEVEL_SCALE * rms;
     }
-
     sessionState::for_each(boost::bind(predUpdateOwnVolume, _1, audio_level));
 }
 
@@ -1518,6 +1561,17 @@ void LLWebRTCVoiceClient::setMuteMic(bool muted)
     }
 
     mMuteMic = muted;
+
+    if (mIsInTuningMode)
+    {
+        return;
+    }
+
+    if (mWebRTCDeviceInterface)
+    {
+        mWebRTCDeviceInterface->setMute(muted, muted ? MUTE_FADE_DELAY_MS : 0);  // delay for 40ms on mute to allow buffers to empty
+    }
+
     // when you're hidden, your mic is always muted.
     if (!mHidden)
     {
@@ -1556,7 +1610,10 @@ void LLWebRTCVoiceClient::setMicGain(F32 gain)
     if (gain != mMicGain)
     {
         mMicGain = gain;
-        mWebRTCDeviceInterface->setPeerConnectionGain(gain);
+        if (mWebRTCDeviceInterface)
+        {
+            mWebRTCDeviceInterface->setMicGain(gain);
+        }
     }
 }
author	Roxanne Skelly <roxie@lindenlab.com>	2025-09-12 17:07:51 -0700
committer	GitHub <noreply@github.com>	2025-09-12 20:07:51 -0400
commit	a6d4c1d394eef2cea41f6c6bcd751fec746ec17d (patch)
tree	23f87bb894d5d6113acf412d8a707d9cb4cf0562 /indra/newview/llvoicewebrtc.cpp
parent	42695904d600c3a51893e5e5c718165956086591 (diff)