std::optional<std::string> GetApplicationId(intptr_t window_id) {
we need to return value if not windows?
std::string GetAppMainProcessId(intptr_t window_id);
Can we have API descritpion?
GetWindowThreadProcessId(main_uwp_app_core_window, &main_process_id);
we need to validate the return value from the OS API, for all this file?
HANDLE hProcess = OpenProcess(PROCESS_QUERY_INFORMATION, FALSE, processId);
process_handle?
HANDLE hProcess = OpenProcess(PROCESS_QUERY_INFORMATION, FALSE, processId);
we need to validate if it is valid?
OpenProcess(PROCESS_QUERY_LIMITED_INFORMATION, false, parent_id));
add /*bInheritHandle=*/?
"mmdevapi.lib",
if getDisplayMedia for echo isn't major scenario, then we can make this as delay load?
static std::string GetApplicationId(std::string_view device_id);
I wonder who creates application device id? I assume that this file has that API to reduce any chance of error and improve readiability?
MEDIA_EXPORT bool IsApplicationLoopbackCaptureSupported();
same here because application loopback feature is already supported for other OSs and this CL is only for Windows?
"ApplicationAudioCapture",
I wonder if we need to specify the Windows somehow and with if BUILDFLAG(IS_WIN)?
AudioParameters params = audio_manager_->GetInputStreamParameters(device_id);
I wonder if GetInputStreamParameters can handel application device id as I don't see the AudioManager change in this CL?
class CompletionHandler
Can you add comment for the class?
: public Microsoft::WRL::RuntimeClass<
probably, you don't want to remove this one like ComPtr by "using Microsoft::WRL::Runtimeclass", same for other Com classes.
activate_operation->GetActivateResult(&hr_activate, &audio_client_);
please null check in case of the API does wrong behavior. we have to harden against third party (OS) APIs as they keep changing and introduced bug from time to time.
if (AudioDeviceDescription::IsApplicationDevice(device_id_)) {
It isn't clear what this API means by just reading a name itself? code wise, device_id_ is audio output device id, if it has something else, then we have to explain it here or its declaration on the header.
.ActivationType = AUDIOCLIENT_ACTIVATION_TYPE_PROCESS_LOOPBACK,
It might be help to reader if it has a link to the MSDN for these values use
hr = completion_handler->WaitAndGetAudioClient(audio_client_);
can we make this as &audio_client that will be more consistent with the below Activate all, clearer to out param.
// Doesn't support these
Is this complete comment?
if (!AudioDeviceDescription::IsApplicationDevice(device_id_)) {
can we have comment why we can't use simple_audio_volume_ for the application device id?
DCHECK(!(AudioDeviceDescription::IsApplicationDevice(device_id) &&
data_flow != eCapture));
it looks like it follows the above pattern, but it is difficult to read though.
how about CHECK(!AudioDeviceDescription::IsApplicationDevice(device_id) || (data_flow == eCapture)), but it is up to you though as it could be easier for somoneone.
// Loopback audio streams must be input streams.
update the comment?
DCHECK(!(AudioDeviceDescription::IsLoopbackDevice(device_id) &&
is_output_device) &&
!(AudioDeviceDescription::IsApplicationDevice(device_id) &&
is_output_device));
it can be updated to have sinlge is_output_device?
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
std::optional<std::string> GetApplicationId(intptr_t window_id) {
we need to return value if not windows?
Done
Can we have API descritpion?
Done
GetWindowThreadProcessId(main_uwp_app_core_window, &main_process_id);
we need to validate the return value from the OS API, for all this file?
Done
HANDLE hProcess = OpenProcess(PROCESS_QUERY_INFORMATION, FALSE, processId);
Gabriel Britoprocess_handle?
Done
HANDLE hProcess = OpenProcess(PROCESS_QUERY_INFORMATION, FALSE, processId);
we need to validate if it is valid?
Done
OpenProcess(PROCESS_QUERY_LIMITED_INFORMATION, false, parent_id));
Gabriel Britoadd /*bInheritHandle=*/?
Done
if getDisplayMedia for echo isn't major scenario, then we can make this as delay load?
Done
static std::string GetApplicationId(std::string_view device_id);
I wonder who creates application device id? I assume that this file has that API to reduce any chance of error and improve readiability?
chrome/browser/media/webrtc/desktop_capture_devices_util.cc creates it.
"ApplicationAudioCapture",
I wonder if we need to specify the Windows somehow and with if BUILDFLAG(IS_WIN)?
I don't think it is necessary. If I do that, I will need to wrap all feature flag checks in ifdef's. Also, I think this feature will be implemented for other platforms in the future.
// DCHECK_GE(now_time, capture_time);
Gabriel Brito?
This DCHECK fails sometimes on my local debug build, making it crash. So I commented it temporarilly. I don't believe that this is related to my changes. I think that it is related to this issue: https://issues.chromium.org/issues/40061089
AudioParameters params = audio_manager_->GetInputStreamParameters(device_id);
I wonder if GetInputStreamParameters can handel application device id as I don't see the AudioManager change in this CL?
The parameters being used for the prototype are the same used by loopback device (system audio). Please check the changes in media/audio/win/core_audio_util_win.cc.
activate_operation->GetActivateResult(&hr_activate, &audio_client_);
please null check in case of the API does wrong behavior. we have to harden against third party (OS) APIs as they keep changing and introduced bug from time to time.
Done
if (AudioDeviceDescription::IsApplicationDevice(device_id_)) {
It isn't clear what this API means by just reading a name itself? code wise, device_id_ is audio output device id, if it has something else, then we have to explain it here or its declaration on the header.
How is application loopback any different from system audio? Both are input streams, because they receive audio data from the OS audio rendering engine.
// Doesn't support these
Gabriel BritoIs this complete comment?
Added more information.
if (!AudioDeviceDescription::IsApplicationDevice(device_id_)) {
can we have comment why we can't use simple_audio_volume_ for the application device id?
Done
// Loopback audio streams must be input streams.
Gabriel Britoupdate the comment?
Done
DCHECK(!(AudioDeviceDescription::IsLoopbackDevice(device_id) &&
is_output_device) &&
!(AudioDeviceDescription::IsApplicationDevice(device_id) &&
is_output_device));
it can be updated to have sinlge is_output_device?
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
*audio_client = std::move(audio_client_);
if (!audio_client_)
return E_FAIL;
}
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Commit-Queue | +1 |
*audio_client = std::move(audio_client_);
Gabriel Britoif (!audio_client_)
return E_FAIL;
}
Done. Thanks!
hr = completion_handler->WaitAndGetAudioClient(audio_client_);
Gabriel Britocan we make this as &audio_client that will be more consistent with the below Activate all, clearer to out param.
Done
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Thank you very much for the prototype! I added some early comments, but I need to dig into the stack more before I can give more feedback.
#include <windows.h>
Should this be included on all platforms? The CL builds on my mac so it seems like <windows.h> exists for me, but should it?
std::optional<std::string> application_id = GetApplicationId(media_id.id);
I think we need to fully understand the implications of adding the application ID (which is currently the main process id) to the device ID.
Will this expose the process ID to the Renderer? It looks like blink::MediaStreamDevice has many usages there.
Is it a privacy issue to do so?
std::string GetAppMainProcessId(intptr_t window_id) {
Is this something we should be doing in the browser process? If this goes wrong and causes a crash, the whole browser will crash. If we instead do it in the AudioService, only the AudioService will crash.
Can we send the window ID to the AudioService instead, and get the process ID when we create the stream there?
return true;
You can remove the if-statement above, and replace this line with
`return base::FeatureList::IsEnabled(features::kApplicationAudioCapture);`
bool is_first_audio_sample_ready_event = true;
We can do this in a way where we don't have to call IsApplicationDevice on every callback. Simply set this variable to `AudioDeviceDescription::IsApplicationDevice(device_id_)`, and then remove that condition from the if-statement in the loop below. You might want to rename the variable to something that makes more sense, like `should_create_fifo_on_next_callback`.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Commit-Queue | +1 |
Should this be included on all platforms? The CL builds on my mac so it seems like <windows.h> exists for me, but should it?
Thanks for pointing this out. This should have been removed when I created `desktop_capture_devices_util_win.h`. Removed it.
std::optional<std::string> application_id = GetApplicationId(media_id.id);
I think we need to fully understand the implications of adding the application ID (which is currently the main process id) to the device ID.
Will this expose the process ID to the Renderer? It looks like blink::MediaStreamDevice has many usages there.
Is it a privacy issue to do so?
Yes. It will expose the process id to the renderer process. I am sorry I forgot about this. I will check how to make use of the HMAC util functions and store this data in `MediaStreamManager::DeviceRequest` as suggested on the design doc.
FWIW, the device id data related to desktop capture is already exposed - e.g. Windows HWND handles. So I am not sure of how bad of a privacy issue this is.
std::string GetAppMainProcessId(intptr_t window_id) {
Is this something we should be doing in the browser process? If this goes wrong and causes a crash, the whole browser will crash. If we instead do it in the AudioService, only the AudioService will crash.
Can we send the window ID to the AudioService instead, and get the process ID when we create the stream there?
This was discussed during the design doc review. Actually this was my first idea too. But we want to keep the audio process doing only audio-related tasks and perform the process tree traversal in the Browser process. Here's a link to the discussion: https://docs.google.com/document/d/1eyO0pTChGUMN7J8tpEmb1UZ8OxMnxIUeoBkvOOiKA3k/edit?disco=AAABN8odm3w
IMO, If we handle the system calls correctly, this shouldn't crash. I am marking this as resolved, but feel free to reopen if you feel like it should.
You can remove the if-statement above, and replace this line with
`return base::FeatureList::IsEnabled(features::kApplicationAudioCapture);`
Thanks!
class CompletionHandler
Gabriel BritoCan you add comment for the class?
Done
bool is_first_audio_sample_ready_event = true;
We can do this in a way where we don't have to call IsApplicationDevice on every callback. Simply set this variable to `AudioDeviceDescription::IsApplicationDevice(device_id_)`, and then remove that condition from the if-statement in the loop below. You might want to rename the variable to something that makes more sense, like `should_create_fifo_on_next_callback`.
I think that `AudioDeviceDescription::IsApplicationDevice` will be called just once on the first loop iteration and that's it. After the first iteration where the `audio_samples_ready_event_` has been set, `is_first_audio_sample_ready_event` will be set to `false`. Then, on the following iterations, `AudioDeviceDescription::IsApplicationDevice` will never be called again, because `is_first_audio_sample_ready_event` will be `false`, shortcircuiting the "and" statement and making it return early.
But I agree that this part could be made clearer. What if we just initialize the fifo queue when the first sample is ready for all types of capture? Then we can avoid the application capture checks altogether.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
bool is_first_audio_sample_ready_event = true;
Gabriel BritoWe can do this in a way where we don't have to call IsApplicationDevice on every callback. Simply set this variable to `AudioDeviceDescription::IsApplicationDevice(device_id_)`, and then remove that condition from the if-statement in the loop below. You might want to rename the variable to something that makes more sense, like `should_create_fifo_on_next_callback`.
I think that `AudioDeviceDescription::IsApplicationDevice` will be called just once on the first loop iteration and that's it. After the first iteration where the `audio_samples_ready_event_` has been set, `is_first_audio_sample_ready_event` will be set to `false`. Then, on the following iterations, `AudioDeviceDescription::IsApplicationDevice` will never be called again, because `is_first_audio_sample_ready_event` will be `false`, shortcircuiting the "and" statement and making it return early.
But I agree that this part could be made clearer. What if we just initialize the fifo queue when the first sample is ready for all types of capture? Then we can avoid the application capture checks altogether.
Ah right! I didn't think about the shortcircuiting and-statement. Yes then your way should be equally efficient. I think it would be clearer if we have a single bool as the source of truth for when we create the fifo, so that we use it both in the if-statement for if we should call CreateFifo() above, and below in the while-loop. So we create the bool above the first of the CreateFifo() if-statements, and use it in both of them.
I think it is probably better to create the fifo as early as possible (before the while-loop when possible), so that we do as little work as possible on the capture path.
Not a complete review but adding some comments on the way.
// Input device id used to identify an application audio capture stream.
I might be picky but "application audio capture stream" is a buit hard to read. Can it be made more clear? E.g. "Input device ID used to identify an application's audio input stream."
bool AudioDeviceDescription::IsApplicationDevice(std::string_view device_id) {
AFAIK, std::string_view is not fully approved in Chrome yet. See https://chromium.googlesource.com/chromium/src/+/1865f277bb4b77b6d12ccfa566d67a187d5d00e6/styleguide/c++/c++11.md#std_string_view-tbd
std::string_view device_id) {
base::StringPiece
if (colon_pos == std::string::npos) {
base::StringPiece::npos
// DCHECK_GE(now_time, capture_time);
Why is this commented out?
// Activates the IAudioClient interface with the adequate parameters.
Can you add some more comments here please.
Why isn't IMMDevice::Activate sufficient and why is an async activation needed. Does it have anything to do with that some applications that we capture audio from are UWP apps? Chrome itself is not an UWP app so is this really needed?
hr = ActivateAudioClientInterface();
Add comment about two different main cases in this method since it is not clear from the method that there is a ig difference for the "application device".
if (base::FeatureList::IsEnabled(features::kApplicationAudioCapture)) {
I think this should be taken into account earlier and reflected in capture_audio parameter - shouldn't it?
std::optional<std::string> application_id = GetApplicationId(media_id.id);
Gabriel BritoI think we need to fully understand the implications of adding the application ID (which is currently the main process id) to the device ID.
Will this expose the process ID to the Renderer? It looks like blink::MediaStreamDevice has many usages there.
Is it a privacy issue to do so?
Yes. It will expose the process id to the renderer process. I am sorry I forgot about this. I will check how to make use of the HMAC util functions and store this data in `MediaStreamManager::DeviceRequest` as suggested on the design doc.
FWIW, the device id data related to desktop capture is already exposed - e.g. Windows HWND handles. So I am not sure of how bad of a privacy issue this is.
guidou@ - WDYT?
HRESULT WaitAndGetAudioClient(ComPtr<IAudioClient>* audio_client) {
I'm not sure we want to make synchronous calls like this. Can we avoid that? Can it result in a deadlock?
henrika@ - what's your take on it?
HRESULT WASAPIAudioInputStream::ActivateAudioClientInterface() {
Could you document it?
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Fogot to ssend this: that looks very interesting, thank you!
HRESULT WaitAndGetAudioClient(ComPtr<IAudioClient>* audio_client) {
I'm not sure we want to make synchronous calls like this. Can we avoid that? Can it result in a deadlock?
henrika@ - what's your take on it?
I have asked similar questions in my review round. See e.g.https://chromium-review.googlesource.com/c/chromium/src/+/6210165/comment/14419678_19588014/. It is possible that it is required to be able to capture audio from UWP apps but I am not sure.
if (base::FeatureList::IsEnabled(features::kApplicationAudioCapture)) {
I think this should be taken into account earlier and reflected in capture_audio parameter - shouldn't it?
Correct. Removing the flag check. Thanks!
std::optional<std::string> application_id = GetApplicationId(media_id.id);
Gabriel BritoI think we need to fully understand the implications of adding the application ID (which is currently the main process id) to the device ID.
Will this expose the process ID to the Renderer? It looks like blink::MediaStreamDevice has many usages there.
Is it a privacy issue to do so?
Olga SharonovaYes. It will expose the process id to the renderer process. I am sorry I forgot about this. I will check how to make use of the HMAC util functions and store this data in `MediaStreamManager::DeviceRequest` as suggested on the design doc.
FWIW, the device id data related to desktop capture is already exposed - e.g. Windows HWND handles. So I am not sure of how bad of a privacy issue this is.
guidou@ - WDYT?
In the latest patch, I removed the device type check from `MediaStreamManager::TranslateDeviceIdToSourceId`. I think that this should be enough to hash ids for both audio (PID) and video (HWND) devices used in window capture. WDYT? @gui...@chromium.org
// Input device id used to identify an application audio capture stream.
I might be picky but "application audio capture stream" is a buit hard to read. Can it be made more clear? E.g. "Input device ID used to identify an application's audio input stream."
No problem. Changed it!
bool AudioDeviceDescription::IsApplicationDevice(std::string_view device_id) {
AFAIK, std::string_view is not fully approved in Chrome yet. See https://chromium.googlesource.com/chromium/src/+/1865f277bb4b77b6d12ccfa566d67a187d5d00e6/styleguide/c++/c++11.md#std_string_view-tbd
I don't know why this is still TBD. Maybe we should ask someone to update this? I remember there was a recent effort to use `string_view` in the codebase: https://issues.chromium.org/issues/40506050. Also, it seems that base::StringPiece was even removed from chromium.src: https://chromium-review.googlesource.com/c/chromium/src/+/5729956
Can I keep `std::string_view` or should I use `std::string` anyways?
// DCHECK_GE(now_time, capture_time);
Why is this commented out?
This DCHECK fails sometimes on my local debug build, making it crash. So I commented it temporarilly. I don't believe that this is related to my changes. I think that it is related to this issue: https://issues.chromium.org/issues/40061089
void CreateFifo();
Gabriel BritoAdd comment.
Added. Thanks!
// Activates the IAudioClient interface with the adequate parameters.
Can you add some more comments here please.
Why isn't IMMDevice::Activate sufficient and why is an async activation needed. Does it have anything to do with that some applications that we capture audio from are UWP apps? Chrome itself is not an UWP app so is this really needed?
I replied to this on this other comment: https://chromium-review.googlesource.com/c/chromium/src/+/6210165/comment/77d585e5_9846cc0f/
Also added a comment to the method's header. Please let me know if it needs improvements. Thanks!
HRESULT WaitAndGetAudioClient(ComPtr<IAudioClient>* audio_client) {
Henrik AndreassonI'm not sure we want to make synchronous calls like this. Can we avoid that? Can it result in a deadlock?
henrika@ - what's your take on it?
I have asked similar questions in my review round. See e.g.https://chromium-review.googlesource.com/c/chromium/src/+/6210165/comment/14419678_19588014/. It is possible that it is required to be able to capture audio from UWP apps but I am not sure.
We an avoid a deadlock by specifying a timeout in millisecs in WaitForSingleObject: https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-waitforsingleobject.
Henrik's comment for context:
>Why isn't IMMDevice::Activate sufficient and why is an async activation needed. Does it have anything to do with that some applications that we capture audio from are UWP apps? Chrome itself is not an UWP app so is this really needed?
Unfortunatelly, this is how the Windows API is implemented.
.ActivationType = AUDIOCLIENT_ACTIVATION_TYPE_PROCESS_LOOPBACK,
Gabriel BritoIt might be help to reader if it has a link to the MSDN for these values use
Done. Does it look good to you?
Add comment about two different main cases in this method since it is not clear from the method that there is a ig difference for the "application device".
Added the comment. Thanks!
bool is_first_audio_sample_ready_event = true;
Gabriel BritoWe can do this in a way where we don't have to call IsApplicationDevice on every callback. Simply set this variable to `AudioDeviceDescription::IsApplicationDevice(device_id_)`, and then remove that condition from the if-statement in the loop below. You might want to rename the variable to something that makes more sense, like `should_create_fifo_on_next_callback`.
Fredrik HernqvistI think that `AudioDeviceDescription::IsApplicationDevice` will be called just once on the first loop iteration and that's it. After the first iteration where the `audio_samples_ready_event_` has been set, `is_first_audio_sample_ready_event` will be set to `false`. Then, on the following iterations, `AudioDeviceDescription::IsApplicationDevice` will never be called again, because `is_first_audio_sample_ready_event` will be `false`, shortcircuiting the "and" statement and making it return early.
But I agree that this part could be made clearer. What if we just initialize the fifo queue when the first sample is ready for all types of capture? Then we can avoid the application capture checks altogether.
Ah right! I didn't think about the shortcircuiting and-statement. Yes then your way should be equally efficient. I think it would be clearer if we have a single bool as the source of truth for when we create the fifo, so that we use it both in the if-statement for if we should call CreateFifo() above, and below in the while-loop. So we create the bool above the first of the CreateFifo() if-statements, and use it in both of them.
I think it is probably better to create the fifo as early as possible (before the while-loop when possible), so that we do as little work as possible on the capture path.
Ok. I adopted your suggestion. Thanks!
HRESULT WASAPIAudioInputStream::ActivateAudioClientInterface() {
Gabriel BritoCould you document it?
Added some comments. Please let me know if you think that I should add something else. Thanks!
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
bool AudioDeviceDescription::IsApplicationDevice(std::string_view device_id) {
Gabriel BritoAFAIK, std::string_view is not fully approved in Chrome yet. See https://chromium.googlesource.com/chromium/src/+/1865f277bb4b77b6d12ccfa566d67a187d5d00e6/styleguide/c++/c++11.md#std_string_view-tbd
I don't know why this is still TBD. Maybe we should ask someone to update this? I remember there was a recent effort to use `string_view` in the codebase: https://issues.chromium.org/issues/40506050. Also, it seems that base::StringPiece was even removed from chromium.src: https://chromium-review.googlesource.com/c/chromium/src/+/5729956
Can I keep `std::string_view` or should I use `std::string` anyways?
All I can say at this stage is that there are other examples of usage of std::_string_view in media so I guess I don't have a string argument againts your usage.
I am not an owner of this particular area and don't want to block something based on the existing vague TBD statement.
Hence, don't change this now.
HRESULT WaitAndGetAudioClient(ComPtr<IAudioClient>* audio_client) {
Henrik AndreassonI'm not sure we want to make synchronous calls like this. Can we avoid that? Can it result in a deadlock?
henrika@ - what's your take on it?
Gabriel BritoI have asked similar questions in my review round. See e.g.https://chromium-review.googlesource.com/c/chromium/src/+/6210165/comment/14419678_19588014/. It is possible that it is required to be able to capture audio from UWP apps but I am not sure.
We an avoid a deadlock by specifying a timeout in millisecs in WaitForSingleObject: https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-waitforsingleobject.
Henrik's comment for context:
>Why isn't IMMDevice::Activate sufficient and why is an async activation needed. Does it have anything to do with that some applications that we capture audio from are UWP apps? Chrome itself is not an UWP app so is this really needed?Unfortunatelly, this is how the Windows API is implemented.
- Why do we need to use `ActivateAudioInterfaceAsync`? Because it is the only way available to setup process audio capture.
- Why can't we use `ActivateAudioInterfaceAsync` to also activate the `endpoint_`? Because only "interfaces" can be activated and endpoints are not audio interfaces.
- Why isn't IMMDevice::Activate sufficient? Process loopback cannot be activated off an endpoint. The official documentation might make look like it is possible (https://learn.microsoft.com/en-us/windows/win32/api/mmdeviceapi/nf-mmdeviceapi-immdevice-activate), but this is wrong. It will be corrected.
But this pattern is not used today and we have support for loopback devices. Are you saying that the current implementation needs similar changed for looback as well?
if (!AudioDeviceDescription::IsApplicationDevice(device_id_)) {
What I meant with my other comment is, we could define `is_first_audio_sample_ready_event` above this if-statement and change the if-statement to `if (!is_first_audio_sample_ready_event) {...}`
Then we will have the same variable in each place that concerns when to create the fifo, which I think will be clearer to the reader. Then the variable should probably also have a new name.
std::optional<std::string> application_id = GetApplicationId(media_id.id);
Gabriel BritoI think we need to fully understand the implications of adding the application ID (which is currently the main process id) to the device ID.
Will this expose the process ID to the Renderer? It looks like blink::MediaStreamDevice has many usages there.
Is it a privacy issue to do so?
Olga SharonovaYes. It will expose the process id to the renderer process. I am sorry I forgot about this. I will check how to make use of the HMAC util functions and store this data in `MediaStreamManager::DeviceRequest` as suggested on the design doc.
FWIW, the device id data related to desktop capture is already exposed - e.g. Windows HWND handles. So I am not sure of how bad of a privacy issue this is.
Gabriel Britoguidou@ - WDYT?
In the latest patch, I removed the device type check from `MediaStreamManager::TranslateDeviceIdToSourceId`. I think that this should be enough to hash ids for both audio (PID) and video (HWND) devices used in window capture. WDYT? @gui...@chromium.org
I propose creating a CL (or several if necessary) for hasing device IDs for screen capture (guarded by a feature flag), independent of this work. Then we have full freedom to add anything we want to the device IDs.
std::optional<std::string> application_id = GetApplicationId(media_id.id);
Gabriel BritoI think we need to fully understand the implications of adding the application ID (which is currently the main process id) to the device ID.
Will this expose the process ID to the Renderer? It looks like blink::MediaStreamDevice has many usages there.
Is it a privacy issue to do so?
Olga SharonovaYes. It will expose the process id to the renderer process. I am sorry I forgot about this. I will check how to make use of the HMAC util functions and store this data in `MediaStreamManager::DeviceRequest` as suggested on the design doc.
FWIW, the device id data related to desktop capture is already exposed - e.g. Windows HWND handles. So I am not sure of how bad of a privacy issue this is.
Gabriel Britoguidou@ - WDYT?
Guido UrdanetaIn the latest patch, I removed the device type check from `MediaStreamManager::TranslateDeviceIdToSourceId`. I think that this should be enough to hash ids for both audio (PID) and video (HWND) devices used in window capture. WDYT? @gui...@chromium.org
I propose creating a CL (or several if necessary) for hasing device IDs for screen capture (guarded by a feature flag), independent of this work. Then we have full freedom to add anything we want to the device IDs.
Ok. I created this bug https://issues.chromium.org/issues/401522577. I set it up with limited visibility, because it is a privacy isse. Please feel free to modify the fields if you feel like it.
@ol...@chromium.org, it seems that hashing will be itself another project. Should we proceed with the window audio capture implementation and get back to hashing after we have window capture landed?
static std::string GetApplicationId(std::string_view device_id);
Gabriel BritoI wonder who creates application device id? I assume that this file has that API to reduce any chance of error and improve readiability?
chrome/browser/media/webrtc/desktop_capture_devices_util.cc creates it.
Done
bool AudioDeviceDescription::IsApplicationDevice(std::string_view device_id) {
Gabriel BritoAFAIK, std::string_view is not fully approved in Chrome yet. See https://chromium.googlesource.com/chromium/src/+/1865f277bb4b77b6d12ccfa566d67a187d5d00e6/styleguide/c++/c++11.md#std_string_view-tbd
Henrik AndreassonI don't know why this is still TBD. Maybe we should ask someone to update this? I remember there was a recent effort to use `string_view` in the codebase: https://issues.chromium.org/issues/40506050. Also, it seems that base::StringPiece was even removed from chromium.src: https://chromium-review.googlesource.com/c/chromium/src/+/5729956
Can I keep `std::string_view` or should I use `std::string` anyways?
All I can say at this stage is that there are other examples of usage of std::_string_view in media so I guess I don't have a string argument againts your usage.
I am not an owner of this particular area and don't want to block something based on the existing vague TBD statement.
Hence, don't change this now.
Ok. I will resolve this comment then. Thanks!
std::string_view device_id) {
Gabriel Britobase::StringPiece
Resolving it because of https://chromium-review.googlesource.com/c/chromium/src/+/6210165/comment/afa59060_d2c970c7/
if (colon_pos == std::string::npos) {
Gabriel Britobase::StringPiece::npos
Resolving it because of https://chromium-review.googlesource.com/c/chromium/src/+/6210165/comment/afa59060_d2c970c7/
"ApplicationAudioCapture",
I wonder if we need to specify the Windows somehow and with if BUILDFLAG(IS_WIN)?
Gabriel BritoI don't think it is necessary. If I do that, I will need to wrap all feature flag checks in ifdef's. Also, I think this feature will be implemented for other platforms in the future.
Done
AudioParameters params = audio_manager_->GetInputStreamParameters(device_id);
Gabriel BritoI wonder if GetInputStreamParameters can handel application device id as I don't see the AudioManager change in this CL?
The parameters being used for the prototype are the same used by loopback device (system audio). Please check the changes in media/audio/win/core_audio_util_win.cc.
Done
HRESULT WaitAndGetAudioClient(ComPtr<IAudioClient>* audio_client) {
Henrik AndreassonI'm not sure we want to make synchronous calls like this. Can we avoid that? Can it result in a deadlock?
henrika@ - what's your take on it?
Gabriel BritoI have asked similar questions in my review round. See e.g.https://chromium-review.googlesource.com/c/chromium/src/+/6210165/comment/14419678_19588014/. It is possible that it is required to be able to capture audio from UWP apps but I am not sure.
Henrik AndreassonWe an avoid a deadlock by specifying a timeout in millisecs in WaitForSingleObject: https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-waitforsingleobject.
Henrik's comment for context:
>Why isn't IMMDevice::Activate sufficient and why is an async activation needed. Does it have anything to do with that some applications that we capture audio from are UWP apps? Chrome itself is not an UWP app so is this really needed?Unfortunatelly, this is how the Windows API is implemented.
- Why do we need to use `ActivateAudioInterfaceAsync`? Because it is the only way available to setup process audio capture.
- Why can't we use `ActivateAudioInterfaceAsync` to also activate the `endpoint_`? Because only "interfaces" can be activated and endpoints are not audio interfaces.
- Why isn't IMMDevice::Activate sufficient? Process loopback cannot be activated off an endpoint. The official documentation might make look like it is possible (https://learn.microsoft.com/en-us/windows/win32/api/mmdeviceapi/nf-mmdeviceapi-immdevice-activate), but this is wrong. It will be corrected.
But this pattern is not used today and we have support for loopback devices. Are you saying that the current implementation needs similar changed for looback as well?
No need to change the system audio loopback implementation. The `IAudioClient` is only setup for loopback capture when we call `IAudioClient::Initialize` later in `InitializeAudioEngine`. This means that neither `IMMDevice::Activate` nor `ActivateAudioInterfaceAsync` have no information about whether we are creating a loopback capture stream or not.
We need these 2 patterns basically because the stream source is different for each scenario:
I need to confirm, but I just noticed that it seems that we don't need to set `endpoint_device_` for application capture.
.ActivationType = AUDIOCLIENT_ACTIVATION_TYPE_PROCESS_LOOPBACK,
Gabriel BritoIt might be help to reader if it has a link to the MSDN for these values use
Done. Does it look good to you?
Done
if (!AudioDeviceDescription::IsApplicationDevice(device_id_)) {
What I meant with my other comment is, we could define `is_first_audio_sample_ready_event` above this if-statement and change the if-statement to `if (!is_first_audio_sample_ready_event) {...}`
Then we will have the same variable in each place that concerns when to create the fifo, which I think will be clearer to the reader. Then the variable should probably also have a new name.
Got it. I renamed the variable and moved the definition further up. Thanks!
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
// Given a window's HWND pointer, returns a string containing the PID of the
// application's "main process" for audio capture purposes. let "window process"
Why a string? Did you mean `std::optional<DWORD>` perhaps?
int class_name_length = GetClassName(hwnd, class_name, MAX_PATH);
A method called GetClassName returns the name's **length**?
// compare class_name with kUWPAppCoreWindowClassName
Remove this comment, it's too obvious.
return false;
The method's documentation should explain when true/false is returned.
bool IsUWPApplication(std::wstring_view app_base_name) {
return app_base_name == kApplicationFrameHostName;
}
Should this be inlined instead? A comment would likely suffice.
// Need to change the string?
Please remember to remove this comment.
// When this feature is enabled, gDM will be able to capture audio from an
// application whose window is selected by the user through the picker UI.
It might be more accurate to say that when applications call `gDM({audio: true})`, Chromium will offer the user the option to also capture app-audio.
// Given a window's HWND pointer, returns a string containing the PID of the
// application's "main process" for audio capture purposes. let "window process"
Why a string? Did you mean `std::optional<DWORD>` perhaps?
I am returning a `std::string` because I didn't want `chrome/browser/media/webrtc/desktop_capture_devices_util.cc` to need to include Windows headers. Is there a better way to ensure this or it is not important?
int class_name_length = GetClassName(hwnd, class_name, MAX_PATH);
A method called GetClassName returns the name's **length**?
It returns the number of copied chars to the `class_name` buffer. Is just `length` ok to you?
https://learn.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-getclassname
Remove this comment, it's too obvious.
Done
The method's documentation should explain when true/false is returned.
Added the comment and a link to the Microsoft documentation. Thanks!
bool IsUWPApplication(std::wstring_view app_base_name) {
return app_base_name == kApplicationFrameHostName;
}
Should this be inlined instead? A comment would likely suffice.
Personally, I prefer having a named function for clarity as I can read the code and understand without resorting to comments. But I understand your point since this check is only used in one place. Still, do you want me to get rid of the function?
Please remember to remove this comment.
Done. Thanks!
// When this feature is enabled, gDM will be able to capture audio from an
// application whose window is selected by the user through the picker UI.
It might be more accurate to say that when applications call `gDM({audio: true})`, Chromium will offer the user the option to also capture app-audio.
Ok. Made the change and just complemented with `for window screen captures` to be more specific. Thanks!
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
// Given a window's HWND pointer, returns a string containing the PID of the
// application's "main process" for audio capture purposes. let "window process"
Gabriel BritoWhy a string? Did you mean `std::optional<DWORD>` perhaps?
I am returning a `std::string` because I didn't want `chrome/browser/media/webrtc/desktop_capture_devices_util.cc` to need to include Windows headers. Is there a better way to ensure this or it is not important?
The conversion of what is essentially a number to a string consisting of a known prefix (`"application:"`) and a stringification of a number appears unusual, unexpected, and feels error-prone. Barring a strong reason, I'd avoid it. I have not yet heard a strong enough reason, imho; but I keep an open mind.
int class_name_length = GetClassName(hwnd, class_name, MAX_PATH);
Gabriel BritoA method called GetClassName returns the name's **length**?
It returns the number of copied chars to the `class_name` buffer. Is just `length` ok to you?
https://learn.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-getclassname
Okay, so it
int class_name_length = GetClassName(hwnd, class_name, MAX_PATH);
Gabriel BritoA method called GetClassName returns the name's **length**?
It returns the number of copied chars to the `class_name` buffer. Is just `length` ok to you?
https://learn.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-getclassname
Ah, right...
Isn't it a problem that you're providing `MAX_PATH` as the maximum length there? Shouldn't that be `kUWPAppCoreWindowClassNameLength + 1`, at least with the code currently in PS13? Wouldn't that risk memory corruption?
Also, note that you likely want to allocate at least **two** character more than kUWPAppCoreWindowClassNameLength. That is, you want one more for the `\0`, and one more so that `Windows.UI.Core.CoreWindow_MORE_CHARS` would not get misidentified as equal to `Windows.UI.Core.CoreWindow` due to GetClassName truncating the result to [nMaxCount](https://learn.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-getclassname#parameters). Let me know if I'm wrong, though.
bool IsUWPApplication(std::wstring_view app_base_name) {
return app_base_name == kApplicationFrameHostName;
}
Gabriel BritoShould this be inlined instead? A comment would likely suffice.
Personally, I prefer having a named function for clarity as I can read the code and understand without resorting to comments. But I understand your point since this check is only used in one place. Still, do you want me to get rid of the function?
I understand the principle and sometimes I apply it myself. But I think the balance in this particular case leans towards getting rid of this helper.
If you think `name == kApplicationFrameHostName` would be insufficiently clear at the call site, you can try other mitigations, such as assigning it to a `const bool` that you name `is_uwp_app`, which will give you just as much clarity as the helper, but without the need to jump around the code to see how it's implemented.
But anyhow, that's a nit either way. I wouldn't block over something like this.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
static const char kApplicationDeviceId[];
Where is it used to form the device id? I can't seem to find the place.
std::string AudioDeviceDescription::GetApplicationId(
This method cannot be here it should not be possible to extract application id from AudioDeviceDescription without unhashing; and ADD does not have access to that logic.
Generally, AudioDeviceDescription is used for device enumerations, and application loopbacks are never enumerated. So I suggest we move all this logic out into some other helper.
std::optional<std::string> application_id = GetApplicationId(media_id.id);
Gabriel BritoI think we need to fully understand the implications of adding the application ID (which is currently the main process id) to the device ID.
Will this expose the process ID to the Renderer? It looks like blink::MediaStreamDevice has many usages there.
Is it a privacy issue to do so?
Olga SharonovaYes. It will expose the process id to the renderer process. I am sorry I forgot about this. I will check how to make use of the HMAC util functions and store this data in `MediaStreamManager::DeviceRequest` as suggested on the design doc.
FWIW, the device id data related to desktop capture is already exposed - e.g. Windows HWND handles. So I am not sure of how bad of a privacy issue this is.
Gabriel Britoguidou@ - WDYT?
Guido UrdanetaIn the latest patch, I removed the device type check from `MediaStreamManager::TranslateDeviceIdToSourceId`. I think that this should be enough to hash ids for both audio (PID) and video (HWND) devices used in window capture. WDYT? @gui...@chromium.org
Gabriel BritoI propose creating a CL (or several if necessary) for hasing device IDs for screen capture (guarded by a feature flag), independent of this work. Then we have full freedom to add anything we want to the device IDs.
Ok. I created this bug https://issues.chromium.org/issues/401522577. I set it up with limited visibility, because it is a privacy isse. Please feel free to modify the fields if you feel like it.
@ol...@chromium.org, it seems that hashing will be itself another project. Should we proceed with the window audio capture implementation and get back to hashing after we have window capture landed?
Addressed by Fredrik's change: https://chromium-review.googlesource.com/c/chromium/src/+/6404482
// Given a window's HWND pointer, returns a string containing the PID of the
// application's "main process" for audio capture purposes. let "window process"
Gabriel BritoWhy a string? Did you mean `std::optional<DWORD>` perhaps?
Elad AlonI am returning a `std::string` because I didn't want `chrome/browser/media/webrtc/desktop_capture_devices_util.cc` to need to include Windows headers. Is there a better way to ensure this or it is not important?
The conversion of what is essentially a number to a string consisting of a known prefix (`"application:"`) and a stringification of a number appears unusual, unexpected, and feels error-prone. Barring a strong reason, I'd avoid it. I have not yet heard a strong enough reason, imho; but I keep an open mind.
Now returning a `base::ProcessId`, which is defined as a `DWORD` on Windows.
int class_name_length = GetClassName(hwnd, class_name, MAX_PATH);
Gabriel BritoA method called GetClassName returns the name's **length**?
Elad AlonIt returns the number of copied chars to the `class_name` buffer. Is just `length` ok to you?
https://learn.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-getclassname
Ah, right...
Isn't it a problem that you're providing `MAX_PATH` as the maximum length there? Shouldn't that be `kUWPAppCoreWindowClassNameLength + 1`, at least with the code currently in PS13? Wouldn't that risk memory corruption?
Also, note that you likely want to allocate at least **two** character more than kUWPAppCoreWindowClassNameLength. That is, you want one more for the `\0`, and one more so that `Windows.UI.Core.CoreWindow_MORE_CHARS` would not get misidentified as equal to `Windows.UI.Core.CoreWindow` due to GetClassName truncating the result to [nMaxCount](https://learn.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-getclassname#parameters). Let me know if I'm wrong, though.
This shouldn't be a problem anymore. Calling `base::win::GetWindowClass` now.
bool IsUWPApplication(std::wstring_view app_base_name) {
return app_base_name == kApplicationFrameHostName;
}
Gabriel BritoShould this be inlined instead? A comment would likely suffice.
Elad AlonPersonally, I prefer having a named function for clarity as I can read the code and understand without resorting to comments. But I understand your point since this check is only used in one place. Still, do you want me to get rid of the function?
I understand the principle and sometimes I apply it myself. But I think the balance in this particular case leans towards getting rid of this helper.
If you think `name == kApplicationFrameHostName` would be insufficiently clear at the call site, you can try other mitigations, such as assigning it to a `const bool` that you name `is_uwp_app`, which will give you just as much clarity as the helper, but without the need to jump around the code to see how it's implemented.
But anyhow, that's a nit either way. I wouldn't block over something like this.
Done
Where is it used to form the device id? I can't seem to find the place.
Addressed by Fredrik's change: https://chromium-review.googlesource.com/c/chromium/src/+/6404482
std::string AudioDeviceDescription::GetApplicationId(
This method cannot be here it should not be possible to extract application id from AudioDeviceDescription without unhashing; and ADD does not have access to that logic.
Generally, AudioDeviceDescription is used for device enumerations, and application loopbacks are never enumerated. So I suggest we move all this logic out into some other helper.
Also addressed by Fredrik's change: https://chromium-review.googlesource.com/c/chromium/src/+/6404482
MEDIA_EXPORT bool IsApplicationLoopbackCaptureSupported();
same here because application loopback feature is already supported for other OSs and this CL is only for Windows?
Yes.
probably, you don't want to remove this one like ComPtr by "using Microsoft::WRL::Runtimeclass", same for other Com classes.
I will keep it just because it is the pattern for the rest of the file.
if (AudioDeviceDescription::IsApplicationDevice(device_id_)) {
It isn't clear what this API means by just reading a name itself? code wise, device_id_ is audio output device id, if it has something else, then we have to explain it here or its declaration on the header.
How is application loopback any different from system audio? Both are input streams, because they receive audio data from the OS audio rendering engine.
DCHECK(!(AudioDeviceDescription::IsApplicationDevice(device_id) &&
data_flow != eCapture));
it looks like it follows the above pattern, but it is difficult to read though.
how about CHECK(!AudioDeviceDescription::IsApplicationDevice(device_id) || (data_flow == eCapture)), but it is up to you though as it could be easier for somoneone.
Gabriel Brito
I will keep it like this for now. But take it into consideration when pushing the final CLs.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
[WIP] Application audio capture prototype for Windows
Still WIP?
return l10n_util::GetStringFUTF16(
IDS_MEDIA_WINDOW_CAPTURE_WITH_AUDIO_NOTIFICATION_TEXT,
application_title);
Let's wrap this in `#if BUILDFLAG(IS_WIN)` for now.
if (media_id.type == content::DesktopMediaID::TYPE_WEB_CONTENTS) {
This block is growing too long; please refactor it into a helper:
```
out_devices.audio_device = SomeHelper(...);
```
For bonus points, have this mini-refactor in a separate CL. (Optional)
request.audio_type, application_id.value(), "Application Audio");
Are we exposing this to (i) the render process and/or (ii) the JS application? The first would be bad and the second worse.
// window
// with the "Windows.UI.Core.CoreWindow" class name.
Auto-formatting error here.
// created the windows identified by the provided HWND. Depending on the type of
s/windows/window
// Given a window's HWND pointer, returns the PID of the application's "main
// process" for audio capture purposes. let "window process" be the process that
Following the logic of my earlier comment, this could have been your first sentence.
// process" for audio capture purposes. let "window process" be the process that
Everything from here on out is the implementation detail, and does not belong in the header file.
Also: s/let/Let
// creates and owns the window might not be the same that plays renders audio -
I think you meant to erase one of these verbs and keep the other.
This is great documentation - thank you for you diligence!
Just a pointer - start out with what the function does, and only explain why or how thereafter. It's easier for readers when a traditional structure is followed, even if it requires (briefly) stating the obvious; in this case, this could be "given a ID of a window, return the ID of the process with which the window is associated." When you *then* explain why a process ID is useful, the reader already has the right frame of mind.
constexpr wchar_t kApplicationFrameHostName[] = L"ApplicationFrameHost.exe";
constexpr wchar_t kUWPAppCoreWindowClassName[] = L"Windows.UI.Core.CoreWindow";
Each of these is only used once, so please consider if it's really helpful to define them in file-scope, requiring the reader to skip around in the file to find them. (If you think there's is explanatory value in named constants instead of string literals, you can still define them closer to where they are used.)
// window is found, this function should return false, signaling to
\`false\`
BOOL CALLBACK IsUWPAppCoreWindow(HWND hwnd, LPARAM lParam) {
nit: Should you be returning `FALSE`/`TRUE` then?
std::array<wchar_t, MAX_PATH + 1> executable_path_buffer;
Is this the one defined in `base/win/windows_types.h`? Can you #include the right header?
if (QueryFullProcessImageName(process_handle, 0,
Invert the condition and use early-exit if it fails. This reduces complexity and indentation.
// the user the option to also capture app-audio for window screen captures.
Remove "also", or else readers might understand it as "in addition to just capturing the window's own audio."
DCHECK(!(AudioDeviceDescription::IsApplicationLoopbackDevice(device_id) &&
data_flow != eCapture));
New DCHECKs are discouraged (CHECKs are encouraged) unless the condition is expensive to evaluate. Is that the case?