Thanks! I haven't responded to some comments that require a bit more thought.
// Probes a domain with the given |probe_name| (used for metrics) and |host|Brian Begnochenit: for new comments prefer backticks: https://chromium.googlesource.com/chromium/src/+/main/styleguide/c++/c++-dos-and-donts.md#comment-style
Done
HostResolver::ResolveHostRequest* request,Brian Begnocheoptional: if we use `const HostResolver::ResolveHostRequest&` for this then it's clear it can never be NULL and we can remove the DCHECK.
Done
HostPortPair host_port_pair(host, 80);Brian Begnochenit: can std::move(host) here
Done
std::move(request), std::move(callback), probe_name));Brian BegnocheTo avoid the use-after-move, at least for callback you may want to use SplitOnceCallback. idk what's best to do for request
Good catch. I think this is cleaned up now.
OnProbeComplete(std::move(request), std::move(callback), probe_name,Brian Begnochenit: can std::move(probe_name) here
Done
base::StrCat({"Net.DNS.CanaryDomainService.", probe_name, ".Result"}),Brian Begnochenit: can std::move(probe_name) here
I'm less clear on whether this still makes sense, since it's now owned now by the probe struct?
base::StrCat({"Net.DNS.CanaryDomainService.", probe_name, ".Result"}),Brian BegnocheCan we do a pass double-checking includes? I think we're missing at least the base/strings/strcat.h include
strcat.h is added. Will check for others.
service->ProbeSecureDnsDomain(Brian BegnocheI think this could be simplified to:
```
base::test::TestFuture<bool> future;
service->ProbeSecureDnsDomain(future.GetCallback());
bool result = future.Get();
```
Done
Brian BegnocheIt might be useful to add a SERVFAIL case as well to test the default case
I tried really hard to make this happen. I don't think it's possible, at least not trivially.
The short of it is that `MockHostResolver::Resolve` sees no rule that resolves to an IP and returns `ERR_NAME_NOT_RESOLVED`, even if the rule specifies some other error code.
doh_fallback_allowed_by_canary_domain_ = allowed;Brian BegnocheI wonder if the more consistent way of applying this change would be to have an entry in DnsConfigOverrides for this boolean and setting that here. Then we don't need the `doh_fallback_allowed_by_canary_domain_ == allowed` check, because IIUC that's handled by UpdateDnsConfig
I thought about this too. The reason I didn't do that was that I didn't want `SetConfigOverrides` to clobber this. So it'd need to do something to persist this value after updating the overrides. It's like an override of overrides, so kind of odd.
Alternatively, it could be a second DnsConfigOverrides that just stores this override, but then `UpdateDnsConfig` needs to be updated regardless. I guess there is an advantage to this in that it would be a canned `DnsConfigOverrides` that can (1) set this value (2) override `secure_dns_mode` (3) clear `doh_config.servers`.
I think I would prefer the latter approach, if anything. What are your thoughts?
dict.Set("doh_fallback_allowed_by_canary_domain",Brian BegnocheI don't think we need this in addition to the change in `DnsConfig::ToDict`, right?
Nope, removed.
config.nameservers = {IPEndPoint(IPAddress(1, 2, 3, 4), 53)};Brian BegnocheI think these exist already here: https://source.chromium.org/chromium/chromium/src/+/main:net/dns/dns_client_unittest.cc;l=90-92;drc=57cc7a30fed828b5ff5f3f075b374c6e1ca035bb
Done
config.nameservers = {IPEndPoint(IPAddress(1, 2, 3, 4), 53)};Brian BegnocheI think these exist already here: https://source.chromium.org/chromium/chromium/src/+/main:net/dns/dns_client_unittest.cc;l=90-92;drc=57cc7a30fed828b5ff5f3f075b374c6e1ca035bb
Done
nullptr /* url_request_context= */, false /* enable_caching */);Brian Begnochenit: for new code prefer using the format `/*url_request_context=*/nullptr` so that the tooling will ensure that the comment correctly indicates the parameter name
https://clang.llvm.org/extra/clang-tidy/checks/bugprone/argument-comment.html
Ack. This is resolved with the chagne to use CreateResolver.
I'll look for this anywhere else. Please comment if I miss any.
std::unique_ptr<HostResolver> resolver = std::make_unique<ContextHostResolver>(this, std::move(context));Brian BegnocheCan we use HostResolver::CreateResolver for this? It seems to do the same things
Done
MaybeProbeSecureDnsCanaryDomain();Brian BegnocheJust to check, do we need to do this in the !changed case?
No, moved.
DnsConfig config_copy = config.value_or(DnsConfig());Brian Begnochelooks like we don't need this anymore
Done
TEST_F(HostResolverManagerDnsTest, SecureDnsCanaryDomainProbe_KeepsDohFallbackEnabled) {Brian Begnochenit: It'd probably be worth a subclass that enables the feature since we have multiple tests here using that
Done
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
params.source = HostResolverSource::SYSTEM;Brian BegnocheI wonder if `HostResolverSource::ANY` should be used to allow use of the built-in resolver, given that we don't require use of the system provider. I'm not sure the full implications of this, though.
I don't know either, but I don't have a reason to prevent the built-in resolver. I'll leave this comment open for net owners to comment.
base::StrCat({"Net.DNS.CanaryDomainService.", probe_name, ".Result"}),Brian Begnochenit: can std::move(probe_name) here
I'm less clear on whether this still makes sense, since it's now owned now by the probe struct?
In any case, the probe callback is moved, so moving the name is fine too. Done.
Brian BegnocheIt might be useful to add a SERVFAIL case as well to test the default case
I tried really hard to make this happen. I don't think it's possible, at least not trivially.
The short of it is that `MockHostResolver::Resolve` sees no rule that resolves to an IP and returns `ERR_NAME_NOT_RESOLVED`, even if the rule specifies some other error code.
I've addressed this the best I can with reworking the result that is logged and to test appropriately here.
I was wrong with what I said before. `MockHostResolver` is WAI, but `HostResolver::SquashErrorCode()` squashes various error codes to `ERR_NAME_NOT_RESOLVED`.
config->secure_dns_mode = SecureDnsMode::kOff;Brian BegnocheI don't think we should change secure_dns_mode, can we not just avoid modifying doh_config in the case when the canary domain indicates we should fallback to Google DNS? That's how the code works today
I can see both ways. IMO it makes sense to clearly categorize this as a case where Secure DNS is turned off because fallback is not possible.
I'd like to hear thoughts from net owners.
bool doh_fallback_allowed_by_canary_domain = true;Brian BegnocheShould probably default to false? Should we disable the fallback until we hear back that it's OK?
I think so, after thinking about it more. Done.
MaybeProbeSecureDnsCanaryDomain();Brian BegnocheShould we invalidate the existing canary domain check status on system DNS config change prior to the results of the query coming back?
I reworked `CanaryDomainService` to do this. There should only ever be one pending probe at any given time now.
canary_domain_service_->ProbeSecureDnsDomain(Brian BegnocheI wonder if we should set a boolean when a probe is currently running and not start a new one if so
I think that this is addressed now. Please reopen or add a new comment if you still see an issue.
Records the result of a canary domain probe used to determine whether SecureBrian BegnocheI think we'll need a bit more description for these, specifically when it is recorded
Done
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I haven't looked into tests yet.
explicit CanaryDomainService(std::unique_ptr<HostResolver> host_resolver,nit: Remove `explicit`.
params.source = HostResolverSource::SYSTEM;I wonder if `HostResolverSource::ANY` should be used to allow use of the built-in resolver, given that we don't require use of the system provider. I'm not sure the full implications of this, though.
Yeah, why do we want to restrict to use SYSTEM?
params.source = HostResolverSource::ANY;If we use `ANY`, we don't have to set this since `ANY` is the default initial value.
params.dns_query_type = DnsQueryType::UNSPECIFIED; // both A and AAAAWe don't have to set this since `UNSPECIFIED` is the default initial value (also UNSPECIFIED may query HTTPS).
NetLogWithSource::Make(net_log, NetLogSourceType::DNS_TRANSACTION)) {}This doesn't seems an appropriate source type. Maybe we should add a new source type.
Probe("SecureDns", features::kSecureDnsCanaryDomainHost.Get(),Just inline `Probe()` here?
HostPortPair host_port_pair(host, 80);Why do we use `80` here? If arbitrary port works, using `0` might be a better one (since it doesn't imply HTTP)
host_resolver_->CreateRequest(host_port_pair,nit: std::move(host_port_pair), or inline.
void CanaryDomainService::OnProbeComplete(const std::string& probe_name,We don't need this since `pending_probe_` has it?
CanaryDomainService::PendingProbe::PendingProbe(
std::unique_ptr<HostResolver::ResolveHostRequest> request,
Callback callback,
std::string probe_name)
: request(std::move(request)),
callback(std::move(callback)),
probe_name(std::move(probe_name)) {}
CanaryDomainService::PendingProbe::~PendingProbe() = default;nit: Move these above CanaryDomainService methods (we usually do so).
config->doh_config =Kenichi IshibashiI was originally wondering whether the canary domain check and the test DoH probes would need to happen in that order. I had convinced myself that they didn't need to be ordered and that once both succeeded we'd upgrade queries that happen after. After thinking about it again, though, it seems like before the test DoH probes run we need `config->doh_config` to be set to the DoH config we should send probes for (which wouldn't be the case in this if block)... I'm not sure what the right thing to do here is
It seems we should have a state (enum class) instead of having a boolean flag to distinguish the states? Probably kUnknown (checking), kPositive, kNegative.
config->secure_dns_mode = SecureDnsMode::kOff;Brian BegnocheI don't think we should change secure_dns_mode, can we not just avoid modifying doh_config in the case when the canary domain indicates we should fallback to Google DNS? That's how the code works today
I can see both ways. IMO it makes sense to clearly categorize this as a case where Secure DNS is turned off because fallback is not possible.
I'd like to hear thoughts from net owners.
I'm wondering if we should add kDisabledByCanaryDomainCheck to categorize precisely.
doh_fallback_allowed_by_canary_domain_ = allowed;I wonder if the more consistent way of applying this change would be to have an entry in DnsConfigOverrides for this boolean and setting that here. Then we don't need the `doh_fallback_allowed_by_canary_domain_ == allowed` check, because IIUC that's handled by UpdateDnsConfig
I thought about this too. The reason I didn't do that was that I didn't want `SetConfigOverrides` to clobber this. So it'd need to do something to persist this value after updating the overrides. It's like an override of overrides, so kind of odd.
Alternatively, it could be a second DnsConfigOverrides that just stores this override, but then `UpdateDnsConfig` needs to be updated regardless. I guess there is an advantage to this in that it would be a canned `DnsConfigOverrides` that can (1) set this value (2) override `secure_dns_mode` (3) clear `doh_config.servers`.
I think I would prefer the latter approach, if anything. What are your thoughts?
I had the same question. It's a good point we don't want to clobber canary domain check. Regardless how we handle it, could you document the requirement?
canary_domain_service_ = std::make_unique<CanaryDomainService>(std::move(resolver), net_log);Kenichi IshibashiI'm not sure if it makes the most sense for HostResolverManager to own this... I wonder if it should be owned by the NetworkService the way DelayedDohProbeActivator is (so, is a sibling to HostResolverManager):
WDYT?
I think using a delayed approach like DelayedDohProbeActivator is worth doing as well. For IP Protection we triggered requests right at NetworkService creation time and particularly on Android we had a lot of issue with requests failing. Those were HTTP requests instead of DNS requests, but still, it seems safest to a wait a bit.
It seems NetworkService is an appropriate owner of CanaryDomainService. NetworkService listens to config changes via NetworkChangeNotifier::DNSObserver (or SystemDnsConfigChangeNotifier, less preferable) and uses host_resolver_manager_ to create an inner HostResolver.
IIUC we will move DnsClient from HostResolverManager to ResolveContext in the future so if we make HostResolverManager owns CanaryDomainService I guess we would need to update it later as well.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
explicit CanaryDomainService(std::unique_ptr<HostResolver> host_resolver,Brian Begnochenit: Remove `explicit`.
Done
params.source = HostResolverSource::SYSTEM;Kenichi IshibashiI wonder if `HostResolverSource::ANY` should be used to allow use of the built-in resolver, given that we don't require use of the system provider. I'm not sure the full implications of this, though.
Yeah, why do we want to restrict to use SYSTEM?
This is now ANY.
If we use `ANY`, we don't have to set this since `ANY` is the default initial value.
Removed
params.dns_query_type = DnsQueryType::UNSPECIFIED; // both A and AAAAWe don't have to set this since `UNSPECIFIED` is the default initial value (also UNSPECIFIED may query HTTPS).
I removed this.
About UNSPECIFIED possibly using HTTPS, does this present an issue with being consistent with the canary domain check for prefetch proxy? It specifically mentions just A and AAAA, so I'm not sure whether allowing anything else makes sense.
https://developer.chrome.com/docs/privacy-security/private-prefetch-proxy-for-network-admins#details_of_the_signaling_mechanism?
NetLogWithSource::Make(net_log, NetLogSourceType::DNS_TRANSACTION)) {}This doesn't seems an appropriate source type. Maybe we should add a new source type.
Added a new `CANARY_DOMAIN_RESOLUTION` source type
Probe("SecureDns", features::kSecureDnsCanaryDomainHost.Get(),Brian BegnocheJust inline `Probe()` here?
Done
Why do we use `80` here? If arbitrary port works, using `0` might be a better one (since it doesn't imply HTTP)
I just picked this up from an example. I appreciate your scrutinizing the details.
Changed to 0.
nit: std::move(host_port_pair), or inline.
Done
void CanaryDomainService::OnProbeComplete(const std::string& probe_name,We don't need this since `pending_probe_` has it?
I did this to be able to log the case where `std::move(pending_probe_)` returns null. What do you think? Unnecessary?
CanaryDomainService::PendingProbe::PendingProbe(
std::unique_ptr<HostResolver::ResolveHostRequest> request,
Callback callback,
std::string probe_name)
: request(std::move(request)),
callback(std::move(callback)),
probe_name(std::move(probe_name)) {}
CanaryDomainService::PendingProbe::~PendingProbe() = default;nit: Move these above CanaryDomainService methods (we usually do so).
Done
doh_fallback_allowed_by_canary_domain_ = allowed;Brian BegnocheI wonder if the more consistent way of applying this change would be to have an entry in DnsConfigOverrides for this boolean and setting that here. Then we don't need the `doh_fallback_allowed_by_canary_domain_ == allowed` check, because IIUC that's handled by UpdateDnsConfig
Kenichi IshibashiI thought about this too. The reason I didn't do that was that I didn't want `SetConfigOverrides` to clobber this. So it'd need to do something to persist this value after updating the overrides. It's like an override of overrides, so kind of odd.
Alternatively, it could be a second DnsConfigOverrides that just stores this override, but then `UpdateDnsConfig` needs to be updated regardless. I guess there is an advantage to this in that it would be a canned `DnsConfigOverrides` that can (1) set this value (2) override `secure_dns_mode` (3) clear `doh_config.servers`.
I think I would prefer the latter approach, if anything. What are your thoughts?
I had the same question. It's a good point we don't want to clobber canary domain check. Regardless how we handle it, could you document the requirement?
I still need to act on this comment on way or the other. I'm hesitant to introduce a precedent for overrides that aren't overridable, or something novel like that, so I'm still a bit inclined to keep this as is.
Ack to bashi to document whatever we end up with.
I'd appreciate hearing any strong opinions against leaving it like this.
canary_domain_service_ = std::make_unique<CanaryDomainService>(std::move(resolver), net_log);Kenichi IshibashiI'm not sure if it makes the most sense for HostResolverManager to own this... I wonder if it should be owned by the NetworkService the way DelayedDohProbeActivator is (so, is a sibling to HostResolverManager):
WDYT?
I think using a delayed approach like DelayedDohProbeActivator is worth doing as well. For IP Protection we triggered requests right at NetworkService creation time and particularly on Android we had a lot of issue with requests failing. Those were HTTP requests instead of DNS requests, but still, it seems safest to a wait a bit.
It seems NetworkService is an appropriate owner of CanaryDomainService. NetworkService listens to config changes via NetworkChangeNotifier::DNSObserver (or SystemDnsConfigChangeNotifier, less preferable) and uses host_resolver_manager_ to create an inner HostResolver.
IIUC we will move DnsClient from HostResolverManager to ResolveContext in the future so if we make HostResolverManager owns CanaryDomainService I guess we would need to update it later as well.
I've moved CanaryDomainService to be owned by NetworkService. Relevant tests are now in network_service_unittest.cc instead of host_resolver_manager_unittest.cc.
When HRM is updated to manage a DnsClient owned by each ResolveContext, `SetDohFallbackAllowedByCanaryDomain` will need to propagate the call to each DnsClient instead of just the one.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Brian BegnocheI will close this once I've added tests to host_resolver_manager_unittest.cc.
Done
void CanaryDomainService::OnProbeComplete(const std::string& probe_name,Brian BegnocheWe don't need this since `pending_probe_` has it?
I did this to be able to log the case where `std::move(pending_probe_)` returns null. What do you think? Unnecessary?
In any case, this is moot now that I've made this callback specifically for Secure DNS probe to match.
config->doh_config =I was originally wondering whether the canary domain check and the test DoH probes would need to happen in that order. I had convinced myself that they didn't need to be ordered and that once both succeeded we'd upgrade queries that happen after. After thinking about it again, though, it seems like before the test DoH probes run we need `config->doh_config` to be set to the DoH config we should send probes for (which wouldn't be the case in this if block)... I'm not sure what the right thing to do here is
It seems we should have a state (enum class) instead of having a boolean flag to distinguish the states? Probably kUnknown (checking), kPositive, kNegative.
Andrew and I talked more about this timing thing. Our conclusion is that timing is OK for two reasons:
1. Because the code here doesn't allow fallback upgrade until the canary domain check finishes successfully, the prober when fallback is needed won't occur until that happens and a system config update triggers the probe.
2. `HostResolverManager::PushDnsTasks` won't push the `SECURE_DNS` task until the following two things both happen:
(a) The canary domain check is positive and retriggers config update here to allow the fallback upgrade.
(b) The DoH prober doesn't record enough failures to make `DnsClient::FallbackFromSecureTransactionPreferred` start returning false.
I've also updated the histogram to log an enum.
config->secure_dns_mode = SecureDnsMode::kOff;I don't think we should change secure_dns_mode, can we not just avoid modifying doh_config in the case when the canary domain indicates we should fallback to Google DNS? That's how the code works today
Kenichi IshibashiI can see both ways. IMO it makes sense to clearly categorize this as a case where Secure DNS is turned off because fallback is not possible.
I'd like to hear thoughts from net owners.
I'm wondering if we should add kDisabledByCanaryDomainCheck to categorize precisely.
Andrew and I talked about this. I'm happy to sync with you on this if needed, Kenichi.
The gist is that we want to remain consistent with the way the Doh prober works today. That prober only prevents pushing the `SECURE_DNS` task for DNS lookups. The way this happens is that `DnsClient::FallbackFromSecureTransactionPreferred` returns false when the prober has failed, and the `SECURE_DNS` task is not allowed.
The failed probe doesn't set `secure_dns_mode` to OFF, so we won't do that here either. I've added mention of this in the design doc too.
doh_fallback_allowed_by_canary_domain_ = allowed;Brian BegnocheI wonder if the more consistent way of applying this change would be to have an entry in DnsConfigOverrides for this boolean and setting that here. Then we don't need the `doh_fallback_allowed_by_canary_domain_ == allowed` check, because IIUC that's handled by UpdateDnsConfig
Kenichi IshibashiI thought about this too. The reason I didn't do that was that I didn't want `SetConfigOverrides` to clobber this. So it'd need to do something to persist this value after updating the overrides. It's like an override of overrides, so kind of odd.
Alternatively, it could be a second DnsConfigOverrides that just stores this override, but then `UpdateDnsConfig` needs to be updated regardless. I guess there is an advantage to this in that it would be a canned `DnsConfigOverrides` that can (1) set this value (2) override `secure_dns_mode` (3) clear `doh_config.servers`.
I think I would prefer the latter approach, if anything. What are your thoughts?
Brian BegnocheI had the same question. It's a good point we don't want to clobber canary domain check. Regardless how we handle it, could you document the requirement?
I still need to act on this comment on way or the other. I'm hesitant to introduce a precedent for overrides that aren't overridable, or something novel like that, so I'm still a bit inclined to keep this as is.
Ack to bashi to document whatever we end up with.
I'd appreciate hearing any strong opinions against leaving it like this.
I added discussion on why `SetDohFallbackAllowedByCanaryDomain` is added to record and retain the canary domain result.
Please comment where you feel that comments in code are not inadequate.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
params.dns_query_type = DnsQueryType::UNSPECIFIED; // both A and AAAAWe don't have to set this since `UNSPECIFIED` is the default initial value (also UNSPECIFIED may query HTTPS).
I removed this.
About UNSPECIFIED possibly using HTTPS, does this present an issue with being consistent with the canary domain check for prefetch proxy? It specifically mentions just A and AAAA, so I'm not sure whether allowing anything else makes sense.
https://developer.chrome.com/docs/privacy-security/private-prefetch-proxy-for-network-admins#details_of_the_signaling_mechanism?
I just realized we wouldn't query HTTPS in this check since we use HostPortPair to call HostResolver::CreateRequest(). We only query HTTPS when scheme (`https://`) is specified.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
public net::NetworkChangeNotifier::DNSObserver {I think we're missing the NetworkChangeNotifier::AddDNSObserver / NetworkChangeNotifier::RemoveDNSObserver calls needed to hook this up
MaybeProbeSecureDnsCanaryDomain();Should we call `host_resolver_manager()->SetDohFallbackAllowedByCanaryDomain(false)` here so that we don't attempt to use secure DNS until the new canary domain check comes back?
I'm not sure whether OnDNSChanged will trigger before the state of the probes change (triggering resending the probes), though, so if not it seems unfortunate that we might:
- Received a system DNS change
- Start the probes
- This method gets called and we update the DnsConfig to indicate that the canary domain check hasn't succeeded
- Cancel the DoH probes (I think?)
- Do the canary domain check and update the DnsConfig if they pass
- Restart the DoH probes
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
bashi's feedback is to incorporate awillia's proposal (crrev.com/c/7567908) to shift to CanaryDomainService ownership by NetworkContext. I have done that.
if (base::FeatureList::IsEnabled(I have added this here, which means that the canary domain check won't start until after the DoH probe delay. Basically they start together.
I pondered and concluded that this is reasonable. I've become more and more convinced that the more that these two are consistent and work in concert, the better.
If we don't want to delay on the canary domain check, then we could add an `NetworkContext::ActivateCanaryDomainCheck` method that just does this added bit, and then `NetworkService::ActivateAllDohProbes` would call this new method to avoid the timer delay.
I think we're missing the NetworkChangeNotifier::AddDNSObserver / NetworkChangeNotifier::RemoveDNSObserver calls needed to hook this up
You're right. I did this and didn't upload. This is a moot point now after moving `CanaryDomainService` ownership to `NetworkContext`.
MaybeProbeSecureDnsCanaryDomain();Should we call `host_resolver_manager()->SetDohFallbackAllowedByCanaryDomain(false)` here so that we don't attempt to use secure DNS until the new canary domain check comes back?
I'm not sure whether OnDNSChanged will trigger before the state of the probes change (triggering resending the probes), though, so if not it seems unfortunate that we might:
- Received a system DNS change
- Start the probes
- This method gets called and we update the DnsConfig to indicate that the canary domain check hasn't succeeded
- Cancel the DoH probes (I think?)
- Do the canary domain check and update the DnsConfig if they pass
- Restart the DoH probes
This is a moot point now that `CanaryDomainService` creation and triggering the canary domain probe are moved to `NetworkContext::ActivateDohProbes`.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
(Still haven't looked into tests)
base::WeakPtr<ResolveContext> resolve_context_;
// The HostResolver that this service uses to perform probes.
base::WeakPtr<HostResolver> host_resolver_;What's the expected life time relation between `resolve_context_`, `host_resolver_` and this service? IIUC NetworkContext owns ResolveContext (via UrlRequestContext, ContextHostResolver). If ResolveContext/HostResolver are expected to outlive `this`, probably using base::SafeRef would be better.
// Service for probing canary domains. It is owned by a HostResolverManager.NetworkContext?
HostResolver::ResolveHostParameters::CacheUsage::DISALLOWED;nit: Could you add a comment why we disable secure DNS for this probe (it may not be obvious if the reader is not very familiar with canary domain check)?
NetLogSourceType::CANARY_DOMAIN_SERVICE)) {}Add `CHECK(base::FeatureList::IsEnabled(features::kProbeSecureDnsCanaryDomain))` ?
if (resolve_context_) {Should we CHECK(resolve_context_) instead?
resolve_context_->RegisterDohStatusObserver(this);optional: Does this mean Start() should be called only once. Should we somehow ensure the precondition?
if (base::FeatureList::IsEnabled(features::kProbeSecureDnsCanaryDomain)) {Should this be `CHECK(base::FeatureList::IsEnabled(features::kProbeSecureDnsCanaryDomain))`?
If we add CHECK in the ctor, the check would be redundant.
resolve_context_->canary_domain_check_status() ==
ResolveContext::CanaryDomainCheckStatus::kNotStarted) {awillia@: Does this look reasonable (I guess the answer is yes)? Or should we always (re)start probe? If so, should we reset the check status?
#include "net/dns/canary_domain_service.h"Missing licence header comment.
UMA_HISTOGRAM_ENUMERATION("Net.DNS.FallbackFromSecureTransactionPreferred",nit: Prefer base::UmaHistogramEnumeration().
https://chromium.googlesource.com/chromium/src.git/+/HEAD/tools/metrics/histograms/README.md#coding-emitting-to-histograms
virtual std::unique_ptr<CanaryDomainService> CreateCanaryDomainService();Please add a description comment for virtual method that is considered as an interface to HostResolver.
enum class CanaryDomainCheckStatus {nit: Please add description comments.
protected:Why do we need make this change? Can't we use `task_environment()` and `service()` in tests?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
base::WeakPtr<ResolveContext> resolve_context_;
// The HostResolver that this service uses to perform probes.
base::WeakPtr<HostResolver> host_resolver_;What's the expected life time relation between `resolve_context_`, `host_resolver_` and this service? IIUC NetworkContext owns ResolveContext (via UrlRequestContext, ContextHostResolver). If ResolveContext/HostResolver are expected to outlive `this`, probably using base::SafeRef would be better.
I've changed to SafeRef and added documentation on these, as well as on the constructor above and HostResolver::CreateCanaryDomainService to clarify lifetime expectations.
// Service for probing canary domains. It is owned by a HostResolverManager.Brian BegnocheNetworkContext?
Done
HostResolver::ResolveHostParameters::CacheUsage::DISALLOWED;nit: Could you add a comment why we disable secure DNS for this probe (it may not be obvious if the reader is not very familiar with canary domain check)?
Done
Add `CHECK(base::FeatureList::IsEnabled(features::kProbeSecureDnsCanaryDomain))` ?
Done. Also checked for canary domain. I added a comment in .h file to specify that this feature needs to be enabled.
if (resolve_context_) {Brian BegnocheShould we CHECK(resolve_context_) instead?
This happens now implicitly since it is a SafeRef.
resolve_context_->RegisterDohStatusObserver(this);optional: Does this mean Start() should be called only once. Should we somehow ensure the precondition?
You're right. `RegisterDohStatusObserver` will enforce this implicitly by failing on a check that the passed in observer isn't already registered.
I'm going to assume that this is sufficient and not needing additional, more explicit enforcement. I added a comment on this method in the .h file to call this out.
if (base::FeatureList::IsEnabled(features::kProbeSecureDnsCanaryDomain)) {Should this be `CHECK(base::FeatureList::IsEnabled(features::kProbeSecureDnsCanaryDomain))`?
If we add CHECK in the ctor, the check would be redundant.
CHECK is added to the constructor. Removed this check here.
resolve_context_->canary_domain_check_status() ==
ResolveContext::CanaryDomainCheckStatus::kNotStarted) {awillia@: Does this look reasonable (I guess the answer is yes)? Or should we always (re)start probe? If so, should we reset the check status?
awillia and I talked about this a bit, and I think we agree with your suggestion to restart. I've updated ProbeSecureDnsDomain to reset the check status and added a test.
#include "net/dns/canary_domain_service.h"Brian BegnocheMissing licence header comment.
Done
UMA_HISTOGRAM_ENUMERATION("Net.DNS.FallbackFromSecureTransactionPreferred",nit: Prefer base::UmaHistogramEnumeration().
https://chromium.googlesource.com/chromium/src.git/+/HEAD/tools/metrics/histograms/README.md#coding-emitting-to-histograms
Done
UMA_HISTOGRAM_ENUMERATION("Net.DNS.FallbackFromSecureTransactionPreferred",nit: Prefer base::UmaHistogramEnumeration().
https://chromium.googlesource.com/chromium/src.git/+/HEAD/tools/metrics/histograms/README.md#coding-emitting-to-histograms
Done
virtual std::unique_ptr<CanaryDomainService> CreateCanaryDomainService();Please add a description comment for virtual method that is considered as an interface to HostResolver.
Done
nit: Please add description comments.
Done
Why do we need make this change? Can't we use `task_environment()` and `service()` in tests?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I've reviewed this change and overall the approach seems reasonable to me. I'll let Kenichi do the actual review though, so I'm moving myself from R= to CC=.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
base::RunLoop().RunUntilIdle();We shouldn't use RunUntilIdle() in new tests since it's one of major sources of flakiness.
Instead, we probably want to add SetSecureDnsProbeCompleteCallbackForTesting() to CanaryDomainService, in a way similar to https://source.chromium.org/chromium/chromium/src/+/main:net/http/http_stream_pool_test_util.cc;l=438;drc=21d7e84d240f65df7594e5f3526d63e4439dd356
(I don't think we need an extra PostTask in the callback for these tests)
config.dns_over_tls_hostname)))) {Q: Do we support DoT? I'm not aware of DoT support and I'm wondering why we need to check it.
enum class FallbackFromSecureTransactionPreferredReason {nit: We prefer putting enum declarations before methods/constructors.
https://google.github.io/styleguide/cppguide.html#Declaration_Order
TEST_F(DnsClientTest, FallbackFromSecureTransactionPreferred_Failures) {Does this mean the previous name was wrong?
DnsConfig config = ValidConfigWithDoh(false /* doh_only */);nit: `/*doh_only=*/false`. Ditto below.
// Note: Current implementation of ShouldPerformDohFallbackUpgrade has a bug
// that prevents fallback from being added if the initial upgrade results in
// an empty config. Updating test to match current behavior.I'm uncertain about this comment. Do we need to fix ShouldPerformDohFallbackUpgrade, or should we update this test?
Either way, I'm not very comfortable just modifying this test. We probably need to fix the bug first (or at least file an issue and put TODO comment with the issue number)?
class MockTransaction;nit: Put Forward class declaration after `public:`.
NOTREACHED();nit: Nice to have a comment similar to GetContextForTesting().
void SetResolveContextForTesting(ResolveContext* resolve_context) {
resolve_context_ = resolve_context;
}nit: We usually prefer putting `*ForTesting()` methods after non `*ForTesting()` methods.
void SetResolveContextForTesting(ResolveContext* resolve_context) {Might want to pass `base::SafeRef<ResolveContext>` directly since we call GetSafeRef() on this?
void set_doh_fallback_canary_domain_check_status(CanaryDomainCheckStatus status) {nit: Is this correctly formatted?
kNotStarted,Probably `kUnknown` might be better since we could reset the status to this value when OnDohServerUnavailable() is called?
enum class CanaryDomainCheckStatus {nit: Put enum declaration before methods/constructors (see other comment)
{{"canary_domain_host", "positive.test"}});Please separate tests for positive and negative cases. Having separate tests is useful which case actually works or fails when running tests.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
We shouldn't use RunUntilIdle() in new tests since it's one of major sources of flakiness.
Instead, we probably want to add SetSecureDnsProbeCompleteCallbackForTesting() to CanaryDomainService, in a way similar to https://source.chromium.org/chromium/chromium/src/+/main:net/http/http_stream_pool_test_util.cc;l=438;drc=21d7e84d240f65df7594e5f3526d63e4439dd356
(I don't think we need an extra PostTask in the callback for these tests)
Done
base::RunLoop().RunUntilIdle();Brian BegnocheDitto.
Done
Q: Do we support DoT? I'm not aware of DoT support and I'm wondering why we need to check it.
Apparently, yes:
I only see that that histogram is logged for Android, which is consistent with the comment there. I suspect that net/dns/dns_config_service_android.h is the only thing that sets this config value.
In any case, here I want to maintain that if any nameserver has already been copied to `doh_coonfig`, then this should return false.
Please re-open if I got something wrong, or if there is still follow up needed.
enum class FallbackFromSecureTransactionPreferredReason {nit: We prefer putting enum declarations before methods/constructors.
https://google.github.io/styleguide/cppguide.html#Declaration_Order
Done
TEST_F(DnsClientTest, FallbackFromSecureTransactionPreferred_Failures) {Does this mean the previous name was wrong?
Yes, actually. It is testing the `FallbackFromSecureTransactionPreferred` method.
The source of the confusion or perhaps how it was misnamed could be that things evolved. Maybe this other method used to exist. But what I suspect is that the term "fallback" is overloaded in a confusing way. It can be used for DoH fallback, which refers to using Google DNS when the system config DNS doesn't support DoH. In contrast, this `FallbackFromSecureTransactionPreferred` method uses "fallback" to mean not allowing Secure DNS even though the config specifies values to do so.
That is to say, yes, this method is testing behavior around "fallback" with the meaning of reverting back *from* configured Secure DNS to actually using insecure DNS.
DnsConfig config = ValidConfigWithDoh(false /* doh_only */);nit: `/*doh_only=*/false`. Ditto below.
Done
// Note: Current implementation of ShouldPerformDohFallbackUpgrade has a bug
// that prevents fallback from being added if the initial upgrade results in
// an empty config. Updating test to match current behavior.I'm uncertain about this comment. Do we need to fix ShouldPerformDohFallbackUpgrade, or should we update this test?
Either way, I'm not very comfortable just modifying this test. We probably need to fix the bug first (or at least file an issue and put TODO comment with the issue number)?
Ah. This was Gemini off the rails at one point, and I missed that this comment had leaked in. I thought I had undone that bad detour.
Note that this test wasn't actually changed meaningfully, and I've reverted these changes.
nit: Put Forward class declaration after `public:`.
Done
NOTREACHED();Brian Begnochenit: Nice to have a comment similar to GetContextForTesting().
Done
void SetResolveContextForTesting(ResolveContext* resolve_context) {
resolve_context_ = resolve_context;
}nit: We usually prefer putting `*ForTesting()` methods after non `*ForTesting()` methods.
Done.
I will note also that perhaps the `ForTesting` suffix is redundant. I'm not sure what the best practice is, but the entirety of a mock is for testing.
void SetResolveContextForTesting(ResolveContext* resolve_context) {Might want to pass `base::SafeRef<ResolveContext>` directly since we call GetSafeRef() on this?
I think that SafeRef doesn't make sense. `resolve_context_ would need to be set in the constructor, whereas this is only needed in combination with `CreateCanaryDomainService`.
I could see passing a `unique_ptr<ResolveContext>` instead and taking ownership. In the spots where this is used (network_context_unittest.cc), that would require the pattern of creating a unique pointer, getting a raw pointer, passing ownership, then accessing via the raw pointer. I'm disinclined to do that in the absence of more reason, but please do re-open if that is more appropriate.
Probably `kUnknown` might be better since we could reset the status to this value when OnDohServerUnavailable() is called?
I still think that `kNotStarted` makes more sense. Resetting the status is resetting back to a not started state, and this is a known state, both to start and after reset.
I could see a case for having both `kUnknown` and `kNotStarted`, if there was a need for when this enum isn't set to something meaningful, but I don't think that need actually exists in this case.
Well, I guess it makes sense for the case where the canary domain check isn't enabled at all. So OK, I've *added* `kUnknown`, but I've changed `CanaryDomainService::ProbeSecureDnsDomain` to set the status to `kStarted`. (I think it was incorrect that it set the status to `kNotStarted`.) See added comments there.
nit: Put enum declaration before methods/constructors (see other comment)
Done
Please separate tests for positive and negative cases. Having separate tests is useful which case actually works or fails when running tests.
Done
Please separate tests for positive and negative cases. Having separate tests is useful which case actually works or fails when running tests.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Looking good! Before signing off, I want to check the flakiness (https://ci.chromium.org/ui/p/chromium/builders/try/android-x64-rel/807223/overview) is a noise or it's actually flaky.
My experience is that the flakiness detector actually detects test flakiness.
// If the resolution completes synchronously (unlikely for uncached DNS),I'm not 100% sure it's unlikely that the resolution completes synchronously. For example, if the canary domain host is present in /etc/hosts (or equivalent), Start() could complete synchronously.
So I think it's good to remove the comment and add a test to cover the path.
base::RunLoop().RunUntilIdle();Why do we need this? Probably remove RunProbe() entirely?
bool FallbackFromSecureTransactionPreferred(awillia@: Does this change looks reasonable to you? Just want to confirm.
void SetResolveContextForTesting(ResolveContext* resolve_context) {
resolve_context_ = resolve_context;
}Brian Begnochenit: We usually prefer putting `*ForTesting()` methods after non `*ForTesting()` methods.
Done.
I will note also that perhaps the `ForTesting` suffix is redundant. I'm not sure what the best practice is, but the entirety of a mock is for testing.
Usually we start with `ForTesting` suffix if the method is only called from tests. We can remove the suffix if we use the method in prod code.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
// If the resolution completes synchronously (unlikely for uncached DNS),I'm not 100% sure it's unlikely that the resolution completes synchronously. For example, if the canary domain host is present in /etc/hosts (or equivalent), Start() could complete synchronously.
So I think it's good to remove the comment and add a test to cover the path.
I've parameterized CanaryDomainServiceTest so that every test case is tested with and without synchronous mode. Synchronous mode tests the case that hits line 127 here.
// If the resolution completes synchronously (unlikely for uncached DNS),I'm not 100% sure it's unlikely that the resolution completes synchronously. For example, if the canary domain host is present in /etc/hosts (or equivalent), Start() could complete synchronously.
So I think it's good to remove the comment and add a test to cover the path.
Done
Why do we need this? Probably remove RunProbe() entirely?
Yes, removed. I just forgot to remove it.
void SetResolveContextForTesting(ResolveContext* resolve_context) {
resolve_context_ = resolve_context;
}Brian Begnochenit: We usually prefer putting `*ForTesting()` methods after non `*ForTesting()` methods.
Kenichi IshibashiDone.
I will note also that perhaps the `ForTesting` suffix is redundant. I'm not sure what the best practice is, but the entirety of a mock is for testing.
Usually we start with `ForTesting` suffix if the method is only called from tests. We can remove the suffix if we use the method in prod code.
Ack. This is my understanding too.
My point is that a mock is never called in prod.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Looking good! Before signing off, I want to check the flakiness (https://ci.chromium.org/ui/p/chromium/builders/try/android-x64-rel/807223/overview) is a noise or it's actually flaky.
My experience is that the flakiness detector actually detects test flakiness.
FYI that appears to be an overall flakiness issue with net_unittests. I see loads of tests in log search that have this failure occasionally.
[0219/181837.682407:FATAL:net/test/net_test_suite.cc:56] DCHECK failed: !g_current_net_test_suite.
I'm rerunning commit queue (dry) and rerunning checks, and I will continue to look at this if it persists.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
lgtm on my side (except for comment on FallbackFromSecureTransactionPreferred), thanks!
void SetOnProbeCompleteCallbackForTesting(base::RepeatingClosure callback);nit: Sorry I missed this before, but why is this RepeatingClosure instead of OnceClosure? We prefer OnceClosure if the callback is called only once.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I think this looks good overall, but leaving some remaining feedback, questions, and comments. Thanks Brian, so close!
* `DnsClient::FallbackFromSecureTransactionPreferred` is updated so that both at least one available DoH server *and* a positive canary domain result are needed to allow Secure DNS (return false).nit: wrap long lines (Gerrit can do this automatically if you click Edit -> Format) and also update this to say something like "are needed to allow Secure DNS to the fallback provider)"
// When enabled, upon reading a new DNS config that enables Secure DNS, thenit: maybe update this to reflect that the canary domain check will be sent when probes are sent
params.cache_usage =optional: it might be useful to have a test demonstrating that the canary domain checks aren't cached / get re-issued when the network changes.
const HostResolver::ResolveHostRequest* request,nit: can use a const reference here instead and then remove the `DCHECK(request)`
https://google.github.io/styleguide/cppguide.html#Inputs_and_Outputs
// Reset any previous result. Secure DNS should only be allowed if a new probenit: "Secure DNS to the fallback provider should only be allowed"
std::move(host_port_pair), NetworkAnonymizationKey::CreateTransient(),nit: this version of CreateRequest takes a const reference for the first parameter, so no need to std::move here. Since we only ever create a HostPortPair with this member, we could create the HostPortPair in the constructor and store that instead.
weak_ptr_factory_.GetWeakPtr()));since we own pending_probe_request_ we can use base::Unretained(this) here instead of needing to use a weak pointer (and could get rid of this classes WeakPtrFactory then, I think)
if (!probe_request) {Is this case still possible? it seems like it isn't, because if pending_probe_request_ is destroyed then the provided callback won't be called
If this case is possible we may want to consider calling set_doh_fallback_canary_domain_check_status so the status doesn't remain kStarted, and ideally we'd have a test.
if (net_error != net::ERR_ABORTED) {is ERR_ABORTED possible here? There was a comment elsewhere that "HostResolver squashes ERR_ABORTED to ERR_NAME_NOT_RESOLVED.". If it is possible, it seems like it'd still be better to call set_doh_fallback_canary_domain_check_status so that the status isn't kStarted?
base::StrCat({"Net.DNS.CanaryDomainService.SecureDns.Result"}), result);can remove base::StrCat here (and above if we keep that code)
if (on_probe_complete_callback_for_testing_) {If we keep the if `if (!probe_request)` block above we may want to consider whether this should be called before the return there for consistency given the name
nit: remove new line since these are all part of the same interface
// conditions. See SetSecureDnsAllowedByCanaryDomain().nit: update `SetSecureDnsAllowedByCanaryDomain` to the new name
// LINT.IfChange(FallbackFromSecureTransactionPreferredReason)nit: we should add the header comment about these being logged to histograms:
config.dns_over_tls_hostname)))) {Brian BegnocheQ: Do we support DoT? I'm not aware of DoT support and I'm wondering why we need to check it.
Apparently, yes:
I only see that that histogram is logged for Android, which is consistent with the comment there. I suspect that net/dns/dns_config_service_android.h is the only thing that sets this config value.
In any case, here I want to maintain that if any nameserver has already been copied to `doh_coonfig`, then this should return false.
Please re-open if I got something wrong, or if there is still follow up needed.
We don't support use of DoT, but for Android we do support checking whether the system is configured to use DoT and if so, we use the DoT server hostname to determine whether we have a corresponding entry in our DoH provider list so we can use DoH. I looked into how this worked a few years ago and IIRC it was related to this: https://developer.android.com/reference/android/net/LinkProperties.html#isPrivateDnsActive()
config.dns_over_tls_hostname)))) {Looking again at how UpdateConfigForDohUpgrade is implemented, we don't attempt the fallback at all unless config.dns_over_tls_hostname is empty...
If we keep this separate logic instead of using a boolean, I think it'd be better for consistency and simplicity if we replace `config.doh_config == DnsOverHttpsConfig(GetDohUpgradeServersFromDotHostname(config.dns_over_tls_hostname))` with a `if (!config->dns_over_tls_hostname.empty()) { return false }` check instead
config.dns_over_tls_hostname)))) {Following up on the conversation from: https://chromium-review.googlesource.com/c/chromium/src/+/7567908/comment/b954f680_c90ed469/
I can see your point about not wanting the DnsConfig to contain metadata about how it was constructed, but if we added a new boolean named something like `should_perform_doh_fallback_upgrade` (and set it to true when doh_config corresponds to the fallback provider DoH config) I think it would feel more like a configuration option than metadata, and then we could use that in FallbackFromSecureTransactionPreferred more simply instead.
I don't think I realized exactly how complex the logic in UpdateConfigForDohUpgrade is and that we'd have to replicate each piece of that in this function as well. I think having to mirror the two is more error prone and possible to break over time than just using the boolean approach. Something to consider (I know, I know, thinking through how best to implement all this has already required way too much brainpower!)
awillia@: Does this change looks reasonable to you? Just want to confirm.
yeah this new logic LGTM, with some thoughts in other comments about whether using a boolean in DnsConfig instead of ShouldPerformDohFallbackUpgrade might be worthwhile. thanks!
const std::vector<base::WeakPtr<MockTransaction>>& delayed_transactions()Are the changes to this file still needed? I don't see where delayed_transactions() is called
std::unique_ptr<CanaryDomainService> CreateCanaryDomainService() override;nit: move into the "// HostResolver methods:" section above
TEST_F(NetworkContextTest, CanaryDomainServiceProbe_PositiveResult) {optional: could combine this and the test below into one parameterized test (actually I've found Gemini pretty good at doing this type of thing)
net::ResolveContext resolve_context(nullptr, false);nit:
```suggestion
net::ResolveContext resolve_context(/*url_request_context=*/nullptr, /*enable_caching=*/false);
```
here and below
auto* mock_resolver = static_cast<net::MockHostResolverBase*>(Do we need these four lines (and resolve_context created above) for this test?
<!-- LINT.ThenChange(/net/dns/canary_domain_service.cc:CanaryDomainResult) -->nit (thanks Gemini):
```suggestion
<!-- LINT.ThenChange(/net/dns/canary_domain_service.h:CanaryDomainResult) -->
```