Brian BegnocheI will close this once I've added tests to host_resolver_manager_unittest.cc.
Done
void CanaryDomainService::OnProbeComplete(const std::string& probe_name,Brian BegnocheWe don't need this since `pending_probe_` has it?
Brian BegnocheI did this to be able to log the case where `std::move(pending_probe_)` returns null. What do you think? Unnecessary?
In any case, this is moot now that I've made this callback specifically for Secure DNS probe to match.
config->doh_config =Kenichi IshibashiI was originally wondering whether the canary domain check and the test DoH probes would need to happen in that order. I had convinced myself that they didn't need to be ordered and that once both succeeded we'd upgrade queries that happen after. After thinking about it again, though, it seems like before the test DoH probes run we need `config->doh_config` to be set to the DoH config we should send probes for (which wouldn't be the case in this if block)... I'm not sure what the right thing to do here is
Brian BegnocheIt seems we should have a state (enum class) instead of having a boolean flag to distinguish the states? Probably kUnknown (checking), kPositive, kNegative.
Andrew and I talked more about this timing thing. Our conclusion is that timing is OK for two reasons:
1. Because the code here doesn't allow fallback upgrade until the canary domain check finishes successfully, the prober when fallback is needed won't occur until that happens and a system config update triggers the probe.
2. `HostResolverManager::PushDnsTasks` won't push the `SECURE_DNS` task until the following two things both happen:
(a) The canary domain check is positive and retriggers config update here to allow the fallback upgrade.
(b) The DoH prober doesn't record enough failures to make `DnsClient::FallbackFromSecureTransactionPreferred` start returning false.
I've also updated the histogram to log an enum.
config->secure_dns_mode = SecureDnsMode::kOff;Brian BegnocheI don't think we should change secure_dns_mode, can we not just avoid modifying doh_config in the case when the canary domain indicates we should fallback to Google DNS? That's how the code works today
Kenichi IshibashiI can see both ways. IMO it makes sense to clearly categorize this as a case where Secure DNS is turned off because fallback is not possible.
I'd like to hear thoughts from net owners.
Brian BegnocheI'm wondering if we should add kDisabledByCanaryDomainCheck to categorize precisely.
Andrew and I talked about this. I'm happy to sync with you on this if needed, Kenichi.
The gist is that we want to remain consistent with the way the Doh prober works today. That prober only prevents pushing the `SECURE_DNS` task for DNS lookups. The way this happens is that `DnsClient::FallbackFromSecureTransactionPreferred` returns false when the prober has failed, and the `SECURE_DNS` task is not allowed.
The failed probe doesn't set `secure_dns_mode` to OFF, so we won't do that here either. I've added mention of this in the design doc too.
doh_fallback_allowed_by_canary_domain_ = allowed;Brian BegnocheI wonder if the more consistent way of applying this change would be to have an entry in DnsConfigOverrides for this boolean and setting that here. Then we don't need the `doh_fallback_allowed_by_canary_domain_ == allowed` check, because IIUC that's handled by UpdateDnsConfig
Kenichi IshibashiI thought about this too. The reason I didn't do that was that I didn't want `SetConfigOverrides` to clobber this. So it'd need to do something to persist this value after updating the overrides. It's like an override of overrides, so kind of odd.
Alternatively, it could be a second DnsConfigOverrides that just stores this override, but then `UpdateDnsConfig` needs to be updated regardless. I guess there is an advantage to this in that it would be a canned `DnsConfigOverrides` that can (1) set this value (2) override `secure_dns_mode` (3) clear `doh_config.servers`.
I think I would prefer the latter approach, if anything. What are your thoughts?
Brian BegnocheI had the same question. It's a good point we don't want to clobber canary domain check. Regardless how we handle it, could you document the requirement?
Brian BegnocheI still need to act on this comment on way or the other. I'm hesitant to introduce a precedent for overrides that aren't overridable, or something novel like that, so I'm still a bit inclined to keep this as is.
Ack to bashi to document whatever we end up with.
I'd appreciate hearing any strong opinions against leaving it like this.
I added discussion on why `SetDohFallbackAllowedByCanaryDomain` is added to record and retain the canary domain result.
Please comment where you feel that comments in code are not inadequate.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
params.dns_query_type = DnsQueryType::UNSPECIFIED; // both A and AAAABrian BegnocheWe don't have to set this since `UNSPECIFIED` is the default initial value (also UNSPECIFIED may query HTTPS).
Kenichi IshibashiI removed this.
About UNSPECIFIED possibly using HTTPS, does this present an issue with being consistent with the canary domain check for prefetch proxy? It specifically mentions just A and AAAA, so I'm not sure whether allowing anything else makes sense.
https://developer.chrome.com/docs/privacy-security/private-prefetch-proxy-for-network-admins#details_of_the_signaling_mechanism?
I just realized we wouldn't query HTTPS in this check since we use HostPortPair to call HostResolver::CreateRequest(). We only query HTTPS when scheme (`https://`) is specified.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
public net::NetworkChangeNotifier::DNSObserver {I think we're missing the NetworkChangeNotifier::AddDNSObserver / NetworkChangeNotifier::RemoveDNSObserver calls needed to hook this up
MaybeProbeSecureDnsCanaryDomain();Should we call `host_resolver_manager()->SetDohFallbackAllowedByCanaryDomain(false)` here so that we don't attempt to use secure DNS until the new canary domain check comes back?
I'm not sure whether OnDNSChanged will trigger before the state of the probes change (triggering resending the probes), though, so if not it seems unfortunate that we might:
- Received a system DNS change
- Start the probes
- This method gets called and we update the DnsConfig to indicate that the canary domain check hasn't succeeded
- Cancel the DoH probes (I think?)
- Do the canary domain check and update the DnsConfig if they pass
- Restart the DoH probes
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
bashi's feedback is to incorporate awillia's proposal (crrev.com/c/7567908) to shift to CanaryDomainService ownership by NetworkContext. I have done that.
if (base::FeatureList::IsEnabled(I have added this here, which means that the canary domain check won't start until after the DoH probe delay. Basically they start together.
I pondered and concluded that this is reasonable. I've become more and more convinced that the more that these two are consistent and work in concert, the better.
If we don't want to delay on the canary domain check, then we could add an `NetworkContext::ActivateCanaryDomainCheck` method that just does this added bit, and then `NetworkService::ActivateAllDohProbes` would call this new method to avoid the timer delay.
I think we're missing the NetworkChangeNotifier::AddDNSObserver / NetworkChangeNotifier::RemoveDNSObserver calls needed to hook this up
You're right. I did this and didn't upload. This is a moot point now after moving `CanaryDomainService` ownership to `NetworkContext`.
Should we call `host_resolver_manager()->SetDohFallbackAllowedByCanaryDomain(false)` here so that we don't attempt to use secure DNS until the new canary domain check comes back?
I'm not sure whether OnDNSChanged will trigger before the state of the probes change (triggering resending the probes), though, so if not it seems unfortunate that we might:
- Received a system DNS change
- Start the probes
- This method gets called and we update the DnsConfig to indicate that the canary domain check hasn't succeeded
- Cancel the DoH probes (I think?)
- Do the canary domain check and update the DnsConfig if they pass
- Restart the DoH probes
This is a moot point now that `CanaryDomainService` creation and triggering the canary domain probe are moved to `NetworkContext::ActivateDohProbes`.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
(Still haven't looked into tests)
base::WeakPtr<ResolveContext> resolve_context_;
// The HostResolver that this service uses to perform probes.
base::WeakPtr<HostResolver> host_resolver_;What's the expected life time relation between `resolve_context_`, `host_resolver_` and this service? IIUC NetworkContext owns ResolveContext (via UrlRequestContext, ContextHostResolver). If ResolveContext/HostResolver are expected to outlive `this`, probably using base::SafeRef would be better.
// Service for probing canary domains. It is owned by a HostResolverManager.NetworkContext?
HostResolver::ResolveHostParameters::CacheUsage::DISALLOWED;nit: Could you add a comment why we disable secure DNS for this probe (it may not be obvious if the reader is not very familiar with canary domain check)?
NetLogSourceType::CANARY_DOMAIN_SERVICE)) {}Add `CHECK(base::FeatureList::IsEnabled(features::kProbeSecureDnsCanaryDomain))` ?
if (resolve_context_) {Should we CHECK(resolve_context_) instead?
resolve_context_->RegisterDohStatusObserver(this);optional: Does this mean Start() should be called only once. Should we somehow ensure the precondition?
if (base::FeatureList::IsEnabled(features::kProbeSecureDnsCanaryDomain)) {Should this be `CHECK(base::FeatureList::IsEnabled(features::kProbeSecureDnsCanaryDomain))`?
If we add CHECK in the ctor, the check would be redundant.
resolve_context_->canary_domain_check_status() ==
ResolveContext::CanaryDomainCheckStatus::kNotStarted) {awillia@: Does this look reasonable (I guess the answer is yes)? Or should we always (re)start probe? If so, should we reset the check status?
#include "net/dns/canary_domain_service.h"Missing licence header comment.
UMA_HISTOGRAM_ENUMERATION("Net.DNS.FallbackFromSecureTransactionPreferred",nit: Prefer base::UmaHistogramEnumeration().
https://chromium.googlesource.com/chromium/src.git/+/HEAD/tools/metrics/histograms/README.md#coding-emitting-to-histograms
virtual std::unique_ptr<CanaryDomainService> CreateCanaryDomainService();Please add a description comment for virtual method that is considered as an interface to HostResolver.
enum class CanaryDomainCheckStatus {nit: Please add description comments.
protected:Why do we need make this change? Can't we use `task_environment()` and `service()` in tests?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
base::WeakPtr<ResolveContext> resolve_context_;
// The HostResolver that this service uses to perform probes.
base::WeakPtr<HostResolver> host_resolver_;What's the expected life time relation between `resolve_context_`, `host_resolver_` and this service? IIUC NetworkContext owns ResolveContext (via UrlRequestContext, ContextHostResolver). If ResolveContext/HostResolver are expected to outlive `this`, probably using base::SafeRef would be better.
I've changed to SafeRef and added documentation on these, as well as on the constructor above and HostResolver::CreateCanaryDomainService to clarify lifetime expectations.
// Service for probing canary domains. It is owned by a HostResolverManager.Brian BegnocheNetworkContext?
Done
HostResolver::ResolveHostParameters::CacheUsage::DISALLOWED;nit: Could you add a comment why we disable secure DNS for this probe (it may not be obvious if the reader is not very familiar with canary domain check)?
Done
Add `CHECK(base::FeatureList::IsEnabled(features::kProbeSecureDnsCanaryDomain))` ?
Done. Also checked for canary domain. I added a comment in .h file to specify that this feature needs to be enabled.
if (resolve_context_) {Brian BegnocheShould we CHECK(resolve_context_) instead?
This happens now implicitly since it is a SafeRef.
resolve_context_->RegisterDohStatusObserver(this);optional: Does this mean Start() should be called only once. Should we somehow ensure the precondition?
You're right. `RegisterDohStatusObserver` will enforce this implicitly by failing on a check that the passed in observer isn't already registered.
I'm going to assume that this is sufficient and not needing additional, more explicit enforcement. I added a comment on this method in the .h file to call this out.
if (base::FeatureList::IsEnabled(features::kProbeSecureDnsCanaryDomain)) {Should this be `CHECK(base::FeatureList::IsEnabled(features::kProbeSecureDnsCanaryDomain))`?
If we add CHECK in the ctor, the check would be redundant.
CHECK is added to the constructor. Removed this check here.
resolve_context_->canary_domain_check_status() ==
ResolveContext::CanaryDomainCheckStatus::kNotStarted) {awillia@: Does this look reasonable (I guess the answer is yes)? Or should we always (re)start probe? If so, should we reset the check status?
awillia and I talked about this a bit, and I think we agree with your suggestion to restart. I've updated ProbeSecureDnsDomain to reset the check status and added a test.
#include "net/dns/canary_domain_service.h"Brian BegnocheMissing licence header comment.
Done
UMA_HISTOGRAM_ENUMERATION("Net.DNS.FallbackFromSecureTransactionPreferred",nit: Prefer base::UmaHistogramEnumeration().
https://chromium.googlesource.com/chromium/src.git/+/HEAD/tools/metrics/histograms/README.md#coding-emitting-to-histograms
Done
UMA_HISTOGRAM_ENUMERATION("Net.DNS.FallbackFromSecureTransactionPreferred",nit: Prefer base::UmaHistogramEnumeration().
https://chromium.googlesource.com/chromium/src.git/+/HEAD/tools/metrics/histograms/README.md#coding-emitting-to-histograms
Done
virtual std::unique_ptr<CanaryDomainService> CreateCanaryDomainService();Please add a description comment for virtual method that is considered as an interface to HostResolver.
Done
nit: Please add description comments.
Done
Why do we need make this change? Can't we use `task_environment()` and `service()` in tests?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I've reviewed this change and overall the approach seems reasonable to me. I'll let Kenichi do the actual review though, so I'm moving myself from R= to CC=.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
base::RunLoop().RunUntilIdle();We shouldn't use RunUntilIdle() in new tests since it's one of major sources of flakiness.
Instead, we probably want to add SetSecureDnsProbeCompleteCallbackForTesting() to CanaryDomainService, in a way similar to https://source.chromium.org/chromium/chromium/src/+/main:net/http/http_stream_pool_test_util.cc;l=438;drc=21d7e84d240f65df7594e5f3526d63e4439dd356
(I don't think we need an extra PostTask in the callback for these tests)
config.dns_over_tls_hostname)))) {Q: Do we support DoT? I'm not aware of DoT support and I'm wondering why we need to check it.
enum class FallbackFromSecureTransactionPreferredReason {nit: We prefer putting enum declarations before methods/constructors.
https://google.github.io/styleguide/cppguide.html#Declaration_Order
TEST_F(DnsClientTest, FallbackFromSecureTransactionPreferred_Failures) {Does this mean the previous name was wrong?
DnsConfig config = ValidConfigWithDoh(false /* doh_only */);nit: `/*doh_only=*/false`. Ditto below.
// Note: Current implementation of ShouldPerformDohFallbackUpgrade has a bug
// that prevents fallback from being added if the initial upgrade results in
// an empty config. Updating test to match current behavior.I'm uncertain about this comment. Do we need to fix ShouldPerformDohFallbackUpgrade, or should we update this test?
Either way, I'm not very comfortable just modifying this test. We probably need to fix the bug first (or at least file an issue and put TODO comment with the issue number)?
class MockTransaction;nit: Put Forward class declaration after `public:`.
NOTREACHED();nit: Nice to have a comment similar to GetContextForTesting().
void SetResolveContextForTesting(ResolveContext* resolve_context) {
resolve_context_ = resolve_context;
}nit: We usually prefer putting `*ForTesting()` methods after non `*ForTesting()` methods.
void SetResolveContextForTesting(ResolveContext* resolve_context) {Might want to pass `base::SafeRef<ResolveContext>` directly since we call GetSafeRef() on this?
void set_doh_fallback_canary_domain_check_status(CanaryDomainCheckStatus status) {nit: Is this correctly formatted?
kNotStarted,Probably `kUnknown` might be better since we could reset the status to this value when OnDohServerUnavailable() is called?
enum class CanaryDomainCheckStatus {nit: Put enum declaration before methods/constructors (see other comment)
{{"canary_domain_host", "positive.test"}});Please separate tests for positive and negative cases. Having separate tests is useful which case actually works or fails when running tests.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
We shouldn't use RunUntilIdle() in new tests since it's one of major sources of flakiness.
Instead, we probably want to add SetSecureDnsProbeCompleteCallbackForTesting() to CanaryDomainService, in a way similar to https://source.chromium.org/chromium/chromium/src/+/main:net/http/http_stream_pool_test_util.cc;l=438;drc=21d7e84d240f65df7594e5f3526d63e4439dd356
(I don't think we need an extra PostTask in the callback for these tests)
Done
base::RunLoop().RunUntilIdle();Brian BegnocheDitto.
Done
Q: Do we support DoT? I'm not aware of DoT support and I'm wondering why we need to check it.
Apparently, yes:
I only see that that histogram is logged for Android, which is consistent with the comment there. I suspect that net/dns/dns_config_service_android.h is the only thing that sets this config value.
In any case, here I want to maintain that if any nameserver has already been copied to `doh_coonfig`, then this should return false.
Please re-open if I got something wrong, or if there is still follow up needed.
enum class FallbackFromSecureTransactionPreferredReason {nit: We prefer putting enum declarations before methods/constructors.
https://google.github.io/styleguide/cppguide.html#Declaration_Order
Done
TEST_F(DnsClientTest, FallbackFromSecureTransactionPreferred_Failures) {Does this mean the previous name was wrong?
Yes, actually. It is testing the `FallbackFromSecureTransactionPreferred` method.
The source of the confusion or perhaps how it was misnamed could be that things evolved. Maybe this other method used to exist. But what I suspect is that the term "fallback" is overloaded in a confusing way. It can be used for DoH fallback, which refers to using Google DNS when the system config DNS doesn't support DoH. In contrast, this `FallbackFromSecureTransactionPreferred` method uses "fallback" to mean not allowing Secure DNS even though the config specifies values to do so.
That is to say, yes, this method is testing behavior around "fallback" with the meaning of reverting back *from* configured Secure DNS to actually using insecure DNS.
DnsConfig config = ValidConfigWithDoh(false /* doh_only */);nit: `/*doh_only=*/false`. Ditto below.
Done
// Note: Current implementation of ShouldPerformDohFallbackUpgrade has a bug
// that prevents fallback from being added if the initial upgrade results in
// an empty config. Updating test to match current behavior.I'm uncertain about this comment. Do we need to fix ShouldPerformDohFallbackUpgrade, or should we update this test?
Either way, I'm not very comfortable just modifying this test. We probably need to fix the bug first (or at least file an issue and put TODO comment with the issue number)?
Ah. This was Gemini off the rails at one point, and I missed that this comment had leaked in. I thought I had undone that bad detour.
Note that this test wasn't actually changed meaningfully, and I've reverted these changes.
nit: Put Forward class declaration after `public:`.
Done
NOTREACHED();Brian Begnochenit: Nice to have a comment similar to GetContextForTesting().
Done
void SetResolveContextForTesting(ResolveContext* resolve_context) {
resolve_context_ = resolve_context;
}nit: We usually prefer putting `*ForTesting()` methods after non `*ForTesting()` methods.
Done.
I will note also that perhaps the `ForTesting` suffix is redundant. I'm not sure what the best practice is, but the entirety of a mock is for testing.
void SetResolveContextForTesting(ResolveContext* resolve_context) {Might want to pass `base::SafeRef<ResolveContext>` directly since we call GetSafeRef() on this?
I think that SafeRef doesn't make sense. `resolve_context_ would need to be set in the constructor, whereas this is only needed in combination with `CreateCanaryDomainService`.
I could see passing a `unique_ptr<ResolveContext>` instead and taking ownership. In the spots where this is used (network_context_unittest.cc), that would require the pattern of creating a unique pointer, getting a raw pointer, passing ownership, then accessing via the raw pointer. I'm disinclined to do that in the absence of more reason, but please do re-open if that is more appropriate.
Probably `kUnknown` might be better since we could reset the status to this value when OnDohServerUnavailable() is called?
I still think that `kNotStarted` makes more sense. Resetting the status is resetting back to a not started state, and this is a known state, both to start and after reset.
I could see a case for having both `kUnknown` and `kNotStarted`, if there was a need for when this enum isn't set to something meaningful, but I don't think that need actually exists in this case.
Well, I guess it makes sense for the case where the canary domain check isn't enabled at all. So OK, I've *added* `kUnknown`, but I've changed `CanaryDomainService::ProbeSecureDnsDomain` to set the status to `kStarted`. (I think it was incorrect that it set the status to `kNotStarted`.) See added comments there.
nit: Put enum declaration before methods/constructors (see other comment)
Done
Please separate tests for positive and negative cases. Having separate tests is useful which case actually works or fails when running tests.
Done
Please separate tests for positive and negative cases. Having separate tests is useful which case actually works or fails when running tests.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Looking good! Before signing off, I want to check the flakiness (https://ci.chromium.org/ui/p/chromium/builders/try/android-x64-rel/807223/overview) is a noise or it's actually flaky.
My experience is that the flakiness detector actually detects test flakiness.
// If the resolution completes synchronously (unlikely for uncached DNS),I'm not 100% sure it's unlikely that the resolution completes synchronously. For example, if the canary domain host is present in /etc/hosts (or equivalent), Start() could complete synchronously.
So I think it's good to remove the comment and add a test to cover the path.
base::RunLoop().RunUntilIdle();Why do we need this? Probably remove RunProbe() entirely?
bool FallbackFromSecureTransactionPreferred(awillia@: Does this change looks reasonable to you? Just want to confirm.
void SetResolveContextForTesting(ResolveContext* resolve_context) {
resolve_context_ = resolve_context;
}Brian Begnochenit: We usually prefer putting `*ForTesting()` methods after non `*ForTesting()` methods.
Done.
I will note also that perhaps the `ForTesting` suffix is redundant. I'm not sure what the best practice is, but the entirety of a mock is for testing.
Usually we start with `ForTesting` suffix if the method is only called from tests. We can remove the suffix if we use the method in prod code.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
// If the resolution completes synchronously (unlikely for uncached DNS),I'm not 100% sure it's unlikely that the resolution completes synchronously. For example, if the canary domain host is present in /etc/hosts (or equivalent), Start() could complete synchronously.
So I think it's good to remove the comment and add a test to cover the path.
I've parameterized CanaryDomainServiceTest so that every test case is tested with and without synchronous mode. Synchronous mode tests the case that hits line 127 here.
// If the resolution completes synchronously (unlikely for uncached DNS),I'm not 100% sure it's unlikely that the resolution completes synchronously. For example, if the canary domain host is present in /etc/hosts (or equivalent), Start() could complete synchronously.
So I think it's good to remove the comment and add a test to cover the path.
Done
Why do we need this? Probably remove RunProbe() entirely?
Yes, removed. I just forgot to remove it.
void SetResolveContextForTesting(ResolveContext* resolve_context) {
resolve_context_ = resolve_context;
}Brian Begnochenit: We usually prefer putting `*ForTesting()` methods after non `*ForTesting()` methods.
Kenichi IshibashiDone.
I will note also that perhaps the `ForTesting` suffix is redundant. I'm not sure what the best practice is, but the entirety of a mock is for testing.
Usually we start with `ForTesting` suffix if the method is only called from tests. We can remove the suffix if we use the method in prod code.
Ack. This is my understanding too.
My point is that a mock is never called in prod.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Looking good! Before signing off, I want to check the flakiness (https://ci.chromium.org/ui/p/chromium/builders/try/android-x64-rel/807223/overview) is a noise or it's actually flaky.
My experience is that the flakiness detector actually detects test flakiness.
FYI that appears to be an overall flakiness issue with net_unittests. I see loads of tests in log search that have this failure occasionally.
[0219/181837.682407:FATAL:net/test/net_test_suite.cc:56] DCHECK failed: !g_current_net_test_suite.
I'm rerunning commit queue (dry) and rerunning checks, and I will continue to look at this if it persists.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
lgtm on my side (except for comment on FallbackFromSecureTransactionPreferred), thanks!
void SetOnProbeCompleteCallbackForTesting(base::RepeatingClosure callback);nit: Sorry I missed this before, but why is this RepeatingClosure instead of OnceClosure? We prefer OnceClosure if the callback is called only once.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I think this looks good overall, but leaving some remaining feedback, questions, and comments. Thanks Brian, so close!
* `DnsClient::FallbackFromSecureTransactionPreferred` is updated so that both at least one available DoH server *and* a positive canary domain result are needed to allow Secure DNS (return false).nit: wrap long lines (Gerrit can do this automatically if you click Edit -> Format) and also update this to say something like "are needed to allow Secure DNS to the fallback provider)"
// When enabled, upon reading a new DNS config that enables Secure DNS, thenit: maybe update this to reflect that the canary domain check will be sent when probes are sent
params.cache_usage =optional: it might be useful to have a test demonstrating that the canary domain checks aren't cached / get re-issued when the network changes.
const HostResolver::ResolveHostRequest* request,nit: can use a const reference here instead and then remove the `DCHECK(request)`
https://google.github.io/styleguide/cppguide.html#Inputs_and_Outputs
// Reset any previous result. Secure DNS should only be allowed if a new probenit: "Secure DNS to the fallback provider should only be allowed"
std::move(host_port_pair), NetworkAnonymizationKey::CreateTransient(),nit: this version of CreateRequest takes a const reference for the first parameter, so no need to std::move here. Since we only ever create a HostPortPair with this member, we could create the HostPortPair in the constructor and store that instead.
weak_ptr_factory_.GetWeakPtr()));since we own pending_probe_request_ we can use base::Unretained(this) here instead of needing to use a weak pointer (and could get rid of this classes WeakPtrFactory then, I think)
if (!probe_request) {Is this case still possible? it seems like it isn't, because if pending_probe_request_ is destroyed then the provided callback won't be called
If this case is possible we may want to consider calling set_doh_fallback_canary_domain_check_status so the status doesn't remain kStarted, and ideally we'd have a test.
if (net_error != net::ERR_ABORTED) {is ERR_ABORTED possible here? There was a comment elsewhere that "HostResolver squashes ERR_ABORTED to ERR_NAME_NOT_RESOLVED.". If it is possible, it seems like it'd still be better to call set_doh_fallback_canary_domain_check_status so that the status isn't kStarted?
base::StrCat({"Net.DNS.CanaryDomainService.SecureDns.Result"}), result);can remove base::StrCat here (and above if we keep that code)
if (on_probe_complete_callback_for_testing_) {If we keep the if `if (!probe_request)` block above we may want to consider whether this should be called before the return there for consistency given the name
nit: remove new line since these are all part of the same interface
// conditions. See SetSecureDnsAllowedByCanaryDomain().nit: update `SetSecureDnsAllowedByCanaryDomain` to the new name
// LINT.IfChange(FallbackFromSecureTransactionPreferredReason)nit: we should add the header comment about these being logged to histograms:
config.dns_over_tls_hostname)))) {Brian BegnocheQ: Do we support DoT? I'm not aware of DoT support and I'm wondering why we need to check it.
Apparently, yes:
I only see that that histogram is logged for Android, which is consistent with the comment there. I suspect that net/dns/dns_config_service_android.h is the only thing that sets this config value.
In any case, here I want to maintain that if any nameserver has already been copied to `doh_coonfig`, then this should return false.
Please re-open if I got something wrong, or if there is still follow up needed.
We don't support use of DoT, but for Android we do support checking whether the system is configured to use DoT and if so, we use the DoT server hostname to determine whether we have a corresponding entry in our DoH provider list so we can use DoH. I looked into how this worked a few years ago and IIRC it was related to this: https://developer.android.com/reference/android/net/LinkProperties.html#isPrivateDnsActive()
config.dns_over_tls_hostname)))) {Looking again at how UpdateConfigForDohUpgrade is implemented, we don't attempt the fallback at all unless config.dns_over_tls_hostname is empty...
If we keep this separate logic instead of using a boolean, I think it'd be better for consistency and simplicity if we replace `config.doh_config == DnsOverHttpsConfig(GetDohUpgradeServersFromDotHostname(config.dns_over_tls_hostname))` with a `if (!config->dns_over_tls_hostname.empty()) { return false }` check instead
config.dns_over_tls_hostname)))) {Following up on the conversation from: https://chromium-review.googlesource.com/c/chromium/src/+/7567908/comment/b954f680_c90ed469/
I can see your point about not wanting the DnsConfig to contain metadata about how it was constructed, but if we added a new boolean named something like `should_perform_doh_fallback_upgrade` (and set it to true when doh_config corresponds to the fallback provider DoH config) I think it would feel more like a configuration option than metadata, and then we could use that in FallbackFromSecureTransactionPreferred more simply instead.
I don't think I realized exactly how complex the logic in UpdateConfigForDohUpgrade is and that we'd have to replicate each piece of that in this function as well. I think having to mirror the two is more error prone and possible to break over time than just using the boolean approach. Something to consider (I know, I know, thinking through how best to implement all this has already required way too much brainpower!)
awillia@: Does this change looks reasonable to you? Just want to confirm.
yeah this new logic LGTM, with some thoughts in other comments about whether using a boolean in DnsConfig instead of ShouldPerformDohFallbackUpgrade might be worthwhile. thanks!
const std::vector<base::WeakPtr<MockTransaction>>& delayed_transactions()Are the changes to this file still needed? I don't see where delayed_transactions() is called
std::unique_ptr<CanaryDomainService> CreateCanaryDomainService() override;nit: move into the "// HostResolver methods:" section above
TEST_F(NetworkContextTest, CanaryDomainServiceProbe_PositiveResult) {optional: could combine this and the test below into one parameterized test (actually I've found Gemini pretty good at doing this type of thing)
net::ResolveContext resolve_context(nullptr, false);nit:
```suggestion
net::ResolveContext resolve_context(/*url_request_context=*/nullptr, /*enable_caching=*/false);
```
here and below
auto* mock_resolver = static_cast<net::MockHostResolverBase*>(Do we need these four lines (and resolve_context created above) for this test?
<!-- LINT.ThenChange(/net/dns/canary_domain_service.cc:CanaryDomainResult) -->nit (thanks Gemini):
```suggestion
<!-- LINT.ThenChange(/net/dns/canary_domain_service.h:CanaryDomainResult) -->
```
* `DnsClient::FallbackFromSecureTransactionPreferred` is updated so that both at least one available DoH server *and* a positive canary domain result are needed to allow Secure DNS (return false).nit: wrap long lines (Gerrit can do this automatically if you click Edit -> Format) and also update this to say something like "are needed to allow Secure DNS to the fallback provider)"
Done
// When enabled, upon reading a new DNS config that enables Secure DNS, thenit: maybe update this to reflect that the canary domain check will be sent when probes are sent
Done
void SetOnProbeCompleteCallbackForTesting(base::RepeatingClosure callback);nit: Sorry I missed this before, but why is this RepeatingClosure instead of OnceClosure? We prefer OnceClosure if the callback is called only once.
Done
optional: it might be useful to have a test demonstrating that the canary domain checks aren't cached / get re-issued when the network changes.
Added a test.
nit: can use a const reference here instead and then remove the `DCHECK(request)`
https://google.github.io/styleguide/cppguide.html#Inputs_and_Outputs
Done
// Reset any previous result. Secure DNS should only be allowed if a new probenit: "Secure DNS to the fallback provider should only be allowed"
Done
// Reset any previous result. Secure DNS should only be allowed if a new probenit: "Secure DNS to the fallback provider should only be allowed"
Done
std::move(host_port_pair), NetworkAnonymizationKey::CreateTransient(),nit: this version of CreateRequest takes a const reference for the first parameter, so no need to std::move here. Since we only ever create a HostPortPair with this member, we could create the HostPortPair in the constructor and store that instead.
Done
since we own pending_probe_request_ we can use base::Unretained(this) here instead of needing to use a weak pointer (and could get rid of this classes WeakPtrFactory then, I think)
Done
Is this case still possible? it seems like it isn't, because if pending_probe_request_ is destroyed then the provided callback won't be called
If this case is possible we may want to consider calling set_doh_fallback_canary_domain_check_status so the status doesn't remain kStarted, and ideally we'd have a test.
I think you're right. I've removed it.
The only possibility I would see is on a sync result (ERR_IO_PENDING) that short-circuits straight to this callback and there's a race condition with another call of ProbeSecureDomain that destroys pending_probe_request. But I think that's not a concern since these events are all executed on the same network thread.
is ERR_ABORTED possible here? There was a comment elsewhere that "HostResolver squashes ERR_ABORTED to ERR_NAME_NOT_RESOLVED.". If it is possible, it seems like it'd still be better to call set_doh_fallback_canary_domain_check_status so that the status isn't kStarted?
MockHostResolver squashes it. I updated the comment.
Basically I want to test this case, but MockHostResolver's behavior makes it impossible, IIUC. It doesn't seem worth mucking with it, but I did leave the tests that I would like to have in place. Maybe they're more confusing than they're worth.
But actually, thinking about this more now, I don't think the specific check for ERR_ABORTED is needed. It was in the initial attempt where only a negative result disabled DoH. Now that this needs a positive result to enable DoH, I think it's OK for ERR_ABORTED to be treated like any other error and report the result via callback.
I've removed this block entirely and removed the tests. Please re-open if you disagree.
base::StrCat({"Net.DNS.CanaryDomainService.SecureDns.Result"}), result);can remove base::StrCat here (and above if we keep that code)
Done
If we keep the if `if (!probe_request)` block above we may want to consider whether this should be called before the return there for consistency given the name
I removed that if block.
nit: remove new line since these are all part of the same interface
Done
// conditions. See SetSecureDnsAllowedByCanaryDomain().nit: update `SetSecureDnsAllowedByCanaryDomain` to the new name
This addition actually no longer makes sense since this added method doesn't exist anymore, and the canary domain doesn't affect the effective config that's returned.
Removed these added sentences entirely.
// LINT.IfChange(FallbackFromSecureTransactionPreferredReason)nit: we should add the header comment about these being logged to histograms:
Done
Following up on the conversation from: https://chromium-review.googlesource.com/c/chromium/src/+/7567908/comment/b954f680_c90ed469/
I can see your point about not wanting the DnsConfig to contain metadata about how it was constructed, but if we added a new boolean named something like `should_perform_doh_fallback_upgrade` (and set it to true when doh_config corresponds to the fallback provider DoH config) I think it would feel more like a configuration option than metadata, and then we could use that in FallbackFromSecureTransactionPreferred more simply instead.
I don't think I realized exactly how complex the logic in UpdateConfigForDohUpgrade is and that we'd have to replicate each piece of that in this function as well. I think having to mirror the two is more error prone and possible to break over time than just using the boolean approach. Something to consider (I know, I know, thinking through how best to implement all this has already required way too much brainpower!)
OK, agreed, and done.
Looking again at how UpdateConfigForDohUpgrade is implemented, we don't attempt the fallback at all unless config.dns_over_tls_hostname is empty...
If we keep this separate logic instead of using a boolean, I think it'd be better for consistency and simplicity if we replace `config.doh_config == DnsOverHttpsConfig(GetDohUpgradeServersFromDotHostname(config.dns_over_tls_hostname))` with a `if (!config->dns_over_tls_hostname.empty()) { return false }` check instead
I had to think about this, but yes, this makes sense. I added a comment as well. Please comment if that doesn't cover it sufficiently.
const std::vector<base::WeakPtr<MockTransaction>>& delayed_transactions()Are the changes to this file still needed? I don't see where delayed_transactions() is called
No, I thought I reverted these. Should be reverted now.
std::unique_ptr<CanaryDomainService> CreateCanaryDomainService() override;nit: move into the "// HostResolver methods:" section above
Done
TEST_F(NetworkContextTest, CanaryDomainServiceProbe_PositiveResult) {optional: could combine this and the test below into one parameterized test (actually I've found Gemini pretty good at doing this type of thing)
Done
net::ResolveContext resolve_context(nullptr, false);nit:
```suggestion
net::ResolveContext resolve_context(/*url_request_context=*/nullptr, /*enable_caching=*/false);
```here and below
Done
auto* mock_resolver = static_cast<net::MockHostResolverBase*>(Do we need these four lines (and resolve_context created above) for this test?
No, thanks for the catch. Removed.
<!-- LINT.ThenChange(/net/dns/canary_domain_service.cc:CanaryDomainResult) -->nit (thanks Gemini):
```suggestion
<!-- LINT.ThenChange(/net/dns/canary_domain_service.h:CanaryDomainResult) -->
```
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
David, can you +1 for histogram changes?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Kenichi-san, can you PTAL at dns_config.{h,cc} again and re-up your +1? Thank you.
| Code-Review | +1 |
LGTM with one final comment. Great work on this Brian!
&CanaryDomainService::OnSecureDnsProbeComplete, base::Unretained(this)));optional: it's recommended to add a comment above uses of base::Unretained regarding why it is safe (here because we own pending_probe_request_)
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
&CanaryDomainService::OnSecureDnsProbeComplete, base::Unretained(this)));optional: it's recommended to add a comment above uses of base::Unretained regarding why it is safe (here because we own pending_probe_request_)
Done
&CanaryDomainService::OnSecureDnsProbeComplete, base::Unretained(this)));optional: it's recommended to add a comment above uses of base::Unretained regarding why it is safe (here because we own pending_probe_request_)
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
base::SequencedTaskRunner::GetCurrentDefault()->PostTask(I've updated this to post a task. Without this, the DNS resolution can fail with `DNS_CACHE_INVALIDATION_IN_PROGRESS`. I discovered this while changing the Secure DNS setting in chrome://settings/security and seeing that the result when changing the setting was `DNS_CACHE_INVALIDATION_IN_PROGRESS` instead of the correct result. So I think this is resolved.
But I also notice that if I spam changes to the setting, sometimes there's a DCHECK failure here in `HostResolverManager::InvalidateCaches`. I'm not sure how to make it happen deterministically either.
`DCHECK_EQ(num_jobs, jobs_.size());`
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
Stamp-ish lgtm to unblock, as I'm OOO today. I might take a closer look tomorrow.
base::SequencedTaskRunner::GetCurrentDefault()->PostTask(I've updated this to post a task. Without this, the DNS resolution can fail with `DNS_CACHE_INVALIDATION_IN_PROGRESS`. I discovered this while changing the Secure DNS setting in chrome://settings/security and seeing that the result when changing the setting was `DNS_CACHE_INVALIDATION_IN_PROGRESS` instead of the correct result. So I think this is resolved.
But I also notice that if I spam changes to the setting, sometimes there's a DCHECK failure here in `HostResolverManager::InvalidateCaches`. I'm not sure how to make it happen deterministically either.
`DCHECK_EQ(num_jobs, jobs_.size());`
Coincidentally, I was looking into the root cause (https://crbug.com/486443374). Here is Gemini's summary: https://paste.googleplex.com/5640716798787584
WIP CL: https://crrev.com/c/7601235
The proposed fix doesn't seem to work (e.g. break existing tests). I had a long conversation with gemini-cli. Still figuring out what's a better way to fix the issue.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
base::SequencedTaskRunner::GetCurrentDefault()->PostTask(Kenichi IshibashiI've updated this to post a task. Without this, the DNS resolution can fail with `DNS_CACHE_INVALIDATION_IN_PROGRESS`. I discovered this while changing the Secure DNS setting in chrome://settings/security and seeing that the result when changing the setting was `DNS_CACHE_INVALIDATION_IN_PROGRESS` instead of the correct result. So I think this is resolved.
But I also notice that if I spam changes to the setting, sometimes there's a DCHECK failure here in `HostResolverManager::InvalidateCaches`. I'm not sure how to make it happen deterministically either.
`DCHECK_EQ(num_jobs, jobs_.size());`
Coincidentally, I was looking into the root cause (https://crbug.com/486443374). Here is Gemini's summary: https://paste.googleplex.com/5640716798787584
WIP CL: https://crrev.com/c/7601235The proposed fix doesn't seem to work (e.g. break existing tests). I had a long conversation with gemini-cli. Still figuring out what's a better way to fix the issue.
Thanks for the pointers. I will consider this an existing issue not created or exacerbated by this change.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
if (!resolve_context_->IsDohFallbackProbeEnabled()) {I realized that the canary domain probe was being triggered for every secure_dns_mode and not just AUTOMATIC w/ DoH fallback. This added check addresses this.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
if (!resolve_context_->IsDohFallbackProbeEnabled()) {I realized that the canary domain probe was being triggered for every secure_dns_mode and not just AUTOMATIC w/ DoH fallback. This added check addresses this.
I'm wondering if we could prevent CanaryDomainService from running when the mode is set to AUTOMATIC w/o DoH fallback in the first place.
resolve_context_->set_doh_fallback_canary_domain_check_status(
CanaryDomainCheckStatus::kStarted);nit: Move this above (below `pending_probe_request_.reset()`) and update the comment to reset the status as well, so that we can remove line 120.
config, base::BindRepeating([](int, int) -> int { return 0; }), nullptr);nit: `/*net_log=*/nullptr`
config, base::BindRepeating([](int, int) -> int { return 0; }), nullptr);nit: `/*net_log=*/nullptr`. Ditto below.
We may also consider adding a lambda to create a DnsSession instance (and possibly other set up) to reduce duplications.
raw_ptr<ResolveContext> resolve_context_ = nullptr;nit: IIUC nullptr is the default value so we don't need to explicitly set this.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
LGTM for enums.xml and histograms.xml
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
if (!resolve_context_->IsDohFallbackProbeEnabled()) {Kenichi IshibashiI realized that the canary domain probe was being triggered for every secure_dns_mode and not just AUTOMATIC w/ DoH fallback. This added check addresses this.
I'm wondering if we could prevent CanaryDomainService from running when the mode is set to AUTOMATIC w/o DoH fallback in the first place.
This was my first thought and attempt too, by updating ContextHostResolver::CreateCanaryDomainService to return null in any case other than AUTOMATIC+DoHF.
Doing it that way would require some additional event plumbing into CanaryDomainService. I found that this works to only trigger the probe on startup with AUTO+DoHF, but then changing the mode in chrome://settings/security would not retrigger, ultimately because that doesn't trigger recreating network contexts, apparently.
This way, every time the setting changes, `OnSessionChanged` and `OnDohServerUnavailable` both fire to reset and retrigger the probe.
I could have CanaryDomainService self-register as a `NetworkChangeNotifier::DNSObserver` to catch the settings changes, but I'm not convinced that it's actually better. The CDS would still need to be created to respond to events. Perhaps NetworkContext could register and create or tear down the CDS depending.
This is simpler and more in line with mirroring DoH probe behavior, so I will stick with this unless there is strong reason not to.
resolve_context_->set_doh_fallback_canary_domain_check_status(
CanaryDomainCheckStatus::kStarted);nit: Move this above (below `pending_probe_request_.reset()`) and update the comment to reset the status as well, so that we can remove line 120.
I'm not sure that I follow. Line 120 sets the status to `kNotStarted`, whereas this line sets it to `kStarted`.
This line is after the IsDohFallbackProbeEnabled check, because it should enter a started state if that check passes.
I did remove line 120, however, because it is actually redundant. The two paths to get to `ProbeSecureDnsDomain()`:
1. `Start()` after initialization, including setting status to `kNotStarted`
2. `OnDohServerUnavailable()` called after `OnSessionChanged()` is called and resets the status to `kNotStarted`.
`OnSessionChanged()` also resets `pending_probe_request_`, so actually resetting that here and resetting the status are both redundant. Both should have already occurred when this method called. So I removed lines 116-117 too.
config, base::BindRepeating([](int, int) -> int { return 0; }), nullptr);Brian Begnochenit: `/*net_log=*/nullptr`
Done
config, base::BindRepeating([](int, int) -> int { return 0; }), nullptr);Brian Begnochenit: `/*net_log=*/nullptr`
Done
config, base::BindRepeating([](int, int) -> int { return 0; }), nullptr);nit: `/*net_log=*/nullptr`. Ditto below.
We may also consider adding a lambda to create a DnsSession instance (and possibly other set up) to reduce duplications.
Done
config, base::BindRepeating([](int, int) -> int { return 0; }), nullptr);nit: `/*net_log=*/nullptr`. Ditto below.
We may also consider adding a lambda to create a DnsSession instance (and possibly other set up) to reduce duplications.
Done
config, base::BindRepeating([](int, int) -> int { return 0; }), nullptr);nit: `/*net_log=*/nullptr`. Ditto below.
We may also consider adding a lambda to create a DnsSession instance (and possibly other set up) to reduce duplications.
Done
raw_ptr<ResolveContext> resolve_context_ = nullptr;nit: IIUC nullptr is the default value so we don't need to explicitly set this.
Done
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
40 is the latest approved patch-set.
The change was submitted with unreviewed changes in the following files:
```
The name of the file: net/dns/canary_domain_service_unittest.cc
Insertions: 2, Deletions: 1.
The diff is too large to show. Please review the diff.
```
```
The name of the file: net/dns/context_host_resolver_unittest.cc
Insertions: 9, Deletions: 6.
The diff is too large to show. Please review the diff.
```
```
The name of the file: net/dns/canary_domain_service.cc
Insertions: 0, Deletions: 5.
The diff is too large to show. Please review the diff.
```
```
The name of the file: tools/metrics/histograms/metadata/net/histograms.xml
Insertions: 34, Deletions: 0.
The diff is too large to show. Please review the diff.
```
```
The name of the file: net/dns/mock_host_resolver.h
Insertions: 1, Deletions: 1.
The diff is too large to show. Please review the diff.
```
```
The name of the file: services/network/network_context_unittest.cc
Insertions: 4, Deletions: 3.
The diff is too large to show. Please review the diff.
```
Implement Canary Domain check to disable DoH fallback.
When Secure DNS is enabled and DoH probers start (guarded by feature
flag):
* Also perform a canary domain check of a flag-specified host.
* `DnsClient::FallbackFromSecureTransactionPreferred` is updated so
that both at least one available DoH server *and* a positive canary
domain result are needed to allow Secure DNS to the fallback
provider)
Canary domain results:
* Positive response: OK with any A or AAAA records (allow DoH fallback)
* Negative response: OK w/o records, or error such as NXDOMAIN or SERVFAIL
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |