For posterity, here is a more detailed explanation:
App Engine used to let the Go runtime terminate the TLS connection, so that I could check whether we were using TLS in the code, and select 'http' or 'https' to all generated absolute URLs.
Nowadays it doesn't, which means that some parts of the app thought we were running without TLS.
Thus, some links got created with 'http' instead of 'https'.
This often worked, for some reason, even if the app required TLS everywhere - since the app, for some magical reason, redirected 'http' requests to the same URL but with 'https' for most things.
For the naked SVG map however, it didn't - I don't know why.
So what I did was I made the main binary (app.go) select DefaultScheme in
github.com/zond/goaeoas (which generates most links) based on appengine.IsDevAppServer(). Then, for all the places where links were generated outside goaeoas, I used DefaultScheme instead of checking the TLS value in the incoming request.
My breakage while doing this was because I didn't read the documentation for
https://github.com/gorilla/mux properly, and thought that I could use the router-generated URLs, but just set Scheme and Host for them to make working absolute URLs.
What I didn't know was that the side effect was that the instance that created this URL also added a new matcher to the router - which didn't work since a lot of incoming requests don't seem to have Host or Scheme set (don't know why?).
Then when all this was fixed, I noticed that some games didn't resolve. Now I found that error as well! It's because the
google.golang.org/appengine/delay package has changed behaviour between Go 1.9 and 1.11 - the generated keys for the delayed functions are now different, so all delayed functions created before my update will fail when they try to run.
The weird thing is that we switched from Go 1.9 to Go 1.11 well before yesterday, so this error shouldn't be new. I have no idea why it happened now... Maybe I hadn't updated my SDK or something?
I noticed that the code for
google.golang.org/appengine/delay isn't provided by the server, but by me when I deploy (that's why I suspect an older local SDK might have caused the changed behaviour). So I changed some bits in delay.go:
--- a/delay/delay.go
+++ b/delay/delay.go
@@ -323,9 +323,20 @@ func runFunc(c context.Context, w http.ResponseWriter, req *http.Request) {
f := funcs[inv.Key]
if f == nil {
- log.Errorf(c, "delay: no func with key %q found", inv.Key)
- log.Warningf(c, "delay: dropping task")
- return
+ invSplit := strings.Split(inv.Key, ":")
+ keys := []string{}
+ for k, fnc := range funcs {
+ foundSplit := strings.Split(k, ":")
+ if len(invSplit) == 2 && len(foundSplit) == 2 && invSplit[1] == foundSplit[1] {
+ f = fnc
+ }
+ keys = append(keys, k)
+ }
+ if f == nil {
+ log.Errorf(c, "delay: no func with key %q found among %+v", inv.Key, keys)
+ log.Warningf(c, "delay: dropping task")
+ return
+ }
}
that I hope will mitigate this for a while. The next release from a different machine than mine will remove this patch, but hopefully all old tasks will be finished by then.
NB: I'm pretty annoyed with the engineers that made the decision to fail all previously generated tasks when you upgrade from Go 1.9 to Go 1.11. Especially since the tasks are just dropped, without any retries or efforts to resolve the problem.