Message:
Hello golan...@googlegroups.com (cc: golan...@googlegroups.com),
I'd like you to review this change to
https://go.googlecode.com/hg/
Description:
cmd/go: allow short custom domain imports
For discussion.
With this CL, I can now type:
$ go get camlistore.org/testlib
Instead of:
$ go get camlistore.org/r/p/camlistore.git/testlib
... which is an ugly import path.
It does this by fetching a config file:
https://camlistore.org/.well-known/golang-vcs.json
The .well-known part is defined in:
http://tools.ietf.org/html/rfc5785
Please review this at http://codereview.appspot.com/5660051/
Affected files:
M src/cmd/go/vcs.go
Index: src/cmd/go/vcs.go
===================================================================
--- a/src/cmd/go/vcs.go
+++ b/src/cmd/go/vcs.go
@@ -8,8 +8,10 @@
"bytes"
"encoding/json"
"fmt"
+ "net/http"
"os"
"os/exec"
+ "path"
"regexp"
"strings"
)
@@ -275,6 +277,14 @@
// import path corresponding to the root of the repository
// (thus root is a prefix of importPath).
func vcsForImportPath(importPath string) (vcs *vcsCmd, repo, root string,
err error) {
+ return vcsForImportPathDiscover(importPath, true, "")
+}
+
+// vcsForImportPathDiscover is the recursive implementation of
vcsForImportPath.
+// If discover is true (only in the first recursive call), discovery
+// is attempted to map the domain name to a VCS URL, fetching a JSON
+// config file at a RFC5785-compliant URL.
+func vcsForImportPathDiscover(importPath string, discover bool,
overrideRoot string) (vcs *vcsCmd, repo, root string, err error) {
for _, srv := range vcsPaths {
if !strings.HasPrefix(importPath, srv.prefix) {
continue
@@ -320,11 +330,54 @@
}
}
}
- return vcs, match["repo"], match["root"], nil
+ root := match["root"]
+ if overrideRoot != "" {
+ root = overrideRoot
+ }
+ return vcs, match["repo"], root, nil
+ }
+ if discover {
+ if path, root, ok := qualifiedImportPath(importPath); ok {
+ println("qualified path is", path)
+ return vcsForImportPathDiscover(path, false, root)
+ }
}
return nil, "", "", fmt.Errorf("unrecognized import path %q", importPath)
}
+type vcsConfig struct {
+ PathPrefix string `json:"pathPrefix"`
+}
+
+func qualifiedImportPath(importPath string) (newImportPath, overrideRoot
string, ok bool) {
+ slash := strings.Index(importPath, "/")
+ if slash == -1 {
+ return "", "", false
+ }
+ host := importPath[:slash]
+
+ // Fetch from an RFC5785-compliant URL:
+ for _, scheme := range []string{"https", "http"} {
+ url := fmt.Sprintf("%s://%s/.well-known/golang-vcs.json", scheme, host)
+ println("fetching", url)
+ res, err := http.Get(url)
+ if err != nil {
+ continue
+ }
+ if res.StatusCode != 200 {
+ continue
+ }
+ var conf vcsConfig
+ err = json.NewDecoder(res.Body).Decode(&conf)
+ if err == nil {
+ println("Path prefix: ", conf.PathPrefix)
+ return path.Join(host, conf.PathPrefix, importPath[slash:]), host, true
+ }
+ println("decode error: ", err.Error())
+ }
+ return "", "", false
+}
+
// expand rewrites s to replace {k} with match[k] for each key k in match.
func expand(match map[string]string, s string) string {
for k, v := range match {
Dave.
-rob
Dave.
> because it doesn't have .git in it.
I thought the go tool did some probing to work out the VCS type if it
couldn't deduce it from the URL.
Dave.
an early draft tried every possible vcs at every possible element
along the path, but that was rejected as expensive
and error prone.
how does that help you if you have an
import path that turns out to be a subdirectory?
do you have to redirect the entire tree to
parallel paths? is that easy on most web servers?
Suggestion:If the URL is not recognized, then send a HEAD request for the URL. If the response is redirect, then use the redirect location to find the repository using the usual rules.
I know it's easy for Apache, and it's easy enough to do with a Go web server.
Dave.
I care a lot more about people running non-programmable web servers.
What is the .htaccess line? Is it one line?
Advanced .htaccess configuration tends to vary between servers, but at
least for Apache it's easy:
RedirectMatch 301 /testlib/(.*) /ugly/path/testlib.git/$1
Dave.
I want to preserve the current property that an import path is
a URL you can load in a browser. That property is not actually
true all the time even now (I have feature requests in at
Google Code, GitHub, and Bitbucket), so ideally whatever fix
we come up with would make it true more of the time, not less.
I also want it to be easy/trivial for people to set up. You
shouldn't have to write your own web server (not that there's
anything wrong with that) to set up a delegation.
All those properties suggest that something like Gary proposed
is a better choice than .well-known. The Accept header is
probably overkill; Gary says it is hard to handle, and I will
add that it is hard to test using a browser.
Maybe a query parameter, like http://importpath?go-get-package=1.
That can be handled in Apache with
RewriteEngine On
RewriteCond %{QUERY_STRING} ^go-get-vcs=1$
RewriteRule ^/my/package/dir$ https://github.com/me/package/dir [R=302,L]
If we do this, we should change the canonical import paths for
the subrepositories to be golang.org/crypto, golang.org/net,
and so on.
These redirects would be a little restricted in their form:
a redirect for a specific import path would have to go to
a known hosting import path with a location within the repo
that was already contained in the original URL. For example,
the fact that you can't do a checkout of just a subdirectory
from git or hg means that swtch.com/csearch cannot delegate
to code.google.com/p/codesearch/cmd/csearch, because
there would be nowhere to write the cmd directory.
Here the path within the repo is cmd/csearch, and that path
must appear in the original for this to be well-defined.
Russ
The ability to send people and go get to different locations would be
great indeed.
> If we do this, we should change the canonical import paths for
> the subrepositories to be golang.org/crypto, golang.org/net,
> and so on.
Or golang.org/pkg/net. Either that, or change godoc so it serves
documentation at /<pkg>.
> that was already contained in the original URL. For example,
> the fact that you can't do a checkout of just a subdirectory
> from git or hg means that swtch.com/csearch cannot delegate
> to code.google.com/p/codesearch/cmd/csearch, because
> there would be nowhere to write the cmd directory.
This sounds like a good idea anyway, even without considering the
technical limitation.
--
Gustavo Niemeyer
http://niemeyer.net
http://niemeyer.net/plus
http://niemeyer.net/twitter
http://niemeyer.net/blog
-- I'm not absolutely sure of anything.
Indeed it is similar for Apache, but it'd be nice to be able to easily
emulate the behavior of go get from a browser for testing and
introspection purposes, and making it unconditional is not an option
if we want to be able to send people and go get to different places
(doc vs. repo).
I see. That's fine, but it still doesn't address testing.
I want to be able to test in a browser.
Russ
Why not?
How many people run web sites directly out of S3
and would want to use those domain names as import paths?
A redirect is a pretty fundamental concept for a web site.
Russ
What if instead we said that you fetch the page at that URL and look
for a <meta> tag?
<meta name="go-import" content="swtch.com/codesearch hg
https://code.google.com/p/codesearch">
The three space-separated fields are import path prefix, vcs, repo root
corresponding to that import path prefix. There can be more than one
meta tag, but if we just fetched the HTML for x.com/y/z then we're only
interested in the tag with a prefix that is a prefix of x.com/y/z.
In the most trivial case, you can write a list of all your repositories and
put it in a global HTML template or in the 404 page. You don't have to
generate a different line for each URL you serve (like you'd have to
generate a different redirect for each URL), it works with static content
servers, and it is still trivially testable in a browser. In fact it encourages
people to make their import paths work in a browser.
Russ
That looks nice, but can we please introduce the aspect of "go get"
using a query argument? Without something like that, we can't
distinguish who's being served at the server side, which restricts
possibilities like redirecting people to an external documentation
site like Gary's gopkgdoc, for example, or even generating the page
for go get dynamically without interfering with the normal site
content.
> In the most trivial case, you can write a list of all your repositories and
> put it in a global HTML template or in the 404 page. You don't have to
Implementing this with Apache is still no harder than using a
redirect. It may glob a prefix and serve a static page for everything
under it, assuming one doesn't want to render the data as part of the
normal site.
ok
> What if instead we said that you fetch the page at that URL and look
> for a <meta> tag?
Nice. +1 here.
Would the <meta> tag be optional for those wanting to define a short package url and those wanting to be explicit with the current behavior remaining when no <meta> tag is found?
> What if instead we said that you fetch the page at that URL and look
> for a <meta> tag?
Isn't this worse than the original bradftz's json based method?
No.
Reviewers: golang-dev_googlegroups.com,
Message:
Hello golan...@googlegroups.com (cc: golan...@googlegroups.com),
I'd like you to review this change to
https://go.googlecode.com/hg/
Description:
cmd/go: allow short custom domain imports
For discussion.
With this CL, I can now type:
$ go get camlistore.org/testlib
Instead of:
$ go get camlistore.org/r/p/camlistore.git/testlib
... which is an ugly import path.
Are you planning to update this CL to implement the scheme we converged on?