code review 5660051: cmd/go: allow short custom domain imports (issue 5660051)

156 views
Skip to first unread message

brad...@golang.org

unread,
Feb 14, 2012, 12:34:30 AM2/14/12
to golan...@googlegroups.com, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
Reviewers: golang-dev_googlegroups.com,

Message:
Hello golan...@googlegroups.com (cc: golan...@googlegroups.com),

I'd like you to review this change to
https://go.googlecode.com/hg/


Description:
cmd/go: allow short custom domain imports

For discussion.

With this CL, I can now type:

$ go get camlistore.org/testlib

Instead of:

$ go get camlistore.org/r/p/camlistore.git/testlib

... which is an ugly import path.

It does this by fetching a config file:

https://camlistore.org/.well-known/golang-vcs.json

The .well-known part is defined in:

http://tools.ietf.org/html/rfc5785

Please review this at http://codereview.appspot.com/5660051/

Affected files:
M src/cmd/go/vcs.go


Index: src/cmd/go/vcs.go
===================================================================
--- a/src/cmd/go/vcs.go
+++ b/src/cmd/go/vcs.go
@@ -8,8 +8,10 @@
"bytes"
"encoding/json"
"fmt"
+ "net/http"
"os"
"os/exec"
+ "path"
"regexp"
"strings"
)
@@ -275,6 +277,14 @@
// import path corresponding to the root of the repository
// (thus root is a prefix of importPath).
func vcsForImportPath(importPath string) (vcs *vcsCmd, repo, root string,
err error) {
+ return vcsForImportPathDiscover(importPath, true, "")
+}
+
+// vcsForImportPathDiscover is the recursive implementation of
vcsForImportPath.
+// If discover is true (only in the first recursive call), discovery
+// is attempted to map the domain name to a VCS URL, fetching a JSON
+// config file at a RFC5785-compliant URL.
+func vcsForImportPathDiscover(importPath string, discover bool,
overrideRoot string) (vcs *vcsCmd, repo, root string, err error) {
for _, srv := range vcsPaths {
if !strings.HasPrefix(importPath, srv.prefix) {
continue
@@ -320,11 +330,54 @@
}
}
}
- return vcs, match["repo"], match["root"], nil
+ root := match["root"]
+ if overrideRoot != "" {
+ root = overrideRoot
+ }
+ return vcs, match["repo"], root, nil
+ }
+ if discover {
+ if path, root, ok := qualifiedImportPath(importPath); ok {
+ println("qualified path is", path)
+ return vcsForImportPathDiscover(path, false, root)
+ }
}
return nil, "", "", fmt.Errorf("unrecognized import path %q", importPath)
}

+type vcsConfig struct {
+ PathPrefix string `json:"pathPrefix"`
+}
+
+func qualifiedImportPath(importPath string) (newImportPath, overrideRoot
string, ok bool) {
+ slash := strings.Index(importPath, "/")
+ if slash == -1 {
+ return "", "", false
+ }
+ host := importPath[:slash]
+
+ // Fetch from an RFC5785-compliant URL:
+ for _, scheme := range []string{"https", "http"} {
+ url := fmt.Sprintf("%s://%s/.well-known/golang-vcs.json", scheme, host)
+ println("fetching", url)
+ res, err := http.Get(url)
+ if err != nil {
+ continue
+ }
+ if res.StatusCode != 200 {
+ continue
+ }
+ var conf vcsConfig
+ err = json.NewDecoder(res.Body).Decode(&conf)
+ if err == nil {
+ println("Path prefix: ", conf.PathPrefix)
+ return path.Join(host, conf.PathPrefix, importPath[slash:]), host, true
+ }
+ println("decode error: ", err.Error())
+ }
+ return "", "", false
+}
+
// expand rewrites s to replace {k} with match[k] for each key k in match.
func expand(match map[string]string, s string) string {
for k, v := range match {


David Symonds

unread,
Feb 14, 2012, 12:36:54 AM2/14/12
to brad...@golang.org, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
I'm not a fan. It's adding more magic to something that is already
fairly magical.


Dave.

Brad Fitzpatrick

unread,
Feb 14, 2012, 12:40:58 AM2/14/12
to David Symonds, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
I'm not a fan of,

1) ugly import paths
2) making github, bitbucket et al 1st class while relegating people running their own servers to crappy import paths

Rob 'Commander' Pike

unread,
Feb 14, 2012, 12:44:02 AM2/14/12
to brad...@golang.org, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
i appreciate the end but not the means.

-rob

David Symonds

unread,
Feb 14, 2012, 12:44:23 AM2/14/12
to Brad Fitzpatrick, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
Can't you set up a symlink or something on the server-side so that
camlistore.org/testlib looks like a git repo and thus works with the
go tool?


Dave.

Brad Fitzpatrick

unread,
Feb 14, 2012, 12:46:26 AM2/14/12
to David Symonds, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
no

because it doesn't have .git in it.

The shortest I could make it is:


which is still gross.

David Symonds

unread,
Feb 14, 2012, 12:50:55 AM2/14/12
to Brad Fitzpatrick, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
On Tue, Feb 14, 2012 at 4:46 PM, Brad Fitzpatrick <brad...@golang.org> wrote:

> because it doesn't have .git in it.

I thought the go tool did some probing to work out the VCS type if it
couldn't deduce it from the URL.


Dave.

Russ Cox

unread,
Feb 14, 2012, 12:52:21 AM2/14/12
to David Symonds, Brad Fitzpatrick, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
On Tue, Feb 14, 2012 at 00:50, David Symonds <dsym...@golang.org> wrote:
> I thought the go tool did some probing to work out the VCS type if it
> couldn't deduce it from the URL.

an early draft tried every possible vcs at every possible element
along the path, but that was rejected as expensive
and error prone.

Brad Fitzpatrick

unread,
Feb 14, 2012, 12:52:29 AM2/14/12
to David Symonds, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
not the current one.  did the old one?

the current one requires .git, .hg, .svn immediately in the URL, and then the only probing it does is which sub-protocol of the git, hg, svn to use (http, https, native)


Russ Cox

unread,
Feb 14, 2012, 12:53:35 AM2/14/12
to Brad Fitzpatrick, David Symonds, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
i like the general redirect idea but not so much
the implementation. i'll sleep on it.

Brad Fitzpatrick

unread,
Feb 14, 2012, 12:58:02 AM2/14/12
to r...@golang.org, David Symonds, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
Cool. I'm not wed to specific implementations.

Gary Burd

unread,
Feb 14, 2012, 1:21:48 AM2/14/12
to golan...@googlegroups.com, David Symonds, re...@codereview-hr.appspotmail.com
Suggestion:

If the URL is not recognized, then send a HEAD request for the URL.  If the response is redirect, then use the redirect location to find the repository using the usual rules.

Russ Cox

unread,
Feb 14, 2012, 1:24:39 AM2/14/12
to Gary Burd, golan...@googlegroups.com, David Symonds, re...@codereview-hr.appspotmail.com

how does that help you if you have an
import path that turns out to be a subdirectory?
do you have to redirect the entire tree to
parallel paths? is that easy on most web servers?

Brad Fitzpatrick

unread,
Feb 14, 2012, 1:25:52 AM2/14/12
to Gary Burd, golan...@googlegroups.com, David Symonds, re...@codereview-hr.appspotmail.com
On Tue, Feb 14, 2012 at 5:21 PM, Gary Burd <gary...@gmail.com> wrote:
Suggestion:

If the URL is not recognized, then send a HEAD request for the URL.  If the response is redirect, then use the redirect location to find the repository using the usual rules.

That does mean that domain.org/PKGNAME must exist, but that's probably a good thing.

If we did that, we'd want to send a certain Accept header from the go tool, so domain.org/PKGNAME can vary its redirect for humans vs. the tool.  (redirecting to a source browser / info page, instead of a raw git URL).

It is harder for most people to configure redirects than upload files, though.  Especially for something like HEAD-vs-GET redirects.

David Symonds

unread,
Feb 14, 2012, 1:26:54 AM2/14/12
to r...@golang.org, Gary Burd, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com

I know it's easy for Apache, and it's easy enough to do with a Go web server.


Dave.

Russ Cox

unread,
Feb 14, 2012, 1:30:00 AM2/14/12
to David Symonds, Gary Burd, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
On Tue, Feb 14, 2012 at 01:26, David Symonds <dsym...@golang.org> wrote:
> I know it's easy for Apache, and it's easy enough to do with a Go web server.

I care a lot more about people running non-programmable web servers.
What is the .htaccess line? Is it one line?

Gary Burd

unread,
Feb 14, 2012, 1:30:52 AM2/14/12
to golan...@googlegroups.com, Gary Burd, David Symonds, re...@codereview-hr.appspotmail.com, r...@golang.org
The entire tree should be redirected to parallel paths.  This is easy on many web servers.

David Symonds

unread,
Feb 14, 2012, 1:39:17 AM2/14/12
to r...@golang.org, Gary Burd, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com

Advanced .htaccess configuration tends to vary between servers, but at
least for Apache it's easy:
RedirectMatch 301 /testlib/(.*) /ugly/path/testlib.git/$1


Dave.

Gary Burd

unread,
Feb 14, 2012, 2:43:07 AM2/14/12
to golan...@googlegroups.com, Gary Burd, David Symonds, re...@codereview-hr.appspotmail.com
> we'd want to send a certain Accept header from the go tool, 
> so domain.org/PKGNAME can vary its redirect for humans 
> vs. the tool

This is where my suggestion runs into some trouble.  Apache supports conditional redirects based on the Accept header, but the server configuration is a big step up in complexity from the unconditional redirect. 

>  Especially for something like HEAD-vs-GET redirects.

I am not suggesting that the server redirect differently for HEAD and GET requests.  Because the go tool does not care about the response body in this scenario, the tool can send a HEAD instead of a GET.

Russ Cox

unread,
Feb 14, 2012, 11:53:27 AM2/14/12
to Gary Burd, golan...@googlegroups.com, David Symonds, re...@codereview-hr.appspotmail.com
I think it is important that this work across servers.
I like the idea that you can delegate your own import path space
to hosting services, so that you are not tied to physical repo
location or choice of version control system.

I want to preserve the current property that an import path is
a URL you can load in a browser. That property is not actually
true all the time even now (I have feature requests in at
Google Code, GitHub, and Bitbucket), so ideally whatever fix
we come up with would make it true more of the time, not less.

I also want it to be easy/trivial for people to set up. You
shouldn't have to write your own web server (not that there's
anything wrong with that) to set up a delegation.

All those properties suggest that something like Gary proposed
is a better choice than .well-known. The Accept header is
probably overkill; Gary says it is hard to handle, and I will
add that it is hard to test using a browser.

Maybe a query parameter, like http://importpath?go-get-package=1.
That can be handled in Apache with

RewriteEngine On
RewriteCond %{QUERY_STRING} ^go-get-vcs=1$
RewriteRule ^/my/package/dir$ https://github.com/me/package/dir [R=302,L]

If we do this, we should change the canonical import paths for
the subrepositories to be golang.org/crypto, golang.org/net,
and so on.

These redirects would be a little restricted in their form:
a redirect for a specific import path would have to go to
a known hosting import path with a location within the repo
that was already contained in the original URL. For example,
the fact that you can't do a checkout of just a subdirectory
from git or hg means that swtch.com/csearch cannot delegate
to code.google.com/p/codesearch/cmd/csearch, because
there would be nowhere to write the cmd directory.
Here the path within the repo is cmd/csearch, and that path
must appear in the original for this to be well-defined.

Russ

Gustavo Niemeyer

unread,
Feb 14, 2012, 12:36:32 PM2/14/12
to r...@golang.org, Gary Burd, golan...@googlegroups.com, David Symonds, re...@codereview-hr.appspotmail.com
> RewriteEngine On
> RewriteCond %{QUERY_STRING} ^go-get-vcs=1$
> RewriteRule ^/my/package/dir$ https://github.com/me/package/dir [R=302,L]

The ability to send people and go get to different locations would be
great indeed.

> If we do this, we should change the canonical import paths for
> the subrepositories to be golang.org/crypto, golang.org/net,
> and so on.

Or golang.org/pkg/net. Either that, or change godoc so it serves
documentation at /<pkg>.

> that was already contained in the original URL.  For example,
> the fact that you can't do a checkout of just a subdirectory
> from git or hg means that swtch.com/csearch cannot delegate
> to code.google.com/p/codesearch/cmd/csearch, because
> there would be nowhere to write the cmd directory.

This sounds like a good idea anyway, even without considering the
technical limitation.

--
Gustavo Niemeyer
http://niemeyer.net
http://niemeyer.net/plus
http://niemeyer.net/twitter
http://niemeyer.net/blog

-- I'm not absolutely sure of anything.

Gary Burd

unread,
Feb 14, 2012, 12:58:53 PM2/14/12
to golan...@googlegroups.com, Gary Burd, David Symonds, re...@codereview-hr.appspotmail.com, r...@golang.org
The Accept header is probably overkill; Gary says it is hard to handle,

The Accept header configuration is similar to your query string configuration. Conditional redirects are more complex than unconditional redirects on Apache because conditional redirects require some knowledge of mod_rewrite.

Gustavo Niemeyer

unread,
Feb 14, 2012, 1:04:34 PM2/14/12
to Gary Burd, golan...@googlegroups.com, David Symonds, re...@codereview-hr.appspotmail.com, r...@golang.org

Indeed it is similar for Apache, but it'd be nice to be able to easily
emulate the behavior of go get from a browser for testing and
introspection purposes, and making it unconditional is not an option
if we want to be able to send people and go get to different places
(doc vs. repo).

Russ Cox

unread,
Feb 14, 2012, 1:07:45 PM2/14/12
to Gary Burd, golan...@googlegroups.com, David Symonds, re...@codereview-hr.appspotmail.com
On Tue, Feb 14, 2012 at 12:58, Gary Burd <gary...@gmail.com> wrote:
> The Accept header configuration is similar to your query string
> configuration. Conditional redirects are more complex than unconditional
> redirects on Apache because conditional redirects require some knowledge of
> mod_rewrite.

I see. That's fine, but it still doesn't address testing.
I want to be able to test in a browser.

Russ

Gary Burd

unread,
Feb 14, 2012, 1:29:21 PM2/14/12
to golan...@googlegroups.com, Gary Burd, David Symonds, re...@codereview-hr.appspotmail.com, r...@golang.org
A problem with redirects is that sites hosted on Amazon S3 and similar services cannot use redirects.

Here are some other suggestions:

- Redirect using regexp and expand template pairs found in a /.well-known file.

- If the URL is not understood, then fetch the resource at the URL and look an alternate location specified in a <link rel="alternate"> element.  This requires a one to one mapping between documents on the site and packages in the repository. 

Russ Cox

unread,
Feb 14, 2012, 1:49:34 PM2/14/12
to Gary Burd, golan...@googlegroups.com, David Symonds, re...@codereview-hr.appspotmail.com
On Tue, Feb 14, 2012 at 13:29, Gary Burd <gary...@gmail.com> wrote:
> A problem with redirects is that sites hosted on Amazon S3 and similar
> services cannot use redirects.

Why not?

Gary Burd

unread,
Feb 14, 2012, 2:25:13 PM2/14/12
to golan...@googlegroups.com, Gary Burd, David Symonds, re...@codereview-hr.appspotmail.com, r...@golang.org
Amazon S3 does not have the feature.  There's no way to specify redirect rules or control the response status code for an object. 

Russ Cox

unread,
Feb 14, 2012, 2:29:43 PM2/14/12
to Gary Burd, golan...@googlegroups.com, David Symonds, re...@codereview-hr.appspotmail.com
On Tue, Feb 14, 2012 at 14:25, Gary Burd <gary...@gmail.com> wrote:
> Amazon S3 does not have the feature.  There's no way to specify redirect
> rules or control the response status code for an object.

How many people run web sites directly out of S3
and would want to use those domain names as import paths?
A redirect is a pretty fundamental concept for a web site.

Russ

Gary Burd

unread,
Feb 14, 2012, 2:45:43 PM2/14/12
to golan...@googlegroups.com, Gary Burd, David Symonds, re...@codereview-hr.appspotmail.com, r...@golang.org
How many people run web sites directly out of S3

Amazon added the features to run a site out of S3 one year ago (http://www.allthingsdistributed.com/2011/02/website_amazon_s3.html). Github supported static sites before that.

I don't know how many people are using static sites, but the popularity seems to be increasing based on discussions at news.ycombinator.com (http://www.google.com/search?q=jekyll+site:news.ycombinator.com).

I host my site on S3. Werner Vogels' site linked above is also hosted on S3. 

Russ Cox

unread,
Feb 14, 2012, 3:01:58 PM2/14/12
to Gary Burd, golan...@googlegroups.com, David Symonds, re...@codereview-hr.appspotmail.com
Okay, let's suppose we're not going to use the redirect. It has this S3
problem but also has the implicit constraints I mentioned earlier.

What if instead we said that you fetch the page at that URL and look
for a <meta> tag?

<meta name="go-import" content="swtch.com/codesearch hg
https://code.google.com/p/codesearch">

The three space-separated fields are import path prefix, vcs, repo root
corresponding to that import path prefix. There can be more than one
meta tag, but if we just fetched the HTML for x.com/y/z then we're only
interested in the tag with a prefix that is a prefix of x.com/y/z.

In the most trivial case, you can write a list of all your repositories and
put it in a global HTML template or in the 404 page. You don't have to
generate a different line for each URL you serve (like you'd have to
generate a different redirect for each URL), it works with static content
servers, and it is still trivially testable in a browser. In fact it encourages
people to make their import paths work in a browser.

Russ

Gustavo Niemeyer

unread,
Feb 14, 2012, 3:39:13 PM2/14/12
to r...@golang.org, Gary Burd, golan...@googlegroups.com, David Symonds, re...@codereview-hr.appspotmail.com
On Tue, Feb 14, 2012 at 18:01, Russ Cox <r...@golang.org> wrote:
> What if instead we said that you fetch the page at that URL and look
> for a <meta> tag?
>
> <meta name="go-import" content="swtch.com/codesearch hg
> https://code.google.com/p/codesearch">

That looks nice, but can we please introduce the aspect of "go get"
using a query argument? Without something like that, we can't
distinguish who's being served at the server side, which restricts
possibilities like redirecting people to an external documentation
site like Gary's gopkgdoc, for example, or even generating the page
for go get dynamically without interfering with the normal site
content.

> In the most trivial case, you can write a list of all your repositories and
> put it in a global HTML template or in the 404 page.  You don't have to

Implementing this with Apache is still no harder than using a
redirect. It may glob a prefix and serve a static page for everything
under it, assuming one doesn't want to render the data as part of the
normal site.

Russ Cox

unread,
Feb 14, 2012, 3:41:38 PM2/14/12
to Gustavo Niemeyer, Gary Burd, golan...@googlegroups.com, David Symonds, re...@codereview-hr.appspotmail.com
On Tue, Feb 14, 2012 at 15:39, Gustavo Niemeyer <gus...@niemeyer.net> wrote:
> That looks nice, but can we please introduce the aspect of "go get"
> using a query argument?

ok

Daniel Krech

unread,
Feb 14, 2012, 4:05:01 PM2/14/12
to r...@golang.org, Gary Burd, golan...@googlegroups.com, David Symonds, re...@codereview-hr.appspotmail.com

On Feb 14, 2012, at 3:01 PM, Russ Cox wrote:

> What if instead we said that you fetch the page at that URL and look
> for a <meta> tag?

Nice. +1 here.

Would the <meta> tag be optional for those wanting to define a short package url and those wanting to be explicit with the current behavior remaining when no <meta> tag is found?


Russ Cox

unread,
Feb 14, 2012, 4:28:24 PM2/14/12
to Daniel Krech, Gary Burd, golan...@googlegroups.com, David Symonds, re...@codereview-hr.appspotmail.com
Recognized paths (github, bitbucket, things with .git in the
import path) would not follow this process.

unthe...@googlemail.com

unread,
Feb 16, 2012, 1:47:25 AM2/16/12
to brad...@golang.org, golan...@googlegroups.com, dsym...@golang.org, r...@google.com, r...@golang.org, gary...@gmail.com, gus...@niemeyer.net, eik...@eikeon.com, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
On 2012/02/14 20:01:58, rsc wrote:
> Okay, let's suppose we're not going to use the redirect. It has this
S3
> problem but also has the implicit constraints I mentioned earlier.

> What if instead we said that you fetch the page at that URL and look
> for a <meta> tag?

Isn't this worse than the original bradftz's json based method?

http://codereview.appspot.com/5660051/

Russ Cox

unread,
Feb 16, 2012, 2:27:18 PM2/16/12
to brad...@golang.org, golan...@googlegroups.com, dsym...@golang.org, r...@google.com, r...@golang.org, gary...@gmail.com, gus...@niemeyer.net, eik...@eikeon.com, unthe...@googlemail.com, re...@codereview-hr.appspotmail.com
On Thu, Feb 16, 2012 at 01:47, <unthe...@googlemail.com> wrote:
> Isn't this worse than the original bradftz's json based method?

No.

Brad Fitzpatrick

unread,
Feb 21, 2012, 4:59:52 AM2/21/12
to golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
On Mon, Feb 13, 2012 at 9:34 PM, <brad...@golang.org> wrote:
Reviewers: golang-dev_googlegroups.com,

Message:
Hello golan...@googlegroups.com (cc: golan...@googlegroups.com),

I'd like you to review this change to
https://go.googlecode.com/hg/


Description:
cmd/go: allow short custom domain imports

For discussion.

With this CL, I can now type:

$ go get camlistore.org/testlib

Instead of:

$ go get camlistore.org/r/p/camlistore.git/testlib

... which is an ugly import path.

I've deleted testlib and testlib2 from the camlistore.org repo.

To test this CL, try instead:


etc
 

Russ Cox

unread,
Feb 21, 2012, 2:09:00 PM2/21/12
to Brad Fitzpatrick, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
Are you planning to update this CL to implement the scheme we converged on?
I am not a fan of /.well-known because, among other things, it requires control
of the root directory of the web server.

Brad Fitzpatrick

unread,
Feb 21, 2012, 5:03:34 PM2/21/12
to r...@golang.org, golan...@googlegroups.com, re...@codereview-hr.appspotmail.com
On Wed, Feb 22, 2012 at 6:09 AM, Russ Cox <r...@golang.org> wrote:
Are you planning to update this CL to implement the scheme we converged on?

Sure.  I didn't know we'd converged.

I've created http://code.google.com/p/go/issues/detail?id=3099 ... please update that if I missed any design details.
Reply all
Reply to author
Forward
0 new messages