Making PATH_INFO possibly unrelated to the request URI and defining REQUEST_URI

143 views
Skip to first unread message

Carl Lerche

unread,
Apr 7, 2009, 1:52:43 PM4/7/09
to Rack Development
There is no current rack spec describing what REQUEST_URI should be.
Currently, both Merb and Rails use REQUEST_URI instead of PATH_INFO to
figure out the current path of the request. This obviously is not OK
if both frameworks are going to be fully rack compatible.

Now, PATH_INFO (and SCRIPT_NAME) are currently defined as being
subsets of the REQUEST_URI, SCRIPT_NAME being the first part to the
root of the application and PATH_INFO being the remainder defining the
path relative to the root of the rack application.

The question is, what is the reasoning behind having PATH_INFO be a
subset of REQUEST_URI. I have two use cases where it might be
interesting to change PATH_INFO to something unrelated to REQUEST_URI.

First, we have made each controller in Merb and Rails a rack
application. The question is how would the controller know which
action to dispatch the request to. Currently, we're setting some
information in a special key in the rack env, but it seems like it
could make more sense to just mutate PATH_INFO to reflect the action
name to dispatch to, so when #call is invoked on the controller, env
['PATH_INFO'] == "/<action_name>", in which case, the PATH_INFO would
be unrelated to REQUEST_URI. It would also make action dispatching
consistent to the way sinatra currently works.

The second use case could be something like an authentication rack
middleware. Daniel Neighman (hassox, who did merb-auth) is trying to
work on this right now. He is building it such that you can specify a
rack application to call when the authentication failed. The question
he is having is similar to the controller scenario: how does the rack
application know that it is getting a failed authentication? One idea
would be to set PATH_INFO to /unauthenticated or something similar.

The proposal then would be to add REQUEST_URI to the rack spec and for
it to be immutable. SCRIPT_NAME would remain a subset of REQUEST_URI
(initial portion of the request URL‘s "path" that corresponds to the
application object). PATH_INFO, would on the other hand, possibly not
be related to REQUEST_URI, but it would STILL designate the virtual
"location" of the request's target within the application. This is
basically adding "rewriting" ability to the spec.

Thoughts?

Carl Lerche
Engine Yard

Yehuda Katz

unread,
Apr 7, 2009, 1:59:46 PM4/7/09
to rack-...@googlegroups.com
+1.

The biggest issue here is standardizing REQUEST_URI, which would allow us to use PATH_INFO as a true "virtual root".

Adding URL rewriting to Rack would open up a lot of interesting possibilities that would allow more cross-use of components without having to know which specific env keys to rewrite.

-- Yehuda
--
Yehuda Katz
Developer | Engine Yard
(ph) 718.877.1325

Christian Neukirchen

unread,
Apr 7, 2009, 3:34:31 PM4/7/09
to rack-...@googlegroups.com
Carl Lerche <carl....@gmail.com> writes:

> There is no current rack spec describing what REQUEST_URI should be.

Trouble starts in that there is absolutely no specification on what
REQUEST_URI even means. Try four different webservers, get four
different results.

I think a not-to-be-changed copy of the entire request path could be
useful, but let's not call it REQUEST_URI.

--
Christian Neukirchen <chneuk...@gmail.com> http://chneukirchen.org

Yehuda Katz

unread,
Apr 7, 2009, 3:41:19 PM4/7/09
to rack-...@googlegroups.com
What do you think about making PATH_INFO rewritable, making it a potentially true virtual location?

-- Yehuda

Jon Crosby

unread,
Apr 7, 2009, 3:51:10 PM4/7/09
to rack-...@googlegroups.com
+1

I am in favor of providing the ability to 1) see a virtual mount point
and 2) know the full original information about the incoming request.
Making it part of the spec would enable middleware and app cooperation
without one-off ENV hacks.

Jon Crosby
http://joncrosby.me

Christian Neukirchen

unread,
Apr 7, 2009, 4:17:58 PM4/7/09
to rack-...@googlegroups.com
Yehuda Katz <wyc...@gmail.com> writes:

> What do you think about making PATH_INFO rewritable, making it a potentially
> true virtual location?

This may be specified awkwardly, but already is supposed to work.

Yehuda Katz

unread,
Apr 7, 2009, 4:19:12 PM4/7/09
to rack-...@googlegroups.com
w00t. It's basically an internal redirect. Nice!

-- Yehuda

Carl Lerche

unread,
Apr 7, 2009, 5:03:06 PM4/7/09
to Rack Development
My concern with rewriting PATH_INFO to something completely different
is that the rack end point will not know what the original request URI
was. I'm fine with not using REQUEST_URI as the key, but let's pick a
name.

REQUEST_PATH, ORIGINAL_PATH, FULL_PATH, any other ideas?

Carl Lerche
Engine Yard


On Apr 7, 1:17 pm, Christian Neukirchen <chneukirc...@gmail.com>
wrote:
> Yehuda Katz <wyc...@gmail.com> writes:
> > What do you think about making PATH_INFO rewritable, making it a potentially
> > true virtual location?
>
> This may be specified awkwardly, but already is supposed to work.
>
> --
> Christian Neukirchen  <chneukirc...@gmail.com>  http://chneukirchen.org

Daniel N

unread,
Apr 7, 2009, 7:00:34 PM4/7/09
to rack-...@googlegroups.com
On Wed, Apr 8, 2009 at 7:03 AM, Carl Lerche <carl....@gmail.com> wrote:

My concern with rewriting PATH_INFO to something completely different
is that the rack end point will not know what the original request URI
was. I'm fine with not using REQUEST_URI as the key, but let's pick a
name.

REQUEST_PATH, ORIGINAL_PATH, FULL_PATH, any other ideas?

Carl Lerche
Engine Yard

+1 for REQUEST_PATH as the original immutable version

Yehuda Katz

unread,
Apr 7, 2009, 7:05:45 PM4/7/09
to rack-...@googlegroups.com
REQUEST_PATH would probably not include the host, scheme, etc.

what about CLIENT_URI?

-- Yehuda

Daniel N

unread,
Apr 7, 2009, 7:15:22 PM4/7/09
to rack-...@googlegroups.com
On Wed, Apr 8, 2009 at 9:05 AM, Yehuda Katz <wyc...@gmail.com> wrote:
REQUEST_PATH would probably not include the host, scheme, etc.

what about CLIENT_URI?

Ok, I thought we were talking just about the path, since mutating PATH_INFO doesn't affect the scheme or host anyway.  So long as I know which version to use I'm happy with whatever ;)

Cheers
Daniel
 

boug...@gmail.com

unread,
Apr 7, 2009, 7:29:00 PM4/7/09
to rack-...@googlegroups.com
Request#fullpath already exists, does it not?

-----Original Message-----

From: Yehuda Katz <wyc...@gmail.com>
Subj: Re: Making PATH_INFO possibly unrelated to the request URI and defining REQUEST_URI
Date: Tue Apr 7, 2009 6:06 pm
Size: 1K
To: rack-...@googlegroups.com

REQUEST_PATH would probably not include the host, scheme, etc.

what about CLIENT_URI?

-- Yehuda

On Tue, Apr 7, 2009 at 4:00 PM, Daniel N <has...@gmail.com> wrote:

>
>
> On Wed, Apr 8, 2009 at 7:03 AM, Carl Lerche <carl....@gmail.com> wrote:
>
>>
>> My concern with rewriting PATH_INFO to something completely different
>> is that the rack end point will not know what the original request URI
>> was. I'm fine with not using REQUEST_URI as the key, but let's pick a
>> name.
>>
>> REQUEST_PATH, ORIGINAL_PATH, FULL_PATH, any other ideas?
>>
>> Carl Lerche
>> Engine Yard
>>
>
> +1 for REQUEST_PATH as the original immutable version
>
>
>>
>> On Apr 7, 1:17 pm, Christian Neukirchen <chneukirc...@gmail.com>
>> wrote:
>> > Yehuda Katz <wyc...@gmail.com> writes:
>> > > What do you think about making PATH_INFO rewritable, making it a
>> potentially
>> > > true virtual location?
>> >
>> > This may be specified awkwardly, but already is supposed to work.
>> >
>> > --
>> > Christian Neukirchen <chneukirc...@gmail.com> http://chneukirchen.org
>

--- message truncated ---


Yehuda Katz

unread,
Apr 7, 2009, 7:35:52 PM4/7/09
to rack-...@googlegroups.com
    def path
      script_name + path_info
    end
   
    def fullpath
      query_string.empty? ? path : "#{path}?#{query_string}"
    end

Which doesn't help for the reasons discussed above :P

-- Yehuda

Christian Neukirchen

unread,
Apr 8, 2009, 7:20:56 AM4/8/09
to rack-...@googlegroups.com
Carl Lerche <carl....@gmail.com> writes:

> REQUEST_PATH, ORIGINAL_PATH, FULL_PATH, any other ideas?

rack.original_path

Do we need to keep the SERVER_NAME etc as well? Then maybe rather

rack.original_uri

Eitherway, it needs to be specced well.

Carl Lerche

unread,
Apr 8, 2009, 7:34:15 PM4/8/09
to Rack Development
The spec does not seem to indicate that SERVER_NAME can be changed in
middleware. In which case, I don't think we need to track it.

On Apr 8, 4:20 am, Christian Neukirchen <chneukirc...@gmail.com>
wrote:
> Carl Lerche <carl.ler...@gmail.com> writes:
> > REQUEST_PATH, ORIGINAL_PATH, FULL_PATH, any other ideas?
>
> rack.original_path
>
> Do we need to keep the SERVER_NAME etc as well?  Then maybe rather
>
> rack.original_uri
>
> Eitherway, it needs to be specced well.
>
> --
> Christian Neukirchen  <chneukirc...@gmail.com>  http://chneukirchen.org

Sam Roberts

unread,
Apr 9, 2009, 12:15:26 AM4/9/09
to rack-...@googlegroups.com
On Tue, Apr 7, 2009 at 12:34 PM, Christian Neukirchen
<chneuk...@gmail.com> wrote:
>
> Carl Lerche <carl....@gmail.com> writes:
>
>> There is no current rack spec describing what REQUEST_URI should be.
>
> Trouble starts in that there is absolutely no specification on what
> REQUEST_URI even means.  Try four different webservers, get four
> different results.
>
> I think a not-to-be-changed copy of the entire request path could be
> useful, but let's not call it REQUEST_URI.

I agree.

I had trouble finding the url to my app's rack mount point in a way
that worked with cgi and mongrel, and worked with .htaccess url
rewriting.

Basically, I have a Sinatra app that can be mounted at arbitrary
points using rack, and that needs to know it's location to return html
that points back at subpaths within it. Also, I need to strip the
request paths of query parameters.

I ended up with this:

# Complete path, as requested by the client. Take care about CGI
path rewriting.
def request_path
# Using .to_s because rack/request.rb does, though I think the Rack
# spec requires these to be strings already.
begin
URI.parse(env["SCRIPT_URI"].to_s).path
rescue
env["SCRIPT_NAME"].to_s + env["PATH_INFO"].to_s
end
end

# Complete path, as requested by the client, without the env's PATH_INFO.
# This is the path to whatever is "handling" the request.
#
# Recent discussions on how PATH_INFO must be decoded leads me to think
# this might not work if the path had any URL encoded characters in it.
def script_path
request_path.sub(/#{env["PATH_INFO"]}$/, "")
end

It is particularly difficult to find the original paths in the face of
url rewriting, it would be nice if the rack spec forced the handlers
to gather this information in a coherent and well-defined way from the
servers, and pass it through as "rack." env variables.

Cheers,
Sam

Carl Lerche

unread,
Apr 9, 2009, 1:04:21 AM4/9/09
to Rack Development
What is needed to move forward with this?

On Apr 8, 9:15 pm, Sam Roberts <vieuxt...@gmail.com> wrote:
> On Tue, Apr 7, 2009 at 12:34 PM, Christian Neukirchen
>
> <chneukirc...@gmail.com> wrote:

candlerb

unread,
Apr 9, 2009, 3:28:15 AM4/9/09
to Rack Development
> Basically, I have a Sinatra app that can be mounted at arbitrary
> points using rack, and that needs to know it's location to return html
> that points back at subpaths within it. Also, I need to strip the
> request paths of query parameters.

I wrote a simple link helper like this:

helpers do
def l(*args)
args.compact!
query = args.pop if args.last.is_a?(Hash)
path = env["SCRIPT_NAME"] + "/" + args.map { |a| escape(a) }.join
("/")
path << "?" << build_query(query) if query
path
end
end

-# examples:
-# /application.css
%link{:rel=>"stylesheet",:type=>"text/css",:media=>"screen",:href=>l
("application.css")}

-# /files/new
%a{:href=>l(:files,:new)}

-# /files/new?subdir=xxx
%a{:href=>l(:files,:new,"subdir"=>param["subdir"])}

I believe this works with Rack::URLMap rewriting and see no reason why
it shouldn't work with CGI, so I guess the problem is specifically
running as a CGI under Apache which is also doing .htaccess
mod_rewrite.

Running an application framework like this as a CGI - even a
comparatively small one like Sinatra+Rack - is going to have a pretty
painful startup overhead per request, and so supporting this wouldn't
be my number one priority. Perhaps FastCGI or SCGI are more important
to support. However, I don't know how mod_rewrite and SCRIPT_NAME
interact for those.

However, if you are using mod_proxy and rewriting the URL to a
different path, then this will have the same problem. I can't see any
option other than explicitly configuring the application with its
mountpoint, because the proxied HTTP request will not carry the
'original' URL anyway.

Perhaps the thing to do is to look at Rails and see how its link
helpers work? I know that at worst you can override what it generates
(using config.action_controller.relative_url_root which in turn
defaults to ENV['RAILS_RELATIVE_URL_ROOT'], and/or def
default_url_options in your controller)

In any case, even modular applications still need to be able to link
to each other, and hence need to know where these other apps are
mounted. IMO, knowing where the app itself is mounted is just one
extra piece of configuration information.

Regards,

Brian.

Magnus Holm

unread,
Apr 9, 2009, 6:31:39 AM4/9/09
to rack-...@googlegroups.com
I've been using this middleware to clean up after mod_rewrite. It's been a while since I wrote it, so I'm not absolutely sure how it works… Maybe something like this should go in contrib?

  class ModRewriteFixer
    def initialize(app)
      @app = app
    end
  
    def call(env)
      r = env["REQUEST_URI"][0..(env["REQUEST_URI"].index("?")||0)-1]
      d = r.length - (env["PATH_INFO"]||'/').length
      env["SCRIPT_NAME"] = r[0...d]
      env["PATH_INFO"] = r[d..-1]
      @app.call(env)
    end
  end

//Magnus Holm

Yehuda Katz

unread,
Apr 9, 2009, 12:09:09 PM4/9/09
to rack-...@googlegroups.com
You wouldn't need this middleware if rack servers were forced to supply a rack.original_uri key that had the info you needed. :)

-- Yehuda

Carl Lerche

unread,
Apr 9, 2009, 1:07:48 PM4/9/09
to Rack Development
Indeed, this is not a middleware issue, this is a spec issue. It needs
to be speced that rack.original_uri cannot be mutated by middleware.

On Apr 9, 9:09 am, Yehuda Katz <wyc...@gmail.com> wrote:
> You wouldn't need this middleware if rack servers were forced to supply a
> rack.original_uri key that had the info you needed. :)
> -- Yehuda
>
>
>
>
>
> On Thu, Apr 9, 2009 at 3:31 AM, Magnus Holm <judo...@gmail.com> wrote:
> > I've been using this middleware to clean up after mod_rewrite. It's been a
> > while since I wrote it, so I'm not absolutely sure how it works… Maybe
> > something like this should go in contrib?
> >   class ModRewriteFixer
> >     def initialize(app)
> >       @app = app
> >     end
>
> >     def call(env)
> >       r = env["REQUEST_URI"][0..(env["REQUEST_URI"].index("?")||0)-1]
> >       d = r.length - (env["PATH_INFO"]||'/').length
> >       env["SCRIPT_NAME"] = r[0...d]
> >       env["PATH_INFO"] = r[d..-1]
> >       @app.call(env)
> >     end
> >   end
>
> > //Magnus Holm
>

Sam Roberts

unread,
Apr 9, 2009, 1:11:32 PM4/9/09
to rack-...@googlegroups.com
On Thu, Apr 9, 2009 at 9:09 AM, Yehuda Katz <wyc...@gmail.com> wrote:
> You wouldn't need this middleware if rack servers were forced to supply a
> rack.original_uri key that had the info you needed. :)

+100!

I like that rack uses a CGI-like env as its base, and doesn't hack it.
But there is information that isn't available in a standard fashion in
the CGI environment, and non-CGI adapters are different, anyhow.

Knowing where you are, and in particular, having the information
available in a URL-encoded form (so it isn't damaged, and its
reversible) would be really handy, note the recent problems with
PATH_INFO, and questions about what it's form is (encoded vs decoded).

I've only worked with 2 (or 3?) adapters, and I had to run them all in
debug mode, examine the env, and develop a strategy for finding my
apps location. It shouldn't be necessary to run every rack adapter to
write a rack-based app!

Sam Roberts

unread,
Apr 9, 2009, 1:52:48 PM4/9/09
to rack-...@googlegroups.com
On Thu, Apr 9, 2009 at 12:28 AM, candlerb <b.ca...@pobox.com> wrote:
> Sam wrote:
>> Basically, I have a Sinatra app that can be mounted at arbitrary
>> points using rack, and that needs to know it's location to return html
>> that points back at subpaths within it. Also, I need to strip the
>> request paths of query parameters.
>
> I wrote a simple link helper like this:
...

> path = env["SCRIPT_NAME"] + "/" + args.map { |a| escape(a) }.join

SCRIPT_NAME isn't what you want when deploying under CGI, and you
shouldn't have to know that, or test with every rack adapter.

> Running an application framework like this as a CGI - even a
> comparatively small one like Sinatra+Rack - is going to have a pretty
> painful startup overhead per request, and so supporting this wouldn't
> be my number one priority. Perhaps FastCGI or SCGI are more important
> to support. However, I don't know how mod_rewrite and SCRIPT_NAME
> interact for those.

Important to who? Rack isn't all about web apps with small low-latency
requests, I hope!

Whether CGI overhead is "painful" depends on how much work the script
does, if it performs a high-latency task, the overhead is
unnoticeable.

Note that CGI is:

- trivial to deploy, often just involving copying the exe into cgi-bin/

- is trivially parallizable even with ruby 1.8, in the sense that
every request is its own process, so if your http server runs CGI
scripts in parallel, you get true parallelism, no need for event
driven co-ordination through rack to the server. This is particularly
nice when it takes a long time to service requests and you don't want
the whole server blocked - note that this is the case where CGI
overhead is irrelevant. For long-latency service points in a SOA
architecture, this sidesteps various blocking issues.

- is naturally in "development mode", since it always does reloading
by its nature, so all the difficulties getting sinatra apps to reload
disappear (that whole rack middleware thing where you use a non-CGI
server, but then fork and run everything in another ruby instance is
basically convoluted CGI).

I'm not trying to convince anybody to use CGI, running merb or rails
under it would probably classify as criminally incompetent, but rack
can be agnostic as to the kinds of apps built on it.

> However, if you are using mod_proxy and rewriting the URL to a
> different path, then this will have the same problem. I can't see any
> option other than explicitly configuring the application with its
> mountpoint, because the proxied HTTP request will not carry the
> 'original' URL anyway.

I'm surprised the original external-facing request paths aren't passed
onwards with proxies, though. Too bad.

Anyhow, not the same thing, its the difference between possible and
impossible. If proxying/http forwarding masks the external URL, then
you do indeed need configuration.

For Apache's CGI implementation, it provides external facing URL info
(pre-rewritten), and post-rewritten, so configuration wasn't necessary
for me running under the rack two handlers I tested.

Sam

Magnus Holm

unread,
Apr 9, 2009, 2:01:21 PM4/9/09
to rack-...@googlegroups.com
I'm still wondering how we're going to handle handle internal "redirects" in other parts of the stack.

When you use CGI + mod_rewrite, the SERVER_NAME will be /cgi.rb (or maybe /config.ru) while PATH_INFO is /whatever. However, the "real" URL may be /blog/whatever. The standard approach to generate URL's for an app is to use SERVER_NAME+/some_url, but this won't work properly unless you use the ModRewriteFixer above. rack.original_uri might help me figuring out the *current* URI (omg, I need to learn the difference between URL & URI soon), but SERVER_NAME is still "broken".

It appears that only way to figure out the true mount-point (aka SERVER_NAME) is to take REQUEST_URI and strip away the PATH_INFO. Even though we now will save the REQUEST_URI, changing PATH_INFO makes it impossible to detect SERVER_NAME unless this is done right in the handlers (and the applications don't need to think about it).

Instead of using the SERVER_NAME which (F)CGI gives you, I believe it's more correct to set it to REQUEST_URI - PATH_INFO.

(I just received Sam's message now, and it looks like we're saying pretty much the same. Gah, I'm way too slow!)

//Magnus Holm

candlerb

unread,
Apr 9, 2009, 3:04:53 PM4/9/09
to Rack Development
> > I wrote a simple link helper like this:
> ...
> >      path = env["SCRIPT_NAME"] + "/" + args.map { |a| escape(a) }.join
>
> SCRIPT_NAME isn't what you want when deploying under CGI

Isn't it? Why not? It seems to work for me under Apache 2.2.8 (stock
from Ubuntu Hardy):

--- env.rb ---
#!/usr/local/bin/ruby
$:.unshift "/home/brian/git/rack/lib"
$:.unshift "/home/brian/git/sinatra/lib"
require 'sinatra'

set :server, :cgi
get '/*' do
"SCRIPT_NAME=#{ENV['SCRIPT_NAME'].inspect}; " +
"PATH_INFO=#{ENV['PATH_INFO'].inspect}"
end

-----

$ telnet localhost 80
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET /cgi-bin/env.rb/foo/bar HTTP/1.0
Host: localhost

HTTP/1.1 200 OK
Date: Thu, 09 Apr 2009 18:58:04 GMT
Server: Apache/2.2.8 (Ubuntu)
Content-Length: 51
Connection: close
Content-Type: text/html

SCRIPT_NAME="/cgi-bin/env.rb"; PATH_INFO="/foo/bar"

Maybe I've overlooked something, but there seems to be sufficient to
construct paths there.

Re performance: yes you're right, some applications don't care about
per-request startup overhead. However a quick test on my
(underpowered, 1GHz VIA + 1GB RAM) machine gives response time of
between 0.4 and 0.5 seconds for that script:

$ time curl http://localhost/cgi-bin/env.rb/foo/bar
SCRIPT_NAME="/cgi-bin/env.rb"; PATH_INFO="/foo/bar"
real 0m0.431s
user 0m0.032s
sys 0m0.020s

Using rubygems instead of explicit paths to rack and sinatra takes it
to over 1.1 seconds.

Regards,

Brian.

candlerb

unread,
Apr 9, 2009, 3:07:36 PM4/9/09
to Rack Development
> When you use CGI + mod_rewrite, the SERVER_NAME will be /cgi.rb

I was pretty sure that under Apache, SERVER_NAME is the configured
virtual host name?

Carl Lerche

unread,
Apr 9, 2009, 3:23:05 PM4/9/09
to Rack Development
I checked with chris. It is legal for middleware to change SCRIPT_NAME
to something unrelated to the original request path. In those
situations, your code would not work.
> $ time curlhttp://localhost/cgi-bin/env.rb/foo/bar

Sam Roberts

unread,
Apr 9, 2009, 5:37:21 PM4/9/09
to rack-...@googlegroups.com
On Thu, Apr 9, 2009 at 12:04 PM, candlerb <b.ca...@pobox.com> wrote:
>
>> > I wrote a simple link helper like this:
>> ...
>> > path = env["SCRIPT_NAME"] + "/" + args.map { |a| escape(a) }.join
>>
>> SCRIPT_NAME isn't what you want when deploying under CGI
>
> Isn't it? Why not? It seems to work for me under Apache 2.2.8 (stock
> from Ubuntu Hardy):

Try with mod_rewrite to hide the cgi-bin/script.rb

Magnus Holm just described the behaviour under those conditions, also
see my original code.

> $ time curl http://localhost/cgi-bin/env.rb/foo/bar
> SCRIPT_NAME="/cgi-bin/env.rb"; PATH_INFO="/foo/bar"
> real 0m0.431s
> user 0m0.032s
> sys 0m0.020s
>
> Using rubygems instead of explicit paths to rack and sinatra takes it
> to over 1.1 seconds.

Weird, my garbage rack playground (which is currently dying and
triggering the exception middleware) uses gems, and returns in 1/10 of
a second. I'm in Vancouver, webfaction is in Texas, AFAIK.

time curl http://hello.octetcloud.com/ > /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 51720 100 51720 0 0 76770 0 --:--:-- --:--:-- --:--:-- 146k
curl http://hello.octetcloud.com/ > /dev/null 0.01s user 0.00s system
1% cpu 0.681 total

Cheers,
Sam

Christian Neukirchen

unread,
Apr 10, 2009, 7:20:20 AM4/10/09
to rack-...@googlegroups.com
Sam Roberts <vieu...@gmail.com> writes:

> SCRIPT_NAME isn't what you want when deploying under CGI, and you
> shouldn't have to know that, or test with every rack adapter.

When you use mod_rewrite or something, *you* need to ensure that
SCRIPT_NAME is the part before PATH_INFO when the app finally gets
called, e.g. with an overriding middleware, or something like LighttpdFix.

Hongli Lai

unread,
Apr 19, 2009, 5:24:45 AM4/19/09
to Rack Development
On Apr 7, 9:34 pm, Christian Neukirchen <chneukirc...@gmail.com>
wrote:
> Carl Lerche <carl.ler...@gmail.com> writes:
> > There is no current rack spec describing what REQUEST_URI should be.
>
> Trouble starts in that there is absolutely no specification on what
> REQUEST_URI even means.  Try four different webservers, get four
> different results.
>
> I think a not-to-be-changed copy of the entire request path could be
> useful, but let's not call it REQUEST_URI.

What are the current possible definitions for REQUEST_URI? I thought
REQUEST_URI is simply the entire URI that comes after the host name,
but without the '?' and the query string?

Tim Carey-Smith

unread,
Apr 19, 2009, 6:04:45 AM4/19/09
to rack-...@googlegroups.com

REQUEST_URI = the whole part after the hostname
REQUEST_PATH = PATH_INFO = everything up to the first "?"


check out http://zelhory.schkocr.cz/phpinfo.php?foo=bar

Also, checkout the URI syntax on http://www.w3.org/Protocols/rfc1945/rfc1945
Section 3.2.1.

relativeURI = net_path | abs_path | rel_path
rel_path = [ path ] [ ";" params ] [ "?" query ]

And the thin parser shows how the parts are added to the ENV.
http://github.com/macournoyer/thin/blob/master/ext/thin_parser/common.rl

rel_path = ( path? %request_path (";" params)? ) ("?" %start_query
query)?;

Ciao,
Tim

Christian Neukirchen

unread,
Apr 19, 2009, 10:14:46 AM4/19/09
to rack-...@googlegroups.com
Tim Carey-Smith <g...@spork.in> writes:

> REQUEST_URI = the whole part after the hostname

Which would not be a URI at all.

Please let's not use REQUEST_URI for anything, it is severely
unspecified. Cook up a new name for what it should be.

Scytrin dai Kinthra

unread,
Apr 19, 2009, 4:08:22 PM4/19/09
to rack-...@googlegroups.com
There's no definition of REQUEST_URI in any of the specifications I've
found, but it typically seems to be the portion of the URI after the
hostname, inclusive of the leading slash. This is nicely perceivable
as the absolute path within the realm of the http host. So running
with that SCRIPT_NAME and PATH_INFO are exclusive but fully
encompassing substrings of REQUEST_URI previous to '?', providing a
good guide on where the application sits within the hierarchy on the
httpd service.

I would have no problem with placing an immutable definition of the
script-URI (as in the context of the CGI 1.1 spec) in some rack
specific variable such as 'rack.request_uri'. It would be awesome if
this was an instance of URI, and then we could be able to simply track
PATH_INFO or SCRIPT_NAME as within env['rack.request_uri'].path and
REQUEST_URI would be subject to willy-nilly.

I am opposed to altering the definition of these meta-variables away
from the CGI spec. It's already a bit mangled in terms of the rack
routing with such middlewares like URLMap, but it still provides a
nice definition of SCRIPT_NAME is something I want to keep in any urls
that any requests return to this specific application, and PATH_INFO
is malleable within my application or provides additional information.

--
stadik.net

Tim Carey-Smith

unread,
Apr 19, 2009, 6:04:01 PM4/19/09
to rack-...@googlegroups.com
On 20/04/2009, at 2:14 AM, Christian Neukirchen wrote:

> Tim Carey-Smith <g...@spork.in> writes:
>
>> REQUEST_URI = the whole part after the hostname
>
> Which would not be a URI at all.
>
> Please let's not use REQUEST_URI for anything, it is severely
> unspecified. Cook up a new name for what it should be.

I believe the place where someone might have decided REQUEST_URI was
useful
was RFC 1945 [1] and you are correct that it isn't really a URI.

In this instance, the Request-URI (section 5.1.2) is describing either
the
"abs_path" (/foo/bar;params?query=string) or the "absoluteURI" for
proxy requests.

It does seem to have some ambiguity in what it specifies, if indeed it
specifies
anything!

It seems that most HTTP parsers extract the parts of the "Request-
Line" into their own
values and that itself it reasonably standard (aside from nginx-
passenger :).

Request-Line = HTTP_METHOD " " PATH_INFO QUERY_STRING " " HTTP_VERSION
"\r\n"

I think I'm verging on mindless rambling now,
Tim

[1] http://www.ietf.org/rfc/rfc1945.txt

Reply all
Reply to author
Forward
0 new messages