[Cherokee] Migration from Apache to Cherokee: URI tweaking

82 views
Skip to first unread message

Anfré Littoz

unread,
May 1, 2013, 6:27:40 AM5/1/13
to cher...@lists.octality.com
Hi,

Don't know if this the proper place for such a question, so excuse me if this is considered as "noise".

I'm presently testing Cherokee as an alternative to Apache for the LXR project (see http://lxr.sourceforge.net). As long as I use elementary LXR configuration features, it works fine.

But, when I come to try to serve several databases (for short, database is the closest ordinary concept equivalent to its "service") with a single application instance, a trick is played on the URI, which I can't fancy how to convert.

LXR is driver by this kind of URI:

http://hostname/LXR_service_signature/DB_id/script_file/path_for_script?arguments

i.e. an argument-like is interspersed inside the web-path for the script. Under Apache, directive AliasMatch strips off this information and simultaneously routes the request to an alternate document root. The important point is the original URI is not changed and available for parsing unaltered by the script which retrieves the DB_id.

In my conversion attempt, I used either a directory rule (on LXR_service_signature) or regexp rule, both with a redirect handler to remove the DB_id and other non-path related bits. Unhappily, this rewrites the URI and defeats the script processing which no longer can retrieve the DB_id.

Does there exist in Cherokee a means to launch a script whose command line is generated from groups ($1, $2, ...) captured by the regexp-based rule so that the URI is unaltered (environment variables reflect the initial URI)?

ajl

Daniel Lo Nigro

unread,
May 1, 2013, 8:33:43 AM5/1/13
to cherokee List
I haven't used Cherokee for a while, but this is known as a rewrite rule ("internal" redirect).


_______________________________________________
Cherokee mailing list
Cher...@lists.octality.com
http://lists.octality.com/listinfo/cherokee


Anfré Littoz

unread,
May 1, 2013, 8:57:09 AM5/1/13
to cherokee List
Yes, this is a rewrite rule and consequently it changes the URI.

What I need is a rule, which can be flagged "final", causing execution of a CGI script whose home directory is known (e;g. /cgi-bin) and whose name and eventually arguments are taken from the URI ($x substitutions) while the environment variables SERVER_NAME, SCRIPT_NAME, ... are set from the original URI.

From the documentation, I'm afraid this is not possible with Cherokee.

The alternate solution would be to remove DB_id from the "script web path" part since this was a bad design decision as it mixes script path and argument but this involves a major rewrite of LXR initialisation and configuration. Maybe this is the wise direction since it could greatly simplify integration with web servers and allow for more server candidates.

ajl


De : Daniel Lo Nigro <li...@dan.cx>
À : cherokee List <Cher...@lists.octality.com>
Envoyé le : Mercredi 1 mai 2013 14h33
Objet : Re: [Cherokee] Migration from Apache to Cherokee: URI tweaking

I haven't used Cherokee for a while, but this is known as a rewrite rule ("internal" redirect).


Daniel Lo Nigro

unread,
May 1, 2013, 9:03:46 AM5/1/13
to Anfré Littoz, cherokee List
Hmm I thought rewrite rules shouldn't change the URI if they're marked as "internal". Does LXR have instructions for Nginx or Lighttpd? If so, you should be able to convert those rules to Cherokee rules. Apache has so many configuration options that sometimes it's hard to find the exact match.

Anfré Littoz

unread,
May 1, 2013, 11:05:36 AM5/1/13
to Daniel Lo Nigro, cherokee List
I tried both "internal" and "external" with the same effect. "external" has the debugging advantage to show the result of substitution.

Nginx and lighttpd are based on original concepts, different from Apache.

* lightpd:
You regexp-match on the URL and you tell that such initial URL part is related to such document root, e.g.:
     $HTTP["url"] =~ "^/LXR_signature/" {
            alias.url += ( "/LXR_signature/DB_id" => "common_LXR_root_directory" )
     }
and there is another directive to name the files which are scripts.

*nginx:
There is no notion of "master" DocumentRoot and Alias as in Apache. Every URL can be individually diverted to its own root. Of course in the simplest case, this is equivalent to DocumentRoot or Cherokee's directory rule or default rule. Part of an URL is regexp-matched by a location directive and you tell what you want to do with the bits without rewrite, e.g.:
    server { ...
        location ~ ^/LXR_signature/[^/]+(.*)$ {
            alias /LXR_root_directory/$1 ;     # for ordinary files like .css or images
            location ~ ^(/LXR_signature/[^/]+/)(script_names) {
                set $virtroot $1;
                set $scriptname $2;
                alias /LXR_root_directory;
                include fastcgi.conf;
                fastcgi_param SCRIPT_FILENAME $document_root$scriptname;   # compute which script to launch
                fastcgi_param SCRIPT_NAME $virtroot$scriptname; # unaltered CGI variable
                fastcgi_pass unix://...;
            }
        }
    }

My idea was to mimic nginx' "action" block. Unhappily, I could not fancy how to simultaneously "untweak" the initial URL part and launch the correct script. I had to break it into two separate rules. The first one identifies the LXR service and removes DB_id to provide an "ordinary" script path, but doing so I lose DB_id. The second one is a common directory rule with CGI handler, but since the URI has changed the called script fails because it takes a segment of the web directory as the DB_id.

Anyway, considering the tweaks needed to port LXR on various web servers, I'm more and more convinced that putting the DB_id in the middle of the web directory name (exactly, just before the script name) was a bad design choice (but ages ago, you had Apache - full stop). That information belongs in the script parameters, maybe as a root for PATH_INFO. Notwithstanding the compatibility issue with existing LXR sites, redesigning this needs a lot of effort (first for a neat design, next in trying not to break the core code).

Thus if you know the name of a Cherokee variable, like nginx' $originaluri, this could temporarily solve the problem.

Thanks for your answers and your patience.


De : Daniel Lo Nigro <li...@dan.cx>
À : Anfré Littoz <page740...@yahoo.fr>
Cc : cherokee List <Cher...@lists.octality.com>
Envoyé le : Mercredi 1 mai 2013 15h03

Objet : Re: [Cherokee] Migration from Apache to Cherokee: URI tweaking
Reply all
Reply to author
Forward
0 new messages