Setting Xerxes Langauge by Language HTTP Request Header

29 views
Skip to first unread message

Luke O'Sullivan

unread,
Aug 1, 2012, 9:41:33 AM8/1/12
to xerxes...@googlegroups.com
Hi Folks,

I've had a request to set the language used by Xerxes based any values set by the user's browser.

I plan to check for the Accept-Language header on the user's first visit to Xerxes. Does Xerxes have a session handler similar to the registry handler or will I have to create one from scratch / use cookies?

Thanks,

Luke

Walker, David

unread,
Aug 1, 2012, 9:46:47 AM8/1/12
to xerxes...@googlegroups.com

Xerxes_Framework_Request has methods for session info, Luke. 

 

  http://code.google.com/p/xerxes-portal/source/browse/trunk/lib/framework/Request.php#589

 

It’s little more than a wrapper around $_SESSION.

 

We’re setting the language now based on the language param in the URL, so you might update this code here to alternately grab from session/http header

 

  http://code.google.com/p/xerxes-portal/source/browse/trunk/lib/framework/FrontController.php#134

 

--Dave

 

-------------------------

David Walker

Interim Director, Systemwide Digital Library Services

California State University

562-355-4845

--
You received this message because you are subscribed to the Google Groups "xerxes-portal" group.
To view this discussion on the web visit https://groups.google.com/d/msg/xerxes-portal/-/ELK-PzVcFw8J.
To post to this group, send email to xerxes...@googlegroups.com.
To unsubscribe from this group, send email to xerxes-porta...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/xerxes-portal?hl=en.

Jonathan Rochkind

unread,
Aug 1, 2012, 10:15:58 AM8/1/12
to xerxes...@googlegroups.com, Walker, David
Suggest language param in URL should over-ride http accept headers, if
both are present.

On 8/1/2012 9:46 AM, Walker, David wrote:
> Xerxes_Framework_Request has methods for session info, Luke.
>
> http://code.google.com/p/xerxes-portal/source/browse/trunk/lib/framework/Request.php#589
>
> It�s little more than a wrapper around $_SESSION.
>
> We�re setting the language now based on the language param in the URL,
> so you might update this code here to alternately grab from session/http
> header
>
> http://code.google.com/p/xerxes-portal/source/browse/trunk/lib/framework/FrontController.php#134
>
> --Dave
>
> -------------------------
>
> David Walker
>
> Interim Director, Systemwide Digital Library Services
>
> California State University
>
> 562-355-4845
>
> *From:*xerxes...@googlegroups.com
> [mailto:xerxes...@googlegroups.com] *On Behalf Of *Luke O'Sullivan
> *Sent:* Wednesday, August 01, 2012 6:42 AM
> *To:* xerxes...@googlegroups.com
> *Subject:* [xerxes-portal] Setting Xerxes Langauge by Language HTTP
> Request Header
>
> Hi Folks,
>
> I've had a request to set the language used by Xerxes based any values
> set by the user's browser.
>
> I plan to check for the Accept-Language header on the user's first visit
> to Xerxes. Does Xerxes have a session handler similar to the registry
> handler or will I have to create one from scratch / use cookies?
>
> Thanks,
>
> Luke
>
> --
> You received this message because you are subscribed to the Google
> Groups "xerxes-portal" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/xerxes-portal/-/ELK-PzVcFw8J.
> To post to this group, send email to xerxes...@googlegroups.com
> <mailto:xerxes...@googlegroups.com>.
> To unsubscribe from this group, send email to
> xerxes-porta...@googlegroups.com
> <mailto:xerxes-porta...@googlegroups.com>.
> For more options, visit this group at
> http://groups.google.com/group/xerxes-portal?hl=en.
>
> --
> You received this message because you are subscribed to the Google
> Groups "xerxes-portal" group.

Luke O'Sullivan

unread,
Aug 1, 2012, 11:44:22 AM8/1/12
to xerxes...@googlegroups.com
Thanks both :)

Luke O'Sullivan

unread,
Aug 1, 2012, 11:48:11 AM8/1/12
to xerxes...@googlegroups.com, Walker, David
The only problem I can see with that is if a user happens to be using a browser over which they have no / little control.
E.g. If I was on holiday in Greece and I (for some unknown reason) wanted to check Xerxes, I would be stuck in Greek even if I wanted English...

Cheers,

Luke


On Wednesday, 1 August 2012 15:15:58 UTC+1, jrochkind wrote:
Suggest language param in URL should over-ride http accept headers, if
both are present.

On 8/1/2012 9:46 AM, Walker, David wrote:
> Xerxes_Framework_Request has methods for session info, Luke.
>
> http://code.google.com/p/xerxes-portal/source/browse/trunk/lib/framework/Request.php#589
>
> It�s little more than a wrapper around $_SESSION.
>
> We�re setting the language now based on the language param in the URL,
> so you might update this code here to alternately grab from session/http
> header
>
> http://code.google.com/p/xerxes-portal/source/browse/trunk/lib/framework/FrontController.php#134
>
> --Dave
>
> -------------------------
>
> David Walker
>
> Interim Director, Systemwide Digital Library Services
>
> California State University
>
> 562-355-4845
>
> *From:*xerxes-portal@googlegroups.com
> [mailto:xerxes-portal@googlegroups.com] *On Behalf Of *Luke O'Sullivan
> *Sent:* Wednesday, August 01, 2012 6:42 AM
> *To:* xerxes...@googlegroups.com
> *Subject:* [xerxes-portal] Setting Xerxes Langauge by Language HTTP
> Request Header
>
> Hi Folks,
>
> I've had a request to set the language used by Xerxes based any values
> set by the user's browser.
>
> I plan to check for the Accept-Language header on the user's first visit
> to Xerxes. Does Xerxes have a session handler similar to the registry
> handler or will I have to create one from scratch / use cookies?
>
> Thanks,
>
> Luke
>
> --
> You received this message because you are subscribed to the Google
> Groups "xerxes-portal" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/xerxes-portal/-/ELK-PzVcFw8J.
> To post to this group, send email to xerxes...@googlegroups.com
> <mailto:xerxes-portal@googlegroups.com>.
> To unsubscribe from this group, send email to
> xerxes-portal+unsubscribe@googlegroups.com
> <mailto:xerxes-portal+unsub...@googlegroups.com>.
> For more options, visit this group at
> http://groups.google.com/group/xerxes-portal?hl=en.
>
> --
> You received this message because you are subscribed to the Google
> Groups "xerxes-portal" group.
> To post to this group, send email to xerxes...@googlegroups.com.
> To unsubscribe from this group, send email to
> xerxes-portal+unsubscribe@googlegroups.com.

Jonathan Rochkind

unread,
Aug 1, 2012, 12:06:50 PM8/1/12
to xerxes...@googlegroups.com, Luke O'Sullivan, Walker, David
That's why language param in URL should over-ride HTTP accept headers.

Of course, the standard Xerxes UI doesn't provide any actual links to
change language, no way for the user to know to add the language param
to the url or a select a language preference other than default.

But if the URL param overrides HTTP accept headers, it would be possible
to create such a UI, if it were actually neccesary (I suspect it will
not be a real world problem, but it is important the design allows for
it in case it is)

On 8/1/2012 11:48 AM, Luke O'Sullivan wrote:
> The only problem I can see with that is if a user happens to be using a
> browser over which they have no / little control.
> E.g. If I was on holiday in Greece and I (for some unknown reason)
> wanted to check Xerxes, I would be stuck in Greek even if I wanted
> English...
>
> Cheers,
>
> Luke
>
> On Wednesday, 1 August 2012 15:15:58 UTC+1, jrochkind wrote:
>
> Suggest language param in URL should over-ride http accept headers, if
> both are present.
>
> On 8/1/2012 9:46 AM, Walker, David wrote:
> > Xerxes_Framework_Request has methods for session info, Luke.
> >
> >
> http://code.google.com/p/xerxes-portal/source/browse/trunk/lib/framework/Request.php#589
> <http://code.google.com/p/xerxes-portal/source/browse/trunk/lib/framework/Request.php#589>
>
> >
> > It�s little more than a wrapper around $_SESSION.
> >
> > We�re setting the language now based on the language param in
> the URL,
> > so you might update this code here to alternately grab from
> session/http
> > header
> >
> >
> http://code.google.com/p/xerxes-portal/source/browse/trunk/lib/framework/FrontController.php#134
> <http://code.google.com/p/xerxes-portal/source/browse/trunk/lib/framework/FrontController.php#134>
>
> >
> > --Dave
> >
> > -------------------------
> >
> > David Walker
> >
> > Interim Director, Systemwide Digital Library Services
> >
> > California State University
> >
> > 562-355-4845
> >
> > *From:*xerxes...@googlegroups.com
> <mailto:xerxes...@googlegroups.com>
> > [mailto:xerxes...@googlegroups.com
> <mailto:xerxes...@googlegroups.com>] *On Behalf Of *Luke O'Sullivan
> > *Sent:* Wednesday, August 01, 2012 6:42 AM
> > *To:* xerxes...@googlegroups.com
> <mailto:xerxes...@googlegroups.com>
> > *Subject:* [xerxes-portal] Setting Xerxes Langauge by Language HTTP
> > Request Header
> >
> > Hi Folks,
> >
> > I've had a request to set the language used by Xerxes based any
> values
> > set by the user's browser.
> >
> > I plan to check for the Accept-Language header on the user's
> first visit
> > to Xerxes. Does Xerxes have a session handler similar to the
> registry
> > handler or will I have to create one from scratch / use cookies?
> >
> > Thanks,
> >
> > Luke
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "xerxes-portal" group.
> > To view this discussion on the web visit
> > https://groups.google.com/d/msg/xerxes-portal/-/ELK-PzVcFw8J
> <https://groups.google.com/d/msg/xerxes-portal/-/ELK-PzVcFw8J>.
> > To post to this group, send email to
> xerxes...@googlegroups.com <mailto:xerxes...@googlegroups.com>
> > <mailto:xerxes...@googlegroups.com
> <mailto:xerxes...@googlegroups.com>>.
> > To unsubscribe from this group, send email to
> > xerxes-porta...@googlegroups.com
> <mailto:xerxes-portal%2Bunsu...@googlegroups.com>
> > <mailto:xerxes-porta...@googlegroups.com
> <mailto:xerxes-portal%2Bunsu...@googlegroups.com>>.
> > For more options, visit this group at
> > http://groups.google.com/group/xerxes-portal?hl=en
> <http://groups.google.com/group/xerxes-portal?hl=en>.
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "xerxes-portal" group.
> > To post to this group, send email to
> xerxes...@googlegroups.com <mailto:xerxes...@googlegroups.com>.
> > To unsubscribe from this group, send email to
> > xerxes-porta...@googlegroups.com
> <mailto:xerxes-portal%2Bunsu...@googlegroups.com>.
> > For more options, visit this group at
> > http://groups.google.com/group/xerxes-portal?hl=en
> <http://groups.google.com/group/xerxes-portal?hl=en>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "xerxes-portal" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/xerxes-portal/-/Stc9CHQ1LQQJ.
> To post to this group, send email to xerxes...@googlegroups.com.
> To unsubscribe from this group, send email to
> xerxes-porta...@googlegroups.com.

Jonathan Rochkind

unread,
Aug 1, 2012, 12:11:05 PM8/1/12
to xerxes...@googlegroups.com, Luke O'Sullivan, Walker, David
Also, you mentioned session -- why would you store the findings from
HTTP accept header in session? There should be no need for that, as
every single request has HTTP accept headers on it, just look at the
headers for the current request when processing the current request.
Storing in session instead could lead to weird hard to debug bugs (say
if a browser changes it's language preferences in mid-session) and
there's no reason for it.

Luke O'Sullivan

unread,
Aug 2, 2012, 6:52:02 AM8/2/12
to xerxes...@googlegroups.com, Luke O'Sullivan, Walker, David
Hi Jonathan,

Sorry - I read your suggestion the wrong way round - I thought you were saying the Accept-Language should take prescience. I'd also just performed a similar task which sets the language using cookies rather than checking for a url parameter so I was getting the two mixed up in my head.

So, process will be (with no need for persistent / session storage)

1) Check for url param
2) Check Accept-Language header
3) Apply default

One issue I'm aware of is that there may be a difference in some of the language standards used by HTTP headers and Xerxes. I believe the headers use the IETF language tag which is based on ISO 639-1 and ISO 3166‑1 whereas Xerxes uses ISO 639-2/B. For example, Xerxes use "eng" for English where the headers are likely to en, en-us or en-gb. The same also applies to Welsh which is "wel" for Xerxes but cy for headers. Do you know of any mechanism which can be used to convert between the two?

Thanks,

Luke

helix84

unread,
Aug 2, 2012, 7:07:58 AM8/2/12
to xerxes...@googlegroups.com, Luke O'Sullivan, Walker, David
On Thu, Aug 2, 2012 at 12:52 PM, Luke O'Sullivan
<datavoy...@gmail.com> wrote:
> One issue I'm aware of is that there may be a difference in some of the
> language standards used by HTTP headers and Xerxes. I believe the headers
> use the IETF language tag which is based on ISO 639-1 and ISO 3166‑1 whereas
> Xerxes uses ISO 639-2/B. For example, Xerxes use "eng" for English where the
> headers are likely to en, en-us or en-gb. The same also applies to Welsh
> which is "wel" for Xerxes but cy for headers. Do you know of any mechanism
> which can be used to convert between the two?

Hi Luke,

this is actually not formed from the 3-letter ISO 639-2/B language
code, but from the locale attribute associated to it in the config
file, take a look at the rfc1766 variable in includes.xsl. So this
shouldn't be a problem.

Regards,
~~helix84

Luke O'Sullivan

unread,
Aug 2, 2012, 1:02:15 PM8/2/12
to xerxes...@googlegroups.com, Luke O'Sullivan, Walker, David, hel...@centrum.sk
Hi Helix,

Thanks for the heads up - I have no experience in language codes etc.

The locale I have for Welsh in cy_GB.uft8 but the http headers will either be cy or cy-gb. I can convert underscores to dashes, remove the .uf8 bit and lowercase the string but I'm not sure if that will be consistent enough across all languages.

While I'm on the topic, I have

if ($code == "wel") {
 return "Cymraeg";
}

at the top of the getNameFromCode function in languages.php in order to ensure that "Cymraeg" is displayed rather than Welsh? Is there a "proper" way to do this?

Thanks,

Luke

helix84

unread,
Aug 2, 2012, 1:27:30 PM8/2/12
to Luke O'Sullivan, xerxes...@googlegroups.com, Walker, David
On Thu, Aug 2, 2012 at 7:02 PM, Luke O'Sullivan
<datavoy...@gmail.com> wrote:
> The locale I have for Welsh in cy_GB.uft8 but the http headers will either
> be cy or cy-gb. I can convert underscores to dashes, remove the .uf8 bit and
> lowercase the string but I'm not sure if that will be consistent enough
> across all languages.

I already prepared the code in includes.xsl, see the commented out
section in definition of the rfc1766 variable. I just didn't think it
was necessary.

It should be consistent. If you want to verify, take a look at RFC
1766 and the two RFCs superseding it and also at what standards Linux
uses for locales (you won't find a normative standard), but I'm sure I
already checked it when I wrote the code.

> if ($code == "wel") {
> return "Cymraeg";
> }
>
> at the top of the getNameFromCode function in languages.php in order to
> ensure that "Cymraeg" is displayed rather than Welsh?

Well, yes, that's a simple workaround, it should work.

> Is there a "proper" way to do this?

I designed it to work with any language, so it looks up the localized
language names from the iso-codes package installed in the system
(works in Debian, Ubuntu etc). You will need to have the cy_GB.utf8
locale installed (which you already should have if Xerxes translation
works) and the string should be translated in the iso-codes package in
the target language. I know it sounds complicated, but this allows us
1) to drop the translations (which are rather large) from Xerxes and
2) to always have up-to-date translations from system.

Regards,
~~helix84

Luke O'Sullivan

unread,
Aug 6, 2012, 7:35:18 AM8/6/12
to xerxes...@googlegroups.com
Hi Folks,

I've added the following to Languages.php

      if ( $lang == null ) {
            $lang = ($this->getLanguageHeaders() != null)
                ? $this->getLanguageHeaders()
                : $objRegistry->defaultLanguage();
            $objRequest->setProperty("lang", $lang);
        }

and

    private function getLanguageHeaders() {

        $langs = array();

        // break up string into pieces (languages and q factors)
        preg_match_all('/([a-z]{1,8}(-[a-z]{1,8})?)\s*(;\s*q\s*=\s*(1|0\.[0-9]+))?/i', $_SERVER['HTTP_ACCEPT_LANGUAGE'], $lang_parse);

        if (count($lang_parse[1])) {
            // create a list like "en" => 0.8
            $langs = array_combine($lang_parse[1], $lang_parse[4]);

            // set default to 1 for any without q factor
            foreach ($langs as $lang => $val) {
                if ($val === '') $langs[$lang] = 1;
            }

            // sort list based on value
            arsort($langs, SORT_NUMERIC);
        }

        if (count($langs) > 0) {

            $objRegistry = Xerxes_Framework_Registry::getInstance();
            $validLangs = $objRegistry->getConfig("languages");

            if ( $validLangs  != null )
            {
                // Loop through language headers
                foreach ($langs as $headerLanguage => $rating ) {

                    $localeString = str_replace("-", "_", $headerLanguage) . ".utf8";

                    // Loop through Xerxes Languages
                    foreach ( $validLangs->language as $language )
                    {
                        $locale = (string) $language->attributes()->locale;

                        if ( strtolower($locale) == strtolower($localeString))
                        {
                           return (string) $language->attributes()->code;
                        }
                    }

                }

                // If we can't find an exact match, try a partial
                foreach ($langs as $headerLanguage => $rating ) {

                    $partialMatch = explode ("-", $headerLanguage);
                    $partialMatch = $partialMatch[0];

                    // Loop through Xerxes Languages
                    foreach ( $validLangs->language as $language )
                    {

                        $locale = !empty($language->attributes()->locale) ?
                        (string) $language->attributes()->locale : false ;

                        if ($locale) {

                            $match = strpos($locale, $partialMatch);

                            if ($match !== false)
                            {
                                return (string) $language->attributes()->code;
                            }
                        }
                    }

                }
            }
        }

        return null;
    }

Will this be of use to anyone else?

Cheers,

Luke

Luke O'Sullivan

unread,
Aug 13, 2012, 5:34:25 AM8/13/12
to xerxes...@googlegroups.com
Hi Folks,

Here's a patch for setting the language using the language headers.

Cheers,

Luke
xerxesLanguage.patch
Reply all
Reply to author
Forward
0 new messages