Enabling Unicode characters in URL

314 views
Skip to first unread message

Kenneth

unread,
May 3, 2016, 10:10:39 PM5/3/16
to web2py-users
Hello there,

I'd like to create URLs in unicode characters and I get "Invalid Request" error when I do. I tried to reconfigure on Nginx side but wasn't able to work it properly.

For example, this curl call will output "Invalid Request".

curl http://localhost:8080/与

Grateful if there is a solution to detect and read URLs in unicode characters. 

My Web2Py server is currently running Nginx and WSGI.

Cheers,


Leonel Câmara

unread,
May 3, 2016, 10:15:36 PM5/3/16
to web2py-users
Unicode characters are not valid in URLs you need to percent encode them.

Kenneth

unread,
May 3, 2016, 10:58:59 PM5/3/16
to web2py-users
Hi Leonel,

Thank you for your quick reply.

If that is the case, is it possible to encode the unicode URLs right after it hits the server? So, it can be redirected to proper URL after the incoming url is encoded in percent.

Best,

Leonel Câmara

unread,
May 4, 2016, 7:55:53 AM5/4/16
to web2py-users
I don't think that's possible, they need to come pre-encoded, don't worry the browser will show the unicode characters in the adress bar and not those ugly percents.

Niphlod

unread,
May 4, 2016, 8:09:22 AM5/4/16
to web2py-users
BTW: you NEED to get your hands dirty with routes.py because a percent-encoded fragment won't ever be mapped to a function or a controller.

Kenneth

unread,
May 4, 2016, 12:26:58 PM5/4/16
to web2py-users
Hi Niphld,

Do I need to configure "routes.py" to accept percent-encoded function? is it "args_match" that I need to configure?

Just tried to pass encoded string "%ED%95%9C%EA%B8%80" as argument, I am getting "Invalid request".

Thank you.

Leonel Câmara

unread,
May 4, 2016, 2:31:17 PM5/4/16
to web2py-users
For functions, specially since in python 2 the identifiers must be ascii. If percent encoded args don't work then that's actually a bug/missing feature of web2py which doesn't unquote args. Please file an issue in github so the devs get properly motivated to close it. In the meantime you will have to use request.vars to put stuff.

Kenneth

unread,
May 5, 2016, 9:02:02 PM5/5/16
to web2py-users
Just filed an issue. thank you, Leonel.

Is there any interim solution to this problem? 

Niphlod

unread,
May 6, 2016, 3:15:38 AM5/6/16
to web2py-users
you need to think it through.
web2py's defaults to have /a/c/f with a being your app, c the name of a controller file and f a valid python function identifier.

if you want to support /whatever you NEED to use routes.py

a simple 


routes_in
= ( ('/welcome/static/$anything', '/welcome/static/$anything'), ('/(?P<any>.*)', '/welcome/default/index/\g<any>'), )
routes_out
= ( ('/welcome/static/$anything', '/welcome/static/$anything'), ('/welcome/default/index/(?P<any>.*)', '/\g<any>'), )

will route / to /welcome/default/index/ . request.args won't be usable for the aforementioned "bug" (or lack of feature), but request.raw_args will be there for you to parse as you wish

Kenneth

unread,
May 6, 2016, 9:20:11 PM5/6/16
to web2py-users
Thank you, Niphlod.

I just had to spin my head for all day today and I figured that 

- I was using parameter-based routes and didn't know that I had to use pattern-based routes to use routes_in and routes_out. No-mixing two different routes.
- I couldn't activate request.raw_args with my parameter-based routes file
- I couldn't figure out how to set up languages for pattern-based routes
For example, I have this snippet on my parameter-based routes
routers = {
    app: dict(
        default_language = 'en',
        languages =['ko']
    )
}

Is it possible to use request.raw_args with parameter-baed routes file?

For now, I will just pause on this issue since I don't think I'd be able to reconfigure all my apps to work properly with patter-based routes. :P

Thank you for great replies as always, Niphlod.

Kenneth

unread,
May 6, 2016, 9:47:17 PM5/6/16
to web2py-users
Quick update

By adding "args_match =r'([ㄱ-ㅣ가-힣\w@ -_=])" under BASE= dict(, I was able to read all unicode characters.

Hope this Regex is safe to use.


On Friday, May 6, 2016 at 3:15:38 AM UTC-4, Niphlod wrote:
Reply all
Reply to author
Forward
0 new messages