Exposing database Id via URL

140 views
Skip to first unread message

Krishgy

unread,
Jul 3, 2008, 10:37:27 PM7/3/08
to pylons-discuss
Hi All,

Currently we expose the database tables primary key value in the URL.

For example, to display the User profile, I use www.example.com/profile/view/12345
where profile is my controller and view controller function and 12345
is actually user id (table: profile, column: uid).

Is this recommended approach? How secure is this?

Regards,

Kriah

Jonathan Vanasco

unread,
Jul 4, 2008, 1:48:14 AM7/4/08
to pylons-discuss
i'll often let people see the ids of objects they manage , but when it
comes to public facing info i use hexids that are unique across the
app.

Mike Orr

unread,
Jul 4, 2008, 1:18:33 PM7/4/08
to pylons-...@googlegroups.com

There are two issues. One, does it contain special characters
(HTML/url markup, filesystem separator, etc? Most IDs are numeric or
restricted to Python "identifier" characters ([A-Za-z_-]), so while
you're checking it for validity you're simultaneously guaranteeing it
doesn't contain troublesome characters. Numeric IDs are especially
easy: 'if not id.isdigit(): abort(404. "Invalid ID")'.

The other issue is, does it matter if the public sees the ID? That
depends on the application. In two apps I have running, there's
nothing secret about the IDs; we don't mind if the user types the URL
for convenience rather than going through all the screens and clicks.
The only reason we don't promote it is users would get confused
("what's this meaningless number for? does it have meaning outside
your agency?") If the IDs were user-specific, such as bank account
numbers, then there may be a reason to disguise them.

--
Mike Orr <slugg...@gmail.com>

Jonathan Vanasco

unread,
Jul 4, 2008, 2:01:03 PM7/4/08
to pylons-discuss
just some points on 'hiding' ids-

- if you're doing a social media site, with numeric ids your
competitors and the annoying industry blogs will be judging and
guaging your popularity and success by sequence ids

- by using the ids, you're good on a pylons app... but lets say you
need to offload something onto php or another system accesses the same
database -- one that is not hardened against sql injection attacks.
you have now exposed your ids - which are fkeys and indexes - to the
public through pylons and have a vulnerability elsewhere. the
security risk might not be in pylons, but you've opened the door for
abuse on your db through other apps.

our practice has needed us to ensure security to clients, and i'm sick
of reading bloggers judging the success of sites based on sequence
numbers and not on the spirit and activity of the active members. so
we hide that, and in all companies i consult to, i insist that they
hide numeric ids on everything.

jerry

unread,
Jul 4, 2008, 8:58:06 PM7/4/08
to pylons-discuss
But how? What encryption/hashing method could be used to transform the
numeric IDs to something less obvious?

Sincerely,
Jerry

Jonathan Vanasco

unread,
Jul 5, 2008, 1:57:20 PM7/5/08
to pylons-discuss

On Jul 4, 8:58 pm, jerry <jerryji1...@gmail.com> wrote:
> But how? What encryption/hashing method could be used to transform the
> numeric IDs to something less obvious?

all my apps have somthing like this in the db:

table hexkey_types
id , len , name
---
1, 10, useraccount:hex_id
2, 6, group:hex_id
3, 6, assetname:hex_id

table hexkey
hex_id primary, hexkey_type (fkey on hexkey_type(id))

table useraccount
id , hex_id fkey on hexkey10


i just create random md5s whenever i make an entry into the db.

when someone requests i record, i validate to make sure the hexid
matches for the function ( are we requesting a 10char hex for a user?
if so check memcached with db failover. if not , then don't
bother ). storing a hex-to-numeric mapping in memcached makes it
really low-impact. and no one knows what my internal mappings are.

jerry

unread,
Jul 5, 2008, 4:06:11 PM7/5/08
to pylons-discuss
Thanks Jonthan.

However, I wonder how an md5 string can be squeezed into a 10, or even
6-character field with no concern of (future) collision -- or am I mis-
understanding your db schema?

Sincerely,
Jerry

Jonathan Vanasco

unread,
Jul 5, 2008, 6:23:47 PM7/5/08
to pylons-discuss
On Jul 5, 4:06 pm, jerry <jerryji1...@gmail.com> wrote:
> However, I wonder how an md5 string can be squeezed into a 10, or even
> 6-character field with no concern of (future) collision -- or am I mis-
> understanding your db schema?

You're misunderstanding the concept.

1. md5(random+time) to get a random string with good dispersion.
2. Then truncate to 6/8/10/however many chars I need
3. check for collision, if found goto 1
4. insert into db ( in unique field )

there's no need to reverse mapping or use the md5 as a checksum. its
just a way to find random numbers.

the strings are just unique ids, they don't have to map backwards and
forwards. md5 is just a very efficient way to create a reasonably
unique id, and then you can do a quick check for collision.

in terms of large-scale sites, i believe both bebo and imeem both do
this with 6char ids

Yoan Blanc

unread,
Jul 6, 2008, 10:35:28 AM7/6/08
to pylons-...@googlegroups.com
It doesn't look very safe to me.

if you don't want to expose your id and create fake ids.

random.seed(id + SALT)

string_id ="%x" % random.randint(0, 0xffffffff)

and now you have an 8 chars id, with very few risks of collison. Thank
to Mersenne Twister a pretty good pseudo-random number generator
(http://docs.python.org/lib/module-random.html)

Then hex encoding waste a lot of place, because it's a very small alphabet.

0123456789abcdef

you can use a bigger one like base32, base 64 (see wikipedia and
replace "/" by something else like "_") or your own alphabet:

>>> import encode # function I've written, pretty trivial: while value > 0: value, rest = divmod(value, len(alphabet)); ...
>>> import random
>>> random.seed(1)
>>> x = random.randint(0, 0xffffffff)
>>> x
577090034
>>> "%x" % x
'2265b1f2'
>>> encode.encode(x, encode.BASE32)
'h6bcfi'
>>> encode.encode(x, encode.BASE64)
'ypr7O'

hex, base16 -> 8
base32 -> 7
base64 -> 6
...

I would say: avoid using MD5 as a random string generator because it
hasn't been created for that purpose.

Does anyone see any flaws with this, apart that the max is 2**32
items, and it's always bad to have a ceiling limit.

Cheers,

-- Yoan

Shannon -jj Behrens

unread,
Jul 7, 2008, 8:19:29 PM7/7/08
to pylons-...@googlegroups.com
> I would say: avoid using MD5 as a random string generator because it
> hasn't been created for that purpose.

If you want universally unique IDs and you have Python 2.5, see
http://docs.python.org/lib/module-uuid.html.

For my situation, normal autoincremented IDs are fine.

Happy Hacking!
-jj

--
It's a walled garden, but the flowers sure are lovely!
http://jjinux.blogspot.com/

Reply all
Reply to author
Forward
0 new messages