[bitly-api] Bitly/Twitter short url character set

163 views
Skip to first unread message

Alexander Sicular

unread,
Apr 19, 2010, 11:29:30 AM4/19/10
to bitl...@googlegroups.com
Hi All,

My question is what the allowable char set is for bitly and soon to be
twitter short URLs. Are they simply [0-9][A-Z][a-z] which would make
it base62 and not a true base64? Are there characters I'm not
including? Will Twitter's set match bitly's character set? Please
share links if you got em!

Thank you, Alexander

(cross post on the twitter dev talk group)

--
You are subscribed to the bit.ly API discussion group.
To post, email to bitl...@googlegroups.com
For more options, visit http://groups.google.com/group/bitly-api

Jehiah Czebotar

unread,
Apr 19, 2010, 12:08:25 PM4/19/10
to bitl...@googlegroups.com
Alexander, can you be a little more specific about what you are
looking for? are you asking about valid characters for bit.ly short
links (ie: http://bit.ly/1234) or links in general that appear on
twitter (short or otherwise)?

bit.ly links could include unicode characters in IDN domains (for
domains other than bit.ly which we power), and while currently the
path component (the part following the domain) contains primarily
[-_a-zA-Z0-9]+ you should look for any properly formed links, as often
you will see links which accidentally include periods or trailing
parenthesis at the end (and there is always the possibility that we
will expand to valid character set in the future).

http://www.faqs.org/rfcs/rfc1738.html
http://www.faqs.org/rfcs/rfc2396.html

(for reference Alexander's cross-post is here
http://groups.google.com/group/twitter-development-talk/browse_thread/thread/a2a9da245a5608e2)

--
Jehiah

siculars

unread,
Apr 19, 2010, 12:16:46 PM4/19/10
to bitly API
Thank you, Jehiah. That's exactly what I was looking for. So it looks
like you are using a base64 set with modified characters like so,
http://en.wikipedia.org/wiki/Base64#URL_applications, aka. 'base64url'
encoding. -Alexander

On Apr 19, 12:08 pm, Jehiah Czebotar <jeh...@bit.ly> wrote:
> Alexander, can you be a little more specific about what you are
> looking for? are you asking about valid characters for bit.ly short
> links (ie:http://bit.ly/1234) or links in general that appear on
> twitter (short or otherwise)?
>
> bit.ly links could include unicode characters in IDN domains (for
> domains other than bit.ly which we power), and while currently the
> path component (the part following the domain) contains primarily
> [-_a-zA-Z0-9]+ you should look for any properly formed links, as often
> you will see links which accidentally include periods or trailing
> parenthesis at the end (and there is always the possibility that we
> will expand to valid character set in the future).
>
> http://www.faqs.org/rfcs/rfc1738.htmlhttp://www.faqs.org/rfcs/rfc2396.html
>
> (for reference Alexander's cross-post is herehttp://groups.google.com/group/twitter-development-talk/browse_thread...)
>
> --
> Jehiah

Jehiah Czebotar

unread,
Apr 19, 2010, 12:25:31 PM4/19/10
to bitl...@googlegroups.com
while we do use 64 different primary characters in our links, it is
not a base64 encoding that is reversible.

--
Jehiah
Reply all
Reply to author
Forward
0 new messages