Account Options

  1. Sign in
Google Groups Home
« Groups Home
What is 140 characters?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 1 - 25 of 27 - Expand all  -  Translate all to Translated (View all originals)   Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Craig Hockenberry  
View profile  
 More options Mar 6 2009, 12:43 pm
From: Craig Hockenberry <craig.hockenbe...@gmail.com>
Date: Fri, 6 Mar 2009 09:43:45 -0800 (PST)
Local: Fri, Mar 6 2009 12:43 pm
Subject: What is 140 characters?
Some discussion about this thread popped up on Twitter yesterday:

<http://groups.google.com/group/twitter-development-talk/browse_thread/
thread/44be91d5ec5850fa>

Alex states that it's 140 bytes per tweet. So, of course, Loren
Brichter and I tried to prove that. With the following results:

1) 140 characters that including ones that include HTML entities:
<http://twitter.com/gnitset/status/1286202252>

At the time of posting, this tweet showed up on the site and in feeds
with all 140 characters. After a few hours, the "<" was converted to
"&lt;", increasing the count per character from one to four bytes and
decreasing the tweet length from 140 characters to 69. (You can see
this truncation at the end of the tweet: the "&" is from "&lt;")

Presumably, this happens as tweets in the memcache are written though
to the backing store.

I also see a lot of Twitter clients that don't realize how special the
&lt; and &gt; entities are. It took me a LONG time to figure out what
was going on here.

2) 140 Unicode _multi-byte_ characters: <http://twitter.com/atebits/
status/1286199010>

What's curious is that Loren's example with 140 characters uses the
Unicode 27A1 glyph. It uses 3 bytes in UTF-8. Why didn't it get
truncated? This seems to contradict Alex's statement in the thread
mentioned above.

As people start to use things like Emoji, tinyarro.ws and generally
figure out that Unicode (UTF-8) is a valid type of data on Twitter,
our clients should adapt and display more accurate "characters
remaining" counts. I can count bytes instead of characters, but I'm
not sure if I should or not.

No one likes a truncated tweet: we need an explicit statement on how
to count and submit multi-byte characters and entities.

-ch


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex Payne  
View profile  
 More options Mar 6 2009, 2:18 pm
From: Alex Payne <a...@twitter.com>
Date: Fri, 6 Mar 2009 11:18:57 -0800
Local: Fri, Mar 6 2009 2:18 pm
Subject: Re: [twitter-dev] What is 140 characters?
I'm taking this email to our Service Team, the folks who work on the
back-end of the service. The whole "message body changing as it moves
from cache to backing store" thing is totally unacceptable. Answers
soon.

On Fri, Mar 6, 2009 at 09:43, Craig Hockenberry

--
Alex Payne - API Lead, Twitter, Inc.
http://twitter.com/al3x

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Craig Hockenberry  
View profile  
 More options Mar 6 2009, 2:38 pm
From: Craig Hockenberry <craig.hockenbe...@gmail.com>
Date: Fri, 6 Mar 2009 11:38:05 -0800 (PST)
Local: Fri, Mar 6 2009 2:38 pm
Subject: Re: What is 140 characters?
This truncation as data moves throughout your system occurs in other
places. I've seen the same behavior when setting a user's location and
bio, for example.

-ch

On Mar 6, 11:18 am, Alex Payne <a...@twitter.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matt Sanford  
View profile  
 More options Mar 6 2009, 2:43 pm
From: Matt Sanford <m...@twitter.com>
Date: Fri, 6 Mar 2009 11:43:37 -0800
Local: Fri, Mar 6 2009 2:43 pm
Subject: Re: [twitter-dev] Re: What is 140 characters?

I deployed a batch of explicit length checks this week to try and stop  
that madness. I didn't do the same for status text because it has  
another validation routine altogether. The Service Team should be able  
to help out with in making that more sane.

— Matt

On Mar 6, 2009, at 11:38 AM, Craig Hockenberry wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Cameron Kaiser  
View profile  
 More options Mar 6 2009, 2:47 pm
From: Cameron Kaiser <spec...@floodgap.com>
Date: Fri, 6 Mar 2009 11:47:56 -0800 (PST)
Local: Fri, Mar 6 2009 2:47 pm
Subject: Re: [twitter-dev] What is 140 characters?

> 2) 140 Unicode _multi-byte_ characters: <http://twitter.com/atebits/
> status/1286199010>

> What's curious is that Loren's example with 140 characters uses the
> Unicode 27A1 glyph. It uses 3 bytes in UTF-8. Why didn't it get
> truncated? This seems to contradict Alex's statement in the thread
> mentioned above.

> As people start to use things like Emoji, tinyarro.ws and generally
> figure out that Unicode (UTF-8) is a valid type of data on Twitter,
> our clients should adapt and display more accurate "characters
> remaining" counts. I can count bytes instead of characters, but I'm
> not sure if I should or not.

FWIW, I had a number of users complain about truncated UTF-8 sequences a
while back, which may be a symptom of this same problem. TTYtter now counts
bytes explicitly, and this seems to have dealt with the issue.

--
------------------------------------ personal: http://www.cameronkaiser.com/ --
  Cameron Kaiser * Floodgap Systems * www.floodgap.com * ckai...@floodgap.com
-- Make welfare as hard to get as building permits. ---------------------------


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex Payne  
View profile  
 More options Mar 9 2009, 6:17 pm
From: Alex Payne <a...@twitter.com>
Date: Mon, 9 Mar 2009 15:17:47 -0700
Local: Mon, Mar 9 2009 6:17 pm
Subject: Re: [twitter-dev] Re: What is 140 characters?
Just to keep the group updated: one of our engineers has claimed this
issue. It will be dealt with with EXTREME prejudice.

--
Alex Payne - API Lead, Twitter, Inc.
http://twitter.com/al3x

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
atebits  
View profile  
 More options Mar 9 2009, 7:27 pm
From: atebits <loren.brich...@gmail.com>
Date: Mon, 9 Mar 2009 16:27:36 -0700 (PDT)
Local: Mon, Mar 9 2009 7:27 pm
Subject: Re: What is 140 characters?
Just to confirm: "EXTREME prejudice" as in "140 *bytes* as defined by
UTF-8 with HTML entity encoding only for special (< > &) characters?

So my tweet should *NOT* have worked?
http://twitter.com/atebits/status/1286199010

Would love to see some kind of official *this is how we determine how
long some hunk of unicode is* blurb on the API docs.

Loren

On Mar 9, 3:17 pm, Alex Payne <a...@twitter.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex Payne  
View profile  
 More options Mar 9 2009, 8:03 pm
From: Alex Payne <a...@twitter.com>
Date: Mon, 9 Mar 2009 17:03:02 -0700
Local: Mon, Mar 9 2009 8:03 pm
Subject: Re: [twitter-dev] Re: What is 140 characters?
Once the guys on the backend team get back to me, I'll provide as much.

--
Alex Payne - API Lead, Twitter, Inc.
http://twitter.com/al3x

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
TjL  
View profile  
 More options Mar 10 2009, 8:11 pm
From: TjL <luo...@gmail.com>
Date: Tue, 10 Mar 2009 20:11:20 -0400
Local: Tues, Mar 10 2009 8:11 pm
Subject: Re: [twitter-dev] Re: What is 140 characters?

On Mon, Mar 9, 2009 at 7:27 PM, atebits <loren.brich...@gmail.com> wrote:

> Just to confirm: "EXTREME prejudice" as in "140 *bytes* as defined by
> UTF-8 with HTML entity encoding only for special (< > &) characters?

Just to interject: & has not been specially encoded except for during
a brief time when " was also converted to &quot; and counted as 5
characters and &amp; equalled 4. This was un-done in a matter of days,
if not less.

I'd reiterate that there's no need to encode > as &rt;

If you are encoding < as &lt; there's no risk of someone getting an
<img> tag or <a href tag to work, so maybe there is an argument for a
left tag, but there's really no need to encode a right tag.

Figured I'd throw that out there FWIW

TjL


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Nicolas Steenhout  
View profile  
 More options Mar 10 2009, 8:21 pm
From: Nicolas Steenhout <vavr...@gmail.com>
Date: Wed, 11 Mar 2009 13:21:30 +1300
Local: Tues, Mar 10 2009 8:21 pm
Subject: Re: [twitter-dev] Re: What is 140 characters?

> I'd reiterate that there's no need to encode > as &rt;

What about &eacute; or &agrav; ?? :)

Nic


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex Payne  
View profile  
 More options Mar 10 2009, 8:26 pm
From: Alex Payne <a...@twitter.com>
Date: Tue, 10 Mar 2009 17:26:08 -0700
Local: Tues, Mar 10 2009 8:26 pm
Subject: Re: [twitter-dev] Re: What is 140 characters?
We consider the issue neither acute nor grave.

On Tue, Mar 10, 2009 at 17:21, Nicolas Steenhout <vavr...@gmail.com> wrote:

>> I'd reiterate that there's no need to encode > as &rt;

> What about &eacute; or &agrav; ?? :)

> Nic

--
Alex Payne - API Lead, Twitter, Inc.
http://twitter.com/al3x

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Cameron Kaiser  
View profile  
 More options Mar 10 2009, 8:32 pm
From: Cameron Kaiser <spec...@floodgap.com>
Date: Tue, 10 Mar 2009 17:32:10 -0700 (PDT)
Local: Tues, Mar 10 2009 8:32 pm
Subject: Re: [twitter-dev] Re: What is 140 characters?

> > > I'd reiterate that there's no need to encode > as &rt;

> > What about &eacute; or &agrav; ?? :)

> We consider the issue neither acute nor grave.

You were waiting all day for that, weren't you?

--
------------------------------------ personal: http://www.cameronkaiser.com/ --
  Cameron Kaiser * Floodgap Systems * www.floodgap.com * ckai...@floodgap.com
-- I use my C128 because I am an ornery, stubborn, retro grouch. -- Bob Masse -


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
TjL  
View profile  
 More options Mar 10 2009, 9:07 pm
From: TjL <luo...@gmail.com>
Date: Tue, 10 Mar 2009 21:07:00 -0400
Subject: Re: [twitter-dev] Re: What is 140 characters?

On Tue, Mar 10, 2009 at 8:26 PM, Alex Payne <a...@twitter.com> wrote:
> We consider the issue neither acute nor grave.

UNFOLLOW.

Oh, wait, crap.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andrew Badera  
View profile  
 More options Mar 10 2009, 9:24 pm
From: Andrew Badera <and...@badera.us>
Date: Tue, 10 Mar 2009 21:24:56 -0400
Local: Tues, Mar 10 2009 9:24 pm
Subject: Re: [twitter-dev] Re: What is 140 characters?
if you listen real hard, you can hear *groan*'s from the East Coast.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Nicolas Steenhout  
View profile  
 More options Mar 10 2009, 11:46 pm
From: Nicolas Steenhout <vavr...@gmail.com>
Date: Wed, 11 Mar 2009 16:46:59 +1300
Local: Tues, Mar 10 2009 11:46 pm
Subject: Re: [twitter-dev] Re: What is 140 characters?
You're all a bunch of degenera....  errr, geeks, right, that's right, GEEKS! :)

<shaking head>

Nic


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dossy Shiobara  
View profile  
 More options Mar 11 2009, 12:09 am
From: Dossy Shiobara <do...@panoptic.com>
Date: Wed, 11 Mar 2009 00:09:08 -0400
Local: Wed, Mar 11 2009 12:09 am
Subject: Re: [twitter-dev] Re: What is 140 characters?
On 3/10/09 11:46 PM, Nicolas Steenhout wrote:

> You're all a bunch of degenera....  errr, geeks, right, that's right, GEEKS! :)

> <shaking head>

"All" is such an inclusive term, isn't it?  :-)

ONE OF US!  ONE OF US!

--
Dossy Shiobara              | do...@panoptic.com | http://dossy.org/
Panoptic Computer Network   | http://panoptic.com/
   "He realized the fastest way to change is to laugh at your own
     folly -- then you can let go and quickly move on." (p. 70)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Nicolas Steenhout  
View profile  
 More options Mar 11 2009, 1:13 am
From: Nicolas Steenhout <vavr...@gmail.com>
Date: Wed, 11 Mar 2009 18:13:03 +1300
Local: Wed, Mar 11 2009 1:13 am
Subject: Re: [twitter-dev] Re: What is 140 characters?

> "All" is such an inclusive term, isn't it?  :-)

> ONE OF US!  ONE OF US!

Hey!  I resemble that :)

But I fear we're somewhat West of the OT's question :)

And we still don't know what to do about encoding html entities used
for accents in languages such as French, Spanish, etc ;)

Nic


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dossy Shiobara  
View profile  
 More options Mar 11 2009, 1:49 am
From: Dossy Shiobara <do...@panoptic.com>
Date: Wed, 11 Mar 2009 01:49:57 -0400
Local: Wed, Mar 11 2009 1:49 am
Subject: Re: [twitter-dev] Re: What is 140 characters?
On 3/11/09 1:13 AM, Nicolas Steenhout wrote:

> And we still don't know what to do about encoding html entities used
> for accents in languages such as French, Spanish, etc ;)

Transcode to their UTF-8 codepoints and punt.  Let the Twitter
developers figure out how to handle the data.  :-)

--
Dossy Shiobara              | do...@panoptic.com | http://dossy.org/
Panoptic Computer Network   | http://panoptic.com/
   "He realized the fastest way to change is to laugh at your own
     folly -- then you can let go and quickly move on." (p. 70)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Craig Hockenberry  
View profile  
 More options Mar 11 2009, 11:39 am
From: Craig Hockenberry <craig.hockenbe...@gmail.com>
Date: Wed, 11 Mar 2009 08:39:54 -0700 (PDT)
Local: Wed, Mar 11 2009 11:39 am
Subject: Re: What is 140 characters?
DO NOT ENCODE WITH HTML ENTITIES.

The only reason that < and > are encoded as &lt; and &gt; is because
these values are represented within an XML <text> element. This is
invalid XML:

<text>This <-- is a test</text>

And this is valid XML:

<text>This &lt;-- is a test</text>

If you use HTML entities, they will only show up correctly in a web
browser. SMS and other media will display &crap;.

-ch

On Mar 10, 10:13 pm, Nicolas Steenhout <vavr...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Craig Hockenberry  
View profile  
 More options Mar 24 2009, 10:56 am
From: Craig Hockenberry <craig.hockenbe...@gmail.com>
Date: Tue, 24 Mar 2009 07:56:37 -0700 (PDT)
Local: Tues, Mar 24 2009 10:56 am
Subject: Re: What is 140 characters?
Any news from the Service Team? I'd really like to get the counters
right in an upcoming release...

-ch

On Mar 6, 12:18 pm, Alex Payne <a...@twitter.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex Payne  
View profile  
 More options Mar 24 2009, 9:36 pm
From: Alex Payne <a...@twitter.com>
Date: Tue, 24 Mar 2009 18:36:50 -0700
Local: Tues, Mar 24 2009 9:36 pm
Subject: Re: [twitter-dev] Re: What is 140 characters?
Unfortunately, nothing definitive. We're still looking into this.

On Tue, Mar 24, 2009 at 07:56, Craig Hockenberry

--
Alex Payne - API Lead, Twitter, Inc.
http://twitter.com/al3x

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bill Robertson  
View profile  
 More options Mar 24 2009, 10:08 pm
From: Bill Robertson <billrobertso...@gmail.com>
Date: Tue, 24 Mar 2009 19:08:22 -0700 (PDT)
Local: Tues, Mar 24 2009 10:08 pm
Subject: Re: What is 140 characters?
I have been wondering too.  If its a character, it should be a
character, weather it's an 'A', 'À' or '的'

On Mar 24, 9:36 pm, Alex Payne <a...@twitter.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
TjL  
View profile  
 More options Sep 8 2009, 5:05 pm
From: TjL <luo...@gmail.com>
Date: Tue, 8 Sep 2009 17:05:09 -0400
Local: Tues, Sep 8 2009 5:05 pm
Subject: Re: [twitter-dev] Re: What is 140 characters?
It's been nearly 6 months. Has this question been answered? If so I missed it.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matt Sanford  
View profile  
 More options Sep 9 2009, 1:07 am
From: Matt Sanford <m...@twitter.com>
Date: Tue, 8 Sep 2009 22:07:14 -0700 (PDT)
Local: Wed, Sep 9 2009 1:07 am
Subject: Re: What is 140 characters?
Hi There,

    I'm sorry this never got updated. Some changes have been made and
are waiting to go out now. When I switched from working on the
Platform (formerly API) team to my focus on international I took over
this issue.
    Once this current fix is deployed (probably in a week or so since
I'm traveling at the moment) the definition of a character will be
consistent throughout our API. The new change will always compute
length based on the Unicode NFC [1] version of the string. Using the
NFC form makes the 140 character limit based on the length as
displayed rather than some under-the-cover byte arithmetic.
    I more than agree with the above statement that a character is a
character and Twitter shouldn't care. Data should be data. The main
issue with that is that some clients compose characters and some
don't. My common example of this is é. Depending on your client
Twitter could get:

é - 1 byte
   - URL Encoded UTF-8: %C3%A9
   - http://www.fileformat.info/info/unicode/char/00e9/index.htm

-- or --

é - 2 bytes
   - URL Encoded UTF-8: %65%CC%81
   - http://www.fileformat.info/info/unicode/char/0065/index.htm
     + plus: http://www.fileformat.info/info/unicode/char/0301/index.htm

    So, my fix will make it so that no matter the client if the user
sees é it counts as a single character. I'll announce something in the
change log once my fix is deployed.

Thanks;
  — Matt Sanford / @mzsanford

[1] - http://www.unicode.org/reports/tr15/

On Sep 9, 6:05 am, TjL <luo...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Charles A. Lopez  
View profile  
 More options Sep 9 2009, 5:49 am
From: "Charles A. Lopez" <charlesalo...@gmail.com>
Date: Wed, 9 Sep 2009 05:49:58 -0400
Local: Wed, Sep 9 2009 5:49 am
Subject: Re: [twitter-dev] Re: What is 140 characters?

2009/9/9 Matt Sanford <m...@twitter.com>

isn't that 2 bytes?

> -- or --

> é - 2 bytes
>   - URL Encoded UTF-8: %65%CC%81
>   - http://www.fileformat.info/info/unicode/char/0065/index.htm
>     + plus: http://www.fileformat.info/info/unicode/char/0301/index.htm

and this three bytes?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 1 - 25 of 27   Newer >
« Back to Discussions « Newer topic     Older topic »