Two issues with descriptions in the API

87 views
Skip to first unread message

maxpower47

unread,
Jul 3, 2013, 10:32:48 AM7/3/13
to pinboa...@googlegroups.com
I've recently been looking at two issues with bookmark descriptions (the "extended" url parameter):
  • Line feeds - it looks like any line feeds in the description get replaced by a single space when returned from the api.  Not only does this mean I can't display the description as it actually appears on the site, but if I'm trying to sync back and forth then the next time that bookmark gets edited, the version of the description on the site will get replaced by the one with no line breaks.
  • 414 URI Too Long - I've apparently found the pinboard api URI length limit.  It looks like any api call with a URI longer than 4103 characters (4096 if you remove the https:/) will get denied with a 414 error.  Is this the intended behavior?  This can put a sharp limit on the length of the description added to a bookmark.  The site allows for longer descriptions, and the api will return longer descriptions (I haven't tested the limits of either except to see that they are higher than 4096).  The api specs seem to indicate that text type fields (like extended) have a limit of 65536 characters, so this seems like a bug to me.  At the very least the documentation should be updated.
I'd imagine that these probably wont be addressed until v2, but thought it was good to bring them up so they can be added to the wishlist.

Matt Schmidt
PinDroid

Stephen Darlington

unread,
Jul 3, 2013, 11:12:33 AM7/3/13
to pinboa...@googlegroups.com
The first will happen if you're using XML format. XML parsers are supposed to convert linefeeds into spaces. (See section 3.3.3, http://www.xml.com/axml/testaxml.htm.)

I'm surprised you're getting away with sending such long requests! Some mobile proxy servers barf with less than 2000 characters.

The first might be solved by using JSON rather than XML. The latter only by version 2.0 of the API, which I hope will use sensible HTTP methods rather than mimicking the nutty Delicious.com ones.

Cheers,
Stephen

--
You received this message because you are subscribed to the Google Groups "Pinboard" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pinboard-dev...@googlegroups.com.
To post to this group, send email to pinboa...@googlegroups.com.
Visit this group at http://groups.google.com/group/pinboard-dev.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

------------------------------------------------------------------------
                    Stephen Darlington (www.zx81.org.uk)
                    "The sea monkeys have my money."
------------------------------------------------------------------------




maxpower47

unread,
Jul 3, 2013, 12:03:02 PM7/3/13
to pinboa...@googlegroups.com
Well, I can retract the first one.  I was looking at the xml returned by the api in Chrome and it appeared that the line breaks were getting replaced by spaces, but I later realized that it was an XML pretty print extension that was doing that.  Looking at the source of the xml shows the line breaks.

Matt Schmidt
PinDroid

maciej

unread,
Jul 3, 2013, 12:14:25 PM7/3/13
to pinboa...@googlegroups.com


On Wednesday, July 3, 2013 7:32:48 AM UTC-7, maxpower47 wrote:
  • 414 URI Too Long - I've apparently found the pinboard api URI length limit.  It looks like any api call with a URI longer than 4103 characters (4096 if you remove the https:/) will get denied with a 414 error.  Is this the intended behavior?  This can put a sharp limit on the length of the description added to a bookmark.  The site allows for longer descriptions, and the api will return longer descriptions (I haven't tested the limits of either except to see that they are higher than 4096).  The api specs seem to indicate that text type fields (like extended) have a limit of 65536 characters, so this seems like a bug to me.  At the very least the documentation should be updated.
This must be a length limit in pound, varnish or apache - I don't throw it explicitly.  Another great reason this stuff should not be in the form of GET requests!

In any case, you are correct that I won't fix this until version 2 of the API.  That's basically where all API development effort is going.

maxpower47

unread,
Jul 3, 2013, 2:27:05 PM7/3/13
to pinboa...@googlegroups.com
So it looks like even though the line breaks are being included in the xml, the xml standard defines that a parser should normalize all whitespace characters (including newlines) in xml attributes to just space characters:

Before the value of an attribute is passed to the application or checked for validity, the XML processor must normalize it as follows:

  • a character reference is processed by appending the referenced character to the attribute value
  • an entity reference is processed by recursively processing the replacement text of the entity
  • a whitespace character (#x20, #xD, #xA, #x9) is processed by appending #x20 to the normalized value, except that only a single #x20 is appended for a "#xD#xA" sequence that is part of an external parsed entity or the literal entity value of an internal parsed entity
  • other characters are processed by appending them to the normalized value
So any standard parsers (like SAX) remove the line breaks before the value is handed off for use.  Hopefully v2 will move away from returning all fields in xml attributes rather than using child elements (at least for the free-text type fields like description).

As an aside, it looks like the json output retains the newlines with \n.

Matt Schmidt
PinDroid
Reply all
Reply to author
Forward
0 new messages