This is partly a "heads up" to other client developers who may be
massaging the content of Blogger posts to be more easily editable, and
partly a "proof of concept" for why I continue to argue that Blogger's
unpredictable insertion of <br /> tags into the post content is
annoying and sets up a fragile relationship with clients that allow
verbatim editing of the content.
If a client sends literal post content such as this to Blogger:
-
Paragraph 1
Paragraph 2
Paragraph 3
-
It seems to be stored literally in the database with the newlines
intact, but for display purposes and (unfortunately) for generating
the Atom-based feed for editors, it has <br /> tags inserted. Up until
very recently, the format of the inserted tags was as complete
replacements for the newlines:
In order to give users the impression that Blogger has not mucked with
their content, I have been replacing the <br /> tags with verbatim
newlines in my editor. Of course, with this recent change, I am now
adding too many newlines, since the newlines are already there.
Ideally, I would love to have Blogger simply return via AtomPub the
same content that I had sent it. This fragile relationship is a set
up for a bad user experience, and makes it impossible for remote
clients to reliably match the experience that users get when using the
web-based editor.
Now, the format of the markup seems to be changing by the minute,
between the "new" and the "old." Is this because I'm getting load-
balanced to a different server running different code, or is somebody
undecisive about how the server should behave, or ... is somebody just
messin' with me ;)
I'd like to know if the switch to the new "break tags + newlines"
format is something Blogger plans to stick with, so I can release an
update of MarsEdit that adapts to the new formatting.
Daniel
On Jun 28, 3:56 pm, Daniel Jalkut <jal...@gmail.com> wrote:
> This is partly a "heads up" to other client developers who may be
> massaging the content of Blogger posts to be more easily editable, and
> partly a "proof of concept" for why I continue to argue that Blogger's
> unpredictable insertion of <br /> tags into the post content is
> annoying and sets up a fragile relationship with clients that allow
> verbatim editing of the content.
> If a client sends literal post content such as this to Blogger:
> -
> Paragraph 1
> Paragraph 2
> Paragraph 3
> -
> It seems to be stored literally in the database with the newlines
> intact, but for display purposes and (unfortunately) for generating
> the Atom-based feed for editors, it has <br /> tags inserted. Up until
> very recently, the format of the inserted tags was as complete
> replacements for the newlines:
> In order to give users the impression that Blogger has not mucked with
> their content, I have been replacing the <br /> tags with verbatim
> newlines in my editor. Of course, with this recent change, I am now
> adding too many newlines, since the newlines are already there.
> Ideally, I would love to have Blogger simply return via AtomPub the
> same content that I had sent it. This fragile relationship is a set
> up for a bad user experience, and makes it impossible for remote
> clients to reliably match the experience that users get when using the
> web-based editor.
On Sat, Jun 28, 2008 at 5:01 PM, Daniel Jalkut <jal...@gmail.com> wrote:
> Now, the format of the markup seems to be changing by the minute, > between the "new" and the "old." Is this because I'm getting load- > balanced to a different server running different code, or is somebody > undecisive about how the server should behave, or ... is somebody just > messin' with me ;)
> I'd like to know if the switch to the new "break tags + newlines" > format is something Blogger plans to stick with, so I can release an > update of MarsEdit that adapts to the new formatting.
Hey Daniel. Apologies for the bug. We pushed out a fix this afternoon to revert back to the old behavior of changing "\n" to "<br>" instead of "<br>\n". You probably were testing during the push, hence the changing behavior.
As for getting back out what you put in, I can't guarantee that we'll always preserve that exactly. For one, the Atom API has to serve double-duty as both an editing format and a syndication format, so we'll always output HTML that resembles what we'd send to browsers.
That being said, it's a bug if we're sending HTML that does not roundtrip correctly if you gave it back to us. In other words:
We actually still have an issue that if you send \ns in your HTML, those will be interpreted according to the blog's Convert Line Breaks setting. This bug has been around for a while, and I'll probably fix it fairly soon to not treat linebreaks from API clients with any significance. We would return them as they were sent.
Hopefully no one out there is assuming that \ns in HTML sent to Blogger will always make <br> tags, as that's not the case in blogs with Convert Line Breaks off.
Thanks for the info. I am glad you guys decided to revert the
behavior. Makes it easier for me ;)
I appreciate the challenges of using a single feed for both HTML
syndication and for the AtomPub clients. But what it makes me wonder
is if it wouldn't make sense to offer a different collection URL for
AtomPub editors. Something that could access the "raw" material and
therefore have a chance at presenting the exact same content that the
web interface does.
I like the idea of Blogger not automatically converting newlines to
<br /> ... although ... if you "fix" this issue, I hope you'll give
some thought to the idea that users are by now quite used to being
able to send chunks of text separated by double-newlines, and having
those interpreted by Blogger in a way that sets those chunks visually
as "paragraphs".
WordPress and other systems also apply a set of default "manipulation"
to the incoming HTML, to convert the plain newlines into the
appropriate markup. I don't think most users mind this, but the <br />
tags that Blogger is inserting are not particularly sophisticated, I
think because they are mainly for presentation at render-time.
In summary, I wouldn't mind, and I don't think most users would mind,
if instead of the "ugly" linebreak tags, Blogger was to massage
incoming text by wrapping double-newline separated text with <p> tags
(with appropriate newlines preserved for pretty editing).
Daniel
On Jun 28, 9:33 pm, "Pete Hopkins ♬☠" <phopk...@google.com> wrote:
> On Sat, Jun 28, 2008 at 5:01 PM, Daniel Jalkut <jal...@gmail.com> wrote:
> > Now, the format of the markup seems to be changing by the minute,
> > between the "new" and the "old." Is this because I'm getting load-
> > balanced to a different server running different code, or is somebody
> > undecisive about how the server should behave, or ... is somebody just
> > messin' with me ;)
> > I'd like to know if the switch to the new "break tags + newlines"
> > format is something Blogger plans to stick with, so I can release an
> > update of MarsEdit that adapts to the new formatting.
> Hey Daniel. Apologies for the bug. We pushed out a fix this afternoon
> to revert back to the old behavior of changing "\n" to "<br>" instead
> of "<br>\n". You probably were testing during the push, hence the
> changing behavior.
> As for getting back out what you put in, I can't guarantee that we'll
> always preserve that exactly. For one, the Atom API has to serve
> double-duty as both an editing format and a syndication format, so
> we'll always output HTML that resembles what we'd send to browsers.
> That being said, it's a bug if we're sending HTML that does not
> roundtrip correctly if you gave it back to us. In other words:
> We actually still have an issue that if you send \ns in your HTML,
> those will be interpreted according to the blog's Convert Line Breaks
> setting. This bug has been around for a while, and I'll probably fix
> it fairly soon to not treat linebreaks from API clients with any
> significance. We would return them as they were sent.
> Hopefully no one out there is assuming that \ns in HTML sent to
> Blogger will always make <br> tags, as that's not the case in blogs
> with Convert Line Breaks off.
On Sun, Jun 29, 2008 at 12:50 AM, Daniel Jalkut <jal...@gmail.com> wrote:
> Thanks for the info. I am glad you guys decided to revert the > behavior. Makes it easier for me ;)
I was poking through the code for the new editor and noticed that we're creating <br>\n for newlines made with the new editor. I'd like to keep this because it makes the generated HTML in the blog easier to read.
(Note that this does not break the <pre> bug from this weekend, as the new editor does not replace \ns with <br>s in <pre> tags.)
I would recommend ignoring any newlines you get in an entry, as they are definitely not significant. As I mentioned before, I think that it's a bug that we treat them as significant on the way back in.
> I appreciate the challenges of using a single feed for both HTML > syndication and for the AtomPub clients. But what it makes me wonder > is if it wouldn't make sense to offer a different collection URL for > AtomPub editors. Something that could access the "raw" material and > therefore have a chance at presenting the exact same content that the > web interface does.
> I like the idea of Blogger not automatically converting newlines to > <br /> ... although ... if you "fix" this issue, I hope you'll give > some thought to the idea that users are by now quite used to being > able to send chunks of text separated by double-newlines, and having > those interpreted by Blogger in a way that sets those chunks visually > as "paragraphs".
Does MarsEdit send <br> tags or newlines? I'd like to get into the habit of having API clients send HTML, with its whitespace semantics.
If you (and anyone else out there, please chime in) are relying on being able to send \ns in the entry and have those converted to <br/>s, we might have to tie the ignore-\n behavior to some sort of format change (perhaps the upcoming AtomPub 1.0 endpoint for Blogger).
> WordPress and other systems also apply a set of default "manipulation" > to the incoming HTML, to convert the plain newlines into the > appropriate markup. I don't think most users mind this, but the <br /> > tags that Blogger is inserting are not particularly sophisticated, I > think because they are mainly for presentation at render-time.
> In summary, I wouldn't mind, and I don't think most users would mind, > if instead of the "ugly" linebreak tags, Blogger was to massage > incoming text by wrapping double-newline separated text with <p> tags > (with appropriate newlines preserved for pretty editing).
This is something I'm looking into. I definitely prefer <p> tags to <br/>s, though the code to do this without trashing complicated HTML is hairy.
Nevertheless, I think that this is an issue for API clients. I'd prefer for you to generate the HTML for <p>s or whatever and send it to us, rather than send newlines and have us interpret them.
That being said, if there's pushback and you want to use our algorithms for linebreaking/paragraphing, perhaps that can be triggered by an additional attribute or special content type.
Hi all. To bring some closure to the newline stuff:
The current newline behavior (we try not to send any out, and treat those that come in according to the blog's Convert Line Breaks setting) will be preserved for all clients and both the old and new rich text editors if you do nothing. I'm writing code to strip the newlines out of the new rich text editor's output in order to maintain this compatibility.
The upcoming AtomPub 1.0 version of the Blogger API will treat entry content as HTML, meaning newlines are whitespace only and will not ever get converted to <br>s. To preserve compatibility with shipped software, your app will have to specially find or request the AtomPub 1.0 version of the API.
More details on the AtomPub 1.0 stuff (how to find / enable the endpoints, etc.) will come when it's all ready (at least a few weeks, given where we are currently in Blogger's release cycles).
Thanks for the update, Pete. I'm glad to hear the line for
compatibility will be drawn at the AtomPub interface (and that you'll
be offering a separate endpoint URL to avoid confusion).
Look forward to trying the new AtomPub stuff.
Daniel
On Jul 31, 7:12 pm, "Pete Hopkins ♬☠" <phopk...@google.com> wrote:
> Hi all. To bring some closure to the newline stuff:
> The current newline behavior (we try not to send any out, and treat
> those that come in according to the blog's Convert Line Breaks
> setting) will be preserved for all clients and both the old and new
> rich text editors if you do nothing. I'm writing code to strip the
> newlines out of the new rich text editor's output in order to maintain
> this compatibility.
> The upcoming AtomPub 1.0 version of the Blogger API will treat entry
> content as HTML, meaning newlines are whitespace only and will not
> ever get converted to <br>s. To preserve compatibility with shipped
> software, your app will have to specially find or request the AtomPub
> 1.0 version of the API.
> More details on the AtomPub 1.0 stuff (how to find / enable the
> endpoints, etc.) will come when it's all ready (at least a few weeks,
> given where we are currently in Blogger's release cycles).