Monday, October 25, 2010

Blogger's 3 year paragraph debacle - the case of universal line break conversion

(Post title revised to reflect updates.)

I'm increasingly running across old posts that I've not touched where paragraphs have now vanished.

Not content with ruining formatting on newer posts, Google (blogger) is is now blowing up older posts.

I need to find an alternative to Blogger.

Update: I ask at Blogger's help group, but, based on the questions there, I doubt it will get any attention. Here's a sample of the damage. I have hundreds to thousands of old posts like this ...

Update 10/29/10: This has been going on since 2007. Three years of screwing up.

Update 10/29/10b: I've figured out part of this, thanks to a hint in that 2007 article. Blogger has a feature in settings that turns out to have devastating side-efects:

I believe the default setting is "convert line breaks". I changed it to NO to see if non-conversion would help with Google Composer's longstanding paragraph and format mangling. It never occurred to me that I was changing a setting that would be applied to every post in my blog. I reversed this setting on tech.kateva.org and my old posts now have line feeds again.

On notes.kateva.org I'd never changed the setting, so it wasn't disrupted.

Incidentally, I have two new insights on what's wrong with Blogger's various editors. MarsEdit's HTML view illustrated the second bug:

  1. Blogger's rich text editor paragraph controls get confused when a paragraph begins with bold text. Frequently, but not always, this triggers an extra line feed.
  2. Blogger's editor sometimes inserts <div> tags when it should insert <p> tags. In the rich text editor these create paragraphs, but browser behavior is variable. To quote Jennifer KyrninThe <div> tag is not a replacement <p> tag. The <p> tag is for paragraphs, only, while the <div> tag defines more general divisions within a document. Others have been confused about this distinction.

Update 10/29/10c: It appears that the editor is inserting two <br> tags and a <div> tag instead of a <p> tag. Both the current standard editor and the draft editor do this, I think the old editor might have inserted a single BR tag and a DIV tag. This is a terrible practice. See this Stack Overflow discussion and this one.

Update 10/30/10: The MarsEdit forum has a 2008 post on Blogger's flailing about with paragraph breaks, there's a companion thread in the Blogger developer forum. The developer group is only moderately interesting, it's been invaded by desperate end users seeking support. There is a "new developer relations engineer", perhaps because his predecessor was last seen drinking heavily in an Alaskan bar.

I wonder if there's a fundamental flaw in Atom Pub 1.0 that somehow led to Blogger's twisted implementation of the paragraph.

No comments: