Saturday, December 31, 2011

Manipulating JSON data in a traditional relational database (Microsoft Access, FileMaker Pro, Converters)

While I wait to see if Pinboard can fix their Google Reader JSON (JavaScript Object Notation) import, and while I consider Google Reader Share JSON import into WordPress, I'm also exploring JSON import/export tools. If, for example, I could import JSON into FileMaker Pro or other data management tool I might be able to manipulate the archive and produce a more useful WordPress import.

StackOverflow and its kin have a good set of references on this topic. Note that CSV can manage only very simple JSON; we really want native importers similar to what Microsoft Access tried with XML [1]. I suspect one approach might be to convert JSON to XML then use Microsoft Access 2010 import.

Incidentally, this topic veered off unexpectedly into something that's actually relevant to my work life and a Strata conference I'm attending in a few weeks.

As of today here are some of my pointers ...

For me this DivConq series was particularly useful because it placed JSON nosql processing in a familiar context - Microsoft Access.

Maybe I should start using Apache Cassandra to manipulate my Google Reader JSON archive and prepare it for WordPress processing. For example ... Cassandra Development Environment in Mac OS Snow Leopard « BigDiver.

[1] I doubt JSON has truly significant advantages over XML as a data interchange format (see JSON Example and wikipedia xml/json). Alas, nobody asked me. Fashion is more powerful than geeks imagine.

Wednesday, December 28, 2011

Pinboard imports Google Reader JSON exports

Pinboard is the first service I know of that will import a Google Reader Social (shared item) JSON file:

Pinboard: howto page
Google Reader Click the gear icon in the upper right corner of the page. Select the Import/Export tab. Choose either items you have starred or items you have shared and click the Reader JSON link (the rightmost column)

When I stumble unexpectedly over something I've been looking for, I look for who else found it. Then I add them to my reading list. Google gave me only these references:

Pinboard has a feed, I don't know if importing will trigger feed actions (probably not)
See also:

Update: I paid my $10 and imported by Google Reader shared item JSON file. I have 3 days to cancel. I used Amazon payments.

Here are the results; as of today the most recent post is 7 weeks old. I may also try importing the JSON for my Reader shared items, which may produce some duplicates.

  • http://pinboard.in/u:jgordon - my pinboard collection - really my Google Reader shared items. Note my user name is a part of the URL, so it's nice that 'jgordon' was available. Posts show a title, a bookmark, and an excerpt. I think my GR annotations precede the excerpt. It's more like Google Reader Social than I'd expected.
  • http://feeds.pinboard.in/rss/u:jgordon/ - the public feed for my collection. I viewed this in Google Reader; gave me a real sense of deja vu. Alas, GR only pulled in 44 items.

I'm still studying the results. So far Pinboard is only showing a fraction of the JSON file, there are not tags, and every item shows with date of '9 weeks ago'. I don't see a convenient way to navigate across the entire collection.

Update 12/31/11: Pinboard has now imported 2 months of Reader shares - about 1100 items or roughly 1% of the total.

Sharing and annotation: Instapaper's supporting apps

I haven't found a replacement for the rough annotation-share-feed ecosystem that had grown up around Google Reader Social (RIP). I've given up, for example, on using Twitter as a Reader Social replacement.

Yes, I miss Google 1.0. I even miss Microsoft these days.

So I'm continuing to explore the pieces of the post-Google world; trying to see where this micro-market may go. This is poorly tracked territory, but today I came across an unexpected guide in the Instapaper: Supporting iPhone and iPad Apps page.

Instapaper has an ecosystem, and although it doesn't have a feed, it will post to Tumblr, Twitter and Pinboard. Tumblr has a feed (barely), Twitter can be turned into a feed (awkwardly) and Pinboard has a feed (and, mercifully, it's not free).

So what can I do with these pieces? Can I archive the output of Pinboard as WordPress posts?

I'll find out.

See also:
Update
  • I tried Instapaper's bookmarklet, but it hangs in Chrome with a "saving" status in the tab.

Friday, December 23, 2011

Recovering Shared Reader items: JSON import into Wordpress

Google amputated a portion of my distributed memory, but they left me a frozen json remnant.

(Yes, we are living in a cyberpunk novel. Sigh.)

Across the net there are unanswered questions about what to do with these json archives. Google has been silent. I believe the Google humans who might help are ashamed or demoralized or fearful. Google 2.0 is not a happy place for them.

I want to represent my JSON archives as posts in a WordPress blog, perhaps with some kind of synthetic title. Then they will be available to search and link. Eventually I hope to add new annotations and shares to that archive, though there will be a gap of several months that will be difficult to fill.

This feels doable, but so far Google (the search engine) hasn't told me how. This is what I have found so far. When I do find an answer, I'm going to answer some of the dangling questions across the net ...

I'll update this post as I learn more. Seth's contribution suggests a fix is close; he needs to tweak some of his code.

Update 12/25/2011: Seth writes that he won't have time to work on this further but he recommends downloading his php file from his linked zip. I'll have to learn how to run PHP scripts from my Dreamhost account, but I don't think that's too hard.

Update 5/3/2012Coping With Google Reader Changes | Much Ado About IT - accessing the lost items.

Thursday, December 22, 2011

The cost of repairing a Mac is less than expected

When my iMac 11,1's 2yo 1TB Seagate drive developed metastatic blockitis I was most unhappy.

It wasn't that the drive is dying young, just two weeks after my AMEX extended warranty lapsed. Two years is short for a drive, but this machine runs all the time and goes through two full disk backups every single night. The drive has had a hard life.

The fact Apple's diagnostics missed my drive's unmasked bad blocks is annoying. There's no magic to a disk scan; I shouldn't have had to buy TechTools Pro to make a diagnosis. Windows diagnostics have managed this for twenty years.

Worse though, is the cost of the repair. FirstTech, a well regarded local shop, gave me a $625 quote to install a 2TB replacement. (They can't get 1TB drives.). I was amazed, I'm used to paying $150 or so for a drive and doing my own installation. That's what I did when my old fully serviceable G5 drive died.

The problem is that the lovely 27" iMac is not user serviceable. Elegant quite design with special thermal sensor cables turns out to have a high post-purchase price. That's why I wrote ...

Gordon's Tech: Mac drive diagnostics: TechTools Pro and Drive Genius find problems OS X missed

... When you consider that iMac 27" hard drives are NOT user serviceable, the iMac is more expensive than it seems. The iMac G5 was entirely user serviceable. Design has its price....

I was wrong though. Today i checked what the cost would be for an Apple store repair. They quoted me about $200 for a 1TB drive replacement. (They don't do upgrades, only like-for-like replacements.)

How do they do that? Apple has a flat $40 service fee, regardless of the complexity of the repair. Apple offsets the ownership cost of their elegant designs by subsidizing repair. (In this case they have another advantage -- they have an inventory of 1TB drives with bundled thermal cables even as the world runs short of hard drives.)

I still prefer my G5 iMac's design -- but that was a hot and noisy machine. My 11,1 (i5, 27" 2009) iMac is quiet and cool most of the time. Apple's subsidy of post-warrany repairs makes that tradeoff more palatable - at least if you live near an Apple store!

Update 1/6/12: One warning: they will want to keep the hard drive. This fits with their out-of-warrantee repair following their in warrantee process. Apple should make this better known. It means if you bring your machine in for an Apple Store repair, you need to do a secure wipe first. Some additional tips:

  • Create an admin account with no password that Apple can use for testing. I didn't think of this, and my machine has guest account disabled.
  • They will want to recreate the problem -- even though I have to pay for the repair. Again, their out-of-warranty repair is basically in-warranty with a parts-charge and subsidized labor.
Update 9/2/2012: In comments Alex points to a 1TB Seagate drive replacement program! Alas, my serial number wasn't accepted. I bet this is what hit me though. I wonder if Apple will eventually extend the program. I'm W8946HAH5PJ.

Update 8/3/2013: Sometime in the past year Apple sent me a check for the costs of this drive. They did eventually extend the replacement program.

Tuesday, December 20, 2011

Mac drive diagnostics: TechTools Pro and Drive Genius find problems OS X missed

I've been having suspicious application crashes lately. In the Mac world, that suggests a hardware problem. (Once upon a time software was the cause, but these days it's hardware.)

I knew from some backup issues that I had 3 unreadable files. That suggests my 24+ month old 1TB iMac drive is dying youngish. It was time for some diagnostics, so I plugged in my old Apple Keyboard and mouse. (Most non-trivial diagnostic work requires a wired keyboard and mouse; Apple's bluetooth keyboard/mouse drivers may be unavailable when needed.)

After I deleted the bad (non-critical happily) files I ran Disk Utility - but the drive passed. Then I ran my Apple Hardware Test - extended, and loop mode. Still no problems.

I didn't believe it. Something had to be wrong. So I checked out Disk Warrior, Disk Genius and TechTool Pro - 3 reputable diagnostic apps. They're all $100. Disk Warrior has a good reputation, but Disk Genius has a trial version. It found about 58 bad blocks -- out of 1.8 billion. That seems a modest number, but DG said I needed to replace the drive. (Incidentally, Disk Genius has a built in uninstall feature -- very nice. Yes, the Mac needs an OS level uninstaller.)

I decided to get a second opinion. Andy M clued me to a MacUpdate bundle, so I got TechTool Pro 6 for $50 (plust a bunch of other apps I don't care about). It one-upped Disk Genius; as it found bad blocks it told me which of them had files (none in this case).

TTP also found bad blocks - 56 (so two less than Disk Genius, but I don't make much of that either way).

I wasn't sure what to make of this. After all, 56 out of 1.8 billion is minuscule. Unfortunately, a modern SATA drive shouldn't have any bad blocks. The excellent TTP manual explains why ...
... TechTool Pro should not normally report bad blocks for these types of drives. The drive controller in them automatically tries to map out bad blocks as they are encountered. It will do this unless either the bad block is in a critical area that cannot be mapped out at the moment or the bad block table is full. If this occurs, TechTool Pro will report a bad block and you will ultimately need to do a low level reinitialization of the drive. When the drive is reinitialized, the entire platter is accessible so that bad blocks can be mapped out if possible no matter where they occur...

.. You can use Apple's Disk Utility to reinitialize your drive. Be sure to choose the Security Option to "zero out data." Choosing this option will map out bad blocks, if possible, during the reinitialization. This may take several hours (depending on the size of your drive). If the reinitialization is successful, the drive should be fine at that point. We suggest, however, that you do a Surface Scan a few times in the next month or two just to be sure no new bad blocks are developing. If they are, then the drive is probably failing and you should consider replacing it. If a low level reinitialization fails, this indicates the drive is faulty and needs to be replaced...
So the bad blocks I see now are probably a small fraction of the number that have already been mapped out. I'm seeing the overflow, including blocks that went bad after they'd been written to.

My i5 iMac is 24 months and 2 weeks old - so it's past even my AMEX extended warranty (by two weeks!). If the drive were user serviceable (like my old G5 iMac!) I'd simply replace it. Since it's not a user serviceable I'll probably bring a new drive and the machine to FirstTech in Minneapolis for a $200 24 hour turnaround replacement. I'll make a bootable clone before I do that. (My usual Carbon Copy Cloner backups are to an encrypted image for offsite transfer, so not bootable.)

If it doesn't I'll try the reinitialization. (See Update - this drive is on death row.)

Update 12/21/2011: Various notes and reflections the day after ...
  • 16 months is a short lifespan for a hard drive. I bought this machine early in its lifecycle, I wonder if there will be more failures in this product line.
  • Modern drives don't write to bad blocks. Based on the dates of the files that were involved the involved blocks went bad in the past month. That fits with Carbon Copy Cloner not complaining until recently. (See my backup issue post for a twist to this story.)
  • I'm glad I bought TechTools Pro - I think I'll get good use of it. From what I know now though, I didn't really need it. In a modern drive a single bad block in a file, especially a relatively recently written file, means replacement. Carbon Copy Cloner told me 3 files were bad.
  • The TTP manual suggests reformatting. I suspect that might work if there was an initial formatting problem, but in this case I know existing blocks are going bad. This drive is on death row.
  • Carbon Copy Cloner complained about bad sectors in files during backup, but Time Machine didn't. That may be because Time Machine only reads files that have changed?
  • Most of the bad sectors are in unused parts of the drive. I suspect they were randomly distributed but were hidden by the drive OS as they were discovered in the parts of the drive that have been used (about 500GB of 1TB).
  • Drive Utility and Hardware Test didn't find any problems, but both Drive Genius 3 and Tech Tools Pro failed the drive and Carbon Copy Cloner complained too. The SMART diagnostics still pass the drive - even today! I'm a bit surprised; this isn't rocket science. Apple could do better. TechTools Pro gives more SMART diagnostics than Disk Utility -- my drive was complaining about heat (it's cold in this room!)
  • It's clearly worth running TechTools Pro or equivalent drive scan on a new drive then every few months. (11/23/11: TTP just crashed during a routine drive scan. I'm not impressed.)
  • I think Windows scandisk/chkdsk are superior diagnostic tools to Disk Utility.
  • TechTools Pro DVD includes an image for burning a PPC DVD. Nice touch. I still have an old PPC.
  • It's good to know the drive is dying before it dies. I have time to do extra backups and to move selected files to other machines -- including my Aperture and iPhoto Libraries and perhaps iTunes.
  • When you consider that iMac 27" hard drives are NOT user serviceable, the iMac is more expensive than it seems. The iMac G5 was entirely user serviceable. Design has its price.
  • Since I know the drive is dying I've disconnected my clone backup. It's my known good repository. I'll take it to my office then create a new clone, then disconnect that clone. I won't be saving data to this machine, I'll be treating it as a "guest" machine until I get it serviced. I may turn off Time Machine too.
  • OWC (Other World Computing - great Mac shop) showed me how to find my Model Identifier (System Profiler), it's iMac11,1. I can only go to 2TB of storage. I'm not confident that their options are correct however.
See also:

Backups - why you need two methods and abundant paranoia

I can't say I feel good about my backups. I believe data wants to die; it wants to be free of the burden of order. Against the despair of data, even the best backup is barely adequate.

Consider tonight, when everything almost failed - Time Capsule and Carbon Copy Cloner alike.

The Time Capsule serves all the machines in our home over a wireless network. I was surprised at first that backup would work over wireless, but it does. Each machine has its own unencrypted disk image; one on the TC's old internal 500 GB drive, two others have images on an external 2TB drive. The TC sits in a closet upstairs;  it's unlikely to be stolen but fire would destroy it. I have done 1-2 file Time Machine restores from that image, so I know it can work. The only test of a backup, of course, is a restore.

I don't trust Time Machine as much as old-time DantzRetrospect, but it seems Apple has gotten most of the bugs out.

I trust Carbon Copy Cloner [3] more. Each day it clones my server, on which all the important data lives. It's more than a cloner; CCC keeps copies of changed or deleted files in "_CCC Archives". I've configured CCC to use an encrypted image it automatically mounts every night. Since that backup is encrypted I can take it offsite, which I do every few weeks. Ok, every month or two. Offsite rotation relies on me, so it's prone to failure. Still, even if the house burns, I am unlikely to lose more than a month of images and videos. I can live with that.

So I have two backup methods, both fully automated, both relatively independent [2]. If each is 95% reliable each day, then the chance both fail on a given day is 1/400. If the daily chance of a server drive failure is 1/1000, the odds of all three failing on the same day are about 1/400,000 [2], [4]

Tonight though, my data got within a few miles of the cliff it wants to meet.

My server has been having worrisome memory exception (EXC_BAD_ACCESS) crashes, and a TV show I  recently downloaded had a file error [1]. There's something wrong on my 2yo i5 iMac; I need to run Apple Hardware Test (again). So I know my server data is at risk.

Time Capsule has had problems too -- it's reporting a "communications error" periodically. I think that error message is  a scarlet herring related to the iMac issues, but clearly I can't trust that backup.

Happily there's good old CCC -- but when I restarted my server for the first time in weeks it reported a problem. The backup drive didn't mount. That was easy to diagnose -- I'd unplugged it. Probably when I was debugging my Aperture crash 3 weeks ago. Why didn't CCC report the error? Maybe it had crashed.

I wasn't that close to data loss -- but I was in a bad neighborhood. As paranoid as I am, I'm almost not paranoid enough.

It's good to have two fully automatic and completely independent backup methods. Data wants to die, and backup is still an unsolved problem.

-fn-

[1] Incidentally, you can't easily report a purchase problem to Apple until they process a charge, and to reduce transaction costs they wait a few days before they process. This is very annoying! Also, the UI for reporting a purchase problem is suspiciously clumsy. More on that experience when I see what they do.
[2] In reality they common failure points of course - me, computer memory, etc. There is the older offsite backup though, so complete and total data loss is probably less than 1/1,000,000.
[3] Donationware. I donated. I wish donation ware apps would let us set a 'reminder' so I could donate yearly. I suppose I should just make donationware donations every year on my birthday against the apps I use.
[4] I'd love to have automated offsite backup too, but I've never foundan offsite vendor I trusted and I expect ISPs to eventually charge for bandwidth use.

See also:

Update 12/21/2011: I was closer to the cliff than I realized.