Sunday, January 06, 2008

Interoperability and my Contact information: Microsoft Outlook and Access, FileMaker Pro and Palm Contacts

[This is written for the very few people who will ever try to do something like this this, and will Google for an explanation.]

I'd say this was harder than I imagined, but really I knew it would be bad. The reasons it's bad are the same reasons that medical software seems to be stuck in the 1980s.

The problem of reconciling data models and data capabilities is a much harder problem than relatively trivial tasks like natural language processing, speech recognition, syntax specifications, quantum computing, and developing multiprocessor compilers. The more knowledge a system contains, the more difficult it becomes to reconcile different semantics.

That's why loss-free interoperability of complex healthcare software is always ten years away.

It doesn't help that the problem is either unrecognized or underestimated.

Ok, I digress. The problem I had to solve was far simpler, though it was a large part of why Palm went from a growing billion dollar company to near bankruptcy.

I had to reconcile several address book data models.

Over several years platform migrations and a 2003 synchronization screwup had scattered my personal contact information between the PalmOS, FileMaker Pro 8 (Windows/OS X) and Microsoft Outlook. (My corporate contacts in a different Outlook 2003/Exchange environment [1].)

I needed to merge the information into a single data management environment, identify duplicates and conflicts, and create a reconciled view.

The first step was to figure out which environment to use. FileMaker Pro 8 has far better layout and user interface capabilities than Access 2003, but I'm regrettably very familiar with Microsoft Access 2003 queries and data transformation. More importantly, until I switch to an iPhone, the true home of this data is in my Palm Tungsten E/2 and Microsoft Outlook 2003 (synchronized).

Reconciliation had to occur in Access 2003.

The next step was to identify what fields to use (which data model, and, more abstractly, which semantic model), which data types, etc. I had to find something that could work across all these environments, and which would allow me to port data from the rich FileMaker environment.

It took me almost a half-day of work - my New Years Day resolution project. I'll summarize just the key points here for anyone who wants to do something like this, followed by a review of the final table structure (single table). The various matching algorithms and data updates turned out to be simpler than I'd expected.
  1. Outlook 2003's internal data model is probably not a relational model. In theory one can create a data link from Access 2003 to Outlook 2003, but this link exposes only a small portion of Outlook's contact data. The export to file/Access works best, omitting only a few odd fields.
  2. I think (though a lot was happening at this point) that Outlook will export and import a Contacts field (column) called "Keywords". Oddly enough, it's not accessible in the Outlook GUI! Ignore it.
  3. Outlook import is more limited than export. In particular, Outlook uses different column labels for import and export. Mostly it matches them despite the name changes (hidden mapping), but it fails to match "email" to "E-mail".
  4. The best connection to FileMaker is ODBC. I experimented both directions, but ended up using FileMaker as an ODBC server and Access as a client. This requires setting up FileMaker 8's weird ODBC services -- I think this is easier in later versions of FM.
  5. Access maps FM fields via ODBC to Memo fields. I did a full import and changed all but Notes to Text (255 character UTF-8).
  6. Some of the data I imported from either Outlook or FileMaker had empty not NULL columns in Access. I resolved this by finding all values that were not NULL but had a string length of 0, then I set them to NULL.
  7. Some FM and Outlook text data contained carriage returns. Outlook 2003 has a lot of trouble with these. I had to replace the CRs with spaces using an obscure technique.
  8. FileMaker hides its internal row identifiers, but I exposed them using Get(RecordID) ans stored them in my new database. (Access doesn't even have these, Oracle does. Longstanding complaint about Access.)
  9. Palm allows four "User fields". Outlook has 8 "custom fields", but not all of them are easy to get at. I used three "User fields" mapped to 3 "custom fields". In Palm Contact Options I name them (see below).
  10. Outlook import will only manage text, memo and date types.
This screenshot (click to enlarge) shows the fields in the reconciliation Access 2003 database. I've never figured out how to get a useful report on these things -- most databases allow one to write queries against internal metadata/schemas but Access doesn't. This was the best I could easily do:


Keeping track of the identifiers is obviously important. RecordID is an Access 2003 Autonumber field. I store an Access Autonumber Synch identifier (a GUID) and a legacy FileMaker identifier in two fields accidentally omitted from the screenshot (sorry).

User1 - User3 contain keywords, date revised and a text data type copy of the Access record id.

--
[1] I am the only person on earth who wants to synchronize my work data to a unified device/platform but not synch my personal data to work. This is proof that I'm an alien.

Update 5/19/09: With my new hacked together setup, I can use Access to manipulate Outlook and have the changes reflected through MobileMe to OS X Address Book to my iPhone.

Permissions: the most messed up part of OS X

This is true for 10.4.11 on a multi-user machine
  1. User Tim moves folder to User John's Drop Box.
  2. User John moves folder to John's desktop.
  3. John cannot edit folder contents.
The owner of the folder contents is still Tim. Other users have read-only access.

OS X doesn't change the ownership, even though the act of moving to John's Public Drop Box is an indisputable transition of ownership.

This behavior has been broken since 10.0.

I'd be pleasantly surprised if it were fixed in 10.5.

This ain't rocket science.

The old Apple would have nailed this one long ago.

[Yes, I know how to change permissions. That's not the point. This is 'not caring' style design.]

Saturday, January 05, 2008

Quick Look - more than I'd thought

I won't have 10.5 for a few months yet, but this is good to file away: 10 ways to get the most out of Quick Look.

I've always liked this kind of functionality. I started out first with Norton Utilities DOS NCView, then the DOS based Norton Commander with integrated NCView (F3 key I think). There was something similar to Windows 95 to, but I can't remember the name of it.

Funny how this sort of capability comes and goes. I hope it stays this time.

Windows Live Writer cursed by Google's bugs

You know you're in a new era when Microsoft is the humble good guy doing the noble thing, and Google is the arrogant foe of justice.

Google's hacked-together Blogger-Picasa pseudo-integration breaks when image-containing posts authored using Windows Live Writer are migrated to a personalized domain or to an ftp site.

The honorable WLW team has put together a partial solution, but really this is Google's bug.

I'm a longtime user of both Blogger and Picasa. Google is not wasting any of their billions on funding those products. I'd guess it's some manifestation of old-style revenue-funding business discipline. Personally, I'd prefer Google sell both properties to someone who's willing to fund them. I'm more than willing to pay for value delivered; Google's low-cost B-team funding approach is really annoying.

Thursday, January 03, 2008

manchester time: another nice Google feature

Cute.

Type "manchester time" into Google and the top of the page shows a nice clock icon with times in various Manchesters.

Omidirectional antennae aren't

Impossible antenna, only $50!

Excellent. All new to me and fascinating.

Python Quick Reference

Bruce Eckels likes the Python Quick Reference (Ruby Needs One!). It is impressive!

If I end up doing more simple programming, my choices are likely to be AppleScript (ugh) and Python (yay).

Wednesday, January 02, 2008

AppleScript - summarize email is useful

I have a library of AppleScript books, but I've never done much with it -- in part because I always thought it was one step away from extinction. It's also a really lousy programming language (scoping anyone?).

Well, whatever its past status it's still with us, and Apple has even fixed up their once decrepit AppleScript: AppleScript Examples page. Automator and AppleScript have been revised in 10.5, the documentation finally left the 20th century, and Python hasn't taken over completely ... yet (alas).

Even in 10.4 I'm rediscovering useful things. Take, for example, the little known "Summarize Message" script buried away in the Mail Scripts folder. Here's what it does:
... This script demonstrates how to write a script that can be executed
directly from the Scripts menu in Mail. It acts on the selected messages,
which are passed in to the 'perform mail action with messages' handler.

This script will take the selected message, create a summary using the
summarize command built into the Standard Additions, then speak the
summary using the say command, also built into Standard Additions...
My mother's vision is failing. This is something she could use, though I've already programme done key to active the built-in generic reading engine. Too bad Mail.app doesn't let me attach a script to a nice fat icon, but I might create a rule that would routinely read each message she opens. (Rules are hidden away in mail preferences -- which is not a logical place for that function.)

By the way, my favorite 10.4 voice is "Vicki". I hear 10.5 has even better voices.

Update: After a bit of experimenting I created an Application from Summarize Email. I then gave it a nice icon from the Icon Factory and put it in the Dock. So it's easy to click on whenever my mother is reviewing her email.

Merging PDFs with OS X Automator

I'm an Adobe Acrobat guru -- too bad it's such a poor quality product. Bad as Acrobat is in XP, it's worse in OS X. Among other things, Adobe can't figure out the concept of a non-admin user.

So I stick with OS X Preview and built in OS X PDF tools. The main thing I miss is the ability to merge and split PDFs.

There are a few OS X utilities to do merges (and more), but it turns out Automator will do the trick (macosxhints.com).

See the macosxhints writeup for the full story. I saved my script as a Finder Plug-In (stored in \Library\Workflows\Applications\Finder), so now I can select any set of PDFs, choose Merge PDF, and they're assembled into a single (oddly named) file on my desktop. The script appends in alphabetic order, so I use a numeric prefix if I want a particular order.

This is the first Automator script I've tried that's really useful!

This is what my script looks like:
!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">


AMApplicationBuild
88.2
AMApplicationVersion
1.0.5
AMDocumentVersion
1
Owning Application
/System/Library/CoreServices/Finder.app
Ok, so that's not very useful. Here's the outline:
  1. Get Specified Finder Items (don't have any values there when you save the script)
  2. Sort Finder Items
  3. Combine PDF pages
  4. Move Finder Items
  5. Open Finder Items
Note that if you omit the Move step the files are saved in an occult (invisible) tmp folder.

Removing embedded carriage returns from Microsoft Access

Carriage return. Such a wonderfully archaic term for the hidden byte that ends a line of text. My children have no idea what kind of carriage can return.

Speaking of archaic, applications like Microsoft Access have trouble with carriage returns. They can't easily be inserted into a text field, but ODBC imports from more sophisticated applications, such as FileMaker Pro, can insert carriage returns into Access fields.

Problem is, there's no easy way to remove them. Fields with embedded CRs behave oddly when edited, and exports and queries break. Search and replace won't work.

I found a method that works.

Export the key column and the troublesome field as XML (no XSD). Then use a text editor to replace every carriage return with a space. The result is a single line, but this doesn't affect XML import. Reimport the XML and the carriage returns are gone.

There are probably better methods.

Update 1/2/07: I ran into another issue where an Access field appeared empty, but it was not NULL. I used 'not null' and 'len=0' to identify these fields, then set them to NULL. Probably another character set problem. I have finally liked all the problems with creating a database that works with Outlook 2003, the PalmOS, sync to Palm, and FileMaker (via ODBC). More on that after I get some sleep.

Monday, December 31, 2007

Yet another identity of mine: MyOpenID

I posted a month or so ago about the identity sources I've committed to. Now I can add MyOpenID: John Faughnan , per Jon Udell's implicit recommendation.

A few comments on MyOpenID (and OpenID services in general):

  • MyOpenID is a provider of OpenID services. If you have the time, geekiness and a domain/website you control you can roll your own. Blogger now supports OpenID authentication for comments, Microsoft is baking OpenID into .NET Windows CardSpace, and Firefox 3 is supposed to have some OpenID management functionality.
  • The "blurb" function of the primary page is broken today.
  • Each account can have multiple personae, any of which can be exposed publicly.
  • There's a easy-to-find function to delete an account

I'd add my GrandCentral number as yet another identity, but I presume that will be bound my "113" Google persona. I also have Yahoo and Microsoft Live identities, but I try not to think of them.

I particularly like the personae management capabilities. Anonymity needs to shrink considerably on the web, but in its place we need the ability to manage multiple personae (aka avatars), each with their own managed reputation -- and the ability to create and destroy personae as needed.

I'm thinking these identity management skills will be second nature to our children, but I'm not confident I know how this will all develop.

JanRain runs MyOpenID, I assume they're aiming to be acquired by Google.

Update 3/4/2008: ClaimID.com provides similar services, also recommended by Udell.

GrandCentral: child accounts, features explained and annoyances

I'm planning to give all the children a GrandCentral number. I'll control the number and email of course -- until each child snatches the pebble from my hand.

At first the number will route to our home, but in time it will route to their personal phones. When they're old enough they get the password and route calls as they will.

So as I set this up I'm exploring GrandCentral features. I liked the explanation one blogger provided for "CallSwitch" (click the link for a description of all GC features):
Web Application Developer: Grand Central - All The Phone Services You Wanted

... CallSwitch: That name is really deceiving. It really should be called 'phone switch', because it lets you re-ring your phones in the middle of a call. So - someone calls. It rings the house. The problem is that I'm getting ready to get out the door. I could call them back on the cell, or I could hit a button on my phone, and the call will ring my phones again so I can pick up the cell. Again on the cell-minute-saver, if I'm outside mowing the lawn, and get a call on the cell, I can take it, walk into the house, hit a button, switch the call to the house phone, and save the minutes...
I've noticed two annoyances so far with GrandCentral:
  1. I'd like to be able to send all calls from my own cell to voice mail, so I can use my phone to capture thoughts and ideas as I drive. This works with MaxEmail, but GrandCentral handles all calls from my cell number as a request to check my voice mail.

  2. Given #1 I think it won't work to add my home number as a GC number, if I do that then nobody using my home phone will be able to reach me. I suspect the GC team were too young to have children.

  3. I wish they supported RSS for notification instead of email. I can work around this though because Bloglines supports creating an email address that generates a feed. So I use this as my GrandCentral notification email.

  4. They don't support fax-in, I'd be glad to pay for this service.

  5. There's no integration with the Gmail contact service and the Gmail import is very old fashioned (doesn't use the Gmail API at all).
GrandCentral is obviously ripe for a vast amount of improvement; we'll have to see how clever Google gets with the service.

GrandCentral introduces visual voice mail for any cell phone

I don't think Google's GrandCentral is open to new subscribers yet, but it's interesting that in advance of Google Android they now have visual voice mail.

Why hasn't anyone improved Blogger's BlogThis! tool?

Blogger is certainly a proletariat blogging tool, but even so it does have a vast number of users, some of whom must be qualified geeks. It also has a well documented API.

So I'm surprised that we're still using the same crummy bookmarklet that we used before Google owned Blogger:
What is BlogThis! ?

....BlogThis! is an easy way to make a blog post without visiting blogger.com. Once you add the BlogThis! link to your browser's toolbar, blogging will be a snap. Or rather, a click. Clicking BlogThis! creates a mini-interface to Blogger prepopulated with a link to the web page you are visiting, as well as any text you have highlighted on that page. Add additional text if you wish and then publish or post from within BlogThis!...
The Google Firefox toolbar includes a similar function (SendTo Blogger) that may actually be inferior to the original Blogger bookmarklet.

I've used these two solutions for years. They're crummy. Let me name a few of the problems:
  1. No access to tags (labels) from the SendTo Blogger UI or the bookmarklet.
  2. Variable bugs -- lately the SendTo Blogger window has aquired its own redundant scrollbars when used in the latest version of Firefox.
  3. Limited toolbar (no bullets, no image, video, upload)
  4. Using Blockquote tags in RTF when the start of string includes a link creates an empty href tag preceding quoted text.
  5. Many bugs with copying highlighted text into the post, lately truncates text.
  6. No 'edit this post' button on the post-submit dialog. Instead need to right click on edit posts, choose open in new window, then find the draft post in list then click on draft post.
The list goes on.

So why hasn't some Googler devoted a portion of their 20% time to fixing this functionality? Why hasn't any hacker created a Firefox extension to replace the bookmarklet/toolbar function?

I think if we knew the answers to these two questions we'd understand something about a lot other modern software frustrations.

HD Photo (JPEG XR) file format: an update

Bill Crow's HD Photo Blog is an excellent information source on Microsoft's HD Photo file format. It's written by the responsible Microsoft Program Manager, and like most Microsoft blogs it's a vast improvement on the usual marketing junk.

Microsoft's stated goal is to make HD Photo into JPEG XR -- a standard they won't control. Microsoft promises a royalty free grant to patents they hold.

I'm not exactly a Microsoft fan, but I'm hoping this one works out. JPEG is really inadequate (though if you shoot raw, edit the raw, and save as JPEG you can get around some of JPEG's worst limitations), but JPEG 2000 seems to be stillborn.

I was really hoping JPEG 2000 would work, but I've read that it probably contains lethal patent bombs. (Patent holders will stay silent until JPEG 2000 is well used, then attack.)

Crow's posts also include a dense discussion of color spaces and gamma. I've read this stuff before (see also: one, two, three), and discussions come in two flavors: wrong and impenetrable. That is, most of the discussions are misleading, but the reliable ones are very dense. I'm convinced not a few famous manufacturers and programmers have messed up their color profile support because the topic was too complex for them to understand. (Trust me, very large companies can have a lot of trouble with complex topics.)

I'm disappointed though that Crow doesn't discuss metadata and HD Photo. Exif headers in JPEG have been extremely valuable -- even though there's no real standard. A wikipedia article on JPEG XR has more information:
HD Photo metadata, optional XMP metadata stored as RDF/XML, and optional Exif metadata, in IFD tags
It would be amazing if Adobe's XMP metadata standards were to make it into JPEG XR. (See also: PNG, metadata and archival formats).

If Microsoft pulls this one off as an honest broker (the devil will be the details of those patent grants) I'll have to say something nice about the Devil.

Update 1/9/10: Sadly, Microsoft waffled on its licensing. So they were true to their satanic heritage.