Saturday, September 11, 2010

My ScanSnap S1300 document scanner review

I bought this scanner because Joe Kissell loved it and Kissell is a good geek. By my standards it was a bit of a shot in the dark, but I'm happy. Even so, I only gave it 3 stars in my Amazon review -- the software is worse than Kissell described (he uses DevonThink Pro Office, $150, so he didn't get the full software experience).

Read the Kissell review, then my own Amazon review. I'll probably do some updating here later, but I wanted to get this out (emphases added) ...
Amazon.com: Customer Reviews: Fujitsu ScanSnap S1300 Instant PDF Sheet-Fed Mobile Scanner (PA03603-B005)
I've been looking for this scanner for 15 years. It's good enough. It could be better, but it's good enough. If it lasts for two years I'll happily buy it again at the same price.
The hardware is essentially perfect. It's a bit annoying that you need two USB cables if you want to avoid the generic (mediocre) power brick, but blame that on USB. We should all be using either old style firewire or never coming USB 3, but we're stuck with USB 2. It scans both sides of paper at once. Yes, DUPLEX.
Although it's primarily a document scanner, I've used it scan color prints. The results were not professional quality, but they were darned good and fast.
The 300 page user guide documentation is excellent.
The software is mediocre. Some of the bundled OS X software is so old it's non-native on Intel machines, fortunately you can omit that install. Unlike the higher end machines you don't get Adobe's superb PDF/OCR combination (yes, once Adobe was competent), you get a much less efficient product called ABBY FineReader. Even so, it does produce PDF images with searchable OCRd text indices.
Most importantly, OS X Spotlight WILL index the text associated with these PDF image files.
The mediocrity extends to the ScanSnap Manager UI and workflow. Clearly this was a low bid contract. Don't expect much in the way of upgrades or future products. The scans, however, can be sent to products like DevonThink Pro ($150) for processing.
The scanner uses proprietary drives. This is the biggest concern. If they're not upgraded we can be sure that within 3 years they won't work on OS X. Fujitsu, notoriously, does not provide new versions of ScanSnap Manager without a hardware purchase.
There are other problems with the software, but so far it hasn't been unstable.
In summary, 2 star software, 5 star hardware, gives a 3 star review. Surprisingly, I still love the product. If Apple were ever to produce a scanner, it would be a lot like this, though with a better power adapter and infinitely better software.
If you prefer 200K OCRd B&W documents to 8MB grayscale/color you need to set and use Profiles. The software isn't smart enough to make that choice for you.

It occupies a corner of my desk where papers used to pile up. It uses less room than the papers, which now live in the recycling.

Update 9/14/10: various notes I really don't have time to assemble into a coherent whole, but will be of interest if you read this far....
  • There's a Carrying case offer
  • Scan Snap Manager includes an online update option from the Help menu
  • Uses a standard OS X Apple installer and documentation has clear uninstall directions
  • 1.2GB installation - watch carefully for the custom install option and disable Cardiris (143MB, needs Rosetta, not useful). ABBY is 526MB, ScanSnap Manager is 2576MB
  • Fujitsu sells consumables - cleaning kit, pad assembly (10,000 sheet or 1 year), Pick Roller 100,000 sheets or 1 year. Fujitsu is used to selling to the high end!
  • 1 year warrantte, no active exchange
  • Options like 1 page vs. multi-page are not obvious.
  • You can change options for a single scan without clicking Aplly (which is actually save) but the progress UI shows the saved settings, not the current modified settins
  • Profile Management has glitches with OS X Spaces and multiple monitors
Update 9/28/10: OS X black screened on me. First time in a very long time. SSM has to be #1 suspect.

Update 7/8/11: I can't recall what that 9/28/10 crash was, but it wasn't due to ScanSnap. Posting now because a comment on my Amazon review tells us Fujitsu has updated the S1300 drivers for Lion. Indeed, they seem to have updated all their current drivers. This was one of my concerns with the ScanSnap and I'm very pleased to see them do this. I'm not on Lion yet, so I'll hold off updating. I will download a copy however.

Update 4/14/2013: Today, after one mangled page too many, I decided this scanner was a bad idea. The sheet feeder simply isn't good enough.

Friday, September 10, 2010

Twitter feed for my Google Shared Items

Five months ago I started using twitterfeed to route my Google Shares to my (real name) personal twitter account. I pretty much forgot about it, I don't have much use for Twitter.

So I was surprised one day to discover people follow that twitter stream -- including people I know. A stream that has my (c) real name on it! A discoverable stream!

That won't do. The reason I changed all the names on my blog to my first and middle names (John Gordon) is so that my ideas and shares wouldn't be trivially discoverable. Trivial discovery is a poor match to my boring work life.

So today I created a duplicate Twitter feed for my Google Shared items -- as  John Gordon (jgordonshare).

If it keeps working I'll gradually promote that stream, and remind people following my true name Twitter feed to switch over. Then I'll turn off my true name stream.

Identity management is a subspecialization of reputation management.

-- My Google Reader Shared items (feed) (twitter)

Wednesday, September 08, 2010

After the Gmail hack - passwords and security

My Google (gmail) account was hacked. Interestingly, I've yet to discover any consequences.

My 58,000 email seem intact. There are no obvious changes to my documents. Passwords were not changed. Spam was not sent. Our financial accounts do not appear to have been hacked.

 It's curious.

So what am I doing differently?

I've always followed Schneier-approved security practices. That is I've calibrated my security measures to the value of what I was protecting, and balanced the cost and benefit of security. Since the hack I've not made any radical changes, but I have adopted somewhat more restrictive practices. I fear the cloud more than ever.

I have no reason to expect that my password database, stored in 1Password on my iPhone and dektop, and in a FileMaker 7 database on an encrypted disk image at home, was exposed, but of course control of my email account would facilitate password resets. I'm gradually going through passwords and updating those I care about. That's probably less than 30 of the 1,500 or so entries in my password database. A gmail search of my email for the string "password" did not find much of interest.

Here's what I do now:
  1. I revised the passwords on my Gmail account (obviously) and all of our Google accounts. I used the free Password Assistant utility to invoke OS X password assistant to help choose good passwords. I use mostly "readable" passwords or, where needed, the number/letter options. I store these in two places - 1Password and FileMaker Pro [1].
  2. I'm incrementally working through the passwords on all of our financial accounts. That's worth doing anyway. Fidelity used to require weak passwords, now they allow reasonably strong passwords. In one case that will go unnamed, their security remains appallingly weak. In several cases the security arrangements remain, essentially, insane.
  3. We are storing less in Google documents. We didn't store much, but I'd considered putting some shared material in spreadsheets there.
  4. I'm deleting email more. No sense keeping what I don't need. I might send myself a password to enter into my password database, but why keep that around?
  5. I printed all password modified in the past two years for Emily and wrote on that directions on how to use the encrypted shares. That's non-electronic and stored in a secured place she controls. If I kick off, she has all she needs to get at the complete set - no passwords required.
  6. I don't enter my Gmail/Google credentials on machines I don't control.
The last is the biggest change. It's doable now that I carry an iPhone around.

These are the changes I'm considering and will probably implement:
  1. Move my email archives off Gmail. 58,000 emails is a rich attack surface. I may decide to keep only a few hundred emails there.
  2. Create Google Apps/Gmail accounts that have limited access to things like my contacts, calendar, blogs and so on. Use these primarily, and limit use of my core Google account. Think of these as perimeter defense that can fall to the enemy.
[1] I don't trust 1Password completely, but there's no easy way to put FileMaker data on an iPhone in a robust encrypted store. So I end up using FMP as my source of truth, and 1Password more or less updates itself and serves as a backup. Both are included in my routine backups, including the encrypted backup I take offsite. I've used both of these for some time.

Update 9/13/10: xkcd on why having a robust password is not enough - creating honey pot services to attract passwords (Ping.FM?). iPhone/Android apps can do the same thing. This could be considered a form of social engineering/phishing. In my case I didn't reuse the Google password.

Tuesday, September 07, 2010

Operators in Windows Search and Spotlight - Common and Similar

This is a narrowcast post. It's of interest to someone who ...
  1. Is a serious geek.
  2. Has to routinely find things in very large document and email collections.
  3. Uses both Windows Search (built into Vista/7, add-on for XP) and Spotlight for OS X.
If you're still reading we need to go out for a beer the next time you're in MSP. There are only 2-3 like us on earth.

In an earlier post I discussed operators in Spotlight. When I first posted I complained about the difficulty of reconciling Windows Search operators and Spotlight operators. It's tough enough to learn one set, but learning two is kinda painful.

My first impression was wrong though. It turns out that several operators work in both Spotlight and Windows Search. Below is a list of common operators, followed by a list of differing operators and conventions. I'll update both lists over time. I'm only including the ones I use, there are many more.

Common operators (work in both Windows Search and Spotlight)
  • author:
  • kind:folder
  • kind:contacts
  • kind:email
  • kind:music
  • date:>7/4/1776
  • Boolean rules with parens (AND, OR, NOT)
Differing operators (W|S)
  • Windows uses () to contain phrases, Spotlight uses quotes
  • kind:docs | kind:document
  • not available | kind:application
  • modified:3/7/08..3/10/08 | modified:3/7/08-3/10/08 (hyphen might work in Win)
I think Windows Search accepts a number of variations, so I'm going to try more OS X Spotlight operators and syntax with Windows Search and document what works. Even now, however, it's impressive how much commonality there is.
--
My Google Reader Shared items (feed)

Monday, September 06, 2010

Archiving email

In the 90s Slashdot was hot. There were no blogs, no feeds, just Slashdot and their commenting system.

Even at its peak, however, you could see the problems. There were hordes of comments on stories, but most were worthless. Good comments often arrived late, and were never ranked so never seen. Realtime before its time, and flawed in the same way realtime is now.

Slashdot is still around, but I rarely find anything novel there. Today was different. Someone asked a question I've wondered about for years ..
Ask Slashdot | Best Way To Archive Emails For Later Searching? (Anonymous)
... I have kept every email I have ever sent or received since 1990, with the exception of junk mail (though I kept a lot of that as well). I have migrated my emails faithfully from Unix mail, to Eudora, to Outlook, to Thunderbird and Entourage, though I have left much of the older stuff in Outlook PST files. To make my life easier I would now like to merge all the emails back into a single searchable archive — just because I can. 
But there are a few problems: a) Moving them between email systems is SLOW; while the data is only a few GB, it is hundred of thousands of emails and all of the email systems I have tried take forever to process the data. b) Some email systems (i.e. Outlook) become very sluggish when their database goes over a certain size. c) I don't want to leave them in a proprietary database, as within a few years the format becomes unsupported by the current generation of the software. d) I would like to be able to search the full text, keep the attachments, view HTML emails correctly and follow email chains. e) Because I use multiple operating systems, I would prefer platform independence. f) Since I hope to maintain and add emails for the foreseeable future, I would like to use some form of open standard. So, what would you recommend?'... 

I think I might still have my NCMail (Norton Commander Mail) archives, back before the public internet, when MCIMail was a great services. That was, by the way, one of the best email clients ever written.

Here are some of the suggestions, with my comments:
  • Run an IMAP server and host them there
  • Notmuch (Linux)
  • Gmail
  • MailSteward for OS X: Uses SQLite or MySQL and process mbox files from Eudora and Endourage. Works with Mail.app I'm going to see if this can process my PC Eudora files.
  • Maildir storage format uses system directories for mail folders and is indexable. It's used by Dovecot IMAP sesrver.
  • mairix - email index and search tool (unix)
Sadly, most of the comments are as worthless as I remember, except they degenerate to mod disputes faster than ever.

Incidentally, Sarbanes-Oxley means CEOs can go to jail for corporate malfeasance. This is inspiring corporate rules around email retention and especially email deletion. So the email archive management industry is spinning up.

Update: MailSteward failed Gordon's Law of Software Acquisition #4:
Inspect the uninstaller. The best apps don't need one - just delete the app. After that look for something built into the app. Then look for something that downloads with the app. If there's no installer stop immediately.
MailSteward has an Apple installer, but neither the FAQ or the Manual seem to discuss uninstallation.

That ended my MailSteward evaluation.

Google Apps aliases can stop working

I don't think this is related to recent password changes, but I just learned today that the email address for this blog wasn't working (jgordon@kateva.org).

It was configured as an alias on a Google Apps account at kateva.org. I removed the alias then restored it, now it's working again. So Google Apps aliases can stop working.

Sunday, September 05, 2010

Better Spotlight in 10.6: search current Finder folder and more

This is new in 10.6. I just read of it. For me it's one of the very best things about Snowie ...
TidBITS Problem Solving: Find Files More Easily in Mac OS X
... you can restrict Spotlight to search the current Finder folder by default, instead of This Mac. To do this, choose Finder -> Preferences, click the Advanced button, and choose Search the Current Folder from the pop-up menu. From then on, when you invoke the Finder's Find command by choosing File > Find (Command-F), searches will be limited to the current folder showing in the frontmost Finder window....
The search window in the Finder menu will also default to searching the currently selected folder.

Drives me crazy that the best features of 10.6 are bloody secrets.

Now if I could only search by file name instead of all contents...
... you can make sure the Search bar at the top of the Finder window is set to File Name without requiring an additional click. Hold down the Shift key, and choose File > Find by Name (Command-Shift-F). This command is available in both Mac OS X 10.5 Leopard and 10.6 Snow Leopard...
Auggghhhahaha! Now they tell me. I just did it. It works.

Wow. Search scoped by folder context, default to file name.... It doesn't get better than this. If only this were the default behavior ...

Yeah, search by file name can be mapped to Cmd-F -- but it requires a logout and login to work. You can also tweak the search results window layout and add a "Last Modified" column to the list view. Please read the original article and send TidBITS some love.

Oh, one last thing. Suppose you've done all of the above but you want to restrict your search to only folder names and modified after 1/1/2010. That looks like this:
kind:folder date:>1/1/2010
Yes, you can use the same sort of operators with Spotlight that you can use with Windows Search. Alas, they aren't identical, so if you do both you are more or less doomed. (I was wrong, they do overlap.)

These operators are usually described as "undocumented", the including in this excellent 7/10 CNET article. That article gives us examples like:
"Apple Computer" kind:pdf OR "Apple Computer" kind:text NOT (Google OR Yahoo OR "Microsoft Corporation")
In fact in 10.6 these features are documented in the little known OS X feature known only as "Help". (It's still not as good as Windows Help, but it no longer sucks.). Search Help on "Spotlight" and look for these Help articles:
  • Performing a Boolean or metadata search
  • Searching for specific types of items