Tuesday, May 20, 2008

Microsoft Office Document Imaging - a little known gem

If you have any Windows version of Microsoft Office, you have some version of a little gem of an app almost nobody knows about.

It's called "Microsoft Office Document Imaging" (MODI). There's an icon for it in your Office Tools folder, along with truly exotic beasts such as "Picture Manager". You'll also probably see an icon for "Microsoft Office Document Scanning", which is basically a shortcut to the scanning dialog in MODI.

The Wikipedia article on MODI claims MODI was introduced in Office XP:

Microsoft Office Document Imaging - Wikipedia, the free encyclopedia

Microsoft Office Document Imaging (MODI) is a Microsoft Office application that supports editing documents scanned by Microsoft Office Document Scanning. It was first introduced in Microsoft Office XP and is included in later Office versions including Office 2003 and Office 2007.

Maybe, but I used to be very familiar with the Xerox-authored document imaging app that was once bundled with Windows 98 and 2000. This looks awfully similar to that twenty year old app.

MODI feels like ancient code, crafted by lost Giants who dreamt in assembler. It's blazingly fast on today's hardware.

It's also the simplest scanning application I've ever seen. Remember to click the 'prompt for additional pages" checkbox and even on a single page scanner it will assemble the images into a single TIFF. It even does quite decent, and very fast, OCR. If you have Adobe Acrobat you can readily convert the output to PDF, and if you don't you can probably "print" using freeware products.

Or you could just leave it as TIFF.

I use the B&W settings when scanning expense receipts on an old personal scanner I brought to work (nobody wanted the scanner after I bought a higher end model). Very nice results.


UHsoccer said...

Interesting background information.

I was told by a member from SnagIt (A cool screen-grab application from TechSmith company in MI) that I might look for MODI to use it as an OCR sytem.

Here is the environment:
1) Faxes are received from clients and are converted to PDF files
2) The content are typically images but they contain the sender FAX number
3)I need to batch PDF files and apply character recognition in order to retrieve that fax number

The number of scans to be processed on any given day might be hundreds.

Any advice or guidance to some other method is appreciated

Nick Curtis said...

Great post. I've been using Microsoft Office Document Scanning for the past several years, after discovering it completely by accident one day. I like how the files it creates are extremely legible, yet also smaller in file size than most PDFs.

As a real estate agent, I'm often using my scanner to send contracts and other paperwork to clients. To date, I've never once had a client tell me that they've been unable to open or view a file I've created in Microsoft Office Document Scanning!

Designworks San Jose said...

If you're looking for a more robust scanning and imaging solution, you might try Content Cabinet which is a document imaging management solution from AboutScan.