Tuesday, January 04, 2005

Copernic/AOL: current leader in the sponsored (freebie) Windows desktop search race

Copernic Desktop Search - The Search Engine for Your PC

Updated 1/5/05 with, unsurprisingly, some more of the negatives.
Update 5/10/05: I think Yahoo Desktop Search (X1 freebie) is the best PC desktop search tool -- except it doesn't index Eudora. X1 commercial does -- for $80. There was rumor of a Google Desktop Search plugin, but it's not there yet. So, for home, where I use Eudora still on my PC, I'm going to try Copernic 1.5 again.

Introduction

I really like Lookout for searching and managing Outlook. I'm sticking with that one for now. I tried Google desktop search and a few others, including Lookout's desktop search -- didn't like 'em.

Since MSN search uses Microsoft's built-in indexing, which I don't like, I haven't tried it.

I think Yahoo/X1 will be interesting, but it's not a freebie yet.

Copernic/AOL is my latest. It looks good at first glance, but it has some fundamental deficits.

Some key features:
  1. You can configure location of the indices. I store them in a folder that I exclude from backup, including backup via ConnectedTLM. You really don't want to backup search indices. All my various search indices tools store files in this folder.
  2. You can control readily what folders are indexed. I turned off Outlook search since I use Lookout.
  3. It indexes PDF.
  4. You can tell it not to index very large files.
  5. You can control when it builds the indices, including time of day scheduling. It will do low level background indexing (not a default). Index builds seem very fast and "smart".
  6. If you map a network share to a drive letter, it appears that it can index that drive. (Performance may be poor, I haven't pursued this capability as I don't need it at work.)
  7. It indexes folder names. (A major flaw in Lookout 1.2's file system search.)
Some fundamental defects:
  1. It's "stupid" in how it does search rankings. In particular it doesn't use NTFS file metadata it doesn't weight metadata >> folder name >> file name >> document content and it doesn't differentially weight aspects of documents (titles, early text, etc) .
    (One of the reasons Lookout works so well for email search is that there's so much reich metadata available in Outlook. The typical PC document filestore has very little metadata. I'm very interested in the 2005 OS X Tiger search because of the way it uses metadata.)
  2. It doesn't return a folder or directory as a search result. The only search results are files. This throws away a lot of the intelligence and metadata that may exist in a file store.
  3. You can't search within a result set easily.
  4. You can't sort result sets by metadata (file size, type, date created, date modified, etc).
  5. The Help and Submit Bug buttons are broken. There's no documentation.
My configuration
  1. Limited search to My Documents folder only. (Removed all other folders, for non-removables set to ignore via "modify" button.)
  2. Moved indices to my "Cache" folder (no backup).
  3. Update index daily at 1am.
  4. Max file size to index 10MB (I may shrink this further)
  5. Various other small tweaks to enhance performance.
Conclusion

It's the best of a bad bunch, but I'm not impressed. I'll compare it to X1 when Yahoo launches it. For now I'm staying with Copernic (it beats Google Desktop!) but I still get more value from a directory string search I hacked together that emulates the immensely underappreciated Norton Change Directory feature of Norton Commander and Norton Utilities @ 1989.

1 comment:

JGF said...

Blogger is supposed to send me email when someone comments, but I don't think I received a notice of David Dawson's comment. I'll review my Blog settings.

I'm one of the few people who'd actually tried Microsoft's native full text indexing. I built a web app to create a UI. (The extension to search didn't work for me, I forget why -- something about my environment didn't fit Microsoft's assumptions.) I found it pretty dumb in terms of how it ranked search results, so I gave up on it.

The biggest problem though, is that MSN search doesn't index PDFs. Until it provides PDF support I'm not going to bother with it.

And, of course, I use Firefox, not IE. So if MSN search forces me to use IE, it's not an option. (I don't know how tight the integration is.)

If it does provide integrated PDF support I'll give it a try.