Thursday, May 31, 2012

My Google Custom Search just died. Did I offend the GoogleNet? (fixed)

Two days ago my much loved Google Custom Search was working beautifully (emphasis added) ...
Why coupons? Price concealment information and memetic archeology in the pre-web world 
... I found that reference through my pinboard/wordpress microblog/memory management infrastructure now integrated into my personal google custom search...
My latest enhancement was paying off; my ("free" = ad supported) personal custom search engine was now successfully indexing a blog that archived my shares and annotations [1] as well as my ancient web pages (archived) and my and blogs.

My extended memory was better than ever!

Until it died. [2] As of yesterday my custom search engine is returning very few results.

My first thought was that I'd unwittingly committed a Class One transgression against the GoogleNet. Perhaps Google considers my link/annotation blog to be a link-farm-equivalent -- and had blacklisted my entire domain. Perhaps I had broken an unwritten rule of the GoogleNet (formerly known as the Internet, home of Archie and Veronica [3]).

I'm still able to find my notes.kateva.or and even posts in Google's standard search however (if I restrict by domain). So I'm not certain I've transgressed. If the search doesn't work soon I'll try recreating that engine. If that doesn't work, or if I detect more signs of transgression, I'll have to remove my pinboard archive and beg mercy of The Google.

I've a broken iPhone I could burn. Perhaps that will appease.
[1] It's my tawdry substitute for my long lost and much mourned Google Reader Share page. I hate the way it looks, but it's primary use is RSS consumption and index fodder. I am looking for a better template but WordPress themes/templates are a rats nest of complexity.
[2] Echoes of losing Google Reader Share!
[3] If you know what that means you either used Google or you are a very old geek.

Update 5/31/2012: It's back.

I followed some of advice that "omr" (not a Google employee) generously gave on the Google Search product forum. Instead of creating a new CSE however, I replaced many of the entries of the old CSE with the patterns he suggested. Perhaps most importantly, I changed the setting for indexing

I'd previously opted to index all entries and all linked pages. Considering I add about 20-60 links a day I think that was a tad ambitious. I now index only the text of this shared items/pinboard (micro) blog.

For reference, here's an edited version of omr's recommendations:
In the "Sites to search" box, enter this URL Pattern:
If you wish to include some of your other sites, enter additional URL Patterns to match them.  (Enter one URL Pattern per line.)  For example, if you want to include the msptrails site, add:
For more information about URL Patterns, see
Please include only a limited number of sites.  Start with the minimum number of sites that you consider necessary to include; or, if you wish to include several, preferably no more than ten.  (If you own some older or less-active sites that you don't need to search anymore, don't include them.)
Click the new CSE's "control panel" link (which takes you to the "Basics" page of the control panel).
Leave the "Search engine keywords" box empty.
Near the bottom of the page, note the "Show automatic thumbnail" option.  The automated thumbnail-image selection is not always ideal, so perhaps you may want to turn off that option.  (Click to remove the check-mark, then click the "Save Changes" button BELOW the option.)


Mary said...

I must be a medium-aged geek: I knew about Archie, but not Veronica (had to Google her) :-)

JGF said...

Veronica was, as you now know, the search engine for the Gopher protocol. Archie the search engine for ftp. Wikipedia tells us that Veronica was rewritten in 2010! (There are still uber-geeks running gopher servers).

I'm sure there are prior art patents in those pre-web search engines :-).

JGF said...

Alas, I just checked, and Chrome doesn't even give a nice error message if you try a gopher protocol url: