Thursday 14 October 2004

Google Desktop Search

Today, Google announced Google Desktop Search which indexes the contents of your computer and provides full text search over your email, computer files, chats, and the web pages you've viewed.

According to Rael Dornfest:

The Google Desktop is your own private little Google server. It sits in the background, slogging through your files and folders, indexing your incoming and outgoing email messages, listening in on your instant messenger chats, and browsing the Web right along with you. Just about anything you see and summarily forget, the Google Desktop sees and memorizes for you.

And it operates in real time.

It seems to be limited to Windows at the moment, specifically, Windows XP and Windows 2000 SP3 and above. It appears to work with Firefox, but it cannot index its cache though it can index Internet Explorer's cache. It can also search emails stored in either Outlook or Outlook Express but if you are on a different email client like Thunderbird or The Bat!, you are currently out of luck (Google Desktop is beta by the way).

Google Desktop Search seems to be what I am looking for; I have over 1 gigabyte of Microsoft Knowledge Base Articles on this computer which I downloaded years ago from their FTP site and (which I still keep up to date on a weekly basis) but I have been unable to get the most out of them and I haven't been disciplined enough to craft the necessary regexes with Perl or any other scripting language in order to suck data out of those knowledge base articles and I grew out of Windows Grep months ago.

Unfortunately, I will not be able to install the Google Desktop Search because it requires 500 megabytes of space on my hard disk for indexing (the installer is only 400k by the way). I have about 1.3 gigabytes left on my system partition and I suspect it will want to store the indexes on the system partition. I cannot allow that. A comment I read today (I have read so many of those today so I cannot recall where exactly) suggests that it will only install to the system partition and nowhere else.

(Note to self - for when I get to play with it myself: Google Desktop operates a web server with the IP address 127.0.0.1 on Port 4664. I need to reconfigure my firewall to allow traffic through 127.0.0.1 on that port. A user reports that the Google Desktop also listens on TCP 0.0.0.0:3049, must do a netstat -ano and confirm this for myself. The same user reports that Google Desktop installs a Winsock LSP - Layered Service Provider - in order to do some network data interception probably for searching HTTPS pages and IM text, unfortunately I have no idea how to confirm this one It looks like the Google Desktop Search Tool will let you move the index off the system partition to another partition via the registry, I need to look into this.).

There is a nice write up on the Google Desktop by Rael Dornfest covering the installation and use of Google Desktop, Danny Sullivan takes a closer look at Privacy and Desktop Search, John Battelle as usual drools over the Google Desktop, Jason Kottke thinks it is a baby step towards GooOS, Danny Sullivan again offers some roundup of links on his blog, Joe Wilcox offers a broad perspective on Search and how it affects Microsoft while as at the time of writing, there is no word from Microsoft's Corporate Blogger.

Update:Jon Udell discusses how to get the Google Desktop Search to index Firefox's cache, PC World discovers that the Google Desktop Search can bypass user names and passwords that secure Web-based e-mail programs and view personal messages sent and received on public PCs while manageability.org contends that the GDS is only able to index Microsoft content because it is making use of the index.dat files found on Microsoft Systems. It sounds plausible until you ask why Microsoft hasn't made use of those index.dat files to beef up their ailing search utility.

Related Reading