Google Desktop is Full
After a day away from my PC in Canberra yesterday, I came into the office this morning to find my PC unusually sluggish. A quick look at the Process Table in Windows XP showed that GoogleDesktopCrawl and GoogleDesktopIndex had obviously been hard at work during my absence. I’ve had Google Desktop Search installed for quite a few months now, so I was curious about this sudden activity. I jumped over to the GDS status page and was greeted by the following message:
Google Desktop Search has reached its maximum size. New items will no longer be indexed. You can still search for old items.
Argh!! A quick google search (how ironic) reveals that I’m not alone.
Google’s GDS Help Centre pages acknowledge the problem, but the only solution they offer is that " you may want to try re-indexing. You can do this by uninstalling and reinstalling Google Desktop". Not exactly satisfactory (I don’t really want to lose my 40,000+ pages of web browsing history).
It’s also not clear to me what the problem is. The Google Help Page referenced above mentions that "As you may know, your personal Google Desktop index requires no more than four gigabytes of space on your hard drive. In most cases, you shouldn’t run out of space for your Google Desktop index."
"In most cases"? Google also admits that "we are aware of a bug that may cause you to see this error message, and our Google Desktop engineering team is working to find a solution.". So I’m a bit puzzled as to whether I’m seeing a manifestation of this bug, or whether I am outside the bounds of "most cases".
I’m somewhat suspicious that this message has appeared so soon after my GDS index size has reached 500,000 items. For the benefit of other people who have hit this problem, my current index stats are as follows:
|
Number of items |
Time of newest item |
| Total searchable items |
500,566 |
9:31am |
| Emails |
142,378 |
9:01am |
| Chats |
0 |
- |
| Web history |
40,630 |
9:31am |
| Files |
317,558 |
9:13am |
I’ve found speculation about that perhaps filesystem file size limits are the cause here (indirectly in this case, since it’s running on an NTFS filesystem, but the speculation is that Google may be working within the limits of the FAT filesystem capabilities to ensure maximum compatibility). Has anyone out there actually found a solution for this problem?
Exponential RSS Growth
A few weeks back, I attended a talk by Rodney Brooks (from the CS and AI Lab at MIT). As I blogged at the time, his talk centred around the idea of exponentials as key to future research directions (of which Moore’s law is almost certainly the best known). Examples he showcased included the disk capacity of an iPod (roughly doubling every year for a given price point) and the number of transistors in the world.
Why I bring this up is I’ve just read a story that TechCrunch is reporting on the recently released Feedburner statistics that show the growth of Feedburner subscribers and managed feeds over the last 18 months. Both show clearly exponential trends – the number of feed subscribers appears to be doubling about every two months (or even faster in some cases)! The number of managed feeds has increased by more than an order of magnitude over the past year.
If we consider these Feedburner stats as a proxy for RSS feeds in general (which may or may not be valid – anyone have any thoughts on the matter?), it’s interesting to think about what the effects will be of massive RSS overload in the not too distant future. What kinds of new opportunities will all those feeds open up? What will be possible which isn’t possible now? I imagine tailored summarisation and synthesis of RSS feeds is only likely to be getting a lot more attention in the future.
Discover the science behind everything!
If you remember back a couple of months, I was involved in filming for Scope a new kids science show on Channel 10. Well, Scope premiered on Channel 10 at 4pm this afternoon, and apparently the finished product turned out quite well
. I must confess that I haven’t even seen it yet, but Michelle managed to co-opt a television in some academic’s office at work so she and her colleagues could sit around and witness my 2 minutes or so of fame. Even more exciting was that Michelle herself also made a cameo appearance in one of the stories. So we’re a family of stars for the day 8-).
I think we’ll have to go out and celebrate when we find a spare moment! Hopefully sometime I’ll get hold of a VHS copy, digitise it, and host a copy here on SGI.nu, so you can see us in all our TV star glory.
The weird thing is that I actually met someone from CSIRO corporate for the first time in a meeting last week, and as soon as I introduced myself, he started telling me about how fantastic my segment on the Scope show was. It transpires that in his capacity as a member of the CSIRO Corporate Rewards commitee, he had seen a copy of the show as part of a promotion case for the show’s host, Dr Rob, who is a CSIRO Education employee. Small world I guess
.
Update: It turns out there’s at least one of my segments available to view online on the scope website. Find out how your mp3 player works in the eyes of a 12 year old!
Natural Language Processing/Language Technology Jobs
I’ve just created a new RSS feed for language technology jobs in Australia and New Zealand on the Australasian Language Technology Association site (which I am currently looking after while our main webmaster is away on sabbatical).
If anyone out there comes across jobs in speech technology or natural language processing/computational linguistics, please let me know (andrew at bebox dot nu). If you’re studying or working in the area, I suggest you keep an eye on the RSS feed.