Where we ponder how filesystem based document management is being replaced by webservers, wikis, search engines.
When we started out organizing our files back in the day when PCs were young we naturally used the filesystem to do it. We created directories that made sense and helped us find information we needed. As we got better and better at producing more and more files and documents, and vendors gave us improved directory naming capabilities, we created even more extensive and nested directory structures.
Then PCs joined with other PCs on a network and it wasn't just our own directory structures anymore, we created group directory structures to store all of our files together. All these documents became organized into a nice directory structure for easy access.
This was great! Except its not. Or at least it isn't any more. With the advent of 100 Gig hard drives even for laptops the potential for creating directory structure that is meaningful and helpful even for personal use is all but gone. We have too much information to bury it down inside a hierarchy out of sight. Combine that with the growth of spam both intentional spam in EMail and the sheer volume of documents that don't necessarily need our immediate attention, and we end up with a filesystem organization that is a mess and impossible to really use or maintain. It becomes a stale archive of hardly used documents.
Enter Google. Google has rapidly become the central organization tool for bloggers. Can't remember where you stored that bookmark, link, idea, whatever? Google it... Now Google is approaching EMail organization the same way. Don't bother with a huge set of nested folders to organize historical EMail. Nested folders that need constant care and attention. Just store all of your email in a flat archive and let the search engine find what you need. Automatic tagging of email based on predefined searches let you organize your inbox so it is easy to deal once with email ... and then forget about it until you need to search for it. File it and forget it.
Or Furl. Furl is a specialized archive tool for web pages you are interested in keeping around. It basically stores copies of web pages and their metadata for you and then allows you to search your collection of copies. It even gives you recommendations for other documents you might be interested in based on the documents you've stored recently.
Even though Furl lets you categorize your saved pages, once your archive reaches a certain size, the categorizaton becomes more of a publishing tool and you find yourself relying on search tools to find what you are looking for. An effective search tool is again the key and the recommendations are based on metadata created without any additional effort beyond storing the page in the archive.
Or Wikis. Wikis are built around a completely different organizing model. Self organization that emerges from the users' interaction with the system. Wikis are by nature a flat archive of topics with links between them. Full text search makes it all available.
This is where I think we are going for Personal Information Management. In the end, the sheer volume of data we need to comb through makes us entirely dependent on full text search of our data and non hierarchical organization models like wikis.
Jon Udell has been thinking about a Google PC that ends up dropping traditional methods of organizing documents and information and instead relies on a search spider that indexes the whole PC. Here is where he is going:
On the Google PC, you wouldn’t need third-party add-ons to index and search your local files, e-mail, and instant messages. It would just happen. The voracious spider wouldn’t stop there, though. The next piece of low-hanging fruit would be the Web pages you visit. These too would be stored, indexed, and made searchable. More ambitiously, the spider would record all your screen activity along with the underlying event streams. Even more ambitiously, it would record phone conversations, convert speech to text, and index that text. Although speech-to-text is a notoriously imperfect art, even imperfect results can support useful search.
A side effect of this approach is that the metadata about your files (which files are important, related to each other, etc) arises out of the use of the files themselves which means that the system gathers the metadata with little effort on the users part. This I think is a super critical key to Personal Knowledge / Information Management.
In the same vein, Bill de hÓra writes about how he is getting away from thinking about his content's organization preferring a fire and forget approach:
Well, over the years I've moved away from a place where I would think hard about how to file everything away (where what I could do was predetermined by the file system at hand). I haven't be able or willing to do that for years - there's too much to classify and too many ways to classify it and I'm not paying myself to be a librarian. Then consider that folder based classification doesn't help with retrieval anyway unless you carry that classification scheme in your head all the time. Life's too short.
I prefer the fire and forget mode that is enabled by giving things URIs and putting them behind web servers. Everything else I've tried or seen was too complicated. I could imagine never classifying or sorting anything based on folders within a couple of years, preferring something like a Topic Map instead to tag the files with metadata - not that as I user I'd actually care how it's done.
Recent experience building a taxonomy into a wiki at work has me a little sour on structure. It appears to be helpful for people who are really new to the wiki, but after using it for a while, you gravitate towards either the What Changed RSS feed or the full text search tools to find what you are looking for, or at least an entry point into the topic you are interested in. Then you follow the links from there.
My own efforts to use a Wiki as a Personal Information Management tool has shown me that even for my documents, having a flat structure that is logged (in a what changed sense) and fully searchable really pays off. If we could apply it to my whole laptop, that would be even better.
This would mean that you can get away from worrying about where where your files are stored which is Bill de hÓra's dream of never classifying or sorting anything based on folders. Where things are located just isn't relevant. The search tools just find them.
Just like Furl. Just Like Google.
|