From El Reg


On Google’s PR blog, we learned that Google’s index had doubled overnight to 8 billion pages. (Where had they been keeping the new 4 billion pages all this time, you might well ask.)

“Together these pages represent a good chunk of the world’s information, but hardly all of it,” wrote Google’s VP of engineering Bill Coughlan, in what might be the understatement of the century.

Precious little of the “world’s information” is even written down. Much of is it encoded in enduring transmission mechanisms such as music, the visual arts, religion and myths, for example. And almost all of the stuff that is written down isn’t ever going to be accessible through the public internet for very practical reasons. You can get some of this piped into your computer if you’re lucky enough to belong to a local library, but that’s because a consensual social mechanism has been invoked to bypass such restrictions. What the internet’s public search engines are left to work with is a toxic wasteland largely characterized by the generation of real time noise - both private and commercial - and what the machines churn out in answer to our hopeful “queries” isn’t of much use to the rest of us.

To technologists, the solution is obvious: it’s either going to require either a technical fix, or some huge change in social behavior, the creation of a world where we’re all moored to our computers twenty four hours a day, so making society conform to the limitations of today’s machines. But we all know this isn’t going to happen. Fortunately, there are better ways out of this conundrum.

Just as governments have realized that using collective, centralized bargaining power against large pharmaceutical companies is a great way of reducing the cost of drugs used by the population, so one day, governments will realize that collective social bargaining with copyright owners and database can help bring good quality information to the population at large. This is a win-win agreement that makes the copyright holders richer beyond their wildest dreams, and gives us high quality databases to which we could never have before been able to access.

If the cult of “information” is as important as technophiliacs tell us it is, we need to develop social mechanisms, not fancier search engines, to get us to the Holy Land. Don’t look to the privatized information scavengers of the web for answers.

So all the while we were consumed with the “search engine wars”, what we were really looking at was the “library wars”. And whoever has the best library wins, in this case.

This is so spot on, I feel like crying.

On an average day, I have to log into Medline, CINAHL, PsychINFO, Embase, Psychlit, Proquest, The Cochrane Library, and umpteen other Athens password protected databases. If I’ve managed to ‘obtain’ a password, I might add another half dozen to that list. In that process, I may be required to refine and rerun my search strategy several dozen of times. Finally I am left with an array of data files that I must reconstruct through clunky bibliographic software. As I peruse the hundreds of articles that result, I have to re-enter several other passwords to access electronic versions. If I am lucky, 10% might be available to me.

We need to take back the means of ’scholarly communication’, and let everyone have access to everything. There are some great ideas here.