A quick bookmark post on Clay Shirky’s Web 2.0 speech from last week. It’s a great read in its entirety. Here are my favorite sections:

So I tell [a television producer] all this stuff, and I think, “Okay, we’re going to have a conversation about authority or social construction or whatever.” That wasn’t her question. She heard this story and she shook her head and said, “Where do people find the time?” That was her question. And I just kind of snapped. And I said, “No one who works in TV gets to ask that question. You know where the time comes from. It comes from the cognitive surplus you’ve been masking for 50 years.”

The “masking” in that quote refers to the place television holds in our post-industrial revolution lives.

So how big is that surplus? So if you take Wikipedia as a kind of unit, all of Wikipedia, the whole project–every page, every edit, every talk page, every line of code, in every language that Wikipedia exists in–that represents something like the cumulation of 100 million hours of human thought. I worked this out with Martin Wattenberg at IBM; it’s a back-of-the-envelope calculation, but it’s the right order of magnitude, about 100 million hours of thought.
And television watching? Two hundred billion hours, in the U.S. alone, every year.

But this takes the cake:

I was having dinner with a group of friends about a month ago, and one of them was talking about sitting with his four-year-old daughter watching a DVD. And in the middle of the movie, apropos nothing, she jumps up off the couch and runs around behind the screen. That seems like a cute moment. Maybe she’s going back there to see if Dora is really back there or whatever. But that wasn’t what she was doing. She started rooting around in the cables. And her dad said, “What you doing?” And she stuck her head out from behind the screen and said, “Looking for the mouse.”

Here’s something four-year-olds know: A screen that ships without a mouse ships broken. Here’s something four-year-olds know: Media that’s targeted at you but doesn’t include you may not be worth sitting still for.

Free-time web-surfing as “cognitive surplus” is a great concept. Shirky talks about a potential model for the social web as:

But media is actually a triathlon, it ’s three different events. People like to consume, but they also like to produce, and they like to share.

A site I stumbled upon tonight that shows this tendency perfectly is Young Me - Now Me. One look at the site and you understand it, you want to view all of the pairings, you want to create your own, and you want to share.

Update: Jay Rosen weighs in with a thoughtful piece.

Until Zotero 2.0 comes out, which will be server-based as opposed to client-based, keeping track of research across multiple computers is a challenge. The solution I’ve been using on my windows boxes is a combination of Firefox Portable, Zotero, and MS Synctoy. Firefox Portable will let you install an instance of the browser on a thumb drive, or any other drive, on your system. That way you can keep it with you and access your citations from any machine. Because running anything off of a thumb drive is excruciatingly slow, what I end up doing is using it to transfer all my Zotero files between computers (letting Synctoy manage the updating) and run it from the hard drive, like this:

MS Synctoy

It works great, but what I’d really like to do is turn the whole synchronization over to the cloud. SugarSync is an interesting tool that will synchronize any folder on your computer (Macs included) doing much of what Synctoy is doing, but without a thumb drive.

Sugarsync and Zotero

It works well, and may be a good solution. It’s fee-based and you get 10GB space. The only drawback being that you can’t execute portable firefox remotely (which makes sense.) I’ll keep testing, and waiting for Zotero 2.0.

Very good persona analysis [pdf] coming out of Macquarie University in Sydney Australia. The thing I like best, aside from the nice personas they’ve developed, is the way they’ve mapped them against experience/seriousness vs. frequency of use/need for the library:

Audience Segmentation

This fits in nicely with several assumptions I’ve had regarding the audience we serve at our academic library, and may be true for most academic libraries. It’s a nice way to contextualize the personas being delivered and I bet helps sell them as well.

Here’s a very cool image browsing service at Hard Rock Cafe powered by Microsoft’s Silverlight and Seadragon technologies. Requires a software download, but worth the effort.

Over the weekend, On The Media had an interesting interview with Clay Shirky where he talked about the potential dark side of unmediated communication for his new book: Here Comes Everybody

He’s saying that our conception of the “audience” is moving away from a simple media consumer towards that of an actor with real power. He has a couple good examples, especially this quote by a subject of a flashmob’s anger who said “You and what army?” As it turned out, the “army” ended up being a worldwide audience, and the police were forced to step in on a case they would normally ignore; (It involved a stolen camera.)

This past weekend another good example came up. A Vespa interest group held a rally on a California highway the same day a Lexus interest group held theirs. Unfortunately one of the Vespas was run over and the rider was sent to the hospital:

The very next post was a request for the license plate, and eventually a photo of the driver was posted. They then found a link to the sponsoring Lexus IG forum and a back & forth between forums broke out.

Ideally this kind of thing would result in a collection being raised to help pay for medical bills. What I hadn’t considered before was the potential for an “audience” to get out of control. I’d be willing to bet that governments around the world are starting to think along those same lines as well.

For the past year I’ve been working on a project using Archive.org to categorize a sub-set of academic library home pages according to their browse and search structures. Next week I’ll present this data at ALA during the Monday 6/26 poster session at 1pm. But for now, here’s a quick overview of some of the findings. This is an update to the data presented last year (2006) at Internet Librarian.

Methodology: Using Archive.org and Google’s list of academic websites, I semi-randomly selected a group of websites (around 276) to see how their browse structure and search placement was organized. The goal was to analyze a group of libraries from a wide cross-section of resource-enabled entities in order to get an idea of how the library community is thinking about library website design. Here is how the various categories were broken down:

Browse strategies: (i.e. if there’s such a thing as “browse DNA”, what is it? I personally don’t have a preference between these two groupings.)

  • Splash home pages: (i.e. the top page browse structure disappears once secondary pages are reached)
  • Frame home pages: (i.e. the browse structure is persistent from the the top page to the secondary pages)

Search strategies: (i.e. how is search implemented from the home page?)

Browse Findings: (the “error” result indicates a problem with archive.org data)

Site Browse Trends

When I started this, I expected to find the “grid” structure to be gaining in popularity. I was surprised to see that the idea of persistent framing is definitely a trend with legs. When grid, cascade, and radial are combined (as “splash”) the trend is still there, and still surprising:

Splash vs. Frame Designs

Search Findings: Here’s the results for search placement on the home page. There’s a definite trend towards placing a one-stop search box directly on the top page, where patrons won’t have to click through to use:

2007 Search Trends

Conclusion: The way things are going, if you’ve embarked upon a library site redesign you would have the most company if you adopted a persistent browse structure with a comprehensive search box located on the home page.

Data:

Creative Commons License

This library website analysis data [2007 Academic Library Website Analysis (spreadsheet)] is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

The same goes with all images above, as well as the original list of library websites found here.

It’s been one of those months where several ideas seem to congeal all at once. Last year I conducted usability interviews with students where I asked them if they were worried about the authority of the documents they were finding. 100% of that sample said they were not. That made little sense to me until this month when I received my copy of Weinberger’s Everything is Miscellaneous and also stumbled upon the idea of Social Proof.

I vividly remember learning in Library School that everything is not miscellaneous. That point was brought up repeatedly and emphatically. But if that’s true, why is the rest of the Internet blithely ignoring us? It seems that Librarians have lost the authority to assert the claim because the users have routed around us and built their own structures based on miscellany.

From Everything is Miscellaneous:

[The business value of content organization] creates a conundrum for businesses as they enter the digital order. If they don’t allow their users to structure information for themselves, they’ll lose their patrons. If they do allow patrons to structure information for themselves, the organizations will lose much of their authority, power, and control.

The Paradox is already resolving itself. Customers, patrons, users, and citizens are not waiting for permission to take control of finding and organizing information. And we’re doing it not just as individuals. Knowledge–its content and its organization–is becoming a social act. [p. 133]

Our users aren’t waiting on Information Scientists to organize their world because they are their own authorities. For a vivid representation of this zeitgeist, see this photo of street art found via StumbleUpon:

The New Authority

In Cialdini’s book Influence: the Psychology of Persuasion he talks about the enormous influence of Social Proof. Here’s the cartoon version found on page 120 (it reminds me strongly of how Digg, Del.icio.us, and other tagging systems work) :

Even the God’s Look Up

In that cartoon, I love how even the Gods (Angels, whatever) have to start paying attention eventually. This type of behavior is happening all over the place. Take for example the ongoing effect the internet is having on Journalism. Old-school journalists are being forced to start paying attention to the opinions and reporting of bloggers, their own opinion pages are losing authority, their classified advertising model is basically dead, and it’s all because of the phenomenon addressed in that cartoon. It’s simply more profitable for a user to pay attention to their peers than to traditional authorities.

According to Cialdini, what’s going on is the “awesome influence of the behavior of similar others” [p. 152], i.e. Social Proof. By now we all know the drill for finding “similar others”: in order to find anything on your topic, consult with others similarly interested (usually via wikipedia, the popularity indexing of Google, Digg, Technorati, etc., maybe even a bibliography.) The technique is so successful, why worry about the authority of what you’ve found?

It seems clear to me that some type of social similarity ranking mechanism needs to be made available by librarians for their patrons. Either that or we wait around for them to build their own, because they will and they are.


Afterwards:

DoshDosh has a good writeup on Social Proof from a marketing perspective.

Researcher Nikhil Bhatla has a highly technical (but fascinating) take.

Columbia professor Duncan Watts appears to be researching in this space.

Another business-related writeup, this time from Take Back Your Brain!

A quick comment on the latest privacy analysis conducted by Privacy International. Their methodology is a very interesting read because it appears to be directly counter to what Google is doing with search and data aggregation. Two points in particular:

Data collection and processing
What type of information does the site collect, with and without consent? On some sites the personal information submitted by customers is necessary (e.g. billing addresses) but there are many sites that collect information that may be unnecessary (age, marital status, home address, preferences, medical information, extraneous financial information) from customers without adequate information about why this information is needed and how it is used. Some companies may collect and mine other information, such as viewing habits and preferences (e.g. musical genre, lifestyle choices etc.)Here, it is also important to note the status of ‘Internet Protocol Addresses’ (IP addresses). Many companies state that they see this data as non-personal - even anonymous - information, permitting them to collect and track users’ movements around the site to determine what a specific user reads. This approach permits profiling of a user’s habits and interests.
Data retention
Some companies delete the information they collect once it is no longer needed. Other companies are not quite so clear, and a few sites are quite open that they do not intend to delete personal information at all (or at least not until they are ready to do so). With increased consumer concern about information breaches from stolen and lost computing resources, or through malicious hackers gaining access to resources, companies need to be aware that the risk to their market position and customer base may be proportionate to the amount of personal data they store.

Both of those points directly contradict how I use Google daily. I rely on Google in my ceaseless and compulsive drive deep into the Long Tail. For me, it’s all about localization and frankly, I live in a small community. That’s true physically, esthetically, professionally, and personally. And I don’t think I’m alone in that.

In my experiment this year (using Google tools for email, blog reading, calendaring, and more) what they are doing for me is aggregating my online experience. If you’ve ever had the frightening opportunity of typing in your email password on a third-party social site, you know what I mean. That screams privacy suicide, but it’s also incredibly effective. Just today I took the plunge on Facebook and by comparing their database to my Gmail contact history I found several people I would like to be connected with. It’s a genuine service that Google is offering and it relies on their having collecting extraneous information from me, using it in a way that I couldn’t have foreseen, and then storing it long-term even though I can’t remember ever giving them permission to do so [although I'm sure a Google lawyer would correct me on that point]. So I don’t think Google is in the wrong on those two points, at least in terms of their long-term vision.

Here’s where I think they’re doing us a disservice: “Openness and Transparency” and “Responsiveness” (from Privacy International’s metrics). If Google were to let me into their system as an account holder and allow me to selectively weed out information that I didn’t want on their servers, I would be much more comfortable. Of course this type of activity would be a huge albatross for a private company to take on, especially taking into account the changeability of Internet business. They would have to make sure that all services offered, now and in the future, were designed in a way that allows me to access the underlying data regardless of how it is structured and where it’s located. I’m assuming they wouldn’t take on this task unless forced to.

Here’s where academic digital libraries could lead (assuming the availability of sufficient resources, of course) by enabling our researcher’s natural and vital drive into various “long tail” topics using socially derived data as jumping off points. This could be done by building a consortial database of personal information (demographics, searches, keywords, citations used, email contacts, professional memberships, conferences attended, etc., etc.) with the caveat that members can easily weed personalized data at will. And this data would need to be stored and accessible in perpetuity.

It would be a dream come true for researchers, and potentially our worst nightmare in the wrong hands.

There’s an interesting, but brief, writeup on the new Overdrive and Recorded Books services in this month’s print edition of Library Journal.

Recorded Books, a supplier of downloadable material to libraries, plans to announce an agreement with major studios at the annual conference of the American Library Assn. June 21-27 in Washington, according to Recorded Books VP Brian Downing.

The company already offers titles from indie house Film Movement along with public domain and other films, such as The Autobiography of Miss Jane Pittman. The company also offers travel and cooking programming and will soon add children’s entertainment.

Called MyLibraryDV the download service went live in February and already has almost 600 libraries signed up, Downing says, including all the libraries in the entire state of Wyoming. Some libraries are seeing “thousands of downloads and thousands of users,” he says.

Exciting stuff. The service seems to be targeting public libraries, but we have several academic classes which require viewing mainstream and indie films as part of the curriculum.

I’ve been playing around with the beta release of Jott [http://jott.com/] and it’s an interesting example of the potential for the cell phone to become the next search interface. It’s a free service that allows you to give it a call, it recognizes your phone number as an account holder, and then transcribes your messages for broadcast out to yourself or others as an email.

Playing around with the service, it definitely has some downsides. My first test message wasn’t very well planned and I haltingly said something like “umm, test message to myself”. When I got the email, the transcription read “Tapped own weapons”. A couple more emails like that and I’ll get a visit from the FBI!

I can see something like this being used in a library if a patron wanted to save an overheard citation (as an example) and email it to their account. So my next message was the following:

Pitts, M. G., & Browne, G. J. (2007). Improving requirements elicitation: An empirical investigation of procedural prompts. Information Systems Journal, 17, 89-110.

The transcription for this took several minutes, at least several minutes for the email to arrive, and when it did arrive it read like this:

Pip and Brown 2007 improving requirements [unclear speech, please listen] and imperial investigation of procedural prompts. Information system Journal IM-17, page of 89 to 110.

While that’s not perfect, there’s enough information there to figure out what I was trying to remember. Also, the email arrives with a speaker/audio icon which allows me to listen to the original message.

It seems we’re a long way off because of the quality of transcriptions, but applying this technology to an OPAC search doesn’t seem quite so distant now.