Business can benefit by observing the evolution and maintenance of information on the web, writes KARLIN LILLINGTON
TO SOME, online social networks and sites of huge group interaction like the Wikipedia encyclopedia, microblogging site Twitter and information tagging communities like Delicious.com may seem like just a bit of fun for people with a lot of time on their hands.
But a research team at Parc – the Palo Alto Research Centre in Silicon Valley, where much of modern computing was invented in the 1960s – thinks these social network systems reveal insights valuable to businesses and organisations about how groups can efficiently analyse, filter and use information, harnessing the “wisdom of the crowd”.
“We realised a lot of very interesting things happen in these kinds of systems,” says Ed Chi, principal scientist at Parc and manager of a group that studies human-computer interaction and large-scale social systems.
His group’s research vision is to understand how social computing systems can enhance the ability of a group of people to remember, think and reason. They have shown applying long-standing research algorithms used in the social sciences works for masses of online data just as it does for traditional social research.
Casual assumptions about how such huge systems work often prove to be wrong, Chi says. Wikipedia, for example, is commonly viewed as a product of benevolent co-operation, in which people with an interest work together to produce an informative encyclopedia entry. But Chi and his team found it was actually conflict that drove productivity on Wikipedia – altering and adding to someone else’s prior contribution.
“And even when people agree on a point of view, they will have arguments on how to present the points they want to make.”
Typical of the conflict that gives rise to many active contributions on a Wikipedia article is an entry on the history of radio. There’s a constant tussle between US and Italian contributors as to which nation can lay claim to inventing the radio, Chi says.
The most active points of contention are often minor, even seemingly innocuous, details. Chi notes that during the last two US presidential elections “conflict wars” broke out on Wikipedia over the age at which George W Bush was convicted of driving under the influence of alcohol, and whether that age could be described as “young” – and could therefore be considered more a youthful mishap and indiscretion.
Chi says he finds Wikipedia’s “social transparency” particularly fascinating. “When you read an article, you really have no idea whether you are listening to one voice or many; you have no idea how many voices may have been synthesised into a single article.”
To try to understand this process better, his team came up with an application that provides a “visual dashboard”. Found at Wikidashboard.com, the application enables anyone to see who was edited and has been influencing an article. “It’s kind of like a pub argument” in visual form, he says. It also forms what he calls a “living laboratory”, because it applies algorithms used to analyse large data sets to see what people do live, in the real world on Wikipedia.
Another goal of the team has been to try to understand Wikipedia’s growth model. “It’s been growing exponentially until March 2007, then it settled into a very different growth regime, more linear. We have been very interested in trying to understand that.” What they found was Wikipedia functioned like a Darwinian ecosystem – based on conflict and limited resources, with success going to members of the system who most readily adapt and gain the most expertise. “The system is a lot more like a living, breathing system than a runaway bacteria colony with an unlimited food supply,” he says.
The more linear growth pattern settled in after Wikipedia introduced a system of elite editors. With that system, many of the arguments get settled and often the articles become more stable.
Chi says administrators and “elite users” of Wikipedia – longtime users who contribute regularly to articles – behave differently from novice users.
Over time, the “revert rate” for novice users – the rate at which edits and additions made by users are deleted so that an article reverts to its previous form – had risen to one in four. By contrast, elite users account for only 1 per cent of reverts, Chi says. Overall, he thinks Wikipedia is moving from rapid expansion into “maintenance mode”, similar to the stabilisation of a biological system.
Why does all this matter? “What’s really interesting is how this might apply to other systems,” says Chi. “Consider sites like Yelp.com or Amazon. Do they have a growth mode and a maintenance mode? These are really serious issues that have business implications.”
He says it’s important to understand what motivates people to contribute to such systems. It is easy to incorrectly assume that people are motivated by community, when it may actually be vigorous conflict and debate that is more productive.
A related area of interest is to examine how information gets delivered to groups of people, by looking at huge social network systems like Delicious and Twitter. Chi says these large systems produce user-generated “optimisation – if you need to know it, information should be easy to find”. With such networks, users themselves begin to find ways to most effectively tag, highlight and find information. People using such networks select and then mark or forward information they think is of interest or of use: “Those acts of forwarding are ways for the system to self-organise and self-optimise.”
Understanding how people make these local decisions that affect a global social network has led his group to analyse the kind of information on Twitter that gets “re-tweeted”.
The group uses algorithms to crunch through millions of re-tweets to understand what elements of a post, URL or topic cause one tweet to spread virally and another to go nowhere. Understanding this better could help people and businesses better filter their “information overload”.
Chi is interested in the problem posed by the late Twitter joiner who might have equally valuable information and URLs to tweet but doesn’t stand out because – as with Wikipedia – they have come late to a party where earlier users have a bigger share of the follower pie.
Again, this is a major problem for businesses which may want to harness the knowledge of their employees, or glean more information about their clients and suppliers externally.
The work the group does may emerge as useful services for businesses, he says, because “the optimal distribution of knowledge across an organisation is how an organisation operates at ultimate efficiency”.