A letter from the Social Search frontier: Good news, Bad news

By Manuel Cebrian & Iyad Rahwan.

Our ability to search social networks for people and information is fundamental to our success. We use our social networks to look for new job opportunities, to seek advice about what products to buy, to find a good physician, to identify business partners, and so on. For some reason, we seem to be able to search social networks efficiently, and we only need a few introductions before finding the answer to our question, or the kind of person we are seeking. How come?

In folk culture, the answer to this question is that we live in a “small world”; that we can find any target person (who has a skill or information we seek) through a handful of personal introductions. At the time the catch-phrase was coined in 1929 by Hungarian author Frigyes Karinthy, there was no quantitative evidence that this was the case. And it took almost 30 years to find any.

In 1967, psychologist Stanley Milgram conducted a ground-breaking experiment to test this “small world” hypothesis. He started with random individuals in the U.S. midwest, and asked them to send packages to people in Boston, Massachusetts, whose address was not given. They must contribute to this “search” only by sending the package to an individual they knew on a first-name basis. Scientists surveyed by Milgram expected that successful searches (if any) would require hundreds of individuals along the chain from the initial sender to the final recipient.

Surprisingly, however, Milgram found that the average path length was somewhere between 5.5 and 6 individuals. While the majority of packages never arrived (in one experiment, only 64 out of about 300 made it to the target), this was attributed to lack of interest in forwarding, rather than the non-existence of a short path. Whenever there was interest, a short path was found and used, yielding social search astonishingly efficient.

Although Milgram’s experiment raised some methodological criticisms, its findings were profound. What it did not answer, though, is why social networks have such short paths in the first place? The reason the answer is not obvious is that social networks are very cliquish: your friends’ friends are more likely to also be your friends. This “clustering” suggests that our search through the social network can easily get “trapped” within our close social community.

It took again a long time, this time more than 40 years, before this riddle was solved. In a seminal paper in Nature, Duncan Watts & Steven Strogatz came up with a simple mathematical model that explains the existence of these short paths. They started from a social network that is highly clustered: most of your friends are also friends of one another, or friends of friends of one another. In this model, the world is “large” since the average path length is very long. However, if we take only a tiny fraction of these connections (say 1%), and rewire them to random locations in the network, that same world suddenly becomes “small”. These random connections allow individuals to jump to faraway communities very quickly, thus reducing average path length in a dramatic fashion.

While this theoretical insight suggests that social networks are searchable due to the existence of short paths, it does not yet says much about the “algorithm” that people use to find these paths. There is no reason, a priori, that we should know how to find these short chains, especially since no individual has knowledge of the network structure beyond their immediate community.

Soon after Watts and Strogatz came up with this model at Cornell University, a computer scientist across campus, Jon Kleinberg, set out to investigate whether all such “small world” networks are searchable. In a landmark Nature article titled “Navigation in a Small World”, he showed that in a 2-dimensional world (like the surface of the earth), search is easy without global knowledge of the network, but only for a very specific value of the probability of long-range connectivity, i.e. the probability that we know somebody far apart in the social network. With the advent of publicly available  social media dataset such as LiveJournal, David Liben-Nowell and colleagues showed that real world social networks indeed have this particular long-range probability. It appears the social structure of the world we inhabit is remarkably fine-tuned for searchability.

 Circa 2005, it seemed that the many pieces of the puzzle of Milgram’s fundamental discovery are were place: the right network structure, the correct parameters to navigate such network structure efficiently, etc. But network scientists still needed to find solid empirical evidence of real individuals performing real searchers efficiently.

Duncan Watts, again, this time with his Columbia University lab members, set out to experimentally find such evidence. Reproducing Milgram’s experiments in the Internet world, they found that indeed, some people were able to efficiently find and reach their destination. The problem was that they found that most people were not able to do so. Financial incentives were suggested as the instrument missing: not enough of them cause the very large observed attrition rate, whereas the right ones would super-charge the networks to route efficiently. This is indeed a network-level, game theoretic problem: how do you persuade individuals to convince other individuals, and so on, to help search and to do so efficiently?

It turns out the questions is quite difficult to answer, even for folks as smart as Jon Kleinberg and Prabhakar Raghavan. They found a partial solution for it: a sophisticated technique denominated “query incentive networks”, where recruiters would offer fixed-contracts to their recruited friends to help with the search. If the offers from recruiter to recruit were generous enough, everything would go smoothly, and the elusive information or individual would be found by the network. The problem with this technique is that, for the offers to promote efficient search, it is needed for everyone to have many, many friends they can persuade and negotiate with. In particular, for these contracts to work nicely, individuals needed to have more than two recruited individuals on average. Unfortunately, it is well known that viral recruitment has a reproductive number (the average number of people an individual can recruit) much smaller than 2. In fact, this number is generally slightly smaller than 1. Thus, these contracts never suffice to find what the crowd is looking for. Realistic reproductive numbers made individuals greedy, their offers not generous enough, causing social search to falter. After so much research, it may seem that we were at a dead end; until we have the right way to structure financial incentives for search, search is just too inefficient.

As in many other situations in life, it takes humans to be under pressure to innovate and solve problems otherwise thought unsolvable. And nothing put so much pressure on “social search” researchers that the now famous DARPA Network Challenge, where teams competed around the world to find 10 read weather balloon in the Continental US. The first team to find them would get $40,000 Dollars.

The winning team, based at the MIT Media Lab, designed a modification of Kleinberg-Raghavan’s contracts, where the contracts were upside down. Instead of the recruiter making an offer to the recruit, it would be the other way around: the recruit would offer a split of the reward back to the recruiter, upon the information being found. The researchers called them “split contracts.” This modification provided a massive advantage for the MIT team over its 4,000+ competitors, landing them the prize. It took three more years for theoretical computer scientists at Universit of California, San Diego, to figure out why these split-contracts provided a competitive advantage over the Kleinberg-Raghavan fixed-contracts. But after quite a few theorems, they got it: split-contracts allowed query incentive network to work for branching factors near 1 or higher, which made them usable in the real world.  With this, the “split contracts” became the golden standard in time-critical social mobilization.

They were used, for instance, to win the Tag Challenge, a transcontinental version the DARPA Network Challenge, where 5 simulated jewel thieves where hiding in 5 different European and North American cities. Participants were able to recruit others in a targeted manner, leading to remarkable convergence towards the target cities. By then, it seemed that time-critical social mobilization was pretty much figured out.

However, the split-contracts took a brutal defeat in the DARPA Shredder Challenge, arguably the most difficult task ever posed to time-critical social mobilization. Participants in this challenge had to recruit thousands of people to assemble 5 puzzles of extreme difficulty, totalling around 10,000 pieces. Every puzzle was a shredded document that had to be reconstructed, with an increasing level of sophistication for the shredding, oiling, and blurring of the pieces. A team based at UC San Diego recruited over 3,500 participants, working on the puzzles around the clock via an online virtual assembly board. However, upon completion of the first three puzzles, they were subjected to a series of sabotage attacks that shattered, scrambled, and completely disrupted the puzzles the participants were working on. Eventually, these attacks not only halted collective progress, but caused an exodus of crowd workers, scared about the idea of future attacks.

There was very little that the UC San Diego team leaders, or the crowd itself, could do to avert these attacks. They were victims to the very strength of crowdsourcing: it’s openness to participation. Split-contracts, and crowdsourcing in general, were designed to be open to massive influxes of participants coming in, recruiting others, etc: the more the merrier.  Unfortunately, sabotage, misinformation, and evildoing were far from the minds of scientists working in was to increase the productivity of crowdsourcing. But it was there, it was definitely there.

Some recent efforts have tried to disincentivize sabotage in crowdsourcing. If verification is also rewarded along the recruitment tree, then the individuals who recruited the saboteurs would have a clear incentive to verify, halt, and punish the saboteurs. This theoretical solution is yet to be tested in practice.

If we are to believe in theory, theory does not shed a promising light on reducing sabotage in social search. The “Crowdsourcing Dilemma” was recently proposed: in it, the authors provided a game-theoretic analysis of a fundamental tradeoff between the potential for increased productivity of social search and the possibility of being set back by malicious behavior, such as misinformation. The results show that in competitive scenarios, such as those with multiple social searches competing for the same information, malicious behavior is the norm, not an anomaly – a result contrary to the conventional wisdom. Counterintuitively, making sabotage more costly does not deter saboteurs, but leads all the competing teams to a less desirable outcome.

Our empirical and theoretical findings have cautionary implications for the future of social search, and crowdsourcing in general. Social search is surprisingly efficient, cheap, easy to implement, and multi-functional across aplications. But there are also surprises in the amount of evildoing that the social searchers will stumble upon while recruiting.

Talent is there to be searched and recruited. But so are evil and malice. Ultimately, our job as scientists is to find more of the former, while deterring more of the later.