Why Google Needs To Validate Human Beings

September 06, 2011

 I just finished reading this great post on the spam arms race problem that Google and other search engines have. This is a great summary of the problem in general. A really interesting implied point of the article is that in some ways the algorithmic approach to search indexing is impossibly flawed. It puts algorithm gamers eternally in front of great content because people focused on great content aren't gaming algorithms (or more realistically they are afraid to because they might get blacklisted or penalized). So there will always be some latency between the great content coming to the top of search results while search engines look to filter out gamed results.

Near the end of the post he says:

    "The good news is that webmasters who don’t invest in gaming will still see the best long-term results. Focusing on quality, basic promotion through guest blogging or social sites, and honestly providing value will get organic results over time – and won’t be tossed to the sidelines with algorithm updates."

Why? I mean, we all hope this is true, but what is the actual next step in addressing the problems Rob raises? What advantage could search engines get back over the algorithm gamers? There are human indexes, but they can't seem to keep up with all the content on the web, especially emerging, fresh, or daily content. One solution might be to crowdsource the problem. I think Google has attempted to do this a little bit with its +1 button. But this could be gamed as well by sending out bots to vote up content.

In every case, the problem keeps coming back to automated processes. And there are only two solutions to this problem that I can see:

1. Build an automated process to identify automated process. Presumably, automated processes have patterns or signals to them that can be identified. Content that is promoted or generated by an automated process could potentially have lower value with a search ranking system.
2. Somehow harness the abilities of real human beings to help identify valuable content.

The problem with #1 is that it remains an arms race. With each development comes a new battleground. With every adjustment Google makes comes a counter-adjustment by anyone looking to game them. #1 will presumably always be a part to the search indexing process, but it alone is not enough,

#2 is the answer: because at the end of the day humans know what they want. They may even *want* what spammers are trying to get to them. But Google doesn't necessarily care about that, it just wants to get searchers paired with what they are looking for. So by threading real human feedback into the quality of the results, you can assure that stuff people don't want is getting marginalized or deprioritized.

Of course, easier said than done.

The gamers may create an army of things that look human or even employ cheap labor in order create networks of people that are directed to promote their product, creating an artificial demand that would get noticed by Google.

Last Friday I talked about the different kinds of personas that may exist on the web. Tomorrow I will take a look at the different ways human beings can be validated and therefore used to improve search quality.

About the author
Tom DiCicco

Tom DiCicco is a Client Partnership Director at Productive Edge.

Join the newsletter
Stay up to date with the latest at Productive Edge

Subscribe Here!