6.05.2014

What's Wrong with Google's Semantic Search and Amazon's Bestseller Lists?

What's wrong with social media? A lot, and most of the time, we're not aware of it. We are so deep into our digital life that we don't notice what's going on. Yet, willy-nilly, our "likes" and "dislikes" are shaped by machines that "think" in a binary mode...

Here I'll focus on Google and Amazon's search systems but much the same can be said of how prominence is given to items on Facebook or Linked In. All major social media are driven by algorithms that rank items by number of hits.

I plead guilty: I'm simplifying a lot and I won't go into the technicalities, but please bear with me. I want to get to the principle at the heart of all those systems. And the principle is incredibly simple: the higher the number of "hits", the higher the ranking.

Now "hits" can be driven by many things, and Google has tried to give a different "weight" to different things.

What things? In Google's world it can be summarized in one word "authorship". If someone who's an "important author", i.e. considered an "expert" in his or her field (whatever that might be and however defined - which, incidentally, further complicates the matter), then a "plus" from that person on Google+ counts for more than from someone who doesn't rank as an author, or at least doesn't rank as high. The concept of "semantic search" is a further refinement that allows Google computers to take into account your past search history that expresses your interests and thus put them and you in context, ensuring more relevant answers to your questions...

But all that is based on an assumption that you search for what you are interested in.

And that, in itself, could be a dangerous assumption to make. For example, it's all "off" in the case of fiction writers like myself: I will search for things the characters  in my novels are interested in, but that doesn't mean I am! And I assume that's even more true of a crime writer searching for the specifications of guns and other firearms: presumably, it doesn't indicate at all that he's in the market to buy a weapon...Also, you could be deeply interested in something today and not give a damn about it tomorrow.

Personally, I love to play games with Google and look for the most unlikely stuff. Their computers probably think I'm off my rocker...

Now take a look at Amazon, the situation is very similar.

Though Amazon has never described its ranking system, one suspects it too is driven by some variant of the "authorship" concept - things like popularity, number of customer reviews and the "star" average resulting from all reviews. For example, a product with a straight 5-star review average resulting from over 50 reviews will obviously "weigh" more than one with a 4-star review average resulting from less than 50 reviews etc It can get quite complicated, but it's nothing that a good computer algorithm can't handle.

All this means that if you search for something - anything, from a bicycle on Google to a book on Amazon - the top returns will be those that got the highest number of "hits".

Read that sentence again.

See what I mean?

Personally, I am terrified. Because it means that  there's no choice or decision-making criteria at work here. No evidence of any quality evaluation carried out by anyone. No sign that this is a product floating to the top because it is "better".

No.

What we have here is a classic "snowball" effect. The more votes something gets, the more votes it accumulates. It's a vicious circle, and a very vicious one indeed.

What's on top of searches - whether Google's or Amazon's or for that matter, what is "popular" and "liked" on Facebook - is not necessarily "the best". Some very good stuff could have been by-passed, ignored, forgotten.

Why? Sheer bad luck, lack of "good marketing". Yes, if you throw a lot of money at something and "force" it up the ranks, at some point it will start to float up by itself. Or maybe it won't. Because luck also has its place here and can destroy the marketing strategies of the savviest marketeers.

Conclusion: we know that a number of good, valuable products will always stay "at the bottom". That's in the nature of algorithms based on number of "hits" - no matter how you try to give weight to some aspects, like popularity or authorship.

That "giving weight" is inevitably a simplification - it's never a thoughtful in-depth evaluation of a product - or, what interests me as a writer, a book. Hey, this is a "binary" world!

You're in (value=1) or out (value=0).

Is our cultural life really down to two numbers, zero and one?
Joseph Pulitzer, the "father" of the Prize

Now, the traditional publishing industry has for a long time used a complex system of literary evaluation - it works on multiple levels, ranging from expert opinions/reviews of books published in specialized magazines like Granta and the New Yorker to articles reviewing books in the mainstream media like the UK Guardian and the New York Times to high-level literary competitions like the Man Booker Prize or the Pulitzer. And all those systems are linked and interact: when someone wins the Pulitzer, the news reverberate through the media - with a snowball effect.

Not the same snowball effect you find on Amazon when a book title hits the top 100 sellers.  That list is driven by sales, not by expert opinion. Computer algorithms simply record that, based on the number of copies sold, you're in that list...or out. It says nothing about book quality.

Now this is a number-driven system that indie writers - those who've had the guts and bravery to self-publish - are ignoring at their peril. Because when you self-publish, you hand your books over to the search systems of Amazon and Google (and similar on Internet). You don't have access to the traditional publishing world and its complex system of literary evaluation. Your writer's blog and your books are driven by algorithms rather than in-depth, quality evaluations. And the numbers can easily go against you - at the end of the day, it's largely a matter of luck. Of course, book marketing can help counteract the onslaught of cold numbers, but not always and only up to a point.

Let me repeat: it's largely a matter of luck.

And that means only one thing: the digital revolution has connected us all very closely, closer than ever before in human History, but it has also injected a "binary numbering" system between us that is not the equivalent of serious discussion and quality evaluation. A lot of good stuff sinks out of sight on both Google and Amazon. And of course on Facebook and other media.

The binary system drives the content that is noticed, popular and talked about.

The binary system impoverishes our intellectual and cultural life. The digital revolution has spawned off the binary society. 

You're either in or out. No two ways about it.

Your views?

PS. This is not in praise of the Pulitzer or Man Booker, far from it. Such literary competitions have their own drawbacks as brilliantly depicted in the novel I'm currently reading "Lost for Words" by Edward St. Aubyn. Let me quote just one very appropriate sentence:
"Everybody thinks they understand the joke of reality TV, but the real joke is that there is no other reality! There can be no civilization because we are living in the desert of the Real. All our experience has been mediated by a system whose tyranny is precisely that no one controls it. Its tyranny is the absence of the tyrant...

Yes, and the "absence" is filled by Google's algorithms ceaselessly at work...So we may not really have the "tyrants" - for example, the gatekeepers of "taste" and culture found in traditional publishing - but we do have the binary numbers to guide us!

Brunn bodied Cadillac made for Pulitzer
The Brunn-bodied Cadillac belonging to Joseph Pulitzer's son...
Those were the days!

Enhanced by Zemanta
Post a Comment