India: Facebook Blocks a Community With “Chutia” Names In a Slang-Surname Mix-Up

Facebook has blocked hundreds of accounts which are using the word “Chutia” in their names. If the word Chutia is transliterated to Hindi, it is the equivalent of “asshole.”

A Tibetan surfs a Facebook page at an internet café in this file photo from New Delhi. Facebook has banned numerous accounts from India in a transliteration-related confusion that translated certain surnames as a swear-word. (AP Photo/Tsering Topgyal)

If you have transliterated any of the words of your native tongue to English, then you can easily connect with this story. If you haven’t had the privilege, then consider this Transliteration 101. To give another example, a city in the state of Kerala is a written as Kozhikode. But it’s pronounced as “KoyiKod.” Why? I don’t know. Why is “Cat” written as “Cat” and not “Kat”?

Looks like we got our first censoring the social media case study. Kapil Sibal should be reading this in the morning papers. And for Zuckerberg and co, it is just another PR nightmare. I was this close to saying that Facebook could care less, but then realized that with 44 million users, India is big for Facebook. Which means Facebook can’t ignore it for long.

What happened?

“Chutia” is a surname from Assam which is pronounced “Sutiya.” A whole community which was using the surname Chutia was apparently banned by Facebook.

Apparently there might have been an Indian team which is monitoring the Facebook accounts. They might know the popular Chutia but didn’t know about a surname which is written just as the same but is pronounced differently in a different part of the country.

Now it’s clear that not even Indians know everything there is to know about India. Though this looks like a honest yet ill-informed mistake, this should serve as a lesson to all those who want to monitor social media. Monitoring all the written content coming out of a country which has some 438 spoken languages, even more dialects and as we now know, even more transliterations, isn’t child’s play. It isn’t Zuck’s play either.

Don’t throw in the moniker of Big Data around this. Because this isn’t Big Data. This Big Outrageous Data.

And that my friends, is how keyword-based filtering works.

Via ZDNet