September 2nd, 2007

So, here I am, permanent resident of Japan, living here for 11 years, doing professional translation work, and there’s one thing that I suck at, and will continue to suck at, until eternity: Googling.

I’m good with Google in English.  Give me a topic, and in a few minutes I’ll trawl down the perfect search results.  But when it comes to Googling in Japanese, I’m as bad as my mom (sorry, Mom).

The problem is (as always in this blog) some idiosyncrasy in Japanese; partly the language itself, and partly the way it’s used.

Let’s imagine, for example, that you want to look up something about servers.  In English, in google, you’d type “server”, and click “search” (yes, I know, that’s a horrible search, and you’ll get a billion results, but if all you have to go on is the single word “server”, it can’t be helped).  In Japanese, however, you’ve got way more fragmentation.

First off, there are two proper ways to write “server”: “サーバ”, and “サーバー”.

Next, there is the possibility that it is written in English, so let’s add “server”.

Next, because we’re dealing with a technical topic, there is a very high likelihood of silliness.  (Think about the folks who write “Micro$oft”.  Now imagine that they do that to pretty much any common term, whether they like the company/product/item or not).  One common thing is to write a word which is properly written in katakana in hiragana.  So now we add “さーば” and “さーばー”.  But there’s more silliness: “server” is pronounced (roughly) “saba” in Japanese.  “Saba” is also the name for a kind of fish (I think it’s mackerel in English).  The mackerel saba is written with the kanji “鯖”.  And that’s what’s used by a lot of techies on their own personal blogs.

So to look up something about servers, you end up using as your search string:

Google (J): server OR サーバ OR サーバー OR さーば OR さーばー OR 鯖

You get something similar (though a little less involved) with even regular non-techy non-taking-the-piss terms: To look up a thing that goes “moo” in English, you would use the following Google search string:

Google (E): cow

To look up that same thing in Japanese, you’d use the following Google search string:

Google (J): 牛 OR うし OR ウシ

…and on and on, with all kinds of words.  Sure, not all of them are a nightmare, but there are so many that they start to stack up fast.

Lord help the person who wants to find a case mod of a server that looks like a cow.

Google (E): server mod cow


Google (J): (server OR サーバ OR サーバー OR さーば OR さーばー OR 鯖 ) AND (mod OR モッド OR 改造) AND  (牛 OR うし OR ウシ)

…Except, heck, Google doesn’t support parentheses, so I wouldn’t even know how to find that page.

    More seriously, I think the language is one more very good manifestation of what I see as a Japanese character trait – lack of the desire to simplify things. From the language to addresses to business organization, etc. etc. etc. The Japanese are not bothered (or not willing to do anything about) levels of arbitrary complexity that drive westerners mad.

    You don't need to do worry about doing searches for both Kanji and Hiragana. Google is doing that under the covers anyway. They have people who are dedicated to improving search quality for all languages, including Japanese, so that users don't need to think about how the need to construct the search…

    Huh. I wasn't aware of that. That's pretty cool, actually.

