Localization support for language identifier

Something’s wrong when a language identifier doesn’t have localization support. So I cooked up a little localization code for What Language Is This?, which proved to be not as easy as one might guess. That’s because some of the textual content of the web app is in HTML, other is generated by PHP, and yet other is generated in JavaScript. I wanted to have one single source of localized strings for all three output paths to simplify overviewing, translating, changing, and adding strings to the web app.

I’m not sure if there’s any good solution for this out there, but I cooked up my own. Each language translation has its strings in a text file formatted like an ini file with id keys and localized strings separated by an equals sign. You can view the English and Japanese raw text files if you like. These are read into a PHP array (i.e. dictionary), after first looking at what language is specified by the URL (/en for English, /ja for Japanese or any other code), and if that is not specified then looking at what languages the browser is set to prefer via the Accept-Language HTTP header. If the requested language is not available then default to English.

To get the html output localized, the php script that reads through and configures the app (the plain html file itself is set up to run offline for debugging purposes only) looks for string ids enclosed in percent signs, i.e. like %strings id%. These are then replaced with the localized strings from the dictionary. The php-generated content is trivially changed to look up strings from the dictionary. On the JavaScript side, I wanted access to the same string dictionary that I had on the php side, so this is inserted into a <script> block of the generated html output as a JavaScript object (i.e. dictionary). String id lookups can then be done on this object from the JavaScript code just like on the php side. In other words, the php string dictionary is converted into JSON, which is used from the JavaScript side.

あれ何語? What Language Is This? in 日本語

あれ何語? What Language Is This? in 日本語

It all works pretty well and meets my goals. The only downside is that it relies on the server to do some processing, so when I develop on the offline version the strings aren’t available, instead I get to see the raw string ids, which can be useful too, but you have to rely on imagination to envision the end result. Isn’t programming always like that anyway, though?

The first translated version of What Language Is This? is of course Japanese, done by myself and my wife (初めての共同作業? lol), not just because it’s easy for me to do, but also because when looking at the AddThis stats, Japan is the top ranking country, and also as you know the average English skills in Japan are pretty bad, so I suspect there is a demand for a Japanese translation. Looking at the access stats, and discounting those with good English skills (India, Netherlands, Scandinavia, for example), next in line would most likely be Spanish, French, and German, in that order. Anyone feel like helping? Please drop me a comment in that case. I can offer proper credit and a link back from the site in return.


We’ll Always Have C

The other day there was an interview in Dr. Dobb’s Journal with the managing director of TIOBE Software, who publishes the TIOBE Programming Community Index, a ranking of programming language popularity. It was also discussed on Slashdot.

The methodology used by TIOBE to calculate a language’s popularity is basically the good old google hits ad-hoc voodoo index, using “[language] programming” as the query. This measures the “web presence” of a programming language.

First of all, it’s obvious to you and me that this measures something, that something being the amount of web pages including the term “[language] programming”, obviously. There’s nothing wrong with this method, as long as one is aware of what they’re measuring. But is it fair to call this the popularity of a language?


Look at this blog, for example. I mostly mention JavaScript and PHP here, just like everyone else. Throw in some Ruby and Python too to max out the buzz factor. There is no mention of relics such as C in this blog. But you know what language I use ten times more than any other? C. I’d love to have a job hacking away in JavaScript, Ruby, and Python all days, but I’d have to settle with half the salary. So here it goes: C programming. Index that. Embedded, heavily multi-threaded, efficient, minimum memory, hardcore badass C programming, that’s what I do, and I love doing it.

Most coders can’t do C. That’s why you see all these Visual This and Dot That and scripting languages on the ranking, because these kids blog about every little insignificant hobby project they manage to cut and paste together, just like I do. But let there be no mistake about it: real programmers can code in C. They do syntactically correct typedefs of function pointers in their sleep. (just kidding that’s impossible.)


At work I also hack in Python, Perl, and Makefile. At home it’s mostly JavaScript, PHP, Ruby, Python… Lately Python has replaced Ruby as my language of choice for home hacking because of its decent unicode support. (Although I’ve had to hack the Python standard library in some places where it didn’t properly support unicode. I read the next version of Python (2.6?) will use unicode strings by default, which is great, and only ten years late.) I also sold my soul the other day and installed Visual C# 2008 Express Edition for some hobby hacking. Turned out not very fun though, but I haven’t given up yet.

At my previous job I used C++ for doing essentially the same thing as I do in C now. I’m completely convinced that C is the right tool for the job. I’m also convinced C does object orientation better than C++, but that is a topic for another post. And I used to be a Java fan, but now I’m considering Java the best examples of software suckiness ever. It’s a volatile industry, technologies come and go, but no amount of blogging will convince me that the C programming language is anything but #1.

I’m saying it because it’s true: We’ll always have C. Because we’ve got jobs to do.


Caching in Php

Php by default tries as hard as it can to make the web browser not cache pages. While I can understand the rationale behind this a bit, sometimes you want caching. Caching is actually a good thing! you know. It means faster load times and lower bandwidth and processing requirements.

So I was surprised by how hard it is to turn off this aggressive non-caching policy. I googled for a few minutes and browsed the php documentation without finding an easy way of doing it. Ok, so you can use the following code snippet to enable caching in php (the argument to the function is number of seconds the page is valid):

function send_cache_headers($expire) {
  header("Cache-Control: max-age=$expire");
  header("Pragma: cache");
  header("Expires: " . gmdate('D, d M Y H:i:s \G\M\T', time() + $expire));
}