*An important thing to keep in mind*:
Do not forget to set the charset in .po file!
For example:
"Content-Type: text/plain; charset=UTF-8\n"
Then PHP will be able to find the .mo file you generated, using msgfmt, from the .po file WITH CHARSET SET.
Because of this I've wasted a lot of time debugging my code, testing every single little changes people suggested over this manual and Internet:
<?php
//this:
setlocale( LC_MESSAGES, 'pt_BR')
//or this:
setlocale( LC_MESSAGES, 'pt_BR.utf8')
//or this:
setlocale( LC_MESSAGES, '')
//this:
putenv("LANG=pt_BR.utf8");
//or this:
putenv("LANGUAGE=pt_BR.utf8");
//this:
bindtextdomain('mydomain', dirname(__FILE__).'/locale');
//or this:
bindtextdomain("*", dirname(__FILE__).'/locale');
//or this:
bindtextdomain('*', dirname(__FILE__).'/locale');
//setting or not "bind_textdomain_codeset()":
bind_textdomain_codeset("mydomain", 'UTF-8');
?>
As well as what locale directory name to set:
./locale/pt_BR.UTF8/LC_MESSAGES/mydomain.mo
or
./locale/pt_BR/LC_MESSAGES/mydomain.mo
or
./locale/pt/LC_MESSAGES/mydomain.mo
Finally, the code which brought the right translated strings (also with the correct charset) was:
<?php
$directory = dirname(__FILE__).'/locale';
$domain = 'mydomain';
$locale ="pt_BR.utf8";
//putenv("LANG=".$locale); //not needed for my tests, but people say it's useful for windows
setlocale( LC_MESSAGES, $locale);
bindtextdomain($domain, $directory);
textdomain($domain);
bind_textdomain_codeset($domain, 'UTF-8');
?>
And the three directory's names worked out, using the pt_BR.utf8 locale. (My tests were made restarting Apache then trying each directory).
I hope to help someone else not to waste as much time as I've wasted... =P
Using:
Ubuntu 8.04 (hardy)
Apache 2.2.8
PHP 5.2.4-2ubuntu5.6
Warning for Linux (Ubuntu) users! Your system will *only* support the locales installed on your OS, in the *exact* format given by your OS. (See also the PHP setlocale man page.) To get a list of them, enter locale -a, which will give you something like this:
C
en_US.utf8
ja_JP.utf8
POSIX
So this machine only has English and Japanese! To add eg. Finnish, install the package:
sudo apt-get install language-pack-fi-base
Rerun locale -a, and "fi_FI.utf8" should appear. Make sure you're using the same name in your PHP code:
setlocale(LC_ALL, "fi_FI.utf8");
Adjust your po paths so that they match, e.g. "./locale/fi_FI.utf8/LC_MESSAGES/messages.po".
Now restart Apache, and it should finally work. Figuring this out took quite a while...
And what about pgettext and npgettext? They are there in the gettext documentation, but there aren't in PHP. They're very useful if you have the same messages for translation, but in different contexts, so they should be translated separately and probably differently.
Fortunately, there is a simple work-around, which may help:
From the gettext.h header one can find out that pgettext() is only a macro calling dcgettext() internally with properly mangled parameters - msgctxt and msgid are glued together with use of ASCII char 4 [EOT, End Of Text]. The same way they're written in .mo files, so it's possible to refer them this way.
Here's my "emulated" pgettext() function:
<?php
if (!function_exists('pgettext')) {
function pgettext($context, $msgid)
{
$contextString = "{$context}\004{$msgid}";
$translation = dcgettext('messages', contextString,LC_MESSAGES);
if ($translation == $contextString) return $msgid;
else return $translation;
}
}
?>
By default, xgettext doesn't support pgettext function for PHP source files. But there is a parameter which can work-around it. Here's how I call xgettext:
xgettext --force-po --keyword="pgettext:1c,2" -c -o messages.po sourceFile.php
In sourceFile.php I use the following test code:
pgettext('menu', 'Open'); //Substitute "Otwórz"
pgettext('forum', 'Open'); //Substitute "Otwarty", different context
Generated .po file fragment:
msgctxt "menu"
msgid "Open"
msgstr "Otwórz"
msgctxt "forum"
msgctxt "Open"
msgstr "Otwarty"
I've tested it out and everything works fine :-) If anyone have some further suggestions or fixes, please write ;-)
My PHP app was made for UTF-8 but i had a problem that gettext always returned all texts in ISO-8859-2 instead of UTF-8.
Then i found out that you have to set locale in PHP exactly to encoding you request. So when i wanted czech UTF-8 i used:
setlocale(LC_ALL, "cs_CZ.UTF-8");
Now it works...
It is noteworthy that, according to the GNU gettext FAQ[1], * source code must be ASCII *. That means that you can't write code like
<?= _("áááàààãã")?>
in UTF-8, ISO-8859-1 or whatever, and hope that that will work. It won't.
[1]http://www.gnu.org/software/gettext/FAQ.html#nonascii_strings
To get this working properly on debian, install the locales-all package. I just spent a few hours finding a bug where it wouldn't work because that package is missing
Gettext is a function allowing to get translated text from a table. Its the common function used to internationalize a web site.