190 lines
7.3 KiB
Text
190 lines
7.3 KiB
Text
|
PHP-gettext 1.0
|
|||
|
|
|||
|
Copyright 2003, 2006 -- Danilo "angry with PHP[1]" Segan
|
|||
|
Licensed under GPLv2 (or any later version, see COPYING)
|
|||
|
|
|||
|
[1] PHP is actually cyrillic, and translates roughly to
|
|||
|
"works-doesn't-work" (UTF-8: Ради-Не-Ради)
|
|||
|
|
|||
|
|
|||
|
Introduction
|
|||
|
|
|||
|
How many times did you look for a good translation tool, and
|
|||
|
found out that gettext is best for the job? Many times.
|
|||
|
|
|||
|
How many times did you try to use gettext in PHP, but failed
|
|||
|
miserably, because either your hosting provider didn't support
|
|||
|
it, or the server didn't have adequate locale? Many times.
|
|||
|
|
|||
|
Well, this is a solution to your needs. It allows using gettext
|
|||
|
tools for managing translations, yet it doesn't require gettext
|
|||
|
library at all. It parses generated MO files directly, and thus
|
|||
|
might be a bit slower than the (maybe provided) gettext library.
|
|||
|
|
|||
|
PHP-gettext is a simple reader for GNU gettext MO files. Those
|
|||
|
are binary containers for translations, produced by GNU msgfmt.
|
|||
|
|
|||
|
Why?
|
|||
|
|
|||
|
I got used to having gettext work even without gettext
|
|||
|
library. It's there in my favourite language Python, so I was
|
|||
|
surprised that I couldn't find it in PHP. I even Googled for it,
|
|||
|
but to no avail.
|
|||
|
|
|||
|
So, I said, what the heck, I'm going to write it for this
|
|||
|
disguisting language of PHP, because I'm often constrained to it.
|
|||
|
|
|||
|
Features
|
|||
|
|
|||
|
o Support for simple translations
|
|||
|
Just define a simple alias for translate() function (suggested
|
|||
|
use of _() or gettext(); see provided example).
|
|||
|
|
|||
|
o Support for ngettext calls (plural forms, see a note under bugs)
|
|||
|
You may also use plural forms. Translations in MO files need to
|
|||
|
provide this, and they must also provide "plural-forms" header.
|
|||
|
Please see 'info gettext' for more details.
|
|||
|
|
|||
|
o Support for reading straight files, or strings (!!!)
|
|||
|
Since I can imagine many different backends for reading in the MO
|
|||
|
file data, I used imaginary abstract class StreamReader to do all
|
|||
|
the input (check streams.php). For your convenience, I've already
|
|||
|
provided two classes for reading files: FileReader and
|
|||
|
StringReader (CachedFileReader is a combination of the two: it
|
|||
|
loads entire file contents into a string, and then works on that).
|
|||
|
See example below for usage. You can for instance use StringReader
|
|||
|
when you read in data from a database, or you can create your own
|
|||
|
derivative of StreamReader for anything you like.
|
|||
|
|
|||
|
|
|||
|
Bugs
|
|||
|
|
|||
|
Plural-forms field in MO header (translation for empty string,
|
|||
|
i.e. "") is treated according to PHP syntactic rules (it's
|
|||
|
eval()ed). Since these should actually follow C syntax, there are
|
|||
|
some problems.
|
|||
|
|
|||
|
For instance, I'm used to using this:
|
|||
|
Plural-Forms: nplurals=3; plural=n%10==1 && n%100!=11 ? 0 : \
|
|||
|
n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;
|
|||
|
but it fails with PHP (it sets $plural=2 instead of 0 for $n==1).
|
|||
|
|
|||
|
The fix is usually simple, but I'm lazy to go into the details of
|
|||
|
PHP operator precedence, and maybe try to fix it. In here, I had
|
|||
|
to put everything after the first ':' in parenthesis:
|
|||
|
Plural-Forms: nplurals=3; plural=n%10==1 && n%100!=11 ? 0 : \
|
|||
|
(n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2);
|
|||
|
That works, and I'm satisfied.
|
|||
|
|
|||
|
Besides this one, there are probably a bunch of other bugs, since
|
|||
|
I hate PHP (did I mention it already? no? strange), and don't
|
|||
|
know it very well. So, feel free to fix any of those and report
|
|||
|
them back to me at <danilo@kvota.net>.
|
|||
|
|
|||
|
Usage
|
|||
|
|
|||
|
Put files streams.php and gettext.php somewhere you can load them
|
|||
|
from, and require 'em in where you want to use them.
|
|||
|
|
|||
|
Then, create one 'stream reader' (a class that provides functions
|
|||
|
like read(), seekto(), currentpos() and length()) which will
|
|||
|
provide data for the 'gettext_reader', with eg.
|
|||
|
$streamer = new FileStream('data.mo');
|
|||
|
|
|||
|
Then, use that as a parameter to gettext_reader constructor:
|
|||
|
$wohoo = new gettext_reader($streamer);
|
|||
|
|
|||
|
If you want to disable pre-loading of entire message catalog in
|
|||
|
memory (if, for example, you have a multi-thousand message catalog
|
|||
|
which you'll use only occasionally), use "false" for second
|
|||
|
parameter to gettext_reader constructor:
|
|||
|
$wohoo = new gettext_reader($streamer, false);
|
|||
|
|
|||
|
From now on, you have all the benefits of gettext data at your
|
|||
|
disposal, so may run:
|
|||
|
print $wohoo->translate("This is a test");
|
|||
|
print $wohoo->ngettext("%d bird", "%d birds", $birds);
|
|||
|
|
|||
|
You might need to pass parameter "-k" to xgettext to make it
|
|||
|
extract all the strings. In above example, try with
|
|||
|
xgettext -ktranslate -kngettext:1,2 file.php
|
|||
|
what should create messages.po which contains two messages for
|
|||
|
translation.
|
|||
|
|
|||
|
I suggest creating simple aliases for these functions (see
|
|||
|
example/pigs.php for how do I do it, which means it's probably a
|
|||
|
bad way).
|
|||
|
|
|||
|
|
|||
|
Usage with gettext.inc (standard gettext interfaces emulation)
|
|||
|
|
|||
|
Check example in examples/pig_dropin.php, basically you include
|
|||
|
gettext.inc and use all the standard gettext interfaces as
|
|||
|
documented on:
|
|||
|
|
|||
|
http://www.php.net/gettext
|
|||
|
|
|||
|
The only catch is that you can check return value of setlocale()
|
|||
|
to see if your locale is system supported or not.
|
|||
|
|
|||
|
|
|||
|
Example
|
|||
|
|
|||
|
See in examples/ subdirectory. There are a couple of files.
|
|||
|
pigs.php is an example, serbian.po is a translation to Serbian
|
|||
|
language, and serbian.mo is generated with
|
|||
|
msgfmt -o serbian.mo serbian.po
|
|||
|
There is also simple "update" script that can be used to generate
|
|||
|
POT file and to update the translation using msgmerge.
|
|||
|
|
|||
|
Interesting TODO:
|
|||
|
|
|||
|
o Try to parse "plural-forms" header field, and to follow C syntax
|
|||
|
rules. This won't be easy.
|
|||
|
|
|||
|
Boring TODO:
|
|||
|
|
|||
|
o Learn PHP and fix bugs, slowness and other stuff resulting from
|
|||
|
my lack of knowledge (but *maybe*, it's not my knowledge that is
|
|||
|
bad, but PHP itself ;-).
|
|||
|
|
|||
|
(This is mostly done thanks to Nico Kaiser.)
|
|||
|
|
|||
|
o Try to use hash tables in MO files: with pre-loading, would it
|
|||
|
be useful at all?
|
|||
|
|
|||
|
Never-asked-questions:
|
|||
|
|
|||
|
o Why did you mark this as version 1.0 when this is the first code
|
|||
|
release?
|
|||
|
|
|||
|
Well, it's quite simple. I consider that the first released thing
|
|||
|
should be labeled "version 1" (first, right?). Zero is there to
|
|||
|
indicate that there's zero improvement and/or change compared to
|
|||
|
"version 1".
|
|||
|
|
|||
|
I plan to use version numbers 1.0.* for small bugfixes, and to
|
|||
|
release 1.1 as "first stable release of version 1".
|
|||
|
|
|||
|
This may trick someone that this is actually useful software, but
|
|||
|
as with any other free software, I take NO RESPONSIBILITY for
|
|||
|
creating such a masterpiece that will smoke crack, trash your
|
|||
|
hard disk, and make lasers in your CD device dance to the tune of
|
|||
|
Mozart's 40th Symphony (there is one like that, right?).
|
|||
|
|
|||
|
o Can I...?
|
|||
|
|
|||
|
Yes, you can. This is free software (as in freedom, free speech),
|
|||
|
and you might do whatever you wish with it, provided you do not
|
|||
|
limit freedom of others (GPL).
|
|||
|
|
|||
|
I'm considering licensing this under LGPL, but I *do* want
|
|||
|
*every* PHP-gettext user to contribute and respect ideas of free
|
|||
|
software, so don't count on it happening anytime soon.
|
|||
|
|
|||
|
I'm sorry that I'm taking away your freedom of taking others'
|
|||
|
freedom away, but I believe that's neglible as compared to what
|
|||
|
freedoms you could take away. ;-)
|
|||
|
|
|||
|
Uhm, whatever.
|