[Sputnik-list] Translation table format

Yuri Takhteyev yuri at sims.berkeley.edu
Fri Jul 6 00:58:05 GMT+3 2007


Yes, if string IDs will be treated as just IDs (as we discussed in an
earlier thread), then it makes sense to treat all translations
equally.  So, each table will look something like this:

    EDIT = {
       en = "Edit",
       ru = "Редактировать",
       pt = "Editar",
    }

The similarity between "EDIT" (the variable name) and "Edit" (the
value for "en"), will be treated as just a coincidence.  All
translations will need to be specified explicitly.

> If the default language really needs to be taken from the locale table (and
> not from a config option elsewhere), something like the following syntax
> might be better:

I think the the default language should be identified by its standard
code in each table, but there should be a variable in the file that
specifies either the default language or the order of languages to
try.  I.e., either:

    DEFAULT_LANGUAGE = "en"

or

    LANGUAGE_ORDER = {"en", "es", "pt", "ru"}

> We might also change the Brazilian Portuguese locale Key to br, instead of
> pt, to distinguish from an eventual future Portugal / African Portuguese.

Actually, what I wanted to do is the following: Each language should
be identified by its two letter or three letter code (see
http://www.iana.org/assignments/language-subtag-registry).  In case of
Portuguese, its "pt" for either European, Brazilian or Caboverdian
Portuguese.  More specific varieties of each language would be
identified by subtags, which could be regional or other, using IANA
subtags when possible, e.g. "pt_BR", "pt_CV", "en_GB" (British),
"es_AR", etc.  The code that does the translation will check both.
I.e., if the "pt_BR" interface is chose, then pt_BR value will be used
if defined, if not, we'll use "pt" and if not, then en.   Note that if
we only got one Portuguese translation, we should label it "pt".  Same
with English.  We can start with

    HI_USER {
       en = "Hi, $user!"
    }

Then move to

    HI_USER = {
       en = "Hi, $user!",
       en_GB = "How do you do, $user?",
       en_AU = "G'day, $user!"
    }

Note that with this approach one only need to "translate" into
specific varieties of English or Portuguese only those strings that
actually need to be different.  E.g., we can just have

    LOGIN = {
       en = "Login"
    }

rather than:

    LOGIN = {
       en_US = "Login",
       en_GB = "Login",
       en_AU = "Login"
    }

This basically means following RFC 4646
(http://www.rfc-editor.org/rfc/rfc4646.txt) when possible, with the
only exception of using underscores instead of hyphens as dividors, to
make it easier to write this in Lua.

> I have finished the first pass on the (portuguese) translation page and
> we'll be refining it, maybe in an on-line Sputnik version (so everyone can
> contribute) in the next days.

You mean your pt_BR translation, right?  :)  Anyway, dado's
translation is in SVN now.

 - yuri

-- 
http://www.freewisdom.org/


More information about the Sputnik-list mailing list