RSS/Atom feed Twitter
Site is read-only, email is disabled

Migration path to xml2po

This discussion is connected to the gimp-docs-list.gnome.org mailing list which is provided by the GIMP developers and not related to gimpusers.com.

This is a read-only list on gimpusers.com so this discussion thread is read-only, too.

23 of 23 messages available
Toggle history

Please log in to manage your subscriptions.

Migration path to xml2po Nickolay V. Shmyrev 22 Sep 04:36
  Migration path to xml2po Marco Ciampa 22 Sep 04:46
  Migration path to xml2po Sally C. Barry 22 Sep 05:11
   Migration path to xml2po Nickolay V. Shmyrev 22 Sep 06:43
    Migration path to xml2po Sally C. Barry 22 Sep 07:49
     Migration path to xml2po Sally C. Barry 22 Sep 07:58
      Migration path to xml2po Nickolay V. Shmyrev 22 Sep 08:08
       Migration path to xml2po Axel Wernicke 22 Sep 08:17
     Migration path to xml2po Axel Wernicke 22 Sep 08:04
     Migration path to xml2po Nickolay V. Shmyrev 22 Sep 08:16
    Migration path to xml2po Sally C. Barry 22 Sep 10:55
   Migration path to xml2po Marco Ciampa 22 Sep 06:57
  Migration path to xml2po Axel Wernicke 22 Sep 07:29
   Migration path to xml2po Nickolay V. Shmyrev 22 Sep 07:40
  Migration path to xml2po Roman Joost 26 Sep 00:21
   Migration path to xml2po Sally C. Barry 26 Sep 09:49
   Migration path to xml2po Nickolay V. Shmyrev 27 Sep 03:36
    Migration path to xml2po Roman Joost 28 Sep 00:29
     Migration path to xml2po Nickolay V. Shmyrev 28 Sep 03:06
   Migration path to xml2po Marco Ciampa 27 Sep 08:17
Migration path to xml2po William Skaggs 22 Sep 09:00
Migration path to xml2po William Skaggs 27 Sep 07:43
  Migration path to xml2po Axel Wernicke 27 Sep 08:39
Nickolay V. Shmyrev
2006-09-22 04:36:03 UTC (over 17 years ago)

Migration path to xml2po

Skipped content of type multipart/mixed-------------- next part -------------- A non-text attachment was scrubbed... Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: =?koi8-r?Q?=FC=D4=C1?= =?koi8-r?Q?_=DE=C1=D3=D4=D8?= =?koi8-r?Q?_=D3=CF=CF=C2=DD=C5=CE=C9=D1?= =?koi8-r?Q?_=D0=CF=C4=D0=C9=D3=C1=CE=C1?= =?koi8-r?Q?_=C3=C9=C6=D2=CF=D7=CF=CA?= =?koi8-r?Q?_=D0=CF=C4=D0=C9=D3=D8=C0?= Url : /lists/gimp-docs/attachments/20060922/e6d6b53d/attachment.bin

Marco Ciampa
2006-09-22 04:46:43 UTC (over 17 years ago)

Migration path to xml2po

On Fri, Sep 22, 2006 at 03:35:36PM +0400, Nickolay V. Shmyrev wrote:

Hi all

What about moving all gimp-help to xml2po way of translation? That will certainly make translation process more managable

[..]
I personally agree but, how do you think to resolve the glossary problem?

Sally C. Barry
2006-09-22 05:11:55 UTC (over 17 years ago)

Migration path to xml2po

Hello Nickolay, Marco and All -

What about moving all gimp-help to xml2po way of translation? That will certainly make translation process more managable

I'm not familiar with these tools, so I am eager to follow the discussions. It seems to me that as more languages are added, the files get bigger and less manageable -- and it gets harder and harder to find a "corner" of the doc tree to work in.

1. We'll lower translation contribution barrier. Translation process will be faster since translators won't care about docbook and will just use existing tools.

As long as the "existing tools" means just text editors and such. :-)

1. Translator that use po files will have to precisely follow English content. Note, that it still will be possible to write additional translated content directly in source files, so I think it's a minor problem.

I must note that when I've been updating the English, I've also looked at the content of other languages to see if there's anything I can add to the English at the same time. :-) Would this still be possible?

2. Development will require additional tools - python and libxml2- python. I think it's a minor requirement.

See my previous comment. I don't build or validate my files at all. I simply download them from viewcvs, edit them, and send them to Julien for validation (and they ultimately go to Axel for commit). I don't have or use the tools you mentioned. Is it still possible for me to contribute with this scheme?

I personally agree but, how do you think to resolve the glossary problem?

Marco, I'm not sure what glossary problem you refer to. :-( Can you elaborate, please?

Between early versions of Gimp and 2.0 (?) there was a major change in the documentation toolset, so I understand. Perhaps this could be considered at the next update of Gimp (2.4?), so the current arrangement isn't disturbed for now. Just a thought.

Thank you very much for thinking about the issues and coming up with a proposal.

Regards,

Sally

Nickolay V. Shmyrev
2006-09-22 06:43:16 UTC (over 17 years ago)

Migration path to xml2po

? ???, 22/09/2006 ? 08:17 -0400, Sally C. Barry ?????:

Hello Nickolay, Marco and All -

What about moving all gimp-help to xml2po way of translation? That will certainly make translation process more managable

I'm not familiar with these tools, so I am eager to follow the discussions. It seems to me that as more languages are added, the files get bigger and less manageable -- and it gets harder and harder to find a "corner" of the doc tree to work in.

Probably you would like to read about gettext and po files http://www.gnu.org/software/gettext/manual/html_mono/gettext.html#SEC161

In short, translations are committed to special text files in po subdir and later merged into the source. When source is updated, po files can be updated. New strings will be marked for translation and strings that changed a bit will be invalidated.

1. We'll lower translation contribution barrier. Translation process will be faster since translators won't care about docbook and will just use existing tools.

As long as the "existing tools" means just text editors and such. :-)

For you as a content writer the tools will include only editors, translators will use editors too, but will benefit from using many translation tools like KBabel and poedit.

1. Translator that use po files will have to precisely follow English content. Note, that it still will be possible to write additional translated content directly in source files, so I think it's a minor problem.

I must note that when I've been updating the English, I've also looked at the content of other languages to see if there's anything I can add to the English at the same time. :-) Would this still be possible?

One content that differs from English content will stay in docbook files so yes, it will be possible. We'll remove only content that precisely follows english version but not at once.

2. Development will require additional tools - python and libxml2- python. I think it's a minor requirement.

See my previous comment. I don't build or validate my files at all. I simply download them from viewcvs, edit them, and send them to Julien for validation (and they ultimately go to Axel for commit). I don't have or use the tools you mentioned. Is it still possible for me to contribute with this scheme?

Let me describe the ways contributors will work with gimp-help

1. Unexperienced content writer will checkout docbook from CVS, edit files and send patches to experienced content writer. The same way as now

2. Experienced content writer will accept patches, validate them and commit.

3. Unexperienced translator will checkout po files from CVS, translate them in text editor and even in web-based tool like rosetta https://launchpad.net/rosetta

4. Experienced translator will install python and libxml2-python, will create po files, will add content and mark some sections for translation with lang attribute. Will accept patches for po file and after validation of the result will commit them.

I personally agree but, how do you think to resolve the glossary problem?

Marco, I'm not sure what glossary problem you refer to. :-( Can you elaborate, please?

Me to, I don't quite understand what's the problem with glossary. The old terms will stay on the same place. Although I think glossary entries should be common for all languages.

Between early versions of Gimp and 2.0 (?) there was a major change in the documentation toolset, so I understand. Perhaps this could be considered at the next update of Gimp (2.4?), so the current arrangement isn't disturbed for now. Just a thought.

Well, it won't be the major change. You'll still be able to work using the same tools and process. Only new translators will benefit from it.

Thank you very much for thinking about the issues and coming up with a proposal.

Regards,

Sally

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: =?koi8-r?Q?=FC=D4=C1?= =?koi8-r?Q?_=DE=C1=D3=D4=D8?= =?koi8-r?Q?_=D3=CF=CF=C2=DD=C5=CE=C9=D1?= =?koi8-r?Q?_=D0=CF=C4=D0=C9=D3=C1=CE=C1?= =?koi8-r?Q?_=C3=C9=C6=D2=CF=D7=CF=CA?= =?koi8-r?Q?_=D0=CF=C4=D0=C9=D3=D8=C0?= Url : /lists/gimp-docs/attachments/20060922/f5d13c37/attachment.bin

Marco Ciampa
2006-09-22 06:57:26 UTC (over 17 years ago)

Migration path to xml2po

On Fri, Sep 22, 2006 at 08:17:59AM -0400, Sally C. Barry wrote:

Hello Nickolay, Marco and All -

I personally agree but, how do you think to resolve the glossary problem?

Marco, I'm not sure what glossary problem you refer to. :-( Can you elaborate, please?

The glossary problem is this: all the manual could (and IMHO should) be a one (english) to many translation. This is the best thing (TM) because in this manner there is one and only one reference, in a language that everyone can understand. But...the glossary is complicated by the fact that it contains the explanation of _native_ language definitions and terms that:

1) has not he same first letter than in english, so the order is different 2) sometimes a technical term exist only in one language 3) there was terms better explained in different languages than in english

The first problem could be resolved keeping the english definitions as a reference, as I tried to propose in an old message, with the help of a special xml tag like gloss_term_letter="f" (perhaps in a special comment) and with the help of some (python?) scripting to compile a nationalized glossary, ordered by that foreign letters.

The second is simple. Erase the terms that exists in just one language or do not exist in english. Perhaps this problem do not apply since such terms does not exists in the glossary right now.

The third is just a matter of asking the translators to contribute in the english version of the manual. If they fear to write bad english, no problem: some native english writer will apply the corrections.

bye

Axel Wernicke
2006-09-22 07:29:28 UTC (over 17 years ago)

Migration path to xml2po

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Hi Nickolay,

I'm not familiar with the po concept at all, but isn't it more about translation of strings than about writing huge structured documents like a book?

Is there any reference of a book or documentation developed with the technology you describe?

Greetings, lexA

Am 22.09.2006 um 13:35 schrieb Nickolay V. Shmyrev:

Hi all

What about moving all gimp-help to xml2po way of translation? That will certainly make translation process more managable

Advantages:

1. We'll lower translation contribution barrier. Translation process will be faster since translators won't care about docbook and will just use existing tools.

2. We'll track changes simpler and keep translated content up-to-date.

3. Translators won't affect each other and content writers and will work with their own files.

Disadvantages

1. Translator that use po files will have to precisely follow English content. Note, that it still will be possible to write additional translated content directly in source files, so I think it's a minor
problem.

2. Development will require additional tools - python and libxml2- python. I think it's a minor requirement.

3. Minor modifications for content will be required still - you'll need to assign headers.

4. No way to translate images right now from po, translation should go directly to the content. We'll solve this technical problem later.

Proposal

I propose to apply the attached files. Basically, it modifies profiling
the following way. If po file for language does not exist, it works as before. If file exists, it profiles for both $lang and en, replaces en with translations according to po files, and then strips untranslated
en content. It works nicely, so I propose just to commit it and encourage new translators to work with po instead while keeping the existing content in place. For example, we can start to translate menus
part into Russian with po files.

I am waiting for comments. I also remember that I've seen a discussion about that before, but failed to find it now, so links are welcome

Nickolay V. Shmyrev
2006-09-22 07:40:05 UTC (over 17 years ago)

Migration path to xml2po

Hi Nickolay,

I'm not familiar with the po concept at all, but isn't it more about translation of strings than about writing huge structured documents like a book?

Is there any reference of a book or documentation developed with the technology you describe?

Greetings, lexA

Right, its the default framework used by translators, the descriptions can be found at gettext manual

http://www.gnu.org/software/gettext/manual/html_mono/gettext.html#SEC8

Due to the work done with xml2po it's possible to use the whole gettext framework while writing docbook documentation, current GNOME documentation is developed this way, it's also very huge. It allow translators to solve the problem described in the original letter, like changes tracking and cooperation troubles.

Probably you should just try the patch and see how it works, I'll be happy to help with it's installation.

Sally C. Barry
2006-09-22 07:49:20 UTC (over 17 years ago)

Migration path to xml2po

Hello Nickolay, Marco and All -

Probably you would like to read about gettext and po files http://www.gnu.org/software/gettext/manual/html_mono/gettext.html#SEC161

Thank you! I'll take a look at it. I've been very curious about po files. I discovered them in the Gimp (C) source code tree and found them to be a great source for the exact command names and keyboard shortcuts, without having to start up Gimp and find the thing in question. It even works for finding foreign commands. :-)

For you as a content writer the tools will include only editors, translators will use editors too, but will benefit from using many translation tools like KBabel and poedit.

Sounds good for me ...

1. Unexperienced content writer will checkout docbook from CVS, edit files and send patches to experienced content writer. The same way as now
2. Experienced content writer will accept patches, validate them and commit.

... and other content writers, but ...

3. Unexperienced translator will checkout po files from CVS, translate them in text editor and even in web-based tool like rosetta https://launchpad.net/rosetta
4. Experienced translator will install python and libxml2-python, will create po files, will add content and mark some sections for translation with lang attribute. Will accept patches for po file and after validation of the result will commit them.

... isn't this more work for the inexperienced translators? And a *LOT* more work for the experienced translators? And what if there is no experienced translator for a language?

I just don't want to make it more difficult for the new translator of a "new" language to start to work. We shouldn't set the hurdles too high. (Though maybe xml is a pretty high hurdle, to begin with. :-) )

I have been concerned as I make changes to the English, though, that the languages which already have translations in my new file might not revisit it again for a while. I hate to mark every English paragraph with a revision date, but it almost seems necessary. From what I understand of your proposal, that might be a lot easier??

Now, regarding the glossary:

1) has not he same first letter than in english, so the order is different 2) sometimes a technical term exist only in one language 3) there was terms better explained in different languages than in english

Even in the English files I just worked on, the English terms didn't seem to be in alphabetical order in all cases. I just left them that way for now.

The first problem could be resolved keeping the english definitions as a reference, as I tried to propose in an old message, with the help of a special xml tag like gloss_term_letter="f" (perhaps in a special comment) and with the help of some (python?) scripting to compile a nationalized glossary, ordered by that foreign letters.

In my recent edits, I saw some MARK comments that already existed. They are at the top of the English definition and point to the definition in other languages, if they're not together with the English one. I added the MARK comments for *ALL* the other foreign terms, but it's not likely to stay up to date. And that isn't really a good tool for having an automatic script come through and re-order everything.

The second is simple. Erase the terms that exists in just one language or do not exist in english. Perhaps this problem do not apply since such terms does not exists in the glossary right now.

Well, if doesn't exist in English, perhaps it should! I don't think I'd want to *exclude* all new definitons. But the proposed new term (and a rough translation of it to English) could be sent to an English editor for inclusion in the English text. (BTW, I just added new terms to the glossary: tile and parasite. :-) )

The third is just a matter of asking the translators to contribute in the english version of the manual. If they fear to write bad english, no problem: some native english writer will apply the corrections.

That works, too.

Like you, I wish there were some way of keeping the terms together for all of the languages and sorting them alphabetically for each language when the files are processed. But I'm not prepared to write such a thing. :-(

Anyway, we seem to be discussing two topics here, although they are related. Does anyone else have any input?

Best,

Sally

Sally C. Barry
2006-09-22 07:58:46 UTC (over 17 years ago)

Migration path to xml2po

Hello again -

The other thing I was thinking about is this. I'm assuming that somehow, the whole thing will be put together to make up one big .xml file at the end. We need to make sure that there is some way to have the input to that file (.po files?) still include the xml tags. It's hard enough going through to put and tags in as it is. :-) We don't want to have to leave that to the software, too.

Sally

Axel Wernicke
2006-09-22 08:04:40 UTC (over 17 years ago)

Migration path to xml2po

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Hi all,

as I see it this is a topic which is not solved within a couple of posts. Before changing our technology we should consider the pros and cons very carefully. I agree that editing the xml docbook files is not that easy, but right now I don't see how the usage of the po technology solves that problem. To me it seems that we would just add another non trivial technology on top of the docbook xml files?!

May be we should have a closer look to other projects doing multi language documentation in a distributed environment. The GNOME documentation team seems to be well organised - are they doing their work in different languages?

Greetings, lexA -----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (Darwin)

iD8DBQFFE/uCR9mXLVsAbiQRAtGyAJkBVSbxnZpHaSQrs66oBr2Kyn0QjQCgtoO3 JZYdDGoSZjmdhE7uUMHb/4g=
=0akp
-----END PGP SIGNATURE-----

Nickolay V. Shmyrev
2006-09-22 08:08:18 UTC (over 17 years ago)

Migration path to xml2po

? ???, 22/09/2006 ? 11:04 -0400, Sally C. Barry ?????:

Hello again -

The other thing I was thinking about is this. I'm assuming that somehow, the whole thing will be put together to make up one big .xml file at the end. We need to make sure that there is some way to have the input to that file (.po files?) still include the xml tags. It's hard enough going through to put and tags in as it is. :-) We don't want to have to leave that to the software, too.

Sally

The po files can include inline docbook elements like acronym and quote but they shouldn't include things like para and section. The translation should quite precisely follow the original content.

For example, look into the way how documentation to gnome clock applet is made

Here is the source:

http://cvs.gnome.org/viewcvs/gnome-panel/help/clock/C/clock.xml?rev=1.8&view=markup

Here is po file:

http://cvs.gnome.org/viewcvs/gnome-panel/help/clock/de/de.po?rev=1.5&view=markup

note, that some translations have inline elements like guibutton and application

Here is the result (de translation):

http://nshmyrev.narod.ru/temp/clock.xml

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: =?koi8-r?Q?=FC=D4=C1?= =?koi8-r?Q?_=DE=C1=D3=D4=D8?= =?koi8-r?Q?_=D3=CF=CF=C2=DD=C5=CE=C9=D1?= =?koi8-r?Q?_=D0=CF=C4=D0=C9=D3=C1=CE=C1?= =?koi8-r?Q?_=C3=C9=C6=D2=CF=D7=CF=CA?= =?koi8-r?Q?_=D0=CF=C4=D0=C9=D3=D8=C0?= Url : /lists/gimp-docs/attachments/20060922/3c11b5b3/attachment.bin

Nickolay V. Shmyrev
2006-09-22 08:16:51 UTC (over 17 years ago)

Migration path to xml2po

? ???, 22/09/2006 ? 10:55 -0400, Sally C. Barry ?????:

Hello Nickolay, Marco and All -

[snip]

3. Unexperienced translator will checkout po files from CVS, translate them in text editor and even in web-based tool like rosetta https://launchpad.net/rosetta
4. Experienced translator will install python and libxml2-python, will create po files, will add content and mark some sections for translation with lang attribute. Will accept patches for po file and after validation of the result will commit them.

... isn't this more work for the inexperienced translators? And a *LOT* more work for the experienced translators? And what if there is no experienced translator for a language?

I just don't want to make it more difficult for the new translator of a "new" language to start to work. We shouldn't set the hurdles too high. (Though maybe xml is a pretty high hurdle, to begin with. :-) )

I have been concerned as I make changes to the English, though, that the languages which already have translations in my new file might not revisit it again for a while. I hate to mark every English paragraph with a revision date, but it almost seems necessary. From what I understand of your proposal, that might be a lot easier??

I don't think so, translators are happy and very familar with their tools. For new language, one should just submit a new po file and make some corrections. Newbies for example can just use web site, much easier than editing docbook xml :)

PO files are easier to review, so if there is no translator that can validate result, we can always ask gnome language team to look through it. That's all they are doing anyway. After that, po file can be committed as usual.

About changes, when you change something in English, po files will be updated and translator will see that some message is changed and will update it too. Until translator will update the po file, this paragraph will simply hide. No need to mark revisions and worry about versions.

Now, regarding the glossary:

1) has not he same first letter than in english, so the order is different 2) sometimes a technical term exist only in one language 3) there was terms better explained in different languages than in english

Even in the English files I just worked on, the English terms didn't seem to be in alphabetical order in all cases. I just left them that way for now.

The first problem could be resolved keeping the english definitions as a reference, as I tried to propose in an old message, with the help of a special xml tag like gloss_term_letter="f" (perhaps in a special comment) and with the help of some (python?) scripting to compile a nationalized glossary, ordered by that foreign letters.

In my recent edits, I saw some MARK comments that already existed. They are at the top of the English definition and point to the definition in other languages, if they're not together with the English one. I added the MARK comments for *ALL* the other foreign terms, but it's not likely to stay up to date. And that isn't really a good tool for having an automatic script come through and re-order everything.

The second is simple. Erase the terms that exists in just one language or do not exist in english. Perhaps this problem do not apply since such terms does not exists in the glossary right now.

Well, if doesn't exist in English, perhaps it should! I don't think I'd want to *exclude* all new definitons. But the proposed new term (and a rough translation of it to English) could be sent to an English editor for inclusion in the English text. (BTW, I just added new terms to the glossary: tile and parasite. :-) )

The third is just a matter of asking the translators to contribute in the english version of the manual. If they fear to write bad english, no problem: some native english writer will apply the corrections.

That works, too.

Like you, I wish there were some way of keeping the terms together for all of the languages and sorting them alphabetically for each language when the files are processed. But I'm not prepared to write such a thing. :-(

Anyway, we seem to be discussing two topics here, although they are related. Does anyone else have any input?

I am for the third way :) About sorting, I think it's docbook-xsl bug, glossary terms should be sorted and probably it's possible to make them sorted with little stylesheets tuning. We'll look later

Best,

Sally

Axel Wernicke
2006-09-22 08:17:58 UTC (over 17 years ago)

Migration path to xml2po

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Here is the source:

http://cvs.gnome.org/viewcvs/gnome-panel/help/clock/C/clock.xml? rev=1.8&view=markup

Here is po file:

http://cvs.gnome.org/viewcvs/gnome-panel/help/clock/de/de.po? rev=1.5&view=markup

note, that some translations have inline elements like guibutton and application

Here is the result (de translation):

http://nshmyrev.narod.ru/temp/clock.xml

this looks to me like the birth of translation slaves. I don't like slavery.
You completely loose the context of what you are doing. Just single lines of text to be translated without a context. What about images that are language specific. Do we really want to do half of the docbook structure in the po files and the other half out of them? To be hones right now I'm far away from being convinced that this will lead us to the future of manual writing.

lexA

William Skaggs
2006-09-22 09:00:31 UTC (over 17 years ago)

Migration path to xml2po

I have always favored splitting the various languages into separate files, because I couldn't imagine hand-editing an xml file with, say, 20 languages all mixed together. If this were done, then I don't see what would prevent having po translations for new languages, while languages that are already far along, such as German, French, Chinese, Italian, and perhaps Czech, could maintain their existing text if they wanted to.

There have already been several cases where people were interested in translating GIMP-Help but were scared away by the xml.

-- Bill


______________ ______________ ______________ ______________ Sent via the CNPRC Email system at primate.ucdavis.edu

Sally C. Barry
2006-09-22 10:55:56 UTC (over 17 years ago)

Migration path to xml2po

Hello Nickolay and All -

I looked at the URL you cited about gettext. Since I have some previous exposure to the "message catalog" method, and since I've been peeking at the Gimp sources, it was not a complete surprise to me. Thanks again for the good resource.

In short, translations are committed to special text files in po subdir and later merged into the source. When source is updated, po files can be updated. New strings will be marked for translation and strings that changed a bit will be invalidated.

I see it as a real advantage, if changed lines in the English are automatically flagged in the .po files.

But there is a caution. In doc files in particular, since they are more often in the form of paragraphs than short phrases, arbitrarily moving the line breaks (to fit within the max chars per line) would automatically flag all the translations as invalid, as would minor changes in xml tags or punctuation.

Until translator will update the po file, this paragraph will simply hide. No need to mark revisions and worry about versions.

Hmmm. I don't know if this is a good thing or not. English changes a single word in a paragraph and the whole thing disappears from the help files and web site until somebody fixes it in each language? It would be better to have a broken paragraph displayed than a "hole" in the middle of an explanation, I think.

Axel said:

this looks to me like the birth of translation slaves. I don't like slavery.

I understand. I think Gimp docs have always allowed a certain amount of translator discretion. But the alternating paragraphs of the current docs do sort of follow this scheme, anyway, don't they?

You completely loose the context of what you are doing. Just single lines of text to be translated without a context.

The Clock example showed some whole paragraphs, not just single lines. And as for the context, it's certainly no worse than the current alternating paragraphs system.

What about images that are language specific.

I see this as potentially being more of a difficulty. If the translator has to edit the xml file, as well as the po file, to deal with the picture -- then he's *really* likely to lose context. :-)

Bill said:

I have always favored splitting the various languages into separate files, because I couldn't imagine hand-editing an xml file with, say, 20 languages all mixed together.

As long as I was editing the menus/ files with usually just en, de and fr, it wasn't too bad. Now that I'm looking at concepts/ with far more languages that that, even *I* am having trouble keeping the context. Since I don't build the files, they're always a surprise and a delight when I see them on the website. And I've often been surprised at how little English text some of these files actually contain! With all of the languages, it's hard to keep that perspective.

Nickolay said:

current GNOME documentation is developed this way, it's also very huge.

If there's already a doc project done this way, where can we find it? (OK, maybe everybody else knows the answer except me!) It would seem that we could at least look at how good or bad that one is.

Regarding XML and PO hurdles, the translator would have to know both syntaxes, so he could edit the PO and also add XML tags (not to mention possibly updating the XML for images.) Pro: perhaps many translators already know PO format. Con: PO format may be fairly simple, but it's not just a straight text file, either.

As Axel said:

as I see it this is a topic which is not solved within a couple of posts. Before changing our technology we should consider the pros and cons very carefully.

True, I don't think it's a simple matter of "let's just accept Nickolay's files and check them in and start using them tomorrow." OK, this is an open source project and nobody really makes a top-down decision. But I'd still like to hear from Roman.

Regards,

Sally

Roman Joost
2006-09-26 00:21:35 UTC (over 17 years ago)

Migration path to xml2po

Hi Nickolay and others,

sorry for replying so late, but I haven't had time to reply earlier.

Let me first say, that this discussion comes always up if new contributors joining us. We should put some of the conclusions up on our wiki, so we don't have to argue about some points again and again. Although it might be sometimes good to get new ideas. Anyways ...

Let me please first reply to the disadvantages:

On Fri, Sep 22, 2006 at 03:35:36PM +0400, Nickolay V. Shmyrev wrote:

Disadvantages

1. Translator that use po files will have to precisely follow English content. Note, that it still will be possible to write additional translated content directly in source files, so I think it's a minor problem.

And IMHO this is our problem: we don't have a finished English manual yet. If we want to use PO files, we need a reference to make a translation for. DocBook provides the basis of not creating just a plain translation of a reference manual written in a specific language. It provides the (IMHO the advantage) possibilities to write your "own" translated manual. Of course the structure is and should be the same, so is the content. But if you describe e.g. the quickmask in a single sentence or in a big article is up to you.

So we have currently: -> more flexibility
-> no reference language to create a translation for

2. Development will require additional tools - python and libxml2- python. I think it's a minor requirement.

Yep - think so as well.

[...] 3.

4. No way to translate images right now from po, translation should go directly to the content. We'll solve this technical problem later.

Excuse my ignorance, but thats always how people come up with new ideas. I love to have new ideas and how we can improve the way we're writing content for the manual. But new technologies come also with a lot of disadvantages.

To make it short: I think we just don't gain any benefit of changing to po. We will run in other problems equivalent to what we now have.

Advantages:

1. We'll lower translation contribution barrier. Translation process will be faster since translators won't care about docbook and will just use existing tools.

2. We'll track changes simpler and keep translated content up-to-date.

3. Translators won't affect each other and content writers and will work with their own files.

Yep - everything a *pro* of course.

Proposal

I propose to apply the attached files. Basically, it modifies profiling the following way. If po file for language does not exist, it works as before. If file exists, it profiles for both $lang and en, replaces en with translations according to po files, and then strips untranslated en content. It works nicely, so I propose just to commit it and encourage new translators to work with po instead while keeping the existing content in place. For example, we can start to translate menus part into Russian with po files.

Did I understand that correctly that we can have the advantages of both worlds? (Sorry haven't looked at your patch, but it'll probably cost me another day to reply to your mail then which I don't want.)

Greetings,

Sally C. Barry
2006-09-26 09:49:49 UTC (over 17 years ago)

Migration path to xml2po

Hi All -

Let me first say, that this discussion comes always up if new contributors joining us. We should put some of the conclusions up on our wiki, so we don't have to argue about some points again and again.

Sorry I missed the previous discussions about it. But you could be right; it might be a good idea to say something about it on the Wiki.

And IMHO this is our problem: we don't have a finished English manual yet. If we want to use PO files, we need a reference to make a translation for.

A very good point. As the only currently (very) active English writer, and a relative newcomer at that, I feel this very acutely.

4. No way to translate images right now from po, translation should go directly to the content. We'll solve this technical problem later.

Excuse my ignorance, but thats always how people come up with new ideas. I love to have new ideas and how we can improve the way we're writing content for the manual. But new technologies come also with a lot of disadvantages.

Am I correct in assuming that this has to do with the example images (menus, results of various operations on picture files, etc.)? If so, I've been thinking that it might be possible to have the program which creates the blank po files include the lines, so they can be "translated". In this case, the fileref= could be changed to include the /fr or /de or /ru, so it points to the localized image file.

At any rate, I do agree that it doesn't make too much sense, as long as the English part of the manual is in its current state.

Regards,

Sally

Nickolay V. Shmyrev
2006-09-27 03:36:09 UTC (over 17 years ago)

Migration path to xml2po

? ???, 26/09/2006 ? 09:20 +0200, Roman Joost ?????:

Hi Nickolay and others,

sorry for replying so late, but I haven't had time to reply earlier.

Let me first say, that this discussion comes always up if new contributors joining us. We should put some of the conclusions up on our wiki, so we don't have to argue about some points again and again. Although it might be sometimes good to get new ideas. Anyways ...

I completely agree with that point. Every change breaks something and introduces new bugs, migration process is very important thing and it should be documented. I'll create wiki page about po vs docbook soon with the results of our discussion here.

Let me please first reply to the disadvantages:

On Fri, Sep 22, 2006 at 03:35:36PM +0400, Nickolay V. Shmyrev wrote:

Disadvantages

1. Translator that use po files will have to precisely follow English content. Note, that it still will be possible to write additional translated content directly in source files, so I think it's a minor problem.

And IMHO this is our problem: we don't have a finished English manual yet. If we want to use PO files, we need a reference to make a translation for. DocBook provides the basis of not creating just a plain translation of a reference manual written in a specific language. It provides the (IMHO the advantage) possibilities to write your "own" translated manual. Of course the structure is and should be the same, so is the content. But if you describe e.g. the quickmask in a single sentence or in a big article is up to you.

Well, I agree completely. But one of the reasons we don't have complete English content is that we are working on different contents actually as you pointed above. It gives you freedom to ignore English completely and many of us use this freedom leaving English content unfinished. If translator should follow the English content, he will try to improve it as well. But it's just a thought :)

So we have currently:
-> more flexibility
-> no reference language to create a translation for

2. Development will require additional tools - python and libxml2- python. I think it's a minor requirement.

Yep - think so as well.

[...] 3.

4. No way to translate images right now from po, translation should go directly to the content. We'll solve this technical problem later.

Excuse my ignorance, but thats always how people come up with new ideas. I love to have new ideas and how we can improve the way we're writing content for the manual. But new technologies come also with a lot of disadvantages.

Let me explain the situation with images since Sally is also interested. xml2po has nice support for translation of images - it stores the hash of the image in po file in the following way:

#. When image changes, this message will be marked fuzzy or untranslated for you.
#. It doesn't matter what you translate it to: it's not used at all. #: C/evince.xml:146(None)
msgid "@@image: 'figures/evince_start_window.png'; md5=7f4da5e33bcac35738a268d93d497d47" msgstr "@@image: 'figures/evince_start_window.png'; md5=7f4da5e33bcac35738a268d93d497d47"

So if original image will change, translator will be aware of it. It also allows transparent switching to untranslated image when translated one is not present.

But the way it is working is a bit incompatible with our current way. More precisely, it relays on the following directory structure

C/figures/image1.png ru/figures/image1.png

You see that lang should be in the beginning of the path. Currently we have a bit different layout:

menus/figures/image1.png menus/figures/ru/image1.png

so we can't translate images, but I think I'll patch xml2po soon so it will use our layout.

To make it short: I think we just don't gain any benefit of changing to po. We will run in other problems equivalent to what we now have.

Advantages:

1. We'll lower translation contribution barrier. Translation process will be faster since translators won't care about docbook and will just use existing tools.

2. We'll track changes simpler and keep translated content up-to-date.

3. Translators won't affect each other and content writers and will work with their own files.

Yep - everything a *pro* of course.

Proposal

I propose to apply the attached files. Basically, it modifies profiling the following way. If po file for language does not exist, it works as before. If file exists, it profiles for both $lang and en, replaces en with translations according to po files, and then strips untranslated en content. It works nicely, so I propose just to commit it and encourage new translators to work with po instead while keeping the existing content in place. For example, we can start to translate menus part into Russian with po files.

Did I understand that correctly that we can have the advantages of both worlds? (Sorry haven't looked at your patch, but it'll probably cost me another day to reply to your mail then which I don't want.)

Yes, please look on it, really it tries to merge both world instead of replacing one with another. Thus we will be able to use as previous content and previous way of translation and we will be able to use a new way. And I completely agree that we should identify the problems first and try to solve them, not just do a migration because somebody did it.

Greetings,

William Skaggs
2006-09-27 07:43:47 UTC (over 17 years ago)

Migration path to xml2po

Well, having spent a lot of time in 2004 generating English documentation, I believe the main reason there isn't more of it is the pain of writing raw xml. I believe I would have been able to write twice as much material in a sane format such as LaTeX. In this modern age, it seems ridiculous to me that our "sophisticated" system is such a burden to use.

In short, the thing that is most of all needed is some system that will allow people to write original documentation without having to hand-code all the xml cruft.

-- Bill


______________ ______________ ______________ ______________ Sent via the CNPRC Email system at primate.ucdavis.edu

Marco Ciampa
2006-09-27 08:17:15 UTC (over 17 years ago)

Migration path to xml2po

On Tue, Sep 26, 2006 at 09:20:38AM +0200, Roman Joost wrote:

Disadvantages

1. Translator that use po files will have to precisely follow English content. Note, that it still will be possible to write additional translated content directly in source files, so I think it's a minor problem.

And IMHO this is our problem: we don't have a finished English manual yet. If we want to use PO files, we need a reference to make a translation for. DocBook provides the basis of not creating just a plain translation of a reference manual written in a specific language. It provides the (IMHO the advantage) possibilities to write your "own" translated manual. Of course the structure is and should be the same, so is the content. But if you describe e.g. the quickmask in a single sentence or in a big article is up to you.

So we have currently: -> more flexibility
-> no reference language to create a translation for

Well, actually I see these two as disvantages for the same reasons but from a different point of view:

1) more flexibility. I do not see any point in writing different versions of manuals for a unique program like GIMP, that for instance is written with the reference commands and strings in english so this last is the _natural_ reference. You, as everyonelse, are free to fork and create a DE-GIMP, a IT-GIMP, a FR-GIMP but again, I do not see any point in differentiate the functions and commands by language and there is not any fork in view any time soon and I really doubt that anyone will ever think to do it in the future, so this "flexibility" is totally useless and actually more a drawback.

Instead I see many people in trouble since they have to re-design again and again the manual for every language because of other people that are concentrate on improving the mother tougue language more than the reference. English is not my mother toungue, you see, but I see it more as a lingua-franca, a tool, much more useful for all translators that for it by itself.

2) no reference language. So we must decide a reference language! Because we really _need_ a reference language for writing the reference manual. And we, topmost need a reference for the manual. I do not see any future without a reference just because the languages in the manual are growing every months and I, for example, do _not_ want to write a different manual for italian. I just want to _translate_ the manual into italian.

It has been done a really great work with the manual structure (many thanks, I coudn't be able to do such a good work), a work that is good for every language and, infact 95% or more of the manual has the same structure in every lanuage that is translated to. I'm convincted that I, italian, like chinese people or russian ones for example, are more confident in translating text from english than from any other languages, and again, we must think more to the whole international project than to a single version of it. We must think global, the good globalization of free/opensource!

2. Development will require additional tools - python and libxml2- python. I think it's a minor requirement.

Yep - think so as well.

[...] 3.

4. No way to translate images right now from po, translation should go directly to the content. We'll solve this technical problem later.

Excuse my ignorance, but thats always how people come up with new ideas. I love to have new ideas and how we can improve the way we're writing content for the manual. But new technologies come also with a lot of disadvantages.

This is _not_ a new idea. Many other groups have just resolved this problem we face, we just need the courage to first admit it as a problem to have any chance to be able to resolve it.

To make it short: I think we just don't gain any benefit of changing to po. We will run in other problems equivalent to what we now have.

This is you opinion that noone have in any other translation groups. Whenever I talk about the gimp manual describing it as a unique multilanguage group of docbook xml files, everyone laugh at me...

Advantages:

1. We'll lower translation contribution barrier. Translation process will be faster since translators won't care about docbook and will just use existing tools.

2. We'll track changes simpler and keep translated content up-to-date.

3. Translators won't affect each other and content writers and will work with their own files.

Yep - everything a *pro* of course.

Proposal

I propose to apply the attached files. Basically, it modifies profiling the following way. If po file for language does not exist, it works as before. If file exists, it profiles for both $lang and en, replaces en with translations according to po files, and then strips untranslated en content. It works nicely, so I propose just to commit it and encourage new translators to work with po instead while keeping the existing content in place. For example, we can start to translate menus part into Russian with po files.

Did I understand that correctly that we can have the advantages of both worlds? (Sorry haven't looked at your patch, but it'll probably cost me another day to reply to your mail then which I don't want.)

I vote YES, PLEASE!

Axel Wernicke
2006-09-27 08:39:56 UTC (over 17 years ago)

Migration path to xml2po

Hi,

-------- Original-Nachricht -------- Datum: Wed, 27 Sep 2006 07:43:25 -0700 Von: "William Skaggs"
An: gimp-docs@lists.XCF.Berkeley.EDU Betreff: Re: [Gimp-docs] Migration path to xml2po

Well, having spent a lot of time in 2004 generating English documentation, I believe the main reason there isn't more of it is the pain of writing raw xml. I believe I would have been able to write twice as much material in a sane format such as LaTeX. In this modern age, it seems ridiculous to me that our "sophisticated" system is such a burden to use.

In short, the thing that is most of all needed is some system that will allow people to write original documentation without having to hand-code all the xml cruft.

There are tools which make writing xml very easy. Even WYSIWYG editors are available. Unfortunately none of them fullfill all of our requirements. Once I suggested to have a closer look into http://www.xmlmind.com/xmleditor/ It has everything one needs and is free (altough not OpenSource), but does some xml formatting which didn't meet the taste of our vim users :)

greetings,

lexA

-- Bill


______________ ______________ ______________ ______________ Sent via the CNPRC Email system at primate.ucdavis.edu


_______________________________________________ Gimp-docs mailing list
Gimp-docs@lists.XCF.Berkeley.EDU
https://lists.XCF.Berkeley.EDU/mailman/listinfo/gimp-docs

Roman Joost
2006-09-28 00:29:12 UTC (over 17 years ago)

Migration path to xml2po

On Wed, Sep 27, 2006 at 02:35:43PM +0400, Nickolay V. Shmyrev wrote:

? ???, 26/09/2006 ? 09:20 +0200, Roman Joost ?????:

Proposal

I propose to apply the attached files. Basically, it modifies profiling the following way. If po file for language does not exist, it works as before. If file exists, it profiles for both $lang and en, replaces en with translations according to po files, and then strips untranslated en content. It works nicely, so I propose just to commit it and encourage new translators to work with po instead while keeping the existing content in place. For example, we can start to translate menus part into Russian with po files.

Did I understand that correctly that we can have the advantages of both worlds? (Sorry haven't looked at your patch, but it'll probably cost me another day to reply to your mail then which I don't want.)

Yes, please look on it, really it tries to merge both world instead of replacing one with another. Thus we will be able to use as previous content and previous way of translation and we will be able to use a new way. And I completely agree that we should identify the problems first and try to solve them, not just do a migration because somebody did it.

Well, that sounds great actually. I still want to rely on the XML only approach as it is now, but there are a lot of authors (like Marco for example) who want to use the po strategy.

Saying that, the only point which prevents me of saying: "GO!" is the fact, that using both strategies could lead into a conglomerate of .po files and xml files. To be pessimistic: a whole mess. Have you thought about how we can prevent people from messing this up or even manage this? Is there someone of the authors who thought about this? Marco - do you think, if we introduce the xml2po approach, that you completely switch to xml2po or do you want to use it for future work?

I'm much in a favor of making this step. It seems to support both kind of authors.

Also thanks a lot of spending time and making the results of this discussion available on our WIKI page.

Greetings,

Nickolay V. Shmyrev
2006-09-28 03:06:55 UTC (over 17 years ago)

Migration path to xml2po

Well, that sounds great actually. I still want to rely on the XML only approach as it is now, but there are a lot of authors (like Marco for example) who want to use the po strategy.

Saying that, the only point which prevents me of saying: "GO!" is the fact, that using both strategies could lead into a conglomerate of .po files and xml files. To be pessimistic: a whole mess. Have you thought about how we can prevent people from messing this up or even manage this? Is there someone of the authors who thought about this? Marco - do you think, if we introduce the xml2po approach, that you completely switch to xml2po or do you want to use it for future work?

Actually I am also afraid of such mess, probably some languages will switch to xml2po completely and that will help a bit, other will stay as is. I've created http://wiki.gimp.org/gimp/GimpDocsWorkflow with some extractions from our discussion, feel free to complete this page

I've also created pot file for menus subdirectory. It's in

http://nshmyrev.narod.ru/temp/menus.pot.gz

I see menus part isn't translated in Russian and Italian, so we can start with it to identify problems we'll have. So translations of the pot file above are appreciated.

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: =?koi8-r?Q?=FC=D4=C1?= =?koi8-r?Q?_=DE=C1=D3=D4=D8?= =?koi8-r?Q?_=D3=CF=CF=C2=DD=C5=CE=C9=D1?= =?koi8-r?Q?_=D0=CF=C4=D0=C9=D3=C1=CE=C1?= =?koi8-r?Q?_=C3=C9=C6=D2=CF=D7=CF=CA?= =?koi8-r?Q?_=D0=CF=C4=D0=C9=D3=D8=C0?= Url : /lists/gimp-docs/attachments/20060928/1e4fdc2e/attachment.bin