Markup in translations? - Kore Nordmann

Markup in translations?

In web applications I want to make accessible to users with different languages, I normally have 3 types of strings / texts:

Markup in static contents

Those static contents, like on the contact information page on may contain some markup, or at least some links. If you don't want to use markup there you get to put all links (as it is done there) below the real text, so the translator won't see them.

I just saw my brother translating text for his website, which runs on top of Zope 3. He used (X)HTML for the original text and for the translated text. This is of course used unfiltered and unescaped in the application (otherwise the markup won't work), which could be used to introduce some XSS by the translators - unlikely to happen though. No problem in his case, because he is the only one translating the contents...

Which markup language to use?

(X)HTML is a quite domain specific markup language, and also known by a lot translation agencies. But the attack vector and the domain specific nature of the language makes it somehow awkward to use. - You could of course filter the (X)HTML using HTMLPurifier, or similar.

Are there any other, better markup languages to use in this case? BBCode, Wiki-Markup or RST won't work either, I suppose, because only very few translation agencies will know about them. Translations sometimes may make it necessary to completely restructure a sentence or even a complete paragraph. So the translator need to understand the markup.

Which markup language do you use in such cases?