PART TWO: The Most Efficient Translation Approach

In Part One – we discovered a host of ways to translate content in Sitecore. Regrettably – none of them are very scalable, efficient or reliable.

Leaving content in Sitecore and inviting translators into the Content Editor is risky, requires training and denies your translators access to their specialized tools like Translation Memory and desktop editing tools. These tools provide tremendous efficiencies – so you really DO want let your translators use them.

So certainly – it’s widely agreed that the content must be sent over to your translation firm. And while the vast majority of clients resort to a manual copy and paste approach – for larger scale or quick turn-around projects – this approach typically fails.

Let’s examine what it would take to automate this approach. Wisely – Sitecore does offer an out-of-the-box XML export feature. However – it’s an all or nothing dump. And it wraps all of the content into one big file. That’s barely survivable if you’re using one translator for just one target language. But imagine if you want to translate into several languages, or wanted to use multiple resources to speed the process. You’d either have to split that file up manual (probably more painful than an item by item copy and paste effort) or export all the content more than once and forward these multiple mega-files to various resources.

So the XML export idea is out. Too bad – XML is a handy file format – widely embraced by both translation firms and translation technologies.

Now – the good news is that Sitecore is pretty extensible – and has well-documented APIs’ and powerful pipeline logic schemas. For the crafty developer, the logical step is create some sort of export process that gathers up content and spits it out the back of Sitecore in an XML file for delivery to the translators. Now – Clay Tablet has built just such a plug-in for Sitecore that does all of what I’ll describe here and much more. But for the purposes of providing more insight into the best practices of CMS<>Translation integrations, I’ll dig into a host of the considerations here – without giving away too much of our rocket sauce recipe.

Step One: Export content.

Sounds easy – but consider these challenges. You’ll need some sort of UI so content editors can chose their target languages, include requested completion dates, email notifications to both the translation firm and the content editors upon content return, and some means of determining the destination for the content.

Then – pay some attention to the architecture of the export process. You’ll want to possibly bundle items together into “projects” – but you’ll need to be able to track which items go where once they are returned. Oh, and mind versioning too. You’ll want to ensure that everything stays aligned by version and target language. And beware of what fields of content you do and do not send for export. While you’re at it – you might want to build some functionality around filtering fields – in case there are times you don’t want to send certain content fields out for translation (dimensions or currency for example).

Step Two: Deliver to your Translation Provider

This sounds easy – and if you just export to XML and dump the files on a drive to be emailed over – it sort of is. But it’s a long way from best practices. For real efficiency wins – sort out how to deliver those files DIRECTLY into the translation management system (tms) of your translation provider. Most leading TMS’s have open API’s that you can code to – allowing this automation. You’ll need to do some research – SDL’s TMS is quite different than their WorldServer product – which is utterly unrelated to Transperfect’s GlobalLink or the open source solutions – like Globalsight – from Welocalize.

Beware of too tight an integration though – or your integration will suffer as the various technologies progress through various versions. The best architecture we’ve found, shown below, is to use an intermediary layer that abstracts the Sitecore connection from the translation connection – allowing systems to be mixed and matched, and prevents version updates for either system from requiring a ground-up rebuild of your integration. Throw in a bit of routing logic in that connectivity platform you can send content to various translation providers based on things like target language. Oh – and pay some attention to security, use an SSL connection and encrypt files while they’re in transit to be safe.

suggested connectivity platform

Step Three: Return Content into Sitecore.

Although this is the last step – it’s actually the hardest by far. You’ll have to architect a means to track every version of every item and every language version too, so that you can reliably reinsert the translated content back into each field in every item. You’ll need to create target language versions as well – but beware how and when you do this. Nothing bogs down a Sitecore Content Editor server more than the automated creation of 10 language versions of 500 items.

Lastly – take the handy workflow you’ll have created to allow Step One, and modify it to allow internal reviews before publishing. If the translated content needs changing – be sure to create a means to send those changes BACK to your translation provider so they can update their translation memories – otherwise your next round of translation will recreate the errors you just fixed.

If all this sounds like a whole lot of coding and learning by trial and error you’re absolutely right. And we haven’t even broached topics like status monitoring, queue management, and batch gathering and filtering! Clay Tablet spent well over 3000 development hours and learned from dozens of client installs worldwide as we built our Sitecore Translation Connector.

But frankly, this is precisely what it takes to efficiently move large, but specific batches of content to and from various translation providers in a timely, secure, scalable and reliable fashion. And all the “nice to have” features I’ve alluded to become utterly critical in the heat of battle as you’re trying to track down that one last page of content that has to be back from translation before you can launch the new product website – tomorrow! If you have a stable, reliable, scalable, integration with your translation providers though – you’ll make that deadline – no problem.