Problems with WF TMX files Автор темы: John Fossey
| John Fossey Канада Local time: 01:51 Член ProZ.com c 2008 французский => английский + ...
A posting has been made on the Wordfast Yahoogroups forum about an agency that doesn't want translators to use WF because of "technical problems" with the TMX created by WF.
I have noticed major problems with importing TMX files created by WFC into Studio 2011. Studio will typically import a few hundred TUs and fail with an error about an unexpected token or invalid character. Of course, Studio fails the import completely, while other tools will skip the bad segment and continue. So... See more A posting has been made on the Wordfast Yahoogroups forum about an agency that doesn't want translators to use WF because of "technical problems" with the TMX created by WF.
I have noticed major problems with importing TMX files created by WFC into Studio 2011. Studio will typically import a few hundred TUs and fail with an error about an unexpected token or invalid character. Of course, Studio fails the import completely, while other tools will skip the bad segment and continue. So I will often have to import the TMX into another tool that is more forgiving, such as Workbench or Olifant, and export it again, in order to successfully import it into Studio.
So from my experience there does appear to be a problem with the TMX files created by WFC. Has anyone else had this experience? ▲ Collapse | | | esperantisto Local time: 09:51 Член ProZ.com c 2006 английский => русский + ... ЛОКАЛИЗАТОР САЙТА
... It's barking up the wrong tree. Using WF TMX files with OmegaT is not problematic, thus, the problems are probably about Trados. | | | John Fossey Канада Local time: 01:51 Член ProZ.com c 2008 французский => английский + ... Автор темы Good to have feedback | Feb 19, 2013 |
esperantisto wrote:
... It's barking up the wrong tree. Using WF TMX files with OmegaT is not problematic, thus, the problems are probably about Trados.
Thanks for the feedback. In which case the complaint by the agency could well be a problem with their software as well. | | | FarkasAndras Local time: 07:51 английский => венгерский + ... How do you know? | Feb 19, 2013 |
esperantisto wrote:
... It's barking up the wrong tree. Using WF TMX files with OmegaT is not problematic, thus, the problems are probably about Trados.
The fact that OmT accepts the files doesn't necessarily mean that they are good files (i.e. that they meet the TMX spec). It may just be that OmT is very permissive.
Studio is unfortunately very picky when it comes to accepting TMX files, rejecting many files that other tools, including earlier trados versions, accept. Some of those files are good, some of them not so much (i.e. they are bad, they just aren't malformed enough to fail completely on other tools).
It'd be interesting to know what category the WF files fall into and what the exact problem is.
All that said, SDL should get its act together and write a TMX import filter that can skip malformed segments and move on with the import while issuing meaningful error messages - much like they would do well to write a doc/docx export filter that can tolerate certain flaws in the sdlxliff. Even partial, mangled but completed operations would be better that the current practice of leaving the user high and dry - better by a long shot.
[Edited at 2013-02-19 18:35 GMT] | |
|
|
John Fossey wrote:
So from my experience there does appear to be a problem with the TMX files created by WFC.
I have to edit it. The text in "" is not visible.
1. Standard TMX is in UTF-8 and this is declared:
""
""
""
(from LF-Aligner)
2. TMX from WFC 6.03 looks so:
""
""
Some CATs can recognize that data are in Unicode, some not.
3. New tool WfConvertor converts data from more formats into TXT Wordfast TM in Unicode. Conversion from TM into TMX is in non-standard Unicode.
4. The solution is very simple, to use older tool WF2TMX and on radio-button select an encoding UTF-8. The conversion is very fast and clear, see declaration of TMX:
""
""
" | | | FarkasAndras Local time: 07:51 английский => венгерский + ...
You need to use character references for < and > to get them to show up.
Like so:
Milan Condak wrote:
John Fossey wrote:
So from my experience there does appear to be a problem with the TMX files created by WFC.
I have to edit it. The text in "<" and ">" is not visible.
1. Standard TMX is in UTF-8 and this is declared:
"<?xml version="1.0" encoding="utf-8" ?>"
"<!DOCTYPE tmx SYSTEM "tmx14.dtd">"
"<tmx version="1.4">"
(from LF-Aligner)
2. TMX from WFC 6.03 looks so:
"<?xml version="1.0" ?>"
"<tmx version="1.4">"
Some CATs can recognize that data are in Unicode, some not.
3. New tool WfConvertor converts data from more formats into TXT Wordfast TM in Unicode. Conversion from TM into TMX is in non-standard Unicode.
4. The solution is very simple, to use older tool WF2TMX and on radion-button select an encoding UTF-8. The conversion is very fast and clear, see declaration of TMX:
"<?xml version="1.0" encoding="UTF-8"?>"
"<tmx version="1.4">"
"<header"
creationtool="Wf2Tmx.exe"
creationtoolversion="1.0.11.41"
---
Milan Condak
Czech WF Trainer
br><br>[Upraveno: 2013-02-19 19:12 GMT]<br><br>[Upraveno: 2013-02-19 19:13 GMT]<br><br>[Upraveno: 2013-02-19 19:14 GMT]<br><br>[Upraveno: 2013-02-19 19:37 GMT] | | | esperantisto Local time: 09:51 Член ProZ.com c 2006 английский => русский + ... ЛОКАЛИЗАТОР САЙТА
FarkasAndras wrote:
The fact that OmT accepts the files doesn't necessarily mean that they are good files (i.e. that they meet the TMX spec). It may just be that OmT is very permissive.
Good point, still…
Studio is unfortunately very picky
This only confirms that the problem is about Studio.
Anyway, a question to John Fossey: how did you produce the TMX file(s)? By export using the WF data editor? If yes, try converting the source WF translation memory using Olifant. Or try exporting it using Anaphraseus. Obviously, no guarantee… | | | esperantisto Local time: 09:51 Член ProZ.com c 2006 английский => русский + ... ЛОКАЛИЗАТОР САЙТА Lemme correct it | Feb 20, 2013 |
Milan Condak wrote:
1. Standard TMX is in UTF-8 and this is declared:
Code:
| <?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE tmx SYSTEM "tmx14.dtd">
<tmx version="1.4">
|
|
(from LF-Al... See more Milan Condak wrote:
1. Standard TMX is in UTF-8 and this is declared:
Code:
| <?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE tmx SYSTEM "tmx14.dtd">
<tmx version="1.4">
|
|
(from LF-Aligner)
2. TMX from WFC 6.03 looks so:
Code:
| <?xml version="1.0" ?>
<tmx version="1.4">
|
|
Some CATs can recognize that data are in Unicode, some not.
3. New tool WfConvertor converts data from more formats into TXT Wordfast TM in Unicode. Conversion from TM into TMX is in non-standard Unicode.
4. The solution is very simple, to use older tool WF2TMX and on radio-button select an encoding UTF-8. The conversion is very fast and clear, see declaration of TMX:
Code:
| <?xml version="1.0" encoding="UTF-8"?>
<tmx version="1.4">
<header
creationtool="Wf2Tmx.exe
creationtoolversion="1.0.11.41 |
|
Hint: use & l t ; (without spaces) for the less-than sign and & g t ; (without spaces) for the greater-than sign. Note that post preview will ruing it
P. S. This forum allows using code tags but handles them wrongly. I’m going to submit a support ticket. ▲ Collapse | |
|
|
Presentation | Feb 20, 2013 |
esperantisto wrote:
Hint: use & l t ; (without spaces) for the less-than sign and & g t ; (without spaces) for the greater-than sign. Note that post preview will ruing it
I created a presentation: WF2TMX, Unicode vs. UTF-8
http://condak.net/tmx/wfconverter/cs/00.html
I hope that it is clear: how to import TMX into WFC and TMX converted in WF2TMX.
Thank you esperantisto for hint. I found it in HTML editor, too.
Milan
[Upraveno: 2013-02-20 09:01 GMT] | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Problems with WF TMX files Wordfast Pro | Translation Memory Software for Any Platform
Exclusive discount for ProZ.com users!
Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value
Buy now! » |
| Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |