Thanks for this new build. Which files should we copy to the existing CT folder?
Best regards
Wolfgang
Thanks for the wishes. In the new year 2012, let me wish health, happiness
and prosperous contracts to everybody here and your families.
The two files need to be updated (Cafetran.jar and mt.jar). I will send
you the instruction along with the update.
Best regards,
Igor
CafeTran website: http://www.cafetran.republika.pl
CafeTran support: cafetran...@gmail.com
Regards
Wolfgang
CafeTran does not impose any strict rules where to place the memories
in their system leaving this decision to the translators.
Using TMX memories directly was one of the really good features of Cafetran, I thought.
Hican anybody here please explain me the concept of Database vs. Memory? Isn' it enough creating a memory? What do I need a database for?Any explanation is appreciated ;-)
On Fri, Jan 6, 2012 at 2:23 AM, Wolfgang <wsch...@gmail.com> wrote:Hican anybody here please explain me the concept of Database vs. Memory? Isn' it enough creating a memory? What do I need a database for?Any explanation is appreciated ;-)
Of course a TM is a database. So is a glossary.
It is absolutely enough and that was the original design to keep
everything in one universal file format (TMX). But it turned out that
translators had their own preferences where to store their segments
and terms. Some say that nothing is better for terms than to keep them
in the simple tab delimited files (glossaries in CT terms). Others
request for a direct access to SQL databases such Oracle, MySQL,
OpenOffice database or MSAccess because over the years they have
learned how to organize their terminology there, and let alone a
complex TBX format which CafeTran does not support. So simple
"CafeTran world" expanded. However, those options are just OPTIONS,
and you can still keep everything in TMX files.
Hi Igor,
> Igor wrote to me some days ago when I asked him about the use of databases
> instead of a 'simple tmx file':
>
> "The answer is MANAGEMENT. Either through external SQL tools/commands or
> right from CT. Database can also be accessed in the server mode (from
> another location across the network). Hans, translators do have/keep their
> linguistic bases in SQL databases. Just a few days ago I received a request
> to check the possibility of accessing MS Access database from within CT. And
> I guess, all major CATs provide some kind of SQL database storage.
>
> In CT this is just an option and you can skip the procedure choosing a
> different method (TMX or a tab delimited glossary)."
When you wrote "translators do have/keep their linguistic bases in SQL
databases", you should rather have written "a very tiny minority of
translators do have/keep their linguistic bases in SQL databases".
And when you wrote "I guess all major CATs provide some kind of SQL
database storage", I have to say your guess was wrong. Few CAT tools
support SQL and even among those that do support it, only a minority
of power users actually take advantage of the possibilities it offers.
Generally speaking, my advice to you (if you want CT to gain broader
acceptance) would be not to bow too much to power users' demands,
because CT will turn into a piece of "frankenware" that will only
appeal to a small minority of geeks/nerds and confuse the vast
majority of potential users (let's face it, most translators are
anything but geeks/nerds). If you do include advanced features to
please a few geeks, try to hide them away in the interface, so they
won't confuse or discourage "normal" users.
Cheers,
Dominique
Best
Wolfgang
On Jan 6, 2012, at 10:59 AM, Dominique Pivard wrote:
> When you wrote "translators do have/keep their linguistic bases in SQL
> databases", you should rather have written "a very tiny minority of
> translators do have/keep their linguistic bases in SQL databases".
I am a Transit NXT user.
For some reason STAR decided to store terminology in an MS SQL Express database.
You have the strange situation that they (advertise with the) use an open file format (XML) for storing the segments (creating a deletable index on the fly), whereas the (IMO) most valuable part (the terminology) is locked in a stupid MS thingy. As this DB grows, it gets slower and slower.
I see big potential for CafeTran here to take over Transit's torch.
Hans
> When you wrote "translators do have/keep their linguistic bases in SQL
> databases", you should rather have written "a very tiny minority of
> translators do have/keep their linguistic bases in SQL databases".
True, minority but I would reflect on the "very tiny" part.
>
> And when you wrote "I guess all major CATs provide some kind of SQL
> database storage", I have to say your guess was wrong. Few CAT tools
> support SQL and even among those that do support it, only a minority
> of power users actually take advantage of the possibilities it offers.
I've just googled to see that Trados Muliterm, DVX and Swordfish have
something to do with SQL. Correct me if I'm wrong.
> Generally speaking, my advice to you (if you want CT to gain broader
> acceptance) would be not to bow too much to power users' demands,
> because CT will turn into a piece of "frankenware" that will only
> appeal to a small minority of geeks/nerds and confuse the vast
> majority of potential users (let's face it, most translators are
> anything but geeks/nerds). If you do include advanced features to
> please a few geeks, try to hide them away in the interface, so they
> won't confuse or discourage "normal" users.
This is a good advice. SQL Databases are complex beasts. The CT SQL
interface (External DB) cannot be simpler (just for basic operations like
searching and editing) and I don't intend to extend it leaving the more
complicated SQL operations for the particular SQL database tools.
> You have the strange situation that they (advertise with the) use an
> open file format (XML) for storing the segments (creating a deletable
> index on the fly), whereas the (IMO) most valuable part (the
> terminology) is locked in a stupid MS thingy. As this DB grows, it gets
> slower and slower.
That would never be the problem for CafeTran because DB segments or terms
used for translation are loaded into fast the RAM memory exactly the same
way a TMX memory is loaded into RAM.
I am pretty sure that the reason why we have relational databases in such software is because the designers did not have the incentive to produce good data matching/retrieving systems.
Databases introduce an extra layer of management and I'd argue that SQL systems are mostly useful for project managers who need to export TM data based on very specific criteria, and much less so for freelancers when they need to actually use the TM.
Jean-Christophe Helary
----------------------------------------
fun: http://mac4translators.blogspot.com
work: http://www.doublet.jp (ja/en > fr)
tweets: http://twitter.com/brandelune
>
>
> That would never be the problem for CafeTran because DB segments or terms used for translation are loaded into fast the RAM memory exactly the same way a TMX memory is loaded into RAM.
Yeah, yeah, yeah!
Hans
Hi Igor,
>> When you wrote "translators do have/keep their linguistic bases in SQL
>> databases", you should rather have written "a very tiny minority of
>> translators do have/keep their linguistic bases in SQL databases".
>
> True, minority but I would reflect on the "very tiny" part.
OK, maybe many translators do keep their linguistic bases in SQL
databases, but just because they happen to use a tool that relies on
an SQL database, like Déjà Vu. What I meant is that only a small
minority of these actually use SQL to massage their bases.
> I've just googled to see that Trados Muliterm, DVX and Swordfish have
> something to do with SQL. Correct me if I'm wrong.
You're definitely right about DVX and Swordfish, though I wouldn't put
them in the "major" league (in terms of market share). OK, I know DV
users will accuse me of bashing their beloved tool one more time, but
the fact is DV is lagging far behind Trados and Wordfast (which are
the tools I would consider as "major") in terms of market share. As to
MultiTerm, it isn't really a CAT tool, it is a terminologist's tool
that happens to be sold bundled with a CAT tool. I'm not sure if
Studio 2011 relies on an SQL database for its TM's. Wordfast and memoQ
definitely don't.
> This is a good advice. SQL Databases are complex beasts. The CT SQL
> interface (External DB) cannot be simpler (just for basic operations like
> searching and editing) and I don't intend to extend it leaving the more
> complicated SQL operations for the particular SQL database tools.
I agree with that approach! Design CT primarily with basic users in
mind, at least if you want to sell more copies of it.
Cheers,
Dominique
> Databases introduce an extra layer of management and I'd argue that SQL systems are mostly useful for project managers who need to export TM data based on very specific criteria, and much less so for freelancers when they need to actually use the TM.
Certainly true. Regarding DV, I think one reason why some people use
SQL has to do with the fact that, until very recently, you could only
have a single TM in use at any given time. This caused people to
maintain "big mommas", with a need to do some management with them.
Nowadays, most tools (including the latest version of DV) let you
access multiple TM's simultaneously, which means many users will tend
to have smaller TM's, eg. project-based or client-specific. This means
less needs for management with SQL.
Cheers,
Dominique
You're definitely right about DVX and Swordfish, though I wouldn't put
them in the "major" league (in terms of market share). OK, I know DV
users will accuse me of bashing their beloved tool one more time, but
the fact is DV is lagging far behind Trados and Wordfast (which are
the tools I would consider as "major") in terms of market share.
I'm not sure if
Studio 2011 relies on an SQL database for its TM's. Wordfast and memoQ
definitely don't.
> Has Wordfast become so big? I was still thinking of Wf as a niche program. I
> thought the big elephants in the business were Trados and Transit.
Look at the number of subscribers to the mailing lists for each tool.
I think it's a rather good indicator of their respective market share,
at least among freelance translators, who usually have to rely on such
lists for technical assistance (corporate users are another matter):
Wordfast: 6773 subscribers
Trados (TW_Users): 5632
Déjà Vu: 2388
OmegaT: 1678
memoQ: 893
Transit: 659
Swordfish: 278
For Wordfast, you can add subscribers from the Wordfast Pro list
(745), the French list (695), the Finnish list (339), the Anywhere
list (262) etc.
How many Transit users do you know personally? I can count them on one hand.
> Studio 2009 has TMs in "XLIFF" format, which look pretty compact when viewed
> in a low-level editor. I would think 2011 uses the same format.
XLIFF is the format by Studio used for *documents*. I'm not sure what
database they use for their TM, maybe something derived from MS
Access?
Cheers,
Dominique
The quick google search for Studio showed me the following:
http://en.wikipedia.org/wiki/SDL_Trados#Handling_of_translation_memories_and_glossaries
And accepting that TM format of Trados is SDLTM, I found this:
http://multilingual.texterity.com/multilingual/200909?pg=25#pg25
See the first paragraph is the Compatibly issues chapter, from which I
gather it is SQL based.
How many Transit users do you know personally? I can count them on one hand.
XLIFF is the format by Studio used for *documents*. I'm not sure what
database they use for their TM, maybe something derived from MS
Access?
>> XLIFF is the format by Studio used for *documents*. I'm not sure what
>> database they use for their TM, maybe something derived from MS
>> Access?
>
> Sorry, my bad. You are right. The Studio 2009 TMs are not text files. They
> are full of of code, and the header line says "SQlite format 3". So yes, a
> database format, alas. Good thing is that you can export them to TMX.
That's correct. SDL TMs are simple SQlite data bases that you can manipulate by issuing SQL commands directly from the command line if you have sqlite installed.
Multiterm termbases use Access.
> Very few. But I know that Transit has offices all over the world, including
> here in Tokyo, and that they have some really big customers and handle huge
> projects. E.g. I turned down some huge jobs for Toyota shop manuals, because
> Transit is a requirement.
>
> I just don´t think Wordfast is in the same league. Maybe for the number of
> mailing list subscribers, but certainly not for the volume of material
> handled.
I'd put Transit in the same league as Across, Idiom World Server and
whatever Lionbridge uses: translators who use it do it mostly in order
to be able to take part in the big projects you mention, but few
translators would pick Transit / Across / Idiom as their tools of
choice if there weren't such projects. There are exceptions, of
course, like Hans on this list.
Trados, Wordfast, Déjà Vu, memoQ, OmegaT, Swordfish, CafeTran etc.,
OTOH, are selected by translators mostly out of their "free will".
This is of course debatable in the case of Trados (often imposed by
agencies), Wordfast Pro (there's one big agency that makes it a
requirement) or even memoQ (which has gained market share among
agencies these last two years).
But out of translators who choose a CAT tool mostly for themselves,
I'd say Trados is nr 1 and Wordfast nr 2.
I know my view is biased: I've never worked with Across, Idiom,
Transit etc.; I don't have big agencies as clients, in fact, I have
never translated a single TTX so far.
Cheers,
Dominique
PS: btw, I omitted MetaTexis from my ranking. Its mailing list has 554
subscribers.
> Few CAT tools
> support SQL and even among those that do support it, only a minority
> of power users actually take advantage of the possibilities it offers.
FWIW I find regular expressions much simpler to work with.
Thanks for the interesting points to which I agree!
On Jan 6, 2012, at 2:36 PM, Dominique Pivard wrote:
> Certainly true. Regarding DV, I think one reason why some people use
> SQL has to do with the fact that, until very recently, you could only
> have a single TM in use at any given time. This caused people to
> maintain "big mommas", with a need to do some management with them.
> Nowadays, most tools (including the latest version of DV) let you
> access multiple TM's simultaneously, which means many users will tend
> to have smaller TM's, eg. project-based or client-specific. This means
> less needs for management with SQL.
One could even argue that the most sophisticated approach would be saving a project tm, in xml, for every project in its own folder, allowing to combine folders and subfolders when setting up a new project easily. Tools like WildEdit can be used for on the fly changes over the folders.
Or does this actually already exist?
Hans
(ducking)
> Look at the number of subscribers to the mailing lists for each tool.
> I think it's a rather good indicator of their respective market share,
Not necessarily. At least not for Transit. I suspect it has a lot of users in big companies, that are not able/allowed to participate in forums.
BTW The Transit forum is the only closed forum I know. But now I'm drifting off ...
Hans
Very few. But I know that Transit has offices all over the world, including here in Tokyo, and that they have some really big customers and handle huge projects. E.g. I turned down some huge jobs for Toyota shop manuals, because Transit is a requirement.
You conveniently cut the end of my post:
I think it's a rather good indicator of their respective market share,
at least among freelance translators, who usually have to rely on such
lists for technical assistance (corporate users are another matter):
So can we agree most Transit users are found in companies/agencies,
and that few freelancers use Transit, unless they work for Star? And
wouldn't most freelancers who work for Star typically use the free
version of Transit (was it called Satellite?)
Cheers,
Dominique
> 1. You are aware of Transit's ability to create XLIFF files (that can be
> translated with other tools).
memoQ can import Transit projects (PXF) directly:
http://kilgray.com/memoq/50/help-en/index.html?import_transit_project.html
I understood that works very well.
Cheers,
Dominique
> memoQ can import Transit projects (PXF) directly:
>
> http://kilgray.com/memoq/50/help-en/index.html?import_transit_project.html
>
> I understood that works very well
The best tool for working with Transit files is ...
;P
The best tool for working with Transit files is ...
> memoQ can import Transit projects (PXF) directly:
>
> http://kilgray.com/memoq/50/help-en/index.html?import_transit_project.html
>
> I understood that works very well
;P