Converting a strangely formatted glossary file

26 views
Skip to first unread message

Steven P. Venti

unread,
Aug 6, 2014, 8:42:21 PM8/6/14
to felix...@googlegroups.com
I have a strangely formatted glossary file.
(eic_pdic_je.txt from http://www.eic.or.jp/library/dic/download.php

The format is:

日本語の単語1<cr>
English1<cr>
日本語の単語2<cr>
English2<cr>
...etc
日本語の単語2<cr>
English2<cr>

There are close to 10,000 terms total, so I really don't have the time to
convert this manually to a Felix glossary nor am I savvy enough to write a
macro for either Word or Excel that would reformat it.

Does anyone know of a tool that might be of help or a way to be able to use
this data in Felix that won't cost me a day of manual labor?

TIA

-----------------------------------------------------------------
Steve Venti
spv...@gmail.com

Arima 1-11-5-203
Miyamae-ku, Kawasaki, Kanagawa 216-0003
Tel: 090-8045-5128

-----------------------------------------------------------------

Ryan Ginstrom

unread,
Aug 6, 2014, 10:11:58 PM8/6/14
to felix...@googlegroups.com
This macro should turn it into a tab-separated form, which you could then paste into Excel and then get into Felix or another format.

Important: Delete any extra lines between the top of the document and the first source term. Otherwise, you could get terms in the wrong order.

Sub MakeGloss()
'
' Takes a glossary in the following form:
' SOURCE1
' TRANS1
' SOURCE2
' TRANS2

' ... and converts it into a tab-delimited format
' MAKE SURE FIRST LINE OF DOC IS FIRST SOURCE
'
' first, go to the top of the document
Selection.HomeKey Unit:=wdStory

While 1 = 1
Selection.EndKey Unit:=wdLine
Selection.Delete Unit:=wdCharacter, Count:=1
Selection.TypeText Text:=vbTab

currentPos = Selection.Start

Selection.MoveDown Unit:=wdLine, Count:=1

' if the position hasn't changed, we're at the end
' of the document
If currentPos = Selection.Start Then
Exit Sub
End If

Selection.HomeKey Unit:=wdLine
Wend

End Sub

Regards,
Ryan

Ryan Ginstrom
Felix Computer Assisted Translation
sup...@felix-cat.com
http://felix-cat.com/
> --
> You received this message because you are subscribed to the Google Groups
> "felix-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to felix-users...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Charles Aschmann

unread,
Aug 6, 2014, 10:31:29 PM8/6/14
to felix...@googlegroups.com
Hi Steve,

The following Word Macro should do it.
I am sure there are better Macro writers that could give you a faster
one, but I made this one up by recording a procedure then making a loop
with it. I was able to convert your file.

Sub LinesToText()
'
' LinesToText Macro
' Converts alternating line files to tab delimited
'
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "^p"
.Replacement.Text = "^t"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchByte = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = False
.MatchFuzzy = False
End With
Selection.Find.Execute
While Selection.Find.Found
With Selection
If .Find.Forward = True Then
.Collapse Direction:=wdCollapseStart
Else
.Collapse Direction:=wdCollapseEnd
End If
.Find.Execute Replace:=wdReplaceOne
If .Find.Forward = True Then
.Collapse Direction:=wdCollapseEnd
Else
.Collapse Direction:=wdCollapseStart
End If
.Find.Execute
End With
Selection.Find.Execute
Wend
End Sub

It should work on any glossary file with the two language in alternating
lines with paragraph breaks like this file. If the file has glitches or
extra line breaks, it will foul up.
If need be, I can send you the file privately.

Charles Aschmann

Charles Aschmann

unread,
Aug 6, 2014, 10:40:47 PM8/6/14
to felix...@googlegroups.com
Obviously Ryan's macro is much better than mine, so please ignore the
one I sent.

Charles Aschmann

Charles Aschmann

unread,
Aug 6, 2014, 10:51:34 PM8/6/14
to felix...@googlegroups.com
I tested Ryan's macro. While it is definitely superior macro writing,
there are some lines in this file long enough to break across two lines
in the file. These cause a reversal in the order the macro creates
around the tabs and then the macro stops because of the long lines later
for some reason.
In a file with short lines, this macro would work great and faster than
mine, but in this particular file, my cribbed together macro gets all
the way through, creating a proper tab delimited file. Just dumb luck, I
think, but you might be able to use it.

Charles Aschmann


On 8/6/2014 10:12 PM, Ryan Ginstrom wrote:

Steven P. Venti

unread,
Aug 7, 2014, 2:19:43 AM8/7/14
to felix...@googlegroups.com
Thanks much to both Charlie and Ryan for helping me out with this issue.

I was able to make both an excel file and a Felix glossary file for this
content, either of which I would be happy to share with anyone on this list.
Feel free to contact me offlist and I will send you a copy.

Cheers,

Steve Venti
> --
> You received this message because you are subscribed to the Google Groups "felix-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to felix-users...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages