MDF export: inserting line breaks

26 views
Skip to first unread message

Nichim

unread,
Sep 29, 2008, 1:57:02 AM9/29/08
to Shoebox/Toolbox Field Linguist's Toolbox
Hello kind folks! I am struggling with a desire to output a dictionary
from Toolbox with line breaks before each new Sense and Example (v)
field, in order to break up long blocks of text for the readers. I
have done this manually for a few entries in Word and we like the way
it looks, but ideally Iʼd like to be able to automate it for
consistency with future editions.

It seems that the best way to output this from Toolbox would be to
insert some appropriate Newline commands in the file MDFDict2.cct. I
would also like to apply a separate paragraph style to the resulting
paragraphs, so that I could then tweak that style in the .dot. Iʼve
looked at the .cct file, but I havenʼt done any programming since I
was a kid and theyʼd just come out with Basic, so Iʼm not really sure
what to do. Can someone tell me if Iʼm on the right track, and if so,
give me a nudge in the right direction?

Thanks kindly,

Sarah Braun Hamilton

ToolboxSupport

unread,
Sep 30, 2008, 5:31:55 PM9/30/08
to Shoebox/Toolbox Field Linguist's Toolbox
Dear Sarah,

It takes a brave person to look into MDFDict2.cct!

Let me give you a short guided-tour to the points of interest for you
at the moment:

At the beginning of the file, search for
define(dOutput) >

This will bring you to the beginning of the output routine.

Now change your search to the word
paragraph

The *second* match to paragraph is in the section titled "Alternate
Hierarchy". It should have stopped at the first of these two lines:

'\IP' nl c Indented Paragraph
'\sn ' out(sn) ')|{~}' nl c 1997-12-04 MRP: |{~} notation

It appears that in the alternate hierarchy, the sense number does
start a paragraph!

Select and copy the first line.

Now do a find for
'\sn

That is
apostrophe backslash s n

It will probably stop at the line shown above. If so, go on to the
next place it stops. That will be the third line of the following.

if(sn) begin c Sense Number
if(px) '\px ' out(px) nl '\BP' nl clear(px) endif
'\sn ' out(sn) ')|{~}' nl c 1997-12-04 MRP: |{~} notation
end endif


Paste the line you copied earlier, about the indented paragraph, so
that it now looks like this:

if(sn) begin c Sense Number
if(px) '\px ' out(px) nl '\BP' nl clear(px) endif
'\IP' nl c Indented Paragraph
'\sn ' out(sn) ')|{~}' nl c 1997-12-04 MRP: |{~} notation
end endif

Give that a try. I haven't, but I think it will work.

===================================
For the sentences, try the following modification:

Look for
if(xvBundle) out(xvBundle)
endif

Change it to
if(xvBundle) '\IP' nl out(xvBundle)
endif

Note that these are the same type of paragraph. MDF also has \BP
available as a paragraph break. It is used for subentries and for 2nd+
parts of speech.

If you need it, I can tell you how to add another paragraph type to
MDF.

Toolbox Support

Sarah Braun Hamilton

unread,
Oct 1, 2008, 12:58:02 AM10/1/08
to ShoeboxToolbox-Fiel...@googlegroups.com
Thank you so much. I was actually so brave that I experimented with this for some time after I posted my question and came to more or less this same solution. However, I didn't track what I did, so having these clear instructions will be a lifesaver in the future. In order to add a new paragraph style, I just modified  '\IP' to  '\IP1' etc. and it worked wonderfully. Would this be what you would recommend?

Thanks again,
Sarah 



2008/9/30 ToolboxSupport <Too...@sil.org>

ToolboxSupport

unread,
Oct 1, 2008, 7:47:42 AM10/1/08
to Shoebox/Toolbox Field Linguist's Toolbox
Dear Sarah,

I'm impressed! I found MDFDict2.cct rather intimidating when I first
started looking at it. But with care it is possible to make useful
changes.

Just putting it into the CC table might be enough. When Toolbox sees
an unexpected marker in a file that it's loading, it adds it to the
list with the Field Name * (asterisk), and it comes in as paragraph
style (which I consider a bug, and which may get fixed). Such a
nameless marker would be sent to Word with the style name of the
marker itself -- in this case IP1.

But it all seems a bit chancy. I'm not sure that MDF uses the regular
import methods. Too many "surprise" markers seem to come through just
as markers. To be sure, you need to modify the database type
MDF_RTF.typ. This is *not* the type you think of your dictionary as
using. MDF changes database types as it processes the data. (This is
why so many attempts at adding new markers, changing languages, etc,
go awry.)

Do Project, Database types. You will see MDF_RTF in the list of types
but "unused" (not in bold type). Modify this type and add the new
marker (IP1). Be sure it is paragraph style for export, and give it a
useful Field Name for the Word style.

That should do it. All the other features of the paragraph style are
dealt with in Word. (Be sure to click "Add to Template" when you
modify it, or the changes will only apply to the current document.)

Toolbox Support


On Oct 1, 12:58 am, "Sarah Braun Hamilton"
<sarahbraun.hamil...@gmail.com> wrote:
> Thank you so much. I was actually so brave that I experimented with this for
> some time after I posted my question and came to more or less this same
> solution. However, I didn't track what I did, so having these clear
> instructions will be a lifesaver in the future. In order to add a new
> paragraph style, I just modified  '\IP' to  '\IP1' etc. and it worked
> wonderfully. Would this be what you would recommend?
>
> Thanks again,
> Sarah
>
> 2008/9/30 ToolboxSupport <Tool...@sil.org>
> > > Sarah Braun Hamilton- Hide quoted text -
>
> - Show quoted text -

Oumar

unread,
Oct 1, 2008, 4:07:18 PM10/1/08
to Shoebox/Toolbox Field Linguist's Toolbox
Dear Toolbox Support,

I would like the opposite of Sarah: to remove the indented paragraphs
in multiple sense entries, multiple part of speech entries or after
the usage field. In short: to get an entry with a block paragraph. How
would I proceed? Thanks.

Oumar
> > - Show quoted text -- Hide quoted text -

ToolboxSupport

unread,
Oct 2, 2008, 6:22:58 PM10/2/08
to Shoebox/Toolbox Field Linguist's Toolbox
Dear Oumar,

First, in CC a "comment" is signalled by a c either at the beginning
of a line or after a space or tab. The c must also be followed by a
space.

To remove the unwanted paragraphs, the safest approach is to turn them
into comments by placing a c followed by a space at the beginning of
the relevant lines. Then if you want one of them back, you can restore
it easily.

As I look at MDFDict2, the paragraphs all appear to be near to one
another. They are also teamed with the the outputting of any graphics
(px) -- but that is done at the lexeme also, so they won't all
disappear. (I hope.)

Here's the code below. Note the five c's that I have placed on the
left margin to comment out the paragraph lines and the graphic
paragraphs -- they're the ones without the string of hyphens
following. (You didn't say you didn't want the subentries as
paragraphs but I commented them out too. You don't have to.)

If you want the possibility of graphics in various places in the entry
you will need a paragraph break if there is one -- then see the next
section further on.

======================.

end else begin c Standard hierarchy: \lx \se \ps \sn
c ----------------------------------------------------------------
if(se) begin c Subentry
c if(px) '\px ' out(px) nl clear(px) endif
c '\IP' nl c Indented Paragraph
'\se ' out(se) '|fs{ ' d9 '}' nl c 1999-08-11 MRP
set(Headword)
end endif
c ---------------------------------------------------------------
if(ph) '\ph [' out(ph) ']' nl endif c Phonetic/phonemic
c ---------------------------------------------------------------
if(ps) begin c Part of Speech
ifn(Headword) begin c If not the first \ps after the headword
c if(px) '\px ' out(px) nl clear(px) endif
c '\BP |{emdash}' nl c Block Paragraph em-dash
end endif clear(Headword)
if(pn)
'\pn ' out(pn) '.' nl
else
'\ps ' out(ps) '.' nl
endif
end endif
c ----------------------------------------------------------------
if(sn) begin c Sense Number
c if(px) '\px ' out(px) nl '\BP' nl clear(px) endif
'\sn ' out(sn) ')|{~}' nl c 1997-12-04 MRP: |{~} notation
end endif

======================.

To preserve the ability to sprinkle graphics throughout the entry, do
the following instead. Note that I inserted the paragraph code into
the conditional for the graphic. I've added the comment
c <<<
to flag where I made changes. The sense number paragraph was already
conditional -- based on the presence of a graphic.

======================.
end else begin c Standard hierarchy: \lx \se \ps \sn
c ----------------------------------------------------------------
if(se) begin c Subentry
if(px) '\px ' out(px) nl '\IP' nl clear(px) endif c
<<<
c '\IP' nl c Indented Paragraph
'\se ' out(se) '|fs{ ' d9 '}' nl c 1999-08-11 MRP
set(Headword)
end endif
c ---------------------------------------------------------------
if(ph) '\ph [' out(ph) ']' nl endif c Phonetic/phonemic
c ---------------------------------------------------------------
if(ps) begin c Part of Speech
ifn(Headword) begin c If not the first \ps after the headword
if(px) '\px ' out(px) nl '\BP' nl clear(px) endif c
<<<
c '\BP |{emdash}' nl c Block Paragraph em-dash
end endif clear(Headword)
if(pn)
'\pn ' out(pn) '.' nl
else
'\ps ' out(ps) '.' nl
endif
end endif
c ----------------------------------------------------------------
if(sn) begin c Sense Number
if(px) '\px ' out(px) nl '\BP' nl clear(px) endif
'\sn ' out(sn) ')|{~}' nl c 1997-12-04 MRP: |{~} notation
end endif

======================

I haven't tried this so it might not work. Give it a try and let me
know if there are any problems.

To make a true block paragraph instead of a hanging indent paragraph
(which is how the Entry Paragraph is usually done), then in Word do
Styles and Formatting, select Entry Paragraph from the list of styles,
and do Format, Paragraph and modify the paragraph style appropriately.

Don't forget to check "Add to Template" or the style change will be
only for that copy of the document.

Toolbox Support

Nichim

unread,
Oct 2, 2008, 7:56:04 PM10/2/08
to Shoebox/Toolbox Field Linguist's Toolbox
Just a note: when I went to modify the MDF_RTF.typ file, I found the
new markers (IP1 etc.) that I had inserted into MDFDict2.cct to
already be there. So now I can change the names to something more
useful, but everything seems to be okay for now.

Thanks again!

On Oct 1, 4:47 am, ToolboxSupport <Tool...@sil.org> wrote:
> Dear Sarah,
>
> I'm impressed! I found MDFDict2.cct rather intimidating when I first
>e started looking at it. But with care it is possible to make useful
Reply all
Reply to author
Forward
0 new messages