Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

WORD X: VBA macro including find/replace with styles much slower than regular macro

1,183 views
Skip to first unread message

Pierre Igot

unread,
Jul 20, 2002, 9:53:25 AM7/20/02
to
Hi,

Since I often have to publish texts both in printed form and for the
Web, I use both character styles and paragraph styles in Word for
every element in my text that will need a specific tag in HTML.

When it comes time to "prepare" my texts for HTML editing (which I do
in BBEdit), I then use a couple of VBA macros to search for
occurrences of these styles and add the tags to the occurrences before
transferring the text to BBEdit (which will strip it of all its
formatting).

My problem is that I find that VBA's Find/Replace commands work much
slower when style formatting is involved.

For example, here's the code I use for Find/Replace with styles:

-----

Sub webStyles()
styleToTag myStyle:="Emphasis", myTag:="em"
styleToTag myStyle:="Italics", myTag:="i"
styleToTag myStyle:="Bold", myTag:="b"
styleToTag myStyle:="Strong", myTag:="strong"
styleToTag myStyle:="Cite", myTag:="cite"
styleToTag myStyle:="Book Title - English", myTag:="i"
styleToTag myStyle:="Book Title - French", myTag:="i"
paraStyleToTag myStyle:="Citation", myTag:="blockquote"
paraStyleToTag myStyle:="Heading 1", myTag:="h1"
paraStyleToTag myStyle:="Heading 2", myTag:="h2"
paraStyleToTag myStyle:="Heading 3", myTag:="h3"
End Sub

Sub styleToTag(myStyle, myTag)
For Each sty In ActiveDocument.Styles
If sty = myStyle Then
With Selection.Find
.ClearFormatting
.Style = ActiveDocument.Styles(myStyle)
.Replacement.ClearFormatting
.Replacement.Style = ActiveDocument.Styles(myStyle)
.Text = ""
.Replacement.Text = "<" & myTag & ">^&</" & myTag & ">"
.Forward = True
.Wrap = wdFindStop
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
.Execute Replace:=wdReplaceAll
End With
End If
Next sty
End Sub

Sub paraStyleToTag(myStyle, myTag)
If Selection.Type <> wdSelectionIP Then
For I = 1 To Selection.Paragraphs.Count
If Selection.Paragraphs(I).Style = myStyle Then
Set myPara = Selection.Paragraphs(I).Range
myPara.MoveEnd Unit:=wdCharacter, Count:=-1
With myPara
.InsertBefore Text:=("<" & myTag & ">")
.InsertAfter Text:=("</" & myTag & ">")
End With
End If
Next I
Else
MsgBox "No selection."
End If
End Sub

-----

As a comparison, here's the code for another macro I use to "prepare"
the punctuation of the text for Web publishing. This macro does NOT
involve any style formatting, just plain old find/replace of textual
elements.

-----

Sub webPunctuation()
With Selection.Find
.ClearFormatting
.Replacement.ClearFormatting
.MatchCase = False
End With
characterSwitch myOld:="&", myNew:="&amp;"
characterSwitch myOld:="^s", myNew:="&nbsp;"
characterSwitch myOld:="(c)", myNew:="&copy;"
characterSwitch myOld:="(tm)", myNew:="&tm;"
characterSwitch myOld:="(r)", myNew:="&reg;"
End Sub

Sub characterSwitch(myOld, myNew)
With Selection.Find
.Text = myOld
.Replacement.Text = myNew
.Forward = True
.Wrap = wdFindStop
.MatchCase = False
.Format = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
End Sub

----

This second macro is MUCH faster than the first one. I would say that
it is about TEN TIMES faster. For a short text, it is almost
instantaneous (in Word X SR1), whereas the macro involving styles last
longer than 10 seconds. For longer articles, it's even worse.

Is this "normal"?

Pierre
---
LATEXT - Literature and Visual Arts http://www.latext.com
"Apple Peel" Columnist at Applelust.com

John McGhie [MVP - Word]

unread,
Jul 21, 2002, 7:57:00 PM7/21/02
to
Hi Pierre:

OK, I think I can see the problem. I am cross-posting this into two of the
VBA groups where people with better coding skills than mine can take a look
(fellas: this is Mac Word v.X, which has Word 2000's VBA less the Enums).

This responds to microsoft.public.mac.office.word on 20 Jul 2002 06:53:25
-0700, appl...@applelust.com (Pierre Igot):

In your webStyles() macro, you are setting two variables then calling
styleToTag. I think this causes styleToTag to recompile on each iteration.
I think you will get a speedup if you set up a two-dimensional array within
styleToTag, and add another level of nested "for each..." to walk down the
array.

The only other suggestion I can make is to work with the Selection object
rather than the Range object. For reasons we cannot figure out, the
Selection object is dramatically faster. This is counter-intuitive to me: I
think I smell a large lump of inline assembler in the mechanism for the
Selection Find object :-)

Others who know much more about VBA will have more to say on this.

Cheers


Please post all comments to the newsgroup to maintain the thread.

John McGhie, Consultant Technical Writer
McGhie Information Engineering Pty Ltd
Sydney, Australia. GMT + 10 Hrs
+61 4 1209 1410, mailto:jo...@mcghie-information.com.au

Antony

unread,
Jul 22, 2002, 12:01:35 PM7/22/02
to
Hello Pierre & John

"John McGhie [MVP - Word]" <jo...@mcghie-information.com.au> wrote in message news:<8mhmjus559alf74o0...@4ax.com>...
> Hi Pierre:


>
>
> In your webStyles() macro, you are setting two variables then calling
> styleToTag. I think this causes styleToTag to recompile on each iteration.

Is this right? I thought the VBA engine interpreted compiled bytecode.
If what you say is true, then no wonder!

> I think you will get a speedup if you set up a two-dimensional array within
> styleToTag, and add another level of nested "for each..." to walk down the
> array.

Do you mean so that it eliminates all the subroutine calls (sorry I
have trouble visualising this sort of thing)? Pierre, your second example
has half as many subroutine calls as the first. Also, the second (faster)
example is effectively searching and replacing plain text. This is fast.
Your first example is having to read (and write) style information contained
in the paragraph as well. The style information is not stored with the
characters. So presumably this means slower searching and replacing. But
anyway, is 10 seconds to have Word format an article for you really a problem?

Please note, however, that I've never had to write 'perfomance' VBA code. I've
not investigated this, just guessing based on what I think would be logical
(often the cause of much grief, will I ever learn? :).

Regards,

Antony

~~~~~~~~~~


John McGhie continued...

John McGhie [MVP - Word]

unread,
Jul 22, 2002, 7:49:29 PM7/22/02
to
Hi Anthony:

This responds to microsoft.public.mac.office.word on 22 Jul 2002 09:01:35
-0700, ad_sc...@postmaster.co.uk (Antony):

> > In your webStyles() macro, you are setting two variables then calling
> > styleToTag. I think this causes styleToTag to recompile on each iteration.
>
> Is this right? I thought the VBA engine interpreted compiled bytecode.
> If what you say is true, then no wonder!

It does. Each module is compiled at first call. But there's complicated
rules for when the compiled bytecode persists between calls and when it is
re-compiled. One of the reasons I cross-posted was that I hoped that you
blokes over there would know :-)

> Do you mean so that it eliminates all the subroutine calls (sorry I
> have trouble visualising this sort of thing)?

Yeah. Within a subroutine, it should automatically pass parameters by
reference, calling externally I wonder if it is trying to pass byVal?

> the second (faster)
> example is effectively searching and replacing plain text. This is fast.

Given that it is not opening the ParagraphFormatting object at all, yes.
But I thought it obtained these things by pointer anyway, which is very
quick because the pointers and formatting tables should all be resident in
memory. I don't notice the thing being slow when I search for formatting,
but maybe I am not looking properly.

Cheers

Antony

unread,
Jul 23, 2002, 8:23:08 AM7/23/02
to
Hello Pierre

I have never bothered with VBA performance issues
because these days I use it for dead simple stuff
and batch processing that I'm happy to leave running
in the backround. But you've piqued my interest so
I've had a look at your code. I'm afraid the problem
isn't VBA, it's your algorithms.

[Aside:

"John McGhie [MVP - Word]" <jo...@mcghie-information.com.au> wrote in

message news:<l66pju85dlg5ofvj7...@4ax.com>...


> It does. Each module is compiled at first call. But there's complicated
> rules for when the compiled bytecode persists between calls and when it is
> re-compiled. One of the reasons I cross-posted was that I hoped that you
> blokes over there would know :-)

So what does Debug | Compile Project do? Can't find
any info on it anywhere.]

paraStyleToTag
--------------
Your algorithm is very slow. On my 200 page test
document I got bored of waiting for it to finish
and broke out of it after several minutes. My
version below does it in about one and a half
seconds (the paragraph bit) by using the following
techniques.

Your paraStyleToTag loops over all the paragraphs
in the selection every time it's called and looks
for the matching style. But really you only need
to loop over the paragraphs once, and check for
all the styles in one go. Place the names of the
styles you want to search for and the replacement
tags in a 2-dimensional array. I think this is what
John McGhie suggested. Put the most common styles at
the beginning of the array. So the webStyles macro
could be written as follows. (I've left the
styleToTag stuff as it is for the moment.)

Sub webStyles()
Const PARASTYLE As Integer = 0
Const MYTAG As Integer = 1

Dim para As Paragraph
Dim my_array_of_styles As Variant
Dim i As Integer

my_array_of_styles = Array( _
Array("Citation", "blockquote"), _
Array("Heading 1", "h1"), _
Array("Heading 2", "h2"), _
Array("Heading 3", "h3") _
)

styleToTag myStyle:="Emphasis", MYTAG:="em"
styleToTag myStyle:="Italics", MYTAG:="i"
styleToTag myStyle:="Bold", MYTAG:="b"
styleToTag myStyle:="Strong", MYTAG:="strong"
styleToTag myStyle:="Cite", MYTAG:="cite"
styleToTag myStyle:="Book Title - English", MYTAG:="i"
styleToTag myStyle:="Book Title - French", MYTAG:="i"

If Selection.Type = wdSelectionIP Then
MsgBox "no selection"
Else
For Each para In Selection.Paragraphs
For i = 0 To UBound(my_array_of_styles)
If para.Style = my_array_of_styles(i)(PARASTYLE) Then
With para.Range
.MoveEnd Unit:=wdCharacter, Count:=-1
.InsertBefore Text:=("<" & my_array_of_styles(i)(MYTAG) & ">")
.InsertAfter Text:=("</" & my_array_of_styles(i)(MYTAG) & ">")
End With
End If
Next i
Next para
End If

End Sub

So now there are no subroutine calls for the paragraph
formatting and you loop through the paragraphs only once.


styleToTag
----------
This will benefit from a similar approach. But also
you are looping through every style in the document
(of which there are quite a lot) for every call to
the routine. You do this to double-check that the
style you are searching for is actually defined. This
checking is gives a performance hit.

You don't need it. Assume that all the styles are
defined (because most of them will be) and use
an error handler to take care of the case when they
are not.


Sub styleToTag(myStyle, myTag)
On Error GoTo err_handler


With Selection.Find
.ClearFormatting
.Style = ActiveDocument.Styles(myStyle)
.Replacement.ClearFormatting
.Replacement.Style = ActiveDocument.Styles(myStyle)
.Text = ""
.Replacement.Text = "<" & myTag & ">^&</" & myTag & ">"
.Forward = True
.Wrap = wdFindStop
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
.Execute Replace:=wdReplaceAll
End With

err_handler:
If Err.Number = 5941 Then 'oops, error: style doesn't
'exist
Resume Next 'can safely ignore it,
'carry on with processing
Else 'mustn't ignore any other errors
On Error GoTo 0 'so turn off error trapping,
Resume 'and let errors appear as normal
End If
End Sub

Well, testing on my 200 page document this alone
saves you a second and a half without implementing
the techniques I used for paraStyleToTag.

So yes, you were right John, searching for styles
is inherently quick.

Hope this helps Pierre.

Antony

~~~~~~~~~~

"John McGhie [MVP - Word]" <jo...@mcghie-information.com.au> wrote in message news:<l66pju85dlg5ofvj7...@4ax.com>...

Howard Kaikow

unread,
Jul 23, 2002, 9:28:03 AM7/23/02
to
It is my understanding that all of the Mac VBA versions were based on VBA 5,
not VBA 6.

--
http://www.standards.com/; Programming and support for Word macros,
including converting from WordBasic to VBA; Technical reviewing; Standards;
Product functional/design/specifications
------------------------------------------------


"John McGhie [MVP - Word]" <jo...@mcghie-information.com.au> wrote in message

news:8mhmjus559alf74o0...@4ax.com...

Paul Berkowitz

unread,
Jul 23, 2002, 11:37:06 PM7/23/02
to
That is correct. It's the same VBA as Word 97, not Word 2000 without the
enums as John said.

--
Paul Berkowitz
MVP Entourage

Please "Reply To Newsgroup" to reply to this message. Emails will be
ignored.

PLEASE always state which version of Entourage you are using - 2001 or X.
It's often impossible to answer your questions otherwise.

Antony

unread,
Jul 24, 2002, 5:33:30 AM7/24/02
to
Sorry, forgot to add an Exit For for a little
extra zip:

If Selection.Type = wdSelectionIP Then
MsgBox "no selection"
Else
For Each para In Selection.Paragraphs
For i = 0 To UBound(my_array_of_styles)
If para.Style = my_array_of_styles(i)(PARASTYLE) Then
With para.Range
.MoveEnd Unit:=wdCharacter, Count:=-1
.InsertBefore Text:=("<" & my_array_of_styles(i)(MYTAG) & ">")
.InsertAfter Text:=("</" & my_array_of_styles(i)(MYTAG) & ">")
End With

Here>> Exit For


End If
Next i
Next para
End If

Regards,

Antony

~~~~~~~~~~
ad_sc...@postmaster.co.uk (Antony) wrote in message news:<ac6de390.02072...@posting.google.com>...

John McGhie [MVP]

unread,
Jul 24, 2002, 9:24:05 AM7/24/02
to
Hi Anthony:

Lovely... Eloquent :-)

Now *I* have a question: Would we squeeze a few more CPU cycles out of it if
we moved the evaluation of Ubound(my-array-of-styles) out of the loop? Or
you have chosen because it's a very short array that the overhead's not
significant?

Would it be quicker to use para.range.start.insertbefore and
(para.range.end - 1).insert.after to perform the insertion directly without
having to move the range? Or does that effectively re-evaluate the range
anyway?

I love this stuff :-)

Many thanks

This responds to article <ac6de390.02072...@posting.google.com>,
from "Antony" <ad_sc...@postmaster.co.uk> on 24/7/02 7:33 PM:

--
Please post replies to the newsgroup to maintain the thread.

John McGhie, Microsoft MVP: Word for Macintosh and Word for Windows
Consultant Technical Writer
<jo...@mcghie-information.com.au>
+61 4 1209 1410; Sydney, Australia: GMT + 10 hrs

Jonathan West

unread,
Jul 24, 2002, 10:33:18 AM7/24/02
to

"John McGhie [MVP]" <jo...@mcghie-information.com.au> wrote in message
news:B964E795.1A177%jo...@mcghie-information.com.au...

> Hi Anthony:
>
> Lovely... Eloquent :-)
>
> Now *I* have a question: Would we squeeze a few more CPU cycles out of it
if
> we moved the evaluation of Ubound(my-array-of-styles) out of the loop? Or
> you have chosen because it's a very short array that the overhead's not
> significant?

I suspect that this will make little difference. It is a traditional code
optimisation technique, but with Word VBA macros, the commands that
manipulate numbers and strings are already lightning fast compared to those
which manipulate the Word object model. Even as it is, the UBound function
is only evaluated once per paragraph, on first entry to the inner loop.

>
> Would it be quicker to use para.range.start.insertbefore and
> (para.range.end - 1).insert.after to perform the insertion directly
without
> having to move the range? Or does that effectively re-evaluate the range
> anyway?

This is a more fruitful line of attack. However, better still might be to
use this to insert at the end of the para

.Characters.Last.InsertBefore Text:=("</" & my_array_of_styles(i)(MYTAG) &
">")

This works because .Characters.Last is a range marking the last character of
the paragraph i.e. the paragraph mark. Use InsertBefore to insert the tag in
front of it.

One other speedup approach might be worth trying. Using a For Each loop to
go through one of Word's built in collections can be extremely slow for
larger collections. It may be faster to use the paragraphs.next property to
get the location of the next paragraph on from the current one, and then use
the InRange method to see whether the next paragraph is still inside the
selection. The outer loop would therefore become a Do While loop rather than
a For Each Next loop.

If there are more than a couple of hundred paragraphs in the selection you
should see a noticeable difference. If there are more than a thousand, you
should see a difference sufficient to transform the performance of the
macro.

There is one problem however with this approach. If the selection contains a
table with vertically merged cells, then the next paragraph within the table
can take you back to a cell that has already been checked. You need to have
code that guards against this. Its simple enough to do, you just check the
value of .Range.End aganst that of the previous paragraph. If it is less,
you need to take alternative action to move on.

>
> I love this stuff :-)

<aol>Me too!</aol>

--
Regards
Jonathan West - Word MVP
MultiLinker - Automated generation of hyperlinks in Word
Conversion to PDF & HTML
http://www.multilinker.com
Word FAQs at http://www.multilinker.com/wordfaq
Please post any follow-up in the newsgroup. I do not reply to Word questions
by email


Pierre Igot

unread,
Jul 24, 2002, 2:48:31 PM7/24/02
to
> Sorry, forgot to add an Exit For for a little
> extra zip:
>
> If Selection.Type = wdSelectionIP Then
> MsgBox "no selection"
> Else
> For Each para In Selection.Paragraphs
> For i = 0 To UBound(my_array_of_styles)
> If para.Style = my_array_of_styles(i)(PARASTYLE) Then
> With para.Range
> .MoveEnd Unit:=wdCharacter, Count:=-1
> .InsertBefore Text:=("<" & my_array_of_styles(i)(MYTAG) & ">")
> .InsertAfter Text:=("</" & my_array_of_styles(i)(MYTAG) & ">")
> End With
> Here>> Exit For
> End If
> Next i
> Next para
> End If
>
> Regards,
>
> Antony

Thanks for all the suggestions, Antony. I will try them out whenever I
get the chance.

This confirms my suspicions, however, i.e. that macro programming in
VBA is not for the faint-hearted :).

abc

unread,
Jul 24, 2002, 6:27:47 PM7/24/02
to
Hello all,

(Still the same me!)

----- Original Message -----
From: "Jonathan West" <jw...@mvps.org>
Newsgroups:
microsoft.public.mac.office.word,microsoft.public.word.vba.customization,mic
rosoft.public.word.vba.general
Sent: Wednesday, July 24, 2002 3:33 PM
Subject: Re: WORD X: VBA macro including find/replace with styles muchslower
than regular macro


>


> "John McGhie [MVP]" <jo...@mcghie-information.com.au> wrote in message
> news:B964E795.1A177%jo...@mcghie-information.com.au...
> > Hi Anthony:
> >
> > Lovely... Eloquent :-)
> >
> > Now *I* have a question: Would we squeeze a few more CPU cycles out of
it
> if
> > we moved the evaluation of Ubound(my-array-of-styles) out of the loop?
Or
> > you have chosen because it's a very short array that the overhead's not
> > significant?
>
> I suspect that this will make little difference. It is a traditional code
> optimisation technique, but with Word VBA macros, the commands that
> manipulate numbers and strings are already lightning fast compared to
those
> which manipulate the Word object model. Even as it is, the UBound function
> is only evaluated once per paragraph, on first entry to the inner loop.
>

Agreed. Things like ubound() don't cause bottlenecks unless they
are being called time and time again in a tight loop. I don't think it's
really getting hammered in this instance.
(On the other hand I unwittingly got hammered by the lookup functions
when I first start programming with Access; they seemed so convenient...
So sometimes it does pay to investigate this sort of thing.)

But anyway, seeing as your definition of the array is hardcoded,
you could have the upper bound of the array as a constant, also
hard coded. But I don't like magic numbers in code (that's why
I defined the array the way I did, almost out of petulance!). Once
you've got the basic algorithm sorted out, all other optimization is a
case of diminishing returns at the expense of readability and
maintainabilty. Any optimizations should be carefully commented.

Plus I wanted to leave as much of Pierre's original code intact so
that my ideas would be as clear as possible to him. But you are right,
there are futher optimizations if you need time critical code, as you have
described below. But for this sort of batch processing thing it's not really
necessary.

More thoughts:

<begin stable numbering>

1: one job I had I went home at the end of the, just left stuff running
overnight. No worries. Unless it went wrong! :(

2: I reduced the time for my apx. 2000 paragraph test
document by a couple orders of magnitude. That's the
point where I give up. But as you say, it's still good fun!

3: Don't make it so fast that there's no time left to make a pretty
status bar move across the screen to impress the boss ;)

4: My version seemed fast, but waiting for Pierre to test
on /real/ documents. That's the acid test.

<end stable numbering>

Regards,

Antony
~~~~~~~~~~

Antony Scriven

unread,
Jul 24, 2002, 6:31:37 PM7/24/02
to
Eeep! Knew I was mailing from a different account,
but didn't know I had *that* for my name and e-mail!

Antony

~~~~~~~~~~
"abc" <a...@b.com> wrote in message news:ahn9kq$pf6$1...@newsg2.svr.pol.co.uk...

Antony Scriven

unread,
Jul 24, 2002, 6:43:17 PM7/24/02
to
Hello Pierre

"Pierre Igot" <appl...@applelust.com> wrote in message
news:c274558d.02072...@posting.google.com...

> Thanks for all the suggestions, Antony. I will try them out whenever I
> get the chance.
>
> This confirms my suspicions, however, i.e. that macro programming in
> VBA is not for the faint-hearted :).
>
> Pierre
> ---
> LATEXT - Literature and Visual Arts http://www.latext.com
> "Apple Peel" Columnist at Applelust.com

Not so Pierre. You wrote a macro that worked, that saved you time,
and that was a lot quicker than doing the processing by hand. So, by
that judgement your code was a great success. You just wanted a little
extra speed for your code. You can get this by applying general
programming principals, as I have shown, because I am no great VBA
wizard.

Happy programming,

Antony
~~~~~~~~~~


Jonathan West

unread,
Jul 24, 2002, 6:47:55 PM7/24/02
to
Hi Anthony,

> >
> > More thoughts:
> >
> > <begin stable numbering>
> >
> > 1: one job I had I went home at the end of the, just left stuff running
> > overnight. No worries. Unless it went wrong! :(

I do quite a few of those. I'm posting from the vba.general group, and I
have to say that for leaving macros running overnight, the only stable
combination I have found is Word 2K running under Win2K. Win 9x just doesn't
have the stability, and Word 2002 seems a bit less table under Win2K than
Word 2K is. I haven't yet explored the wonders of Win XP, so I don't know
whather that makes things better. I haven't used a Mac lately, so I can't
comment on which combinations are stable platforms for long-running batches.

> >
> > 2: I reduced the time for my apx. 2000 paragraph test
> > document by a couple orders of magnitude. That's the
> > point where I give up. But as you say, it's still good fun!

2 orders of magnitude is good. For a 2000 para test, I suspect that there's
still another order of magintude to be got. A 2000 para collection is an
unwieldy thing to manipulate.

> >
> > 3: Don't make it so fast that there's no time left to make a pretty
> > status bar move across the screen to impress the boss ;)

Awwwww! :-)

> >
> > 4: My version seemed fast, but waiting for Pierre to test
> > on /real/ documents. That's the acid test.

Very true.

Antony

unread,
Jul 25, 2002, 8:43:09 AM7/25/02
to
Hello all,

*LOL* I put the Exit For in the wrong place in my post!
No wonder it's fast!! (Definitive listing for the paragraph
styles below.)

"Jonathan West" <jw...@mvps.org> wrote in message news:<Osgett2MCHA.944@tkmsftngp10>...


> Hi Anthony,
>
> > >
> > > More thoughts:
> > >
> > > <begin stable numbering>
> > >
> > > 1: one job I had I went home at the end of the, just left stuff running
> > > overnight. No worries. Unless it went wrong! :(
>
> I do quite a few of those. I'm posting from the vba.general group, and I
> have to say that for leaving macros running overnight, the only stable
> combination I have found is Word 2K running under Win2K. Win 9x just doesn't
> have the stability, and Word 2002 seems a bit less table under Win2K than
> Word 2K is. I haven't yet explored the wonders of Win XP, so I don't know
> whather that makes things better. I haven't used a Mac lately, so I can't
> comment on which combinations are stable platforms for long-running batches.
>

Um. Sorry was speaking in general,
not necessarily about Word or any Microsoft operating system!
So: I've not left Word macros running overnight. Have done million+
mail merges in word6 on Win3.1. Apart from memory leak,
no bad stability problems. You just need an eggtimer to
remind you when next to reboot :) I have run macros taking 1hr + on
Word95 and upwards on winNT with no real problems. Have
left stuff running when I've gone home; everything okay in the
morning. Never used a Mac, sorry.

> > >
> > > 2: I reduced the time for my apx. 2000 paragraph test
> > > document by a couple orders of magnitude. That's the
> > > point where I give up. But as you say, it's still good fun!
>
> 2 orders of magnitude is good. For a 2000 para test, I suspect that there's
> still another order of magintude to be got. A 2000 para collection is an
> unwieldy thing to manipulate.
>

It goes without saying that you need to turn off screenupdating,
but I didn't do that in my original code, so Pierre, make sure
you do that. (And turn it back on afterwards. The number of times
I've sat there wondering why my macro is taking so long... :)

Right my new test doc: 10k pages, about 40k paragraphs, all
formatted with styles. 19.5Mb file. BTW, I'm running Word2k/Win2k
on goodness knows what hardware--using win2k pizza box. Load on
the server will make a difference to the test times, but I will
do several runs, make sure the variance isn't too large, and take
an average.

Damn! It's trying to save the AutoRecovery file!

Okay. Gonna just concentrate on this paragraph formatting bit
and remove the character styles code. Going to display a
count in the status bar so I know it's doing something. (But
I won't put total number of paragraphs in the status bar: for
a large document this will take a /long/ time to enumerate.)
Going to use Timer() to measure execution time in seconds
(execution time is v.long so Timer() is accurate enough).
Here's the code I've used.

____Start of listing____

Sub webStyles()

On Error Goto err_handler



Const PARASTYLE As Integer = 0
Const MYTAG As Integer = 1

Dim para As Paragraph
Dim my_array_of_styles As Variant

Dim i As Long 'index for main loop
Dim paracounter As Long 'count number of paras processed
Dim starttime As Single

starttime = Timer


my_array_of_styles = Array( _
Array("Citation", "blockquote"), _
Array("Heading 1", "h1"), _
Array("Heading 2", "h2"), _
Array("Heading 3", "h3") _
)

Application.ScreenUpdating = False

paracounter = 1


For Each para In Selection.Paragraphs
For i = 0 To UBound(my_array_of_styles)
If para.Style = my_array_of_styles(i)(PARASTYLE) Then
With para.Range
.MoveEnd Unit:=wdCharacter, Count:=-1
.InsertBefore Text:=("<" & my_array_of_styles(i)(MYTAG) & ">")

.InsertAfter Text:=("</" & my_array_of_styles(i)(MYTAG) & ">")
End With


Exit For
End If
Next i

StatusBar = paracounter
paracounter = paracounter + 1
Next para

Application.ScreenUpdating = True
MsgBox Timer - starttime

Exit Sub

err_handler:
Application.Screenupdating = True
On Error Goto 0
Resume
End Sub

____End of listing____

The time taken is 95 seconds on average. If you know how divide
that by 10 then /please/ let me know!

JW: One other speedup approach might be worth trying. Using a
JW: For Each loop to go through one of Word's built in
JW: collections can be extremely slow for larger collections. It
JW: may be faster to use the paragraphs.next property to get the
JW: location of the next paragraph on from the current one, and
JW: then use the InRange method to see whether the next
JW: paragraph is still inside the selection. The outer loop
JW: would therefore become a Do While loop rather than a For
JW: Each Next loop.

I thought that For Each ... Next was quick. You let VBA return
each member of the collection in the way it sees fit.
This results in speedy access to the each member, but the
order of access is unpredictable. At least that's how I
thought it works.

I've looked at Paragraph.Next and it seems to be slow to me.
I guess this is because VBA is trying to return the paragraphs
in the order that they appear in the document. Perhaps it is
traversing a linked list of paragraph structures. I would
guess that For Each ... Next just goes through memory allocated
for the pointers to paragraph structures sequentially which would
be quicker. I may be completely wrong, I'm just speculating.

To me the code looks as fast as it can reasonably get. I don't
think you can do any more with it unless you have knowledge of
Word's internal data structures and functions. And then you
would need to seriously comment your code so that you know what
it means a week later!

Pierre, your code seemed so innocent; look what you've done! :)
Never done this sort of thing before; I've learnt quite a bit
about VBA with these experiments.

Cheers,

Antony

~~~~~~~~~~

John McGhie [MVP - Word]

unread,
Jul 26, 2002, 8:27:28 AM7/26/02
to
Hi Anthony:

This responds to microsoft.public.mac.office.word on 25 Jul 2002 05:43:09
-0700, ad_sc...@postmaster.co.uk (Antony):

> Pierre, your code seemed so innocent; look what you've done! :)
> Never done this sort of thing before; I've learnt quite a bit
> about VBA with these experiments.

Nowhere near as much as I have :-)

I am most grateful to you, Many thanks

0 new messages