MailMergeDataSource iteration extremly slow

amr

unread,

Mar 4, 2010, 6:40:01 AM3/4/10

to

Hi

I'am reading addresses from a MailMergeDataSource in Word 2007.
My problem is, that this is extremly slow. Every invoke from get_Item will
take a sec.
The app is a C# Word 2007 Application Add-In.

Is there some solution to speed it up or an another approach to solve the
issue?

Code:

object indexLastname = MailMergeDataFieldIndices.Lastname;
object indexFirstname = MailMergeDataFieldIndices.Firstname;
object indexCompany = MailMergeDataFieldIndices.Company;
object indexStreet = MailMergeDataFieldIndices.Street;
object indexZip = MailMergeDataFieldIndices.Zip;
object indexCity = MailMergeDataFieldIndices.City;

object index = MailMergeDataFieldIndices.None;
for (int i = 0; i < count; i++) {
string name = fields.get_Item(ref indexFirstname).Value + " " +
fields.get_Item(ref indexLastname).Value;
string company =fields.get_Item(ref indexCompany).Value;
string street =fields.get_Item(ref indexStreet).Value;
string zip = fields.get_Item(ref indexZip).Value;
string city = fields.get_Item(ref indexCity).Value;

dataSrc.ActiveRecord =
Word.WdMailMergeActiveRecord.wdNextDataSourceRecord;
}

Thanks in advance
amr

Peter Jamieson

unread,

Mar 4, 2010, 11:54:38 AM3/4/10

to

I created a very small VS2008 Addin with the following code as a test:

public partial class ThisAddIn
{
private Word.Application oWord;

private void ThisAddIn_Startup(object sender, System.EventArgs e)
{
oWord = this.Application;
oWord.MailMergeBeforeRecordMerge +=
new Word.ApplicationEvents4_MailMergeBeforeRecordMergeEventHandler(
oWord_MailMergeBeforeRecordMerge);
}

private void ThisAddIn_Shutdown(object sender, System.EventArgs e)
{
}

private void
oWord_MailMergeBeforeRecordMerge(Microsoft.Office.Interop.Word.Document
doc, ref bool Cancel)
{
for (int i = 1; i <= doc.MailMerge.DataSource.DataFields.Count; i++)
{
object j = i;

doc.Content.InsertAfter(doc.MailMerge.DataSource.DataFields.get_Item(ref
j).Value);
}
}

It seems to run pretty fast (small Excel data source with 4 columns). I
wouldn't execute anything like
dataSrc.ActiveRecord = Word.WdMailMergeActiveRecord.wdNextDataSourceRecord
within that because when you're using MailMerge Events you shouldn't
move records within the event handler because Word tends to get
confused. So I hope you aren't doing that!

However, assuming that you aren't, my next best guess would be that it's
moving from record to record that's causing the problem. That might be
the case if you are using a data source that Word needs to open with one
of its internal/external converters, e.g. a Word document, text file
with more than 255 fields, etc. But that's not what you're saying, so
I'm stuck.

If you can provide a complete, working chunk of code it would be easier
to check whether it is slow here.

Peter Jamieson

http://tips.pjmsn.me.uk

amr

unread,

Mar 5, 2010, 4:26:06 AM3/5/10

to

Hi Peter,
thanks for your investigations.

I dont execute this code during a mailmerge event.
After the mailmerge addresses were selected, the user can open a dialog and
edit a product list. Each address has a product. At dialog initilisation i
load the addresses from the datasource. The source is an outlook addressbook
with two members.
When i try your example code during the mailmerge event , your right, its
pretty fast.

The code below should work after addresses were loaded.

using System.Windows.Data;

[ValueConversion(typeof(Word.MailMergeDataSource), typeof(List<Address>))]
public class MailMergeDataSourceConverter : IValueConverter {

private List<Address> addressList;

#region IValueConverter Member

public object Convert(object value, Type targetType, object parameter,
System.Globalization.CultureInfo culture) {
if (value != null) {
try {
Word.MailMergeDataSource dataSrc =
(Word.MailMergeDataSource)value;

int count = dataSrc.RecordCount;
Log.Info(string.Format("Converting MailMergeDataSource
(Count: {0})", count));

if (count > 0) {
addressList = new List<Address>();

Word.MailMergeDataFields fields = dataSrc.DataFields;

object indexLastname = MailMergeDataFieldIndices.Lastname;
object indexFirstname =
MailMergeDataFieldIndices.Firstname;
object indexCompany = MailMergeDataFieldIndices.Company;
object indexStreet = MailMergeDataFieldIndices.Street;
object indexZip = MailMergeDataFieldIndices.Zip;
object indexCity = MailMergeDataFieldIndices.City;

object index = MailMergeDataFieldIndices.None;
for (int i = 0; i < count; i++) {

int fieldCount = fields.Count;

// Crap
string name = (fieldCount >= (int)indexFirstname) ?

fields.get_Item(ref indexFirstname).Value + " " + fields.get_Item(ref
indexLastname).Value : "";

string company = (fieldCount >= (int)indexCompany) ?
fields.get_Item(ref indexCompany).Value : "";
string street = (fieldCount >= (int)indexStreet) ?
fields.get_Item(ref indexStreet).Value : "";
string zip = (fieldCount >= (int)indexZip) ?
fields.get_Item(ref indexZip).Value : "";
string city = (fieldCount >= (int)indexCity) ?
fields.get_Item(ref indexCity).Value : "";

addressList.Add(new Address() {
Name = name,
Company = company,
Street = street,
Zip = zip,
City = city
});

dataSrc.ActiveRecord =
Word.WdMailMergeActiveRecord.wdNextDataSourceRecord;
}
}
}
catch (Exception ex) {
Log.Error(ex);
}
}

return addressList;
}

public object ConvertBack(object value, Type targetType, object
parameter, System.Globalization.CultureInfo culture) {
throw new NotImplementedException();
}

#endregion
}

Maybe it's a matter of interop. I don't really know how this datasource
behaves or works under the hood.

Peter Jamieson

unread,

Mar 5, 2010, 9:13:14 AM3/5/10

to

> The code below should work after addresses were loaded.

Sorry, to look at the problem I would need a much simpler piece of code
that
a. I can get into operation without knowing a whole lot more about C#
than I do, and the specific interfaces you are using
b. defines stuff such as MailMergeDataFieldIndices, which is not
declared etc. etc.

If you have the inclination and can provide the code for a very small
Addin that would do that, perhaps fired from a single ribbon button, I
would be happy to give it a go.

> Maybe it's a matter of interop.

That could be, but I can't say I have enough experience of working with
Mailmerge via Interop to have any idea how likely that would be.

> I don't really know how this datasource
> behaves or works under the hood.

Well, Word can connect to a datasource using 4 basic methods. I don't
know whether or not you know what type of data source will be used
(possibly "whatever the user throws at it?), but for test purposes you
would presumably have to set up a document with a suitable data source.
Word uses one of 4 "connection methods":
a. an internal/external text converter. It would do that if the data
source was another Word document containing a table with one row
forheaders and subsequent rows for data.
b. DDE (obsolescent, but users still sometimes use it to solve a few
formatting problems).
c. ODBC (obsolescent)
d. OLE DB. It would do that if, for example, you are using Word XP or
later and your data source is an Excel worksheet.

If you can set up some small data sources that you could open with each
method, you may find that the slowdown only occurs with some connection
types.

Peter Jamieson

http://tips.pjmsn.me.uk

amr

unread,

Mar 5, 2010, 11:33:03 AM3/5/10

to

A simple callback for a ribbon button is below. Before you press the button,
you have to load a source. I loaded my outlook contacts.

public void TestDataSource(Microsoft.Office.Core.IRibbonControl ribbon) {

Word.MailMergeDataSource dataSrc =
Application.ActiveDocument.MailMerge.DataSource

int count = dataSrc.RecordCount;

if (count > 0) {
string[][] addressList = new string[count][5];

Word.MailMergeDataFields fields = dataSrc.DataFields;

object indexLastname = 2;
object indexFirstname = 1;
object indexCompany = 4;
object indexStreet = 8;
object indexZip = 11;
object indexCity = 9;

object index = 0;

for (int i = 0; i < count; i++) {
int fieldCount = fields.Count;

addressList[i][0] = (fieldCount >= (int)indexFirstname) ?

fields.get_Item(ref indexFirstname).Value + " " + fields.get_Item(ref
indexLastname).Value : "";

addressList[i][1] = (fieldCount >= (int)indexCompany) ?
fields.get_Item(ref indexCompany).Value : "";
addressList[i][2] = (fieldCount >= (int)indexStreet) ?
fields.get_Item(ref indexStreet).Value : "";
addressList[i][3] = (fieldCount >= (int)indexZip) ?
fields.get_Item(ref indexZip).Value : "";
addressList[i][4] = (fieldCount >= (int)indexCity) ?
fields.get_Item(ref indexCity).Value : "";

dataSrc.ActiveRecord =
Microsoft.Office.Interop.Word.WdMailMergeActiveRecord.wdNextDataSourceRecord;
}
}

}

Peter Jamieson

unread,

Mar 5, 2010, 2:46:07 PM3/5/10

to

I made some minor changes and am running the code in a DocumentOpen
event, and yes, with some types of data source it does appear to be very
slow. However, it's not obvious what the common factor is - so far I
have tried
a. an Outlook source (i.e. connecting from Word. That would use the
Jet/ACE OLE DB provider and the Outlook/Exchange IISAM. Slow.
b. More or less the same data in a .csv file. In this case, Word would
use the same provider but the Text IISAM (may not mean anything to you).
Slow.

Either way, I see the following message, which doesn't mean much to me
but may mean something to you:

A first chance exception of type
'System.Runtime.InteropServices.SEHException' occurred in WordAddIn3.DLL

c. More or less the same data in a .xls file. In this case, Word would
probably use the same provider but the Excel IISAM. Much quicker!

Not really enugh to draw much of a conclusion except that the type of
data source does seem to make a significant different.

Peter Jamieson

http://tips.pjmsn.me.uk

Peter Jamieson

unread,

Mar 6, 2010, 4:16:42 AM3/6/10

to

BTW, another thing you /may/ need to take account of in your code is that

dataSrc.RecordCount

does not necessarily return the number of records in the data source.
With some data sources (e.g. a Word document) it may return -1.

Also, depending on what your code needs to achieve, you may have to take
account of the fact that the user can include/exclude individual
records, so that even when RecordCount is not -1, it may not reflect the
count that is to be merged.

You may be able to circumvent that by
a. setting .ActiveRecord to wdFirstRecord
b. setting .ActiveRecord to wdNextRecord until you get an exception
(In VBA, I see

5853 Invalid parameter.

when go try to go past "end of file")

Peter Jamieson

http://tips.pjmsn.me.uk

amr

unread,

Mar 9, 2010, 5:38:04 AM3/9/10

to

Hi Peter,

sry for the delay.
Thank you for your help and suggestions

Greets
amr

amr

unread,

Mar 26, 2010, 9:03:01 AM3/26/10

to

Solution to speed it up:

Instead of using self declared indices and the get_item method we have to
use a Word enum.

Word.MappedDataFields fields = dataSrc.MappedDataFields;

string company =
fields[Microsoft.Office.Interop.Word.WdMappedDataFields.wdCompany].Value;

Maybe this helps somebody.

Greets
amr

Peter Jamieson

unread,

Mar 27, 2010, 4:41:14 AM3/27/10

to

Thanks for posting back. Wow!

Peter Jamieson

http://tips.pjmsn.me.uk