For test 1, I set CommitBatchSize to 5 and CommitBatchThreshold to 10,
then issued 7 update statements that affected 1 row each. In profiler
I saw one batch with 5 update statements (all with the same
transaction ID) followed by a second batch with 2 update statements
(also all with the same transaction ID, but not the same transaction
ID as the first batch of 5 updates) . No problem, I expect it to work
this way.
For test 2, same settings as test 1, I issued a single update
statement that affected 20 rows. In profiler I saw one batch with 20
update statements, all with the same transaction ID.
For test 3, I set CommitBatchSize to 100 and CommitBatchThreshold to
5. I issued the same 7 update statements as in test 1. In profiler I
saw one batch with 6 update statements (all with the same transaction
ID) followed by a second batch with 1 update statement (different
transaction ID than the last).
Based on what I see in profiler I can only conclude that when
CommitBatchThreshold > CommitBatchSize the value for
CommitBatchThreshold is essentially ignored. However, when
CommitBatchThreshold <= CommitBatchSize the value for
CommitBatchThreshold is what dictates how many commands are delivered
in a batch. Even stranger, it appears that in the second case the
number of commands that are delivered is actually CommitBatchThreshold
+ 1.
Am I missing something obvious?
However what is key to remember in this case is that the distribution agent
always honors transactional boundaries.
The only problem with your tests is test #3. If it was 5 and 2 it would make
more sense. I am wondering if perhaps they was a gap between the first 6 and
the last one.
If BOL was correct, it would be possible to break the ACID properties of a
transaction via a small value for -CommitBatchThreshold. In BOL I think the
"COMMIT" that is referred to is not actually anything to do with transactions
- it could more reasonably be the no of records/transactions obtained in a
network round trip and applied at the subscriber. Empirically I find that the
transaction boundaries are always respected regardless of these settings and
I checked this using log explorer at the subscriber. So, if I'm right, then
the BOL definitions should read:
CommitBatchSize : "Is the maximum number of transactions to be read from the
distributor before being applied at the subscriber".
CommitBatchThreshold : "Is the maximum number of commands to be read from
the distributor before being applied at the subscriber".
I've added the "maximum" statement because if there are only a few
transactions of commands, these are still sent over on synchronization.
Let's request verification from MS for this section.
Cheers,
Paul Ibison
-CommitBatchSize commit_batch_size
Is the number of transactions to be issued to the Subscriber before a COMMIT
statement is issued. The default is 100.
-CommitBatchThreshold commit_batch_threshold
Is the number of replication commands to be issued to the Subscriber before
a COMMIT statement is issued. The default is 1000.
The posters problem is with singleton updates.
" I issued the same 7 update statements as in test 1. In profiler I
saw one batch with 6 update statements (all with the same transaction
ID) followed by a second batch with 1 update statement (different
transaction ID than the last).
"
He does 7 separate singleton updates, each of them affect 1 row, each of
them a transaction.
6 of them are applied in a batch (the commitbatchsize of 5 should have
limited it to 5), and then the other one is in a different batch -different
transaction id. At least this is how I read the post.
What bol does not make clear is how it handles transactions and its
constituent commands.
If in the first case I did a batch update which affected 20 rows, this 20
row update would be replicated in a single transaction event though my
commitbatchthreshold was 10. The distribution agent respects transactional
boundaries.
Where you would see it is if you issued two transactions, one of 5 rows and
one of 6 rows.
Commitbatchsize would not trigger a commit as its two transactions, but
commitbatchthreshold would as its 11 commands.
This makes producing the rules in the original post "CommitBatchThreshold >
CommitBatchSize" etc quite tricky.
On the point of the poster's strange issue with 7 singleton updates becoming
2 transactions: one with 6 commands and one with one command, I have no ideas
- I've just tested on SQL Server 2000 with Log Explorer and it is totally
reproducible there also! Have requested clarification for this logic from our
friends at MS.
Cheers,
Paul Ibison
I think we are chasing straws really. These parameters have great impact on
performance, whether they are one off or not.
PS Happy Diwali!
just understood your test: "Where you would see it is if you issued two
transactions, one of 5 rows and one of 6 rows. Commitbatchsize would not
trigger a commit as its two transactions, but commitbatchthreshold would as
its 11 commands." with Commitbatchsize = 5 and commitbatchthreshold = 10.
Carried it out and what actually happens is the both transactions go over.
However the same occurs when the updates hit 5 rows and 4 rows (ie one less
than commitbatchthreshold) which is consistent with my proposal that the
commitbatchsize and commitbatchthreshold are maximums.
Not too sure if commitbatchthreshold only ever applies to "singleton
transactions" but it seems so.
Cheers,
Paul Ibison
What happens here is like a bar maid at the pubs you frequent to all hours
and probably are at while you are reading this message.
Let's suppose the bar maid on duty tonight is the high IQ type. She can
memorize 50 orders (commitbatchThreshold), but her tray is only so large, so
she can carry 5 beers (commit batch size).
If you give her two orders, is she going to wait around for 3 more orders?
Of course not (unless she is chatting u up). She will go and grab the beers
no matter that her brain can hold 50 orders and her tray can hold 5 beers.
Same thing is going on here. You issue 2 transactions with a varied amount
of commands. Is the distribution agent going to wait 7 years for the next 3
transactions to trip a commit? No, its going to trip when it hits the polling
interval or some other internal mechanism.
It only when its things get busy that the bar maid start delivering orders
with a full bar tray, and then there is no time to chat for her to chat you
up anymore, until you mention your ailment and immediately clear the bar.
You can only observe this action when pimping, oops typo there, pumping
large numbers of transactions and commands to the distribution agent.
What is puzzling however is the split in test 3, however.
I also confirmed the bit about transactional boundries; I had
CommitBatchThreshold set to 5 and CommitBatchSize set to 1,000, then I
ran a transaction (one the publisher) where I update 10 rows one at a
time and commit. In Profiler I see a single transaction with 10
updates. When I run the 10 updates outside of a transaction on the
publisher I see transactions with 6 and 4 updates at the subscriber.
Unfortunately I don't think the concept of transaction vs. command is
very clearly explained either, since a command is really autocommit
transaction. It would probably cause less confusion to explain how the
values work if it's phrased more in terms of explicit vs. autocommit
transactions.
Thanks for your help,
Kendal Van Dyke
http://kendalvandyke.blogspot.com/
Commithreshold is honored at a transaction boundary. Let's take the
following example
Commitbatchsize 10
Commitbatchthreshold 20
Transaction1 has 19 commands
Transaction 2 has 11 commands
Transaction 3 has 5 commands
With the above setting, the distribution agent reader thread will set the
commit flag for the command block after we read all the commands in
transaction 2. This is because we check the commitbatch size and
commitbatchthreshold after a transaction boundary is reached. We dont issue
a commit after we read the first command (20th command in that batch) in
transaction 2. So this commit batch on the subscriber will have committed 2
transactions with 30 commands even though the threshold is 20.
So the batch will be committed either when we hit the commitbatchsize OR
when a transaction crosses the command threshold before the commitbatchsize
is hit.
With regard to the threshold adding one more command that seems to be a bug.
Hope this helps.
"Paul Ibison" <Paul....@ReplicationAnswers.Com.(donotspam)> wrote in
message news:0A8275F6-3C9A-4388...@microsoft.com...
No bug here.
On Oct 28, 8:53 pm, "Gopal Ashok [MSFT]" <gopalas...@hotmail.com>
wrote:
> Just had a chat with Paul on this one
>
> Commithreshold is honored at a transaction boundary. Let's take the
> following example
>
> Commitbatchsize 10
> Commitbatchthreshold 20
>
> Transaction1 has 19 commands
> Transaction 2 has 11 commands
> Transaction 3 has 5 commands
>
> With the above setting, the distribution agent reader thread will set the
> commit flag for the command block after we read all the commands in
> transaction 2. This is because we check the commitbatch size and
> commitbatchthreshold after a transaction boundary is reached. We dont issue
> a commit after we read the first command (20th command in that batch) in
> transaction 2. So this commit batch on the subscriber will have committed 2
> transactions with 30 commands even though the threshold is 20.
>
> So the batch will be committed either when we hit the commitbatchsize OR
> when a transaction crosses the command threshold before the commitbatchsize
> is hit.
>
> With regard to the threshold adding one more command that seems to be a bug.
>
> Hope this helps.
>
> "Paul Ibison" <Paul.Ibi...@ReplicationAnswers.Com.(donotspam)> wrote in
> messagenews:0A8275F6-3C9A-4388...@microsoft.com...