Optimization about writeAscii_slow

26 views
Skip to first unread message

Yibo Peng

unread,
May 19, 2021, 9:04:03 AM5/19/21
to kryo-users
writeAscii_slow is used only in two positions, but there is some inefficient logic, the code as below:


private void writeAscii_slow (String value, int charCount) throws KryoException {
if (charCount == 0) return;
if (position == capacity) require(1); // Must be able to write at least one character.
int charIndex = 0;
byte[] buffer = this.buffer;
int charsToWrite = Math.min(charCount, capacity - position);
while (charIndex < charCount) {
value.getBytes(charIndex, charIndex + charsToWrite, buffer, position);
charIndex += charsToWrite;
position += charsToWrite;
charsToWrite = Math.min(charCount - charIndex, capacity);
if (require(charsToWrite)) buffer = this.buffer;
}
}


what I think is not efficient is the bold code block, this while code block will repeat allocating memory and copy bytes to the new memory. if the repeat times is greater than 2, The previous call to require and getValues is useless

why we just do something like this to replace current writeAscii_slow:
  require(charCount);
  value.getBytes(0, charCount, buffer, position);
  position += charCount;

It is more readable and efficient.

By the way, I know this optimization is not useful in most cases because huge string value is not common. so such pr could be accepted?



Thomas Heigl

unread,
May 19, 2021, 9:22:46 AM5/19/21
to kryo-...@googlegroups.com
Hi,

Can you benchmark the performance difference with JMH? If it outperforms the old code and all tests pass, I'm happy to accept a PR.

Best regards,

Thomas

--
You received this message because you are subscribed to the "kryo-users" group.
http://groups.google.com/group/kryo-users
---
You received this message because you are subscribed to the Google Groups "kryo-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kryo-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kryo-users/33c5fefb-9b12-408d-bbf6-c3bbce4744b9n%40googlegroups.com.

Yibo Peng

unread,
May 20, 2021, 4:11:21 AM5/20/21
to kryo-users
Of course, I will take some time to test it in next weeks

Yibo Peng

unread,
Jun 2, 2021, 9:04:30 AM6/2/21
to kryo-users
Hi, I have do the benchmark and have some questions.

The StringBenchmark not test the huge string(the so called long string is just several hundreds bytes)
and the initial output buffersize is set very large (1024 * 512)
so what I talked about the redundant allocation and copy will not happe 

On the other hand, the output of the state in benchmark will be reused in each operation.
so except the first series of requires to this output object, the remained batchsize - 1 calls to writeAscii will not trigger too much allocation
because the output bufferSize will be very large.

But If there is such a case(and I do a benchmark for this rare case), the performance will have huge improvement in writeAscii.
1. the length of string is larger than the initial output buffer size. (which will cause the unecessary require operation(allocate and copy))
2. the output is not reused, new output each time (I think the best practice is reused the output if possible)

I write a simple benchmark like this
@Benchmark
public void writeAsciiHuge (InputOutputState state) {
Output output = new Output(1024, 1024 * 512); // not reused output
output.writeAscii(hugeString); // hugeString is 36864 bytes
}

The result shows: 

StringBenchmark.writeAsciiHuge smallOutput ss 12 2.007 ± 0.138 s/op (optimized)

StringBenchmark.writeAsciiHuge smallOutput ss 12 3.102 ± 0.252 s/op (original)


And do some little Optimization like this:

private int maxRequired(){
  if(outputStream != null){return maxCapacity;}
  else{return maxCapacity - position;}
}


private void writeAscii_slow (String value, int charCount) throws KryoException {
if (charCount == 0) return;
if (position == capacity) require(1); // Must be able to write at least one character.
int charIndex = 0;
byte[] buffer = this.buffer;
int charsToWrite = Math.min(charCount, capacity - position);
while (charIndex < charCount) {
value.getBytes(charIndex, charIndex + charsToWrite, buffer, position);
charIndex += charsToWrite;
position += charsToWrite;
// charsToWrite = Math.min(charCount - charIndex, capacity);
                        // charsToWrite = Math.min(charCount - charIndex, maxRequired);
if (require(charsToWrite)) buffer = this.buffer;
}


The change is easy to understand, and can improve the performance in some rare conditions.
So what do you think about this change?



On Wednesday, May 19, 2021 at 9:22:46 PM UTC+8 thomas...@gmail.com wrote:

Thomas Heigl

unread,
Jun 3, 2021, 7:11:09 AM6/3/21
to kryo-...@googlegroups.com
This sounds good to me! Please create a PR and I'll take a closer look.

Yibo Peng

unread,
Jun 14, 2021, 12:36:50 PM6/14/21
to kryo-users
I have submitted a PR , feel free to take a look

Yibo Peng

unread,
Jun 20, 2021, 11:30:25 PM6/20/21
to kryo-users
Hi, could you take a closer look for this PR?
Reply all
Reply to author
Forward
0 new messages