Corruption in GrBlockAllocator

14 views
Skip to first unread message

ahujab...@gmail.com

unread,
Nov 15, 2021, 9:45:00 AMNov 15
to skia-discuss

Hi Guys,

 We just updated to Skia M91( I know it's a little older version, we started long back and then had to pause due to some other work). We are noticing some memory corruption with GrBlockAllocator in M91. 

 During the creation of GPU canvas, we are hitting an Assert

SkASSERT(count == fAllocator->metadata());

Call stack

sk_abort_no_print() SkMemory_malloc.cpp:49

GrTBlockList<GrClipStack::SaveRecord, 2>::count() const::'lambda'()::operator()() const GrTBlockList.h:180

GrTBlockList<GrClipStack::SaveRecord, 2>::count() const GrTBlockList.h:180

GrTBlockList<GrClipStack::SaveRecord, 2>::empty() const GrTBlockList.h:189

GrClipStack::currentSaveRecord() const GrClipStack.h:289

GrClipStack::clipState() const GrClipStack.h:48

SkGpuDevice::onGetClipType() const SkGpuDevice.cpp:310

SkCanvas::computeDeviceClipBounds() const SkCanvas.cpp:1676

SkCanvas::init(sk_sp<SkBaseDevice>) SkCanvas.cpp:455

SkCanvas::SkCanvas(sk_sp<SkBaseDevice>) SkCanvas.cpp:491

I dug little deep and found out we are setting metadata on related block in pushItem

Call stack

GrTBlockList<GrClipStack::SaveRecord, 2>::pushItem() GrTBlockList.h:274

GrClipStack::SaveRecord& GrTBlockList<GrClipStack::SaveRecord, 2>::emplace_back<SkIRect const&>(SkIRect const&) GrTBlockList.h:93

GrClipStack::GrClipStack(SkIRect const&, SkMatrixProvider const*, bool) GrClipStack.cpp:1153

SkGpuDevice::SkGpuDevice(GrRecordingContext*, std::__ndk1::unique_ptr<GrSurfaceDrawContext, std::__ndk1::default_delete<GrSurfaceDrawContext> >, unsigned int) SkGpuDevice.cpp:136

SkGpuDevice::Make(GrRecordingContext*, std::__ndk1::unique_ptr<GrSurfaceDrawContext, std::__ndk1::default_delete<GrSurfaceDrawContext> >, SkGpuDevice::InitContents) SkGpuDevice.cpp:92

SkSurface::MakeFromBackendRenderTarget(GrRecordingContext*, GrBackendRenderTarget const&, GrSurfaceOrigin, SkColorType, sk_sp<SkColorSpace>, SkSurfaceProps const*, void (*)(void*), void*) SkSurface_Gpu.cpp:653

Push Item Code

void* pushItem() {

        // 'template' required because fAllocator is a template, calling a template member

        auto br = fAllocator->template allocate<alignof(T)>(sizeof(T));

        SkASSERT(br.fStart == br.fAlignedOffset ||

                 br.fAlignedOffset == First(fAllocator->currentBlock()));

        br.fBlock->setMetadata(br.fAlignedOffset);

        int value = fAllocator->metadata() + 1;

        int64_t thisAddress = reinterpret_cast<std::uintptr_t>(this);

        int64_t allocaterAddress = fAllocator->GetThisAddress();

  int64_t headBlockAddress =  reinterpret_cast<std::uintptr_t>(fAllocator->headBlock());

        LOGE( "object %lld, with allocator %lld head block %lld, has been set with metadata value %d", thisAddress, allocaterAddress, headBlockAddress, value);   //debug code I added

        fAllocator->setMetadata(fAllocator->metadata() + 1);

        void* returnPtr = br.fBlock->ptr(br.fAlignedOffset);

        int64_t returnPtrAddress = reinterpret_cast<std::uintptr_t>(this);

        LOGE( "retuned address %lld and size of T is %d", returnPtrAddress, sizeof(T)); );   //debug code I added  

        return br.fBlock->ptr(br.fAlignedOffset);

    }

By looking at the logs ( I added, in bold ) I think this issue is occurring due to an overlap of returned address and metadata block address.

object 3653399144, with allocator 3653399144 head block 3653399160, has been set with metadata value 1

setmetadata calling on object 3653399144 with head block 3653399160, value 1

retuned address 3653399144 and size of T is 64

3653399144 ( returned address )

3653399160( headed block )

Difference 16( headed block -  returned address )

so object(GrClipStack::SaveRecord::SaveRecord size 64) created on returned address ends up overwriting block metadata.

Based on the above analysis I think there is some issue with retuned allocator address and the allocator is messed up. Wanted your opinion on this, am I in the right direction? can you guys give some pointers so that I can speed up this investigation? Btw it tried running address sanitiser but didn't get anything useful.

We have seen other issues in this upgrade due to some bugs in our NDK so its possible that this is also due to some bug in our build environment, but any pointers that might help are highly appreciated.

Thanks

Bharat

ahujab...@gmail.com

unread,
Nov 16, 2021, 8:40:13 AMNov 16
to skia-discuss
Update : there was some inlining issue in our compiler, we were able to fix this by non-inlining, some code.

Now, the address sanitiser has caught one more issue, I am working on that, will reach out if need any help.

Thanks
Bharat



Bharat Ahuja

unread,
Nov 18, 2021, 8:19:52 AMNov 18
to skia-d...@googlegroups.com
Hi Guys,

I noticed a double delete in BlockAllocator which is causing corruption while using SkRunTimeEffects.

sk_sp<SkShader> GetBitMapImageShader()
SkBitmap bitmap;   
bitmap.allocPixels(SkImageInfo::MakeN32Premul(12, 11));   
bitmap.erase(SK_ColorRED, SkIRect::MakeXYWH(0, 0, 100, 100));   

sk_sp<SkImage> fImage0 = SkImage::MakeFromBitmap(bitmap);   

SkMatrix matrix; 
matrix.setTranslate(10, 10);

auto imageShader = fImage0->makeShader (SkTileMode::kDecal, SkTileMode::kDecal, SkSamplingOptions(SkFilterMode::kNearest), matrix);   
return imageShader;
}

const char* GetEffectCodeSimplified()
{
   const char* sksl= R"(

   uniform shader input1;
   uniform shader input2;

   uniform float R;
   uniform float G;
   uniform float B;
   uniform float random;

   half4 main(float2 p)
   {
      // Sample the source texture
      float4 src1 = float4( sample( input1, p ) ); // src
      float4 src2 = float4( sample( input2, p ) ); // dst
      return  half4( src1.rgb * src2.a / 0.5, src2.a );
   }
   )";

   return sksl;
}

void Draw( SkCanvas* pCanvas ) noexcept
{
   SkPaint paint;

   sk_sp<SkShader> imageShader = GetBitMapImageShader();
   sk_sp<SkShader> imageShader2 = GetBitMapImageShader();

   const char* sksl = GetEffectCodeSimplified();
    auto [effect, err] = SkRuntimeEffect::Make(SkString(sksl));

    if(!effect)
    {
        return;
    }

   srand( (unsigned)time( NULL ) );
   float hue = float(rand()%256)/256;
   float r = float(rand()%256)/256;
   float g = float(rand()%256)/256;
   float b = float(rand()%256)/256;

   pCanvas ->clear( SK_ColorWHITE );

   SkRuntimeShaderBuilder builder( effect );
   builder.child( "input1" ) = imageShader;
   builder.child( "input2" ) = imageShader2;
   builder.uniform( "R" ) = r;
   builder.uniform( "G" ) = g;
   builder.uniform( "B" ) = b;
   builder.uniform( "random" ) = hue;

   paint.setShader( builder.makeShader( nullptr /*localTransform*/, false /*fOpaque*/ ) );

   pCanvas ->drawPaint(paint);
   pCanvas ->flush();
}

I added some logs to catch this double delete.

nop statement creation started nop index 74
GrBlockAllocator.h:buffer allocated for no-op
GrMemoryPool.cpp: block allocated with header address 2439038080 header start 8316, block address 2439029760
SkPool.h: returned address for object creation from pool 2439038096 size 16
SkNop.h : no op statement created at address 2439038096 index 74
nop statement creation finished nop index 74

SkSLCompiler.cpp: program conversion successful, detaching program pool from thread1
block deleted with address 2439029760 due to resetScratchSpace called
done detaching program pool from thread

nop statement destruction started at address 2439038096 index 74
releasing block with header address 2439038080 header start 8316186
failed check: trying to release block, block address 2439029760 sentinel address 2439029760

it seems like resetScratchSpace is deleting the block which is already allocated to an object (Nop)

void GrBlockAllocator::resetScratchSpace()
{   
if (fHead.fPrev)
 {                          
LOGE( "block deleted with adress due to resetScratchSpace called %lld", (uint64_t)reinterpret_cast<std::uintptr_t >(fHead.fPrev));       
delete fHead.fPrev;
fHead.fPrev = nullptr;   
}
}

Can you guys please help me figure out what is wrong here? I can see, there are two possibilities either resetScratchSpace should check whether the block is occupied before deleting or fHead.fPrev is pointing to the wrong block and I should check why it is pointing to the occupied block?

Thanks
Bharat

Michael Ludwig

unread,
Nov 18, 2021, 4:39:14 PMNov 18
to skia-discuss
Hi Bharat,

I definitely want to get to the bottom of this, because it sounds like either a bug in GrBlockAllocator (now SkBlockAllocator), GrMemoryPool, or in the SkSLCompiler's handling of statements.  What compiler are you on? Is this a debug build, and can you confirm that no other asserts are firing in this latest example? I am assuming the issues from your first message are resolved with the compiler inlining fix, although I wouldn't mind knowing more about that. Was it a bug in your compiler's inlining? Or is it actually a bug in our code that produces incorrect results when inlined?

Bharat Ahuja

unread,
Nov 19, 2021, 8:34:32 AMNov 19
to skia-d...@googlegroups.com
Hi Michael,

Thanks so much for reverting back.

The inline crash was in android emulator x86 ship builds and had been there in the old build too. 
I missed merging it in a recent update. The Inlining issue was due to a bug in our NDK version and was causing corruption in SkString.

I manged to fix it by non inlining below function

SkString& GrGLSLShaderBuilder::code()
{   
   return fShaderStrings[fCodeIndex];
}

This new crash is happening with recent update only and right now only reproing in emulator x86 debug build. we are hitting an assert

opensource/googleskia\src/gpu/GrBlockAllocator.h:610 assert(block->fSentinel == kAssignedMarker)

The reason for asserting is double freeing of Block allocated to no-op statement. I think SkSLCompiler's code is fine and it is converting the program as expected and after success, it is calling program->fPool->detachFromThread() which in turn is calling GrBlockAllocator::resetScratchSpace. resetScratchSpace is freeing the block (1 delete ) and then when the program is destroyed, the corresponding stored statements are also destroyed deleting the block again and resulting in assert getting hit.
there are only two possibilities here either resetScratchSpace should check whether the block is occupied before deleting or head.fPrev is pointing to the wrong block and we need to figure out why it is pointing to the occupied block?

(sk_abort_no_print()+6) (BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(_ZZN16GrBlockAllocator11owningBlockILj8ELj0EEEPNS_5BlockEPKviENKUlvE_clEv+86) (BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(_ZN16GrBlockAllocator11owningBlockILj8ELj0EEEPNS_5BlockEPKvi+89) (BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(GrMemoryPool::release(void*)+173) (BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(SkSL::Pool::FreeMemory(void*)+77) (BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(SkSL::Poolable::operator delete(void*)+47) (BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(SkSL::Nop::~Nop()+65) (BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(SkTArray<std::__ndk1::unique_ptr<SkSL::Statement, std::__ndk1::default_delete<SkSL::Statement>>, false>::~SkTArray()+120) (BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(_ZN9SkSTArrayILi2ENSt6__ndk110unique_ptrIN4SkSL9StatementENS0_14default_deleteIS3_EEEELb0EED2Ev+44) (BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(SkSL::Block::~Block()+85) (BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(SkSL::Block::~Block()+49) (BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(SkSL::FunctionDefinition::~FunctionDefinition()+85) (BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(SkSL::FunctionDefinition::~FunctionDefinition()+49) (BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(void std::__ndk1::allocator_traits<std::__ndk1::allocator<std::__ndk1::unique_ptr<SkSL::ProgramElement, std::__ndk1::default_delete<SkSL::ProgramElement>>>>::__destroy<std::__ndk1::unique_ptr<SkSL::ProgramElement, std::__ndk1::default_delete<SkSL::ProgramElement>>>(std::__ndk1::integral_constant<bool, true>, std::__ndk1::allocator<std::__ndk1::unique_ptr<SkSL::ProgramElement, std::__ndk1::default_delete<SkSL::ProgramElement>>>&, std::__ndk1::unique_ptr<SkSL::ProgramElement, std::__ndk1::default_delete<SkSL::ProgramElement>>*)+57) (BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(void std::__ndk1::allocator_traits<std::__ndk1::allocator<std::__ndk1::unique_ptr<SkSL::ProgramElement, std::__ndk1::default_delete<SkSL::ProgramElement>>>>::destroy<std::__ndk1::unique_ptr<SkSL::ProgramElement, std::__ndk1::default_delete<SkSL::ProgramElement>>>(std::__ndk1::allocator<std::__ndk1::unique_ptr<SkSL::ProgramElement, std::__ndk1::default_delete<SkSL::ProgramElement>>>&, std::__ndk1::unique_ptr<SkSL::ProgramElement, std::__ndk1::default_delete<SkSL::ProgramElement>>*)+57) (BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(SkSL::Program::~Program()+311) (BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)
(BuildId: bb4bb12e58f22cd7a88a84cd4f246adc2d860bfa)

Bharat Ahuja

unread,
Nov 22, 2021, 8:30:29 AM (11 days ago) Nov 22
to skia-d...@googlegroups.com
Hi Michael,

I think I figured out the issue. It seems like an alignment issue.

    #ifdef SK_FORCE_8_BYTE_ALIGNMENT
        // This is an issue for WASM builds using emscripten, which had std::max_align_t = 16, but
        // was returning pointers only aligned to 8 bytes.
        //
        // Setting this to 8 will let GrBlockAllocator properly correct for the pointer address if
        // a 16-byte aligned allocation is requested in wasm (unlikely since we don't use long
        // doubles).
        static constexpr size_t kAddressAlign = 8;
    #else
        // The alignment Block addresses will be at when created using operator new
        // (spec-compliant is pointers are aligned to max_align_t).
        static constexpr size_t kAddressAlign = alignof(std::max_align_t);
    #endif

  alignas(kAddressAlign) Block fHead;

The returned head address was not aligned properly, forcing an 8-byte alignment, solved the issue.

Thanks
Bharat
Reply all
Reply to author
Forward
0 new messages