Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Alignment hack...

44 views
Skip to first unread message

Chris M. Thomasson

unread,
Feb 21, 2020, 4:38:22 AM2/21/20
to
Fwiw, this is some old C code I just cobbled up to work with C++; used
it as a region allocator in the past:

https://groups.google.com/forum/#!original/comp.lang.c/7oaJFWKVCTw/sSWYU9BUS_QJ

Well, "work" with C++ or even C is very loose here. Its a total hack to
force align objects on large boundaries. This is very useful wrt
designing different exotic algorithms. However, I think its forever
doomed wrt UB. I am not sure how to ever make it work in a 100% portable
way. When I say a large boundary, I mean say, 2048 bytes are much
bigger. Well, here is some code, can you even get it to run without
tripping an assert or getting a throw?
______________________
#include <iostream>
#include <new>
#include <cassert>
#include <cstdlib>
#include <cstddef>
#include <cstdint>


// Doctor Hackinstein!
#define CT_RALLOC_ALIGN_UP(mp_ptr, mp_align) \
((unsigned char*)( \
(((std::uintptr_t)(mp_ptr)) + ((mp_align) - 1)) \
& ~(((mp_align) - 1)) \
))

#define CT_RALLOC_ALIGN_ASSERT(mp_ptr, mp_align) \
(((unsigned char*)(mp_ptr)) == CT_RALLOC_ALIGN_UP(mp_ptr, mp_align))


// Hackish indeed!
template<std::size_t T_size>
struct ct_local_mem
{
unsigned char m_bytes[T_size];

template<typename T>
unsigned char* align_mem()
{
return align_mem<T>(alignof(T));
}

template<typename T>
unsigned char* align_mem(unsigned long align)
{
if (!align) align = alignof(T);

unsigned char* base = m_bytes;
unsigned char* aligned = CT_RALLOC_ALIGN_UP(base, align);

assert(CT_RALLOC_ALIGN_ASSERT(aligned, align));

std::size_t size = aligned - m_bytes;

if (size + sizeof(T) + align > T_size)
{
throw;
}

return aligned;
}
};



// A test program...
struct foo
{
int m_a;
int m_b;

foo(int a, int b) : m_a(a), m_b(b)
{
std::cout << this << "->foo::foo.m_a = " << m_a << "\n";
std::cout << this << "->foo::foo.m_b = " << m_b << "\n";
}

~foo()
{
std::cout << this << "->foo::~foo.m_a = " << m_a << "\n";
std::cout << this << "->foo::~foo.m_b = " << m_b << "\n";
}
};


int main()
{
{
// create some memory on the stack
ct_local_mem<4096> local = { '\0' };


// create a foo f
std::cout << "Naturally aligned...\n";
foo* f = new (local.align_mem<foo>(alignof(foo))) foo(1, 2);

// destroy f
f->~foo();



// create a foo f aligned on a large byte boundary
std::size_t alignment = 2048;
std::cout << "\n\nForced aligned on a " << alignment << " byte
boundary...\n";

// ensure the alignment of foo is okay with the boundary
assert((alignment % alignof(foo)) == 0);


f = new (local.align_mem<foo>(alignment)) foo(3, 4);

assert(CT_RALLOC_ALIGN_ASSERT(f, alignment));

// destroy f
f->~foo();
}

{
std::cout << "\n\nFin\n";
std::cout.flush();
std::cin.get();
}

return 0;
}
______________________


Here is some output, notice the addresses on the large boundary:
______________________
Naturally aligned...


0x7fffb1f56ba0->foo::foo.m_a = 1


0x7fffb1f56ba0->foo::foo.m_b = 2


0x7fffb1f56ba0->foo::~foo.m_a = 1


0x7fffb1f56ba0->foo::~foo.m_b = 2








Forced aligned on a 2048 byte boundary...


0x7fffb1f57000->foo::foo.m_a = 3


0x7fffb1f57000->foo::foo.m_b = 4


0x7fffb1f57000->foo::~foo.m_a = 3


0x7fffb1f57000->foo::~foo.m_b = 4
______________________


Notice how the latter pointer values have zeros at the end? There are
many fun things we can do here, but I am afraid it all UB. ;^o

Alf P. Steinbach

unread,
Feb 21, 2020, 6:50:26 AM2/21/20
to
Not sure, I just cooked this up, but I believe the following is
standard-compliant and does what you want:


#include <assert.h> // assert
#include <limits.h> // CHAR_BIT

#include <bitset> // std::bitset
#include <exception> // std::terminate
#include <iostream> // std::(cin, cout)
#include <memory> // std::align
#include <stddef.h> // size_t
#include <new> // std::bad_alloc
using std::align, std::bad_alloc, std::terminate, std::bitset,
std::cout, std::cin;

using Byte = unsigned char;
const int bits_per_byte = CHAR_BIT;
template< class T > constexpr int bits_per_ = sizeof( T )*bits_per_byte;
template< class T > using Type_ = T;

struct Buffer_view{ void* p_start; size_t size; };

template< class Int >
auto pop_count( const Int value )
-> int
{ return static_cast<int>( bitset<bits_per_<Int>>( value ).count() ); }

auto operator new( const size_t size, Buffer_view& buffer, const size_t
alignment )
-> void*
{
assert( pop_count( alignment ) == 1 );
if( auto p = align( alignment, size, buffer.p_start, buffer.size ) ) {
return p;
}
throw bad_alloc();
}

auto operator new( const size_t size, Buffer_view&& buffer, const size_t
alignment )
-> void*
{ return operator new( size, buffer, alignment ); }

// Called if constructor throws.
void operator delete( const Type_<void*>, Buffer_view&, const size_t )
{
terminate(); // Clean-up can be supported by more info in
Buffer_view.
}

void operator delete( const Type_<void*> p, Buffer_view&& b, const
size_t a )
{
operator delete( p, b, a );
}

// A test program...
struct foo
{
int m_a;
int m_b;

foo( const int a, const int b ):
m_a( a ), m_b( b )
{
cout << this << "->foo::foo.m_a = " << m_a << "\n";
cout << this << "->foo::foo.m_b = " << m_b << "\n";
}

~foo()
{
cout << this << "->foo::~foo.m_a = " << m_a << "\n";
cout << this << "->foo::~foo.m_b = " << m_b << "\n";
}
};

void test()
{
// create some memory on the stack
Byte local[4096] = {};
const auto buffer_view = [&]{ return Buffer_view{ &local, sizeof(
local ) }; };

// create a foo f
cout << "Naturally aligned...\n";
foo* f = new( buffer_view(), alignof( foo ) ) foo( 1, 2 );
f->~foo();

// create a foo f aligned on a large byte boundary
size_t alignment = 2048;
cout << "\n\nForced aligned on a " << alignment << " byte
boundary...\n";

// ensure the alignment of foo is okay with the boundary
assert( alignment % alignof( foo ) == 0 );

foo* f2 = new( buffer_view(), alignment ) foo( 3, 4 );
f2->~foo();
}

auto main()
-> int
{
test();
cout << "\n\nFin\n";
cin.get();
}


- Alf

Chris M. Thomasson

unread,
Feb 22, 2020, 8:02:03 PM2/22/20
to
On 2/21/2020 3:50 AM, Alf P. Steinbach wrote:
> On 21.02.2020 10:38, Chris M. Thomasson wrote:
>> Fwiw, this is some old C code I just cobbled up to work with C++; used
>> it as a region allocator in the past:
>>
>> https://groups.google.com/forum/#!original/comp.lang.c/7oaJFWKVCTw/sSWYU9BUS_QJ
>>
>>
>> Well, "work" with C++ or even C is very loose here. Its a total hack
>> to force align objects on large boundaries. This is very useful wrt
>> designing different exotic algorithms. However, I think its forever
>> doomed wrt UB. I am not sure how to ever make it work in a 100%
>> portable way. When I say a large boundary, I mean say, 2048 bytes are
>> much bigger. Well, here is some code, can you even get it to run
>> without tripping an assert or getting a throw?
[...]
>> Forced aligned on a 2048 byte boundary...
>>
>> 0x7fffb1f57000->foo::foo.m_a = 3
>>
>> 0x7fffb1f57000->foo::foo.m_b = 4
>>
>> 0x7fffb1f57000->foo::~foo.m_a = 3
>>
>> 0x7fffb1f57000->foo::~foo.m_b = 4
>> ______________________
>>
>>
>> Notice how the latter pointer values have zeros at the end? There are
>> many fun things we can do here, but I am afraid it all UB. ;^o
>
> Not sure, I just cooked this up, but I believe the following is
> standard-compliant and does what you want:
[...]

Okay, will have some more time tonight to look at your code. Btw, the
reason I want to forcefully align objects on large boundaries is for
performance reasons wrt exotic algorithms. We can steal bits from highly
aligned addresses, and/or we can round a point down to the lowest large
boundary to get at meta data for a high performance allocator.

Öö Tiib

unread,
Feb 22, 2020, 11:13:51 PM2/22/20
to
I sometimes feel that we are going too far with our pursuit of
performance. Quality is always more important. My discussions
with end users have always revealed that they see it absurd when
shop adds features or performance instead of fixing known bugs.
No one cares how fast they get wrong answers. And that is
obvious ... no one, zero, zip, zilch, nada.

I may be am unjust here but to me it feels far more trickier to
achieve quality (than performance) with C++.

Chris M. Thomasson

unread,
Feb 22, 2020, 11:32:32 PM2/22/20
to
A fun part can be allowing an allocators free operation to take an
address, round it down to the lowest large boundary, say we align on a
8192 boundary. then, the result of the rounding down directly gets at
the meta data for its master block, so to speak. Easy, simple and
efficient... However. its embracing dr. hackinstein! This master block
can have a lifo list that the freed block of memory can use for a cache.
Man, the last time I worked on this was way back in early mid 2000's ish.


Fwiw, working on an animation right now, should have some more time tonight.

Fwiw, here is my current work.

https://youtu.be/xbm4r45S2Xs

Chris M. Thomasson

unread,
Feb 22, 2020, 11:38:04 PM2/22/20
to
On 2/22/2020 8:13 PM, Öö Tiib wrote:
Well, wrt this alignment hack, it can help create things that _need_
high performance, like an allocator.

Chris M. Thomasson

unread,
Feb 22, 2020, 11:40:54 PM2/22/20
to
On 2/21/2020 3:50 AM, Alf P. Steinbach wrote:
[...]

Thank you for pointing my C mind to:

https://en.cppreference.com/w/cpp/memory/align

Never used it. It should work fine.


Chris M. Thomasson

unread,
Feb 22, 2020, 11:41:43 PM2/22/20
to
On 2/22/2020 8:40 PM, Chris M. Thomasson wrote:
> On 2/21/2020 3:50 AM, Alf P. Steinbach wrote:
>> On 21.02.2020 10:38, Chris M. Thomasson wrote:
>>> Fwiw, this is some old C code I just cobbled up to work with C++;
>>> used it as a region allocator in the past:
>>>
>>> https://groups.google.com/forum/#!original/comp.lang.c/7oaJFWKVCTw/sSWYU9BUS_QJ
[...]
>> Not sure, I just cooked this up, but I believe the following is
>> standard-compliant and does what you want:
> [...]
>
> Thank you for pointing my C mind to:
>
> https://en.cppreference.com/w/cpp/memory/align
>
> Never used it. It should work fine.

Need to snip better. Sorry.

Melzzzzz

unread,
Feb 23, 2020, 8:27:08 AM2/23/20
to
I worry about performance only if circumstances require. This is not
Python.


--
press any key to continue or any other to quit...
U ničemu ja ne uživam kao u svom statusu INVALIDA -- Zli Zec
Svi smo svedoci - oko 3 godine intenzivne propagande je dovoljno da jedan narod poludi -- Zli Zec
Na divljem zapadu i nije bilo tako puno nasilja, upravo zato jer su svi
bili naoruzani. -- Mladen Gogala
0 new messages