The core of the application is the endless loop which essentially
updates each frame all entities of the game World. Currently this
World loop is singlethreaded. It's working just fine but I'd really
like to make it more scalable on multicores.
Since World entities exist in Locations I was thinking about running
each Location(and its updating logic) in a separate thread. This means
I need to synchronize properly the World entities. And this is what I
really don't want to to do since, I believe, explicit locking in
domain classes methods is wrong and it makes the development,
maintaining and debugging much-much more difficult.
At first I was thinking about isolating Location entities from
entities in different Locations by forbidding any calls between them.
What are possible ways to achieve this? Store entities of each
Location in a thread local storage so that they are not accessible
from outside? Or maybe instead of a thread per Location use processes
instead?(but that's going to complicate everything a lot).
However even if Location entities are nicely isolated there another
problem - persistence. I already have some sort of a simple generic
persistence service which is running in a separate thread. It can be
used in async mode, it accepts an object to be saved and returns a
special future object which can be used to track the persistence
process. I would love to use this service, however since it's running
in a separate thread I again need to properly synchronize access to
domain classes. In this case the possible option could be to implement
proper cloning of domain objects so that persistence service would
accept a copy of the object to be saved and no explicit locking would
be needed...
Hence the question, is all said above worth it? Or maybe I should
simply add explicit synchronizing logic into all domain classes and be
done with it? Or maybe there is some better option I'm not aware of?
Thanks in advance
> At first I was thinking about isolating Location entities from
> entities in different Locations by forbidding any calls between them.
> What are possible ways to achieve this? Store entities of each
> Location in a thread local storage so that they are not accessible
> from outside? Or maybe instead of a thread per Location use processes
> instead?(but that's going to complicate everything a lot).
It's hard to understand your problem because you're weak on details,
but the paragraph above reflects some extremely serious
misunderstandings about what threads are and how they work. Threads
are not at war with each other, they cooperate to achieve a common
objective. Thread-local storage is a workaround for certain unusual
cases where normal parameter passing doesn't work, it's most certainly
*not* to keep threads out of things they're not supposed to mess with.
You should simply code a thread to work with the data it's supposed to
-- if you need something special to stop it from doing something else,
something is really wrong with your design or reasoning.
DS
I'll try to be more specific. Currently the World updates each of its
entities every frame
as follows in a single thread:
- World::update(dt) //dt is delta time since the last frame
- Location::update(dt)
- WorldEntity::update(dt)
- WorldEntity::update(dt)
- ...
- Location::update(dt)
- WorldEntity::update(dt)
I'd like to make it rather this way:
thr1
- Location::update(dt)
- WorldEntity::update(dt)
- WorldEntity::update(dt)
...
thrN
- Location::update(dt)
> Thread-local storage is a workaround for certain unusual
> cases where normal parameter passing doesn't work, it's most certainly
> *not* to keep threads out of things they're not supposed to mess with.
But isn't it a tool for completely isolating the thread data?
Why would you want to "isolate" the thread data? You should code your
threads to work on different data.
For your particular problem you could use a task queue and each of the
worker threads would extract a task from the queue, such as updating a
Location) and resolve it. It is not feasible to create a thread per
Location if the number of Locations is high.
Of course you would need to synchronize access to the queue.
>> Thread-local storage is a workaround for certain unusual
>> cases where normal parameter passing doesn't work, it's most certainly
>> *not* to keep threads out of things they're not supposed to mess with.
>
> But isn't it a tool for completely isolating the thread data?
>
No, not really. Essentially, information has both DATA, the information
you're storing, and ADDRESS, a way to find that data.
Normal global data, e.g. an 'extern', has the address in a symbol table
entry, so that all compiled code can find the data. A static or local
variable has exactly the same data; but the address is restricted to a
particular compilation scope. Heap storage (malloc, new) has data
allocated at runtime, and you manage the address dynamically (where you
store the return value from malloc or new).
Thread local storage (TLS), or Thread specific data (TSD), doesn't have
any impact on the data -- frequently the data is allocated in heap, but
(at least for TSD) needn't be. (TLS hides the data allocation; but it's
usually heap.) What you're really restricting to a particular thread
context is the ADDRESS; the way your code locates the DATA. The idea of
both TLS and TSD is that all threads perform the same access sequence,
but each retrieves a different memory address from that sequence.
However, the data is still in the process address space, and all threads
are capable of accessing it or, perhaps more importantly, scribbling
over it. Like a phone number... you might have an unlisted number, which
will prevent anyone from just picking up a phone directory and calling
you; but someone who randomly dials valid phone numbers is just as
likely to find yours as anyone else's.
So if you're worried about a thread scribbling over another thread's
data, TLS or TSD won't help you a bit. And if you're only concerned with
threads going about their proper and expected business, then what
difference does it make? ... They won't be touching data they weren't
programmed to touch.
> But isn't it a tool for completely isolating the thread data?
Absolutely, unequivocally, 100% *NO*. Again, if you think that, your
understanding of thread-specific data is *completely* incorrect.
Threads, by definition, share all their address space. Thread-local
data lives in the process address space. Therefore, it is shared by
all threads.
>> Thread-local storage is a workaround for certain unusual
>> cases where normal parameter passing doesn't work, it's most certainly
>> *not* to keep threads out of things they're not supposed to mess with.
DS
Thanks a lot for the detailed explanation!
And now back to my original question... What threading model would you
recommend in order to get rid of explicit locking in domain model
classes?
I'm currently looking at erlang and really like its process messaging
idea. Does it make sense to implement something alike for my C++
application?(and I can't afford rewriting the application with erlang)
> And now back to my original question... What threading model would you
> recommend in order to get rid of explicit locking in domain model
> classes?
The simplest way to avoid the need for locks is to ensure that each
thread is processing its own data: you only need locks where there is
mutable shared state.
> I'm currently looking at erlang and really like its process messaging
> idea. Does it make sense to implement something alike for my C++
> application?(and I can't afford rewriting the application with erlang)
The prime reason Erlang's message passing avoids the need for locks is
that Erlang is a /functional/ language, so all data is essentially
immutable. If you pass copies of you data around rather than pointers or
references to shared state then you too can avoid the need for locks.
Alternatively, if you have a strict ownership scheme such that threads
only process data they own then you can avoid the need for locks that
way.
One way to do this is to parcel up the data and the work to do on it as
a self-contained "task" object. You then put the tasks on a queue, and
the worker threads then take tasks off the queue to process. If the
tasks are truly self-contained then the only synchronization needed is
in the queue. There have been lots of queue implementations posted to
this newsgroup, and I've got a simple implementation on my blog:
Anthony
--
Author of C++ Concurrency in Action | http://www.manning.com/williams
just::thread C++0x thread library | http://www.stdthread.co.uk
Just Software Solutions Ltd | http://www.justsoftwaresolutions.co.uk
15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976
Yep, I feel like this is the way to go since cloning erlang model in C+
+ is quite awkward. I feel a bit nervous though about the fact I can't
enforce the ownership scheme on the compiler level. For example, in my
case if I'm going to use one thread per Location it should be
forbidden for all Location entities to access entities in any other
Locations. Unfortunately, it seems I can enforce this restriction only
on the "documetation level" and not on the "interfaces level".
> One way to do this is to parcel up the data and the work to do on it as
> a self-contained "task" object. You then put the tasks on a queue, and
> the worker threads then take tasks off the queue to process. If the
> tasks are truly self-contained then the only synchronization needed is
> in the queue. There have been lots of queue implementations posted to
> this newsgroup, and I've got a simple implementation on my blog:
>
> http://www.justsoftwaresolutions.co.uk/threading/implementing-a-threa...
Oh, yes, thanks I'm aware of this post. And also very like your
boost::future implementation :)
I could integrate it with boost::threadpool using the example from "C+
+ Concurrency in Action" book(which is great btw)
>> Alternatively, if you have a strict ownership scheme such that threads
>> only process data they own then you can avoid the need for locks that
>> way.
>
> Yep, I feel like this is the way to go since cloning erlang model in C+
> + is quite awkward. I feel a bit nervous though about the fact I can't
> enforce the ownership scheme on the compiler level. For example, in my
> case if I'm going to use one thread per Location it should be
> forbidden for all Location entities to access entities in any other
> Locations. Unfortunately, it seems I can enforce this restriction only
> on the "documetation level" and not on the "interfaces level".
Yes. If you've got shared memory, then it's hard to enforce ownership at
compile time (though you could potentially do it with a template-based
tagging system).
>> One way to do this is to parcel up the data and the work to do on it as
>> a self-contained "task" object. You then put the tasks on a queue, and
>> the worker threads then take tasks off the queue to process. If the
>> tasks are truly self-contained then the only synchronization needed is
>> in the queue. There have been lots of queue implementations posted to
>> this newsgroup, and I've got a simple implementation on my blog:
>>
>> http://www.justsoftwaresolutions.co.uk/threading/implementing-a-threa...
>
> Oh, yes, thanks I'm aware of this post. And also very like your
> boost::future implementation :)
> I could integrate it with boost::threadpool using the example from "C+
> + Concurrency in Action" book(which is great btw)
Yes, boost::threadpool plus a task queue would work well.
Thanks for the kind words.
You could try out a scheme like:
__________________________________________________________________
struct complete
{
atomic<unsigned long> m_count; // = 0
event m_event; // auto-reset
void set(unsigned long count)
{
m_count.store(count, memory_order_relaxed);
}
void wait()
{
if (m_count.fetch_sub(1, memory_order_acquire) != 1)
{
m_event.wait();
}
}
void finished()
{
if (m_count.fetch_sub(1, memory_order_release) == 1)
{
m_event.set();
}
}
};
struct world_entity
{
void update(delta_time_t dt)
{
// whatever
}
};
struct location
{
collection<world_entity> m_entitys;
complete& m_complete;
delta_time_t m_dt;
location(complete& complete)
: m_complete(complete)
{
}
void update()
{
for each world_entity in m_entitys as i
{
i.update(m_dt);
}
m_complete.finished();
}
};
struct world
{
collection<location*> m_locations;
complete m_complete;
void populate()
{
// fill `m_locations' with allocated location objects
}
void update(delta_time_t dt)
{
m_complete.set(m_locations.count());
for each location in m_locations as i
{
// maintain consistent delta time for each location
// relative to this specific update.
i.m_dt = dt;
i.update();
}
m_complete.wait();
}
void run()
{
for (;;)
{
update(calc_delta_time());
}
}
};
static thread_safe_queue<location*> g_queue;
struct worker
{
void entry()
{
for (;;)
{
g_queue.pop()->update();
}
}
};
__________________________________________________________________
Each `location' is completely isolated from any other `location' when in the
context of `worker::entry()' threads. Therefore, all of the locations
`world_entity's are isolated. No locks are needed on a per-location basis.
The only synchronization needed would be in the
`thread_safe_queue<location*>' object. The `world' object is meant to be
single-threaded. However, multiple locations can be updated in parallel
because the single-threaded world object issues commands to the worker
threads then waits for everything to complete. And you can maintain
consistency of certain things... Notice how the delta time is the same in
all locations during an update? Is that even important to you?
Thanks for the idea, however looks like your pseudo code is missing
the actual submission of the worker to the thread pool ;)
BTW, since there is already boost::threadpool available the code can
be greatly simplified into something like this:
void world::update(dt)
{
//optionally we can resize the amount of working threads
//threadpool_.resize(locations_.size());
foreach(Location* loc, locations_)
threadpool_.schedule(boost::bind(&Location::update, dt);
threadpool_.wait();
}
>Notice how the delta time is the same in
> all locations during an update? Is that even important to you?
It should be fine