Re: [MOO-talk] MOO Multithreading (was: Would this be a good idea?)

17 views
Skip to first unread message

Wolfgang Faust

unread,
Nov 2, 2018, 3:34:53 PM11/2/18
to Ethin Probst, MOO-talk
What are you planning on using threads for?

I'm struggling to imagine how this would work with object modifications. Are you planning on adding a mutex to every object and altering every verb to obtain locks before changing them? 

On Friday, November 2, 2018, Ethin Probst <harlyd...@gmail.com> wrote:
So, I've been thinking about multithreading in MOO (particularly with
Stunt) and have wondered about implementing a threading and mutex API.
Something similar to the thread support library in C++
(https://en.cppreference.com/w/cpp/thread), but not quite. So to
create a thread you might call:
id = thread_create(obj object, str verb, list args);
And for a mutex:
"id holds the mutex number. Mutexes are globally shared across the
entire server.";
id = mutex_create();
To lock the mutex you'd call:
mutex_lock(int mutex);
or
mutex_try_lock(int mutex);
Where mutex is the number of the mutex given by mutex_create(), and
force_lock uses lock() instead of try_lock(). mutex_lock() blocks
until the lock succeeds. (I'm curious if I should even add that; it
would be very easy to hang the MOO, and it would be incredibly easy to
cause deadlocks with that.) mutex_try_lock() returns immediately if
the lock fails.
So, here are the functions I'm thinking about adding:
Thread control:
thread_create(object, verb, args) -- creates and starts a new thread
by calling object:verb(args)
thread_destroy(thread) -- destroys thread and terminates it
thread_join(thread) - waits for a thread to finish execution, blocking
the main thread if needed.
thread_detach(thread) - permits thread to execute independently of the
master thread
thread_joinable(thread) - determines if thread is joinable.
thread_hardware_concurrency() - returns max number of concurrent threads
yield() - suggests that the implementation reschedule execution of
threads (from the above linked page)

Mutex control:
mutex_create() - returns a mutex ID, which is used in the following functions
mutex_lock(mutex) - locks the mutex, blocks if the mutex is not
available (my above concerns apply)
mutex_try_lock(mutex) - tries to lock the mutex, returns if the mutex
is not available
mutex_unlock(mutex) - unlocks the mutex
mutex_destroy(mutex) - destroys all references to the mutex and allows
the system to reclaim resources for it.
Those are all the functions I want to implement for now, I don't want
to implement too much in one go.

--
You received this message because you are subscribed to the Google Groups "MOO Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to MOO-talk+u...@googlegroups.com.
To post to this group, send email to MOO-...@googlegroups.com.
Visit this group at https://groups.google.com/group/MOO-talk.
For more options, visit https://groups.google.com/d/optout.

Ethin Probst

unread,
Nov 2, 2018, 4:27:36 PM11/2/18
to Wolfgang Faust, MOO-talk
No. Normal verbs would not use them; just like normal programming
languages like C++, it would be up to the programmer to acquire and
release mutex locks. I could, however, force thread_create() to
acquire a mutex lock for the duration of the threads execution.
--
Signed,
Ethin D. Probst

Luke Dashjr

unread,
Nov 2, 2018, 4:37:19 PM11/2/18
to MOO-...@googlegroups.com, Wolfgang Faust, Ethin Probst
On Friday 02 November 2018 19:34:52 Wolfgang Faust wrote:
> What are you planning on using threads for?

One use case would be to make the MOO itself distributed/decentralised.
Without a requirement for a single thread, you can have each user (in the
extreme example) run the code for their own objects, and setup an RPC
interface for interaction between different instances.

> I'm struggling to imagine how this would work with object modifications.
> Are you planning on adding a mutex to every object and altering every verb
> to obtain locks before changing them?

Automatically managing locks would be difficult, but probably not impossible.
I would think it's possible so long as there's a way to make changes atomic
and roll them back if there's contention. But I haven't given it much
thought.

Luke

Wolfgang Faust

unread,
Nov 2, 2018, 4:44:15 PM11/2/18
to Ethin Probst, MOO-talk
So threads can never modify objects that a regular verb might touch (and vice versa), or they'll get race conditions from concurrent modification? That's a pretty big limitation, particularly since most of the uses I can think of for this feature involve modifying objects.

I wrote some notes at some point about possible ways to make a multiprocessing MOO, I will see if I can find and post them shortly.

Ethin Probst

unread,
Nov 2, 2018, 4:51:30 PM11/2/18
to Wolfgang Faust, MOO-talk
This feature would be nice to have. Originally I was going to try and
add a statement like
thread
<code>
endthread
But this might be a better idea, since I'm not well-versed in Yacc-Bison.

Wolfgang Faust

unread,
Nov 2, 2018, 5:12:09 PM11/2/18
to Ethin Probst, MOO-talk
Here's my notes--it's a bit of a brain dump. There's a lot of detail here, but the gist of it is that I think the best way to add concurrency to LambdaMOO is to implement transactions, and run each task as a transaction which either commits or rolls back and retries if there's a conflict. I was looking at this from the perspective of sharding a very large world, but Luke mentioned the possibility of interaction between independent instances instead.

- MOO tasks (=fibers) are easy to reason about:
- Full control until suspend - no need to worry about concurrent modification etc
- Suspends are explicit: suspend()/fork/read() are the only way for a task to be suspended
- Threads are hard to reason about: (see http://wiki.c2.com/?ThreadsConsideredHarmful )
- Resources which need locks:
- Player I/O (read()/notify())
- Objects (e.g. read-increment-write problems) - also may have invariant across multiple properties
- Cross-object invariants? (Esp. location<->contents, but also potentially custom-e.g. linked teleport ring pairs)
- Compatibility with existing code:
- Dedicated single thread for "classic" tasks, which autolocks objects
- Thread-per-task with autolock algorithm (extremely complicated -- possibly halting-problem-hard)
- Modification of most/all existing code to take locks - complicated and error-prone
- Completely changes mental model of MOO code
- Race conditions
- Deadlocks
- Concurrent modification - usually requires explicit locks
- Probably a bad idea
- Message passing/Actor model - neat idea, not MOO-like
- Shared state is fundamental; Actors aren't supposed to have any
- but maybe a new paradigm? - to model physical entities
- Instead: Implicit transactions w/ auto-rollback
- Preserves semantics & easy reasoning of MOO code
- Worker pool picks up from thread queue
- Begin transaction at start of task execution, commit on suspend/end
- Roll back and requeue task if conflict
- In addition to property changes, queue notify() until commit - then send all at once.
- Potential problem: high-conflict task with multiple rollbacks
- Unlikely in practice?
- Possibly transaction prioritization - higher if task has rolled back
- Implementations:
- * TODO: Look at how relational databases do this.
- Compare-and-swap:
- Single commit thread/semaphore
- On property write, save both old and new value
- On property read, save copy of value? (not sure if necessary - TODO this needs more thought)
- On commit
- compare all saved property values;
if saved property is different from object's current value, roll back
- Apply writes and notifies
- Advantage: Can be implemented across high-latency networks (only requires one round-trip)
- Object sharding!
- Note - for multiple object shards, needs two-phase commit
- Disadvantage: Conflicts only discovered at commit time (wasted work)
- Disadvantage: Requires serialized commit - problem for scaling a single datastore
- Lock
- On property read, take shared lock
- On property write, take exclusive lock and store old value
- If taking a lock fails, roll back:
- Delete notify queue
- For all shared locks, release
- For all exclusive locks, re-write old value then release
- For extra performance, queued lock aquisition? i. e. if lock fails:
- Mark all active locks "semi"
- Await on failed lock
- If semi-locks are requested by someone else, roll back and release all locks
- When await finishes, re-acquire full locks
- ^ The overhead of all this only makes sense if tasks frequently roll back multiple times. Profile before implementing.
- Advantage: No commit bottleneck
- Advantage: Conflicts identified immediately (fail-fast)
- Disadvantage: Lots of locks (at least per-object, possibly per-property?)

Todd Sundsted

unread,
Nov 2, 2018, 8:30:58 PM11/2/18
to MOO Talk
I haven't thought about this in depth for a while, but I agree that transactions applied/rolled-back at existing task execution boundaries are the way to go!

Wolfgang Faust

unread,
Nov 3, 2018, 12:06:43 AM11/3/18
to harlyd...@gmail.com, moo-...@googlegroups.com
On Fri, Nov 2, 2018 at 5:39 PM Ethin Probst <harlyd...@gmail.com> wrote:
> I like your lock algorithm. For the lock maintenance, could an
> std::unordered_map<int, <lock_type>> work? That would make it easy for
> verbs to access the map if they need to forcefully unlock one of them
> (or create new locks).

Premature unlocking seems likely to be a recipe for disaster. Bear in
mind that locks will only be held while the task is actually running,
so for $server_options.(bg/fg)_(ticks/seconds) at most. Once the task
suspends all the locks will be released. I also can't think of any
uses for accessing locks where you wouldn't just fetch the lock from
the object again.

> Or even just an std::vector<mutex>. That would
> be even better; then you just do something like this to unlock all the
> locks:
> std::for_each(locks.begin(), locks.end(), [&](std::mutex lock) {
> lock.unlock();
> });
> Plus, that final algorithm has far more advantages than disadvantages,
> and would probably be easier to implement.

I don't do C++ so I'm not familiar with what it offers, but I agree
that a simple list of locks to be released is probably the better
option. As I mentioned before I can't think of any uses for a map of
locks.

One thing that has just occurred to me is that whatever lock
implementation you choose needs a way of *upgrading* from a shared
lock to an exclusive one--otherwise a task would always deadlock
itself when writing to a property it had previously read!

> It wouldn't be overly
> complicated to make notify()/read() run in their own thread now; the
> issue is actually (for read()) taking control from the interpreter and
> passing it to the thread, then returning it when the thread
> terminates.

I'd imagine that you would leave all of the I/O on the main thread
(it's certainly not a bottleneck), and pass the task notify() queue
over to it for transmission as part of the commit process. I don't
think that you need a lock on the network connection, though I haven't
put much thought into this. read() obviously does not require any
locks, since calling it will *release* all the locks as part of
suspending the task.

It has also occurred to me that there are a few more internal details
that will need to be covered:

- create() at least needs to make sure it performs an atomic increment
of the next object number. (Fortunately recycle() is not involved in
any of this.) For strict correctness, it should actually take an
exclusive lock on this value, and max_object() should take a shared
lock. Otherwise the manual's comment that "the next object created
will be assigned the object number one larger than the value of
max_object()" may not hold true, if another object is created
concurrently. I have no idea if there is any code that actually
depends on this.

- I've been assuming that it may be possible to take a lock on
individual properties instead of an entire object. However, this would
make object_bytes() return the wrong value if it is called while
another task is editing a property. Likewise this would cause problems
when adding and deleting properties and verbs. I'd guess that
individual property locks would have too much overhead anyway so this
may not matter.

- chparent() and move() need to also take locks on the parents and
containers, respectively, so that their children() and .contents don't
have bogus values.

- renumber() seems likely to cause complications, since it moves the
entire object while other things may be trying to use it. It probably
makes sense to disable it unless the server is started in
single-threaded mode, since it's not used in regular operation anyway.

- Checkpoints can only be started when there are no tasks running,
otherwise objects may be in a partially edited state. Easiest way to
do this is to request that all of the worker threads halt, fork (which
as I understand it doesn't like threads anyway), and then start the
workers again.

- Queue manipulation functions (queued_tasks(), queue_info(), etc)
need to take a lock on the entire queue and stop any other tasks from
being run. Otherwise they could try to get info on a task that has
just completed and no longer exists.

- There are undoubtably other things I've missed that would need to be handled.

Clearly this is not going to be a trivial task, though it does not
seem insurmountable either.

(P. S. - You sent this email to me only instead of the group. I've
taken the liberty of adding the group back in to my reply.)
> >>> >> email to MOO-talk+u...@googlegroups.com.

Luke Dashjr

unread,
Nov 3, 2018, 2:27:13 AM11/3/18
to MOO-...@googlegroups.com, Wolfgang Faust, harlyd...@gmail.com
On Saturday 03 November 2018 04:06:29 Wolfgang Faust wrote:
> - renumber() seems likely to cause complications, since it moves the
> entire object while other things may be trying to use it. It probably
> makes sense to disable it unless the server is started in
> single-threaded mode, since it's not used in regular operation anyway.

Hmm, I suspect renumber() may break the assumptions of existing MOO code
already. Storing the object in a variable and sleeping is basically unsafe...

> - Checkpoints can only be started when there are no tasks running,
> otherwise objects may be in a partially edited state. Easiest way to
> do this is to request that all of the worker threads halt, fork (which
> as I understand it doesn't like threads anyway), and then start the
> workers again.

They need to halt somehow regardless, unless you're planning to get rid of the
tick/second limits (which seems like a bad idea).

But if you're just building up a map of changes to commit at halt, instead of
changing them in place, you can checkpoint without waiting for things to halt
by just ignoring those maps.

> - Queue manipulation functions (queued_tasks(), queue_info(), etc)
> need to take a lock on the entire queue and stop any other tasks from
> being run. Otherwise they could try to get info on a task that has
> just completed and no longer exists.

Or just refcount the task ids?

Luke

Ethin Probst

unread,
Nov 3, 2018, 1:20:14 PM11/3/18
to Luke Dashjr, MOO-...@googlegroups.com, Wolfgang Faust
Yeah, I know, this would be no trivial task. But modernizing MOO in
general is no trivial task anyway. MOO would be a hell of a lot
better, IMO, if it took advantage of more modern technologies (i.e.
LLVM for the parser like Tyler said, or a different database format
that can be trivially modified offline, or by using multithreading).
@Tyler, your ideas are nice; the database could be updated to use a
different system altogether. I in fact was poking around at various
relational and NoSQL DBMSs to see what I could find that would satisfy
MOO's requirements, especially Stunt. Unfortunately pretty much most
relational DBs don't qualify, since you'd need like 5 tables per
object, which would be a fucking outrageously huge database. Something
like etcd might work, since its key/value and you can have directories
in etcd, or even MongoDB.
The LLVM parser is going to be a huge, messy problem. One way to make
it as fast as possible (which would probably be even faster than MOO)
would be to make JIT compile it as soon as possible (preferably when
set_verb_code() is called), and then store the compiled machine code
in memory. That would defeat the hole purpose of JIT, -- compiling on
the fly, when its actually needed, -- but if your not going to be
modifying verbs for a while, then making the server recompile the verb
every time its executed seems inefficient. The only time I can see
this causing problems is when a verb has already begun execution and a
programmer edits it and recompiles it. (This especially applies to
forked tasks.) Will the old machine code be effectively "hotpatched"
by the new machine code? Will the old code be allowed to finish
executing in that round and then the new code would take over?
Also, one final issue that I've discovered while playing with stunt:
try/catch. This is in the server, not in MOO (that still works in MOO
with try/except, thank god), but try/catch in the server code doesn't.
This has made me discover a rather concerning bug that makes the
server crash hard: if you call open_network_connection() with a valid
host but a closed port, *** B O O M ***. And your server crashes,
instantly. No panic or anything. Or mine does, anyway. I've backtraced
it to this function in net_multi.cc:
#ifdef OUTBOUND_NETWORK

enum error
network_open_connection(Var arglist, server_listener sl)
{
int rfd, wfd;
const char *local_name, *remote_name;
enum error e;

e = proto_open_connection(arglist, &rfd, &wfd, &local_name, &remote_name);
if (e == E_NONE)
make_new_connection(sl, rfd, wfd,
local_name, remote_name, 1);

return e;
}
#endif
Furthermore, this problem seems to stem from proto_open_connection(),
in net_bsd_tcp.cc. So I went digging and found this portion of that
function that uses try/catch:
try {
id = set_timer(server_int_option("outbound_connect_timeout", 5),
timeout_proc, 0);
result = connect(s, (struct sockaddr *)&addr, sizeof(addr));
cancel_timer(id);
}
catch (timeout_exception& exception) {
result = -1;
errno = ETIMEDOUT;
reenable_timers();
}
The error seems to be from:
result = connect(s, (struct sockaddr *)&addr, sizeof(addr));
The confusing thing is why this doesn't work. The server is being
compiled in C++ mode (otherwise stunt wouldn't even run), so I'd
presume that try/catch handlers would be executed if available. I've
even added catch (std::exception &ex) and catch (...) to this code
before and no go. So it seems like all exception handlers are just
being skipped, as if the try/catch block wasn't even there, behavior
that I've never seen before in a C++ program. This is the code for
both the exception and timer_proc, also relevant to this error:
class timeout_exception: public std::exception
{
public:
timeout_exception() throw() {}

virtual ~timeout_exception() throw() {}

virtual const char* what() const throw() {
return "timeout";
}
};
static void
timeout_proc(Timer_ID id, Timer_Data data)
{
throw timeout_exception();
}
This same issue also happened to me when I added a bunch of hash
functions (like K12, M14, Sha512, and so on) to extensions.cc. Most of
those built-in functions needed try/catch since most of the functions
I called could've thrown std::runtime_error, and so I added catch for
those. But again, the same thing happened -- the exceptions were
skipped and abort() was called like I'd never defined the exception
handlers.

Michael Munson

unread,
Nov 3, 2018, 1:36:51 PM11/3/18
to Ethin Probst, Luke-Jr, MOO-...@googlegroups.com, wolfg...@gmail.com
I think it is easily doable. Well, easily is relative of course. Don't try to re-invent the wheel. There are tons of C++ libraries that would make this easier.

For example, if you could change the MOO main loop into an event based system using either boost asio or libuv then you would have done most of the work to make MOO threadsafe right there.

Littlefield, Tyler

unread,
Nov 3, 2018, 3:44:49 PM11/3/18
to Wolfgang Faust, harlyd...@gmail.com, moo-...@googlegroups.com
I'm still incredibly confused where this would be useful, and it sounds like this is still premature. I love the concept of sharding, but sharding would look much different to a multithreaded moo. I think in theory this is great, but there are so many internal details that need to be ironed out, and unless someone has a lot of call graphs and profile statistics that can show where the bottlenecks are, it would be much more useful to do various other things to speed up MOO. For example:
1. Rewriting the parser to go through LLVM and making the vm be obsolete. With LLVM you get JIT which would provide presumably a decent performance boost.
2) Writing code optimization that will do things like DCE, combining of operations (10*10 just becomes 100), etc.
3) Finding a better non flat-file way to store MOO objects. I'm invisioning something that moves us closer to the concept of transaction writing, which would move us closer to threads.
4) Implementing a nice telnet parser, mccp, etc.

I'm sure MOO has bottlenecks, I'm just really not sure what making it multithreaded accomplishes right now.
I also put comments inline.

On 11/3/2018 12:06 AM, Wolfgang Faust wrote:
On Fri, Nov 2, 2018 at 5:39 PM Ethin Probst <harlyd...@gmail.com> wrote:
I like your lock algorithm. For the lock maintenance, could an
std::unordered_map<int, <lock_type>> work? That would make it easy for
verbs to access the map if they need to forcefully unlock one of them
(or create new locks).

I'm struggling to see where a map would be useful and why a verb would be forcefully unlocking anything.
Premature unlocking seems likely to be a recipe for disaster. Bear in
mind that locks will only be held while the task is actually running,
so for $server_options.(bg/fg)_(ticks/seconds) at most. Once the task
suspends all the locks will be released. I also can't think of any
uses for accessing locks where you wouldn't just fetch the lock from
the object again.

Or even just an std::vector<mutex>. That would
be even better; then you just do something like this to unlock all the
locks:
std::for_each(locks.begin(), locks.end(), [&](std::mutex lock) {
lock.unlock();
});
Plus, that final algorithm has far more advantages than disadvantages,
and would probably be easier to implement.


C++ 11 called, you don't need that atrocity of code. It's also worth note that the more locks you start piling on the more likely it is that you're going to go from wide to narrow (or narrow to wide) and try locking something that's already locked. That won't be good! There are recursive mutexes, but that only partially solves this larger problem. As I've said a few times, you have to be incredibly careful where you apply locks.


I don't do C++ so I'm not familiar with what it offers, but I agree
that a simple list of locks to be released is probably the better
option. As I mentioned before I can't think of any uses for a map of
locks.

One thing that has just occurred to me is that whatever lock
implementation you choose needs a way of *upgrading* from a shared
lock to an exclusive one--otherwise a task would always deadlock
itself when writing to a property it had previously read!
Shared_mutex does provide this, but you would have to find some clever ways of locking things--what if I want to operate on multiple objects? You can acquire locks for those objects, but the more you do the slower things will be.



      
It wouldn't be overly
complicated to make notify()/read() run in their own thread now; the
issue is actually (for read()) taking control from the interpreter and
passing it to the thread, then returning it when the thread
terminates.
Sending data wouldn't be overly complex, but even here you'll need to make sure that notify and read don't have race conditions so you end up sending lines out of order. MOO uses various methods, I believe one is polling for input and I don't particularly think that this is the bottleneck.

Michael Munson

unread,
Nov 5, 2018, 4:53:15 AM11/5/18
to sorressean ., Wolfgang Faust, Ethin Probst, moo-...@googlegroups.com
If we can convert each operation (function call, verb call, database access or modification) to completely async using boost asio io_service we can just change the Objects data structure to a simple STD container and wrap every database operation in a boost::strand we wouldn't need any mutexes because strands don't need explicit synchronization across threads. A strand guarantees that no two handlers execute concurrently.

What would need to be done is look at the main_loop and convert each individual action it does into instead using an io_worker to do it asynchronously. Then we'd have to find various tasks that happen that aren't generated by the main loop, like new connections, player input, etc.

So long as we didn't actually run it multithreaded until everything was converted it would probably still work even when half was async and half wasn't.

I think creating an Objects class and enforcing the only way to access the Objects data structure is through methods on that class, which would essentially be a container wrapped in methods for access and modification. That would be kind of a pain in the butt since a lot of the codebase just uses Objects willy nilly.

I would be willing to work on this if I had help and if we agreed to use boost asio instead of libuv.

David Given

unread,
Nov 5, 2018, 11:06:42 AM11/5/18
to ty...@tysdomain.com, Wolfgang Faust, harlyd...@gmail.com, moo-...@googlegroups.com
On Sat, 3 Nov 2018 at 20:44 Littlefield, Tyler <ty...@tysdomain.com> wrote:
1. Rewriting the parser to go through LLVM and making the vm be obsolete. With LLVM you get JIT which would provide presumably a decent performance boost.

I've done something like this. In real life the performance boost depends hugely on the workload. If you're spending all your time in library code (e.g. mutating persistent data structures in the database) it may not actually be worth it.

Also, I've worked with the LLVM API and it is huge, complex and changes a lot; not actually bad to work with, but consistently annoying. (See http://cowlark.com/calculon/, which is a pain to update every time there's a new LLVM library.) Given MOO's essentially dynamic nature you might want to consider something like LuaJIT instead, where MOO code is translated into Lua and thrown at the VM. The performance is great, it's tiny, the API is stable, C interoperability is good, and you also get a battle-tested garbage collector and data structure library for free (although this last may not be terribly useful if all data accesses need to go via the MOO database so they can be tracked).
 
3) Finding a better non flat-file way to store MOO objects. I'm invisioning something that moves us closer to the concept of transaction writing, which would move us closer to threads.

In my experience there are only three ways to store data on disk:

- flat files, using automatic serialisation via protobufs or equivalent.
- sqlite.
- full-on server-based RDBs.

I've found that anything else is effectively not a good use of time. Sqlite is amazing: as soon as your data becomes even slightly complex it's not worth using anything else, even if your data isn't relational. Even as a key/value datastore it's blisteringly fast, with proper transaction management etc.

One day I want to write a simple interpreter or something which stores all state in an Sqlite database, just to see if the performance is acceptable; I suspect it might be.

Regarding threading, I'd very, very strongly suggest that some alternative form of concurrency would work better. Shared memory threading is hugely difficult to reason about. MOO code, which is essentially actor-based with heavyweight independent objects, should lend itself well to a CSP-based system, though; effectively, this means that each object runs in its own thread and has its own event loop, and data is copied to other actors rather than shared (to prevent mutability nightmares). See https://www.ponylang.io/ for an example of a language which does this.

Luke Dashjr

unread,
Nov 5, 2018, 12:56:58 PM11/5/18
to MOO-...@googlegroups.com, Michael Munson, sorressean ., Wolfgang Faust, Ethin Probst
On Monday 05 November 2018 09:54:39 Michael Munson wrote:
> If we can convert each operation (function call, verb call, database access
> or modification) to completely async using boost asio io_service we can
> just change the Objects data structure to a simple STD container and wrap
> every database operation in a boost::strand we wouldn't need any mutexes
> because strands don't need explicit synchronization across threads. A
> strand guarantees that no two handlers execute concurrently.

Then you end up with the joke of "threading" in CPython.

But not only that, it also won't work.

Consider the verb:

if (this.a > 0)
--this.a
++this.b
endif

If you're not careful, running two instances of this in parallel could execute
like so:

Initial state: this.a = 1
Task A: if (this.a > 0) => TRUE
Task B: if (this.a > 0) => TRUE
Task A: --this.a => this.a is now 0
Task B: --this.a => this.a is now -1
Task A: ++this.b => this.b is now 1
Task B: ++this.b => this.b is now 2
> > (seehttp://wiki.c2.com/?ThreadsConsideredHarmful ) - Resources which need
> > business solutions. <http://tylerlittlefield.me> My personal site
> > <http://tysdomain.com> My Linkedin
> > <https://www.linkedin.com/in/ty-lerlittlefield> @Sorressean on Twitter
> > <http://twitter.com/sorressean>

Michael Munson

unread,
Nov 5, 2018, 7:00:32 PM11/5/18
to Luke-Jr, MOO-...@googlegroups.com, sorressean ., Wolfgang Faust, Ethin Probst
I am enlightened.

Ryan Daum

unread,
Nov 15, 2018, 1:40:35 PM11/15/18
to MOO-...@googlegroups.com
In this day and age if I were wanting to make a MOO-type system that took advantage of concurrency and didn't hold a global lock in this day and age, I'd look into some kind of software-transactional-memory around the whole shared world and avoid the world of mutexes entirely.

I think it'd be a real PITA to get there from the existing LambdaMOO codebase though.

R

Michael Munson

unread,
Dec 2, 2018, 12:20:03 AM12/2/18
to ryan...@gmail.com, MOO-...@googlegroups.com
Didn't processors just start supporting STM fairly recently? Like the last couple years? Aside from that you do have a point. Mutexes do have better performance than STM if conflicts are high -- which might be the case in a MOO Objects container if multiple threads are executing multiple verbs.

Ethin Probst

unread,
Dec 2, 2018, 1:29:01 AM12/2/18
to Michael Munson, ryan...@gmail.com, MOO-...@googlegroups.com
Mutexes do have higher performance, perhaps, but are also incredibly
dangerous to use (not to mention the difficulty of avoiding deadlocks
and race conditions). Granted, STM has race conditions too, so that
problem would still exist; deadlocks, though, would most likely be
avoided by STM. I'm not even sure on the status of implementations of
C/C++ compilers that support STM, though I'd think most of them would
support it (the standard, ISO/IEC TS 19841:2015, was published in
October of 2015). I'd raise the problem that STM is non-standard C++,
but I think the MOO is non-standard as is.

Ethin Probst

unread,
Dec 2, 2018, 1:31:45 AM12/2/18
to Michael Munson, ryan...@gmail.com, MOO-...@googlegroups.com
I also forgot to append to my first statement that mutexes (naturally)
aren't dangerous to use (in the sense that they are "dangerous" like a
virus) but in the sense that they are difficult to get right in
certain cases (though C++ mostly absolves you of that problem with
features like std::lock_guard).

Michael Munson

unread,
Dec 2, 2018, 2:40:50 AM12/2/18
to Ethin Probst, ryan...@gmail.com, MOO-...@googlegroups.com
Yes, I think using lock_guard would be necessary considering the unpredictable access of an Object's data structure. I don't think just internal locking would be sufficient.

Michael Munson

unread,
Dec 2, 2018, 3:00:12 AM12/2/18
to Wolfgang Faust, Ethin Probst, MOO-...@googlegroups.com
If I was using boost, I would change Object from a structure to a class which inherits from basic_lockable_adapter. Then find all the functional methods that do dirty deeds to an individual Object (rename, move, chparent, basically everything in db_objects.cc and a lot in objects.cc, etc) and move them all onto the class and then use a lock guard before changes are made.

For example, instead of the global function db_set_object_flag we'd have a set_flag method in the Object class, which would do something like this before changes were made (std::lock_guard would work same way):
boost::lock_guard<Object> guard(*this);
It would be a lot of work to Classify MOO objects, but it wouldn't be insurmountable. A lot of the work would be changing all the calls to things like do_move(object,...) to Object:move(...) for example. Consider how many things affect objects from random function calls. MOST of them are in objects.cc and db_objects.cc -- but not all, for example property setting.

Michael Munson

unread,
Dec 2, 2018, 3:40:58 AM12/2/18
to Wolfgang Faust, Ethin Probst, MOO-...@googlegroups.com
Here is an extremely basic example of a "classified" Object, to give you an idea of what would need to be done. This is the bare minimum and not enough, but gives you an idea:

class Object: public basic_lockable_adapter<mutex> {
private:
    Objid _id;
    Objid _owner;
    const char *_name;
    int _flags; /* see db.h for `flags' values */
    Var _location;
    Var _contents;
    Var _parents;
    Var _children;
    Pval *_propval;
    unsigned int _nval;
    Verbdef *_verbdefs;
    Proplist _propdefs;
    /* The nonce marks changes to the propval layout caused by changes
     * to parentage, property addition/deletion, etc.  Every value is
     * globally unique.
     */
    unsigned int _nonce;
public:
    Objid id() const {return _id;}
    void id(const Objid var) {boost::lock_guard<Object> guard(*this); _id = var;}
    Objid owner() const {return _owner;}
    void owner(const Objid var) {boost::lock_guard<Object> guard(*this); _owner = var;}
    const char *name() const {return _name;}
    void name(const char *var) {boost::lock_guard<Object> guard(*this); _name = var;}
    int flags() const {return _flags;}
    void flags(const int var) {boost::lock_guard<Object> guard(*this); _flags = var;}
    Var location() const {return _location;}
    void location(const Var var) {boost::lock_guard<Object> guard(*this); _location = var;}
    Var contents() const {return _contents;}
    void contents(const Var var) {boost::lock_guard<Object> guard(*this); _contents = var;}
    Var parents() const {return _parents;}
    void parents(const Var var) {boost::lock_guard<Object> guard(*this); _parents = var;}
    Var children() const {return _children;}
    void children(const Var var) {boost::lock_guard<Object> guard(*this); _children = var;}
    Pval* propval() const {return _propval;}
    void propval(const Pval* var) {boost::lock_guard<Object> guard(*this); _propval = var;}
    unsigned int nval() const {return _nval;}
    void nval(const unsigned int var) {boost::lock_guard<Object> guard(*this); _nval = var;}
    Verbdef* verbdefs() const {return _verbdefs;}
    void verbdefs(const Verbdef* var) {boost::lock_guard<Object> guard(*this); _verbdefs = var;}
    Proplist propdefs() const {return _propdefs;}
    void propdefs(const Proplist var) {boost::lock_guard<Object> guard(*this); _propdefs = var;}
    unsigned int nonce() const {return _nonce;}
    void nonce(const unsigned int var) {boost::lock_guard<Object> guard(*this); _nonce = var;}
} Object;

Properties would need a mutex as well for sure, and after that the various functions would need to be moved to the Object class and any thing that called the old function would need to be changed to reference the object's member function.

On Fri, Nov 2, 2018 at 1:34 PM Wolfgang Faust <wolfg...@gmail.com> wrote:

Michael Munson

unread,
Dec 2, 2018, 3:45:28 AM12/2/18
to Wolfgang Faust, Ethin Probst, MOO-...@googlegroups.com
I think it wouldn't be out of bounds to try to shift into the 21st century by replacing all the linked lists or void* data you have to increment by so many bytes with one of the new std::containers, either. I mean, really.

Ethin Probst

unread,
Dec 2, 2018, 4:55:26 AM12/2/18
to Michael Munson, Wolfgang Faust, MOO-...@googlegroups.com
Lots of excellent ideas here. Perhaps, even, we could do the following
while we're at it:
1: change all the object references to boost::flyweights? It makes
sense, and would probably decrease memory using by a large amount
(word processors do this, for example, and each object/character can
be different).
2: Perhaps, even, alter your class to be more ACID compliant. Something like:
#include <atomic>
#include <memory>
#include <string>
class Object: public basic_lockable_adapter<mutex> {
private:
std::atomic<Objid> _id;
std::atomic<Objid> _owner;
std::atomic<std::string> _name;
std::atomic<int> _flags; /* see db.h for `flags' values */
std::atomic<Var> _location;
std::atomic<Var> _contents;
std::atomic<Var> _parents;
std::atomic<Var> _children;
std::atomic<Pval> *_propval;
std::atomic<unsigned int> _nval;
std::atomic<Verbdef> *_verbdefs;
std::atomic<Proplist> _propdefs;
/* The nonce marks changes to the propval layout caused by changes
* to parentage, property addition/deletion, etc. Every value is
* globally unique.
*/
std::atomic<unsigned int> _nonce;
public:
Objid id() const {return _id.load();}
void id(const Objid var) {boost::lock_guard<Object> guard(*this);
_id.store(var);}
Objid owner() const {return _owner.load();}
void owner(const Objid var) {boost::lock_guard<Object>
guard(*this); _owner.store(var);}
const char *name() const {return _name.load().c_str();}
void name(const char *var) {boost::lock_guard<Object>
guard(*this); _name.store(std::string(var));}
std::string name() const {return _name.load();}
void name(const std::string var) {boost::lock_guard<Object>
guard(*this); _name.store(var);}
int flags() const {return _flags.load();}
void flags(const int var) {boost::lock_guard<Object> guard(*this);
_flags.store(var);}
Var location() const {return _location.load();}
void location(const Var var) {boost::lock_guard<Object>
guard(*this); _location.store(var);}
Var contents() const {return _contents.load();}
void contents(const Var var) {boost::lock_guard<Object>
guard(*this); _contents.store(var);}
Var parents() const {return _parents.load();}
void parents(const Var var) {boost::lock_guard<Object>
guard(*this); _parents.store(var);}
Var children() const {return _children.load();}
void children(const Var var) {boost::lock_guard<Object>
guard(*this); _children.store(var);}
Pval* propval() const {return _propval.load();}
void propval(const Pval* var) {boost::lock_guard<Object>
guard(*this); _propval.store(var);}
unsigned int nval() const {return _nval.load();}
void nval(const unsigned int var) {boost::lock_guard<Object>
guard(*this); _nval.store(var);}
Verbdef* verbdefs() const {return _verbdefs.load();}
void verbdefs(const Verbdef* var) {boost::lock_guard<Object>
guard(*this); _verbdefs.store(var);}
Proplist propdefs() const {return _propdefs.load();}
void propdefs(const Proplist var) {boost::lock_guard<Object>
guard(*this); _propdefs.store(var);}
unsigned int nonce() const {return _nonce.load();}
void nonce(const unsigned int var) {boost::lock_guard<Object>
guard(*this); _nonce.store(var);}
} Object;
This may or may not work, but something like this might be an
improvement, and may go further to adding threading protection by
ensuring that either an object is altered completely when it is
changed, in the affected manner, or it is not altered at all.

Michael Munson

unread,
Dec 2, 2018, 5:25:43 AM12/2/18
to Ethin Probst, Wolfgang Faust, MOO-...@googlegroups.com
boost::flyweights would definitely be useful for things like names and in the Var union for string values.

Another thing I would like is replacing Objid (it's currently just an int64) with a tuple of a UUID and a string. The string would be how you referenced the object in-game, like #333. However, with the UUID in the background we could invalidate references to objects that have been destroyed or recycled (there would be some method to regen a UUID on an existing object to force this behavior so things like $recycler could work.) Presumably if the UUID is blank in the Objid it would function like the old-style Objids.

In that case flyweights would be ideal for those types of Objids.

We'd also definitely need to replace the Objects array with a container like a std::unordered_map or similar container that wouldn't need to walk through the array to find an Object that wasn't referenced by a number.

Ethin Probst

unread,
Dec 2, 2018, 6:12:29 AM12/2/18
to Michael Munson, Wolfgang Faust, MOO-...@googlegroups.com
How would the invalidation work though? Would an object be
invalidated, and then that allocated object number would be reusable
later? Currently, if recycle() is directly called an object number
becomes permanently useless, something I was unhappy with as soon as I
figured that out, since it never really made sense to me. Reusing an
UUID could only be done one way -- caching. The only other way you
could do that is if you subverted the UUID generation algorithm (which
would negate the universally unique-ness of the resultant string).
We'd also need a back-mapping list, so you'd have a list of UUIDs to
strings and vice-versa. Something like an std::unordered_map/std::map
would do nicely.
What did you think of my atomic version of your class?

Michael Munson

unread,
Dec 2, 2018, 6:08:28 PM12/2/18
to Ethin Probst, Wolfgang Faust, MOO-...@googlegroups.com
An invalidated Objid would resolve to #-1 or something similar that would return false for a valid() check when it was evaluated if the Object with that UUID no longer existed. This is kind of how it is supposed to work now, because theoretically the server expects you to recycle() objects which eliminates this problem of dangling objectid references that now point to new Objects. However, a lot of us have issues with that, we like to reuse object numbers. That causes us occasionally to have objectids referencing things that shouldn't exist anymore, or rather referencing something new that had has no bearing on what it used to be.  If you consider that you'll realize that that is the reason for a good half of isa() calls, to verify an objectID actually is pointing at what we expect it to.

We'd still reference things by the object numbers (although technically we could have them be any arbitrary string!) The UUID is just so that it can check if the Object is still the object it was referring to since each Objid object would basically be a pair of <UUID,std::string> -- Although having UUIDs might be kind of nice if we ever develop a useful MOO communication protocol.

UUIDs wouldn't be reused at all, not gonna run out.

The bonus to this is that there would no longer be any kind of overhead for incrementing object numbers and "losing" them if you recycled() an object, We could have it function as before or even have the a function to change the ID on the Object on the fly since an unordered hash map doesn't care one jimmy.  Because the Objects array would no longer be a shitty array, would no longer have to be renumbered if you wanted to use an already used up object number. It would kind of eliminate the need for the $recycler object that most MOOs use.


Ethin Probst

unread,
Dec 2, 2018, 8:07:56 PM12/2/18
to Michael Munson, Wolfgang Faust, MOO-...@googlegroups.com
It most certainly would. It would make the renumber() built-in
function useful instead of just something that's there but not really
used by anyone because it poses far to great of a risk to actually be
used.
Reply all
Reply to author
Forward
0 new messages