Hi Tim, this is Michael Wong chair of SG5. I have noted a number of opinions you gave for SG5 and SG7 all related to transactional memory in some ways. We would like to reach out to you to get your feedback and comment on our direction, as well as specifically this proposal.
If you like, we can forward to you the latest version.
Would you be able to get on a call anytime (you name the time) with our group to speak about the future direction for TM. We are currently in the phase of getting user feedback and your opinion, being from industry will be particularly helpful. Thanks.
Tim Sweeney has been occasionally pinging us or talking about us:
From Jan 29 meeting:
In my view, this is going the wrong way. Here's a summary of reasons, and an alternative path forward.
First, let's recognize that the following topics are closely linked:
- Transactional memory (how do we track all reads and writes to all shared memory?)
- Persistence (how do we allocate, find, persist, and manage data long-term without corruption?)
- Garbage collection (how do we find out what memory is actively being used?)
-
ABI (how do we provide interface and data backwards-compatibility over
multiple program invocations and even across platforms?)
- Reflection (what is the format of all of our data?)
There are several different ways that SG5 could approach this topic.
The
current approach is to try to expose transactional memory at the
language level. This is difficult, expensive, not fully orthogonal to
the other topics above. More generally, it seems aloof to the C++ way,
which is to expose general abstractions to programmers so we can
implement specific features. Examples of abstractions include functions
(1960), templates (1990), and reflection (2020?)
My
view is: Give us a great reflection spec, and we'll do the other things
ourselves in libraries. Doing this in libraries would be a good thing
because:
- Developers can experiment and discover what works best, as opposed to mandating a solution that's only roughly prototyped.
-
Designing containers for transactional, persistent, garbage-collected,
binary-forward-compatible containers leads to very different designs
than std.
- New transactional, persistent, garbage-collected
code will need to coexist and interoperate with existing libraries, so
fine-grained control will be needed -- which is natural with a library
solution to these problems, and unnatural with cross-cutting language
features.
- Reflection provides the full toolset needed to
build the features above. Transactions via new templated container
types; persistence via anything from serialization to patching memory to
upgrade versions in-place; garbage collection via metadata; and ABI
compatibility by automatically creating forward-compatible wrappers and
adapters.
A minimalist alternative for SG5 is
to simply bless (via std extensions) the kind of
accelerated-but-not-guaranteed restricted transactional memory of
Intel's TSX and similar related proposals. These are well-understood
low-level features that libraries can build on to implement full
transactional memory and the other things, on an opt-in basis.
From Tim Sweeney to SG7 reflection on 12/28/17
With just reflection, and no reliance on a future generative C++
proposal, we can generate specialized functions that mimic the behavior
of constructors and destructors but are customized for special usage
cases. For example: a "deserializing constructor" to generate a new
instance of a class from a stream, or helper constructors for optimized
garbage-collection schemes.
One thing we can't with
reflection alone is member-specific customization smart pointers to
classes. For example, given "gc_smart_pointer<t> p", we can
implement "gc_smart_pointer<t>::operator->()", but it has to behave uniformly for all types.
Could
we have a per-class overloadable templated variant of
operator->(t&) which receives a meta object describing the
particular member being accessed? Then it can customize its behavior
according to the type and member being accessed. This would be useful
for optimized garbage-collection schemes (where accessing a POD can be
optimized compared to a garbage-collector-managed type); software
transactional memory schemes (which would like to store data in a class
as a simple type, but access it using a wrapper type), marshaling layers
that connect C++ to scripting languages; etc.
From Tim Sweeney to SG7
Will P0194 be extended to support lambdas, and specifically reflecting on the number and type of lambda captures?
Reflecting
on lambda captures is critically important in the case of implementing a
garbage collector on top of standard C++, without hardcoding knowledge
of memory layout or other things.
Background
In
standard C++, a general-purpose garbage collector can be implemented on
top of smart pointers with reference counting. Any allocation with a
nonzero reference count is treated as a GC root. To get from this
starting point to real garbage collection, we can provide a mechanism
for certain types (such as containers) which are themselves
heap-allocated and reference counted, to release the reference counts of
their contents once they're initialized.
This
can be automated by replacing "new t(parms)" with
"newref<t>(parms)", which allocations memory, calls a constructor,
and ensures smart pointers release their reference counts immediately
rather than in their destructor. This approach breaks the
reference-counting cycles for those heap-allocated types, while ensuring
everything on the stack remains a GC root.
Using
this approach, I have a neat concurrent, nonblocking garbage collector
up and running on top of standard C++17. Without static reflection,
this requires manually implementing reference-count-releasing functions
for essential types.
With C++2a, reflection
could make it completely automatic. So, instead of using raw pointers
and new, you use a smart pointer and newref, and get free, safe GC
within standard C++. For this to work well, we'd need to reflect lambda
captures. If we can't reflect lambda captures, then we are nearly
certain that they'll be held forever due to reference-counting cycles,
because a lambda's purpose is often to manipulate an object it's stored
in. Thus the lambda pins the object, and the object contains the
lambda, so it's never released. Eager functional languages like ML
require garbage collection solely because of these cyclic references
between containers and lambdas within them.
Aside: Garbage Collection in Future C++
I
believe C++ will fundamentally require concurrent GC in order to scale
to many-threaded programs with complex data dependencies and
asynchronous execution. Objects, lambdas, and futures interact in so
many subtle ways that manual memory management seems intractable.
(Unreal Engine has relied on a hand-coded C++ garbage collector since
1998.)
I feel the N2670 garbage collection
track is neither tenable nor desirable. C++ isn't about heavyweight
runtime plumbing; it's about giving the programmer control, and relying
on standard and user libraries to solve common problems. Give us a
thorough version of P0194 and we'll have garbage collection soon
enough. And it won't be a conservative kluge that stops all threads and
scans all stacks and memory looking for pointer-like things; it will be
a standard C++ implementation which users can opt-in to, while
remaining safe and composable with all other libraries, whether they use
GC or not.