Slow record access

75 views
Skip to first unread message

Neal Alexander

unread,
Oct 28, 2021, 5:45:56 AM10/28/21
to chez-scheme
I've used define-record to create a 3-tuple to use as a drop-in replacement for (cons X (cons Y Z)), but it is surprisingly a lot slower despite the extra pointer indirection involved with cons.

In fact, just using a two element record in place of cons is over twice as slow for no apparent reason.

What is the best way to create a three element immutable record?

Andy Keep

unread,
Oct 28, 2021, 10:53:51 PM10/28/21
to chez-scheme
Are you using generative or nongenerative records?  Is the record sealed? Generally, nongenerative and sealed records will be slightly faster, because it ensures the most efficient code for access and type checking will be generated.  Cross-library optimization should allow inlining of predicates, accessors, and constructors, but it is worth noting that built-in types like pair and vector always get their accessors inlined, since they are built-in primitives.  If efficiency is of concern, you might try optimize-level 3 and/or using vectors in place of records or nested pairs.

Neal Alexander

unread,
Oct 29, 2021, 4:20:59 AM10/29/21
to chez-scheme
Vectors and sealed non-generative records are still still slower than cons even at both the default and third optimize-level.

** Summary
(cons X (cons Y Z)) : 0.562500000s
vector: 0.671875000s 
sealed-nongen-record: 0.750000000s
record:  0.765625000s

The CPU is an Intel Core i9 - 9880H
Chez 9.5.3 with threaded runtime / windows

The benchmark code branches through a binary tree indexed by the bits of a fixnum key and node labels (critbit patricia)
(tree has 30,000 random elements)

----
** cons, cons?, cadr, cddr, car
optimize-level 3:
    no collections
    0.562500000s elapsed cpu time
    0.582575600s elapsed real time
    0 bytes allocated
----

** make-vector 3, vector-ref
default optimization level:
    no collections
    0.671875000s elapsed cpu time
    0.678558100s elapsed real time
    0 bytes allocated
optimize-level 3:
    no collections
    0.671875000s elapsed cpu time
    0.652121100s elapsed real time
    0 bytes allocated
----

** (define-record tree (label left right))
default optimization level:
no collections
    0.812500000s elapsed cpu time
    0.821295400s elapsed real time
    0 bytes allocated
optimize-level 3:
    no collections
    0.765625000s elapsed cpu time
    0.773954600s elapsed real time
    0 bytes allocated
---

**: (define-record-type tree (fields label left right) (nongenerative) (sealed #t))
optimize-level 3:
    no collections
    0.750000000s elapsed cpu time
    0.753506700s elapsed real time
    0 bytes allocated

Dorian Corompt

unread,
Oct 29, 2021, 7:30:27 AM10/29/21
to Neal Alexander, chez-scheme
Hello, it seems obvious than a record is more expensive than two cons cells.
Cons cells are the main building blocks of all LISP and need to be highly optimized in terms of allocation time,  GC collection and memory consumption.
Some potential optimizations based on preallocated vectors (one for cars and one for cdrs) are shown in SICP (which is a reference). For a record, you need to allocate on the heap and need n+1 slot (type info) for a n fields structure. No benchmarking needed.



--
You received this message because you are subscribed to the Google Groups "chez-scheme" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chez-scheme...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chez-scheme/86186c7a-4fd8-4f44-9693-fe91868c07een%40googlegroups.com.

Neal Alexander

unread,
Oct 29, 2021, 2:02:09 PM10/29/21
to chez-scheme
It is great that cons cells are allocated quickly, but the benchmark performs no allocations or collections.

Even if the system uses pointer tagging or other tricks to avoid having to do a memory load for RTTI of cons cells, it should only save one memory read in this example, which still isn't less than a vector or structure should need to access a field.

Does the multi threaded runtime have to do locking on records and vectors or something?
Reply all
Reply to author
Forward
0 new messages