I think if you truly want to measure unlimited memory growth, you need to watch the total memory of the process, under load, over time. It will grow to a certain degree, then should stay constant. In this case, there is more caching going on that will make it seem like memory is growing in the short term and it also describes why the Dialect and some expression related constructs are being held around for a bit. The Dialect is a stateful object that accumulates information about a particular database URL, and is also used to cache information aobut various dialect-specific constructs like SQL statements and types. The "many engine" use case might someday be helped by gaining the ability to share the same Dialect among many engines that are connecting to identically-behaving databases since its really the Dialect that has a larger memory footprint.
The attached test case illustrates the size management of another cache called mapper._compiled_cache which keys compiled forms of SQL statements to Dialects. This cache is an LRU cache that manages its size around 150 items. If you run the script, it prints these sizes which you'll see grows to about 150 and then gets chopped down by 50. The test also illustrates the size of gc.get_objects() which grows over the first few dozen iterations, but then levels out into a stepwise pattern:
sample gc sizes: [16939, 16997, 17045, 17093, 17141, 17189, 17237, 17285, 17333, 17381, 17429, 17477, 17525, 17573, 17621, 17669, 17717, 17765, 17813, 17861, 17909, 17957, 18005, 18053, 18101, 18149, 18197, 18245, 18293, 18341, 18389, 18437, 18485, 18533, 18581, 18629, 18677, 18725, 18773, 18821, 18869, 18917, 18965, 19013, 19061, 19109, 19157, 19205, 19253, 19301, 19349, 19397, 19445, 19493, 19541, 19589, 19637, 19685, 19733, 19781, 19829, 19877, 19925, 19973, 20021, 20069, 20117, 20165, 20213, 20261, 20309, 20357, 20405, 20453, 20501, 20549, 20597, 20645, 20693, 20741, 20789, 20837, 20885, 20933, 20981, 21029, 21077, 21125, 21173, 21221, 21269, 21317, 21365, 21413, 21461, 21509, 21557, 21605, 21653, 21701, 21749, 21797, 21845, 21893, 21941, 21989, 22037, 22085, 22133, 22181, 22229, 22277, 22325, 22373, 22421, 22469, 22517, 22565, 22613, 22661, 22709, 22757, 22805, 22853, 22901, 22949, 22997, 23045, 23093, 23141, 23189, 23237, 23285, 23333, 23381, 23429, 23477, 23525, 23573, 23621, 23669, 23717, 23765, 23813, 23861, 23909, 23957, 24005, 24053, 21683, 21731, 21779, 21827, 21875, 21923, 21971, 22019, 22067, 22115, 22163, 22211, 22259, 22307, 22355, 22403, 22451, 22499, 22547, 22595, 22643, 22691, 22739, 22787, 22835, 22883, 22931, 22979, 23027, 23075, 23123, 23171, 23219, 23267, 23315, 23363, 23411, 23459, 23507, 23555, 23603, 23651, 23699, 23747, 23795, 23843, 23891, 23939, 23987, 24035, 24083, 21683, 21731, 21779, 21827, 21875, 21923, 21971, 22019, 22067, 22115, 22163, 22211, 22259, 22307, 22355, 22403, 22451, 22499, 22547, 22595, 22643, 22691, 22739, 22787, 22835, 22883, 22931, 22979, 23027, 23075, 23123, 23171, 23219, 23267, 23315, 23363, 23411, 23459, 23507, 23555, 23603, 23651, 23699, 23747, 23795, 23843, 23891, 23939, 23987, 24035]
the number of objects gets to around 24000 then bounces down to 21000 again twice and continues to do so, as the LRU cache grows then reduces its size (the numbers are slightly different in 0.7 but same idea). So this is not a leak.
If you have other cases to test you can send them back here by modifying the attached script to illustrate your scenario.