I've probably gone through this before on this list, but worth doing again (hey, I'm unemployed and got time on my hands).
Animal Logic's most recent incarnation of their pipeline did as you say, we basically bypassed references entirely. Thats one thing, the other big thing is huge levels of automation by the rest of the pipe, so lighting scenes and assets are being built on the farm, that way lighters just get the final end result quickly, rather than waiting for maya to load in 4000 houses or 50,000 trees every time they build a shot.
Re bypassing references, a lot of it came down to identifying things artists need to do, and if it was likely to hit maya's slow processing of .ma's, or creating connections, or hit render layers, or anything we just know would suck, it'd be funelled out into a seperate process, usually with a seperate file format. Some examples:
-animation is baked into a geo cache format (as most places do), but the file format can also handle basic material assignment, LOD, obj replacement, curves, uv assignment (via another external uvxml file), etc etc... it was alembic well before alembic became available basically.
-the storing of material connections themselves are handled seperately; connections are recorded and exported, then remade in the lighting scene
-cameras are stored in their own independant format
-lights are extracted and stored seperately
-lighting layers/render passes are stored in a seperate text file, and edited as text (lighters hate this at first, then love it by the end)
all this modularity means that scenes can be built and rendered on the farm, from a known set of good versions, lighters often don't have to load an entire scene to be able to get it ready for rendering, we bypass a lot of the problems we used to have with references.
that said we'd generate a pre-built maya scene for each major component of a shot (camera, sets, chars, fx, lights), and reference them all in. The pre-building of each of those components might take 5 minutes up to 10 hours on the farm (some of the bigger gatsby manhattan sets involved thousands of objects), but loading that now pre-built scene would take under 2 minutes. Oh, we also identified which prebuilds we might need to debug and keep as ma's, or were never likely to open and move them to .mb's. The answer was almost no-one hand-dissects .ma's anymore, moving everything to .mb gave us huge speed gains.
Those kind of asset would hit exactly the same problems you describe; if we referenced, we'd hit slowdowns, hangs, all sorts of annoying stuff. Until we realised that because we were recording all the changes we made to a shot in a modular fashion, there was no need to reference. We'd just import, update the lightrigs/text descriptions of the render scene, save those out seperately, then just close the scene without rendering. We'd treat the maya session as a temp working scene, once the changes were recorded, we could throw the scene away. For rendering, we'd just tell the farm to go rebuild the scene from its components, lighters wouldn't have to be there to see it, was all offline, made us much more efficient.
I see there's steps in the latest builds of maya to make referencing and whatnot easier, but honestly, it doesn't seem worth the hassle.