There's more than one way to load data into your graph so I needed to know exactly what you were doing. Anyway, you're using graph.io(). Know that with that approach you're using a GryoReader underneath and that's a simple singlethreaded loader that uses a vertex cache which for tinkergraph is a bit of waste because it's already holding vertices in memory. The cache doesn't release or evict vertices at any point so you gotta throw a lot of -Xmx at the thing to make it work and then you consider the points Robert Dale mentioned you can see why things might seize up.
For larger datasets and TinkerGraph, I'd prefer a custom loader (i.e. just a Gremlin script to run in the Gremlin Console). Unfortunately it's not really safe to do parallel writes to TinkerGraph as it isn't proven completely thread-safe for that, (though i think parallel reads are ok).
I'm curious to see what happens when this merges:
as it opens up the io() as a first class citizen to the Gremlin language and perhaps we'll see graph providers get legit bulk loaders behind that step or at least make use of Hadoop Input/OutputFormat with CloneVertexProgram. For TinkerGraph, i'm not sure what we'll do.....maybe there could be a more TinkerGraph specific GryoReader that dropped the caching, transaction checking, etc. - that might be good.