Considerably more sleuthing has brought us to the following:
When requesting 4 GPUs from slurm, all is well.
When requesting 1 GPU from slurm, and then attempting to provision 1 GPU from DyNet, the following:
$ python -c "import dynet" --dynet-gpu 1
[dynet] initializing CUDA
CUDA failure in cudaGetDeviceCount(&nDevices)
unknown error
terminate called after throwing an instance of 'dynet::cuda_exception'
what(): cudaGetDeviceCount(&nDevices)
Aborted
This seems to match with the "slurm/DyNet GPU numbering problem" hypothesis, but I'm not quite sure how. I've attempted to install dynet on a 4-gpu-provisioned slurm interactive run as well as a 1-gpu-provisioned run, but the effect is the same -- DyNet only seems to work for me when I request all 4 GPUs on a machine, regardless of how many I tell DyNet to use.
Posting this in part for posterity/in case I find a fix, but any help would also be appreciated.
-John