Hi,
The current implementation of partial trace, though generic, has a
heavy footprint in terms of memory once the system size is somewhat
large even for pure states (tracing 7 out of 13 spins require about
2gb RAM). For this case I have implemented an algorithm which requires
much less memory and is much faster for large systems (~10 times
faster for tracing 6 out of 12 spins). I do not have sufficient
understanding of the code structure of qutip to implement the
algorithm in the library directly. However if someone is interested
and have the required expertise, they are welcome to copy the code in
qutip. A self-contained program which compares this implementation to
that of qutip can be found here -
https://www.dropbox.com/s/bzhc629e28ffun8/ptrace.py?dl=0
I hope it is useful to someone else too.
Cheers,
Rajeev