Also, I want to confirm the following understanding: kets are themselves used
to describe ensembles of systems, so an ensmeble of kets is in some sense an
ensemble of ensembles - but a ket, by definition, cannot be used to describe
anything but a pure state.
--
"The power of this battlestation is _insignificant_ when compared with
the power of the Farce." - D. Vader.
>I am curious as to how one traditionally describes systems that are
>in 'mixed states'. The notational probelm is that given any
>two kets, |u> + |v> is interpreted as a superposition. What one wants
>to describe is an ensemble of kets, I believe. However, it's not clear
>how this is written --- {|u>, |v>}?
I believe mixed states are best represented by a "density matrix" or
"density operator" instead of ket vectors. The density operator for
a mixture of states |u> and |v> would be
D = a |u><u| + b |v><v|.
Here, |u><u| is an "outer product." The real numbers a and b are the
probability weights of the two states |u> and |v>, and insure that
trace(D)=1--in other words, that the total probability of the mixture
over any basis is 1.
>Also, I want to confirm the following understanding: kets are themselves used
>to describe ensembles of systems, so an ensmeble of kets is in some sense an
>ensemble of ensembles - but a ket, by definition, cannot be used to describe
>anything but a pure state.
>--
Under some interpretations of quantum mechanics, the state vector must describe
an ensemble, but others identify it realistically with a single particle.
Interestingly, Hawking showed that particles can go from pure to mixed
states, even on a fundamental level! So the idea that a mixture is a
macroscopic ensemble may give way--perhaps on a fundamental level, particles
should be described by density operators which do not have to represent
pure states.
Bram Boroson
bor...@spot.colorado.edu
>>I am curious as to how one traditionally describes systems that are
>>in 'mixed states'. The notational probelm is that given any
>>two kets, |u> + |v> is interpreted as a superposition. What one wants
>>to describe is an ensemble of kets, I believe. However, it's not clear
>>how this is written --- {|u>, |v>}?
>I believe mixed states are best represented by a "density matrix" or
>"density operator" instead of ket vectors.
Yes. (Nice intro deleted.) Let me again remind everyone that I am on a
crusade to teach the universe the difference between superpositions and
mixtures --- the latter being what Vidugiris is seeking to understand.
See the end of this article for my spiel.
>Interestingly, Hawking showed that particles can go from pure to mixed
>states, even on a fundamental level! So the idea that a mixture is a
>macroscopic ensemble may give way--perhaps on a fundamental level, particles
>should be described by density operators which do not have to represent
>pure states.
Let's note that Hawking's work while brilliant is highly controversial
and certainly without experimental verification: the idea is that
virtual black holes are constantly popping into existence, munching
(possibly virtual) particles, and disappearing in a burst of thermal
Hawking radiation, thus increasing the entropy of the world. This
builds upon Hawking's earlier and already controversial approach on the
black hole information loss problem. (See the latest "This Week's
Finds".) So we can't really say he "showed" this, only that he
suggested it might be the case.
Okay, my rap, from "week27":
In quantum theory, associated to any physical system there are
states and observables. An observable is a real-valued quantity
we might conceivably measure about the system. A state
represents what we might conceivably know about the system. The
previous sentence is quite vague; all it really means is this:
given a state and an observable there is a mathematical recipe
that lets us calculate a probability distribution on the real
number line, which represents the probability of measuring the
observable to have a value lying in any subset of the real line.
We call this the probability distribution of the observable in
the state. Using this we can, if we want, calculate the mean of this
probability distribution (let us assume it exists!), which we
call the expectation value of the observable in the state.
Given two states Psi and Phi, and a number c between 0 and 1
there is a recipe for getting a new state, called c Psi +
(1-c)Phi. This can be described roughly in words as follows:
"with probability c, the system is in state Psi; with probability
1-c it is in state Phi." This is called a *mixture* of the
states Psi and Phi. If a state is a mixture of two different
states, with c not equal to 0 or 1, we call that state a *mixed*
state. If a state is not mixed it is *pure*. Roughly speaking,
a pure state is a state with as little randomness as possible.
(More precisely, it has as little entropy as possible.)
All the remarks so far apply to classical mechanics as well as
quantum mechanics. A simple example from classical mechanics is
a 6-sided die. If we ignore everything about the die except
which side is up, we can say there are six pure states: the state
in which the side of the die showing one dot is up, the state in
which the side showing two dots is up, etc.. Call these states
1,2,3,4,5, and 6. If it's a fair die, and we roll it and don't
look at it, the best state we can use to describe what we know
about the die is a mixed state which is a mixture: 1/6 of state 1
plus 1/6 of state 2, etc.. Note that if you peek at the die and
see that side 4 is actually up, you will be inclined to use a
different state to describe your knowledge: a pure state, state
4. Your honest friend, who didn't peek, will still want to use
a mixed state. There is no contradiction here; the state simply
is a way of keeping track of what you know about the system, or
more precisely, a device for calculating expectation values of
observables; which state you use reflects your knowledge, and
some people may know more than others.
Things get trickier in quantum mechanics. They also get trickier
when the system being described includes the person doing the
describing. They get even trickier when the system being
described is the whole universe -- for example, some people rebel
at the thought that the universe has "many different states" --
after all, it is how it is, isn't it? (Gell-Mann gave a talk at
this conference, which unfortunately does not appear in this
volume, entitled "Quantum mechanics of this specific universe."
I have a hunch it deals with this issue, which falls under the
heading of "quantum cosmology.")
The first way things get trickier in quantum mechanics is that
something we are used to in classical mechanics fails. In
classical mechanics, pure states are always dispersion-free --
that is, for *every* observable, the probability measure assigned
by the state to that observable is a Dirac delta measure, that
is, the observable has a 100% chance of being some specific value
and a 0% chance of having any other value. (Consider the example
of the dice, with the observable being the number of dots on the
face pointing up.) In quantum mechanics, pure states need NOT
be dispersion-free. In fact, they usually aren't.
A second, subtler way things get trickier in quantum mechanics
concerns systems made of parts, or subsystems. Every observable
of a subsystem is automatically an observable for the whole
system (but not all observables of the whole system are of that
form; some involve, say, adding observables of two different
subsystems). So every state of the whole system gives rise to,
or as we say, "restricts to," a state of each of its subsystems.
In classical mechanics, pure states restrict to pure states. For
example, if our system consisted of 2 dice, a pure state of the
whole system would be something like "the first die is in state 2
and the second one is in state 5;" this restricts to a pure state
for the first die (state 2) and a pure state for the second die
(state 5). In quantum mechanics, it is *not* true that a pure
state of a system must restrict to a pure state of each
subsystem.
It is this latter fact that gave rise to a whole bunch of quantum
puzzles such as the Einstein-Podolsky-Rosen puzzle and Bell's
inequality. And it is this last fact that makes things a bit
tricky when one of the two subsystems happens to be *you*.
It is possible, and indeed very common, for the following thing
to happen when two subsystems interact as time passes. Say the
whole system starts out in a pure state which restricts to a pure
state of each subsystem. After a while, this need no longer be
the case! Namely, if we solve Schroedinger's equation to
calculate the state of the system a while later, it will
necessarily still be a pure state (pure states of the whole
system evolve to pure states), but it need no longer restrict to
pure states of the two subsystems. If this happens, we say that
the two subsystems have become "entangled."
In fact, this is the sort of thing that often happens when one of
the systems is a measuring apparatus and the other is something
measured. Studying this issue, by the way, does *not* require a
general definition of what counts as a "measuring apparatus" or a
"measurement" -- on the contrary, this is exactly what is not
needed, and is probably impossible to attain. What is needed is a
description in quantum theory of a *particular* kind of measuring
apparatus, possibly quite idealized, but hopefully reasonably
realistic, so that we can study what goes on using quantum
mechanics and see what it actually predicts will occur.
For example:, taking a very idealized case for simplicity:
Our system consists of two subsystems, the "detector" and an
"electron." The systems starts out, let's suppose, in a pure
state which restricts to a pure state of each subsystem: the
detector is "ready to measure the electron's spin in the z
direction" and the electron is in a state with its spin pointing
along the x axis. After a bit of time passes, if we restrict the
state of the whole system to the first subsystem, the detector,
we get a mixed state like "with 50% probability it has measured
the spin to be up, and with 50% probability it has measure the
spin to be down." Meanwhile, the if we restrict the state to the
second subsystem, the electron it is in the mixed state "with 50%
change it has spin up, and with 50% chance it has spin down." In
fact these two mixed states are *correlated* in an obvious sense.
Namely, the observable of the *whole* system that equals 1 if the
reading on the detector agrees with the spin of the electron, and
0 otherwise, will have expectation value 1 (if the detector is
accurate). The catchy term "entangled," which is a little silly,
really just refers to this correlation. I don't want to delve
into the math of correlations, but it is perhaps not surprising
that, in classical or quantum mechanics, interesting correlations
can only occur between subsystems if both of them are in mixed
states. What's sneaky about quantum mechanics is that the whole
system can be in a pure state which when restricted to each
subsystem gives a mixed state, and that these mixed states are
then correlated (necessarily, as it turns out). That's what
"entanglement" is all about.
It was through analyses like this, but more detailed, that
Everett realized what was going on in a quantum system composed
of two subsystems, one of which was a measuring apparatus (or
person, for that matter), the other of which was something
measured. The post-Everett work amounts to refining Everett's
analysis by looking at more realistic examples, and more varied
examples. In particular, it is interesting to study situations
where nothing very controlled like a scientific "measurement" is
going on. For example, one subsystem might be an atom in outer
space, and the other subsystem might be its environment (a bunch
of other atoms or radiation). If one started out in a state
which restricted to a pure state of each subsystem, how fast
would the subsystems become entangled? And exactly *how* would
they become entangled? -- this is very interesting. When we are
doing a scientific measurement, it's pretty clear what sort of
correlation is involved in the entanglement. In the above
example, say, the detector reading is becoming correlated to the
electron's spin about the z axis. If all we have is an atom
floating about in space, it's not so clear. Can we think of the
environment as doing something analogous "measuring" something
about the atom, which establishes correlations of a particular
kind? This is the kind of thing Zurek and Unruh are studying.
In my description above I have tried to be very matter-of-fact,
but probably you all know that this subject is shrouded in
mystery, largely because of the misty and dramatic rhetoric
people like to use, which presumably makes it seem more profound.
At least "entangled" has a precise technical meaning. But anyone
studying this subject will soon run into "collapse of the
wavefunction," "branches," "the many-worlds interpretation," the
"observer," and so on. These things mean many things to many
people, and nothing in particular to many more, so one must
always be on the alert.
Now for a little math to ground the above discussion. To keep
life simple suppose we have a quantum system described by a
n-dimensional Hilbert space H which we'' just think of as C^n,
n-dimensional complex space. The main thing to get straight is
the difference between superpositions and mixtures of quantum
states. An observable in quantum theory is described by a
self-adjoint operator A, which for us is just an nxn self-adjoint
matrix. A state is something that assigns to each observable a
number called its expectation value, in a manner that is 1)
linear, 2) positive, and 3) normalized. To explain this let us
call our state Psi. Linearity means Psi(A + B) = Psi(A) + Psi(B)
and Psi(cA) = c Psi(A) for all observables A,B and real numbers
c. Positivity means Psi(A) > 0 when A is a nonzero matrix that
has non-negative eigenvalues (a so-called non-negative matrix).
And the normalization condition is that Psi(1) = 1.
This may seem unfamiliar, and that is because elementary quantum
mechanics only considers states of the form
Psi(A) = <v, Av>
where v is a unit vector in H. Not all states are of this
form, but they are an extremely important special class of
states. It is also important to consider states that are
represented as ``density matrices,'' which are non-negative
matrices D with trace 1:
tr(D) = Sum_i D_{ii} = 1
Such a density matrix defines a state Psi by
Psi(A) = tr(AD).
It's worth checking that this really meets the definition of a
``state'' given above!
The states corresponding to unit vectors in H are in fact a
special case of the density matrices. Namely, if v is a unit
vector in H we can let D be the self-adjoint matrix corresponding
to projection onto v. I.e., the matrix D acts on any other
vector, say w, by
Dw = <v,w> v.
It's not to hard to check that the matrix D really is a density
matrix (do it!) and that this density matrix defines the same
state as does the vector v, that is,
tr(AD) = <v, Av>
for any observable A.
The entropy of a state Psi corresponding to the density matrix D is
defined to be
S(Psi) = -tr(D lnD)
where one calculates D lnD by working in a basis where D is diagonal and
replacing each eigenvalue x of D by the number x lnx, which we
decree to be 0 if x = 0. Check that if D corresponds to
a *pure* state as above then D lnD = 0 so the entropy is zero.
Now about superpositions versus mixtures. They teach you how to take
superpositions in basic quantum mechanics. They usually don't tell you
about density matrices; all they teach you about is the states
that correspond to unit vectors in Hilbert space. Given two unit vectors in
H, one can take any linear combination of them and, if it's not zero,
normalize it to be a unit vector again, which we call a superposition.
Mixtures are an utterly different sort of linear combination. Given
two states Psi and Phi -- which recall are things that assign numbers
to observables in a linear way -- and given any number c between 0 and
1, we can form a new state by taking
c Psi + (1-c) Phi
This is called a mixture of Psi and Phi. Finally, some
nontrivial exercises:
Exercise: Recall that a pure state is defined to be a state which
is not a mixture of two different states with 0 < c < 1. Show
that the states corresponding to unit vectors in Hilbert space
are pure.
Exercise: Conversely, show (in the finite-dimensional case we are
considering) that all the pure states correspond to unit vectors
in Hilbert space.
Exercise: Show that every density matrix is a mixture of states
corresponding to unit vectors in Hilbert space.
Exercise: Show (in the finite-dimensional case we are
considering) that all states correspond to density matrices.
Show that such a state is pure if and only if its entropy is
zero.
)I believe mixed states are best represented by a "density matrix" or
)"density operator" instead of ket vectors. The density operator for
)a mixture of states |u> and |v> would be
)
) D = a |u><u| + b |v><v|.
)
)Here, |u><u| is an "outer product." The real numbers a and b are the
)probability weights of the two states |u> and |v>, and insure that
)trace(D)=1--in other words, that the total probability of the mixture
)over any basis is 1.
Isn't |u><u| the projection operator? I.e. |u><u|(v) = |u><u|v>, a ket
proportional to u, with a magnitude of <u|v>. Thought an outer product
was something else. (??).
I guess my new #1 question is - what good are kets, and why does everyone seem
to use them? I was doing more reading, and d'Espagnat also seems to feel
that the density matrix can represent things that an ensemble of kets can't.
So, what good are they, and why not stick with the density matrix formulation
from day 1?
As a secondary question, I'm not quite sure I understand why density matrices
are more general than ensembles of kets. Is the problem one with infinite
dimensions? And is this representation issue known to be one of physical
significance - i.e. are there known physical systems that can't be modelled
with a state being an ensemble of kets {pure states}, or is this just a
mathematical curiosity, or is this matter still open.
Take spin, for example. [I'm using it as a specific example to work through
the formalisms]. <u|S|u> gives the average value of the 'observable' S.
u can be represented by a 2 element vector. In the density matrix form,
U is a 2 element matrix instead, and one takes the trace of <U|S|U>.
But it seems to me that all this really does is something that could be
done with an ensemble of two kets - i.e. whatever U happens to be in
the particular case
u1 u2
u3 u4
one breaks it up into two kets
u1 = |k1>
u3
u2 = |k2>
u4
[I'm using column vectors to represent kets, I hope this is reasonably
standard].
and the result of the trace operation is just the sum of <k1|S|k1> + <k2|S|k2>.
This is what makes me ask if it is a dimensionality issue, btw - an infinite
sum could be a problem.
>Isn't |u><u| the projection operator? I.e. |u><u|(v) = |u><u|v>, a ket
>proportional to u, with a magnitude of <u|v>. Thought an outer product
>was something else. (??).
It is the projection operator. It is also an outer product. Note that
density matrices can also have things like |u><v| in them.
>I guess my new #1 question is - what good are kets, and why does everyone seem
>to use them? I was doing more reading, and d'Espagnat also seems to feel
>that the density matrix can represent things that an ensemble of kets can't.
>So, what good are they, and why not stick with the density matrix formulation
>from day 1?
It's the same reason that you don't use the formalism of statistical mechanics
when studying the mechanics of a single particle in intro classical
physics. When we can describe the state as completely as QM allows,
it can be described with a ket (at least in the sort of QM I was taught),
and kets are easier to use than density matrices, just as it's easier not
to use the formalism of velocity distributions when doing single-particle
mechanics.
>As a secondary question, I'm not quite sure I understand why density matrices
>are more general than ensembles of kets. Is the problem one with infinite
>dimensions? And is this representation issue known to be one of physical
>significance - i.e. are there known physical systems that can't be modelled
>with a state being an ensemble of kets {pure states}, or is this just a
>mathematical curiosity, or is this matter still open.
A nice thing about the density matrix is that it makes more explicit the
basis independence of the mixed state. It is a remarkable fact in QM that if
you have an ensemble of, say, two orthogonal spin states that have certain
classical probabilities, this ensemble has no observable difference from one
that you can construct out of two other orthogonal spin states that are
linear combinations of these. With the density matrix the transformation
between bases of states is fairly obvious, just from the fact that it's
really a basis-independent density operator.
--
Matt 01234567 <-- The original Indent-o-Meter
McIrvin ^ Someday, tab damage will light our homes!
The reader should note that the representation is not unique.
Keith Ramsay
)>Here's the crux of the question. If every density matrix is a mixture
)>of states corresponding to unit vectors in Hilbert space (kets), this
)>would seem to imply that a mixture (ensemble) of kets would be an equivalent
)>representation to a density matrix. I.e. just compute the expected value
)>of the observable O as:
)
)>The sum over all kets {p(ket) * <ket|O|ket>}, where p(ket) is the probability
)>of the ket occuring in the ensemble.
)
)>Or does this just happen to work out in this example because of the
)restriction to finite dimensionality in this particular 'exercise'?
)
)Yes, density matrices are all mixtures of kets in the manner you
)describe, even if one is working with an infinite dimensional Hilbert
)space. In the infinite dimensional case, one of course needs an
)*infinite* sum, with convergence being a consequence of the condition
)\sum p(ket) = 1.
)
)This fact is a consequence of the fact that any density matrix can be
)diagonalized.
To me, that implies that the 'ensemble of kets' description and the density
matrix description of mixed states are equivalent. Equivalent as far as
observables/things that can be observed go, I should add (someone is bound to
mention this point sooner or later).
There did seem to me to be a marked preference towards the later in both
some of the books I've been reading and your posts, though - much more marked
than I would expect from them being apparently (?) basically equivalent.
It was mentioned that there may be several different ways to describe the
same density matrix as different ensembles, even though the density matrix
has all information that is needed to compute the values of observables.
As an example, I will take the following quotes: (from D'Espagnat):
"An ensmeble E obtained by combining all the elments of several subensmebles
Ea is a mixture, although, as we shall see, this statement is not a
generic definition of the concept."
This seems to me to be much stronger than his later statement (several pages
later):
"Since such a mixture is, as shown above, equivalent to not _one_ but to
_several_ physical mixtures such as E, the old formulation, in terms of
state vectors, necessarily implies some arbitrariness in the choice of
the later. The new formulation has no such defect."
I presume I was reading more into what he (and you) said than was
intended.
Actually, I had momentarily confused the outer product (which I don't use much)
with the tensor product (which I also don't use much). [I still don't have
a written definition of outer product handy, I'll have to take your word for
it]. I do appreciate the effort to stick with classical linear algebra
terminology, even if it didn't work out in this case :-(.
....
)A nice thing about the density matrix is that it makes more explicit the
)basis independence of the mixed state. It is a remarkable fact in QM that if
)you have an ensemble of, say, two orthogonal spin states that have certain
)classical probabilities, this ensemble has no observable difference from one
)that you can construct out of two other orthogonal spin states that are
)linear combinations of these. With the density matrix the transformation
)between bases of states is fairly obvious, just from the fact that it's
)really a basis-independent density operator.
The advantage of state vector basis independence (i.e. that there are several
different ket/state vector descriptions for the same density matrix, and that
the density matrix gives all information needed to compute observables) I
will acknowledge.
>To me, that implies that the 'ensemble of kets' description and the density
>matrix description of mixed states are equivalent. Equivalent as far as
>observables/things that can be observed go, I should add (someone is bound to
>mention this point sooner or later).
>There did seem to me to be a marked preference towards the latter in both
>some of the books I've been reading and your posts, though - much more marked
>than I would expect from them being apparently (?) basically equivalent.
Eh? When two things are equivalent, it often means that you can
forget about one and only use the other. That's what people have done
by using density matrices instead of mixtures of kets. (Mixtures is the
technical term for what you are calling ensembles here.) This sort of
thing often confuses people who think for themselves, like you, and
wonder why the heck nobody ever comes out and says some other approach
is equivalent to the usual one.
>john baez writes
>> Yes, density matrices are all mixtures of kets in the manner you
>> describe, even if one is working with an infinite dimensional Hilbert
>> space.
>
>The reader should note that the representation is not unique.
>
>Keith Ramsay
For what it's worth, the representation is unique if you demand that
the density matrix be a mixture of orthogonal pure states. That is,
suppose we write a density matrix D in the following way:
D = sum over j of C_j |Psi_j><Psi_j|
Then to represent D as a mixture, we need to solve for C_j and |Psi_j>.
If we assume that the |Psi_j> are orthogonal----
<Psi_i|Psi_j> = delta_{i,j}---then it becomes an eigenvalue problem:
D |Psi_j> = C_j |Psi_j>
The C_j are just the eigenvalues of D, and the |Psi_j> are the corresponding
eigenvectors.
Daryl McCullough
ORA Corp.
Ithaca, NY
>>The reader should note that the representation is not unique.
>For what it's worth, the representation is unique if you demand that
>the density matrix be a mixture of orthogonal pure states.
This is a very good point, but it's not *always* true -- due to the
old problem with repeated eigenvalues. E.g., if we have two orthonormal
bases of a 2d Hilbert space, say x,y and z,w, we have
(|x><x| + |y><y|)/2 = (|z><z| + |w><w|)/2