In all cases yes. However this is not the recommended or optimal configuration.
_______________________________________________
Lustre-discuss mailing list
Lustre-...@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 1) Could the "Client" and "MGS" run at one node together? or
> could "Client" and "OSS" run at one node together? 2) Suppose
> I had deployed them at one node, what potential shortcomings
> or harm are there?
Running MGS and MDS on the same nodes is customary, see:
http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#50438194_24122
Running the MGS, MDS and OSS service on the same node is
possible and fairly common in very small setups, usually those
in which there is only 1-2 nodes.
It is possible to use the client code on all types of Lustre
servers, but at least in the case of using the client code on an
OSS there is the non-negligible possibility of a resource
deadlock, if the client uses the OSS on the same node, as the
client and OSS codes compete for memory, so in the past this has
been discouraged.
This is documented here:
http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#50438194_84876
«Caution - Do not do this when the client and OSS are on the
same node, as memory pressure between the client and OSS can
lead to deadlocks.»
With most bonding modes, packets can get send across different links
in the bond. This results in out-of-order packets, and can slow down
a TCP stream (like your Lustre connection) unnecessarily. LACP will
route packets to a given destination across exactly one link in the
bond - but that will limit each TCP stream to the link speed of a
single bond member.
You can improve upon single link speeds with Lustre, because Lustre
will let you stripe large files across multiple OSSes. A client will
build a separate TCP connection to each OSS, so as long as traffic
passes over different links you can use all the available bandwidth.
The way traffic is scattered across the bond members is controlled by
the xmit_hash_policy option; layer3+4 uses an XOR of the source and
destination addresses, combined with the source and destination TCP
ports (modulo number of links in the bond) to pick the specific link
for that stream. If you're using sequential IPs for your OSSes, you
should be able to get a good scattering effect (since your source
address and port won't change, but your destination address will vary
across the OSSes).
A few years ago, I was using Lustre 1.4 and 1.6 and saw 200+ MB/sec
across two gigE links bonded together on the client (using striped
files). Your mileage may vary, of course. Caveats include:
Small files may have poorer performance than usual, due to transaction
overhead to multiple OSSes (if your bond is on the client).
Similarly, non-striped files will only see the speed of a single link.
If your OSTs start to fill, Lustre's load balancing may not give you
an ideal distribution of stripes across OSSes - causing multiple TCP
streams to land on the same bond member on the client. Unfortunately,
this will present as slowdowns for certain files on certain clients
(because the number of bonds that can be used is a function of both
which OSSes are used in the file and the client's IP in the hash
policy).
All metadata accesses are limited to the speed of a single bond member
on the client.
If your bonds are on the server, then (as long as you have a number of
clients) you should see a nice increase in overall IO throughput. It
won't be as marked a boost as 10gigE or Infiniband, but bonds are
inexpensive and generally better than a single link (to multiple
clients).
Hope this helps - good luck!
--
Mike Shuey
Theoretically, the client and OST cannot be on the same node due to a
potential memory deadlock. When a node that is a Lustre client grows
short of memory it flushes it's cache to free some memory up.
However if the OST that it needs to flush pages to is also on the same
node, it will need to to allocate memory to receive the pages from the
client, which it will not be able to do since the node is already short
of memory which is causing the need for the client to flush.
Cheers,
b.
--
Brian J. Murrell
Senior Software Engineer
Whamcloud, Inc.