Proxy and storage nodes

Blair Bethwaite

unread,

Oct 21, 2011, 10:54:39 PM10/21/11

to openst...@googlegroups.com

Hi list,

In considering an OpenStack Swift deployment one has to consider two main things, storage and proxy nodes. All access to the actual storage goes through the proxy, so the proxy obviously needs a reasonable pipe into the public network. Storage traffic is stateless and thus easily load-balanced, so you can add proxies to handle more requests faster. What's not clear to me is why you would necessarily physically separate the proxy and storage nodes (assuming multiple proxies)...

Seems like the most horizontally scalable and flat edged solution would be to spread this all out over the whole storage cluster. I must be missing something.

--
Blair Bethwaite
Researcher, Developer, SysAdmin, Nimrod and Grid support specialist
Monash eScience and Grid Engineering Lab (http://www.messagelab.monash.edu.au/)
+61 3-9903-2800

Tom Fifield

unread,

Nov 1, 2011, 1:43:13 AM11/1/11

to OpenStack Australia

Cool question,

Thoughts:

* Increases the cost of your storage nodes
The proxy has far greater network requirements than the storage. For
example, if your storage is connected at 10Gbit, you would ideally
have 40Gbit for your proxy (3*10Gbit for 3 replicas to the storage
net, 1*10Gbit to the public network). Having the proxy on the storage
does not reduce this network requirement, so proxy-on-storage node
would mean every storage node had to have 4 10gbit cards. This might
be less of an issue if your storage only uses 1Gbit connections.

The other thing to look at is CPU and RAM usage. Depending on the
setup, this could be something which varies a bit between proxy and
storage that could result in extra cost having a proxy on every box. I
don't have a good feel for this yet, but look into the contention on
the bus between network traffic and disk I/O.

* Removes the public-private split - bandwidth problems
The storage network, that is, the switch sitting between the storage
nodes gets very chatty. Storage nodes are constantly communicating
with each other to balance the cluster.

Especially when you have a disk/node failure scenario this network
gets quite hammered. If you're putting your public interfaces into the
same kit, it will likely reduce the performance of all proxies, or at
least the proxies on nodes doing large transfers.

These together could mitigate the benefit you get from load
balancing...

Regards,

Tom

Peter Jung

unread,

Dec 9, 2011, 1:35:27 AM12/9/11

to OpenStack Australia

From the Rackspace Reference Architecture.
Object Storage Public Storage – Multiple Cabinet(Rack) Physical
Architecture.

All storage nodes connect to respective L2 switches(TOR) in the
cabinet.
L2 Switches are up-linked to L3(inter-vlan, multi-layter,
aggregation)switches via MLAG (Multi-chassis Link AGgregation).

Note that Proxy nodes are setup as a separate entity from the storage
node to allow for better scalability in this architecture. This
implementation allows for maximum scalability of the environment and
optimum performance.

Object, Storage and Account (OSA) services run on the Storage node in
this implementation, with Proxy service running on proxy node.

Proxy nodes are similar to Web servers in DMZ, which can be load-
balanced via L4~7 Switch with VIP.

Cheers,
Peter

Reply all

Reply to author

Forward