1. Q: In many layered protocols, each layer has its own header. Surely
it would be more efficient to have a single header at the front of each
message with all the control in it than all these separate headers. Why
is this not done?
A: Each layer must be independent of the other ones. The data passed from layer k + 1 down to layer k contains both header and data, but layer k cannot tell which is which. Having a single big header that all the layers could read and write would destroy this transparency and make changes in the protocol of one layer visible to other layers. This is undesirable.
2. Q: Why are transport-level communication services often inappropriate for building distributed applications?
A: They hardly offer distribution transparency meaning that application developers are required to pay significant attention to implementing communication, often leading to proprietary solutions. The effect is that distributed applications, for example, built directly on top of sockets are difficult to port and to interoperate with other applications.
4.
Q: Consider a procedure incr with two integer parameters. The procedure
adds one to each parameter. Now suppose that it is called with the same
variable twice, for example, as incr(i, i). If i is initially 0, what
value will it have afterward if call-by-reference is used? How about if
copy/restore is used?
A: If call by reference is used, a pointer to i is passed to incr. It will be incremented two times, so the final result will be two. However, with copy/restore, i will be passed by value twice, each value initially 0. Both will be incremented, so both will now be 1. Now both will be copied back, with the second copy overwriting the first one. The final value will be 1, not 2.
6. Q: One way to handle parameter conversion in RPC systems is to
have each machine send parameters in its native representation, with
the other one doing the translation, if need be. The native system
could be indicated by a code in the first byte. However, since locating
the first byte in the first word is precisely the problem, can this
actually work?
A: First of all, when one computer sends byte 0, it always arrives in byte 0. Thus the destination computer can simply access byte 0 (using a byte instruction) and the code will be in it. It does not matter whether this is the low-order byte or the high-order byte. An alternative scheme is to put the code in all the bytes of the first word. Then no matter which byte is examined, the code will be there.
7. Q: Assume a client calls an asynchronous RPC to a server, and
subsequently waits until the server returns a result using another
asynchronous RPC. Is this approach the same as letting the client
execute a normal RPC? What if we replace the asynchronous RPCs with
one-way RPCs?
A: No, this is not the same. An asynchronous RPC returns an acknowledgement to the caller, meaning that after the first call by the client, an additional message is sent across the network. Likewise, the server is acknowledged that its response has been delivered to the client. Two asynchronous RPCs may be the same, provided reliable communication is guaranteed. This is generally not the case.
8. Q: Instead of letting a server register itself with a daemon as
in DCE, we could also choose to always assign it the same endpoint.
That endpoint can then be used in references to objects in the server's
address space. What is the main drawback of this scheme?
A: The main drawback is that it becomes much harder to dynamically allocate objects to servers. In addition, many endpoints need to be fixed, instead of just one (i.e., the one for the daemon). For machines possibly having a large number of servers, static assignment of endpoints is not a good idea.
10. Q: Describe how connectionless communication between a client and a server proceeds when using sockets.
A: Both the client and the server create a socket, but only the server binds the socket to a local endpoint. The server can then subsequently do a blocking read call in which it waits for incoming data from any client. Likewise, after creating the socket, the client simply does a blocking call to write data to the server. There is no need to close a connection.
11. Q: Explain the difference between the primitives mpi_bsend and mpi_isend in MPI.
A: The primitive mpi bsend uses buffered communication by which the caller passes an entire buffer containing the messages to be sent, to the local MPI runtime system. When the call completes, the messages have either been transferred, or copied to a local buffer. In contrast, with mpi isend, the caller passes only a pointer to the message to the local MPI runtime system after which it immediately continues. The caller is responsible for not overwriting the message that is pointed to until it has been copied or transferred.
12.
Q: Suppose that you could make use of only transient asynchronous
communication primitives, including only an asynchronous receive
primitive. How would you implement primitives for transient synchronous
communication?
A: Consider a synchronous send primitive. A simple implementation is to send a message to the server using asynchronous communication, and subsequently let the caller continuously poll for an incoming acknowledgement or response from the server. If we assume that the local operating system stores incoming messages into a local buffer, then an alternative implementation is to block the
caller until it receives a signal from the operating system that a message has arrived, after which the caller does an asynchronous receive.
13. Q: Suppose that you could make use of only transient
synchronous communication primitives. How would you implement
primitives for transient asynchronous communication?
A: This situation is actually simpler. An asynchronous send is implemented by having a caller append its message to a buffer that is shared with a process that handles the actual message transfer. Each time a client appends a message to the buffer, it wakes up the send process, which subsequently removes the message from the buffer and sends it its destination using a blocking call to the original send primitive. The receiver is implemented in a similar fashion by offering a buffer that can be checked for incoming messages by an application.
15. Q: In
the text we stated that in order to automatically start a process to
fetch messages from an input queue, a daemon is often used that
monitors the input queue. Give an alternative implementation that does
not make use of a daemon.
A: A simple scheme is to let a process on the receiver side check for any incoming messages each time that process puts a message in its own queue.
How to understand the answer, especially the part following "each time".
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
16. Q: Routing tables in IBM WebSphere, and in many other
message-queuing systems, are configured manually. Describe a simple way
to do this automatically.
A: The simplest implementation is to have a centralized component in which the topology of the queuing network is maintained. That component simply calculates all best routes between pairs of queue managers using a known routing algorithm, and subsequently generates routing tables for each queue manager. These tables can be downloaded by each manager separately. This approach works in queuing networks where there are only relatively few, but possibly widely dispersed, queue managers.
17. Q: With persistent communication,
a receiver generally has its own local buffer where messages can be
stored when the receiver is not executing. To create such a buffer, we
may need to specify its size. Give an argument why this is preferable,
as well as one against specification of the size.
A: Having the user specify the size makes its implementation easier. The system creates a buffer of the specified size and is done. Buffer management becomes easy. However, if the buffer fills up, messages may be lost. The alternative is to have the communication system manage buffer size, starting with some default size, but then growing (or shrinking) buffers as need be. This method reduces the chance of having to discard messages for lack of room, but requires much more work of the system.
18. Q: Explain why transient synchronous communication has inherent scalability problems, and how these could be solved.
A: The problem is the limited geographical scalability. Because synchronous communication requires that the caller is blocked until its message is received, it may take a long time before a caller can continue when the receiver is far away. The only way to solve this problem is to design the calling application so that it has other useful work to do while communication takes place, effectively establishing a form of asynchronous communication.
19. Q: Give an example where multicasting is also useful for discrete data streams.
A: Passing a large file to many users as is the case, for example, when updating mirror sites for Web services or software distributions.
21. Q: How could you guarantee a maximum end-to-end delay when a
collection of computers is organized in a (logical or physical) ring?
A: We let a token circulate the ring. Each computer is permitted to send data across the ring (in the same direction as the token) only when holding the token. Moreover, no computer is allowed to hold the token for more than T seconds. Effectively, if we assume that communication between two adjacent computers is bounded, then the token will have a maximum circulation time, which corresponds to a maximum end-to-end delay for each packet sent.
22.
Q: How could you guarantee a minimum end-to-end delay when a collection
of computers is organized in a (logical or physical) ring?
A: Strangely enough, this is much harder than guaranteeing a maximum delay. The problem is that the receiving computer should, in principle, not receive data before some elapsed time. The only solution is to buffer packets as long as necessary. Buffering can take place either at the sender, the receiver, or somewhere in between, for example, at intermediate stations. The best place to temporarily buffer data is at the receiver, because at that point there are no more unforeseen obstacles that may delay data delivery. The receiver need
merely remove data from its buffer and pass it to the application using a simple timing mechanism. The drawback is that enough buffering capacity needs to be provided.
25. Q: When searching for files in an unstructured peer-to-peer
system, it may help to restrict the search to nodes that have similar
files as yourself. Explain how gossiping can help to find those nodes.
A: The idea is very simple: if, during gossiping, nodes exchange membership information, every node will eventually get to know about all other nodes in the system. Each time it discovers a new node, it can be evaluated with respect to its semantic proximity, for example, by counting the number of files in common. The semantically nearest nodes are then selected for submitting a search query.