Wentao 's Blog

Distributed Multi-Server Messaging Network

This is a multi-server system for broadcasting activity stream objects between a number of clients. The multi-server system will:

  1. Load balance client requests over the servers, using a redirection mechanism to ask clients to reconnect to another server.

  2. Allow clients to register a username and secret, that can act as an authentication mechanism. Clients can login and logout as either anonymous or using a username/secret pair.

  3. Allow clients to broadcast an activity object to all other clients connected at the time

  4. Allow servers to join the network at any time.

  5. Allow clients to join and leave the network at any time.

  6. Maintain that a given user-name can only be registered once over the server network.

  7. Guarantee that an activity message sent by a client reaches all clients that are connected to the network at the time that the message was sent.

  8. Guarantee that all activity messages sent by a client are delivered in the same order at each receiving client however the order that messages are sent by different clients does not need to be enforced at receiving clients.

  9. Ensure that clients are evenly distributed over the servers as much as possible.

Protocol

AUTHENTICATE

Sent from one server to another always and only as the first message when connecting.

{
  "command" : "AUTHENTICATE", 
  "secret" : "a0fb99496214f0d166e3ce4fe5e077ba"
}

Receiver replies with:

  • AUTHENTICATION_FAIL if the secret is incorrect

  • INVALID_MESSAGE if anything is incorrect about the message, or if the server had already successfully authenticated

  • No reply if the authentication succeeded

If anything other than authentication succeeded, then the connection is closed immediately after sending the response.

INVALID_MESSAGE

A general message used as a reply if there is anything incorrect about the message that was received. This can be used by both clients and servers.

{
  "command" : "INVALID_MESSAGE", 
  "info" : "the received message did not contain a command"
}

{
  "command" : "INVALID_MESSAGE", 
  "info" : "JSON parse error while parsing message"
}

After sending an INVALID_MESSAGE the sender will close the connection. Receivers of such a message must therefore close the connection as well.

AUTHENTICATION_FAIL

Sent by the server when either a server fails to authenticate or when a client fails to login.

{
  "command" : "AUTHENTICATION_FAIL", 
  "info" : "the supplied secret is incorrect: a0fb99496214f0d166e3ce4fe5e077ba"
}

After sending an AUTHENTICATION_FAIL the server will close the connection.

A client or server that receives AUTHENTICATION_FAIL must close the connection.

LOGIN

Sent from a client to a server.

{
  "command" : "LOGIN", 
  "username" : "test1", 
  "secret" : "a0fb99496214f0d166e3ce4fe5e077ba"
}

A username must be present. The value of username may be anonymous in which case no secret field should be given (it should be ignored). Otherwise a secret must be given.

The server replies with:

  • LOGIN_SUCCESS only if the server recognizes the combination of username and secret in its local storage.

  • LOGIN_FAILED in cases where the secret does not match that in the local storage, for the supplied username, or when the username is not found.

  • INVALID_MESSAGE in any case where the message is incorrect.

LOGIN_SUCCESS

Sent from server to client to indicate that the client successfully logged in.

{
  "command" : "LOGIN_SUCCESS", 
  "info" : "logged in as user aaron"
}

The server will follow up a LOGIN_SUCCESS message with a REDIRECT message if the server knows of any other server with a load at least 2 clients less than its own.

REDIRECT

Sent from a server to a client, to request the client to reconnect to a new server.

{
  "command" : "REDIRECT", 
  "hostname" : "123.456.78.9", 
  "port" : 1234
}

After sending a REDIRECT message the server will close the connection.

After the client receives a REDIRECT message it must close the connection and make a new connection to the system, presumably to the server as given in the message. The client starts the protocol afresh.

LOGIN_FAILED

Sent from server to client to indicate that the client did not successfully log in.

{
  "command" : "LOGIN_FAILED", 
  "info" : "attempt to login with wrong secret"
}

After sending a LOGIN_FAILED the server will close the connection.

LOGOUT

Sent from client to server to indicate that the client is closing the connection.

{
  "command" : "LOGOUT"
}

The client will close the connection after sending. The server will close the connection after receiving.

ACTIVITY_MESSAGE

Sent from client to server when publishing an activity object.

Example:

{
  "command" : "ACTIVITY_MESSAGE", 
  "username" : "test1", 
  "secret" : "a0fb99496214f0d166e3ce4fe5e077ba", 
  "activity" : { ... }
}

The actual activity object is not shown above.

The server will respond with:

  • AUTHENTICATION_FAIL if the username is not anonymous or if the username and secret do not match the logged in the user, or if the user has not logged in yet. After sending such a message the connection is closed.

  • INVALID_MESSAGE if the message is incorrect in any way, and close the connection.

If the request succeeds then the server will broadcast the activity object to all servers and they in turn will broadcast to all connected clients. The activity will be processed (see later) before being broadcast.

SERVER_ANNOUNCE

Broadcast from every server to every other server (between servers only) on a regular basis (once every 5 seconds by default). The id is unique value that each server chooses for itself and maintains constant throughout operation. The load is the number of clients currently connected to the server. The hostname and port is the address of the server.

{
  "command" : "SERVER_ANNOUNCE", 
  "id" : "a0fb99496214f0d166e3ce4fe5e077ba", 
  "load" : 5, 
  "hostname" : "128.250.13.46", 
  "port" : 3570, 
  "level": 1
}

Servers respond with an INVALID_MESSAGE if the message is incorrect or if the message is received from an unauthenticated server (i.e. on a connection that has not yet authenticated with the server secret). Then the server will close connection.

ACTIVITY_BROADCAST

Message broadcast between servers, and from each server to its clients, that contains a processed activity object.

{
  "command" : "ACTIVITY_BROADCAST", 
  "activity" : { ... }
}

Servers respond with an INVALID_MESSAGE if the message is incorrect or if the message is received from an unauthenticated server (i.e. on a connection that has not yet authenticated with the server secret). Then the server will close connection.

REGISTER

Sent from client to server when the client wishes to register a new username.

{
  "command" : "REGISTER", 
  "username" : "aaron", 
  "secret" : "a0fb99496214f0d166e3ce4fe5e077ba"
}

The client selects the secret that it wishes to register with. The server replies with:

  • REGISTER_FAILED if the username is already known (registered) by any server in the system. Connection is closed.

  • REGISTER_SUCCESS if the username was not already known by any server in the system, and now the user can login using the username and secret.

  • INVALID_MESSAGE if anything is incorrect about the message, or if receiving a REGISTER message from a client that has already logged in on this connection. Connection is closed by server.

REGISTER_FAILED

Sent by server to client to indicate that an attempt to register a username and client has failed.

{
  "command" : "REGISTER_FAILED", "info" : "aaron is already registered with the system"
}

The connection is closed by the server after sending this message.

REGISTER_SUCCESS

Sent by server to client to indicate that an attempt to register a username and client has succeeded.

{
  "command" : "REGISTER_SUCCESS", "info" : "register success for aaron"
}

LOCK_REQUEST

Broadcast from a server to all other servers (between servers only) to indicate that a client is trying to register a username with a given secret.

{
  "command" : "LOCK_REQUEST", 
  "username" : "aaron", 
  "secret" : "a0fb99496214f0d166e3ce4fe5e077ba"
}

A server that receives this message will:

  • Broadcast a LOCK_DENIED to all other servers (between servers only) if the username is already known to the server with a different secret.

  • Broadcast a LOCK_ALLOWED to all other servers (between servers only) if the username is not already known to the server. The server will record this username and secret pair in its local storage.

  • Send an INVALID_MESSAGE if anything is incorrect about the message or if it receives a LOCK_REQUEST from an unauthenticated server (i.e. the sender has not authenticated with the server secret). The connection is closed.

LOCK_DENIED

Broadcast from a server to all other servers (between servers only), if the server received a LOCK_REQUEST and to indicate that a username is already known, and the username and secret pair should not be registered.

{
  "command" : "LOCK_DENIED", 
  "username" : "aaron", 
  "secret" : "a0fb99496214f0d166e3ce4fe5e077ba"
}

When a server receives this message, it will remove the username from its local storage only if the secret matches the associated secret in its local storage.

Send an INVALID_MESSAGE if anything is incorrect about the message or if it receives a LOCK_DENIED from an unauthenticated server (i.e. the sender has not authenticated with the server secret). The connection is closed.

LOCK_ALLOWED

Broadcast from a server to all other servers if the server received a LOCK_REQUEST and the username was not already known to the server in its local storage.

{
  "command" : "LOCK_ALLOWED",   
  "username" : "aaron", 
  "secret" : "a0fb99496214f0d166e3ce4fe5e077ba"
}

The server will send an INVALID_MESSAGE if anything is incorrect about the message or if it receives a LOCK_ALLOWED from an unauthenticated server (i.e. the sender has not authenticated with the server secret). The connection is closed.

SERVER_INFO

Sent from a parent server to its child servers after a successful authentication.

{
  "command" : "SERVER_INFO", 
  "id" : "a0fb99496214f0d166e3ce4fe5e077ba", 
  "level" : 2, 
  "users" : [
    "UserA:Passw0rd1",
    "UserB:Passw0rd2"
  ]
}

LOCK_CANCEL

Sent from the original server to all servers after a failed registration.

{
  "command": "LOCK_CANCEL", 
  "username": "userA", 
  "secret": "passw0rd"
}

When a server receives this message, it will remove the user-name and password from its local storage.

Design Rationale

Failure Model

The basic principle of failure model is that each server (except the one on the root) should ensure an available outgoing connection to its parent server. servers do not care about their child servers’ dead or alive, while child servers do care about their parent server. Once a child server detects the disconnection to its parent server, it will immediately try to establish a new connection to another server.

There are mainly two cases that a server will lose the connection to its parent server, and will be handled in the failure model: one is the parent server quits and closes the TCP socket; its child servers will be immediately aware of the disconnection and initiate a reconnection. The other is, the network of the parent server is interrupted and only the parent server will be aware of the disconnection. To handle the second case, the child server will make use of the information obtained through SERVER ANNOUNCE. Each server checks whether it has received an announcement for its parent server on a regular basis; if not, the child server will assume the parent server is disconnected from the network, and establish a new connection.

Reconnect

Consequently, each (child) server should have a strategy of reconnection. To help each server maintain a basic structure of the network, we have added a level property to each server, which represents the level of the server in the tree structure demonstrated as Figure 1.

Figure 1. Tree structure with defined level

The level of the first (root) server is 0, and its successors can know the level of their parent servers by SERVER INFO, which is sent after a successful authentication. Also, we appended a level field to SERVER ANNOUNCE above. Since the structure of the multi-server system is a tree, we have defined server’s level indicating the level of current server in this tree structure. For instance, while the default level of the initial server is 0, its child severs are all in level 1.

Therefore, each server has the id and level information of any other server. All these servers are sorted by level and id in descending order and ascending order respectively. Then, when they lose the connection with their parents, they will need to find a new parent server, thus the child server will traverse the list of known servers and try to find the first one with a smaller level number like figure 2 has shown.

Figure 2. Reconnecting to new parent, case 1

If massive failures happen like the Figure 3 below (in which server D has the smallest id in the level 2 servers), then a level 2 server will traverse for the second time to find the one with the same level number. A server will never reconnect to another server with greater level numbers, as that may result in a cycle. If the smallest numbered child server (Server D) can not find a server to reconnect, it will then become the new root server, and will wait for re-connections from other servers.

Figure 3. Reconnecting to new parent, case 2

Buffer

The connection disconnects at time $T_1$, and the reconnection to a new parent server needs time $\Delta T$. Between time $T_1$ and $T_1+\Delta t$, the child server can still receive ACTIVITY_MESSAGE from a client although the outgoing connection is unavailable. To ensure that the messages send at $T_1$ can be delivered to all servers connected at $T_1$, these messages will be stored in a queue. After the reconnection is accomplished, all messages held in the queue will be resent to the new parent server.

Registering

Since any server can quit at any time, we have to adopt a different registering strategy. When a server receives a registration request from a client while the username does not appear in its local storage, it will still broadcast LOCK_REQUEST to all directly connected servers. But rather than storing the number of all known servers, the server will store the connections that it has sent LOCK_REQUEST to. When another server receives a LOCK_REQUEST, it will check the availability in its storage. If the username is not available, it will response with a LOCK_DENIED immediately. However, if the username is actually available, this server will not do any response but pass the LOCK_REQUEST to the other directly connected servers, storing a list of these servers (List A), save the pair of username and secret, and, finally, wait for their responses. Also, each server maintains a list (List B) of all connections directly connected to it at any time. When getting a response of LOCK_DENIED, the server will send the same LOCK_DENIED to the server that requested this user name. When getting a LOCK_ALLOWED, the server will compare List A and List B to check if all currently connected servers have responded with a LOCK_ALLOWED. By this recursive checking process, the original server with the client will get a LOCK_DENIED or LOCK_ALLOWED from all its directly connected servers.

If the registration is unsuccessful, the original server would broadcast a LOCK_CANCEL message to all other servers, aiming at the removal of the pair of username and secret from local storage of every server.

This strategy can handle the case that one or more servers in the network may quit during a registration. Let us assume that, there is an initiation of LOCK_REQUEST from server A to servers B, C and D. At this moment, server B dies before any response can be sent to server A. Will that become a problem? No, server A will also remove server B from the connected servers list, and will not expect any response from server B.

New Servers

If a server joins the network after some clients have joined, the new server will get the information of all registered users by SERVER_INFO. It will also know the level n of its parent server, and will set the level of itself to n+1.

Concurrency Issues

It can be always found that two users are trying to register in a system with same username and password to two servers. On the one hand, during the LOCK_REQUEST processing time, there is a possibility that both servers do not receive response and before getting the LOCK_REQUEST from each other, they will respond LOCK_ALLOWED to each other. On the other hand, if both servers have responded LOCK DENIED , neither of the two registrations will be successful.

To avoid such issue, servers save the user information in its local storage before sending LOCK_REQUEST message, therefore, LOCK_REQUEST with same credentials will be certainly denied, and only one user can register with that name and secret. This solution contains all servers in a system: once a LOCK_REQUEST is sent to a server, it saves the user details before passing them to further servers. When this request gets denied in other servers and the LOCK_DENIED is broadcasted, the server must remove these user details.

Other problems include situations like two users trying to register on a same server with same user information. For two LOCK_REQUEST requests, their user credentials are same, and servers are not different; how can we define one from the other and prevent duplicate registrations? Here we would like to introduce the idea of numbers; in specific, that means adding an ID to each LOCK_REQUEST . With a unique ID, the system can identify requests even if they are from a same server with same usernames and secrets. Moreover, other revisions may apply for concurrency problems. For example, synchronizing all servers with a full list of registered users may also reduce the rate or concurrency registrations. Such arrangement requires each server sending its user list to its child nodes, once the nodes are connected to it.

Scalability

In this system, message objects are sent from one server to its connected servers in order. If we do not set restrictions, the worst case in a system (with server number n) is connecting all servers in line, from which we can conclude the complexity of $O(n)$. This complexity can be improved by setting tree restrictions when adding new servers. When we set the max depth of the tree as 5, new server will not be connected to nodes with depth 5; another node with less height will be assigned to it as parent to keep balance of the tree. For such tree structures, the complexity can be reduced to logarithmic of n, depending on the depth of trees.

According CAP theorem, only two of Consistency, Availability and Network partitioning can be satisfied. In this system, we place consistency and partitioning over availability. By default, the SERVER ANNOUNCE message is broadcasted every 5 seconds from a server. If we turn back to see the LOCK_REQUEST mechanism, the protocol will only allow a client to connect after receiving LOCK_ALLOWED from every server. In a large-scale distributed system, considering these 5 seconds, the time cost could be great and efficiency improvements may be necessary. Because we have designed the system as every sever knows its parent and children, this whole lock procedures can be simplified by assigning the announcement tasks to parent nodes; the server which sends LOCK_REQUEST can then get all server information from the parents only.


This project was finished with Y. Tang, L. Zhu, and I. Garg. The source code related to this project is hosted on GitHub, but is not publicly available yet.