rust-networking-proxy

Summary:

A proxy written in Rust supporting multiple applications. Each application has a set of ports and a set of targets. The proxy listens on each port and routes the requests to one of the targets. The configuration is written in JSON.

Some features:
1 - It accepts connections from all ports listed.
2 - It alternates between app's target servers (it does not only use the same server over and over)
3 - It deals with bad target servers (if the server is down, another one will be used, if possible)
4 - If no servers are available for the app, an according message is sent to the user

How it works:

We handle each app separately (async). For each app, we listen on each of the specified ports.

We support concurrent users by using asynchronous programming through tokio (https://tokio.rs/) For each connection, we get the request and route it to one of the available servers in the application set of targets. The proxy attempts to balance the load between servers by always picking a different one, when possible. This is done with the following algorithm:

For each app, keep track of the last target we have successfully routed to.
For the next request, try to route it to the next server on the list until one succeeds (the list is circular).
Save the used target as the last target.

How I landed on the design:

I tried to follow some code examples online. One big inspiration was this video by Lily Mara: "Creating a Chat Server with async Rust and Tokio" https://www.youtube.com/watch?v=Iapc-qGTEBQ
I learned the async part mostly from Tokio docs: https://tokio.rs/tokio/tutorial
My biggest struggle was to get the shared state needed for the load balancing algorithm (we need to keep track of the last used target for each app). I ended up using a Mutex with a HashMap, but it has obvious drawbacks that I write more about in later sections.

What might break under a production load:

The load-balancing algorithm is a problem. Note that we only keep track of the last successful target used. This means that if we have one available target followed by a lot of unavailable ones, all of those will be tried.
Example:
1st target is available
2nd, 3rd and 4th are down
the proxy will always try 2nd, 3rd and 4th before successfully routing to 1st.
This means the user may experience a lot of wait while the proxy tries always-bad servers.

Another point of consideration is that for my implementation, I am using a Mutex for the whole proxy. Each connection may potentially wait a lot for the Mutex to be released. An improvement would be to have one Mutex for each app. A better one would be to get rid of the Mutex (maybe with message passing and some queue?)

What needs to happen before your proxy is production ready:

We could make a better load-balancing strategy.

Periodically check each target server for its availability and load, and keep track of which are best and which are down.
Maybe route them geographically in some way (would require storing/gathering some info about the user, I don't know how feasible this is in this context)

A production-ready proxy would also have much more robust error handling, which I have skipped here.

Notes about the proxy testing tool:

I do not really know how to approach testing such code. I ended up coding a very simple tool in Go, with not-great tests. Below are what my code accomplishes and what I think it lacks.

Pros:

The "tests" indicate that the proxy is indeed routing the requests.
The "tests" indicate that the proxy is balancing the load between available targets.
The "tests" indicate that the proxy is handling unavailable servers (routing the request to other servers)

Cons:

The code is structured very simply, a bunch of functions that assert some results. This would be better expressed with a testing framework (then our functions are the tests).
The "tests" are not reliable. Some depend on the actual state of the proxy/targets and they are not really "good" tests anyway.
The "tests" use hard-coded values from the config.json
We should have some configuration better tailored for the tests, and values should be read from it automatically, not manually.
I could not really test the concurrency of the proxy. I attempted to create a bunch of users with goroutines, but I can't be sure if they are being processed concurrently only by Go or by the proxy too. (manually testing with telnet works, though).