Application architectures include either clustering or load balancing for two specific purposes:
To maximize availability of the application by having multiple servers available. If one server hangs up or dies, then the other server(s) can continue processing and the application remains available to the users.
To increase processing throughput by distributing the load across multiple servers. The way the load is distributed between servers can be simple or complex. The most basic method is called round-robin. In it the first user request is assigned to server number one. The second request is assigned to server number two, etc. More complex algorithms take into account the complexity of each request. In this arrangement, one server might be handling a single complex request while another server is handling multiple simple requests.