Image Credit: http://www.flickr.com/photos/getbutterfly/6317955134/
The COPIOUS Engineering team regularly launches world-class web applications on cutting edge technologies. From the drawing board to the database, our digital products starts with the user.
First, we need to know the best and worst case scenarios for how many users an application expects to have and what kind of growth curve is anticipated. We need to know what the users are going to be doing with the system and how, including which actions need instant feedback. Regularly, too, things need to happen behind the scenes to connect a user to their goal. These are all accounted for in our cluster design processes.
With baseline user numbers in hand, we can proceed to layering the servers. We’ll generally use three kinds of servers in a cluster: application servers for user-facing software, database servers for storage, and utility servers for software that needs to run behind the scenes. We write chef scripts to install and configure our servers, and capistrano scripts to automate deployment and rollback. This makes pushing a build or a fix a single-step process. For our largest clusters, we use the Jenkins continuous integration server to keep our deployed systems in step with our code paths in github.
Our code paths are laid out in the standard trio: master, staging, and production, using feature branches for development. When code is feature complete, tested, and reviewed, it’s merged into master for internal preview. Once we’ve signed off, it goes to staging for a customer-facing preview, then on to production when everyone is satisfied. The staging cluster will generally be a pared-down version of the production cluster—identical in architecture, but with minimized need to handle active user load.
The architecture of the application servers is generally one or more load balancers, an app master, and one or more app slaves. These servers run nothing other than the latest ruby interpreter, rails framework, and daemons for the app, web, and caching systems. For web services, we’ll generally go with nginx, and on modern rails apps we tend to use unicorn rather than passenger due to better threading performance. Finally, for caching, we’ll generally go with varnish, redis, or haproxy, depending on what’s being cached where in the application.
The application servers work to process the user’s requests, though we still need a place to store data. Most of our large clusters use both relational and non-relational (“NoSQL”) data persistence strategies. On the relational side, we prefer PostgreSQL over MySQL, and on the nonrelational side, we’ll most regularly use MongoDB as a document database and Redis as a data structure server. Each database is deployed across at least two, generally three servers, depending on its unique architectural needs, to ensure uptime and data consistency throughout network outages and load spikes.
Finally, where there are calculations that need to happen behind the scenes, we’ll deploy utility instances to handle that load. Mike Perham from our client The Clymb wrote and released a great tool called Sidekiq for handling background tasks with ruby. These tasks could be calculating ranking data, running geographic indexes, or sending out emails--but they’re not tasks the user should have to wait on. Keeping these tasks out of the main app cluster keeps the application responsive and fast.
Fast, stable applications are important to keep users happy, and they come from talented designers and engineers using the best tools and processes customized to the digital product being built. Happy clusters lead to delighted users, one of the goals of the COPIOUS engineering team.