Archive for the ‘ragenet’ Category

Sizing your infrastructure before launch

Wednesday, March 12th, 2008

So you got a webapp – How do you decide how many servers to deploy??? Even if you are still in development and don’t have a single outside user you can make an informed decision on how big to build and what your future network infrastructure will look like.

By gathering some data and doing a little load testing you can launch a new application confident in the fact that you know how many users your application will support.

I will outline the process you can use to size your infrastructure. I’ll be discussing it in the context of a web-based application but these methods can be applied to other types of applications. At my last client, Avvenu, half the network communication was not HTTP based and I used these methods to scale it regardless.

At the end of this process you’ll have a spreadsheet where you’ll be able to plug in arbitrary numbers and get out the scaling information you need. If bizdev asks “what happens if we close this deal and double our user base?” or if engineering finds a way to increase server performance by 100% you’ll be able to quickly answer what the impact on your network would be.

Understanding your usage

The first step in building our scaling model is to understand how your users use the system. There are a big series of questions that you’ll need to answer to get an idea of what that usage looks like.

First you’ll need to know how many active users to expect in the future. This data often comes from your marketing department.

The data is usually presented something like – in one month we’ll have X active users, in two months we’ll have Y, in three months we’ll have Z. You’ll need all these for your scaling spreadsheet.

Next you’ll need to find out how the typical user either uses the site (for existing sites) or is expected to use the site (for new sites). You’ll want this data in a given time period, such as per week. Some examples of what you’ll want to know are:

  • How many times a week does he visit?
  • When he visits what does he do?
  • Downloads a large file?
  • Looks at pages that require a large amount of processing
  • How many times and which ones?
  • Looks at images that are dynamically created?
  • Looks at static pages?
  • Uploads Data?

How much data do you have to maintain per users? This includes files, database rows, or in some applications constant open connections. This will also have to be accounted for in your scaling model.

For an existing application you’ll be able to mine your access logs. Always keep and archive these logs when at all possible. They come in handy to mine for useage pattern data. Throw together some scripts to extract the answers from your access logs.

For new sites put together a detailed but not overly technical questionnaire for your product manager. The answers from the questionnaire can be used to model typical visitor usage patterns.

One final note on usage patterns. You’ll find that you’ll have some users that look at a few pages every couple of months, and then some users who integrate your site into their daily routine. You’ll need to find the /average/ across all your active users.
Distilling the estimated traffic

Now you have how many users you have, vs. the activity of each user. You can now determine how many requests your service will have to handle. You can figure this out just by multiplying the number of users against the number of operations and then divide that by the number of seconds in your time period (i.e. a week) to find the average number of operations you’ll have to perform per second.

Important to note, when sizing your bandwidth that file sizes are measured in BYTES and bandwidth in BITS. multiply all file sizes by 8 to find the number of bits they would be when crossing Ethernet.

Load Testing

Once you’ve determined what your average user will do you’ll need to automate that behavior for load testing. Typically you’ll set up a load testing cluster – or just test against your pre-production or development environment on off hours. You’ll need to ensure your load-generating machines that run your load testing scripts do not become your bottleneck. In this phase it is very useful to be running server monitoring and graphing software like NAGIOS and CACTI. Make sure your server graphing captures CPU, Disk, Memory, Network, and process utilization so that you can identify which machines bottleneck and what parts of the machines have to be scaled. Sometimes you’ll think an application should bottleneck on CPU and find it bottlenecks on Memory. This helps you make informed purchasing decisions when you buy new machines for your production environment.

You can set up scripts and use tools such as AB (apache benchmark) to throw traffic at your servers and determine the number of operations per second your servers can handle. You’ll have to try to isolate each class of machine (i.e. DB or HTTP, etc) and determine it’s maximum load. With unlimited resources you could load test a single webserver to determine it’s limits, then throw 100 load-testers against 100 web-servers to find your DB’s load limits. But for most of us this is impractical. So you may have to be clever and try and profile the database traffic generated by the webserver load testing and then create a script to drive simulated load at your DB server directly.

It is important in this step to discover any horizontal scaling issues. If you find adding new servers does NOT increase your capacity as you expect then you’ll need to work with your software engineering team and fix the scaling problems or warn management that their is a likely hard limit of X number of users the system will support.

Peak vs. Average usage

You will need to determine the peak usage hour(s) of your service and how these relate to your average usage.

I have found that your peak usage will typically be double your average usage. If you have no other data then go ahead and size for that.

If you are sizing an existing application you already know your ratio of peak vs. average by looking at your log data.
Building the Spreadsheet

TOTAL          (users * usage / time-period-to-seconds ) * peak/avg
REQUIRED  =  --------------------------------------------
SERVERS       benchmarked-requests-per-second-per-server

Do this for each class of server, web servers, app servers, DB servers, etc. Then make a column for each month of growth. Make your formula round-up the number of servers. you can’t deploy 2.3333333 servers can you?

Often I’ll break this down into the number of active users each server can support. I can then divide the number of projected users and have the number of required servers.

USERS       benchmarked-requests-per-second-per-server
PER       = ---------------------------------------
SERVER     (per-user-usage / time-period-in-seconds ) * peak/avg

TOTAL                USERS
REQUIRED = ---------------------

Your total servers numbers can drive other parts of the spreadsheet as well. Every so many servers you’ll need a new Ethernet switch, another rack at the colo, and perhaps increased headcount (try and reduce this by automating as much as possible!)

Make sure your spreadsheet also accouts for the amount of static data you have to maintain per user. For example how many file servers will you need for the files your users upload? How many users will the disks on your DB server support?

Your model should also determine the maximum network traffic at peak times so that you’ll understand when you’ll need to order more bandwidth from your connectivity provider or will need bigger routers and load balancers.

In Conclusion

Using this process has allowed me to help size networks for many internet startups and kept my network operations groups from being caught with their pants down. Determining your scalability and using this data to anticipate required infrastructure growth will help you and the rest of your organization have confidence going forward with a growing userbase.

RoR: Testing with simple_captcha & HTTP-Auth

Saturday, February 9th, 2008

While developing a small Ruby on Rails application for The Pilot’s Camping Directory website I ran into a few problems that weren’t solved by a simple google search – so I’m documenting them here for future posterity and googling. I had problems with testing when using some security features to keep out riff-raff. It was not obvious how to handle simple_captcha or simple_http_auth while doing testing so I scratched around the net and pieced together a solution for each of the problems. These work with Rails 1.2. With Rails 2.0 YMMV – but then 2.0 breaks every rails tutorial ever written so I don’t feel bad if this blows up in 2.0.

Using Mocks for testing with simple_captcha

Tests will fail when trying to save something protected by a captcha – obviously – as stoping automated lever-pulling is exactly what a captcha is designed to do. In my application I use capcha at the model level, so I simply override the save_with_captcha method with a simple save.

Here’s what my mocks/test/recipient.rb looks like:

# Can't fake captcha for testing - so we mock it out.
require_dependency 'models/recipient'
class Recipient < ActiveRecord::Base
def self.save_with_captcha

Functional Testing HTTP-Auth

To test HTTP Authorization / Authentication you must set up your request environment to pass the http authorization into the application. This is known to work with the simple_http_auth plugin, the plugin that I used for my application. Specify this in the setup section of your functional test.

def setup
@controller =
@request =
@request.env['HTTP_AUTHORIZATION'] = "Basic " + Base64.encode64(ADMIN_USER +':' + ADMIN_PASSWORD )

Integration Testing HTTP-Auth

Integration testing simulates making requests directly to the webserver. To work with http authorization here you must pass in the appropriate authentication headers when making each get/post request. An example is below:

@htauth = "Basic " + Base64.encode64(ADMIN_USER+':' + ADMIN_PASSWORD )
get("/supersecret/index", nil , {:authorization => @htauth})

Sharpening the saw, html and graphics.

Wednesday, January 16th, 2008

In my off-season (winter) I am usually traveling internationally – mostly places that are sunnier and warmer than the San Francisco bay area. It’s often the perfect time for me to sharpen my various skills , being unconstrained by the usual grand infrastructure projects I do in the summer.

It’s often these times that I bring back up my html/coding/graphics skills. Wifi Bandwidth here in Puerto Vallarta has gotten much more ubiquitous and reliable and so I’ve got connectivity almost as good as back in SF. I’ve been diving back into apps like
Gimp, Aptana, & Inkscape.

I also enjoy catching up on the avant guard of web artistry and seeing what people are creating with html and css. I appreciate simplistic designs and so I really enjoyed the sites on display at the link below:

25 Beautiful, Minimalistic Website Designs – Part 2 | Vandelay Website Design

Powered by ScribeFire.

Home Fabrication

Monday, May 21st, 2007

This weekend I went to the Make Faire here in Silicon Valley, put on by Make Magazine. make is geared towards folks who enjoy making things with their own hands, inventing and creating instead of simply consuming what’s available at the store.

The most amazing technology at the fair was the home fabrication / 3D printer technology. There were several units there, but one caught my eye. The Fab @ Home unit, designed for hobbyists, is a unit that can be assembled for just two thousand dollars in parts and a weekend of work.

This will prove to be one of the most distruptive technologies to come along. Home fabrication will make the copyright issues with MP3s look like a cakewalk. When you can print your own furniture, clothing, and other housewares, just by downloading designs from your friends.

Yahoo Pipes, a very neat app!

Thursday, February 8th, 2007

Well, web 2.0, for me has a lot to do with making data available in an agnostic manner, wether that be via RSS or via a web services API. Data tied to a presentation layer, such as a traditional website, is data that has no future outside that website. The rise of mash-ups is enabled by data being decoupled from it’s presentation. Being combined with other data makes that data more valuable.

Until now you’ve needed to be a reasonably adept programmer to put together different data sources to create mash-ups. But not now. Yahoo has just launched an application that allows anyone with the most rudementary conceptual knowledge of programming to create new mashups.

Yahoo Pipes is the new application, and it allows anyone to easily string together web data sources and funnel them through some rudimentary filters to create new mash-ups. Yahoo has been a bit absent with the whole innovation thing since Google became the industries’ darling but I think this marks their comeback in a big way.

There are a good series of articles on the O’Reilly Radar about why it’s important and how it works. Tech crunch has a good mention about Yahoo! Launching Pipes and There’s a nice bit about it from Yahoo MySQL guru Jeremy Zawodny.

The excitement about this product is very high in the tech community, resulting in someone as big as yahoo having their new service overwhelmed. So be patient when trying it out until they’ve got some new servers spun up!

Technorati Tags: ,

Self-healing networks

Thursday, February 1st, 2007

Last year I wrote an article on building a self-healing network with off the shelf software components. If you are responsible for managing a large UNIX/Linux network it’s a must-read…

An excerpt from the article:

Computer immunology is a hot topic in system administration. Wouldn’t it be great to have our servers solve their own problems? System administrators would be free to work proactively, rather than reactively, to improve the quality of the network.

This is a noble goal, but few solutions have made it out of the lab and into the real world. Most real-world environments automate service monitoring, then notify a human to repair any detected fault. Other sites invest a large amount of time creating and maintaining a custom patchwork of scripts for detecting and repairing frequently recurring faults. This article demonstrates how to build a self-healing network infrastructure using mature open source software components that are widely used by system administrators. These components are NAGIOS and Cfengine. — Building a Self-Healing Network

Technorati Tags: , ,

Project Management Tools.

Tuesday, January 30th, 2007

I’ve been trying to find some project mangement software lately that’s compatable with the Getting Real development methodology. For our new ‘Locomotive’ project I’d love to find a tool where I can set out all the required tasks, and assign an hour value to them to create a timeline for how long each part of the project will take. This should help me decide if parts of the project should be trimmed. It also jives with the “Getting Real” suggestion that tasks be broken into 4-hour chunks.

So I went off into the web (version 2.0) to evaluate several web-based project management systems. The first one I tried was Basecamp, the 37signals offering. I found there’s no time management available in the base product. So I continued on and checked out Zoho Projects . This also was missing the functionality I required. Finally I tried out Devshop. This had really great time mangement tools. I would recommend this one for medium-sized team development projects. The screencasts were very impressive. However it didn’t quite do what I wanted as the smallest unit of time it supported was 1 day. So for now it looks like I’m back to the web 0.0 pen-and-paper method, or perhaps an excel spreadsheet. If you have any project management suggestions please leave them in the comments!