Evaluation And Elimination of Internet Web Server Bottlenecks

Alen Peacock and Dr. J. Kelly Flanagan, Computer Science

Popular Internet web sites experience a common problem: as the number of people trying to access a site grows, the wait for page downloads for each of those users grows accordingly. In extreme cases, the site becomes completely inaccessible as the server refuses connections over a certain threshold (1).

In order to reduce the effects of heavy load on web servers, we must pinpoint those areas in the system that are slowing it down, areas commonly called bottlenecks. A typical server consists of a number of components. These components can be divided into two categories: the physical (hardware) and the virtual (software). Bottleneck candidates among the hardware components include network cards, hard drives, memory, and CPUs. Among software components, low level networking routines in the operating system, the operating system itself, and the web server software are good targets for discovering bottlenecks.

The first step in analyzing these bottlenecks is to imitate the conditions of heavy load. To this end, I have spent the past months developing a tool I call “WebBeater.” WebBeater consists of two applications. The first is called WBMaster, and runs on a single computer (Fig. 1). The other is called WBClient, and runs on multiple computers simultaneously. The WBClients have a simple job: to bombard a specified web server with page requests, collecting statistics on response time, connect time, errors, and throughput. It is the job of WBMaster to control the WBClients, starting and stopping their execution and collecting/combining their data. This tool is nearing completion, but is currently in an unfinished state. Additional features are slated, including the ability to imitate workloads defined by the SPEC Group as standard web server workloads (2).

In its current state, WebBeater is capable of performing simple tests and collecting useful data for discovering bottlenecks. In the following example, two machines are compared. Both are running the same operating system and server software, so the test will tell us something about how the hardware differs in performance. The most notable difference between the two machines is the existence of two CPUs in the machine named Macaque, four in the machine name Gibbon. Some results are shown in the accompanying graphs (Figures 3 and 4).

From these graphs we can see that the four-CPU machine, Gibbon, had better throughput on average as well as slightly quicker response times, but the differences in both cases were small. The test set of downloads for this experiment were simple static documents. Requesting more complex documents that require processing by the server results in a larger gap in performance between Gibbon and Macaque. Similar experiments are used to determine exactly which components are slowing the system most, so that strategies can be devised to remove these bottlenecks. For continually updated information on this project, see http://pel.cs.byu.edu/

References

Lucky, Robert. 1996. Reflections on Technology and Society, address given at the Marriott Center of Brigham Young University, November 26. 241 Figure 1 Throughput 100 120 140 160 180 200 220 2 6 10 14 18 22 26 30 Clients MB/s Gibbon Macaque Figure 2 Maximum Response Time 100 2100 4100 6100 8100 10100 12100 2 6 10 14 18 22 26 30 Clients ms Gibbon Macaque Figure 3
The Standard Performance Evaluation Program. 1996. The Workload for the SPECweb96Benchmark. http://www.specbench.org/osg/web96/workload.html