Published 04/05/1999
For more information, contact:
Duane Wessels, NLANR, 303-497-1822,
wessels@ircache.net
Web Cache Competition:
http://bakeoff.ircache.net/
Web Caching Workshop:
http://workshop.ircache.net/
Report:
http://bakeoff.ircache.net/bakeoff-01/
NLANR:
http://www.nlanr.net/
UNIVERSITY OF CALIFORNIA, SAN DIEGO -- The results of the first Web Cache "Bakeoff" competition were released at the Fourth International Web Caching Workshop in San Diego on April 2. The competition was organized by the IRCACHE team of the National Laboratory for Applied Network Research (NLANR), an independent research and support organization for high-performance networking funded by the National Science Foundation.
Six major sources of Web Cache systems -- IBM, InfoLibria, Network Appliance, Novell (in an OEM agreement with Dell), the University of Wisconsin, and NLANR itself -- participated in the competition, which was open to any company or organization with a Web cache product. Several other vendors, while initially interested, declined to participate in a head-to-head comparison.
"A 'bakeoff' is a test of several similar products that perform similar functions," said NLANR's Duane Wessels, organizer of the competition. "In this case, we evaluated the performance of systems that speed up information retrieval over the Internet -- Web cache servers available from commercial vendors and non-profit organizations."
More than half of the traffic on the Internet backbone is related to the World Wide Web. In the basic client-server transaction model, each Web browser client connects directly to each server, so a single file often would be transmitted many times over the same network paths when requested by different clients. This mode of operation can cause severe congestion on many of the Internet's wide-area links.
Caching is a way to reduce client-server traffic and improve response time. Instead of connecting directly to servers, clients are configured to connect to an "HTTP proxy server" at the Internet Service Provider (ISP), which requests Web objects from their source servers (or from other caches) and then saves these files for use in response to future requests. Popular objects collect in the caches and might be used many times without reloading from remote sites.
NLANR operates the IRCACHE Web Caching project under funding from the National Science Foundation's Directorate for Computer and Information Sciences and Engineering. NLANR created and maintains nine high-level Web caches located throughout the United States. These caches are directly connected to approximately 400 other caches, and indirectly to 1100 worldwide. Collectively, the NLANR caches receive approximately 7,000,000 requests per day from the others. The operational system of intermeshed Web caches has been credited with significantly reducing Internet congestion. NLANR also developed and distributes the Squid open source software package, which is widely used in the ISP community.
The Bakeoff Competition
Several Web cache products have entered the marketplace within the past couple of years. Unfortunately, competing performance claims hare been difficult to verify, and haven't meant quite the same thing from vendor to vendor.
Wessels and his colleagues Alex Rousskov and Glenn Chisholm have developed a freely available software package called Web Polygraph. Polygraph simulates Web clients and servers and is becoming a de facto benchmarking standard for the Web caching industry. The package is designed to give Web *servers* a workout -- it generates about 1000 requests per second on a 100baseT network between a client-server pair, and can specify such important workload parameters as hit ratio, response sizes, and server-side delays.
"To ensure a valid comparison, every product was tested under identical conditions within a short period of time and at a single location," Wessels explained. The Web Cache Bakeoff competition was held March 15 through 17 in Redwood City, CA, in an industrial space donated for the occasion by Paul Vixie of Vixie Enterprises ( http://www.vix.com/vix/).
Seemingly minor variations in workload parameters or system configuration can markedly affect performance. "In our report, we specify as much detail as possible about our benchmarking environment so the results of our tests can be reliably reproduced by others," Wessels said.
The bake-off took place over three days. The first day was used for testing the network and computer systems. The next two days were dedicated to running the benchmark. Theoretically, all benchmarking could have been finished by the end day two, so the third day was a saftey net. Participants also had the option of repeating some runs if necessary.
Each vendor was allowed to bring more than one product to the bake-off; each tested product was considered an independant participant, with a separate benchmarking harness (bench) for every participant. More than 80 Compaq Pentium II computers were rented for use as Polygraph clients and servers in the bakeoff.
IBM, InfoLibria (two entries), Network Appliance (two entries), Novell (in OEM agreement with Dell, two entries), the University of Wisconsin, and NLANR participated in the competition, which was open to any company or organization with a Web cache product. The precise parameters of the test were arrived at by discussion and mutual agreement among the competitors and researchers.
"The bakeoff set a high standard both for design and execution, and for the cache robustness required for completion," said Abdelsalam A. Heddaya, InfoLibria's VP of Research and Architecture. "Because it was the first truly independent benchmark of network caches, we believe it will be of tremendous value to the industry."
CacheFlow, Cisco, Entera, and Inktomi had expressed strong interest in the bakeoff, but eventually decided not to participate. In addition, IBM and Network Appliance decided not to disclose the results of their trials after the bakeoff, a "bail-out" option previously agreed upon by the competitors and testers.
"Certainly we are disappointed by their choice," Wessels said. "We feel that benchmarking results are more useful when there are more results to compare. At the same time, we take it as a compliment that our benchmark was taken very seriously -- by those who competed, and by those who didn't."
The Competitors
The Results
The results of the Web Cache Bakeoff were released on April 2 at the Fourth International Web Caching Workshop in San Diego. The conference was organized by NLANR and CAIDA (the Cooperative Association for Internet Data Analysis). Detailed information about the competition and the formal report of its results can be found at http://bakeoff.ircache.net/bakeoff-01/. There is no single absolute measure of performance for all situations -- some customers will place the highest value on throughput, while others emphasize bandwidth or response time savings. Maximizing cache hit ratio is essential at many Web cache installations. For some sites price is important, for others it's the price/performance ratio.
IBM and Network Appliance declined to make their results public.
Detailed results of each test, with graphs and analyses, are contained in the report at http://bakeoff.ircache.net/bakeoff-01/.
"We strongly caution against drawing hasty conclusions from these benchmarking results," Wessels said. "Since the tested caches differ a lot, it is tempting to draw conclusions about participants based on a single performance graph or pricing table. We believe such conclusions will virtually always be wrong."
"Our report contains a lot of performance numbers and configuration information; take advantage of it," he continued. "Compare several performance factors: throughput, response time, and hit ratio, and weigh each factor based on your preferences. Don't overlook pricing information and price/performance analysis. And always read the Polyteam and Participant Comments sections in the Appendices.
"Our benchmark addresses only the performance aspects of Web cache products. Any given cache will have numerous features that are not addressed here. For example, we think that manageability, reliability, and correctness are very important attributes that should be considered in any buying decisions."
Most customers also have to consider price in their decision making process. The report summarizes the pricing of the participating products and gives detailed product configurations. Note that these costs represent list prices of the equipment only. In reality, there are many additional costs of owning and operating a Web cache. These may include software/technical support, power and cooling requirements, rack/floor space, etc. A more thorough cost analysis might try to determine something like a two-year cost of ownership figure.
The Competitors' Views
In the interest of fairness, all of the competitors were invited to comment on the results. Participant comments contain forward-looking statements concerning, among other things, future performance results. All such forward-looking statements are, by necessity, only estimates of future results and actual results achieved by the Participant may differ substantially from these statements.
We have received the following information from the participating vendors:
Conclusions
The Web caching industry's thirst for a benchmarking standard led to the creation of the Web Polygraph suite and the launch of a series of IRCACHE bakeoffs. We consider the first bakeoff to be a success. Despite the absence of several big players in the industry, the IRCACHE team collected a representative set of interesting performance data, and prepared the first industry document that provides a fair performance comparison of a variety of caching proxies. We hope the performance numbers and our analysis will be used by buyers and developers of caching products.
The IRCACHE team applauds the vendors who came to the bakeoff and disclosed their results. We regret that other cache vendors did not show their leadership. We certainly hope that more companies will participate in future benchmarking events and will have the courage to disclose their results.
We expect discussions of the bakeoff and its results to appear, possibly including attempts to denounce bakeoffs in general. We believe that, while not perfect, this first bakeoff's rules and workload give knowledgeable customers a lot of valid, useful, and unique performance data. Future bakeoffs will further improve the quality and variety of our tests. We do not know of a better substitute for a fair same-rules, same-time competition.
Finally, we expect some companies will try to mimic bake-off experiments in private labs, and we certainly welcome such activities. We trust the reader will be able to separate unsubstantiated speculations and semi-correct bake-off clones from true performance analysis. If unsure about the validity of vendor tests, consult this report and Polyteam members directly.
The second Web Cache Bakeoff has been tentatively scheduled for six months from now.
---
The National Laboratory for Applied Network Research (NLANR) has as its primary goal to provide technical, engineering, and traffic analysis support for NSF High-Performance Connections sites and high-performance network service providers such such as Internet 2, Next Generation Internet, the NSF/MCI vBNS, and STAR TAP. Founded by the National Science Foundation's Computer and Information Science and Engineering Directorate in 1995, NLANR is a "distributed national laboratory" with researchers and engineers at the San Diego Supercomputer Center, the National Center for Supercomputing Applications, the Pittsburgh Supercomputing Center, and the National Center for Atmospheric Research, among other sites. See
http://www.nlanr.net/ for more information.
The Cooperative Association for Internet Data Analysis is a collaborative undertaking among government, industry, and the research community to promote greater cooperation in the engineering and maintenance of a robust, scalable global Internet infrastructure. It is based at the San Diego Supercomputer Center (SDSC) at the University of California, San Diego (UCSD) and includes participation by Internet providers and suppliers, as well as the NSF and the Defense Advanced Research Project Agency (DARPA). CAIDA focuses on the engineering and traffic analysis requirements of the commercial Internet community. Current priorities include the development and deployment of traffic measurement, visualization, and analysis tools and the analysis of Internet traffic data. For more information, see http://www.caida.org/, or contact Tracie Monk, CAIDA, 619-822-0943, tmonk@caida.org.
The San Diego Supercomputer Center (SDSC) is a research unit of the University of California, San Diego, and the leading-edge site of the National Partnership for Advanced Computational Infrastructure (http://www.npaci.edu/). SDSC is sponsored by the National Science Foundation through NPACI and by other federal agencies, the State and University of California, and private organizations. For additional information about SDSC, see http://www.sdsc.edu/, or contact Ann Redelfs at SDSC, 619-534-5032, redelfs@sdsc.edu.