Metrics gathered about system hardware, and supporting software, but NOT application code. For instance, Web server throughput and hits per second when accessing a large graphic can be determined through benchmark testing.
Compares the performance of new or unknown target-of-test to a known reference standard such as existing software or measurement(s) and industry standards.
A benchmark, or light load, scenario is generally a small community of users compared to the target load. This community of users much be large enough to approximate a reasonable sample of the entire user community model while still being significantly smaller than the expected system capacity, 15% of total expected user load is generally a good Benchmark volume. Executing Benchmark tests ensures that the testing environment behaves as expected under light load as well as validates that the scripts have been developed correctly.
Benchmarking compares the performance of a target product to a known reference standard such as software in production in the shop or industry standards. It is also possible to define a standard for
performance with early testing of a new product and then compare all further test results to this standard.
Benchmarking is useful in aiding us deploying applications which are up to their task, but there are many opportunities for benchmarking to go wrong. Here are some common 'gotchas' to watch out for when benchmarking:
- Are pre-existing 'benchmarks' applicable? Just because some application, survey or study reports a particular throughput for a particular transaction, it does not mean that you must meet that throughput for that transaction in your product. The 'benchmark' may simply not be applicable to your application, clients or resource levels. Identify your own competing baselines and find out from them what is achievable. Then you can determine whether you can achieve a similar level of performance.
- Doing it all at once. A product is typically comprised of a group of tasks or workflows. Avoid trying to benchmark a total system all at once - it will be costly, take more time than you have, and make it difficult to remain focused especially the larger and more complex the system. It is better to select several workflows that form a key part of the total system, work with them initially, and then move on to the next part of the system.
- This is not research. Benchmarking assumes that you are working on an existing product that has been in use long enough to have some data about its effectiveness and its performance. Commencing a new product, such as developing a new web application by collecting information about other similar applications and taking ideas or requirements from them, is research, not benchmarking.
- Measuring the 'ilities. Many non-functional requirements are used to assess the quality of a product, but often such things as usability or reliability are difficult to measure as a whole. Encourage your benchmarking team to select instead a part of the topic that can be observed and measured (a functional requirement).
- Establish the baseline. Make sure your benchmarking team is very clear about what it wants to learn and what is the current state of the product performance before beginning your benchmarking effort. Benchmarking results of a new application will form the baseline for the next iteration of the application. It is important to evaluate the real life production results with the original findings and update these findings to improve the baseline numbers.
A standard against which measurements or comparisons can be made. Benchmarks can include work loads, baseline systems, and system support environments.
A test in which a benchmark mix of demands in run against the system being tested.
Testing a system by comparing its behavior to another system (the benchmark).