Performance Testing

Why do we performance test?

The need to perform Performance and Load testing can be linked to the following requirements:

Ability to provide stakeholder’s with information about their application regarding speed, stability and scalability.
Ability to analyse the system in a realistic environment so that improvements / enhancements can be made before live deployment.
Ability to conform to the following GDS Service Standards on performance and durability.
- Standard 6: Evaluate tools and systems - the evaluation and use of tools for your system and development environment.
- Standard 10: Test the end-to-end service - as well as providing evidence of how the system has been designed and changed from user research the NHSBSA needs to provide evidence that the system is technically tested end to end (Functionally, Regression, Performance, Accessibility & Mobile).
- Standard 15: Collect performance data - contributing to this standard the use and implementation of performance and load testing can help prove how robust and stable the proposed systems are.

How do we performance test?

We performance test using the open source JMeter tool. This is available from the JMeter website.

The scope of the performance tests should initially be designed to test the easiest route (Happy Path) through the system in order to get some initial indications on what the system can handle or cope with. The scripts can then be expanded to investigate and explore more of the system in order to attain full coverage.

Before commencing any performance testing the following test criteria needs to be defined and clear:

Entry criteria (Conditions that need to exist or be met before testing can start, e.g. all pre-requisites, monitoring and reporting ready).
Exit criteria (Assessment of Application Performance to requirements, documentation of faults & issues, fixing bottlenecks and meeting performance goals.)
Definition of ready (State (i.e. clear acceptance criteria) where stories or tests need to be before any testing can commence.
Definition of done (State after testing where the tester is confident that the test has completed or finished).

What types of performance tests do we do?

Load Tests

A load test is conducted to understand the behaviour of the system under a specified expected load(s). This load should be the expected concurrent number of users on the system performing a specific number of transactions or activity within the set duration.

Soak Tests

This type of performance test that verifies a system’s stability and performance characteristics over an extended period of time. It is typical in this type of performance test to maintain a certain level of user concurrency for an extended period of time. This would keep the user intervals constant while the ramp period and execution cycles would change to extend the testing period.

Stress Tests

This type of test helps to understand the upper limits of capacity within the system. This test helps to determine the system’s robustness in terms of extreme load and helps application administrators to determine if the system will perform sufficiently if the current load goes well above the expected maximum. This test can often be combined / integrated with Load Tests such as with LIS Eligibility Checker Load tests where the system limit was found during interval testing over 100 users.

Spike Tests

Spike testing is performed by suddenly increasing or decreasing the load generated by a very large number of users, and observing the behaviour of the system. The goal is to determine whether performance will suffer, the system will fail, or it will be able to handle dramatic changes in load.

Configuration Tests

These tests are created to determine the effects of configuration changes to the system’s components on the system’s performance and behaviour. A common example would be experimenting with different methods of load balancing. The load balancers are controlled by AWS / Arcus for the Cloud solutions therefore test do not have the required authority or permissions to alter or change these settings so this would have to arranged through platform services.

How do we measure performance testing?

When we performance testing we look at three main criteria that JMeter provides:

User Intervals: Number of concurrent users that will access the system under the test conditions.
Ramp Periods - This is the period of time between which each user will access the system during the tests. For example, 0 ramp up will mean all of the above users will access the system at the same time. This setting is used more for soak testing purposes (testing the system over longer periods of time e.g. operational hours or peak times).
Execution Cycles - This is the number of times the whole test will be repeated. There should be no less than 3 execution cycles for each set of incremental user interval. This will allow comparative results to be gained and will eliminate any unknown or unusual activity often found in solo runs.

Improve the playbook

If you spot anything factually incorrect with this page or have ideas for improvement, please share your suggestions.

Before you start, you will need a GitHub account. Github is an open forum where we collect feedback.

Suggest a change to this page

Published: 4 February 2022

NHSBSA Digital, Data and Technology Playbook

Performance Testing

Improve the playbook