Performance Testing Series
This post is part of a series of blog posts about my performance tests in Azure SQL Database. For the first post in this series (and links to all of the other posts) please see here. For a summary of all the results please see here.
Performance Test Architecture
All elements of the test set up run in Azure, in the same region (Western Europe).
A Cloud Service (Worker Role) ran each test. This reads the definition of each test from a shared control database before executing the test against a target test database – i.e. preparing the database with test data, generating requests of the type and rate specified in the test definition, then cleaning up the test database.
The cloud service can scale to whatever number of request generating threads are required (provided a large enough cloud service instance is configured in Azure).
The cloud service also tracks and captures details of the test (response timings, data volumes, etc) and records this information in the control database.
Fore more details, please continue reading below.
Performance Test Groups
The performance tests were divided into four types:
Quick Tests (QT)
The QT tests were initial fifteen minute tests designed to quickly probe the performance levels and help design the other tests.
Generally speaking, the results of these quick tests are not discussed in these posts at all, since these initial results were repeated (in more detail) using the other test types. The results of these quick tests were consistent with the more detailed tests.
Unlimited Tests (UT)
The UT tests allow a specified number of threads in the request generator (the cloud service) to generate an “unlimited” number of requests against the test database, i.e. the rate of the requests is not limited by the request generator. Within each thread, as soon as one request is completed another is generated. Each thread has at most one request pending at any one time (i.e. a single thread does not generate more than one request concurrently – this is true across all test types).
The UT tests are primarily focussed on exploring the maximum write rates to disk and any CPU throttling. The UT tests used relatively small sized data sets (to minimise the set up time and cost for each test). This also means the read rates measured are for reading from the buffer pool and not the physical disk (i.e. from the metrics it is possible to determine that at no point was the physical disk hit and all data was in memory throughout the tests). This provides an interesting contrast to the read rates determined in the Sequential Select and Scale Tests described below.
Each UT test lasted 30 minutes. Each type of test was repeated three times against each of the service tiers except P3.
Limited Tests (LT)
The LT tests generate requests up to a specified maximum rate. The request rate starts low and is gradually increased over subsequent tests. A maximum thread limit (for the request generator) is specified in the definition of each test to prevent the request generator overwhelming the target test database. Once the limit of the target database has been passed (and it cannot service requests fast enough to meet the rate specified in the test) the request generator tracks the number of missed requests (i.e. requests that it couldn’t send) and records this information.
These tests are designed to investigate the behaviour of the database as it approaches the limit of the service tier. Like the UT tests, the LT tests used relatively small sized data sets and are focussed on exploring the maximum write rates to disk and any CPU throttling. Again, read rates are generally speaking for reading from the buffer pool.
LT tests also lasted 30 minutes. Multiple LT tests of each type were performed against each environment (gradually increasing the request rate).
LT tests are more time consuming than the UT tests and so were only conducted against the Web, Basic and S1 service tiers. It is expected/assumed (but not proven here!) that throttling behaviour is pretty similar for the higher-spec service tiers as their respective limits are approached.
Sequential Select Tests (SeqSel)
The SeqSel read large sets of data sequentially in order to measure the maximum read rates from disk. This is in contrast to all of the previous test types which generally speaking did not hit the physical disk and read only data that was already in memory.
Each SeqSel test lasted 60 minutes and was conducted three times against each service tier. These tests were conducted against all of the service tiers except P3.
Scale Tests (ST)
These tests run one larger workload against each edition / tier with a mixture of insert, select, update and delete activity. This allows the combined effect of the read and write limits to be seen on one example test workload.
Each ST test lasted 120 minutes and was conducted three times against each service tier. These tests were conducted against all of the service tiers except P3.
Performance Test Types
Test Type Overview
The tests have been specified to perform each different type of activity individually then build up to more mixed tests. The table below describes the different types of test as well as whether the target database is populated with some test data prior to the test beginning (or whether it is empty).
|1||Inserts||Single row inserts via Stored Procedure.|
|2||Bulk Load||Rows inserted via the SQL Server bulk load interface.|
|3||Direct Selects||Single row selects by primary key via Stored Procedure.||∗|
|4||Indirect Selects||Multiple row selects by various different fields via text SQL.||∗|
|5||Updates||Single row updates via Stored Procedure.||∗|
|6||Inserts and Deletes||50% single row inserts via Stored Procedure and 50% single row deletes by primary key via stored procedure.||∗|
|7||Ins, Sel (Dir. only), Upd and Del||20% single row inserts, 30% single row selects, 30% single row updates and 20% single row deletes – all via stored procedure.||∗|
|8||Ins, Sel (Dir. & Ind.), Upd and Del||20% single row inserts, 30% selects, 30% single row updates and 20% single row deletes – all via stored procedure. The selects are split into single row selects via stored procedure (75%) and multiple row selects by various different fields via text SQL (25%).||∗|
|9||Mini Seq. Sel||Sequential selects of small data sets.||∗|
|10||Seq. Sel||Sequential selects of relatively large data sets.||∗|
|11||Scale Test||Combined workload against a larger test database.||∗|
More specific details of each type of test are provided in later posts.
Summary of Test Types by Test Group
The table below summarises the types of test that were conducted in each group of tests:
|6||Inserts and Deletes||∗||∗|
|7||Ins, Sel (Dir. only), Upd and Del||∗|
|8||Ins, Sel (Dir. & Ind.), Upd and Del||∗||∗|
|9||Mini Seq. Sel||∗|
Performance Test Configurations
The diagram below details the general configurations used for the tests. A few tests were run with configurations slightly different to that shown below – this detail is described in later posts.
The data volumes described above are the volume of test data created prior to each test beginning. These volumes apply only to those test types that require test data being present (i.e. not to the Insert and Bulk Load tests, as described in the previous section).
∗ A more extensive series of UT tests was conducted on the Business Edition test database to provide more data to compare the results of the Premium Service Tier tests with. These Business Edition UT tests scaled the number of request generating threads and size of test data. Tests ranged from a single thread operating against 200 MB of test data in a single test table up to eight threads operating against 1.2 GB of test data spread across eight test tables.
In More Detail: Cloud Service
The Cloud Service (request generator), responsible for running the tests, runs in an Azure Worker Role and is based on .NET 4.5, written in C#.
It contains classes to govern the timing of tests, including the set up and tear down of each type of test. For those types of test that require test data to be present in the database before the test begins, the cloud service pre-populates the tables in a pre-test set up activity. Test data values are all randomly generated.
Gaps were left between tests (and between set up / tear down activities and tests). This allows system performance to equalise and also leaves clear gaps in the resource usage data from Azure to aid in the analysis.
The cloud service also includes classes that can govern the rate at which requests are generated, to accurately generate up to a specified number of requests per second, distributing those requests reasonably evenly in time at the second and sub-second level.
For all tests except the SeqSel tests, requests are distributed randomly across the test tables and across the range of primary key values in the test tables. The Cloud service maintains lists of primary key values in memory to ensure requests do not target primary key values that don’t exist.
Data Access is entirely via the .NET Sql Client classes.
Collecting Performance Metrics
In addition to generating requests, the cloud service also gathers other data helpful to measure performance:
- Database Activity Metrics – the number of actions of different types performed, row counts, data volumes and associated response times – captured at per minute level during the tests.
- Database Actions – the time taken, row count and data volumes for specific database actions – for longer running operations, in particular associated with the SeqSel tests.
- Request Metrics – a breakdown of the number of requests generated per minute (i.e. submitted to the DB) and missed per minute (i.e. no free worker thread) – to enable a detailed understanding of the activity occurring within the request generator and in particular to check that it is not the bottleneck in any performance tests.
- Cloud Service Counters – captures the average CPU utilisation and available memory in the cloud service VM once a second during the tests, again to help check whether this is the bottleneck in performance.
Scaling the Cloud Service
The request generator is multi-threaded and uses asynchronous programming for more efficient processing.
Generating a larger load on the target database simply requires firing up a larger number of request generating threads. The Request Metrics and Cloud Service Counters allow the workload of the cloud service to be monitored – enabling it to be scaled appropriately for each Azure Database Edition.
In More Detail: Test Database
The test database used is the same across all tests and service tiers. It contains 50 tables (imaginatively named Table1 to Table50), all of identical structure:
The RowId field was the primary key and also comprised the clustered index. Values for this field (generated for inserts and bulk load) were sequentially generated.
The text values are generated to be, on average, just over 2/3 the maximum length of the fields. This leads to an average row length (i.e. just summing the data type sizes and average text value lengths) of around 375 bytes. This applies across all test types. Running the data generation routines on-premise and examining sys.dm_db_index_physical_stats reports the average row size as close to 400 bytes.
The tests against the high-spec service tiers make use of a larger number of the tables. Specific numbers are provided in later posts.
I make no claim at all that this is the best database design for performance testing – only that it has the virtue of simplicity!
The test database also contains the following stored procedures:
- iTable1 to iTable50 – to insert a single row into a test table
- sTable1 to sTable50 – to select a single row from a test table
- uTable1 to uTable50 – to update a single row in a test table
- dTable1 to dTable50 – to delete a single row from a test table
The object definitions are included in an appendix below for reference.
It is worth nothing that although the database was created with 50 tables, most tests used no more than 10. Generally speaking, the number of tables used was directly proportional to the number of threads e.g. a test running with four threads would use four tables. The obvious exceptions to this are the larger scale Sequential Select and Scale Tests which typically used 36 tables.
All test databases were entirely standalone, no synchronisation etc. was running.
SQL Database Performance Metrics
Azure SQL Database provides a number of performance metrics via system views. For these performance tests, the data within the sys.resource_stats view was captured. This view details the percentage of CPU quota, log write quota and physical disk read quota that has been used per 5 minute period per database.
It is worth noting that there were occasional data points missing (i.e. random five minute data periods), notably so during the last few days of these tests leading up to 29th June (it appears a bug or infrastructure fault was occurring). Fortunately these omissions are not significant enough to affect the general results (e.g. the results of a test may be from an average of 15 data points rather than 18).
Appendix: Database Object Definitions
CREATE TABLE [dbo].[Table1]( [RowId] [int] NOT NULL, [Bit1] [bit] NULL, [Bit2] [bit] NULL, [Bit3] [bit] NULL, [Bit4] [bit] NULL, [TinyInt1] [tinyint] NULL, [TinyInt2] [tinyint] NULL, [SmallInt1] [smallint] NULL, [SmallInt2] [smallint] NULL, [Int1] [int] NULL, [Int2] [int] NULL, [Int3] [int] NULL, [Int4] [int] NULL, [Int5] [int] NULL, [DateTime1] [datetime] NULL, [DateTime2] [datetime] NULL, [DateTime3] [datetime] NULL, [DateTime4] [datetime] NULL, [Float1] [float] NULL, [Float2] [float] NULL, [Money1] [money] NULL, [Money2] [money] NULL, [Money3] [money] NULL, [Text1] [char](2) NULL, [Text2] [char](10) NULL, [Text3] [varchar](40) NULL, [Text4] [varchar](100) NULL, [Text5] [varchar](250) NULL, CONSTRAINT [PK_T001_DataLoading] PRIMARY KEY CLUSTERED ( [RowId] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ) CREATE proc [dbo].[iTable1] @RowId int, @Bit1 bit, @Bit2 bit, @Bit3 bit, @Bit4 bit, @TinyInt1 tinyint, @TinyInt2 tinyint, @SmallInt1 smallint, @SmallInt2 smallint, @Int1 int, @Int2 int, @Int3 int, @Int4 int, @Int5 int, @DateTime1 datetime, @DateTime2 datetime, @DateTime3 datetime, @DateTime4 datetime, @Float1 float, @Float2 float, @Money1 money, @Money2 money, @Money3 money, @Text1 char(2), @Text2 char(10), @Text3 varchar(40), @Text4 varchar(100), @Text5 varchar(250) as begin INSERT INTO [dbo].[Table1] ([RowId], [Bit1], [Bit2], [Bit3], [Bit4] ,[TinyInt1], [TinyInt2], [SmallInt1], [SmallInt2] ,[Int1], [Int2], [Int3], [Int4], [Int5] ,[DateTime1], [DateTime2], [DateTime3], [DateTime4] ,[Float1], [Float2], [Money1], [Money2], [Money3] ,[Text1], [Text2], [Text3], [Text4], [Text5]) VALUES (@RowId, @Bit1, @Bit2, @Bit3, @Bit4, @TinyInt1, @TinyInt2, @SmallInt1, @SmallInt2, @Int1, @Int2, @Int3, @Int4, @Int5, @DateTime1, @DateTime2, @DateTime3, @DateTime4, @Float1, @Float2, @Money1, @Money2, @Money3, @Text1, @Text2, @Text3, @Text4, @Text5) end CREATE proc [dbo].[sTable1] @RowId int as begin select RowId, Bit1, Bit2, Bit3, Bit4, TinyInt1, TinyInt2, SmallInt1, SmallInt2, Int1, Int2, Int3, Int4, Int5, DateTime1, DateTime2, DateTime3, DateTime4, Float1, Float2, Money1, Money2, Money3, Text1, Text2, Text3, Text4, Text5 from dbo.Table1 where RowId = @RowId end GO CREATE proc [dbo].[uTable1] @RowId int, @Bit1 bit, @Bit2 bit, @Bit3 bit, @Bit4 bit, @TinyInt1 tinyint, @TinyInt2 tinyint, @SmallInt1 smallint, @SmallInt2 smallint, @Int1 int, @Int2 int, @Int3 int, @Int4 int, @Int5 int, @DateTime1 datetime, @DateTime2 datetime, @DateTime3 datetime, @DateTime4 datetime, @Float1 float, @Float2 float, @Money1 money, @Money2 money, @Money3 money, @Text1 char(2), @Text2 char(10), @Text3 varchar(40), @Text4 varchar(100), @Text5 varchar(250) as begin UPDATE [dbo].[Table1] SET [Bit1] = @Bit1, [Bit2] = @Bit2, [Bit3] = @Bit3, [Bit4] = @Bit4, [TinyInt1] = @TinyInt1, [TinyInt2] = @TinyInt2, [SmallInt1] = @SmallInt1, [SmallInt2] = @SmallInt2, [Int1] = @Int1, [Int2] = @Int2, [Int3] = @Int3, [Int4] = @Int4, [Int5] = @Int5, [DateTime1] = @DateTime1, [DateTime2] = @DateTime2, [DateTime3] = @DateTime3, [DateTime4] = @DateTime4, [Float1] = @Float1, [Float2] = @Float2, [Money1] = @Money1, [Money2] = @Money2, [Money3] = @Money3, [Text1] = @Text1, [Text2] = @Text2, [Text3] = @Text3, [Text4] = @Text4, [Text5] = @Text5 WHERE RowId = @RowId end CREATE proc [dbo].[dTable1] @RowId int as begin delete from dbo.Table1 where RowId = @RowId end