Microsoft Azure SQL Database Performance Tests: Test Architecture, Test Types and Test Configurations

Performance Testing Series

This post is part of a series of blog posts about my performance tests in Azure SQL Database. For the first post in this series (and links to all of the other posts) please see here.  For a summary of all the results please see here.

Quick Navigation:        << Previous post in series <<        >> Next post in series >>

Performance Test Architecture

All elements of the test set up run in Azure, in the same region (Western Europe).

Image

A Cloud Service (Worker Role) ran each test. This reads the definition of each test from a shared control database before executing the test against a target test database – i.e. preparing the database with test data, generating requests of the type and rate specified in the test definition, then cleaning up the test database.

The cloud service can scale to whatever number of request generating threads are required (provided a large enough cloud service instance is configured in Azure).

The cloud service also tracks and captures details of the test (response timings, data volumes, etc) and records this information in the control database.

Fore more details, please continue reading below.

Performance Test Groups

The performance tests were divided into four types:

Quick Tests (QT)

The QT tests were initial fifteen minute tests designed to quickly probe the performance levels and help design the other tests.

Generally speaking, the results of these quick tests are not discussed in these posts at all, since these initial results were repeated (in more detail) using the other test types. The results of these quick tests were consistent with the more detailed tests.

Unlimited Tests (UT)

The UT tests allow a specified number of threads in the request generator (the cloud service) to generate an “unlimited” number of requests against the test database, i.e. the rate of the requests is not limited by the request generator. Within each thread, as soon as one request is completed another is generated.  Each thread has at most one request pending at any one time (i.e. a single thread does not generate more than one request concurrently – this is true across all test types).

The UT tests are primarily focussed on exploring the maximum write rates to disk and any CPU throttling. The UT tests used relatively small sized data sets (to minimise the set up time and cost for each test). This also means the read rates measured are for reading from the buffer pool and not the physical disk (i.e. from the metrics it is possible to determine that at no point was the physical disk hit and all data was in memory throughout the tests). This provides an interesting contrast to the read rates determined in the Sequential Select and Scale Tests described below.

Each UT test lasted 30 minutes. Each type of test was repeated three times against each of the service tiers except P3.

Limited Tests (LT)

The LT tests generate requests up to a specified maximum rate. The request rate starts low and is gradually increased over subsequent tests. A maximum thread limit (for the request generator) is specified in the definition of each test to prevent the request generator overwhelming the target test database. Once the limit of the target database has been passed (and it cannot service requests fast enough to meet the rate specified in the test) the request generator tracks the number of missed requests (i.e. requests that it couldn’t send) and records this information.

These tests are designed to investigate the behaviour of the database as it approaches the limit of the service tier. Like the UT tests, the LT tests used relatively small sized data sets and are focussed on exploring the maximum write rates to disk and any CPU throttling. Again, read rates are generally speaking for reading from the buffer pool.

LT tests also lasted 30 minutes. Multiple LT tests of each type were performed against each environment (gradually increasing the request rate).

LT tests are more time consuming than the UT tests and so were only conducted against the Web, Basic and S1 service tiers. It is expected/assumed (but not proven here!) that throttling behaviour is pretty similar for the higher-spec service tiers as their respective limits are approached.

Sequential Select Tests (SeqSel)

The SeqSel read large sets of data sequentially in order to measure the maximum read rates from disk. This is in contrast to all of the previous test types which generally speaking did not hit the physical disk and read only data that was already in memory.

Each SeqSel test lasted 60 minutes and was conducted three times against each service tier. These tests were conducted against all of the service tiers except P3.

Scale Tests (ST)

These tests run one larger workload against each edition / tier with a mixture of insert, select, update and delete activity.  This allows the combined effect of the read and write limits to be seen on one example test workload.

Each ST test lasted 120 minutes and was conducted three times against each service tier. These tests were conducted against all of the service tiers except P3.

Performance Test Types


Test Type Overview

The tests have been specified to perform each different type of activity individually then build up to more mixed tests. The table below describes the different types of test as well as whether the target database is populated with some test data prior to the test beginning (or whether it is empty).

# Name Workload Test Data
1 Inserts Single row inserts via Stored Procedure.
2 Bulk Load Rows inserted via the SQL Server bulk load interface.
3 Direct Selects Single row selects by primary key via Stored Procedure.
4 Indirect Selects Multiple row selects by various different fields via text SQL.
5 Updates Single row updates via Stored Procedure.
6 Inserts and Deletes 50% single row inserts via Stored Procedure and 50% single row deletes by primary key via stored procedure.
7 Ins, Sel (Dir. only), Upd and Del 20% single row inserts, 30% single row selects, 30% single row updates and 20% single row deletes – all via stored procedure.
8 Ins, Sel (Dir. & Ind.), Upd and Del 20% single row inserts, 30% selects, 30% single row updates and 20% single row deletes – all via stored procedure. The selects are split into single row selects via stored procedure (75%) and multiple row selects by various different fields via text SQL (25%).
9 Mini Seq. Sel Sequential selects of small data sets.
10 Seq. Sel Sequential selects of relatively large data sets.
11 Scale Test Combined workload against a larger test database.

More specific details of each type of test are provided in later posts.

Summary of Test Types by Test Group

The table below summarises the types of test that were conducted in each group of tests:

# Name UT LT Larger Scale
1 Inserts
2 Bulk Load
3 Direct Selects
4 Indirect Selects
5 Updates
6 Inserts and Deletes
7 Ins, Sel (Dir. only), Upd and Del
8 Ins, Sel (Dir. & Ind.), Upd and Del
9 Mini Seq. Sel
10 Seq. Sel
11 Scale Test

Performance Test Configurations

The diagram below details the general configurations used for the tests. A few tests were run with configurations slightly different to that shown below – this detail is described in later posts.

20140630-TestConfigurationsSmall

The data volumes described above are the volume of test data created prior to each test beginning. These volumes apply only to those test types that require test data being present (i.e. not to the Insert and Bulk Load tests, as described in the previous section).

∗ A more extensive series of UT tests was conducted on the Business Edition test database to provide more data to compare the results of the Premium Service Tier tests with. These Business Edition UT tests scaled the number of request generating threads and size of test data. Tests ranged from a single thread operating against 200 MB of test data in a single test table up to eight threads operating against 1.2 GB of test data spread across eight test tables.

In More Detail: Cloud Service


Composition

The Cloud Service (request generator), responsible for running the tests, runs in an Azure Worker Role and is based on .NET 4.5, written in C#.

It contains classes to govern the timing of tests, including the set up and tear down of each type of test. For those types of test that require test data to be present in the database before the test begins, the cloud service pre-populates the tables in a pre-test set up activity. Test data values are all randomly generated.

Gaps were left between tests (and between set up / tear down activities and tests). This allows system performance to equalise and also leaves clear gaps in the resource usage data from Azure to aid in the analysis.

The cloud service also includes classes that can govern the rate at which requests are generated, to accurately generate up to a specified number of requests per second, distributing those requests reasonably evenly in time at the second and sub-second level.

For all tests except the SeqSel tests, requests are distributed randomly across the test tables and across the range of primary key values in the test tables. The Cloud service maintains lists of primary key values in memory to ensure requests do not target primary key values that don’t exist.

Data Access is entirely via the .NET Sql Client classes.

Collecting Performance Metrics

In addition to generating requests, the cloud service also gathers other data helpful to measure performance:

  • Database Activity Metrics – the number of actions of different types performed, row counts, data volumes and associated response times – captured at per minute level during the tests.
  • Database Actions – the time taken, row count and data volumes for specific database actions – for longer running operations, in particular associated with the SeqSel tests.
  • Request Metrics – a breakdown of the number of requests generated per minute (i.e. submitted to the DB) and missed per minute (i.e. no free worker thread) – to enable a detailed understanding of the activity occurring within the request generator and in particular to check that it is not the bottleneck in any performance tests.
  • Cloud Service Counters – captures the average CPU utilisation and available memory in the cloud service VM once a second during the tests, again to help check whether this is the bottleneck in performance.

Scaling the Cloud Service

The request generator is multi-threaded and uses asynchronous programming for more efficient processing.

Generating a larger load on the target database simply requires firing up a larger number of request generating threads. The Request Metrics and Cloud Service Counters allow the workload of the cloud service to be monitored – enabling it to be scaled appropriately for each Azure Database Edition.

In More Detail: Test Database


Composition

The test database used is the same across all tests and service tiers. It contains 50 tables (imaginatively named Table1 to Table50), all of identical structure:

Field Data Type
RowId int (PK)
Bit1 bit
Bit2 bit
Bit3 bit
Bit4 bit
TinyInt1 tinyint
TinyInt2 tinyint
SmallInt1 smallint
SmallInt2 smallint
Int1 int
Int2 int
Int3 int
Int4 int
Int5 int
DateTime1 datetime
DateTime2 datetime
DateTime3 datetime
DateTime4 datetime
Float1 float
Float2 float
Money1 money
Money2 money
Money3 money
Text1 char(2)
Text2 char(10)
Text3 varchar(40)
Text4 varchar(100)
Text5 varchar(250)

The RowId field was the primary key and also comprised the clustered index.  Values for this field (generated for inserts and bulk load) were sequentially generated.

The text values are generated to be, on average, just over 2/3 the maximum length of the fields. This leads to an average row length (i.e. just summing the data type sizes and average text value lengths) of around 375 bytes. This applies across all test types. Running the data generation routines on-premise and examining sys.dm_db_index_physical_stats reports the average row size as close to 400 bytes.

The tests against the high-spec service tiers make use of a larger number of the tables. Specific numbers are provided in later posts.

I make no claim at all that this is the best database design for performance testing – only that it has the virtue of simplicity!

The test database also contains the following stored procedures:

  • iTable1 to iTable50 – to insert a single row into a test table
  • sTable1 to sTable50 – to select a single row from a test table
  • uTable1 to uTable50 – to update a single row in a test table
  • dTable1 to dTable50 – to delete a single row from a test table

The object definitions are included in an appendix below for reference.

It is worth nothing that although the database was created with 50 tables, most tests used no more than 10. Generally speaking, the number of tables used was directly proportional to the number of threads e.g. a test running with four threads would use four tables.  The obvious exceptions to this are the larger scale Sequential Select and Scale Tests which typically used 36 tables.

All test databases were entirely standalone, no synchronisation etc. was running.

SQL Database Performance Metrics

Azure SQL Database provides a number of performance metrics via system views. For these performance tests, the data within the sys.resource_stats view was captured. This view details the percentage of CPU quota, log write quota and physical disk read quota that has been used per 5 minute period per database.

It is worth noting that there were occasional data points missing (i.e. random five minute data periods), notably so during the last few days of these tests leading up to 29th June (it appears a bug or infrastructure fault was occurring). Fortunately these omissions are not significant enough to affect the general results (e.g. the results of a test may be from an average of 15 data points rather than 18).

Appendix: Database Object Definitions

CREATE TABLE [dbo].[Table1](
 [RowId] [int] NOT NULL,
 [Bit1] [bit] NULL,
 [Bit2] [bit] NULL,
 [Bit3] [bit] NULL,
 [Bit4] [bit] NULL,
 [TinyInt1] [tinyint] NULL,
 [TinyInt2] [tinyint] NULL,
 [SmallInt1] [smallint] NULL,
 [SmallInt2] [smallint] NULL,
 [Int1] [int] NULL,
 [Int2] [int] NULL,
 [Int3] [int] NULL,
 [Int4] [int] NULL,
 [Int5] [int] NULL,
 [DateTime1] [datetime] NULL,
 [DateTime2] [datetime] NULL,
 [DateTime3] [datetime] NULL,
 [DateTime4] [datetime] NULL,
 [Float1] [float] NULL,
 [Float2] [float] NULL,
 [Money1] [money] NULL,
 [Money2] [money] NULL,
 [Money3] [money] NULL,
 [Text1] [char](2) NULL,
 [Text2] [char](10) NULL,
 [Text3] [varchar](40) NULL,
 [Text4] [varchar](100) NULL,
 [Text5] [varchar](250) NULL,
 CONSTRAINT [PK_T001_DataLoading] PRIMARY KEY CLUSTERED 
(
 [RowId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)


CREATE proc [dbo].[iTable1]
 @RowId int, @Bit1 bit, @Bit2 bit, @Bit3 bit, @Bit4 bit,
 @TinyInt1 tinyint, @TinyInt2 tinyint, 
 @SmallInt1 smallint, @SmallInt2 smallint,
 @Int1 int, @Int2 int, @Int3 int, @Int4 int, @Int5 int,
 @DateTime1 datetime, @DateTime2 datetime, 
 @DateTime3 datetime, @DateTime4 datetime, 
 @Float1 float, @Float2 float, 
 @Money1 money, @Money2 money, @Money3 money,
 @Text1 char(2), @Text2 char(10), @Text3 varchar(40), 
 @Text4 varchar(100), @Text5 varchar(250)
as
begin
INSERT INTO [dbo].[Table1]
 ([RowId], [Bit1], [Bit2], [Bit3], [Bit4]
,[TinyInt1], [TinyInt2], [SmallInt1], [SmallInt2]
,[Int1], [Int2], [Int3], [Int4], [Int5]
,[DateTime1], [DateTime2], [DateTime3], [DateTime4]
,[Float1], [Float2], [Money1], [Money2], [Money3] 
,[Text1], [Text2], [Text3], [Text4], [Text5])
VALUES
 (@RowId, @Bit1, @Bit2, @Bit3, @Bit4, 
 @TinyInt1, @TinyInt2, @SmallInt1, @SmallInt2, 
 @Int1, @Int2, @Int3, @Int4, @Int5,
 @DateTime1, @DateTime2, @DateTime3, @DateTime4,
 @Float1, @Float2, @Money1, @Money2, @Money3,
 @Text1, @Text2, @Text3, @Text4, @Text5) 
end


CREATE proc [dbo].[sTable1]
 @RowId int
as
begin
select RowId, Bit1, Bit2, Bit3, Bit4, 
 TinyInt1, TinyInt2, SmallInt1, SmallInt2, 
 Int1, Int2, Int3, Int4, Int5,
 DateTime1, DateTime2, DateTime3, DateTime4,
 Float1, Float2, Money1, Money2, Money3,
 Text1, Text2, Text3, Text4, Text5
from dbo.Table1 where RowId = @RowId
end
GO


CREATE proc [dbo].[uTable1]
 @RowId int, @Bit1 bit, @Bit2 bit, @Bit3 bit, @Bit4 bit,
 @TinyInt1 tinyint, @TinyInt2 tinyint, 
 @SmallInt1 smallint, @SmallInt2 smallint,
 @Int1 int, @Int2 int, @Int3 int, @Int4 int, @Int5 int,
 @DateTime1 datetime, @DateTime2 datetime, 
 @DateTime3 datetime, @DateTime4 datetime, 
 @Float1 float, @Float2 float, 
 @Money1 money, @Money2 money, @Money3 money,
 @Text1 char(2), @Text2 char(10), @Text3 varchar(40), 
 @Text4 varchar(100), @Text5 varchar(250)
as
begin
UPDATE [dbo].[Table1]
SET [Bit1] = @Bit1, [Bit2] = @Bit2, [Bit3] = @Bit3, [Bit4] = @Bit4,
 [TinyInt1] = @TinyInt1, [TinyInt2] = @TinyInt2, 
 [SmallInt1] = @SmallInt1, [SmallInt2] = @SmallInt2,
 [Int1] = @Int1, [Int2] = @Int2, [Int3] = @Int3, 
 [Int4] = @Int4, [Int5] = @Int5,
 [DateTime1] = @DateTime1, [DateTime2] = @DateTime2, 
 [DateTime3] = @DateTime3, [DateTime4] = @DateTime4,
 [Float1] = @Float1, [Float2] = @Float2, 
 [Money1] = @Money1, [Money2] = @Money2, [Money3] = @Money3,
 [Text1] = @Text1, [Text2] = @Text2, [Text3] = @Text3, 
 [Text4] = @Text4, [Text5] = @Text5
WHERE RowId = @RowId
end


CREATE proc [dbo].[dTable1]
 @RowId int
as
begin
 delete from dbo.Table1 where RowId = @RowId
end

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s