Performance Testing Series
This post is part of a series of blog posts about my performance tests in Azure SQL Database. For the first post in this series (and links to all of the other posts) please see here. For a summary of all the results please see here.
For a general overview of the test architecture, test components and test types, please see here.
Indirect Select Test Overview
The Indirect Select test involves executing a text SQL command to select multiple rows from a pre-populated test table. The aim of the test is to investigate performance of a read workload.
The Cloud Service fires up a number of worker threads as specified in the test definition. Each worker thread picks a Row Id (primary key value and clustered index field) at random from an in-memory list of primary key values that exist in the database – this is the start of the search range. The worker thread also chooses a number of rows to scan at random: 100 rows (30% probability), 200 rows (20%), 300 rows (15%), 400 rows (10%) and 600/800/1200/2500/5000 rows (each 5%). This means the average number of rows examined per request is 660. A further criteria clause is generated and applied to one of the other fields at random in the test table. The range used for this other field criteria is also random. Some examples of WHERE clauses that can be generated include:
- WHERE (RowId BETWEEN @i1 AND @i2) AND (TinyInt1 BETWEEN @p1 AND @p2)
- WHERE (RowId BETWEEN @i1 AND @i2) AND (SmallInt2 BETWEEN @p1 AND @p2)
- WHERE (RowId BETWEEN @i1 AND @i2) AND (Int5 BETWEEN @p1 AND @p2)
- WHERE (RowId BETWEEN @i1 AND @i2) AND (DateTime2 BETWEEN @p1 AND @p2)
- WHERE (RowId BETWEEN @i1 AND @i2) AND (Float1 BETWEEN @p1 AND @p2)
- WHERE (RowId BETWEEN @i1 AND @i2) AND (Money3 BETWEEN @p1 AND @p2)
The worker thread then executes this SQL against the database. So, in summary, each request selects multiple rows by a primary key range – also the clustered index – and by another field – hence the test name “Indirect Select”.
As in the Direct Select test, all fields from the test table are returned (roughly speaking “SELECT * “).
The worker threads in the UT tests are not limited to a specified request rate. As soon as a SQL call has completed another is generated. The worker threads in the LT tests are constrained to generate requests at a limited rate specified in the test definition. The LT tests gradually increase the rate over the course of a number of tests.
Both the UT and LT Direct Select tests run against only a small set of test data. This means the entire data set generally exists in the SQL Server Buffer pool (the test table is pre-populated immediately before the test begins). This test therefore investigates whether reads from the buffer pool are constrained by edition / service tier. In contrast to these tests, the later Sequential Select and Scale Tests purposefully target reading from the physical disk.
UT Test Results
Results from the 30 UT Indirect Select tests are shown in the two charts below. In these charts, the “T” values are the number of worker threads generating requests against the database, e.g. “Business 4T” = a test against Business edition, running with 4 threads continuously generating requests.
The two charts are similar because the average row size was around 400 bytes throughout. The data volume figures here are based on the data content (i.e. summing the sizes of the data values, according to the number of bytes each data value requires in the SQL Server Data Type scheme but ignoring internal SQL Server overheads).
Reading multiple rows per request is a CPU intensive activity. We can see the results scale roughly by the number of threads reading data (remember, this is for data already in the buffer pool, disk is not involved). The Business 8T test ran on an A3 VM, the Prem P2 8T test ran on a A2 VM. The A2 box was effectively max’d out on CPU, which is why the Business 8T result looks so much greater.
These charts suggest that there are no significant performance limits in place that restrict the rate at which data in the buffer pool can be read. Even Basic edition manages 10,000 rows per second.
The charts below show the performance profiles over the lifetime of the tests in the charts above. Since 30 lines is rather a lot to fit onto one chart, data for the different editions / tiers has been split across several charts.
It is also worth noting that due to the way the Azure SQL Database logging into sys.resource_stats works, data is only available at five minute intervals and the last data point in each test (which would have been plotted as minute 27 in these charts) is often not useful and so is generally omitted.
Basic to P1 Tiers
The new service tiers show good consistency. Basic tier was close to being CPU bound for this workload. None of the tests hit the physical disk.
Std2 to P2 Tiers
Business Edition showed quite good consistency in these tests.
LT Test Results
Three series of LT tests were carried out, against the Web, Basic and S1 editions / tiers.
The chart below illustrates how the performance of the Basic Tier varies as the workload (number of requests per second) is increased. A 30 minute test was conducted at each request rate.
This all makes sense. As request rate increases, throughput also increases. The maximum throughput here matches that seen in the UT test. Beyond a rate of 70 requests per second requests aren’t processed. Again, at no point do we hit the physical disk.
This chart looks very different to the previous Basic Edition chart. Despite these tests always starting only a few minutes after the test data was created, the data is no longer resident in the buffer pool. All of these tests were constrained by the physical disk read rate. The maximum read rate of 2000 or so rows per second is considerably less than the nearly 10,000 rows per second seen in the Basic Edition test earlier (I have deliberately left the green vertical axis scale the same as on the previous chart to highlight the difference in throughput).
The Web Edition tests do not hit the physical disk. The maximum throughput rate seen here is actually notably lower than that found in the UT tests. This is very likely due to the A0 cloud instance used to run this test being insufficiently powerful to process the incoming rows.
Indirect Select Test Conclusions
When selecting multiple rows via a combination of a range in the Clustered Index key and another non-indexed field and when the data is already in the buffer pool, performance of the Web / Business editions and new service tiers is approximately equal.
Appendix – UT Test Configuration
|Cloud Svc Inst. Size||A1||A2||A2||A2||A3||A1||A1||A2||A2||A2|
|Req. Gen. Thread Count||1||1||2||4||8||1||1||2||4||8|
|Initial Test Data (MB)||0.1||0.2||0.2||0.2||0.8||0.1||0.1||0.4||0.8||1.2|
As noted in the text, really, to conduct a fair test, the Business 8T and Prem P2 tests should have both been conducted with A3 instances. Later tests (e.g. the Scale Tests) did do this.
Appendix – UT Test Results
|Configuration||Avg Rows Per Second||Avg MB Per Minute|
|Std S1 1T||13979||14539||15401||300||313||331|
|Std S2 2T||27595||28058||28358||595||604||610|
|Prem P1 4T||36190||35301||36044||779||760||775|
|Prem P2 8T||39048||39121||39260||841||842||846|
|Configuration||SQL Avg Disk Read %||SQL Avg CPU %|
|Std S1 1T||0.0||0.0||0.0||46.1||47.2||49.9|
|Std S2 2T||0.0||0.0||0.0||28.2||27.6||27.6|
|Prem P1 4T||0.0||0.0||0.0||21.8||19.3||48.4|
|Prem P2 8T||0.0||0.0||0.0||9.8||9.8||10.0|
|Configuration||Cloud Svc Avg CPU||Error Count|
|Std S1 1T||55.3||51.4||52.8||0||0||0|
|Std S2 2T||51.9||52.6||53.7||0||0||0|
|Prem P1 4T||72.0||70.2||71.7||0||0||0|
|Prem P2 8T||80.2||80.6||81.5||0||0||0|
Appendix – LT Test Configuration
|Cloud Svc Inst. Size||A0||A0||A0|
|Req. Gen. Thread Count||1||1||1|
|Initial Test Data (MB)||0.25||0.25||0.5|
Appendix – LT Test Results
|Threads||Req. Per Sec||Avg Rows Per Second||Avg MB Per Minute|
|Web||Basic||Std 1||Web||Basic||Std 1|
|Threads||Req. Per Sec||SQL Avg Disk Read %||SQL Avg CPU %|
|Web||Basic||Std 1||Web||Basic||Std 1|
The Requests per Second maximum applies in total, not per thread.