Performance Testing Series
This post is part of a series of blog posts about my performance tests in Azure SQL Database. For the first post in this series (and links to all of the other posts) please see here. For a summary of all the results please see here.
For a general overview of the test architecture, test components and test types, please see here.
Quick Navigation: << Previous post in series << >> Next post in series >>
Scale Test Overview
This test combines the activities of the earlier Inserts and Deletes, Direct Selects, Indirect Selects and Updates tests. Please refer to those posts for a more detailed description of these activities.
The type of request generated by each worker thread at any point in time is randomly selected, based on the following probabilities:
- Inserts – 20%
- Selects – 30% – which breaks down as 75% Direct Selects and 25% Indirect Selects
- Updates – 30%
- Deletes – 20%
As in those earlier tests, the worker threads in these tests are not limited to a specified request rate.
The earlier tests only ran on a relatively small set of data in memory. These tests run against a newly restored and much larger 36 GB database (except for Basic edition, where a 2 GB database was used). The size of the database and the fact it is newly restored means that none of the data exists in the buffer pool at the start of the test and the test therefore includes the effect of read rates from physical disk for each tier.
Using the same workload against each tier is, on one hand, not a smart thing to do – i.e. clearly we expect the lower tiers/performance levels to not perform as well as the top end. On the other hand, running the same workload is a handy way to illustrate exactly what those differences are like.
The previous Insert, Select, Update and Delete tests randomly picked a row from any point in the database with equal probability. Since those tests were working with data entirely in the buffer pool, that was fine. In most databases, certain data is more likely to be hit than other data, e.g. typically more recent data is more “active”.
To at least loosely replicate that, two distribution functions were used in these tests that meant the data towards the end (highest row ID values) of each test table was more likely to be accessed than data at the start. The first distribution function, used for the single row operations (i.e. Select Direct, Update and Delete) had a high probability of choosing data in the last 5% of each table. The second distribution function used for row ranges (i.e. Select Indirect) had a slightly higher probability of selecting a slightly older data – i.e. the function was less peaked at the end of the table.
Probability distribution charts for these functions can be found in an appendix at the end of this post.
I am not at all claiming that this workload and these distributions are representative of the majority of databases in use. They are merely one example workload and data access pattern. The single row distribution function is in particular quite “peaked”, however the choice was deliberate – having such a peak should allow some data to be relatively quickly cached in the buffer pool.
Each test lasted 120 minutes and was executed three times against each tier. Tests were performed against Business Edition with 1 thread generating requests, 2 threads, 4 threads and 8 threads. This is due to the opportunistic nature of that edition (which means more threads = more throughput, to some extent). This is less true of the new service tiers, where the different quotas come into play and additional threads don’t improve performance once a quota limit is reached (which generally happens in the tests here).
More details of the test parameters and configurations can be found in the appendices at the end of this post.
Test Results
Overview
Results from the 30 Scale Tests are shown in the charts below. In these charts, the “T” values are the number of worker threads generating requests against the database, e.g. “Business 4T” = a test against Business edition, running with 4 threads continuously generating requests.

Continue reading →