Has Azure SQL Database Business Edition Got Slower?

Introduction

Since Microsoft released the Azure SQL Database v12 Preview in mid-December, some users have suggested there was, around the same time, a noticeable drop in the performance of the legacy Web/Business editions of Azure SQL Database.

In this post I re-run some of my earlier tests to see if that really is true.  I was planning to revisit Business Edition in any case, to get some better and more up-to-date metrics for comparison to V12 when that heads into General Availability.

July 2014 Performance

The Sequential Selects read tests performed in July 2014 estimated that Web/Business Editions read from physical disk on average at 550 MB per minute (peak 650 MB per minute).  See those earlier tests here for details.

Several write tests were also performed in July 2014.  All were reasonably consistent and estimated the maximum write rate at around 90 MB per minute.  See for example the Bulk Load tests here.

Quick Test

Before setting up any elaborate tests, I ran a quick initial read test – basically a sequential read of the whole of a 37 GB database.  This took 2 hours 50 minutes, equating to a read rate of around 225 MB per minute.  Clearly, much less than the July 2014 results and a sufficiently big enough difference to warrant further investigation.

Re-running the Sequential Select Read Tests:  Read Rate into a VM

The first full tests to run were a repeat of the July 2014 Sequential Select Tests.  This should allow a direct comparison between performance in July 2014 and January 2015.

The Sequential Select test involves reading an entire database to get a simple estimate of maximum read rates. The data is read table-by-table, and each table is read sequentially (i.e. roughly speaking SELECT * with no ORDER BY).

The test involves using a newly restored database. The database is 36 GB in size.  The test lasts 60 minutes.

The size of the database and the fact it is newly restored means that none of the data exists in the buffer pool and the test therefore estimates the maximum read rates from physical disk.

The data is read into a Windows Azure Worker Role (i.e. a VM in the same Azure data centre) which measures the data read rate and captures various other metrics.  The test uses a single reader (i.e. a single thread) which is typically sufficient to hit the read rate limit (more details on this below).

Results

20150126-01

The chart shows three different runs of the same Sequential Select test.  The read rate is quite volatile, flipping between around 200 MB per minute and around 500 MB per minute.  The first test ran almost entirely at 500 MB per minute, the second test mostly at 200 MB per minute, the third test flipped between the two rates.

Comparing with the July 2014 test results:

20150126-02

The read rate was much more stable in the July tests at around 550 MB to 600 MB per minute, though not completely stable – the third Business Edition test shows a drop to 200 MB per minute, though it is brief.

This suggests Business Edition read performance has become more volatile since July 2014.

Alternative Read Test:  Maximum Read Rate Test inside SQL Server

The older test, described above, was based on reading from the test database into a Azure Worker Role – i.e. a VM outside SQL Server.

The more recent tests I have conducted against the newer service tiers in v11 and preview v12 (see previous blog posts in December 2014) involved measuring the read rate using a simple aggregate query (i.e. select count(*) from … where … ).  This performs table scans across an entire database but the data does not leave SQL Server (only the final count(*) value is returned by the query).  Compared to the previous test, this test has the advantage that it does not involve any significant network activity between SQL Server and a VM outside of SQL Server which may have constrained the read rate in the earlier test.  This should therefore allow a better estimate of the maximum read rate from disk, free of any network constraints.  For more details of this test, see the “Test 1 – Sequential Table Scans by a SQL Query from Physical Disk into Memory” heading in the previous post here.

Results

20150126-03

Above: The data was read from the database table-by-table.  Each table was 1 GB in size.  The time taken to read each GB was measured.  The chart above shows the read-rate for each GB of data read during the test.  E.g. we can see the 24th GB in test 2 was read at a rate of around 275 MB per minute.  The test was repeated three times.

The flipping between two different read rates is even more clearly shown in this test.  The maximum read rate here is actually 800 MB per minute, higher than the 550 MB per minute seen in the earlier tests.  As suggested above, this higher rate is very likely explained by a network I/O restriction somewhere between SQL Server and the Azure Worker Role VM used in the earlier test.

However, more interesting in these results, is the fact that the flipping between the two different read rates is even clearer here.  This suggests internal logic in Business Edition that, once a given SQL Server is overloaded (i.e. when all of the different databases from all of the different users on that server attempt to read too much data too quickly), then the read rate for some/all databases on that SQL Server is limited to 200 MB per minute until the overload passes.  This is probably a simpler mechanism that provides a more reliable database service than failing databases over to other SQL Servers to attempt to balance workload (causing outages in the process).

Re-running the Scale Tests

The Scale Tests from July 2014 involved executing a mixed workload of inserts, (single and multi-row) selects, updates and deletes against copies of the same 36 GB test database.  This test was re-run as it provides a more complex workload than a simple sequential read – which is clearly not a typical workload.

Again, the test starts with a newly-restored database.  The requests are not uniformly distributed across the database (mirroring most typical OLTP workloads) so that the SQL Server Buffer Pool should have an increasing effect as the test runs and more data is cached in memory.  See the earlier post here for more details on this test.

In July 2014 tests were performed against Business Edition with different numbers of worker threads (i.e. number of concurrent requests).  The test re-run here is the heaviest of those tests – the 8T test from July 2014 – i.e. 8 worker threads running in parallel in the VM, generating requests against the test database.

Results

20150126-04

The lines on the chart show the number of operations (i.e. SQL statements) executed per second.  The blue lines are the July 2014 results.  The test was run three times in July 2014.  The green lines are the January 2015 results – again the test was run three times.

Clearly, performance is comparable between July 2014 and January 2015.  The performance is quite volatile in all six tests (as is to be expected in Business Edition where all users share/compete for server resources).

Averaging the three results from each set of tests:

20150126-05

The latter stages of the test perform better in the more recent tests.  There is no difference in the types of request generated as the test progresses.  The only thing that changes as the test progresses is that the SQL Server Buffer Pool will have more data cached (i.e. the cache starts empty and fills as requests are processed).  Therefore, this difference between July 2014 and January 2015 could be explained by an increase in the size of the SQL Server Buffer Pool per database since July 2014 allowing more data to be cached – or it could simply be a random result given the inherent volatility in Business Edition.

Write Rates

The write test conducted involved using a SQL query to write 100 MB chunks into a database, up to 10 GB in total, and measuring the write time for each chunk.  The query was executed directly on the server – i.e. no external VM and no VM to SQL Server network hops were involved.  Again, the test was executed three times.

Results

20150126-06

Above: The data was written to the database in 100 MB chunks, up to a total of 10 GB.  The time taken to write each 100 MB chunk was measured.  The chart above shows the write-rate for each chunk written during the test.  The test was repeated three times.

Clearly, the write rates are quite volatile (just as they were in July 2014).  Smoothing (via a rolling average):

20150126-07

The average write rate in the first test was around 87 MB per minute.  In the second and third tests the average rate was almost exactly 100 MB per minute on average.  Entirely consistent with the July 2014 results.

Conclusion

The Business Edition of Azure SQL Database has always featured / suffered from variable performance.  These tests appear to show the performance is more volatile in January 2015 compared to July 2014.  However there is no clear evidence for performance having been “turned down”, e.g. no lower maximum performance limit has been imposed.

The ratio of SQL servers to user databases in Azure SQL Database has likely always varied as new user databases and new batches of servers are deployed at different times.  It is probable that last July there was more spare capacity and a so a lower database density (i.e. number of databases per server) across the SQL servers in the Azure data centre.  Now, it is possible that the database density has risen and as a result performance for individual databases is being trimmed back more often to balance workloads (such as the flipping between higher and lower read rates seen in these tests).

Conspiracy theorists may suggest Microsoft have increased database density to reduce performance.  Personally I think that is unlikely.  Business Edition has always been volatile and some users have always had to endure periods of more volatile / lower performance.

Indeed, the entirety of this post shows why Business Edition is difficult to work with and will inevitably lead to difficulties for/between Microsoft and users.  Performance varies and drifts over time and there are no performance benchmarks to plan against, test against, measure against, etc.

The long term solution to this variable performance is to move into one of the newer service tiers where the performance boundaries between different databases are much more rigorously enforced.  Previously this would have incurred a much greater cost compared to the Business Edition cost.  However, the new S3 service tier in the upcoming Azure SQL Database v12 (currently in preview) should help here, providing a closer price/performance fit to Business Edition than has been available to date.  See my earlier post here for more details.

And Finally…

These tests were carried out in the Western Europe Data Centre.  It is possible performance may be different in other data centres (e.g. where the ratio of SQL Servers to user databases may be different).

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s