Hebtech

Azure SQL Database Elastic Scale Part 2 - The Shard Map Manager Database

Azure SQL Database Elastic Scale Part 2 - The Shard Map Manager Database
MAY 1, 2015

In my last post, I discussed some of the concepts around Sharding and how we could use Sharding and Azure Database Elastic Scale in a real world application to implement Horizontal Scaling. In this post, I'll introduce an Azure SQL Database Runbook that uses Powershell to implement the Sharding, and also talk about what is happening behind the scenes in Azure SQL Database. In my final post (and I promise that I'll get to it quicker than this post), I'll finish talking about the Runbook, and introduce a sample C# application that queries the Sharded Databases.
 

Review of the Business Case 
Before getting started, I just wanted to review the fictictious business case that we were going to solve that was covered in the previous post. We have an Employee Payroll System that we want to Scale Horizontally. We will do this by Sharding the Payroll database into 4 Shards with Employee ID's 1-25 in the first Shard, 26-50 in the second Shard, 51-75 in the third Shard and 76-100 in the final Shard. Using this approach, we can distribute the load across 4 databases instead of having all of the load handled by a single database. The layout of the Sharded databases can be seen in the diagram below.


Payroll System Database Setup

Setting up the Runbooks
The next thing that we have to do is to set up the Powershell Libraries and the Azure SQL Database Runbook that we are going to use to create the Elastic Scale Databases to implement the Sharding. I'll go through these steps in case you want to set up the Runbook on your own Azure SQL Database. Where possible, I'm going to provide links for all of the steps because these steps have been covered in a lot better detail in other places.
  1. Create the Server - The easiest way to create an Azure Server for use with the Elastic Scale Databases is to create a sample database by going through the steps in this post. Remember the name of the Server, the Admin login and the Password that you created because we will need to add them to the Runbook later.
  2. Create an Azure Automation Account - The next thing that needs to happen is to create an Automation Account. In order to create an Automation Account, you first have to enable Automation in the Preview Portal. Once that is done, you just need to click on "Automation" on the left hand side of the Azure Portal, click the "Create" icon at the bottom of the screen and give a "Name" and "Region" for your Automation Account. Again, these steps (with pictures) are all summed up really well in the post Microsoft Azure Automation.
  3. Upload PowerShell module to Azure Automation as an Asset - The next thing that we need to do is to upload a Powershell module that contains the Azure Elastic Scale dll (Microsoft.Azure.SqlDatabase.ElasticScale.Client.dll), some database helper functions that create and populate the databases (SqlDatabaseHelpers.psm1) and a Powershell data file that contains information about what is contained in the Powershell module (PayrollElasticScaleModule.psd1). In order to upload the Powershell module, please perform the following steps:

    1. Click on the Automation Account that you created in Step 2.
    2. Click on "Assets" at the top of the screen.
    3. Click "Import Module" at the bottom of the screen.
    4. Select PayrollElasticScaleModule.zip that is included with this post and then click the "Check Mark" to complete.
  4. Import the Azure Runbook - The last thing that we have to do is to import the Runbook. In order to import the Runbook, click "Runbooks" at the top of the screen, and then click "Import" at the bottom of the screen. Select PayrollElasticScale.ps1 that is included with this post and then click the "Check Mark" to complete.
The Powershell Runbook
If you've set up everything correctly, you should now see the PayrollElasticScale Runbook deployed to Azure as shown below.
 

PayrollElasticScale Runbook
 
If you click the PayrollElasticScale Runbook and then click "Author", you'll see the code for the Runbook. The first thing that we'll have to do is to change the Runbook so that it can run in your environment. In order to do that, change the variables $SqlServerName, $UserName and $Password from "xxxxxx" to match what you have created in Step 1 of "Setting up the Runbooks" (above). In the code below, you'll also see that we added a reference to the Azure Elastic Scale dll that we uploaded in Step 3.
 

Change Server Name, Username and Password
 
The next thing that happens in the script is that we have to explicitly create the Shard Map Manager database - PayrollShardMapManagerDb. This database keeps track of the type of Shard Map that is being used and the information about each Shard. At this point, the database is empty, and has been created using one of the functions in the Powershell module that was uploaded in Step 3.
 

Create Shard Map Manager Database
 
The next bit of Powershell is where things start to get a bit interesting. The piece of Powershell below populates the Shard Map Manager Database with all of the tables that control how the Shard databases are queried.
 

Populate the Shard Map Manager Database
 
When this bit of Powershell is run, the database PayrollShardMapManagerDb is populated with the 6 following tables:
  • [__ShardManagement].[ShardMapManagerGlobal] - Stores version information about the Elastic Scale Database
  • [__ShardManagement].[ShardMapsGlobal] - Stores the ShardMapId and the name of the Shard Map that has been created. All of the rest of the information that we create in the Shard Map Manager Database will link back to this ShardMapId.
  • [__ShardManagement].[ShardedDatabaseSchemaInfosGlobal] - Stores the schema information about the tables that are being Sharded.
  • [__ShardManagement].[ShardMappingsGlobal] - Stores information about the Shards that get created such as the minimum and maximum values for each Shard.
  • [__ShardManagement].[ShardsGlobal] - Stores information about which database contains each Shard.
  • [__ShardManagement].[OperationsLogGlobal] - Stores information about operations that have been performed in the Shard Map Manager Database

We then add information about the tables that we are going to Shard by running the Powershell shown below. This step is quite important because it is here that we define the Shard Key, which in our case is the EmployeeID, and the table to which that key belongs - the Employee table. We also define the reference tables: the Payroll table and the PayrollDetail table.


Add Schema Information to the Shard Map Manager

At this point, only 3 tables in the Shard Map Manager database have data: [__ShardManagement].[ShardMapManagerGlobal], [__ShardManagement].[ShardMapsGlobal] and [__ShardManagement].[ShardedDatabaseSchemaInfosGlobal]. When we look at the [__ShardManagement].[ShardMapManagerGlobal] table, we can see the version information of the Elastic Scale technology that is being used.


[__ShardManagement].[ShardMapManagerGlobal] table

When we look at the [__ShardManagement].[ShardMapsGlobal] table, we can see the ShardMapId and the Name that we have assigned to that Shard Map. We can also see that we have set up a Range Shard Map (ShardMapType = 2) and that our Shard Key is an integer (KeyType = 1).


[__ShardManagement].[ShardMapsGlobal] table

Finally, when we look at the [__ShardManagement].[ShardedDatabaseSchemaInfosGlobal] table, we can see that for this particular Shard Map, an XML document has been defined that contains all of the information about the tables that we are going to Shard.


[__ShardManagement].[ShardedDatabaseSchemaInfosGlobal] table

When we look at the XML document, we can see that it contains the Sharding Key (EmployeeID) and the table to which the Sharding Key belongs (Employee table).


XML Scema Information for the Shard Map Manager Database

Conclusion
I think that I'll probably wrap things up here for now as this post is starting to go on a bit. In the next post, I'll go into detail about the rest of the scripts which create the Shards and the C# application that can be used to run a query against all of the Shards.
 
Before going, I just wanted to point out some other great resources where you can go to get some more Azure SQL Database Sharding Scripts. The Runbook and Powershell Module for this post are based on the following posts from Sidney Higa at Microsoft - Elastic database Split-Merge tool tutorial and Azure SQL Database Elastic Scale: Shard Elasticity. They've been rewritten a lot to make them much simpler so they are easier to follow in a blog post. If you are going to use Azure Database Elastic Scale in production, you would want to make the scripts a lot more robust and modular as Sydney has done in his scripts.
 

References
http://azure.microsoft.com/en-us/documentation/articles/sql-database-get...
http://blogs.technet.com/b/cbernier/archive/2014/04/08/microsoft-azure-a...
http://azure.microsoft.com/en-us/documentation/articles/sql-database-ela...
https://gallery.technet.microsoft.com/scriptcenter/Elastic-Scale-Shard-c...

 

 

Add new comment