“It’s still magic even if you know how it’s done.”
– Terry Pratchett
For a long time now I have worked heavily on Redgate’s SQL Provision software. I was involved from the very beginning when SQL Clone was but a few months old and just finding it’s little sheepish (teehee) legs in the world and before Redgate Data Masker was “a thing”.
The one thing I’ve never comprehensively been able to help a customer with though, was PaaS. Platform as a Service has been something that has plagued my time with this wonderful software and that is because you simply cannot take an Image (one of SQL Clones VHDX files) from an Azure SQL Database or from an Amazon AWS RDS Instance directly, helpfully.
But then in January 2019 I did some research and I wrote this article on how you could achieve Provisioning from Azure, through the BACPAC file export method, this was great and several customers decided this method was good enough for them and have adopted it, and in fact completely PowerShell-ed out the process (links to something similar in my GitHub which I used for my PASS Summit Demo 2019), however this never solved my AWS problem.
I’ll be the first to admit, I didn’t even try. AWS for me was “here be dragons” and I was a complete n00b; I didn’t even know what the dashboard would look like! However, in early December 2019 I was on a call with a customer who mentioned that they would like to Provision directly from RDS SQL Server and they don’t want any “additional hops” like the BACPAC Azure method. On the same day, Kendra Little (sorry Kendra, you seem to be the hero of most of my blogs!) shared some insight that it was possible, with AWS, to output .bak files directly to an S3 bucket. That got me thinking, if we can get access to a .bak file directly from S3, surely we could provision it all the way to dev with little- to-no involvement in the process?
My reaction to this news was that it was the universe telling me to get off my backside and to do some thing about it, so with renewed determination and looking a little bit like this:
I set off into the world of AWS.
1 – Setup
Now naturally, I am not a company. Shock. So i don’t have any pre-existing infrastructure available in AWS for me to tinker with, and that was the first challenge. “Can I use anything in AWS for free?” – The answer? Actually, yes! AWS has a free tier for people like myself who are reeeeeally
stingy curious which at the very least will let me better understand how to interact with the various moving parts for this.
First step. I’m going to need a Database in RDS, so I went to my trusty DMDatabase (scripts here for y’all) which I use for EVERYTHING, on-premise, in Azure, EV-ERY-THING.
In AWS I went to RDS and setup a SQL Server Express instance called dmdatabaseprod (which fortunately kept it on the free tier). Luckily, AWS provides an easy getting started tutorial for this which you can find here – why re-invent the wheel? After creating the DB I had some major issues actually connecting to it in SQL Server Management Studio; I thought I had allowed all the correct ports for traffic, put it in the right security groups blah blah blah… and guess what it was?
Public accessibility. Was set. To “No“. *cough* well that problem was sorted quickly so it was onto the next challenge.
2 – Backing up to an S3 bucket
I can take no credit for this step whatsoever because it was the wonderful Josh Burns Tech who saved me. He created a video showing exactly what I wanted to do and you can see this, with full instructions and scripts here: https://joshburnstech.com/2019/06/aws-rds-sql-server-database-restore-and-backup-using-s3/
After following the advice of Josh and walking through his steps, getting a new S3 bucket setup and configured and creating a new backup of my DMDatabase, I was a good step of the way there! As you can see my .bak was nicely sat in my S3 bucket – marvelous!
3 – Making the S3 bucket visible to SQL Server
This was the tricky bit. My approach to solving this problem was “I need SQL Server to be able to see the .bak file to be able to create an image and clones from it. So, logically, I need it to be mapped as a network drive of some kind?” – simple, no? It turns out that it was the best approach from what I found online but there were a number of ways I found of tackling it.
I started out using this article from Built With Cloud which was super informative and helpful, I managed to get rClone running and the S3 bucket was showing as a local drive, which was exactly what I wanted:
But I ran into a problem – SQL Server could not access the mapped drive.
So is there another way? I found a bunch of resources online for CloudBerry, TnT Drive and MountainDuck but, like I mentioned I’m on a very limited budget ($0) so naturally… I put this on twitter. I received a tonne of replies giving some examples and some ideas and the one idea that kept coming up time and time again was AWS Storage Gateway. I had never heard of it, nor did I have any idea of how it worked.
So. Back to Google (or in my case Ecosia, it’s a search engine that plants trees if you search with them, what’s not to love???)
To simplify it. Storage Gateway is solution that is deployed “on-premise” i.e. as a hardware gateway appliance or a virtual machine, and it allows you to effectively use your S3 (or other AWS cloud storage service) locally by acting as the middle-person between AWS and your on-premise systems, and it does fancy local caching which means super low latency network and disk performance. There are a few different types you can utilize but for this exercise I went with “File Gateway”, from Amazon: “A File Gateway is a type of Storage Gateway used to integrate your existing on-premise application with the Amazon S3. It provides NFS (Network File System) and SMB (Server Message Block) access to data in S3 for any workloads that require working with objects.”
Sounds ideal. Time to set it up!
I have access to VMWare Workstation Pro on my machine so I downloaded the OVF template for VMWare ESXi and loaded it up in VMWare (the default username and password threw me a little but it turns out it’s admin and password as standard, and you can change it as you configure):
Then it was a bit of a checkbox exercise from there:
Now I wasn’t 101% sure of exactly how best to set up my fancy new Gateway, so fortunately I found this super helpful video, funnily enough from Teach Me Cloud as opposed to the aforementioned Built With Cloud, and although it was a little out of date, I also had one of Redgate’s finest engineers (the wonderful Nick) on hand to help out. Between the video and us (mostly Nick) we were able to get everything connected!
But I ran into the same problem. SQL Server couldn’t access the backup file.
Fortunately though, after some frantic Googling we managed to find a very straightforward article that fixed all of our pain! We needed to map the drive in SQL Server itself – Thanks Ahmad! Now, yes, I did use XP_CMDSHELL (insert DBAReactions gif at the mere mention of it here) but this was for testing purposes anyway, I’m sure there are other ways to get around this problem!
…and guess what? It worked. Huzzah!
If you can substitute my poorly named image “blah” and assume instead it says “HeyLookImAnImageFromAnRDSBackupFileArentIClever“, this means I can now schedule my PowerShell process to create new images at the beginning of every week to refresh my DMDatabase Clone environment, no manual steps needed!
Whilst there are a number of steps involved, you can easily take advantage of some of the fantastic features offered by AWS like Storage Gateway and even if your database is hosted in RDS, you can fully provision copies back into IaaS (Infrastructure as a Service) or On-Premise workflows to keep your Pre-Production copies up to date and useful in development!
Just remember to mask it too!
P.S. I’m sure you could probably find some clever way of using the free rClone method I also mentioned and having this readable by SQL Server, but I haven’t figured it out yet, but will blog when I do!