Provisioning local Dev copies from AWS RDS (SQL Server)

“It’s still magic even if you know how it’s done.”
Terry Pratchett

For a long time now I have worked heavily on Redgate’s SQL Provision software. I was involved from the very beginning when SQL Clone was but a few months old and just finding it’s little sheepish (teehee) legs in the world and before Redgate Data Masker was “a thing”.

The one thing I’ve never comprehensively been able to help a customer with though, was PaaS. Platform as a Service has been something that has plagued my time with this wonderful software and that is because you simply cannot take an Image (one of SQL Clones VHDX files) from an Azure SQL Database or from an Amazon AWS RDS Instance directly, helpfully.

But then in January 2019 I did some research and I wrote this article on how you could achieve Provisioning from Azure, through the BACPAC file export method, this was great and several customers decided this method was good enough for them and have adopted it, and in fact completely PowerShell-ed out the process (links to something similar in my GitHub which I used for my PASS Summit Demo 2019), however this never solved my AWS problem.

I’ll be the first to admit, I didn’t even try. AWS for me was “here be dragons” and I was a complete n00b; I didn’t even know what the dashboard would look like! However, in early December 2019 I was on a call with a customer who mentioned that they would like to Provision directly from RDS SQL Server and they don’t want any “additional hops” like the BACPAC Azure method. On the same day, Kendra Little (sorry Kendra, you seem to be the hero of most of my blogs!) shared some insight that it was possible, with AWS, to output .bak files directly to an S3 bucket. That got me thinking, if we can get access to a .bak file directly from S3, surely we could provision it all the way to dev with little- to-no involvement in the process?

My reaction to this news was that it was the universe telling me to get off my backside and to do some thing about it, so with renewed determination and looking a little bit like this:

Ready Lets Go GIF by Leroy Patterson

I set off into the world of AWS.

1 – Setup

Now naturally, I am not a company. Shock. So i don’t have any pre-existing infrastructure available in AWS for me to tinker with, and that was the first challenge. “Can I use anything in AWS for free?” – The answer? Actually, yes! AWS has a free tier for people like myself who are reeeeeally stingy curious which at the very least will let me better understand how to interact with the various moving parts for this.

First step. I’m going to need a Database in RDS, so I went to my trusty DMDatabase (scripts here for y’all) which I use for EVERYTHING, on-premise, in Azure, EV-ERY-THING.

In AWS I went to RDS and setup a SQL Server Express instance called dmdatabaseprod (which fortunately kept it on the free tier). Luckily, AWS provides an easy getting started tutorial for this which you can find here – why re-invent the wheel? After creating the DB I had some major issues actually connecting to it in SQL Server Management Studio; I thought I had allowed all the correct ports for traffic, put it in the right security groups blah blah blah… and guess what it was?

Public accessibility. Was set. To “No“. *cough* well that problem was sorted quickly so it was onto the next challenge.

2 – Backing up to an S3 bucket

I can take no credit for this step whatsoever because it was the wonderful Josh Burns Tech who saved me. He created a video showing exactly what I wanted to do and you can see this, with full instructions and scripts here: https://joshburnstech.com/2019/06/aws-rds-sql-server-database-restore-and-backup-using-s3/

After following the advice of Josh and walking through his steps, getting a new S3 bucket setup and configured and creating a new backup of my DMDatabase, I was a good step of the way there! As you can see my .bak was nicely sat in my S3 bucket – marvelous!

3 – Making the S3 bucket visible to SQL Server

This was the tricky bit. My approach to solving this problem was “I need SQL Server to be able to see the .bak file to be able to create an image and clones from it. So, logically, I need it to be mapped as a network drive of some kind?” – simple, no? It turns out that it was the best approach from what I found online but there were a number of ways I found of tackling it.

I started out using this article from Built With Cloud which was super informative and helpful, I managed to get rClone running and the S3 bucket was showing as a local drive, which was exactly what I wanted:

But I ran into a problem – SQL Server could not access the mapped drive.

So is there another way? I found a bunch of resources online for CloudBerry, TnT Drive and MountainDuck but, like I mentioned I’m on a very limited budget ($0) so naturally… I put this on twitter. I received a tonne of replies giving some examples and some ideas and the one idea that kept coming up time and time again was AWS Storage Gateway. I had never heard of it, nor did I have any idea of how it worked.

So. Back to Google (or in my case Ecosia, it’s a search engine that plants trees if you search with them, what’s not to love???)

To simplify it. Storage Gateway is solution that is deployed “on-premise” i.e. as a hardware gateway appliance or a virtual machine, and it allows you to effectively use your S3 (or other AWS cloud storage service) locally by acting as the middle-person between AWS and your on-premise systems, and it does fancy local caching which means super low latency network and disk performance. There are a few different types you can utilize but for this exercise I went with “File Gateway”, from Amazon: “A File Gateway is a type of Storage Gateway used to integrate your existing on-premise application with the Amazon S3. It provides NFS (Network File System) and SMB (Server Message Block) access to data in S3 for any workloads that require working with objects.”

Sounds ideal. Time to set it up!

I have access to VMWare Workstation Pro on my machine so I downloaded the OVF template for VMWare ESXi and loaded it up in VMWare (the default username and password threw me a little but it turns out it’s admin and password as standard, and you can change it as you configure):

Then it was a bit of a checkbox exercise from there:

Now I wasn’t 101% sure of exactly how best to set up my fancy new Gateway, so fortunately I found this super helpful video, funnily enough from Teach Me Cloud as opposed to the aforementioned Built With Cloud, and although it was a little out of date, I also had one of Redgate’s finest engineers (the wonderful Nick) on hand to help out. Between the video and us (mostly Nick) we were able to get everything connected!

But I ran into the same problem. SQL Server couldn’t access the backup file.

angry hate GIF

Fortunately though, after some frantic Googling we managed to find a very straightforward article that fixed all of our pain! We needed to map the drive in SQL Server itself – Thanks Ahmad! Now, yes, I did use XP_CMDSHELL (insert DBAReactions gif at the mere mention of it here) but this was for testing purposes anyway, I’m sure there are other ways to get around this problem!

…and guess what? It worked. Huzzah!

If you can substitute my poorly named image “blah” and assume instead it says “HeyLookImAnImageFromAnRDSBackupFileArentIClever“, this means I can now schedule my PowerShell process to create new images at the beginning of every week to refresh my DMDatabase Clone environment, no manual steps needed!

Conclusion

Whilst there are a number of steps involved, you can easily take advantage of some of the fantastic features offered by AWS like Storage Gateway and even if your database is hosted in RDS, you can fully provision copies back into IaaS (Infrastructure as a Service) or On-Premise workflows to keep your Pre-Production copies up to date and useful in development!

Just remember to mask it too!

P.S. I’m sure you could probably find some clever way of using the free rClone method I also mentioned and having this readable by SQL Server, but I haven’t figured it out yet, but will blog when I do!

DBAle hits 5000 listens

We see our customers as invited guests to a party, and we are the hosts. It’s our job every day to make every important aspect of the customer experience a little bit better.”
Jeff Bezos

The date is June 6th 2018. Our producer Louise has just setup a slack channel called #dbale with the topic: “Hi, thought i’d just set this up as it might be the easiest way to chat about the podcast idea at this early stage.” purely off the back of a conversation we had after a practice run for this webinar where I suggested to Chris and Lou that I loved working with the 2 of them, and that we should do some kind of podcast together.

We produced our first DBAle episode and it went live around the 12th July 2018, the production quality was very low, like abysmal and we didn’t really know what we were doing, but it was up. It had it’s own landing page, we were on Spotify and Google Play Music. It was “a thing”.

Little did we know though, it started something. I will never try to quantify that something because it was a weird something, something that could only be produced by 2 Chris’ getting in front of a microphone, drinking a beer and talking about data.

There are a hundred different podcasts out there that are super good if you’re looking to learn more about data like SQL Data Partners or Dear SQL DBA and we wanted to be a bit different. So fortunately people are able to join us on our own journey of discovery and also have a laugh on the way!

Fortunately, as a podcast backed by Redgate we get the added benefit of having access to some superstars as well, like episode 17 with the incredible Steve Jones, or the live episode with the incredible Chris Yates. I have to admit I’m very excited for the next few episodes for this very reason! Look out for February’s episode as it’s sure to be wonderful, and hopefully we’ll even persuade none other than Kendra Little herself to join us soon! (Hint, Kendra, hint!!)

I guess this blog post wasn’t to give insight into how to sustain a podcast, keep motivated or even what the formula for success is, as I’m not sure we know any of these (at least 1 point would be invest in a good microphone, we found that out the hard way). But there is one thing I do know, we keep doing what we love and at time of writing we are at 5,300 unique downloads, so I wanted to take the opportunity to say thank you.

Thank you to everyone who has supported us, thank you to everyone who has been on or helped produce the show, and a massive thank you to everyone who has listened.

Here’s to the next 5000 listens, a year of fun, beer and friendship… oh and data. Here’s a picture of the crate of beer and scratch off bucket list Redgate (thanks Louise!) got us for hitting the 5,000 mark:

P.S. Everyone asks me what my favorite episode of DBAle was so far, so if you’re interested then check out these two (as I couldn’t pick!)

Servers that go bump in the night(time): https://www.red-gate.com/hub/events/online-events/dbale-05-servers-go-bump-nighttime

Monitor the state of the 8%: https://www.red-gate.com/hub/events/online-events/dbale-09-monitor-the-state-of-the-8

Using things weirdly – Part 1: The SQL Clone Config Database

“If the challenge exists, so must the solution.”
Rona Mlnarik

I like to use things in the way they weren’t intended. It leads to behaviors you weren’t expecting… and that’s pretty fun.

Often though, when you work for a software company, people want your solutions to do things they just were not designed to do, or have no capabilities to do at that exact point in time. The answer then that will often come back to you is a resounding: “I’m sorry, that’s not available at this time.” of course that is if you’re lucky enough to get an answer at all.

It’s no secret though, none of us who answer those questions like saying that. There is nothing worse than not being able to help people with what they’re trying to do.

Now don’t get me wrong – some people’s questions should be challenged. You cannot immediately assume that the person you’re talking to has all of the available variables, or has thought of any alternative means of achieving their end goal – which is why it is always important to ask ‘the why behind the why’. Every question has at least one immediate answer:

“I am going to buy a car.”
“Why?”
“Because I want to drive to work.”

This seems like a perfectly reasonable response. But what if we haven’t thought about every available option here? I can take this at face value and sell you a car, but i haven’t asked where you live and work; perhaps you live within the congestion zone in London where driving is impractical. Do you actually have off-street parking available? Will you need the car for other trips or just for work? With a little additional questioning and thinking we might establish that you live right near a London Underground station and it would be more cost effective and slightly faster to instead commute by Tube.

This is not just a sales tactic, rather one that should be used to better understand ever requirement, personal or professional – there is no substitute for good understanding.

But there are times when a perfectly reasonable request is made with good justification and unfortunately you’re still unable to help. But sometimes there are still ways to help, even if it doesn’t natively fall within the remit of the solution you’re working with. I had one such question recently:

“I would like to receive a list of Clones per image that have been created, which are older than 2 weeks and who created them, so I know which developers to nudge to refresh their Cloned environments and so we can create new images.”

Again this is a perfectly reasonable request, and the obvious answer was, well, “check the dashboard”, there’s a full list there with all of that information. When you have a good many developers all using SQL Clone though, it can get a little difficult to scroll through everything and identify the culprits. SQL Clone has no native alerting or reporting (as of writing) so it was assumed that was the best we could do.

Like I said though, I like to use things as they weren’t intended. Tee-hee.

When you install SQL Clone it creates a back-end SQLClone_Config database on an instance of your choosing (normally where you’ve installed the solution) for recording operations carried out within the software. This database is structured really, really logically that anyone can understand it and leverage it to meet their requirements.

By thinking a little more about their requirement we realized that it would actually be possible to leverage the database by running a very small snippet of T-SQL which could show us:

  • The name of the Clone
  • Who created it
  • Which Image it belongs to
  • When it was created and,
  • How old it is

This was as straightforward as:

SELECT C.Name                                AS CloneName
     , C.CreatedBy                           AS CloneOwner
     , I.FriendlyName                        AS BelongsToImage
     , O.Timestamp                           AS CreatedOn
     , DATEDIFF(DAY, O.Timestamp, GETDATE()) AS AgeOfCloneInDays
FROM dbo.Clones              C
    LEFT JOIN dbo.Images     I
        ON I.Id = C.ParentImageId
    LEFT JOIN dbo.Operations O
        ON I.Id = O.ImageId
WHERE O.Type = 1
      AND O.Timestamp < DATEADD(DAY, -14, GETDATE())
      AND C.DeletedOn IS NULL;

This gave us exactly what we needed:

It was just that simple. But we didn’t want to leave it there! So we set up an email to be sent to the heads of each development team from a SQL Server Agent Job with the output of this query each Monday morning so they could raise it with the relevant members just after stand up. Now I had never done this before so of course who would come to my rescue but the wonderful, knowledgeable (and just all round awesome-sauce-person) Jes Schultz with this post from 2014!

And with that – we used something that isn’t exactly supposed to be used in this way, but got something valuable up and running for their teams. It’s a simple answer, but it was quick, and it worked.

What’s a quick win you’ve had recently?