Delays are not DevOps, delaying DevOps is worse: Why we need better working practices now more than ever.

“The time is always right to do what is right.”
– Martin Luther King Jr.

Over the past few months, we have been on lock-down. The product of a devastating and deadly disease that has well and truly stamped it’s legacy on human history forever. But it is out of these times that we receive a glimpse, a look into what is possible, and what humanity can do. It is out of this fight, out of these ever decreasing odds that we finally see what a combined effort can do, and what a focus on our fellow people can bring about. Don’t know what I mean, take a look at the Good News Network and subscribe (just like me) to see the best of us.

But it only works, we only triumph, when we work together.

Recent Example: Scientists at Oxford University have seen a tremendous breakthrough with their virus efforts and are making unprecedented strides towards a viable vaccine – but it involves an incredibly strong partnership with pharmaceutical companies and governments worldwide for staged testing, large scale results gathering and continuous improvement.

So. Why then in the last few months have I been speaking to people who say things like:

  • We have delayed our non-BAU process implementation, because we have seen a spike in usage, and we need all hands on deck.
  • We have been investigating tooling and processes to help our development teams, but this is on hold as we come to terms with this newer way of working.
  • Our teams are keen to adopt more agile ways of working, but they are overburdened at the moment, so we have decided to postpone any research into this for the next few months.

It is feasible, amid a global panic, that people and companies will do (and definitely have done) what comes naturally to us; that is to “bunker down“. We believe that if we shift all efforts from projects and ongoing testing / new processes, we can have all hands available to deal with anything that comes our way. Processes are established for a reason, right? Legacy methods of dealing with ad-hoc changes and semi-frequent deployments, waterfall-esk development cycles and decade(s) old systems represent the familiar, the safe… Supposedly.

Now so, more than ever, it is time to actually change direction and to put more effort into some of the key principles and processes that will lead us to DevOps nirvana; it is this trinity of people, processes and tooling that can ultimately be the salvation for many global companies as they try to maintain their agility and competitiveness within an uncertain, shifting post-pandemic international market. There are many reasons why I, and many others, believe this but I have detailed 3 key reasons below:

1 – Delaying DevOps creates waste and costs businesses money

DevOps is a culmination of learning, experience and effort and it cannot be classified as one single thing, however it is possible to define a number of things that DevOps is and what it most certainly is not. One of the things that DevOps is, is “the constant delivery of value to end users“; the idea that by adopting certain technical measures and working practices we can minimize the time to delivery of new features and functions, which equates to greater value for us, our end users and significantly more agility to shift in different directions as required.

These ideas of flow and value streams are covered quite nicely by Lean IT, which extended from Lean Manufacturing principles and it is nicely explained here (and I would highly recommend you read The Phoenix Project if you haven’t already) – but the purpose of it (in a similar vein to agile principles) in this sense, is that it defines a number of things that don’t add value to the resulting product or service. These ‘things’ are referred to as waste and this is precisely what we should be looking to remove from our existing, legacy processes because why would we work on anything that doesn’t deliver any value?

There are a few different types of waste but I want to highlight three important ones that often exist as a result of legacy processes still being in place:

  • Defects Includes lack of testing (poor execution) and hot fixing environments (unauthorized changes)
  • Waiting Including everything from waiting for refreshed environments to waiting for feedback/results and even manual processes like deployment approvals
  • Motion (excess)Effectively doing the same thing over and over again, fire-fighting problems that arise on a near daily basis, engaging and monopolizing resources constantly who could otherwise be working on other, more important or value-add tickets (for those of you who HAVE read the Phoenix Project, see Brent as an example!)

The product of this waste is very simple and it fits into 3 main buckets: poor customer experience, increased costs and lost productivity. All of these things boil down to one fundamental truth – bad practices cost us money and reputation, transformation now could help us prevent this, and people will remember us for stepping up when we needed to.

2 – The workforce is increasingly distributed/remote and needs to collaborate better

At the beginning of 2020, one of the biggest questions faced by companies all around the globe was “how do I find and retain talent?”; this is not a new question and had already been around for years. Companies restricted to their offices (base OR satellite) realized they were increasingly fighting for candidates in one of the most competitive markets served by an ever dwindling local pool of options.

This was a situation which necessitated companies to stretch beyond their existing capabilities to enable a better quality experience for remote and distance workers and/or teams – a situation which would later be exacerbated 100x-fold by the global pandemic crisis. Not only has this crisis confirmed that most companies, certainly those feeding software markets, can work remotely, it has also posed the question of if we should work remotely more often, and has fundamentally changed the way we as a workforce will continue to work in the future.

One thing is for sure, whilst we will try to “return to normal” as much as possible, normal has been forever changed and remote working and collaboration is here to stay. That’s the important word in play here, collaboration.

The spirit of DevOps as I’ve mentioned before is good quality communication, collaboration and accountability. But at the heart of those three ideas is visibility. In an office we can over-hear, we can drop-in or bring things up “over the water cooler”/”at the coffee machine”. In a remote working context, that isn’t possible. So we have to adhere to 2 of the most important practices in modern day software development: transfer knowledge and record decisions.

When we work and communicate better in a remote/distributed workforce, and use tools and processes available to us, people don’t make unauthorized changes, or make decisions that affect you that you weren’t aware of until 3 weeks after they were made. It becomes easier to make decisions and generate better work faster, rather than being paralyzed by indecision and uncertainty as to whether you hold the latest version of the truth or if it is outdated. When we adopt the right processes and tools into our DevOps methodology, we know for certain what we should be doing, and why we should be doing it.

Process-wise this can easily take the form of common functions, many already at the disposal of teams when remote; stand-ups, retrospectives, mob- and pair- programming, OKR and sprint planning, there are lots of different ways for us to work well and know what we’re supposed to be working on at all times (and why). Tooling-wise we can then match these how we will be doing something with the respective record of what is being done, what decisions have been made and crucially, why. Using work management software like Trello, bug/feature tracking software like JIRA or Azure DevOps work items, source controlling everything (even having a strong branching and merging strategy to control workflow) with rigorous testing routines, policies and pull requests and automation all lead to better informed, happier*, well-performing developers with a crucial sense of purpose.

*Important side note: It is also crucial that we don’t simply lose ourselves in the business benefits completely – developers, testers etc. are all human and we all crave job satisfaction and happiness in our roles. Yes you might be able to increase your deployment frequency, minimize costs or complaints, but nothing compares to a satisfied, motivated team, which DevOps can help breed and inspire.

3 – DevOps breeds innovation and improves company performance, with a tangible return on investment (and not just financial!)

Automation is one of the single greatest ways we can modernize our processes, and is often the first principle we think of when adopting DevOps practices; taking something that is manual or held together by legacy scripts that forever fail and cause outages, and instead continuously integrating, continuously testing and continuously improving using the latest processes and tooling available to us. Automation allows us to create high cycle rates, enhance and multiply the feedback options we have within our pipeline(s) and allows us to reduce manual concerns and issues, to dedicate teams to the very thing they were employed to do: innovate. We only need watch how Netflix does DevOps to realize what we can unlock.

On a weekly basis I discuss existing processes with developers all over the world and one key trend always emerges that we need to focus on: they have a process that is currently manual, and it needs improving. I have lost track of the number of times I’ve been told about a process where developers generate scripts, manually test themselves and then “do x” with it, whether that be just deploying to Production themselves (without review), or putting it in an ever mounting pile of scripts in a folder on a file share for someone to sift through periodically.

Across every single one of the conversations mentioned above that I have, there is not a single discussion that doesn’t include some kind of quantifiable cost to the business, whether that be downtime, customer refunds, regulatory penalties and even in extreme cases, high developer churn. All can still be expressed in terms of 2 things: Time and Money.

If we remove those roadblocks for our developers, if we give them tools to enable them to more easily do their jobs, we put in place processes that allow them to more easily deliver that innovation, and tight automated controls to remove error-prone, manual jobs – we end up with something more akin to harmony. By this of course I mean the “constant delivery of value to end users“. This creates a more positive user experience, allows us to respond more quickly in an uncertain market and make decisions on what we should or, just as importantly, shouldn’t do, faster.

The world at the moment is a very uncertain place and has destroyed jobs, companies and whole industries. We should expect that consumer confidence is at an all time low and as we all come out of lock-down across the globe, we should be prepared to metaphorically “put our best foot forward” to help our developers believe in what we’re doing, stimulate faith in our industries and ultimately deliver more value bidirectionally.

But this idea only works if we do this now.

Conclusion

DevOps is more than just “picking up some new tools” or “rolling out agile” to development teams. It is fundamentally a mindset change that can drastically and fundamentally alter the underlying motivations and thinking within an organisation, allowing you to focus on the most important thing – delivering value, faster.

There are always times where delaying large scale roll-outs is a pertinent decision to make, and a hard one at that! But DevOps seeks to unify every part of of the development cycle; giving you greater visibility, communication, accountability and control, with maximum flexibility to test, validate and even pivot where needed.

But the time to do DevOps is not “once everything improves“, “once we get back to normal” or “when we have more time“… it is now. Now is the opportunity we all have to capitalize on the wave of change we have been consistently waiting to implement, to strengthen our position and future growth in our markets. DevOps is how we can come out on the other side of this disaster ready to embrace new technologies and ways of thinking, to respond to our customers needs, and deliver value and speed at scale.

Flyway and tSQLt – migrating to warmer test climates

“If you truly have faith in your convictions, then your convictions should be able to stand criticism and testing.”
DaShanne Stokes

Welcome fellow TestDriven-Development enthusiasts… is what I would say if i actually ever did TDD and didn’t just, you know… write regular unit tests after the fact instead.

I’m going to be honest, I love the idea of TDD but have I ever actually been able to do it? No. Have competent developers been able to do it successfully? Yes, of course. Don’t know anything about TDD? You’re in luck! Click here for an introduction (don’t worry though, THIS post is not going to be about TDD anyway, so you can also keep reading).

But one thing we can all agree on is that testing is pretty important. Testing has evolved over the years though and there are a million-and-one ways to test your code, but one of the most difficult and frustrating things to test, from experience, is database code.

gilmore girls shot of cynicism GIF

Some people argue that the days of testing, indeed, the days of stored procedures themselves are gone and that everything we do in databases should be tested using a combination of different logic and scripting languages like Python or PowerShell… but we’re not quite there yet, are we?

Fortunately though we’re not alone in this endeavor, we have access to one of the best ways to test T-SQL code: tsqlt. You can read more about tsql at the site here but in short – we have WAYS to test your SQL Server* code. The only problem is, when you’re using a migrations approach… how?

*There are also many ways to unit test code from other RDBMS’ of course, like utPLSQL for Oracle Database or pgTAP for PostgreSQL – would this method work for those? Maybe! Try adapting the method below and let me know how you get on!

I’ve already talked about how implementing tests is easier for state based database source control in a previous post because we can easily filter tests out when deploying to later stage environments, however with migrations this can be a real pain because you have to effectively work on tests like you would any normal database changes, and maybe even check them in at the same time – so ultimately, they should be managed in the same way as database schema migrations… but we can’t filter them out of migrations or easily pick and choose what migrations get run against test and Prod, without a whole lot of manual intervention.

Basically. It’s a mess.

mess fail GIF

But during my last post about Flyway I was inspired. This simple and easy to use technology just seems to make things really easy and seemingly has an option for EVERYTHING, so the question I started asking myself was: “How hard would it be to adapt this pipeline to add unit tests?” and actually although there were complications, it was still easier than I thought it would be! Here’s how you can get up and running with the tSQLt framework and Flyway migrations.

1 – Download the scripts to create the tSQLt framework and tests from the site

Ok this was the easiest step of them all, largely because in the zip file you download from the tsqlt website all you have is a set of scripts, first needed to enable CLR and the second to install the tsqlt framework:

As part of my previous pipeline I’m actually using Azure SQL Database as my development environment, where RECONFIGURE is not a supported keyword and where we don’t need to run the CLR script anyway, so all I needed was the tSQLt.class.sql file.

The good thing about this is that we can copy it across into a migration and have this as our base test class migration, and then any tests we write on top of it will just extend it – so as long as we remember to update it _fairly_ frequently with any new tsqlt update, we should be fine! (Flyway won’t throw an error because these are non persistent build objects, so no awkward checksum violations to worry about!)

2 – Adapt the folder structure in the repository for tests

I added 2 new folders to my _Migrations top level folder, a Schema_Migrations folder and a Test_Migrations folder. When you pass Flyway a location for migrations, it will recursively scan folders in that location looking for migrations to run in order. I copied the migrations I had previously into the Schema Migrations folder and then my new tSQLt creating migration into the Test Migrations folder. This allows them to be easily coupled by developers, whether you’re writing unit tests or practicing TDD:

You’ll have noticed I called my base testing migration V900__ – this is because I do still want complete separation and if we have a V5 migration in schema migrations and a V5 testing migration, we’re going to have some problems.

3 – Add a callback to handle removal of the objects

As I was putting this together, I noticed that I could use flyway migrate to run the tSQLt framework against my Dev database, but every time I tried to then flyway clean that database I got a very nasty error stating that the tSQLt assembly could not be removed because of dependent objects.

Flyway does not handle complex dependencies very well unfortunately, that’s where you’d use an industry leading comparison tool like SQL Compare so, with some advise from teh wonderful Flyway team, I set to work on a callback. A callback is how you can hook into Flyway’s own processes, telling it to do something before, during or after certain commands. In my case we were going to remove all of the tSQLt objects prior to running Flyway clean to remove the rest of the schema. To make it future proof (in case objects are added or removed from the tSQLt framework), I wrote a couple of cursors to go through the different objects that were dependent on the assembly and remove them, rather than generating a script I know to have all of the tSQLt objects in right now. You can find the code for the callback in my GitHub here, you are welcome to it!

Animated GIF

All you have to do is name it beforeClean.sql and ensure it is in the directory with your other sql migrations so that it will pick this up and run it – I put it in my Test_Migrations folder, because I only want it to run this callback when cleaning the build DB, as this is the only place we’re utilizing automated unit tests… for now!

4 – Update the Azure DevOps pipeline

I’ve got my callback, I’ve got my tSQLt migration and the folder structure is all correct and is pushed to Azure DevOps but naturally it is breaking the build *sad* but fortunately all we now have to do is update the YAML pipeline file:

trigger:
- master

pool:
  vmImage: 'ubuntu-latest'

steps:
- task: DockerInstaller@0
  inputs:
    dockerVersion: '17.09.0-ce'
  displayName: 'Install Docker'

- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run -v $(FLYWAY_LOCATIONS)/Test_Migrations:/flyway/sql -v $(FLYWAY_CONFIG_FILES):/flyway/conf flyway/flyway clean -enterprise
  displayName: 'Clean build schema'

- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run -v $(FLYWAY_LOCATIONS)/Schema_Migrations:/flyway/sql -v $(FLYWAY_CONFIG_FILES):/flyway/conf flyway/flyway migrate -enterprise
  displayName: 'Run flyway for schema'

- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run -v $(FLYWAY_LOCATIONS)/Test_migrations:/flyway/sql -v $(FLYWAY_CONFIG_FILES):/flyway/conf flyway/flyway migrate -enterprise
  displayName: 'Run flyway for tSQLt'

You will notice a couple of important things that I have highlighted above:

  1. I’m cleaning the build schema using the Test_Migrations repository – this is because that is where my callback is and I need that to run before the clean otherwise it will fail due to the tSQLt assembly issue (line 17)
  2. I am running the migrate for the tests and the schema separately in the file, instead of just calling flyway to recursively run everything in the _Migrations folder. This is because I want them to be 2 separate steps, in case I need to modify or remove either one of them, or insert other steps in between and so that I can see the testing output in a separate stage of the CI pipeline (lines 23 and 29).

Caveat: As a result of (Option 2) running the 2 processes separately, it means running Flyway twice but specifying the Schema_Build and Test_Build folders in the YAML as being mapped to Flyway’s sql directory (lines 16 and 22 in the file above) but the problem this causes is that the second time Flyway runs, when it recursively scans the Test_Migrations folder it will not find the migrations that are present in the Flyway_Schema_History table, resulting in an error as Flyway is unable to find and resolve the migrations locally.

The way to fix this though is pretty simple – you find the line in the Flyway Config file that says “IgnoreMissingMigrations” which will allow it to easily continue. We wouldn’t have to worry about this setting though, if we were just recursively looking to migrate the Schema and Test migrations in the same step (but I’m a control freak tee-hee).

Now, once committed this all runs really successfully. Velvety smooth one might even say… but we’re not actually testing anything yet.

5 – Add some tests!

I’ve added a single tSQLt test to my repository (also available at the same GitHub link), it was originally created by George Mastros and is part of the SQLCop analysis tests – checking if I have any user procedures named “SP_”, as we know that is bad practice – and I have wrapped it up in a new tSQLt test class ready to run.

You’ll notice I also have a V999.9__ migration in the folder too, the purpose of this was to ‘top and tail’ the migrations; first have a script to set up tSQLt that could be easily maintained in isolation and then end with a script that lets me do just 1 thing: execute all of the tests. You can do this by simply executing:

EXEC tSQLt.RunAll

and we should be able to capture this output in the relevant stage of the pipeline.

Some of you may be asking why I chose to have the run unit tests as part of the setting up of the testing objects – this was because I had 2 options:

  1. I’m already executing scripts against the DB with Flyway, I may as well just carry on!
  2. The only other way I could think to do it was via a PowerShell script or run SQL job in Azure DevOps but the 2 plugins I tried fell over because I was using a Ubuntu machine for the build.

So naturally being the simple person I am, I opted for 1! But you could easily go for the second if you prefer!

6 – Test, Test, Test

Once you’ve handled the setup, got the callback in place (and also followed the steps from the last blog post to get this set up in the first place!) you should be able to commit it all these changes and have a build that runs, installs tSQLt and then runs your tests:

I realize there are a lot of “Warnings” in there, but that is just Azure DevOps capturing the output, the real part of this we’re interested in is lines 31-40 and if we clean up the warnings a little you’ll get:

+----------------------+
|Test Execution Summary|
+----------------------+
|No|Test Case Name|Dur(ms)|Result |
+--+---------------------------------------+-------+-------+ 
|1 |[somenewclass].[testProceduresNamedSP_]|144|Success|
------------------------------------------------------------
Test Case Summary: 
1 test case(s) executed, 1 succeeded, 0 failed, 0 errored. 
------------------------------------------------------------------

But if I introduce a migration to Flyway with a new Repeatable Migration that creates a stored procedure named SP_SomeNewProc…

+----------------------+
|Test Execution Summary|
+----------------------+
|No|Test Case Name|Dur(ms)|Result |
+--+---------------------------------------+-------+-------+ 
|1 |[somenewclass].[testProceduresNamedSP_]|184|Failure|
------------------------------------------------------------
Test Case Summary: 
1 test case(s) executed, 0 succeeded, 1 failed, 0 errored. 
------------------------------------------------------------------

It even tells us the name of the offending sproc:

All I have to do now is make the corresponding change to remove SP_ in dev against a bug fix branch, push it, create a PR, approve and merge it in and then boom, the build is right as rain again:

Thus bringing us back into line with standard acceptable practice, preventing us from delivering poor coding standards later in the pipeline and ensuring that we test our code before deploying.

Conclusion

Just because you adopt a more agile, migrations based method of database development and deployment, doesn’t mean that you have to give up on automated testing during Continuous Integration, and you can easily apply these same principles to any pipeline. With just a couple of tweaks you can easily have a fully automated Flyway pipeline (even xRDBMS) and incorporate Unit Tests too!