Status: Working from home

“What you stay focused on will grow.”
Roy T. Bennett

As of Monday 16th March 2020 I have been a remote worker.

Ever since I left University I have worked in busy, bustling offices where the air runs thick with collaboration, questions and social commentary.

But now… I’m at home. I guess you could say I’m not quite sure how to come to terms with this. It certainly is strange knowing that my commute in the morning has gone from 50 minutes on the bus to 50 seconds from bedroom to dining room (via the kitchen for coffee). It’s even stranger though that I don’t get to see colleagues; friends, who I’ve worked with for years and who’s smiling faces, cheerful demeanor and indomitable spirits have been huge contributing factors to my desire to tackle the working day and make as much of it as i possibly can.

In the past remote working was simply the odd days I worked here and there from home, with nothing but my laptop because it was a quieter day when I could do some learning as well as some other life task like going to the doctor or dropping the car off to be serviced – therein lies the problem.

I had come to associate working from home with quieter periods of time, when I could focus on 1 thing at a time, maybe work from the sofa and treat myself to the occasional snack.

As of Monday however, the entire office has been locked down and we are _all_ working from home… for how long? I don’t think anybody knows. The only certainty in the world right now is that nothing is certain – so it is time to put into practice something that I always preach, but rarely am pushed to exercise. I will have to change my mindset.

I’m not alone in this at all – you may be reading this thinking “but Chris, working from home is so easy, I actually get more done and I’m more focused!” and if that is the case I applaud you, and I wish I could share those same abilities. However if you’re like me and suddenly working from home by directive has jarred you, here are my top 3 tips for working from home that have helped me get to grips with it and maintain my productivity:

1- Find a routine

This is perhaps the most important on my personal list because in the past I have found myself getting out of bed at 7.30/8am on days where I worked from home. This is ~2 hours after I would normally get up when commuting into the office and whilst this sort of lay in is great at the weekend to get some rest, it also leads to me being groggy and not fully awake when your start your morning meetings/calls/work and doesn’t help you focus or build a reasonable mental list of priorities.

Take aspects of your normal daily routine and replace them so that you build up a Monday-Friday (for example) that represents something similar to the structure you enjoyed when working from the office. Here is an example of how I have changed my schedule to adapt to my new working situation; I would normally catch the bus to the office which would take anywhere between 40 and 50 minutes first thing in the morning. Now at exactly the same time in the morning I go for a 30-40 minute brisk walk to simulate that commute – match something you did to something creative, compelling, healthy or otherwise you can do from home instead.

The thing to bear in mind is that finishing work for the day should also factor into your plan. It is very very easy when one works from home, to simply leave the laptop open or code running or join a late call. These things will rapidly eat into your personal life though so once you hit your magic “clocking off” time… clock off.

2 – Create separation between work and home

Even though they are now one and the same, you have to keep a separation between your work and home lives. I’ve heard it described by many people that they have set up in the study or in the office – but what do you do when you live in a small flat? Or if you live with your parents who are _also_ working from home and you have to say, work in your bedroom?

My solution to this, as I have a similar problem, is to make certain touches to transform myself and the space i’m working in to make it feel as though there is a transition between going from home to work.

Once I have come back from my walk I relax and make a coffee and some toast or cereal whilst watching YouTube videos, then at 7.55am precisely, my Amazon Echo buzzes an alarm, I turn the television off and I begin “building” my work space. A process which involves taking out the monitor and accompanying cables, setting them up on the dining room table, neatly arranging where everything should go and plugging in my phone and finally ensuring that I have a full bottle of water on standby to stay hydrated.

Conversely at the end of the day I will go through the same ritual in reverse to the point where we are all able to sit around the dining room table and have dinner with no trace that I was ever there working. This act of transformation somehow makes the space seem different, and gives it a different energy, which means i’m not thinking about work problems whilst I eat with my family.

3 – Communicate, Communicate, Communicate

Ok. This one is a no-brainer, but there are subtle levels to communication that can really help when you’re feeling the weight of working from home hovering above you like so much foreboding.

Whenever you’re feeling the pinch of distraction or you’re lacking motivation, message a colleague you would normally speak to on a daily basis at the office and ask them how they are. It seems simple but if you rotate who this person is and just check in on them you make them feel though about, cared about and appreciated. Not only does this give them the motivation to carry on but is a selfless act that can also revamp your own spirits and the resulting conversation can even inspire the thoughts you need to help with whatever you’re working on.

There are so many posts about working from home and engaging with others on a technical and personal level, such as Kendra’s post on SSC, Kathi’s post on SimpleTalk or this incredibly detailed and insightful post from Alice Goldfuss, that many of these will be repeats on what others have said. However, I think my personal top 3 on communication specifically are:

  1. Turn your webcam on – be seen and see other people and enjoy the smiles and the thinking and the body language you would otherwise be missing out on.
  2. React and be reacted to -if you use Teams or Slack or any kind of collaboration tool, make sure that when people make statements or ask questions, you either offer feedback or at least react. A thumbs up emoji at least lets people know that you are listening, allowing them to feel validated and supported and not think they’re just shouting into the void.
  3. Learn to use a mute button – we have all been in situations where someone has mouth breathed the meeting to postponement or someone has had a coughing fit or their child(ren) have come in asking questions – all of these are perfectly fine because we’re humans… but. That is no excuse for everyone in the meeting internal or external to hear that. Liberally use your mute button on your meeting software or headset and make sure you are able to contribute meaningfully, but that when you’re not contributing, you’re also not hindering others from doing so.

Conclusion

Right now is a difficult time. Very much so – and I hope you’re all doing well and staying healthy and happy. Working from home makes up a small portion of what everyone is feeling and struggling against at this point in time and so we should remember to be mindful and grateful that we even are a part of companies who are able to support us working from home as the ability to even do so is a blessing.

Remember to take your mindset and find a way to turn each of the things you struggle with into positives; recognize them, appreciate them and ask yourself why you’re feeling that way, and then find a way to deal with it that makes you more productive and/or happier – most companies have a wealth of people who are experienced in or who are dedicated to help you working from home, don’t be afraid to ask for help if you need it and see if someone can help you tackle some of the problems you’re facing. Everyone is different and will need different things, just because you don’t feel comfortable working from home, for any reason, doesn’t mean you’re doing things wrong. We’re all learning together.

So stay safe, stay healthy and stay awesome and I’ll leave you with my favorite tweet so far on the matter:

P.S. We also just recorded a DBAle Special Episode on working from home, where all podcast participants were working from their respective homes, if you’re interested, you’ll be able to listen on Spotify, Apple Music and directly here: https://www.red-gate.com/hub/events/entrypage/dbale when it goes live in the next few days 🙂

Shall we begin? (With data classification)

“If I had an hour to solve a problem I’d spend 55 minutes thinking about the problem and 5 minutes thinking about solutions.”
Albert Einstein

So I just spent about 20 minutes trying to come up with a suitable title for this blog post, and then it struck me – one of my favorite movies of all time (which I will soon be ensuring my wife and I watch again) is Star Trek into Darkness, featuring the magnificent Benedict Cumberbatch, in which he masterfully growls the phrase “shall we begin?”.

shall we begin benedict cumberbatch GIF

This sums up perfectly where people find themselves at the beginning of their classification activities. “Where do I start?” is usually the first question I get asked – regardless if you’re using an excel sheet, Azure Data Catalog or Redgate SQL Data Catalog to carry out this process, you will find yourself in the same place, asking the same question.

Classifying your SQL Server Tables and Columns is not straight forward, I’ll say that up front – you have to be prepared for whatever may come your way – but give yourself a fighting chance! Whether you’re looking to better understand your data, protect it, or you’re just hoping to prepare your business to be more able to deal with things such as Subject Access Requests (SARs), the Right to be Forgotten *cough* I’m looking at _you_ GDPR *cough* – or even just key development in systems containing sensitive information; this is the ultimate starting point. As per my blog post on data masking here, you can’t protect what you don’t know you have.

This is my effort to give you the best possible start with your classification process, whether this feeds into a wider data lineage process, data retention or, of course, data masking. So… shall we begin?

Get a taxonomy set up

This is perhaps the most crucial part of your success. Even if you have the best classification process in the world it really means nothing if you’ve basically described your data in one of say, 3 possible ways. The thing to bear in mind before getting started is that the data cataloging process is not specific to one job.

You may think at this point in time that you’re going to use it to highlight what you want to mask for Dev/Test environments, or maybe it’s your hit list for implementing TDE or column level encryption – but this _thing_ you’re building is going to be useful for everyone.

  • DBAs will be able to use this to help them prioritize systems they look after and being more proactive when it comes to security checks or updates, backups etc.
  • Developers will be able to use this to better understand the tables and environments they are working on, helping them contextualize their work and therefore engage and work with any other teams or individuals who may be affected or who may need to be involved.
  • Governance teams and auditors will be able to use this to better understand what information is held by the business, who is responsible for keeping it up to date and how it is classified and protected.

The list goes on.

So all of the above will need to be engaged in a first run to help actually describe the data you’re working with. What do you actually care about? What do you want to know about data at a first glance? Below is the standard taxonomy that comes out of the box with Redgate’s Data Catalog:

Some of my favorites are in here, which I would encourage you to include as well! If nothing else, having Classification Scope as a category is an absolute must – but I’ll come to this soon. You can see though, how being able to include tags such as who owns the data (and is therefore in charge of keeping it up to date), what regulation(s) it falls under and even what our treatment policy is in line with any of those regulations is, gives us so much more to go on. We can be sure we are appropriately building out our defensible position.

Having a robust Taxonomy will enable you to not only know more about your data but to easily communicate and collaborate with others on the data you hold and the structure of your tables.

Decide who is in charge

This seems like an odd one, but actually one of the most common questions I get is about who will be carrying out the classification process, and this is where the true nature of collaboration within a company is going to be absolutely critical.

Some people believe that a DBA or a couple of developers will suffice but as you’ll see later on, this is not a simple process that only 1 or 2 people can handle by themselves. Be prepared to spend hours on this and actually the implementation of classification means by nature you are going to need a team effort in the first instance.

You will need representation from people who know the database structure, people who know the function of the various tables and people who know the business and how data should be protected. You will require representation on this team and the collaboration between Dev, DBAs, Testers, Governance and DevOps, and you will need someone central to coordinate this effort. When you have key representation from these teams, it will make it easier to identify and collaborate on hot spots of data, so ensure you have this knowledge up front.

Get rid of what doesn’t matter

You may be surprised that the next step is technically an execution step, but it is an important point nonetheless and will absolutely help with the classification effort. This is where the Classification Scope category comes in, and this is why it’s my favorite.

One of the biggest problems that people face when actually executing on their classification is the sheer enormity of the task. There is no “average” measure we can rely on unfortunately but even small schemas can be not insubstantial – recently, some work I did with a customer meant they provided me with just ONE of their database schemas which had well in advance of 1800 columns across dozens of tables. When you scale that same amount to potentially hundreds of databases, it will become rapidly clear that going over every single column is going to be unmanageable.

To start then, the knowledge brought by the team mentioned above will be invaluable because we’re going to need to “de-scope” everything that is not relevant to this process. It is very rare to find a company with more than 50% of columns per database which contain PII/PHI and even if you are one of those companies, this process can help you too.

There could be many reasons that something shouldn’t be included in this process. Perhaps it is an empty table that exists as part of a 3rd party database schema, such as in an ERP or CRM solution. It could be a purely system specific table that holds static/reference data or gathers application specific information. Regardless what the table is, use the knowledge the team has to quickly identify these and then assign them all with the necessary “Out of Scope” tag.

This will not only help you reduce the number of columns you’re going to need to process significantly, but will give you greater focus on what does need to be processed. One of the greatest quotes I’ve heard about this process comes from @DataMacas (a full on genius, wonderful person and someone who over the years I have aspired to learn as much from as possible) who referred to it as “moving from a battleships style approach to one more akin to minesweeper“. Which is just so incredibly accurate.

In my example database below with only 150 odd columns, using the “Empty Tables” filter, and then filtering down to system tables I know about, I was able to de-scope just under half of the database, just as a starting point:

Figure out patterns and speed up

Many of the people carrying out this process will already have some anecdotal knowledge of the database as I’ve already mentioned, but now it’s time to turn this from what _isn’t_ important, to what is.

The fastest way to do this is to build up some examples of column naming conventions you already have in place across multiple databases – there will likely be columns with names that contain things like Name, Email or SSN in some format. Both SQL Data Catalog from Redgate and Microsoft’s Azure Data Catalog have suggestions out of the box that will look at your column names and make suggestions as to what might be sensitive for you to check and accept the classification tags.

Now these suggestions are incredibly helpful but they do both have a reduced scope because they’re just matching against common types of PII, so it’s important to customize them to better reflect your own environments. You can do this fairly easily, one or both of the following ways:

1 – Customize the suggestions

In Redgate’s SQL Data Catalog you can actually, prior to even looking at the suggestions and accepting them, customize the regular expressions that are being run over the column naming convention to actually check that they are more indicative of your own schemas – either by editing existing rules or by creating your own rules and choosing which tags should be associated with the columns as a result:

You can then go through and accept these suggestions if you so wish, obviously making sure to give them a sense check first:

2 – POWER ALL THE SHELL

In both of the aforementioned solutions you can call the PowerShell API to actually carry out mass classification against columns with known formats – this will allow you to rapidly hit any known targets to further reduce the amount of time spent looking directly at columns, an example of the SQL Data Catalog PowerShell in action is the below, which will classify any columns it finds where the name is like Email but not like ID (as Primary and Foreign keys may, in most cases, fall under the de-scoping work we did above) with a tag for sensitivity and information type (full worked example here):

Finally – get classifying

This is the last stage, or the “hunt” stage. It’s time for us to get going with classifying what’s left i.e. anything that wasn’t de-scoped and wasn’t caught by your default suggestions or PowerShell rules.

You can obviously start going through each column one by one, but it makes the most sense to start by filtering down by tables which have the highest concentration of columns (i.e. the widest tables) or the columns that are specifically known as containing sensitive information (anecdotally or by # of tags) and classifying those as in or out of scope and what information they hold, who owns it and what the treatment intent is at the very least.

The approach i take in this instance is to use filtering to it’s utmost – in SQL Data Catalog we can filter by table and column names but also by Data Type. Common offenders can be found with certain NVARCHAR, XML or VARBINARY types, such as NVARCHAR(MAX) – to me, that sounds like an XML, JSON, document or free-text field which will likely contain some kind of difficult to identify but ultimately sensitive information.

Following the NVARCHAR classification I move on and look at DATETIME and INT/DECIMAL fields for any key candidates like dates when someone attended an event or even a Date of Birth field. This helps especially when the naming conventions don’t necessarily reflect the information that is stored in the column.

Finally, one thing to add is that you will need access to the databases or tables at some point. You can’t truly carry out a full-on data classification process purely against the schema, especially for the reason above. Data will often exist in places you weren’t aware of, and knowing the contents, format and sensitivity of the data can only reasonably be found and tagged if you have the support of the data team to do so.

Conclusion?

This is not a one time thing. The initial classification is going to be the hardest part but that can be mitigated if you follow some of the processes in this post and ultimately work as a team.

Classification though, is ongoing. It is there to be an evergreen solution, something that provides context for any data governance or DevOps processes that you have in place and therefore should be maintained and treated as what it is. The place that everyone is able to use to gather information about what they will be using, and what the company might have at risk.