Category Archives: DevOps

What I talk about when I talk about DevOps

What does ‘DevOps’ mean, anyway?

devops-photo-mEarlier in my career, I was a DevOps consultant – and we were trying to hire other DevOps consultants. But the software industry is actually quite confused about the term ‘DevOps’ and what it really means. I was starting to wonder whether putting ‘DevOps’ in our job ads might actually be counter-productive – and sitting in a seminar at YOW! one week, I finally understood why.

Elabor8 (where I worked at the time) had a booth at YOW!, and I gave a couple of lightning talks there. Probably the biggest crowd-pleaser was my talk on resiliency patterns in distributed systems. I covered some difficult topics like the importance of having idempotent rollback steps in compensating transactions and how lessons learned from the ship-building industry help us craft better distributed systems, all presented in 10 minutes in a crowded event space.

Hang on, you may well ask. If I’m a DevOps consultant, why am I talking about atomicity and consistency in distributed systems? Shouldn’t I be talking about cool PowerShell tips and how to set up Jenkins?

As is so common with rhetorical questions, the answer is a resolute ‘No’.

When I talk about DevOps, I talk about Software Engineering.

When I do DevOps work, I’m doing software engineering. When I hire for DevOps roles, I hire software engineers. But I don’t hire just any software engineers: I want the ones who care about the delivery process. They know how to build software, and also how to put it in front of the user – and how to keep on putting that software in front of users, sprint after sprint, story after story, rapidly, efficiently, and without breaking things.

‘DevOps Engineer’ is the next ‘Full-Stack Developer’ – and not because it’s the next hype-cycle in tech hiring. In the same way full-stack developers expanded their scope to cover both the UI and the back-end, DevOps engineers have expanded their scope beyond writing software – and into the realm of how we get that software in front of users, in the fastest and most reliable way possible.

No longer are we content to just build back-end systems and UI layers on top of them: as a profession, we’re coming to understand that software engineering is bigger than just churning out vertically-sliced user stories. Software engineering is about building the right thing and keeping it running – and a DevOps engineer is a software engineer who cares about both steps. You don’t want a team full of DevOps engineers – but you definitely need at least one.

When I talk about DevOps, I talk about Agile.

You start to see the real benefits of automation when you’re deploying to production regularly. For example, if you’re only shipping a few times a year, the overhead doesn’t hurt you that much. If you have a 6-week QA pipeline and a 2-month UAT window, you don’t need DevOps (yet) (but you do need to change something! Ouch!). Once you start trying to deploy regularly – getting your cycle time down, keeping your WIP low, delivering value to the user faster and more frequently – that’s when the overhead starts to hurt.

Once you introduce agility to your process, that’s when you need to pay attention to the DevOps movement. That’s when you need some software engineers who care about automation. Please note though that I’m not saying you need “some DevOps” – there’s no such thing as “some DevOps”, and anyone who tries to sell you “some DevOps” is doing you no more favours than someone who tries to sell you “some Agile”. What you need is some smart software engineers who care about DevOps – and you need to give them the time and resources they need to do their job.

When I talk about DevOps, I talk about teams.

DevOps engineers are software engineers, but that doesn’t mean you should fill your software teams up with DevOps engineers. DevOps engineers tend to be passionate about a bunch of really interesting stuff: resilience patterns, testing, automation, source control, and release management. The great thing about multi-disciplinary, cross-functional teams, however, is that you get a bunch of people with different passions together, and that breadth gives you the ability to do great things. Don’t try to hire DevOps engineers (or worse, to build a DevOps team). Hire software engineers, and when you find ones who are great at DevOps, keep them.

Having cross-functional teams also gives your team members the chance to cross-skill while they work with other team members, which is why…

When I talk about DevOps, I talk about teaching.

Great DevOps engineers have a genuine enthusiasm for quality software engineering and release management, and their enthusiasm is infectious. Great DevOps engineers don’t hoard their knowledge, but help their fellow software engineers to learn more about the DevOps mindset by sharing what they know. They also learn from their colleagues who specialise in other software engineering fields, becoming more well-rounded themselves as they help others do the same.

Finally: When I talk about DevOps, I talk about people.

DevOps is a branch of software engineering – and whatever you might hear, software engineering is all about people. It’s about the people who use our software, and the people who build it. DevOps is the intersection of those two groups: users and developers. Our users are not just end-users, who enjoy higher-quality software, but also our fellow engineers, who rely on our automation to work more efficiently. They’re our managers, who rely on the insights we give them into the release process. Our users are the junior software engineers who may one day specialise in DevOps engineering – or who might use what they can learn from us to be better at some other branch of software engineering.

Update (2021): If you’re a software engineer who cares about the difference between GitFlow and GithubFlow; who has made calls to the Octopus API; or who loves showing other developers how to write a better test or add more context to a log message; please talk to me about joining Squiz. It’s a great team, and whether or not we’re actively hiring for a role, I can look to see if there’s a place for you here. If you want to take the engineering you care about and have a broader impact on more people, you want to join us. Get in touch.

Measure What Matters

What’s your average API response time? Do you know? Is it important to your business? What about the 90th percentile? Do response times suffer during peak demand?

Do you think about those questions? How about these ones:

How long does it take to get a software change reviewed? Do you know? Is it important to your business? Is it a bottleneck? Do reviews get skipped during busy periods?

If you care about code reviews, you should measure them. Put them on your system dashboard. They’re as much an indicator of the health of your software environment as your API response times. Minimising Work In Progress and Mean-Time-To-Release are important parts of your QA process, and making sure your pull requests are reviewed and merged in a timely fashion is a great way to improve those numbers.

What existing products are there out there to do this? Depending on the tools you use, you can probably pull out a few relevant reports. Jira is popular, and I’ve seen PMs produce some great graphs to include in their monthly management update. The problem is, the numbers you get out of these tools don’t give you direct, real-time feedback. Their very nature as longer-term averages mean they can’t represent a call to action.

Enter TeamLab

As a software shop, if the tools I’m using don’t do what I want, I have an option: build something. This is a dangerous option to have, and countless business hours have been wasted solving the wrong problems, but I really needed a nice visual prompt of how we’re doing at our code reviews in-the-moment. I also wanted a side-project for the team to tinker with new ideas for writing web applications – so even if the project didn’t turn out to be useful, the experiment would teach us something.

I had a specific technology I wanted to try out: React Storybook. This is a really nice way to visualise your React components in various different states, and I wanted something relevant to use as a demo for the team. It was very quick and easy to get up and running with a create-react-app project including Storybook, and I hacked together a quick picture of what my PR display should look like:
Storybook10PRs
On the right, you can see my quick mock-up of a board displaying ten pull requests, and the left is the Storybook control panel.

I decided it would be useful to colour-code the pull requests, and display any reviewers and approvers on the PR cards. A new PR is yellow, and an approved one is green. A PR with reviewers turns blue, and most importantly, any PR which is older than 48 hours turns red.
StorybookPRs

This was a nice little mock-up, but there was no real data behind it at this stage. Fortunately, the Git server we use has a fairly straightforward API, and so it didn’t take long to get some real data behind this component.
TeamLabPRs

It’s really easy to see when we have PRs which are starting to get stale, and need attention. Quick – at a glance, how many PRs here have been hanging around too long and need attention?
MorePRs

This has become the go-to way of seeing our outstanding PRs at a glance, and has since gone up on a big screen on the wall in our dev team office. I soon got requests for a few other widgets to go on the same dashboard, and our little side project has become a key part of our DevOps toolkit.
TVDashboard

Has It Worked?

Having those cards up where we can see them during the day has been good – but the biggest signal is during stand-up each morning. A quick glance at the TeamLab PR board has become part of the ritual, and if those cards start to build up – especially if they start to turn red – the team has a really strong signal that we’re getting behind on our code reviews.

I don’t currently have a report which tells me the Mean-Time-To-Merge for our PRs – but I don’t think I need it. Mean-Time-To-Merge isn’t as strong or immediate a signal as a pile of glaring red PR cards looming over our morning stand-up, nor does it provide the immediate sense of relief when we clear the board.

What Next?

I’m not sure what will go on the dashboard next, but I have some idea what kinds of things I’m looking for.

I need things I can measure – things I can pull straight out of an API. Things which can directly influence numbers like Mean-Time-To-Release – but I don’t want to display averages like that. I’m going to give people a dial they can turn directly. I’ll pick an angry colour like red for things which are outside targets, and nice friendly colours like blue and green for things which are on track. Once something is off the list, I’ll make it go away.

In short, I want to find things which I can measure, which team members can directly influence, and which will improve our overall quality – and I want to put them up where everyone can see them.

Announcing NSchemer 1

If you already know all about NSchemer, you can jump straight to the Version 1 release notes.

What is NSchemer?

Database schema management has been an interest of mine for a very long time. I’ve seen all sorts of approaches tried: folders full of .sql files, schema version tracking in Excel, and of course the tried-and-true1 manual approach using schema diffing tools. I’m a keen proponent of automated schema management. Automated deployment is all the rage these days, and if you can’t automate your schema updates, you can’t automate your releases.

I prefer to go one step further: I like to aim towards single code path schema management. Any database, whether a brand-new one to support a new installation, or an ancient database restored from a backup for a returning client, should get to the current version using the same code path – or at least, as close as possible.

When I started pushing this idea – that developers should write their own SQL migrations as they went, rather than leaving it to the designated DBA to do in the lead-up to a release – I got some push-back. Some of my team didn’t want to write SQL. Thus, NSchemer was born2. The framework languished in alpha status for many years, despite being actively used in a number of production systems. Recently, I finally decided to tidy up the API, add a few new features I’d been meaning to for a while, and bump it up to version 1.

Why Automated Schema Management?

The number one reason for automating your schema management is testing and reliability. Assuming, for a moment, that you have test environments, automated schema management means your test environments should go through the same migrations as your production environments will – automatically, with no opportunity for a manual step to get skipped or done incorrectly. This gives you a lot of confidence that when you hit the Big Red Button to go live with a new version, your schema migrations will work: the same automated set of steps which have run against all of your other environments will run against production.

You get a lot of other nice bonuses, as well. When you merge master into your own branch, not only do you get all of the new code; you also get the migrations that update your database schema to match. No more pulling in another branch, only to have to manually update your local database schema to match.

Digging up a backup from a couple of years ago? No worries, NSchemer will bring it up to current without any hassle at all.

Installation

While you can install NSchemer into an existing assembly, I typically create a new assembly just for managing schema transitions (I use a console app, so I can run the transitions from a script during deployment). Once you’ve created YourProjectName.Schema, just

install-package NSchemer

and you’re ready to go.

Show me the code!

NSchemer uses a single class which inherits from SqlClientDatabase to represent a versioned schema. Just create one, implement the Versions collection, and start writing transitions (beginning from 1 – NSchemer uses version 0 internally). If you’re starting with an existing schema, just use your favourite SQL tool to generate a full CREATE script, and drop it in as version 1 (use the embedded resource transition mentioned below).

public class TestSchema : SqlClientDatabase
{
    public TestSchema(string connectionString) : base(connectionString) {}
    public override List<ITransition> Versions
    {
        get
        {
            return new List<ITransition>
            {
                new CodeTransition(1, "Initial Schema", BuildTheWorld),
                new CodeTransition(2, "Add Widget Table", "This script adds a very important table", AddWidgets)
            };
        }
    }
    private bool BuildTheWorld()
    {
        CreateTable("Thing",
            new Column("ThingId", DataType.BIGINT).AsIdentity(1, 1).AsPrimaryKey(),
            new Column("ThingName", DataType.STRING, 50)                
        );
        CreateTable("ThingAnnotation",
            new Column("AnnotationId", DataType.BIGINT).AsIdentity(1, 1).AsPrimaryKey(),
            new Column("Text", DataType.STRING, 50),
            new Column("ThingId", DataType.BIGINT, false).AsForeignKey("Thing", "ThingId")
        );
        return true;
    }
    private bool AddWidgets()
    {
        RunSql(@"CREATE TABLE DBO.Widget (WidgetId [int],WidgetName [nvarchar](50)) ON [PRIMARY]");
        return true;
    }
}

The core of your schema class is the Versions collection, which contains a numbered list of transitions to be run in order. NSchemer will automatically create a table to track which versions have and haven’t been run, and whenever you call Update() on your class, it will work out which transitions haven’t run yet, and apply them. Assuming your transitions all ran without exceptions and returned true, your schema should now be up-to-date, and the version history table updated. If any of your transitions either returned false, or threw an exception, Update() will throw an exception.

Why not just SQL?

You’ll notice, in the sample above, that as well as being able to write SQL transitions, there are helper methods like CreateTable. These are here for four reasons:

  1. Some developers refuse to write SQL.
  2. Some developers write terrible SQL.
  3. They’re actually pretty convenient.
  4. If/when NSchemer officially supports non-MSSQL databases, your transitions should be cross-platform.

If you don’t want to use these convenience methods at all and you’re happy just using SQL, you could also look at some of the SQL-only frameworks which do the same thing as NSchemer, such as DbUp (which has explicit support for a range of other databases as well as Microsoft SQL Server).

If you have larger blocks of SQL to run, don’t put them all into a string like in the example above: NSchemer also supports resource files. You can create a transition like this:

new SqlScriptTransition(3, "Add another table", "NSchemer.SystemTests.EmbeddedFile.sql")

It will look in the same assembly for an embedded resource file with that name. There is also an overload which allows you to specify a different assembly for the resource file.

NSchemer uses a similar format to SQL Server Management Studio: it uses GO on a line by itself as a command separator, allowing you to submit multiple chunks of the file as separate commands.

Configuration

NSchemer has a couple of options you can use to control its behaviour. I’m afraid the API is inconsistent and the options are limited: I plan to address this when I do the overhaul in 2.0 (see below).

  • VersionTable
    Override this property to change the table NSchemer uses. Default: NSCHEMER_VERSION
  • SchemaName
    Set this property to use a schema other than dbo (use with caution: this has limited test coverage).

Can I rely on NSchemer?

You should be testing all of your migrations in testing and staging environments before they make it to production. These tests will also ensure NSchemer is behaving itself in your environment. Should you start using NSchemer in production without ensuring it goes through a testing pipeline? No, but you shouldn’t be running your own code that way either.

If you find any bugs or problems in NSchemer, please report them on the NSchemer GitHub repository. I use NSchemer myself, and I’m keen to fix any bugs you find as soon as possible. I’m also open to pull requests.

Implementation Advice

I like to run NSchemer from two different places: one in development environments, and another in deployed environments (testing, staging, UAT, production, whatever you prefer to call them).

To run in development environments, I throw some guards in (to make really sure it never runs in a deployed environment), and put it somewhere it will run on application start-up. It might look like this:

if (Debugger.IsAttached && _connectionString.Contains(".\sqlexpress")) {
    new MySchemaClass(_connectionString).Update();
}

For deployed environments, I make sure the assembly containing my transitions is a console app, run the transitions from there, and return or output a success/failure message so my deployment tool knows whether to continue or alert me that something went wrong.

Upgrading from 0.x

The only changes you should need are to sprinkle a few using NSchemer.Sql statements at the top of your files.

You should notice some API improvements:

  • When you create columns, there are new options you can provide using a fluent syntax:
    • .AsPrimaryKey()
      Only allowed when creating a new table. Specifies that this column is part of the primary key. Supports composite keys.
    • .AsForeignKey(…)
      Allowed when either creating a table or adding columns to an existing table. Allows you to indicate the referenced table, column, and (optionally) cascade options.
    • .Identity(…)
      Allows you to create the column as an IDENTITY (auto-increment) column. You can specify the initial seed, and increment value.
  • Description is no longer required on transitions.
  • CreateTable(…) now has a params syntax, so you don’t need to create a List every time you use it.
  • You can now specify the nullable status of a column for data types which don’t require a length.

The Future: NSchemer 2.0

As I’ve used NSchemer, I’ve discovered a few short-comings of the existing API. I made some minor breaking changes when I bumped NSchemer to 1.0, mostly just relating to namespaces, but 2.0 is likely to be a more significant overhaul of the API.

I also want to split Microsoft SQL Server support out into a separate package, and provide official support for other database servers. If you have strong knowledge of MySQL, PostgreSQL, Oracle, or another database platform and would like to help maintain support for that platform, please get in touch.

License

I’ve chosen to release NSchemer under the LGPL because I want it to be generally usable, but I want to make sure any improvements are available to everyone who uses it. If you want to use NSchemer but the LGPL is a problem for you, please let me know. I’m sure we can work something out.