“Every time an Agile team switches to properly formatted Kanban an angel gets it’s wings” - me
I used to think anything Agile was awesome, and different strokes were just equally good but for different folks. But now I do not think that, at all. The TLDR hits my main point, but I’ve bunched up a number of stories to support some common Scrum anti-patterns, culminating in a suggestion that Kanban can fix the most scary structural one. I hope you’ll read on - the issues may seem subtle but with the benefit of hindsight I think they’re the drivers of much of the “but I thought we’d get more done” pondering I’ve seen over the years.
A special thanks to Stuart Corrigan’s short, free e-book (Fr)Agile which I learned about at a Critical Chain project management conference. It sort of red-pilled me on this subject. I feel I added onto it along the way, but the what he wrote helped me to see reality more clearly.
This one pulls together a lot of the operating concepts I’ve been preaching - along with stories of Scrum gone wild. Then I try to tie it all back together on what I think should be true about a throughput based Agile approach. Which Kanban more or less satisfies given the (almost) tenets of the TLDR below.
Without further delay LFG!
TLDR;
Make sure you have great OKR’s so you truly know what’s important. Use my guide or any of the more famous ones.
Read “Goldratt’s rules of flow” and make sure you understand how critical having everything ready for a task (full kit) and how bad multitasking truly is within agile processes
Switch to Kanban so you can really visualize and reduce WIP
Avoid Scrum because the perceived upside “we let the product owner fill in the capacity for a period so they leave us alone” has a fatal flaw of “the product owner has a huge incentive to use all the capacity.” Using all the capacity sounds fantastic. But … that’s a case study of focusing on local efficiency (utilization) vs. global throughput. Spoiler: you care a LOT more about throughput.
Make sure your entire management focus is on the prioritized release of work and unblocking people so they don’t multitask and/or wait on blockers.
Another recommendation to read the damn software flow book I keep bringing up. It probably is just like 2 hours of your time. But of course read the rest of this first.
You’re welcome. ;-)
Longer version: Throughput > Efficiency
This is the section where I suggest that several things that we think are great about Agile processes are totally wrong and should be abolished. Unless you don’t care about velocity and achieving highly leveraged business results.
No … I’m not going to argue for Waterfall, per se. Even if I believed in the merits of some of it (actually, I do) - I don’t quite feel like being burned at the stake. But I am going to argue that a few things in Scrum tend to get abused, and there’s basically an unalterable flaw in the system that leads me to believe you’re better off not using it the way most of us have been using it.
Apologies this section does expect some prior knowledge in Agile/Scrum and some of the associated lingo. But I’ll try my best to not make things require total nerddom in the topic. I’ll start with some context…
A biased history of the amazing shift that is Agile.
The tablets come down from the mountaintop
There was a time believe it or not when Agile was controversial1. Folks were saying things like “what are those newfangled whippersnappers up to?” and screaming “get off my lawn!” with some regularity. Maybe a slight exaggeration. But there was real skepticism for the practice, though that was also still an age where some devs would feel they didn’t need any kind of source control systems. It was sort of the wild west now that I come to think about it.
To my recollection, back then “Agile” more or less meant “Scrum.” I’ve certainly heard lots of different views of why Scrum is awesome - and lots of them around breaking work down into continuous delivery, having milestones that are verified (and celebrated) during team demos make a ton of sense. Of course I’ve always loved the continuous learning mechanism of the team retro.
Though when I read the early Agile manifesto one thing jumped out at me as the true reason for the approach - the backlog and the sprint planning meetings. Specifically the process of having a team estimate units of work and the forcing function of the product owner selecting only what will fit in the next sprint. And THEN LEAVING THE TEAM ALONE WITHOUT CHANGING THEIR MIND FOR 2-4 WEEKS.
To this day, admittedly without any evidence, I remain convinced that the reason that Scrum spread like wildfire at a let’s spill gasoline and play with matches convention is because it was impossible to not read the idea and think “OMG - I can get to actually code for 2-4 weeks without someone changing their flipping mind on the requirements.” Even with the predictable problem that folks would feel they didn’t need to really think through the requirements anymore2, the tradeoff on focus seemed like the deal of the century.
Undesirable effects - rise of the heretic
Over time I thought I’d begun to feel there were some cracks in the armor of execution perfection promised by Scrum. This wasn’t super obvious in the best, smaller teams. Those where the various folks required for something to come together were in perfect sync. As work became complicated, getting a project done starts to require input from product managers, scientists, engineers, designers, operational teams, etc. All things have weak points. Scrum provided flexibility but if one didn’t address the potential pitfalls intentionally plenty of undesirable things could happen.
Before I get back to my main thesis, I’m just going to riff a bit on undesirable aspects of Scrum I’ve noticed over time in the next few sections. This isn’t intended to be a guide to “how to do Scrum well.” Therefore things like ensuring you have a definition of “done”, a well groomed backlog, and how to run an effective retrospective are being left out for now. Feel free to checkout this article for what I think some important things you’ll want to make sure are true for any Agile implementation.
Low rigor in defining “what we want to accomplish” (the requirements)
With Scrum It’s a bit too easy to not have a high bar on rigor around “what we are building?” In the (not entirely so) good old days folks would be tasked with creating a requirements document that detailed everything to solve the problem at hand. Shockingly this one and done document didn’t always work out and epic stories of changing requirements were common. Scrum’s sprint cadence provides a natural way to buffer the shocks of such changes by explicitly allowing the product owner to learn and pivot every few weeks.
For those readers who’ve worked with humans before the likely undesirable effect is probably obvious. While a requirements document may be written, it’s all too easy to either skip the step entirely or just not apply a high bar on the documents. Much as people in media production joke that when a terrible mess happens that they’ll just “fix it in post” (production) this under investment in up front analysis is rationalized that it can be fixed in a later sprint. This is true but with a but that Sir Mix-a-Lot would love. You don’t want to come up with the requirements as you go for the same reason you want to be crystal clear when setting goals what the problem is you’re trying to solve and why that’s important. Otherwise you’re just going to be lurching from idea to idea - which itself is almost always a sign that you don’t really know what the most important thing for your customers is.
Here's a suggestion for those looking for a fast test to assess whether your product team has nailed the requirements at the 80%+ level. Does the product manager feel they need to attend the team’s daily standup? Or worse - does the dev team feel they need the PM there at the daily standup? If so, then it’s likely more proactive thinking around the requirements (and likely your OKR’s) needs doing.
I get pushback almost every time I make this point. But “We love having Mary attend the standup, we work better when everyone is in sync and she often has great insights!” Or the PM says “If I don’t go to stand-up then how will I know what the team is doing?” While often at the same time each side is complaining that the product manager is behind on some other thing that they probably should be thinking ahead on (ironically, often the next stage of requirements).
My advice at this stage (whether you buy my argument or not) is to be curious and ask - “what specifically will go wrong if Mary isn’t at the standup for this sprint at all?” Of course, pushing deeper for anything that’s nonspecific in the answer. I suspect in a meaningful number of cases you’ll hear something that reduces to “if they’re not there I won’t know what to do next” or “if I guess at what to do next there’s a good chance I’ll be wrong because the requirement/user-story/napkin isn’t clear enough.” Bingo! This is the point I’m trying to make. As if you go back to the Scrum origin story - the entire point of the exercise is that the team enters the Sprint with clear information on what they are expected to do. Whether this is a user story, API definitions, or just adequate context to understand “done” the intent is for the execution team to be able to execute.
If the product manager (PM) is needed to handhold so consistently that they’re always meeting with the devs then they’re either (a) not specifying what successful completion of the Sprint components are (b) doing some important aspect of the Engineering lead/manager’s job, or (c) changing their mind as they go. All of these are bad - and often it’s a combination of these.
I’m not saying anyone should ban the PM from hanging with the dev team. But I am saying it should be a “want” and not a “need.” Also, you should be very suspicious that the PM has so much time to hang out with the devs in this way. If so there’s a good chance they’re then underinvesting in the incredibly hard and complex job of thinking through the longer term execution aligned with the core problem the business needs solving. This harder work gets put behind the easier dopamine thrill of hanging out with the developers and feeling like you’re contributing to the coding part of things.
Throughput is your goal, not self organizing autonomy
One of the things that also gets a lot of positive attention is how Scrum supports a working team autonomously self organizing their work by how they choose elements from the backlog. In theory everything going into a Sprint has passed through the “Sprint Planning Meeting.” During which the product owner (often the PM) sits down with the team and reviews the “groomed backlog.” This is a list of things that the group wants to do - including both product features, important bug fixes, and other work that improves the team’s ability to build now and in the future (ie; infrastructure/platforming projects). In theory, using a well defined prioritization rubric the product owner selects what items from this backlog should go into the plan for the next sprint. The team, having provided the input estimates, then reviews and “commits” that they will work over the Sprint cycle to complete everything chosen.
There are a few things that tend to go a bit wrong here. The product owner being highly incentivized to include as much as they can in the next sprint, leaving not enough slack/buffer for things that go sideways. Additionally, there is often a long simmering battle between investments the team feels are technically valuable that the PM doesn’t want to do now. Typically this is because the devs are want to have a simpler codebase allowing them to go faster, and the PM feels they only get measured on what’s shipped (without much risk of being blamed if things break after launch). These are both big issues - but they’re not the problem I now refer to as the core Steve problem of Scrum.
To understand what I’m referring to it’s important to recount how Scrum teams typically organize the doing of work within the Sprint (2-4 week period). Going back to my recollection of the manifesto writings, the commitment of the team is to complete the work agreed to in the planning meeting - but no agreement is made on the ordering of work. This is intentional, because one of the issues with people changing their minds (keeping the product owner from poking their nose into the spring) is that they tend to try to get teams to deliver specific things in specific order within the sprint. This violates the separation of focus concerns and autonomy of the team. Which is viewed as bad. For the record I’m extremely in favor of teams being able to operate with high autonomy. As mentioned in my writings on OKR’s, a huge benefit of that process is the ability to ensure high alignment (through objectives and context setting) while also providing high autonomy. These are often seen as in conflict - but they’re not, and you need both for a high achieving organization in the long term.
The problem comes from the simple fact that no plan survives contact with reality. With the corollary that almost no Sprint finishes all the work committed to. Given that the commitment is only to complete the work for the sprint, and not in any order - most teams default to a work assignment model that is “pull driven and interest based.” Meaning that while maybe there is team discussion, in some way the individual devs select which item they will work on next. This often means that only one dev will work on each item, and more often than not when a dev gets blocked in some way they select a new item from the Sprint’s backlog while they try to figure out how to complete the last item.
This is possibly the most important point so I’m going to write it out again in a super annoying way.
Having the product owner cram in as much as they can within the “sprint budget” encourages a focus on accounting efficiency (“did I use all the resources I could?”).
This is the direct enemy of throughput - which is gained by maximizing focus on the most important thing, protecting it from distraction,
and ensuring it’s completed before the next thing.
Ok, now that I’m done shouting I’m going to share a real life example of how this sort of happens. I’d call it the Tao of Steve except I’m pretty sure I saw a movie with that title something like 25 years ago.
The Steve parable - start something and finish it, don’t just do lots of work in parallel.
So let me stop and share the story of Steve and his team when I was at IMDb. Before I get started it’s worth me being explicit that Steve was (and likely still is) an incredibly gifted and trustworthy engineer. This was a strong team, with an excellent manager - this was 100% NOT a team that had performance issues. They were responsible for a key middleware layer that enabled our collection of mobile apps to function, and powered the app teams to build quickly without having to understand all the nuances of the larger IMDb ecosystem. Key stuff, staffed by good and conscientious people.
One of the projects they had on the backlog was switching over the authentication mechanism we used for IMDb logins. Logged in users were important in the long run. Without a logged in experience the site was valuable, but just not as sticky. Ensuring there was low login friction was a key controllable3 input.
The team took on this project with enthusiasm, and clear requirement. But month after month as we reviewed their sprint outcomes the project wasn’t completed - likely it “just needed another sprint or so.” While we hadn’t drawn a line in the sand for when it MUST be completed - it was exceeding the estimated investment, and we wanted the business benefits. Even though I was not the direct manager, nor the manager’s direct manager I started to be super curious what was going on with these delays (relative to repeated sprint commitments).
Probably a few sprints had gone by this way this way with me hearing about it it indirectly. Most likely the narrative was around unexpected (and detailed) technical issues, folks being out of office, or other specifics. Not something systemic in how work was being done. Eventually, and admittedly too late, I sat down with the team to deepen my own understanding.
What I eventually learned would both haunt me, and inform my views for years to come. While there were 6 people on the team, it slowly dawned on me that only Steve was working on this problem each sprint. Being one of the most senior folks on the team, Steve also had other responsibilities - such as jumping in to help teammates, helping prep other estimates for new work, and so on. Because it was “Steve’s project” and others were pulling tasks of interest there wasn’t a bias to jump the other way to help Steve out. Eventually by being curious about everyone’s assumptions I realized the key mental model of the team was “we do work in parallel in order to go fast and preserve individual autonomy. and interests4” It was a lightning bolt moment of feeling stupid for me. We (and by “we” that means “I”) had not communicated the context on the relative importance of the authentication work to the team. Letting them make this series of decisions based on the otherwise reasonable believe that everything was equally important.
Discussing it more as a group, we made a larger fundamental change. The mental model became “don’t do things in parallel” but instead that everyone should work together to complete the next big thing (the authentication work), rather than trying to do everything all at once. Once we reset this expectation for the team, things got done a lot faster. It also shouldn’t surprise students of flow that more also got done5.
This concept of “swarming” isn’t new and I know it’s often been described as a best practice - well before I had this epiphany. Outside of software development, operational processes have famously sought to streamline the number of things going on at ones - epitomized by the goal of single piece flow from the Toyota system. Even so, this Scrum tendency is worth keeping an eye out for because at almost every place I went to after this story I observed teams proudly working on 8 “things” in a sprint with 6 people. So the idea of focusing and minimizing the number of workstreams with only one person on it is not a solved problem in the wild.
But … taking this further, this story points to the fundamental flaw in the scrum methodology if you’re seeking to maximize flow (aka velocity) in your development process.
Kanban - what is it good for?
For many years I’d heard about and even worked with teams using the Kanban approach to Agile6. I knew a few basic things about it. First, that it was based upon a system pioneered at Toyota that was designed to control work in process across stations in a factory - using a visible card to signal when it was time to do the next piece of production work. Second, when people explained it to me that “it was like scrum, but instead of designing a sprint ‘we just pull the next thing to do off a list’ and do it, oh - and we have a cool board too!”.
I’d ask “Tell me about this board” and be told that each “story” was on a notecard/sticky and that the team tracked all work by placing the cards on the proper place of a highly visible physical board. Where the board’s layout consisted of the following columns:
Backlog -> Doing -> Review ->Done
My main thought each of these times was along the lines of “this seems sort of cool, but I don’t really see why it’s better?” Plus this “pick any piece of work you want seems to violate the forced structure of carefully ordering work by product value/priority.”
I WAS WRONG. Full stop.
Clue #1: The one time I walked into a team room lead by a manager named Patrick I saw the spark of difference. He’s the one who initially made me sure I was missing something, even if it took me years to realize what exactly7. The thing was that when walked into that team room there was a sense of clarity that the team knew exactly what they were doing in which order and why. I just couldn’t put my finger on the reason for it.
Clue #2: Much, much later I think I figured it out - aided by a mini book by a Stuart Corrigan8 and some self introspection after reading Goldratt’s Rules of Flow and attending a Critical Chain virtual conference.
If you’re just having people pull whatever type of work they want next from the backlog then Kanaban is nice, but not really unique in the Agile universe. An approach that addresses the undesirable effects of Scrum without introducing some negative second order outcome is what’s needed. Therefore you want a method that
Ensures that team members are working on the most important work as defined by your objectives and key results.
Retain a high degree of team9 autonomy for how the team builds and organizes ways of working.
Control work in progress (WIP)10 from a top down so as to enable high rates of flow focus on global needs
Visually highlights constraints to flow such as bad multitasking or too much WIP
It turns out Kanban addresses all of these as long as you; ensure that there is coordination on release of work (starting new development units, the release of work is according to a very simple prioritization rubric aligned with OKR’s, and your team doesn’t just pick up new work when blocked on something11.
Isn’t coordinating work anti-Agile in some way?
The coordination of release of work may run counter to a lot of Agile talk in the past - and could be seen as “non-empowering.” But my view is that we want to empower teams to maximize impact. Impact is throughput aligned against true goals - what you’ve written your OKR’s again, taking into account your unique system bottlenecks. Even if this assignment in the short run occasionally runs counter to what would be most personally interesting for a team member
I’m not saying that personal development and other interests shouldn’t be balanced. But we all mostly have a tendency to want to do things that give us a sense of accomplishment, and often that’s at the expense of doing the harder, but more critical things first. Therefore, this shouldn’t be 100% left up to any one person’s feelings at the moment. Unless that’s one of the very few priorities we’ve decided to use as our decision rubric.
Step 1 is to release work for development (as tracked on your Kanban board) in the order it is needed, ie; aligned with your prioritization rubric12. Step 2 is that once released the work must be completed before moving onto something else. People get stuck/blocked all the time - one should expect that to happen. But once someone is blocked, then it’s important that the team react with the belief that unblocking that person is the most important thing that can happen. If a manager for example sees a team member is blocked, then the manager’s top job at that moment is restoring the flow (aka - helping break the blockage).
If someone on the team (including the dev) has to wait a bit without doing “productive work” then that’s perfectly OK. Because we are maximizing the flow - NOT maximizing the “efficiency” of that developer. This is pretty different than what normally happens in my experience whereas you get blocked, so you switch over to something else while you wait for whatever is blocking you. I do this too - unless I use a structured tool to avoid it - it’s very very easy to just think multi-tasking is the right call. It’s not, sorry.
Likely as a maanger (or a PM) you hear “I’m blocked waiting for ____ so I pulled another task” pretty regularly without it triggering you to spring into action to resolve the issue. But if we care about flow that’s what must happen. What’s great about Kanban is that that board provides an ongoing visual perspective as to whether people are blocked and WIP is piling up. Creating a hard to miss feedback loop that if used with these simple expectations should cause an ongoing focus on maximizing throughput.
Conclusion
That’s it - that’s like 80% of the magic, and why I strongly recommend ditching Scrum in favor of Kanban. If you can avoid the anti-patterns I called out your throughput will likely rise, and with it overall internal and external satisfaction. You cannot do it without
great OKR’s set in your customer’s world,
tenets to preload decisions,
simple prioritization rubrics that focus attention,
treating tech-debt as a throughput and not moral problem, and
strong blameless culture that supports psychological safety.
But if you combine these things (and a few others) with WIP control and a zeal to stamp out bad multi-tasking magic can happen.
If you’re new to the Substack I encourage you to read through my prior ramblings.As a set the whole may be worth more than the sum of the parts.
Yes, I’m that old. Yes I know how to use a rotary phone. No - I didn’t ride a dinosaur to school (my daughter occasionally asks). Yes - of course it was uphill both ways. GET OFF MY LAWN!!!
This is basically another way of saying the PM’s don’t know what they’re doing really. A favorite complain from members of the development tribe. ;-)
or at least an “influenceable” input.
Not literally, as if they had written tenets to that extent. But this was my translation of it. BTW - this is one of those areas where writing tenets for what’s needed (everyone focuses together on completion) vs. wanted (pulling interesting work and owning it individually) could be super useful.
This is another of my obligatory references to Goldratt’s Rules of Flow - you’d think I was getting paid for referrals or something. For the record, I’m not sadly.
Look, sometimes when you’re a manager of managers you miss learning something important because it’s buried in the one out of six teams that do something differently. Since this point I’ve tried not to make that mistake - looking for why teams make different choices is an important way of looking for bright spots in your org.
This is almost certainly my fault and
I’m not saying I 100% agree with everything in the book. I think there’s maybe more good in Agile given the history of software development than one might take from it. But on the core point about a serious limitation on the “let’s fill up the hopper with work” and not tie these to excellent OKR’s I think it’s been a game changer for me.
Individual autonomy has positives, but it’s not the goal here. I think there are other ways to scratch that itch and I’m happy to write more about it in the future if there’s interest.
Work in progress is what it sounds like. If you have 6 people in a dev unit and 8 parallel things that is too much work in progress. That’s very bad for flow - as repeatedly demonstrated in physical manufacturing. It’s also very bad for most types of projects - including software development. WIP = lots of multitasking = less throughput.
I’m going to keep quoting the key point that Eliyahu Goldratt makes when it comes to reducing multitasking - “start something and finish it. Period.”
Of course technical considerations are likely to come into play here in the real world.