Many of my clients and the people I talk to are using Jira – it seems to be the default choice here in Australia and no doubt in some other countries for Agile teams. However, I have personally found the Kanban implementation lacking many things and over the years have had to implement a number of “Hacks” or workarounds to get it to work somewhat closer to the Kanban that I understand. Ideally, if you want to go deeper with Kanban you should be looking at tools like Kanbanize and SwiftKanban, but in true Kanban spirit, we’re going to “Start with what you do now” and look at how you can get more out of Jira.
The “No sub-columns” problem
This problem is fairly straightforward, basically, Jira doesn’t allow you to have sub-columns. Here’s an example of part of a board in Kanbanize that has subcolumns:
What is useful about this is that you can very clearly see what items are being worked on and which items are ready for the next step in the process. You might say that “Jira can do that, it’s just two separate columns”, which is true. However, the key to this is that the WiP limit is above both columns. This prevents people from starting new items if the next part of the process hasn’t picked it up – in this example, developers can move downstream and help testers fix bugs, write automation or anything else to get the flow moving again. Usually what happens in Jira implementations is that you put a WiP limit of 5 above each (creating higher overall WiP), or perhaps you do 3 & 2. Either way, what happens is at some point the “Done” column fills up and there’s something else that gets finished from “In Progress”. Where does this go – essentially either you break the WiP limit in “Done” or you have a “Done” item sitting in “In Progress”. Now your Kanban system doesn’t reflect reality, nor is there enough stress on the system to enable change or the increased WiP is throwing out lead times & predictability. Additionally, your data is going to be out – it looks like items are “In Progress” rather than queued up, further putting less stress on the system to fix the queuing and WiP problems.
The way you can get around this is to add both statuses to a single column – here is the view in Jira for the board settings
However, this leaves you with a view on the board where you still can’t really distinguish between what’s in progress and what’s done:
You can do something like the following below to add the status to the cards on the board so that you can distinguish them:
Here’s a view on the board:
Alternatively, you could change the card colours based on the statuses to give you a better visual:
The “Single Workflow” Problem
Again, the problem is fairly simple – essentially you can’t have different workflows on the same board. Here is an example (straight from one of the template Marketing boards) from SwiftKanban:
What we want to show is different work item types with different workflows that are done by the same team or are related services. We want to see all of the work together, reflecting the actual process. You can see in the above, general marketing tasks have a different workflow compared to running events. There’s also different WiP limits involved and we want to control those independently – however I’ll deal with WiP limits more broadly below.
One way people have tried to solve this is to come up with “common” workflows for both and put them on the one board. I don’t really think this is a solution – it’s more of putting a square peg in a round hole. The problem with this approach is that neither workflow is a true reflection of what is going on and you run into problems understanding what’s going on in the flow as well as making improvements. I would not recommend doing this.
The only really viable solution to this is to have two separate boards – one with each workflow. It’s not ideal, because you can’t see all of the work together, but at least you can see each workflow and get accurate data so that you can track it and make improvements on the system of work. You may need the assistance from a Jira admin to help with status management at this point as well. Essentially, if you have a Jira project and the Simplified Workflow, all statuses for all types will appear in the drop down. What commonly occurs is that people, whilst in the issue view, update the status to something outside of the workflow for that type. You will need your Jira admin to create a specific workflow for that type for your Jira project so that you don’t get them mixed up and have missing cards.
The “WiP Limit” Problems
There are several problems here:
Can’t assign WiP Limit to a swimlane – This can allow you to control the flow based on things like work item type or class of service. See below where “All Marketing tasks” has a WiP limit of 50, whereas “Events” has a WiP limit of 5.
Can’t assign WiP Limit to different sub-columns – although above we talk about putting a WiP limit above a column, sometimes you want the sub-parts to have WiP limits as well (see below under Events -> Execute)
One thing that is positive about the way Jira has implemented WiP limits is that they’ve included minimum WiP limits. This is useful for upstream to make sure you have enough demand for the capacity, but also if you want to implement CONWIP style limits (ie the minimum and maximum would be the same).
Can’t assign WiP Limit to a swimlane
There is no way in the tool to achieve this. The only way to do it is to establish a policy for the board and make sure when you perform replenishment that you do a count of the items to make sure you haven’t gone over the WiP limit for that lane. This becomes more difficult to manage for “on demand” replenishment as there may be other policies in place to allow the team to “pull” work into the system.
Can’t assign WiP Limit to different sub-columns
Here’s one way that you might want to deal with this – in this case I’ve added a WiP limit of 2 to the “Development – In Progress” sub-column / status using a Quick Filter:
Note that the limit is included in the filter so you can do a quick check / count of what’s on the board. I’m not aware of any other ways to make this more visible / obvious that the WiP limit has been breached on the board – it also requires you to actively click the filter to check for problems.
The “No Policies on Board” Problem
The policies around how the board works is not able to be attached to the board in Jira. That is, things like “Definition of Ready” / “Definition of Done” can’t be placed onto the board.
Here’s an example in the Kanbanize tool:
What’s important here is that the definitions are present in their context (just hover over the column name to see it). Team members can see this right away and understand collectively “how the work works”. It’s useful for new starters and auditing as well because you’ve effectively made the system policies explicit in the workflow. It also means that when there is an issue, the team can collectively review the policies and make improvements to the system of work immediately.
Jira doesn’t allow you to do this and it often leads to confusion as to how the workflow works.
In the days pre-COVID, with a number of the teams I worked with, I actually replicated the boards in Jira with a physical board. You could put all the policies you like on the physical board and ensure understanding and improvements take place. However, since COVID this is not an option for most and we’re more reliant on the tools than ever. The other downside for this is that you’re now maintaining duplicate information and usually one will be out of sync with the other in some way.
Another possibility is to record the policies somewhere else. Teams with Jira often also buy Confluence, so this might be a natural place to store this policy information. However, this is less than ideal as it’s detached from the context of the board and in an “information refrigerator”. Oftentimes this is ignored, unknown and is just not used. It’s not ideal, but it’s really the best option I can think of without a physical board.
The “Single Assignee” Problem
Jira, by default, doesn’t allow for more than one assignee per card. Using the Kanban practices, we often see improvements in collaboration and teamwork and often get to the point where multiple people could be working on a card. We should see that reflected on the board so that we can see the work everyone is working on.
One way to do this is to add an additional field. Now, you can either add a single field with the user picker, or add a Text Field and include people referenced by the name (using the @ character). Generally, the user picker is better because it has greater query ability and visibility on the cards, but it is restricted to one user. If you commonly have 3 people collaborating, you may want to add 2 additional fields.
You will need to be a Jira admin to add the new field and apply it to the necessary screens:
Once it’s available to projects, the users can see it in their issue view and assign it.
You will need to edit your board layout if you want to see this extra field on the cards:
You can now see collaborators on the board:
The “Poor Data” Problem
There are a couple of key problems here:
There’s not many data reporting options – really on Cumulative Flow and Control Chart (the CFD is actually not too bad)
The Control chart is not particularly good
I’ve never been a fan of Jira’s control chart. I find just looking at it can be really jarring and I find it hard to get useful / actionable information from it. For example, here’s an image from Jira:
Note the Y axis – the scale is changing as you go higher up the chart. I don’t know about you, but when I was taught basic graphing skills in high school, this was always supposed to be a consistent scale. This assumes the interesting things will be at the bottom because it has the most real estate on the chart. In reality, it’s probably the items at the top that you need to focus on. This actually distorts your view on the data and limits, or even taints, what you can get from this kind of report.
The other is that it’s using standard deviation to improve “predictability of cycle time”. This assumes that predictability is the goal (“fit for purpose”). You may want to focus on raw speed rather than predictability, or perhaps something else such as profit / costs. This won’t necessarily help you with that. My opinion is that this is a manufacturing take on using data, rather than one that is specific for knowledge work where the Jira tool is aimed. I’d prefer to see other metrics such as percentiles so that I can make a decision on what to do based on the purpose of the team / service.
My first suggestion to anyone who’s using Jira and can’t move to another more suitable Kanban platform is to install the Nave plugin for Jira. This has fantastic metrics and will allow you to make a myriad of improvements using the data.
It has histograms and scatterplots for Cycle Time and Throughput, plus it has a great “Aging chart” where you can see problems in the system right now. One thing I find really useful is that if you use the “Flag” feature in Jira for blocking items, you can get additional data for this.
I know there are some people out there that can’t add plugins for various reasons into Jira. If you’re one of those people, then here’s another potential option for you (although it’s a bit manual). For each status, add a field with “Date Time Picker” type.
Create an automation to update that date every time a card moves into that status on the board:
Then, when someone moves the card, it will update the date:
You can query the data and pull out the dates you require using the CSV export on the “Filter results” screen. Then, you can use something like Troy Magennis’ Cycle Time Calculator spreadsheet to copy the key dates into which will produce charts for you. You can find the charts on his Github page http://bit.ly/SimResources
The only other option is to call the Jira APIs and extract the data yourself and create the visualisations – but if you’re doing that, you may as well buy the Nave plugin.
That’s all for now folks – there are other gaps, but I haven’t covered everything here. If there’s anything important that you think should be covered, please let me know.
Jira has a very basic Kanban implementation – it may be useful to get you started, but it will continue to hold back the maturity of your organisation unless you do some of the “outside of the box” solutions above. For anyone using Jira for Kanban, I would recommend using Nave. If you can, you might want to try one of the other tools such as Kanbanize and SwiftKanban which have Kanban front of mind when they designed the tool for those looking for greater organisation maturity without the workarounds.
I wrote this article a number of years ago, but I’ve been thinking about it again recently, so I thought I’d republish this with a few additions.
I have been pondering the Kanban commit point for some time and it occurs to me that there is a striking similarity between this and the formation of contracts in the eyes of the legal system. I’m taking this from my perspective here in Australia, however I believe that basics of contract law are very similar in many western legal systems. In all cases, if you are seeking to rely on opinions of the law applicable to you, please consult with your local legal professional. I found this as a useful way of understanding the commit point – I hope you do as well.
One of the key reasons for understanding this is the importance of the commit point to the operation of a Kanban system. This is the point from which we measure the Lead time and start to make improvements. It’s the point at which the understanding between customer and service provider have or should be of a common mind in terms of what service they’re getting and when they’ll get their output. In contract law this is referred to as a “meeting of the minds”. This sounds simple enough, but from anyone who has been practicing service delivery in practice will know, real life situations are not necessarily simple.
There are some basic concepts in contract law that may assist Kanban service providers to better understand their customers and ultimately provide an improved service.
Offer & Acceptance
Two of the key aspects of contract law are offer and acceptance. The third is consideration but for the purpose of understanding when Kanban commit point, this is not necessarily important. For the purpose of this conversation, where consideration is mentioned take it to mean simply – consideration from the customer the price paid and for the service provider the consideration is the service. Just as in contract law, I would argue that the Kanban commit point requires both offer and acceptance to be in place in order for the commitment to be made. There is often a lot of confusion because people don’t active listen for acceptance and assume that because something is on a backlog it’s committed (see also Use Options Instead of Backlogs).
This is where one party – it may be the customer or the service provider in a Kanban context, communicates to the other the basic terms. For example:
Customer offer – I’ll give you $100 if you can deliver me a widget by the first Monday of next month.
Service Provider offer – Advertises that they can deliver anyone a widget of specification AxB for $150 within 10 business days of you placing an order on their website.
Note at this stage, there is no actual acceptance of either scenario. In the Kanban world this then remains an Option of the person receiving the offer.
Additionally, the offer needs to be communicated to the other party – in terms of the Service Provider offer this could be direct to a particular customer, to a class of customers or even the world at large.
This is where the party receiving the offer clearly communicates that they agree to it (verbally, written or through conduct). To continue the examples above:
Service Provider acceptance – Yes, we’ll get this for you by that date.
Customer Acceptance – Logs onto the Service Provider’s website, fills in an order form and clicks submit.
The form of the communication can vary from something as a simple nod of the head to a formal written contract with all the detailed terms and conditions. This is a key thing to examine in your Kanban system – are your customers getting a clear signal of the acceptance? Are they even aware that you have committed and do they properly understand what this means? Are they mistaking commitment for a “receipt of the offer”?
In many circumstances, a party may come back with a counter offer, which similarly must be accepted before the contract is formed / Kanban commit point is reached. To continue the original examples:
Service Provider Counter Offer – If you give me $150, I’ll deliver it 3 business days before that.
Customer Counter Offer – I want a widget with specification AxC, I’ll give you $200 if you can deliver it in the 10 days.
Of course, at that point you may get count-counter offers and it may go back and forth until an agreement is finally made. For example, in Scenario 2 above, the Service Provider might respond by saying, “I’ll need to retool to provide that specification, therefore I can’t get it to you in 10 days, but 15”. In Scenario 1 above, there may be indications that a different class of service is involved – this may become a fixed date class of service. This is where misunderstanding can occur – it’s worthwhile summarising what the work item / outcome being agreed to is before committing it to your Kanban system.
Keeping it small
A problem that commonly occurs in practice, which Kanban systems try to control, is batch size. However, often contracts are written on the basis that it’s simpler to go through the negotiation process only once and then wait for the end outcome. That is, the contract formation process is often based on large batch. However, you might want to consider the shape of the offer when creating contracts. This will allow your Kanban system to keep its options open for longer and preserve flow. Of course, you may lose some of the certainty that large contracts bring to a business – but with some additional negotiation as a Service Provider you should find a way around that. Indeed, the regular reliable delivery that often results from Kanban implementation, will often negate the need for larger batches.
These are further offer / counter offer examples that demonstrate keeping batch size small:
Customer Offer – I need 1000 widgets at $100 each, but I’d like them in 5 months.
Service Provider – We can provide a maximum of 300 widgets per month. We’ll accept orders for 300 or less widgets per month at $100. Please notify our sales staff 1 week prior to the beginning of the month as to how many you need and we’ll deliver these by the end of the month.
In this scenario, the Customer tells the Service Provider that they’re happy with that arrangement because they get some of their widgets early. They then place orders on the first, second and third month of 300 and for 100 on the fourth month.
Customer enquiry – I need 10 widgets of specification AxB, 10 of AxC and 20 of AxD, what can you provide?
Service Provider offer – we can produce a maximum of 12 widgets every 10 days. The price for AxC specification items is $200 and the price of AxD is $250. Our online order form will prevent more orders coming in for that 10 day period once we have reached our maximum and will accept an order for the next period up to 10 business days prior to the period commencing.
Customer response – No one else can make type AxC & AxD, plus your AxB widgets are of higher quality than your competitor, so we’ll go ahead with that. Expect to see our first order shortly. They put in an order for 10 AxB and 2 AxC in the first period, 8 AxC and 4 AxD in the second period, 12 AxD in the third period and 4 AxD in the fourth.
In both the above cases, there are multiple contracts at play. Each order is a separate contract, under the terms negotiated earlier. The first part of the negotiation is around the terms of contracts, as each order is placed a separate commitment is made. Until the order is placed and accepted, these are options for both the customer and service provider.
In the Scenario 1, there are flow control mechanisms being put in place. A maximum limit is described as well as the timing.
In the Scenario 2, the order form is shut down preventing the acceptance from being communicated – this is baked into the system of work ensuring the service provider doesn’t committing to items outside of it’s capacity.
Possible drawbacks of this approach are if it is no longer fit for purpose for the customer to break things down into smaller pieces of work, then you should be prepared for your kanban system to have a larger batches and the associated variability that comes with this. Often, the variability will result in higher costs to cover the inherent risk associated with it. You may wish to describe / quote these as different terms to your customer to highlight / make visible the risk of large batch. However, there may be reasons such as competitive forces at work in your context that prevent this, so of course you should also pay attention to these. Although, this does raise the question of whether this is the right type of option to pursue for your business.
This is a different type of negotiation than normal service delivery. This is about exploring options and assessing risk. The nature of this kind of work and outcome is different, as are the parties and as such you should consider a different kind of agreement here – generally speaking, you’re “buying information” to assist with decision making.
If you’re part of the same company, often it’s for providing leaders the information they need to make the key decisions around the strategy for their company, as is often the case with product development. In the context of providing services to external clients and in the case of specialised or “one off” products, often understanding what the parties want is an essential first step. An “Idea” is often not enough to make a reliable commitment to deliver by – at least the variance for such things are often so large as to provide no solid guidance as to whether a commitment can be fulfilled.
Possible solutions to this problem are to agree to a different kind of work – it might be based on time and materials or constrained through timeboxes, number of experiments or other means. In the same way as the downstream kanban systems described above, it should be clear prior to pulling the Upstream Idea for exploration that the requestor and service provider both understand that this is the necessary first step in understanding what is to be delivered. The terms of this can also stipulate either the framework for delivery later on but should keep it options open for both parties – after all the idea may prove to be invalid for any number of reasons. Alternatively, it may leave this open for later agreement once the details of the Idea are known.
Service Delivery teams should not pull Options through the upstream discovery process without this in mind – that is, Ideas are options and should not be committed to automatically. It may also be the case that downstream systems will receive work items to explore this upstream process – you may need to ensure you have capacity to give the upstream process to ensure overall flow.
Kanban has cadences set up for dealing with this kind of thing. Typically, the “Replenishment” meeting is where work items will be committed to. Thus, prior to having these meetings, it’s often useful to understand the details above so that this meeting is run relatively quickly with a pre-understood way of working together (essentially, the “terms”). The key parts will become “system policies” – such as “ready to commit”. For upstream, the commitment would be “to explore option A”, whereas for downstream its to deliver the final service. During these sessions, usually we move the ticket to represent the work over the commit point – a signal to all involved that the contract / agreement has been formed.
There are some similarities between Kanban commit points and contract law – I hope you find it useful. Indeed, gaining a clearer understanding of the request will assist in providing services that are fit for their customer’s purpose. This “meeting of the minds” is important and may be something you’re overlooking in your Kanban system. Accepting items of work onto your system of work without understanding the nature of the agreement may well expose your business & kanban system to potentially avoidable risk. Make commitment policies explicit and well understood and build them into your process for a better outcome for all involved.
The use of explicit policies will help you to scale. In the same way that protocols help technology to scale, such as the way the internet has been designed, explicit policies in Kanban systems will help you scale your services. Using explicit policies will not only help your managers ability to better manage their services, it will also benefit the teams involved by allowing them to further “self organise” to get the work done, guided by the policies.
When I refer to an explicit policy, it means a rule or protocol that determines / describes how the system of work operates. Explicit policies are known by all and consistently followed by those operating within the system. They’re effectively the “rules” of the system of work.
When scaling, I tend to think of the internet – here’s something that has scaled to the extreme with many people around the world now using it as part of their daily lives. So long as you follow the protocols defined by the system, you can become a part of it quite easily and start to interact with the other parts of the system.
Examples of policies are things like definitions of ready and done, WiP limits, the order in which items are pulled through the system and many others that may be customised to your work systems needs. By making the clear and understood by all, it allows teams to “get on with” the work without having to have detailed discussions at various potential touchpoints. The key here is making sure that there actually is that common understanding of those simple rules and practices that are adhered to by all. To assist with that, keep them simple – don’t make them massive “pre-flight” checklists that are too detailed to understand. Put them on the board or in other places where the team will see and refer to during their daily work – allow team members to see them and do a “spot check” when they’re moving a card. Having them hidden away in a large document that is rarely used, or only used for “audit purposes” won’t be sufficient.
Another key part to it is that these rules should never be 100% static – they should be revised. Through the kanban feedback loops, you’ll discover improvements and you may even iterate through STATIK to find new or updated policies that need to be applied. As these mature, you’ll more of them apply to situations involving scaling.
When scaling kanban you’ll start to move to different levels of abstraction for services across the organisation. At these different levels, you might discover new and different policies and making them explicit helps flow at that level. Furthermore, you might start to scale out – understanding how teams interact by coming up with “rules of engagement” (policies) that describe how teams collaborate to achieve a collective service goal.
Managers no longer need to direct teams in how they work, but they work to make the policies and interactions within the system explicit and understood so that the teams can self organise around the work. Being explicit on your policies when scaling up, across and down the organisation will help make it easier for all of the groups to interact together.
There are 3 key feedback loops in Kanban – the board, the cadences and the data. Each of these feedback loops play their part in enabling and improving flow. What you might not have noticed right away is that these three feedback loops support each other and without one of these, or with a lower maturity version of each, you’re likely not getting the most out of your kanban system. The following outlines how each of these support and effect each other and how you can use and improve them to enable better outcomes.
Here’s an overview of how the 3 relate to each other:
Boards and Cadences
These two really support each other. For example, during the Kanban meeting we often walk the board and in the process of doing so we may make some updates to items. This is also the case with the Replenishment meeting – you’re moving items over the commit point as a part of this. In that way, the cadences are supporting the boards in terms of keeping them up to date and relevant. Imagine if you didn’t have a daily kanban meeting – how likely is your board to fall behind or even into disuse? Additionally, the boards are supporting the cadences because imagine having these meetings without the boards – you wouldn’t have all the latest information at hand to make those decisions.
In terms of the improvement cadences, often things like the service delivery review will look at the board and how it’s designed. You might even do a small STATIK increment and redesign the board with a new work item type or class of service that you’ve discovered. Thus the improvement cadences are keeping the structure of the board(s) up to date as well.
Boards and Data
This is relatively simple – essentially the data that we have from kanban systems is derived from the board(s). We capture data about the flow, blockages, work types, classes of service and any other data points that you may think relevant. For physical boards this data capture is often manual – you might at the end of the week round up all of the things that have been unblocked and record the details of the blockages for a subsequent blocker clustering activity. Alternatively, you might capture lead time data at the end of the week or even daily.
For electronic systems, these are often built into the kanban tool itself. Purpose built kanban tools such as Kanbanize and SwiftKanban have these built in from the beginning. If you’re using something else such as Jira or Azure DevOps, you might want to consider something like Nave (pictured above) to get the metrics directly from your board.
Data and Cadences
The data really does help to support the cadences – particularly the improvement cadences. You really can’t do the Service Delivery Review effectively without data – it’s a key part to look at things like lead times to see if your service is truly fit for purpose and if not, what actions should you take. Furthermore, after taking actions, you’ll go back to the data points to check in to see if the experiment you tried worked, if you need to rollforward/backwards or try something else entirely.
Using data is also useful in the delivery meetings as well. For example, if you know which work items have aged to an extent that puts their lead times at risk, you might want to focus on those and have others in the team help out to get them done.
The kanban feedback loops work together and help support each other. You’ll find that as you refine one, you all add to the maturity of others. For example, modifying the board will lead to new data which you’ll then review in you cadences. All of these serve to continually enable the delivery of and improvement of the services that you offer your customers. How are your feedback loops servicing your customers? If you want to learn more about feedback loops, please come along to my Kanban Systems Improvement course.
Understanding the difference between work item types and classes of service can sometimes be confusing to those who are new to Kanban. These are two different concepts and whilst there is often a 1:1 mapping between a work item type and class of service, they are distinguishable concepts. Here’s a reminder if you’re getting confused between the two which will help you out in future STATIK exercises.
Work Item types
The work item type is closely related to the request that’s coming from the customer. Customers can often have different types of requests / different types of services / work required. These requests can go through different types of workflow.
This is the key to work item types – the customer request / service needed and the process we go through to provide the service.
For example, if you were to take a coffee shop, a customer could request a take away coffee or they could order a coffee to have in the coffee shop. These two are conceivably different work item types – in the first example the customer will expect to get the coffee in a disposable cup that they can take away. In the second example, the coffee will come in a reusable mug and staff member will bring it out to the table at which you’ve been seated. Notice that both have different workflows. Also, in either case there might be different expectations as to the level of service. For take away, you might want to get it as soon as possible, whereas for dine in service, you might be happy to wait a little longer as you’ve likely got someone there with you that you’re having a conversation with.
Classes of Service
Classes of service represent different policies on how to deal with a customer request / work. For example, for an urgent or expedited item the policies may be:
Stop what you’re doing and focus on this task
Swarm on it (use as many people as required)
Don’t wait in queues
Release when ready
Policies for standard items might be:
Only pull this in when capacity is available
First in, first out (FIFO)
Note how the policies describe how the team is supposed to treat the request.
Compare and Contrast
Sometimes a work item type will have a class of service to itself. Some work item types naturally have a higher priority than others. I think this is where some people can get confused, when there’s a 1:1 mapping between a work item type and a class of service. However, please remember to distinguish based on what was discussed above so that you can get a better understanding of how the system is operating.
Alternatively, some work item types will have different classes of service. There are situations where certain requests will have a different cost of delay compared to others. Thus, with those different levels of urgency the team will need to handle it in a different way.
Both types taken together can make up the service level expectations. Take the following examples:
Class of Service
Average lead time of 12 days, with 85% completed within 16 days.
Average lead time of 6 days, with 90% completed within 8 days
Average lead time of 10 days with 80% completed within 15 days.
Lead time expectations
As you can see from the above, they both have distinct elements, but they work together to help inform customers of service expectations.
Work item types and classes of service are two distinct concepts. They need to both be understood in order to form a kanban system. When designing your system, take the above into consideration and you should be able to walk through the STATIK steps more easily.
This is something that I’ve been thinking about over the last several years. After experiencing / running some of these events and having seen them played out with various clients I had an instinctive reaction as to the waste behind PI planning. I haven’t had the full language to express exactly the problems with this and how it fits into the overall organisational maturity and growth up until now. Recently, I attended an Okaloa flow simulation on multi-team flow and after stewing on the insights from that for a week or so, whilst planning an inception (I didn’t want to use the PI planning words), I realised a number of connections that I’d like to share with you.
Firstly, as if to contradict what I’ve just said above, even though PI planning is ultimately waste, it can be a useful stepping stone for chaotic / lower maturity agile implementations. I think this is similar to many things in SAFe – some of the practices are useful to begin with, the painful part is when they get hardened and become the norm rather than looking to rise above and find better ways to work. Let’s consider some of the useful points of PI Planning first:
It gets alignment (across leadership and teams)
It gets the teams to think about their capacity
It can help bootstrap the upstream (described below)
This is worthwhile expanding a little. For some teams, they haven’t yet developed or seen emerge Upstream Kanban. The PI Planning event can actually help bootstrap this if it isn’t there. However, often many agilists (yes, even those who consider themselves coaches!) don’t have knowledge of Upstream Kanban, so they miss the subtleties that occur upstream and how it is different to delivery. What is important to understand is that work flow through your overall system – some aspects prepare work for delivery and discard poor options, whilst others build / deliver it. Where flow doesn’t exist, the PI planning event (being large and expensive) creates an impetus to make sure “features” are prepared enough to run through during the event. Often what happens is that during PI planning teams commit to features to deliver, but also determine which features they want to “shape up” for the next event for delivery. This is one of the downsides if you’re not careful – it locks you into a 6 month lead time for features (is this agile!?!). But importantly, teams are concurrent thinking of getting items ready whilst they’re delivering other ones.
Where PI Planning comes unstuck
One of the key parts of where this comes unstuck is that this is usually held every three months. That’s problematic because what occurs at this time is that teams align and commit to a number of work items every three months. The result is that you essentially only get 4 of these per year. Is 4 points per year to plan / change course agile? This is really a large batch transfer that get’s underway and it undermines the continuous flow that was starting to emerge through bootstrapping upstream mentioned above. Agility comes from fast feedback and having only 4 pivot points per year where you can adjust the work based on feedback is the opposite of what I believe the agile manifesto authors were trying to achieve.
Other examples of where I have seen waste with PI planning is when an assumption during planning turns out to be false. I’ve seen this happen within 2-4 weeks of the PI planning exercise and teams in those instances dove back into a PI planning process. There’s a lot of cost involved in getting these PI planning events up and running – when a few assumptions learned down the track causes the whole process to come unstuck, I’ve got to question whether it is the right process.
Whilst working within a “train” is part of the PI Planning, if you have to synchronise work outside of your train then more coordination is needed. I think this comes down to the mechanistic nature of SAFe itself that denies the complex and somewhat organic nature of the organisational network. Although you may be able to predict the nature of interdependencies and organise around that, there’s always going to be new demand that will create new connections in the organisation that you have to deal with. Now we’re talking about pre and post PI planning events to ensure items are synchronised across streams. Again, these are often large events with a multitude of people involved. A couple of problems with that are that, once again, plans can change and you need an intervening process to keep things aligned once underway. Is there a way to achieve the outcome without the large cost of these events?
One key theme from all of this is that we’re committing too early. Whatever happened to the lean thinking that talks about committing at the “last responsible moment”? Now I’ve heard arguments that PI planning isn’t about “committing” but about planning and that plan will change. However, I don’t think that’s in practice what most people take away from it. Particularly so when teams do the “fist of five” – it seems like they’re “committing to the plan” (although the usefulness and safety of this technique is questionable). Alternatively, this perhaps could be argued to be a reversible commitment – that things later in the planning can change. But what’s not readily apparent is the abort costs and other potential waste of this behaviour.
If we were instead to defer commitment until we have capacity to be able to fulfil the order / option, then we have greater flexibility to deal with unexpected events with greater clarity for all. Oftentimes, we see urgent work enter the system after PI planning – this causes other work to either be paused, postponed or even aborted. All that time we spent planning and now we have to replan for this new item. We also have to reset expectations about when those other items will get delivered. Perhaps we shouldn’t do such detailed planning for items down the road, perhaps there’s another way to schedule / commit to work.
Replace “Plan” with “Forecast”
I think that the word “Plan” inherently has issues with it. When people hear the word “plan” the often associate it with a commitment. This gets the expectations around the event confused and misunderstandings will arise. Instead of using “Plan”, use the word “Forecast” and give a likelihood of something being done. For example, say something along the lines of “on average our features take 65 days to deliver, and we can deliver 90% of them within 110 days”. Note the difference between the average and the 90th percentile – it allows for the potential that an urgent item might slip into the flow and delay it – and there’s also a chance that it can take longer. This also requires teams to capture some basic flow metrics – again something else that might need to be bootstrapped through the process before you can rise to this next level. However, this gives stakeholders a much clearer understanding of when something will actually come their way.
Once the upstream has been bootstrapped, you can start to move away from the “stop/start” nature that tends to occur with PI Planning. This tends to be more of a batch transfer than flow and will start to impact the organisation’s real agility. Start to look at your flow and impose some WiP limits around the parts of the process – batch transfers happen when there are no WiP limits around those parts. Doing so will create a more sustainable pace for your teams and allow you to defer commitment until you need to make those decisions.
As your deployment practices mature, you’ll find that you’ll be able to start to deploy on demand. It begs the question why you can’t replenish on demand in the same way. Why would you want to wait for the 3 monthly cycle to pull new work in? The answer is that you probably shouldn’t – the problem is that the transaction costs of replenishment via PI Planning are now too high, so it’s probably best you avoid those costs and find a better way.
Continuous improvement inherently built in
One of the other problems with PI planning cycle is the “Planning and Innovation Iteration”. This totally goes against the ideas of flow and continuous improvement. Using the last iteration of the PI for this is inherently bad. What tends to happen in practice most times is that this gets left out when initiatives run into risk / dark matter expansion and have to fill it with work anyway. Also, leaving all innovation to one single point in time seems contrary to how innovation actually works.
Instead, plan in time along the way to do the preparation of new items and to include innovation as a matter of course – part of the weekly workload. At the team level, you can allocate capacity to intangible class of service items on the board to cover off innovation items as they’re needed (usually it’s own card colour or swimlane).
Tokens guide capacity
An alternative that was made clear to me through Okaloa flowlab simulations, was that you can use tokens to guide capacity and commitment. Teams understand how many features they can build concurrently and offer that many tokens. Let’s say it’s 4 concurrent features. They work on 4 features and when one is done, they have a free token. This token is then available for new commitment. When a new feature is ready that requires the token (usually it needs tokens from a few teams) that new token can be allocated to the feature and it can be started (once tokens from all teams required are available).
This is a simple solution to allowed deferred commitment and reduced transaction costs from PI planning and will ultimately help the agility of the organisation. Oftentimes, teams in SAFe style arrangements don’t have a very good understanding of their capacity – or they have to do lots of estimation and need other details to understand it. The tokens provide a simple way to understand capacity without the need to go into detailed estimation.
PI Planning builds in large transaction costs and creates batch transfers which are waste. Both of these effects can impact your organisation’s agility in the long term. These events can be useful in the short term to provide alignment and help bootstrap flow. You should avoid hardening in such costs and anchoring your agility and instead take it to the next level by concentrating on how to enable real flow in your organisation. There are other alternatives and you should look at how they can be implemented to help your people, your customers and your organisation’s strategic objectives.
In my early days in agile, I remember statements from folks like Martin Fowler who would talk about continuous integration and his approach that “if something’s is hard to do, do it more often”. Through this continuous integration was developed, by folks who leaned into a stressor on the system of work. Later, based on the same kind of idea, this proved to be the foundation for other advances like continuous deployment / delivery.
Stressors on the system of work often appear – you may not notice them at first for what they are, but they’re often an opportunity to improve the system of work. Indeed in these examples, by leaning into the stressor teams have been able to unlock new opportunities previously unheard of. You can choose to listen to the stressors in the system of work or avoid them – the maturity of your system of work will reflect your group’s ability to deal with stress on the system.
Another example of a stressor that I often see on systems of work are WiP limits. Teams that are new to this often need some time to adjust. It can be hard not to follow old habits like continually pulling new work into the system without first clearing those that are in progress. Often systems are setup to reward / recognise those who are “busy”, but not necessarily those who have created a sustainable, reliable flow. Particularly, I’ve seen teams go over their WiP limits and have had to work with them to reinforce the need to do so. Those that can manage to lean into the stress often see benefits to the systemic improvements and can move onto next level problems like ensuring they really are being fit for their customers purpose.
There are many other examples, such as “backflow” (items that are rejected, move backwards in the flow), other quality issues, discovering and dealing with fat tailed lead times, cultural issues such as heroism and rockstars as well as dealing with things that are blocking flow. All of these can create stress on the system of work and often the people in it.
In all of these cases it’s important to be able to detect the stressor. Often visualisation systems help with this, but what’s also important is the supporting cadences and feedback loops. Equally important is to have an inspection mechanism / point where you can more deeply understand what is happening to the system and why, so that you can take a more logical, reasoned response to the causes. Many people respond to these stressful situations emotionally as they too are under stress, but if you focus on the system of work you can help relieve the people in it.
I would hearten you to have courage when facing into these stressors, it’s not going to be easy, but if you handle it correctly the ultimate system will improve and relieve stress on the system of work and the people within it. Remember our Continuous Integration example above – it would have taken quite some work to go from no CI, to a working CI practice but the benefits are clearly there. Sometimes it will be difficult and require patience, but don’t back away from it because that will only hide or exacerbate the problem.
Leaning into the stressor in a system of work is difficult, but often very beneficial. You will uncover better ways of working through doing it and improve your ability to service your customers and your market. To do so effectively you’ll need to set yourself to be able to start, being able to see / visualise the problems, having a feedback mechanism to understand causes and catalyse action and it often requires a little slack in your system to be able to allocate capacity to create the improvement. The good news is, you can start putting these things in place today!
This is a format of retrospective I first started using in around 2013 and still use it today to help teams understand their work and the system of work in more detail. It’s particularly useful for teams starting out with Kanban and can be used as not just a reflection point for the team, but one for coaching and learning. It’s fairly simple and doesn’t need a lot of preparation – you can just go with this on your next retrospective.
It’s called the campfire retrospective because essentially the team gather around the board like a campfire and tell stories. You look at the board and prompt with questions like “what’s going on here” and let the team tell the story. I also often ensure that key data such as lead time distribution / scatterplots are to the side of the board so we can discuss this in relation to the stories on the board. After telling a few stories and understanding some more of what’s going on, you can start to adjust your board right there in the retrospective. Here are some things you can look out for / prompt for:
Does the workflow look correct or should we adjust it?
How are the WiP limits going – do we need to adjust?
How’s collaboration – are people helping each other to achieve better flow?
Are any items blocked and what’s causing it?
Are we seeing some key repeating dependencies?
Are there any new work item types or classes of service emerging?
Are the policies working or do we need to update them?
What is the data telling you?
Are recent changes working as you expected or do we need to adjust again?
What are the customer expectations and interactions around the tickets?
There are so many things that you can look at and talk about around the Kanban campfire. Importantly, once you discuss these issues, you should look to make improvements. Some you can make straight away – for example to adjust the policy just do it there and then on the board. Others you might need to add a task to the board to track a larger improvement body of work.
This is a simple form of retrospective that often provides a great way for the team to collaborate and improve. Often they may start this with improving the system to help the team, but over time, you’ll find you can shift this to help them focus more on the customer outcomes with their Kanban system. The other thing that’s great about this is that you don’t need a permission to get started – you can just schedule this for your next retrospective and watch the team take ownership and leadership. Have you tried the Kanban campfire retrospective yet?
This is a question that I often ask people when they’re having trouble adapting to Work in Process (WiP) limits. WiP limits can be challenging to some when they’re new – all those competing demands telling you that you need to do more. How do you respond – by pulling more work into the system in the hope that business will equal better performance. It can be hard to make the switch to say “not yet“.
Often the decline into high WiP will occur over time – and it will be so gradual that you don’t notice it happening. Until you’re snowed under – there’s so many things on the go you find that you spend your entire day just trying to keep all the balls in the air – you don’t actually make much if any forward progress. There’s so many things on the go that you don’t have time to consider how to fix the problem – and any attempt to improve the system is met with a negative, sometimes aggressive response because you’re too busy to even think about it.
People can react this way when they’re under stress – from too much work in the system. People don’t want to take on a WiP limit and don’t understand how it can help them and they often think that it’s OK to task switch a lot. It’s not OK – especially in knowledge work. To do knowledge work effectively you need time to think through deeper problems and try and solve them in ways that are intensive on the “necktop computer”. Task switching takes you away from this ability and is part of the source of the frustration.
Which is where my question comes in – if you think a little bit of task switching is OK, then I take it to the extreme. “Is it ok to work on 1000 things at once?”. To which, the inevitable answer is “No” (I haven’t heard anyone say Yes to this yet). To which I respond, “Good, so you agree there needs to be a WiP limit. Now let’s talk about where that limit should be for your context”.
Getting started with WiP limits can be difficult – sometimes you need to take it to the extreme to help people understand what’s really going on. If you find yourself bogged down with too much work, don’t use hope as a method. Limiting your WiP will help you focus on what’s important to meet commitments and achieve balance. What are you waiting for?
This question comes up quite a lot – particularly with those who are new to Kanban. I was reminded of this by Alexei Zheglov recently in one of his tweets and I recall some recent conversations on this topic. The way you handle this kind of situation reflects on the maturity of your Kanban implementation. On lower maturity systems, you will see cards moving backwards quite a lot. However, with more mature implementations this should rarely occur, if at all.
It comes down to whether your board is representing a workflow with handoffs and backflow for problems or if you’re modelling the knowledge discovery process. If you’re modelling you’re basic process or purely specialities of work you will see things move back and forth as people create those handoffs. In this kind of board you’re describing the specialities – so although it may be a useful transition point you run the risk of hardening those distinctions, rather than focusing on the overall flow. You may use it briefly as a transition point – to demonstrate the problems, but please use it as a catalyst for further action and improvement rather than sticking with it otherwise you’ll continue to face problems.
When we teach Kanban, we recommend that your board visualises the knowledge discovery process. What that means is on the left we have little knowledge of the item and as it moves through the flow we gather more knowledge and information about the problem we’re facing, its implementation / rollout to customers. It also means that the titles on the board describe the predominant knowledge discovery activity – it doesn’t exclude other types of activities. Take this simple board:
Just because the title says analysis, it doesn’t mean no development or testing can occur. For example perhaps as part of the analysis you might need a developer to do a quick spike to see if something will work in a particular way. Additionally, the development column doesn’t mean no analysis can occur – for example the developer may analyse the various data scenarios in order to create the appropriate set of unit tests to handle the scenarios. Testing doesn’t mean development activities can’t occur – maybe you’re developing the automated test suite here or fixing bugs.
This brings me to my next point – how to handle these situations. The testing scenario where we find a bug is one that is quite common. In that case we often leave the original card in testing and create a new card for the bug which also traverses the flow. Once the bug reaches testing and is verified as fixed the original card can continue on its way. Another way to deal with this is using blocking stickies. I discussed this in one of my earlier posts on blockers.
If you don’t treat this as a knowledge discovery process and instead move cards back and forth in the flow you’ll hit a number of issues that will make it very difficult – if not impossible, to manage flow effectively:
Your metrics are going to be out – if you look at your CFD charts they’ll do all sorts of strange things such as lines going down (that’s usually a sign you messed something up).
You’ll break your WiP limits – these are there to help you manage the flow based on capacity. Items moving back and forth will constantly be breaking WiP limits and you’ll find it very difficult to manage flow as you have no control over the movement back and forth
Furthermore, if you’re modelling the knowledge discovery process it makes no sense to move something backwards. Cards upstream denote that you have less knowledge about them. The trigger for moving something backwards is usually that you’ve discovered something about it – some kind of problem. It makes no sense to move it to a place on the board which denotes you have less knowledge – quite the opposite the discovery shows that you now have more knowledge than you did yesterday about this ticket so it should not move backwards. You may be better off examining your policies to make sure items flow cleanly without the backflow.
Moving cards backwards on a kanban board doesn’t make sense unless you’re using it as a catalyst to bring about improvement. There are better alternatives available that take into consideration the whole of the system. You’re better off going to a place of understanding the knowledge discovery process so that you can manage the flow effectively.
Many organisations don’t recognise the difference between upstream and downstream. Some do recognise, but don’t consciously think about the nature of the difference between the two and consequently don’t design their systems of work in the most effective way. I hope this article will help you recognise the difference between the two and secondly, how to handle each situation.
Both upstream and downstream involve knowledge work. I consider knowledge work anything that requires the use of the “necktop computer”. That is, it’s inherently human and requires skills and knowledge, but it’s also about knowledge discovery. The following diagram is a visual representation of flow moving through upstream and downstream.
Upstream is to the left of the “commit point” (the red line). You can see a few columns there describing a basic flow, plus a discarded area at the bottom. Different types of work may flow through at different rates, but usually the high value items will take longer to move through this flow.
Upstream is about discovering and selecting from your options. This is why it’s slightly different here – you can see the discarded area where we drop options that are not worthy of our time in delivery. We have other things like minimum WiP limits to help ensure that we don’t starve downstream.
Key to upstream is that you’re buying information. That is, you’ll need to make decisions thus you should understand what decisions you’re looking to make and get the appropriate information for those decisions. Doing so, you should usually find the cheapest and fastest ways to get information that is “good enough” for making decisions.
I also like to overlay this with the Cynefin framework. I quite often consider a lot of the work that occurs upstream to be in the complex domain. We often probe to find the information we need to make decisions. We may also apply some expertise / analysis (complicated domain) – for example to get the appropriate information to be considered “Ready” for downstream consumption, but the main high value work will usually involve some probes.
There are also linkages with upstream once an item is “Done”. There may be some validation that occurs once items are deployed which will feed new options into upstream. Ensuring appropriate feedback loops and data capture are performed are ready should also be considered when designing your system of work.
Downstream we’re looking to convert the options chosen into outcomes. This is quite different to upstream – whilst still knowledge work the goals are different and we need to adapt the system appropriately. I often also see / refer to this as “Delivery” as well.
This is why in downstream we consider things like efficiency and quality. Often downstream is quite costly – there may be a variety of skills and competencies here that aren’t cheap to buy so we want to make sure we’re getting the best “bang for our buck”. Here we’re looking to avoid blockages and bottlenecks and convert the option as quickly as possible, or at least doing so in a way that’s fit for the customers purpose.
As this conversion of options requires quite a lot of skill, often we see most of the work in the Complicated domain of Cynefin. That’s not to say that we won’t see elements of complexity or best practice, just that predominantly it will be complicated. Ensuring we adjust the system of work that is created should reflect this nature. We see this in agile software development quite a lot, where teams might be working on stories that fulfil user / customer needs, but there might be other types in there such as “Spikes” which are a form of probe.
The quality aspect here downstream usually get’s us to ask “are we building the thing right”. Conversely, upstream are often asking the question “are we building the right thing”. Downstream we focus on making sure the customer’s need was met through the service we’re providing. We’re also looking more deeply at “how” we do it rather than “why” – upstream should have already answer the question of “why”.
As you can see, there are considerable differences between upstream and downstream. Just knowing the differences will help you create a better design for your system of work. Please consider both and make sure your system has been optimised in the right way at the right points to enable better outcomes.
If you want to learn more about upstream, we cover this off in more detail in the Kanban Systems Improvement course. I’d encourage you to do Kanban System Design first if you haven’t yet done so to understand more of the basics of how to create Kanban systems.
I was recently asked by the Adelaide Agile community to give a talk at their Scaling Agility meetup. I had a lot on my plate the last few months with adapting all of my training to the virtual world and running some TKP and KMP sessions, but I thought this is a topic that I’d like to address. I think that many organisations have been doing “Agile at scale” in the wrong way so I hope that the folks from Adelaide – and others around the world, got something out of this recent talk.
I’ll start with what I think folks have been doing wrong. Firstly, the whole “we need to reorganise first” is completely wrong. There are a number of organisations doing this to implement things like SAFe and “the Spotify model”. I think this is completely unreasonable because it puts people offside in the organisation from the start and guarantees that for about the next 12 months your organisation is going to be massively internally focused and lose track of customers and purpose. If I were a competitor and you announced you were going to go down this path I’d be smiling away in the background knowing that I’d have a distinct competitive advantage for the immediate future that I should capitalise on.
Also, when it comes to things like the “Spotify Model” – not even the folks at Spotify think this is a thing. If you think the benefit is there, then go right ahead, but you’re missing the point that each of these organisations is unique and has particular needs. What was going on at Spotify was based upon their culture, their domain, their market – so a variety of factors that you likely don’t align with. I’ve always believe that organisations should find their own path to agility that is unique to their context – sure there may be patterns that you can apply, but don’t do so blindly.
Which is really my key point – understand the problems you’re trying to solve, understand your customers and understand your purpose. There’s no such thing as perfect and you’ll need to continually adapt. Models like SAFe are more internally focused and don’t put these things at the forefront – it feels more like a solution looking for a problem. I’ve mentioned this before in posts like “Focus on core problems“.
Scaling with Kanban is different because you focus on problems facing your customers and your teams. You look to understand purpose – customer and organisation, and evolve towards it in a humane way. There are different ways you can look at scaling such as height, width and depth. These are the kinds of things I cover off in my Kanban Management Professional courses. If you really want to go deeper as well, I’d suggest checking out the Kanban Maturity Model which has gone through many evolutions and is now packed full of useful, pragmatic advice.
Many thanks to Jason Cameron for reaching out to me and asking me to do this talk. It might be somewhat different than what your group is used to, but I hope that everyone got a lot out of it.