A Year of Whiteboard Evolution

Share

Back in December last year I started supporting Red Gate’s .NET Developer Tools Division. As of this month, we’ve restructured the company and from next week, the old division will be no more (although the team are still in place in their new home).

When I joined the team things were going OK but they had the potential to be so much more so I paired up with Dom their project manager and we set to work.

The ANTS Performance Profiler 8.0 project was well under way already and the team had a basic Scrum-like process in place (without retrospectives), a simple whiteboard and a wall for sharing the “big picture”.

I spent the first week on the team simply getting to know everyone, how things worked and observing the board, the standups and the team activities.

We learned some time ago here at Red Gate that when you ask a team to talk you through their whiteboard, they tell the story of their overall process and how it works. Our whiteboards capture a huge amount about what we do and how we do it.

I attempted to document and capture at least some key parts of the journey we’ve have over the year in which we released over a dozen large product updates across our whole suite of tools. This post is picture heavy with quite limited narrative but I hope you’ll enjoy the process voyeurism :) If there’s any specifics you have questions about, please ask and I’ll expand.

Next time, I think I’ll get a fixed camera and take daily photos!

The end result? Multiple releases of all 5 of our .NET tools including a startup and quality overhaul for our 2 most popular products, support for a bunch of new database platforms, full VS2013 support (before VS2013 was publicly released), Windows 8 and 8.1 compatibility and a huge boost for the morale of the team.  See for yourself if you’re interested!

Of course this is just one aspect of what I’ve been up to. You might notice the time between photos over the summer grew a little. See my last post for more insights into what happens at Red Gate Towers.

Some Thoughts on Project “Closure”

Share

I’m currently working with the team of function heads at Red Gate developing a “skills map” for all our development roles. (I’m primarily covering Project Management).

Having “bought-in” skills inventories at previous companies and failed with months of time and effort wasted, we’ve taken our approach from the ground-up, knowing our role definitions are pretty unique to us and managed to get to “usable” for trials in less than 2 weeks! (We’ll be sharing what we did in the near future)

Having mapped out the set of skills and directions at a high level, we’re in the process of taking each skill and defining what the skill actually means to us, why it’s important, what does “good” look like and where to turn for more info or help. Each of us on the team is trying a couple out for starters and then planning to crowd-source the rest.

I’ve decided to start with “Project Closure” as a skill that’s not too simplistic, is really important to us and an area where through most of my career I’ve not often seen a “great job” done.

Before I progress, what I mean by Project Closure is everything required to get from an in-flight running happily along project to “done” – the ramping-down.  This is Not what PMI, PRINCE2, APM or most other standard PM definitions define as “Closure”.

This disconnect is why I decided to write this post. Most PM resources talk about closure being sign-off, completion of project documentation and lessons-learned meetings or retrospectives (and a few other process and stakeholder type things).

Having done a bit of hunting, it turns out there’s very little information available on “ramp-down”. It’s no surprise projects overrun with the lack of support and resources that exist around this area! “Traditional” project closure happens once everything’s already been delivered.

Forgive me if I’m being a bit ranty here but surely that’s after the horse has bolted, right?

What about all that effort of putting all those loose project tentacles back in the box first?

An Octopus in a treasure chest

How do we get from an “in-flight” project to “Done”.

My best analogy to this comes from the “String and Octopus Guide to Parenthood” that my first ever boss sent me a copy of over 15 years ago.

5. Dressing small children is not as easy as it seems: first buy an octopus and a string bag. Attempt to put the octopus into the string bag so that none of the arms hang out.

Time allowed for this – all morning.

Closing out projects well is hard and it takes  longer than you ever expect it to.

I’ve not finished writing up the full skill definition yet but I wanted to share where I am so far for anyone out there that wants at least a slight pointer in the right direction…

(Note, this is my definition based on the context I work in at the moment. It doesn’t conform to any certifiable standard so don’t scream if you answer your PMI exam questions on closure with stuff from this and get it wrong.)

A Short Definition

Project Closure (as a skill) is the ability to take a project from actively running to successful delivery and completion with a happy team, low defects, no panics, and in a timely fashion.

This is usually meeting a planned release date if there’s one set but for some teams can also mean getting to a point where we’re shipping production quality software with completed, working features on a frequent, regular basis and have the ability to move a team successfully off of a project stream (ready to work on another) and close it down in a satisfactory way for the team, customers, product management and the business if needed.

Why Is This Valuable?

Unless we ship valuable working software regularly we don’t keep customers happy and we find it harder to attract new customers. Working on products that don’t ship for long periods of time also has a pretty negative impact on morale for everyone involved – we love getting our stuff in the hands of customers and making them happy & successful!

We also risk building up large quantities of undelivered code or features (“inventory” waste in Lean terms), racking up undiscovered defects, increasing project risk and technical debt. The longer a project or feature is under active development, typically the longer it takes to shut down and ship.

Being able to ship and close down a project  is vital to our ability to balance investment across projects and products as it’s likely we’ll always have less teams than things we want to deliver and need to be flexible around team size, availability and commitments.

What Does “Mastery” Look Like Here?

Someone handling project closure really well will have a good handle on the state of development in relation to the remaining time left for a project to run and how long it’s been since we last shipped a successful release.

You’d probably expect them to be looking at what’s needed to close out the project from the point it’s about 2/3 complete. As an example – if you’re running a 3 month project, then most of the effort from the PM in the last month will be around closing things down.

They’ll be winding things down in a controlled manner with no surprises and no extreme rushes to the goal. Known bugs will be visible, regularly reviewed and going down, release testing will be planned, scheduled and resourced – chances are there will be a full-team release test towards the very end. Documentation will be all well under control, a UI freeze will be planned and completed. Features will be winding down or completed and bug fixing will be moving from high-risk breaking type bug fixes to showstoppers, visible, cosmetic or quality issues and small functional gaps for tying off.

They’ll have worked with our marketing team or product manager to ensure marketing is ready to go. Our sales and support teams will be being briefed on the new features, and we’ll have probably completed at least one bug hunt.

In the week or 2 before closure we’d expect to see a controlled code freeze and whilst the PM might be having to negotiate and plan what’s coming next they’ll still have a really tight rein on scope, be making decisions on any new incoming issues and setting clear priorities.

They’ll be encouraging the team to support each other across roles and swarming on areas that need a real push whilst ensuring that we don’t lose sight of the original project goals, customers’ needs and product quality.

 

That’s it – so far. If you have anything useful to add to this, please dive in and comment (mind the tentacles)

 

 

Back to the Oubliette – How We’re Digging Our Way Out

Share

Back in December I moved to a new team. After 18 months on DevOps and email marketing projects I’m back to managing commercial product development. After a week of getting up to speed on the product and project (it was already in flight when I joined) I started looking at the bug backlog in Jira.

One problem with bug backlog tools is that they allow you to create bug backlogs. I considered shutting them all down but knew we’d find value in being more considered in our approach.

For this project I was pairing with another project manager; Dom Smith (yes, pairing isn’t just for developers). The project was a major new release for one of our longstanding successful tools.

This is the second time I’ve pair-led a project and every time the overall outcome and workload is so much better. One day the rest of the industry will realised that having a single PM running multiple projects will always get a suboptimal outcome. In fact another critical project (fixed time, fixed quality, fixed schedule, fixed scope!) elsewhere in the building is also being pair-led right now and is proving to be a great success.

Dom was a technical author in a previous life and is therefore rather good at keeping track of everything a team does over the life of a project. Another team were interested in what we’d done so fortunately Dom managed to capture everything we did to get the oubliette under control. Today’s post is brought to you courtesy of his work.

At the start of our review we had ~1200 open issues, of which about 400 targeted for the “next version”. The backlog had reached a state where nobody wanted to look at it as it was “too big” to resolve.

Fortunately strategies for breaking down big problems is something I tend to specialise in.

If I’d said at the start we were going to spend 4 months doing this, I think we’d have struggled. Looking back, we spent about 50-60 hours of our time and about 100 hours from the development team to get things under control.

We started with just one hour’s “space” and simply said. “OK, what can we achieve in 10 minutes”.  Fortunately bug backlogs have many “seams” to work through. We worked through the smallest easiest first and whenever things slowed down or value started to diminish, we’d move to a fresh seam and started digging again.

We started by setting up an hourly slot once a week to pair up and pull the things we were going to fix into the current version. In the process of this, we found many duplicates we could close. We also ensured that rough priorities (priorities would be clarified later) were set on everything we pulled through, and that they all had components (the functional or technical product area) set.

After a few weeks repeating step one, we started tackling the other end of the list: things that were out-of-date or really not going to get done.

Anything more than 3 years old was checked on a case-by-case basis with the starting assumption that it would be closed as ‘won’t fix’ unless we could argue a reason not to (primarily based on prevalence and customer impact). Pairing made these a sensible conversation rather than a  hasty decision.

“Autobugs” – we have an automated bug reporting system so that when users hit an exception they can choose to report it directly into Jira with no additional effort.

Any autobug raised prior to the last major shipped release that was not seen in the latest live release. We were working on v8.0 so any bugs reported against v6.x with no reports in v7.x, were closed

Any autobug raised only by one our development team on an unreleased build over 6 months old were closed.

After cleaning up the “noise”, we added a weekly, 45-minute meeting with the whole development team to review all the new issues added in the last 7 days, set a clear and appropriate priority, ensured component information was present, and set an explicit release target. (Either “current” (v8.0), “Next Minor” or “Next Major”. This helped contain the quality of new bugs – we “stopped the bleeding“.

We adapted our weekly pairing slot, to tackle the issues assigned to people no longer on the team (the aim being to ensure that they ended up with no issues assigned to them). Many of these could be closed as No Longer Valid, but there were a few which should have been prioritized that they were being sat on.

Again during the weekly pairing, we started working on closing bugs targeted at “improbably releases”. We had placeholder releases called ‘Future – Possible’ and ‘Future – Unlikely’ which, by their very nature, aren’t likely to happen. Again, we ruthlessly reviewed each in turn. Many had already have been fixed, and just needed 5 minutes testing to verifying they were resolved.

We added a further hourly weekly session with our product manager prioritize any issues we’d found through this process that looked more important but needed a product management decision. Having set the impacted component on everything we weren’t closing we’d found clusters of problem-areas that had been previously hidden, that became candidate focus for our next minor release (due to ship imminently).  We also checked any queries coming from the development team’s review of issues from the last 7 days, and the scope of work for the current release to determine ‘Must Fix’ vs ‘High’ vs ‘Would like to fix’.

In parallel to these activities, we worked with our Sales team to ensure CRM references were added to any feature requests in JIRA (both new requests and existing ones that were being asked for repeatedly). This helps our UX specialists and product managers with prioritizing these in the future and gives them direct customer links to follow up on actual needs and problems.

In the space of 4 months we closed over 50% of the backlog issues by not doing any development work –  just investing in three weekly meetings. We fixed another 25% by setting clear priorities, targets and sensible grouping of issues.

We aren’t there yet though. Bugs are still coming in and there’s still two big ugly piles in “Next Minor” and “Next Major” that we have yet to tackle but we’ll continue to put a little effort in on a regular basis to keep improving our position. Better still, the last release we shipped has significantly less bugs and the one we’re about to ship has less again.

We’re digging our way out!

 

Convergence & Divergence

Share

image demonstrating convergence and divergence using people and tools

It’s been a while since I’ve had time to write but I’m enjoying a brief respite between product releases to collect everything I’ve learned in the last 9 months. I have a few rather significant new ideas I’ve been using that I want to share. This is the first. It’s long, in raw-unedited form and likely makes a few mental leaps so please take your time, digest, think about how all this can apply to your environments and ask clarifying questions. I’d like to get this in a more digestible form someday…

Before I start, a huge thank you to Matthew Dodkins who discovered a brief comment Joh & I made whilst presenting at Agile On The Beach in Cornwall last year was something far more significant than we’d realised…

Complex organisational structures cannot be static, in order to remain healthy they must ebb and flow, centralizing & decentralizing around roles, reporting lines, business needs, good practices and knowledge.

Having brought this observation to a more conscious point, I and my fellow managers at my current employer have put this into active use…

Most experienced facilitators are used to talking about a diverge->converge process as used in moving from brainstorming to actions.

  • Diverge = generate ideas
  • Converge = choose the most promising

What I’m talking about here has very little to do with facilitation and is a reversed view (converge, then diverge) however it’s more accurate to view this as a natural continuous tidal cycle.

A loop showing convergence and divergence

Theory: The Converge-Diverge Cycle

The Converge-Diverge Cycle applies to organizational structure, behavior, process and even technology.

Although at any single point in time an organizational structure may appear static, many organizations develop natural rhythms of restructuring over time. At General Electric that cycle was roughly every 2-3 years at a wholesale organizational level. At Red Gate it’s closer to 6 monthly but cycles incrementally through divisions and operational functions.

At worst, poorly managed organizational tides are pretty toxic but at best they help re-balance interactions, interpersonal relationships, behaviours and practices across diverging groups.

Here’s the cool bit…

Which direction are you heading?

You’re always in either a convergent or divergent cycle. If you know which way you’re heading you can make an explicit decision to let the tide continue in its current direction or to turn it back. The longer a tide moves in a divergent direction, the longer and more painful the resulting convergence may be. The longer things remain converged, the more constrained your learning, innovation and evolution.

(unnecessary or diminished-value convergence can also occur – think “spork“)

Simply being aware that convergent or divergent cycles exist makes it possible to step back, observe, interpret the current direction of travel and make a conscious choice.

Your Convergence may be my Divergence

In organisational structures, convergence can shift from one position to another, causing divergence in its wake.  (I’m getting a bit philosophical so I’ll back this up with some real examples shortly).  Think about the age-old “Us vs Them mentality. Most often seen between the “business” and “development” or in bad cases between “development” and “testing” or between you and your supplier or customer.

Whilst it’s important to take individual responsibility for increasing the bandwidth of communication from email to phone to face-to-face, it’s really hard to do – particularly in a conflict situation.  The more physically and organisationally separated groups or individuals are, the stronger the “Us vs Them” pull may be. Beyond relying on individuals to play nicely, how about changing the structure and dynamic of the groups to foster closer collaborative working.

Equilibrium

So now it gets tricky…

Each “Us” or “Them” has a set of existential goals and motivations. These aren’t necessarily combative but often they aren’t aligned to each other. We know this lack of alignment can cause problems but what happens when we realign?

The best example of this that I frequently see is organisational structures that provide a centralised (converged) service. The centralised service convergence drives common “best practices”, efficiency and greater synchronization across the group that are all trying to provide a similar service to multiple “customers”.

This convergence is often at the cost of quality of service, responsiveness, tailoring and specialisation to the needs of a given customer.

The “Us & Them” conversation is between the service and each of their customers.

From this shared starting point, if we reverse this, diverge and decentralise the group with the service team distributed to directly serve each customer we find that responsiveness improves, tailoring increases and localised “best practices” emerge but it will have its own costs and compromises.

Standardization

I spent some years running an organizational standardization programme and have been involved in many others. I’ve seen the pain and cost of these first-hand.  Often the primary benefits are in management reporting and process certification, not improvement. I’m a strong believer in no best practices. Following a “best practice” implies there is nothing better. This mindset will stifle continuous improvement. There are however best known practices for your current context, situation and experience.

Process or tool standardization is not a sensible goal in itself. What is this trying to achieve and is it really the end point or simply a pivoting point between converging and diverging?

Reconsider where you’re standardising today. If these stayed locked-down for a decade and only evolved in a single direction, would this be bad for your teams and your business? If so, what’s your divergence strategy?

Who’s “Us” and who’s “Them”

Initially the ties of a previously converged team remain strong. You benefit from the strength of prior working relationships and team bonds to maintain synchronization.

This only lasts so long. After a surprisingly short period of time – particularly if groups are under pressure – (less than 6 months in my experience), the “Us” vs “Them” returns however this time it’s between groups with diverging and potentially competing approaches.  The former ties have weakened and new ones are formed. And this increases over time.

There is no long-term equilibrium. There is always an “Us” and a “Them” somewhere.

Great, we’re doomed to hate each other eventually and our evolution is over – so what do I do?

Deliberate Convergence & Divergence

There’s many existing strategies for forging and retaining ties between groups and a bunch of organisational team-building resources available with a bit of searching. Most of these need time set aside and essentially keep some bonds alive but rarely provide opportunities for continuous evolutionary improvement.

Here’s the cycle that usually happens

  1. We’re heading in the wrong direction.
  2. Someone realises there’s a problem that requires convergence in a different direction.
  3. An organizational change programme is initiated
  4. Time and money are invested in effecting a large organizational change
  5. Productivity and Morale take a beating.
  6. The changes bake in
  7. Rinse & repeat

Here’s the “deliberate” equivalent

  1. As part of a regular review, current direction is assessed.
  2. We determine if we’re converging or diverging on the right things, what’s working well & what’s not.
  3. We make an explicit decision whether to change direction.
  4. We give things a “prod” in the direction we want to go.
  5. We do this regularly enough to stay fresh but slowly enough to be stable.

Here’s a couple of real examples from where I am right now…

In Practice: Organizational Convergence & Divergence

We have a number of teams who are responsible for providing centralised services. HR, Publishing, IS, DevOps, Finance. These groups are co-located teams and interact with their internal customers.

Team 1: HR

Our HR team assessed how they were working and made an explicit decision to diverge.

Each division now has an HR specialist who sits co-located with their customer.

The HR team still get together regularly but they’re now a diverged group.

They’ve structured things so they won’t diverge any further but also know that all decisions are reversible. We may re-centralise them in a year or two’s time in order to put a load of energy into a large centralised change.

Team 2: Dev Ops

Our DevOps team were a centralised group for over a year. They worked closely with IS but provide a service to sales, development and finance (among others).

Because of the nature of projects and changes on the cards we had 2 choices.

  1. Disband DevOps, put skilled DevOps staff in each delivery team and diverge the service
  2. Converge DevOps, Business Intelligence and IS even further to run under a single pillar of the organization.

We chose further convergence. The team have radically altered their processes and have focused on the top priority company-wide priorities. Some other teams will miss out on their requests but in the larger scheme, this was seen as the most effective structure for our current needs.

Team 3: Sales

Our product organization was divided into a number of divisions. Each division had its own sales force co-located with the development teams. This was working great for development but it gives a really bad customer experience when a customer owns products from multiple divisions. We’ve converged the sales team and brought their processes and practices together to give a better customer experience. We’ve sacrificed the closeness of development and had to drop some division-specific process tailoring but we’re having better conversations with our customers.

In all 3 of these cases, I fully expect these changes to cycle in a different direction at some point in the next 2-5 years. As do the teams – it’s simply how we evolve.

Because we know things will change again, we’re less entrenched with the status-quo.

As an aside, we don’t hot-desk here but because our organisation is so fluid, most staff have moved desks 3 or 4 times in the last 18 months. We treat this as normal. We still have our own space, desks, walls & possessions but we stay mobile enough to move ourselves in under an hour’s effort.

This week we’ve asked all our development staff to propose where they each feel they can best contribute for the next 6 months and based on those preferences I’m expecting to see over 20% of our teams move again.

In Practice: Process & Tool Convergence & Divergence

I mentioned earlier that standardization is a directional way-point, not a goal.

I spent nearly 2 years “standardizing” Scrum, Lean & Agile with one employer. At the time it was the right thing to do. Most parts of the software group had never had any form of large-scale convergence. As a result the effort & pain were significant however we took teams who had been working in their own evolutionary islands for 10-20 years and brought increased commonality to them. We reached very close to standardisation but deliberately designed our ISO processes to support limited positive divergence.

At my next employer, my first action as head of project management was to explicitly state that we would not be standardising our project practices. All teams had standardized on Scrum 3 years earlier and had been slowly diverging. Having observed the teams and contexts it was clear we had scope to diverge further whilst still remaining healthy. In fact, it gives us the ability for concurrent parallel learning and improvement. We learn more and we learn faster.

2 years later we’re at the point where I want to introduce some convergence. Rather than a complete standardisation we’ll be developing a playbook of “known good practices”. It won’t be exhaustive, it’ll start with the top 25 things we think are a no-brainer and we’ll work from there. Some of these things won’t be common across the teams when we start. Eventually they will be…

… and then we’ll decide whether to continue converging or to diverge again.

Great, now what?

Review where you currently stand.

  • Which groups are converging and diverging? – Observe and interpret.
  • How long have they been moving in that direction?
  • Are you willing to continue or is it time to adjust? – Make an explicit decision
  • How large and fast a change is needed? (if any)
  • How will you push in the direction you need? – Deliberate action
  • How long do you expect the next cycle to last? – Tidal change
  • When will you next check?

Whew! That was epic. Thanks for reading to the end. As always, I’d love your feedback.

 

 

 

Stopping The Line

Share

A few weeks ago the Company I work for celebrated its 13th birthday. As part of the celebrations we were each given the latest copy of the “BoRG” – the company book.  On reading through the pages I found one of the teams I’m responsible for received an award.

This isn’t quite as positive as it sounds and I’m pretty sure the incident will become lgendary within the company. The lessons learned, what we did afterward and the forward thinking attitude of our senior management are however truly worth celebrating.

The (rough) Story

Over the summer one of our teams was working on some updates to our product deployment tools (we deploy upwards of 50 new releases across our product portfolio every month). Part of the automated process involves uploading a packaged installer for our software to a download location and updating our web site to point to the update.

Due to a mix-up between environments and configurations, one of our internal tests made it into the outside world. The problem was spotted and resolved fast but something was clearly wrong for this to have been possible.

This alone would have been rather embarrassing however this was about the 6 or 7th significant incident that had come from our operations teams in as many weeks. We’d recently restructured team ownership of parts of the codebase, were making a large number of significant infrastructure, library, test and build changes to our systems – mostly legacy code (code without sufficient tests). Moreover, we added a whole new team onto the codebase with a very different remit in terms of approach and pace, the volume of churn in the code had massively increased.

This was all during the height of holiday and conference season so many of us weren’t fully aware of the inner carnage that had been occurring. Handing over from one manager to the next repeatedly meant we’d not seen the bigger picture.

I returned from Agile 2012 to a couple of mails from my boss (who was now on holiday) to fill me in on what had happened with the (paraphrased) words: “Can I leave this safe in your hands”.

Over the first 2 hours of my return I was briefed by managers and team members on the situation. Everything was sorted, no problems were in the wild but the team’s credibility had taken a beating.

I’d seen similar things happen in other companies and had always been certain of the right course of action. This was the first time I actually felt safe to lead what I knew was right.

I donned my “Lean” hat and started my nemawashi campaign with our senior managers.

I spoke to each manager individually – they were already well aware of the problems which made things much easier. I simply said.

“These problems can’t continue, we’re going to ‘stop the line’. All projects are going to stop until we’re confident that we can progress again safely.”

I went a step further and set expectations on timescales.

We’d be stopping development work for nearly 20 staff for at least a week. We’d monitor progress daily, and approval to continue would be on the condition that we were confident problems would not reoccur.

By lunchtime I had unanimous support. It was described as a “brave” thing to do by our CEO but all agreed it was right.

A side-benefit of Lean is the shared language it provides. In every case when I approached our management team and explained that I wanted to “stop the line” they immediately understood what I meant plus the impact, value and message behind such an action.

Now of course you can’t prevent new problems with hindsight but you can identify patterns of failure and address these.  In our case I had a good understanding of what had been going on.

Initially I was strongly against performing a full root-cause analysis.  There were half a dozen independent incidents and a strong chance of finger-pointing if we’d gone through these. I was already “pretty sure” where our problems lay. The increased pace had led to a fall in technical discipline coupled with an increased pressure to deliver faster and a lack of sufficient safety net (insufficient smoke tests).

I divided the group into 3 teams to focus on 3 areas.

  • “before release” – technical practices
  • “at the point of release” – smoke tests
  • “after release” – system monitoring

With an initial briefing and idea workshop I stepped back and left the 3 teams to deliver.

The technical practices team developed a team “technical charter”.  We brought all participants together for a review, revised and then published this. Individuals have since signed up to follow this charter and we review it regularly to ensure it’s working.

The smoke testing team developed a battery of smoke tests for the most critical customer-facing areas (shopping cart, downloads etc). These are live and running daily.

The monitoring team developed a digital dashboard (that I can still see from my desk every day). This shows the status of the last run of smoke tests (and history), build status, system performance metrics and a series of alerts for key business metrics that would indicate a potential problem with the site – e.g. a tail-off in volume of downloads or invoices.

They also implemented some server-side status monitoring and alerts that we subscribe to via email.

Since these have been in place we *have* had a couple more incidents but in every case we’ve spotted and resolved it early.

Subsequently a couple of the teams have self-selected to perform a root-cause analysis on a couple of issues. This is exactly the behaviour I love to see, it’s not a management push, they simply wanted to ensure we’d pinned things down and done the right thing. Moreover, they published the results to the whole company.

The award…