Sunday 26 April 2015

Reviewing code




Code reviews. Love them, hate them - they have to be done. They also remain largely untouched by the Agile ceremonies framework; apart from pair programming, and other assorted ideas, people have scope for doing them as they like.

I'm one of those people, and having been doing a few of those each day for a number of years, I've acquired a number of practices. Naturally, there is no better place for sharing them than this blog!

Usually, when a new review lands at my (virtual) desk, I try answering this question:

How I would have done it?




This is before even looking at how the other guy wrote his/her/its code. 
Why? Because if I can't think of an approach myself, then I'm obviously not qualified to critically review the design which lies before me. Also, all competitions need at least two entrants; if there is just one, the winner, the one that passes review, may not be fit for purpose.

Of course, in some cases we would have selected a low-level design before coding even began (see post on planning spikes). Then, we are already looking at the right approach.
However, this luxury is not always there; many small items, minor features and featurettes, simply don't justify a separate planning task, and we end up reviewing both design and code at the same time.

The five minutes test

Ok, let's assume I have a solution rotating in slow 3D in my head. Now, it's time to look at the artefact to be reviewed (for the benefit of my spell checker and US colleagues, artifact).

Here we come to the 5-minute test. It's as simple as tests go: if I can't figure out in five minutes what this does and how it does it, then the review travels back with a few well-chosen question marks attached. If circumstances allow, we'll also have an informal design review rather than bouncing the code back and forth.

The reasoning is simple: the task is recent, and I understand what it should perform. In one year, both statements will become untrue. What's worse, it might not be even my memory, as there are/will be other developers in the team, some of them new.
Hence, if I can't quickly figure out what it does today, then tomorrow I, or indeed, someone else, might not figure it out at all. 

Of course, there are shades of grey. It does not often happen that the entire approach is incomprehensible; it might be a mother of an "if" condition, or a function with 6 nesting levels, or a convoluted class hierarchy.
Then, I prefer getting those refactored before proceeding with the review, as they may obscure other problems. 
If the code deserves to be complex, then the explanation should make its way into a code comment.

Once the implemented design is understood, and understood well, it's compared against the one I have in slowmo 3D rotation. If, for some (and I hasten to say, objective) reason, the preconceived option is better, we have a discussion.
It might cause a rewrite, but more often than not, I'll be comprehensively shown that the implemented design is better; after all, the other guy spent more time thinking about it.
This is fine: it is a good reason for another couple of comments, and gives me better idea of what is being reviewed.

Ok, we cleared the low-level design barrier, what comes next? Next come
  
Personal angles


Every reviewer has their own pet subjects. Someone hunts for Python loops and swoops in to convert them to list comprehensions. Another determinedly looks for C++ raw pointers in a fashion popularised by guys with metal detectors on beaches. In short, everyone has some form of bias.

Even putting those aside, in most cases, you'd be looking for the latest defect you've found or experienced. It's hard to cast that away, which is why it is worthwhile doing at least a couple of rounds on important reviews.

Usually, I'm following this order:

  • Code reuse (or lack thereof)
  • Encapsulation, interfaces and self explanatory variable names. 
  • Correctness, i.e. no obvious defects with inverted conditions, missed basic use cases etc.
  • Performance and efficiency
  • More esoteric errors, such as off-by-ones, imprecise variable types, signed/unsigned, proper variable scope and so on
  • Auto-tests and code coverage
  • Correct merges to release branches and external dependencies.

Not all changes are born equal, and a one liner defect fix won't be experiencing all the trials and tribulations above.

The more important point is that for bigger reviews I tend to stop if serious issues are found at any stage. For example, if a developer duplicated a chunk of code, it won't be even reviewed; I'll just ask to adapt or reuse what we already have.
Similarly, if I find basic functional issues, I won't be looking for performance, auto-tests or advanced defects. Basic correctness should be verified by auto-tests, not me, so it's back-to-drawing-board time.

Let's rotate this by 180 degrees, and say

What code review is not?

It is not a coding standard check. Every self-respecting language has tools that verify naming syntax, spaces vs. tabs etc. No need to waste highly paid humans' time on this.

It is not a validity test. Code should be working before it is submitted for review. Computer will tell much faster than a human if something is completely and utterly non-functional.

It is not like reading a book. If you're going over sources top to bottom in an alphabetical fashion (or whichever order imposed by your code review tool), then you won't find too many defects.
You might first hit some internal helper module, then auto-tests, then core functionality, and then wrapper functionality. Tracing the narrative is hard, since the actual order should be: core functionality->helper modules->wrapper->auto-tests.

Moreover, you might have to go over sources a few times as you understand better what is going on (as covered above)

It is not a management metric. I've seen places where the time spent on code reviews was measured, and if reviewers spent less than X seconds per LOC, they got a slap on the wrist. 
I've also seen places where you could not approve a review unless the tool confirmed that you scrolled over every single line of code.
They were probably done with the best of intents, but if you, for valid reasons, don't trust developers to review properly, then I assure you - no metrics will help; the problems lie much deeper. What's more - if engineers are smart enough to get paid IT salaries, they will also be smart enough to work around the metrics.
The only number I found useful is the count of defects found during code review as compared to those that went to QA and production.

(Ok, there's also this one)

Code review is not a formal meeting. Let's add a weasel statement first - sometimes formal code walkthroughs are ok. 
If you're developing mission critical software, where a minor failure carries untold consequences, then yes: throw whatever you can at it, including a formal, line-by-line, code reviews with numerous people present.
For the remaining 99% of us, a formal code walkthrough is one of the most unproductive practices known to man. Everyone is going through the sources in their order, has a different starting point, and comes with a different baggage. Sitting in a room, eyeballing source lines one by one, and wearily commanding the reviewee to move to the next line just does not provide much excitement.
Informal chats and mini-meetings are good and useful though, especially when dealing with complicated solutions to complicated problems. 




Code review is not about transferring responsibility. If you created a defect (which happens to everyone who writes code), then you're still its author irrespective of how many people reviewed it. Code inspection is a service, not an ownership transfer.

Wrapping up

There was too much talk and not enough content. Next time, I'd like to compensate for that by showing how it all works in practice on a willing victim (i.e. chunk of code).



Saturday 18 April 2015

Scrum - Part XI: Module teams or feature teams?


As this blog has seen its fair share of dilemmas, it's time to add another one: namely, the eternal debate of module vs. feature teams.  

Just in case you have a tiny question mark hovering over your head: 
  • Module teams are organised around technology and components. For example, a module team may own driver abstraction in an OS kernel, a database layer, or a processing engine.
  • Feature teams deliver features end-to-end and are comprised of engineers from different modules. E.g. a single feature team may deliver the telemetry example from my previous posts. 
Module teams - a thing of the past? 

There's plenty of prior art on this topic from Agile gurus, and it tends to paint pictures similar to the one below:



You can easily recognise feature teams on the left panel and observe gradual descent into module teams on the right. I might be a bit cynical here, but you're welcome to check Table 1 in this link, and then come back here.

The upshot is that component teams are passé, that they trigger Conway's law, give rise to complex coordination, and inhibit Agile practices, such as common sprint goals and backlog.
And, actually, some of that is sometimes correct in some cases (see below). 

However, whenever a guru comes over and draws a comparison where there's heaven on one side and hell on another, a warning light should start blinking. Conceptual simplicity might be good for pre-Renaissance art, but it is not necessarily as good when software development comes in.

The mantra that's going through this blog is that management is an art, and delicate decisions require deliberation, experience and strong understanding of the company and its products. Decisions do not come much more delicate than deciding how teams should be organised.


Feature teams - can they backfire?

Well, yes, they can. There are quite a few situations where feature teams have their own cons:

  • Many small changes. Let's say you have a fairly mature product, and you spend a significant chunk of your time dealing with customer escalations, assisting Technical Support and resolving defects.
    Then, it's rarely that interaction between components is required; the changes are small and contained. There's little benefit in having developers from other modules in the Scrum, while on the other hand, engineers from the same module can contribute to the sprint goal.
    Same would apply if we have many self-contained featurettes.
  • Refactoring is an inevitable reality. Interfaces, classes, source directories appear as features add up, and design mutates beyond its original purpose. 
    Note: In case you disagree, or the term is unfamiliar, have a look here.
    This is almost always a modular change; there's more oomph behind developers who know the same component inside out, and can design and refactor together.
  • Varying team experience. Imagine you have a team with a few senior guys and a few newcomers to the industry. Placing a newcomer into a feature team is a bit like throwing them to the lions as they'll be expected to represent and develop their component on their own.
    This actually means that they'll have to lean on the senior guys in order to represent their component, and that in turn means that the latter belong indirectly to several feature teams.
    And, again, in turn, it means that we have de-facto module teams. That leads into the next caveat:
  • Periodic team meetings and backlog grooming. Back at the start of this blog, I've argued that daily stand-ups have their own quirks. However, let's assume that you diligently and regularly perform some form of meeting where engineers share what they do (and you really should).
    An ideal dialogue in such a meeting is:


    Developer A: "I'm working on this new dynamic evaluation interface to retrieve widgets by their moniker, but was stuck yesterday figuring out the widget class hierarchy."

    Developer B
    "Hey, I've refactored those classes two years ago, I'll run it past you on the whiteboard after the stand-up."


    Now, what are the chances that developer B will be available in a feature team Scrum as opposed to a component team?  
  • Shared function and team size disparity. Let's say we have a handful of specialists who are needed on most projects, but don't do the bulk of work there. For example, a UX designer, a TechDoc engineer, or a Product Manager. Should they belong to all the feature teams? If yes, will they have to attend all the ceremonies in 5-odd teams and keep their sanity at the same time? Should they attend only a few ceremonies?
    Should they be instead in a separate unit providing services/dependencies to feature teams?
    If so, then we do not have a pure feature-based Scrum.


Feature teams - do they always backfire?

Absolutely not! There are teams/products/times when they work, and work well.
 

The art of management is recognising those, and, as we start playing the "spot-the-feature-team" game, greenfield projects come into view closely followed by large cross-functional roadmaps.

These get most benefit from a project Scrum team and few of the detractors.  
  1. The project is big enough, so several people from each team work on it.
  2. There are few disruptions and customer escalations.
  3. No refactoring to worry about (at least for greenfield projects)
  4. Enough room to accommodate dedicated UX/TechDocs. 
This is the "sometimes correct some of the time" I alluded to above. Feature teams is definitely an option on big new projects with few outside interruptions. Established, mature products with an incremental roadmap is a different beast altogether.




But what about Conway's law?

Indeed what about it? I'm advocating old-school module teams for stable products; would not it constrain architecture and solutions along the team structure?

There are two counter-arguments to that: firstly, architectures get spawned when new projects come into existence. This is exactly when project Scrum teams work best, so there is no contradiction.

The second point is even more important: component teams do not have to operate as silos.

In more layman terms: if I'm in a module team, it does not mean I should not care about the rest of the system. Of course, I might not understand another team's module well enough to refactor it, but I should be aware of what it is, and avoid designs that are purely restricted to what my own team does.


Yes, you might claim that this is not what happens in real lives and silos do appear. Maybe yes, but from personal experience - it is not always the case. 
Feature teams do not hold monopoly over understanding system end-to-end, and there are many ways of spreading awareness across teams: demos, lunchtime engineering sessions, cross-functional design reviews and so on.

Summary

As systems become more mature, there's less and less reason for feature teams, and more case for module teams.


Moreover, same way as few people are doing pure Waterfall, few are running pure module teams, where everything that's outside of own component is "not my business" (TM)

Scrum teams can reorganised as products evolve, though it always comes at a cost. It's up the powers to be to decide whether the benefit from functional teams at the start of the project is worth the price tag.

Saturday 11 April 2015

Presenting status updates: do-s and don't-s


Presentation is one of the most neglected skills among software engineers. It's hardly surprising: software professionals are rarely hired for their flamboyant keynotes, ability to entertain, elevate and amuse, or massive PowerPoint skills. 

Nevertheless, as people progress through their careers, they start consciously or subconsciously improving their soft skills: written comms, cross-team collaboration etc. Yet, it is rare that someone wakes up one morning and decides that from now on they shall be a great presenter. 
The pervading opinion is that this is what Marketing and Sales do, while we code and test.


But do we really have to present?


Yes, we do. We're Agile now, so engineers do demos, iteration kick-offs, retrospectives and what not. (I'm not even going into more advanced exec or customer technology presentations - not yet).

For example, one of my favourites is the department update meeting. The chances are that you know the drill: a department consisting of several teams gathers and each group goes on about what they've done and what they are going to do.

An example update from a team might look like this:



Now, I'll paint a temporary target over the bulletpoints, but will leave them alone for now. Let's listen to the presenter.

Right, so we developed widget A, and then we refactored module B, and, um, we did not quite deploy feature C, but we had to look at urgent customer problem D. 

Sounds familiar? If not, then count yourself as lucky, and classify this post as rant. If yes, then you probably had the same mental reaction as me: "I can read! I don't need you to vocalise the bulletpoints!".

The reason people do this is simple: reading and writing bulletpoints is easy, but conveying memorable information is not.

When we mix that with the tendency of IT professionals towards introversion, and hence keen desire to tick off the meeting rather than revel in the opportunity, we get a rather boring result.


Ok, stop the rant. How can it be done better?

Well, for starters, PowerPoint is a visual aid, not a subtitle. Whenever I present, my mental goal is not to repeat anything that the slide already has.
More importantly, I need to understand what my audience cares about, and how the information can be rotated in their direction.

In this case, our audience is fellow engineers; what do they care about when reclining in their chairs during a departmental update meeting?

In the order of priority:
  1. Is this meeting going to be quick and when can I get back to my work?
  2. How anything of what he/she saying is going to impact me?
  3. Is there any cool technology involved?
  4. What are other teams working on and why?
Of course, I am being facetious here, and many engineers might well have number 4 as their top priority. Many will also care about the business side of things. However, let's keep things simple and stick with these four. Here's how I would go about presenting the same slide:

We were finding more and more defects in our logging layer, so we refactored it to a plug-in system. We might want also to simplify back-end interface, so back-end team - we'll come talking to you. 

Feature C was our first deployment using Docker, but we missed some network configuration specific to production, so it got delayed. If anyone is interested in the details, please go to the Wiki page. 

We had an interesting defect uncovered by customer X - turns out we had a classic off by one in our report calculation, and they managed to pick it up. 

Main thing, we finished our cool new customisable skin engine, here are a couple of quick slides on how it looks like.

This might also elicit little reaction, but at least there is genuine attempt to convey new information and share what we've done. It's also not hard to extrapolate and talk about what we are going to do.

There are also many little techniques to keep the audience engaged:

  • For example, rather than explain the off-by-one defect, I could pull out a slide with the erroneous report, and ask whether someone can spot the issue. People enjoy showing how smart they are, and more importantly, that they are smarter than everyone else around them.
  • I also would jump randomly across topics rather than go top to bottom - the last thing a presenter needs to be is predictable.
  • Giving a nod to someone else (e.g. and here we used Docker techniques shown by Mike in the last demo) is also very important psychologically - it shows Mike and everyone else that these meetings and their efforts are not done for nought, and emphasizes collaboration.
  • It also helps gauging the audience's reaction visually, and deciding where and what we should zoom in.

There's not much to it really: just a combination of small rules and an ounce of self-confidence. Far simpler than understanding intricacies of modern programming languages
It won't make one a world-travelled keynote presenter, but should go far enough to keep fellow colleagues interested.

What about bulletpoints then?

I don't like them. To be precise, I don't mind them being in the presenter's notes, but not on a 16x9 projected screen in front of 50-odd people.

Why? They're not very visual. It's me who is conveying the bulk of the information, and the job of PowerPoint is to visualise it. They are very good as talking points reminders, and they are fine as an offline list for those who can't be at the meeting or listen to the recording. 
But, that can't take away the fact that they are bland, and the last thing any presentation need to be is bland.

So, coming back to the departmental update meeting - I'd move the bulletpoints to the notes section where they belong, and use the slides to show off screenshots, high-level architecture diagrams or PowerPoint SmartArt.

Of course, someone from Marketing would run circles around this, and create slides that put my list to shame. That's fine - we are not visual arts majors, our job is to create adequate, not great, presentations.

Do the same techniques work for other presentations?

Not exactly. A customer meeting and an exec presentation have complete different audience and focus. There's less merit to jump across topics or give credit to someone in the audience. Of course, attendees' agenda is also at a completely different angle.
Maybe a good topic or two for another post...

Sunday 5 April 2015

Scrum - Part X: Planning spikes and low-level design


Last time I've talked about a small technique that eases sprint management: task pinning. What I skimmed over was its elder brother, planning spike, for reasons of prior art: Google helpfully pointed out that the commonality of this approach.

However, when I started digging into said prior art, it differed somewhat from what I've been using. Agile gurus treat planning spikes as a form of high-level design, and/or UX planning ahead of a major project. Even more so, some recommend treating them as the exception rather than the rule, and fix planning gaps during retrospective.

I already expressed an opinion that people put too much faith in retrospectives. No post-mortem can fix technical gaps, and these are the most common root causes of struggling sprints; not processes. Handling those is much better before people start the sprint, and raise their left (or, as they case may be, right) foot over the chasm. 

So what are planning spikes?

My bad. Spent three paragraphs talking what the subject isn't, rather than saying what it is
Easy to fix - here's the definition: A planning activity which provides a fine-grained low-level task breakdown.

No, it's not going to be carved in granite any time soon, and neither is it meant to amaze the community of Scrum practitioners. Frankly, it's one of the most obvious definitions you're going to meet.

The more interesting and harder question is:

How often should I do planning spikes?

This is where my road starts diverging a bit from the wisdom unearthed by the Google search. My take is that all sizeable tasks should have a planning spike - with a few notable exceptions due to be unveiled.

The reason is simple: people get estimates wrong. It is not a subjective opinion, and it is not a sample of a specific company. It is an observation.

Software engineering and testing is a very complex discipline, and it depends on many unknowns. If an estimate is more than a day and a half (or 5, 13, 21 story points, name your poison), or if the person did not spend more than a few minutes dropping in a cost tag, then it's very likely to be inaccurate.

This looks as an overgeneralisation, so let's pull out a few reasons why tasks get under-, or over-estimated:

  1. Missed error cases.
  2. Did not account for release and code merging activities.
  3. Code review took longer than expected.
  4. Missed code reuse opportunities; refactoring post initial code delivery.
  5. Test bed unstable or not ready.
  6. Missed downstream dependencies.
  7. Low-level interface had assumptions which proved to be incorrect.
  8. Specific test cases were hard to execute; required cumbersome setup (e.g. install of an atypical OS).
  9. Expected to reuse specific functions, but these had somewhat differing interfaces.
I am not compiling an encyclopedia; but am sure you have a private list of grievances from past efforts. 
Now, many people realise that a monolithic estimate of 4 days is not worth a lot, and that breaking it down is important. The breakdown they come up with then looks like this:

  • Design - one day
  • Implementation - two days
  • Testing - one day
I spent 2 minutes typing this, and by coincidence, this is exactly how long the estimation process takes in those cases. This helps neither man nor beast.

The important point is not how fine-grained an estimate is, but the thinking behind it. If you planned a work item through, then granularity is a beneficial by-product.

So why retrospectives do not help?

Now, with a few examples and definitions left behind, it's easier to reiterate the reasoning behind retrospectives (or lack thereof).

Let's say, we missed a set of error cases behind the long running telemetrics example. Middleware's team task on recording the set of formats took twice long than expected as they have not considered invalid codecs, codec families etc.
Now the team conducts a retrospective. What corrective actions can they agree on?

(a) "Never miss error cases."

While we're at it, we can also solve world's poverty or stop writing bugs. Easier said than done.

(b) "Consider invalid formats when dealing with codec-specific tasks."

This looks better, but won't help us with the next wrong estimate (e.g. processing and categorisation of client's headers).

(c) "Always consider error cases when planning tasks"

This pulls the sanity dial down the middle, but still does not provide the ultimate answer, since error cases are not that easy to flush out. Few people can look at a task, spend ten minutes, and rattle off the full and ultimate list of error conditions. 
Moreover, check out the sample list of other reasons the wrong estimate can occur; our dial might still be erring on the "too-specific" side.

(d) "Perform low-level design prior to starting risky tasks"

This creates a better tradeoff, but you do not need to wait for retrospectives to come up with something like that.

Anyhow, the (a)-(d) exploration above is not a mathematical proof, so if you think there is a gaping hole, feel free to point it out in comments.

The upshot of all of this is the same point I made a few posts back. Retrospectives are good for process corrections, while most sprint under-achievements are down to technical reasons.

The return of the low-level design

You might have picked up on a keyphrase above: "low-level design". Yep, I slipped it in on purpose.

In not so many words: a planning spike is often a low-level design; and, of course, this is not restricted to development activities only. Laying down detailed QA test cases, mapping out DevOps deployment or troubleshooting plan are all fair game.

At this point, you might have a slight feeling of betrayal. Many paragraphs with rhetoric and examples to read: all just to say that we should do low-level design and detailed planning? This is exactly the reason I deliberated skipping over this topic; however, the devil is in the details, and it's time to say
What low-level design is and isn't

While advocating a planning spike (this modern name looks fancier than just saying "figure out what you're going to do"), it is important to define its boundaries:

a)  The output is 100% technical. Only people who are going to do the task, and/or can provide useful insight need to see it.

b) It is not a living document. It's a trampoline to do the work, and it is going to become obsolete shortly after the work is complete.

c) Its job is to flush out time-consuming activities. It need not explore every nook and cranny, especially if those are small enough.

Most importantly, 

d) It need not be too long! If we estimate a task from afar at 4 days, we should not spend longer than a day or two doing the low-level design.

When planning spikes are unnecessary

I talked at length about the benefits of upfront planning. There are however a few notable cases where they do more harm than good.

Exactly the same task was done before. Manual regression testing or repetitive deployment activities are examples that come to mind. Yes, they should be automated, but there is the occasional distance between "should" and the reality of today.

The tasks are minor. For example, we have a set of 5-odd defect fixes, that we understand at a superficial level. Some might take a bit longer, some might take half an hour, but laws of averages work in our favour.

There is urgency. We have a critical fix to make, and it had to be done yesterday. We still need to figure out what to do, but we have to do it all this sprint, and can't accommodate a separate planning spike.

Isn't this the same as backlog grooming?

Yes and no. This idea and purpose is the same. 
The execution is somewhat different; doing low-level design in a meeting just does not cut it. This needs a preparation; an offline dedicated effort from someone. Review can happen in a meeting, though note my previous comments on validity of those in diverse teams.

In short, people need dedicated time and the right environment to plan their work. Backlog grooming meetings are suited better for elucidating requirements.

Sprint duration

The main weakness in spraying planning spikes is the time to execute. Let's assume the worst case, and say that your sprint is four weeks long.
You can't stick both the spike and the actual work in the same sprint; that would defeat the purpose of design as means of getting fine grained, accurate, sprint commitments.
This in turn means that we face eight weeks between deciding to do something and delivering it. It might be good in some situations, but it's not that Agile(TM).

In other words, this technique encourages shorter sprints: ideally two, and definitely not longer than three, weeks.
There are opposing forces against shorter sprints, which I'll come to in future posts. Getting the duration right is an important, but not by all means a straightforward, decision for a given department.

Typical acceptance criteria

As a reference, here are typical success criteria I've been using from while wearing a Product Owner hat:

Development. Description of main functions and interfaces to be changed. For more complex changes, flow diagrams (UML optional). Intended unit-test coverage, with possible TDD (Test-Driven Development) focus.

Testing. Description of the test environment. Low-level design of automation, e.g. how many new UI elements do we need to control. A reasonable estimate of the amount of test cases.

Deployment. Quality gates during deployment stages. Manual effort required during validation of those, if any. Collection of metrics, if for example, we are replacing or augmenting a technology. Training sessions for Technical Support and other interested folk.

Summary

Agile techniques do not obviate the need to do low-level design prior to starting on tasks. To properly and confidently slot tasks into sprints, preparatory work is required, and this can't be done purely through meetings and ceremonies.

There are situations where a separate planning spike does not work, and using this technique in anger requires shorter sprints.