Headshot-color me@jbrains.ca Find out where I'm appearing
« Previous 1 3 4 5 6 7

Making decisions doesn't have to be so hard

I enjoy collaborating on decisions, but only with groups that agree to use a technique I refer to as consent-based decision making. I might not use that term exactly as the coiners intended, so let me explain what I mean. I characterize consent-based decision making by putting forth a proposal, then looking for reasoned objections. When no-one raises such an objection, we accept the proposal. I find this style of decision making quicker, easier, and better for team unity than typical approaches.

Consent-based decision making contrasts sharply with a typical decision-making exercise, which tends to follow these steps:

  1. Present a need or problem.
  2. Present options.
  3. Solicit more options from the group.
  4. Discuss the merits of each option in detail.
  5. Propose solutions.
  6. Combine the proposed solutions into a kind of hybrid solution to which everyone can assent.
  7. Ask for any last-minute objections.
  8. Decide.

I feel tired trying to make decisions this way, and it seems that the fatigue rises with at least the square of the number of people involved. Not only do I find it difficult to make decisions this way, but that difficulty encourages me to exclude people who might otherwise have valuable input. It puts me in a place of wanting to prune ideas, rather than generate them. You can understand why I’d want to avoid this kind of consensus building.

Some objections

When I have taught consent-based decision making to my clients, some of them have pointed out that it leads to a particularly negative culture: “Don’t bring me problems; bring me solutions.” I agree that, practised mindlessly, that could happen, but because I insist we practise mindfully, I think we can avoid this problem. Still others have pointed out that this style of decision-making stifles creativity because it intimidates people who want to point out a flaw in a proposal without necessarily having a better proposal. Again I agree, but we can mitigate this risk by looking at one of consent-based decision making’s greatest strengths: separating generating ideas from selecting a solution.

I remember dozens of meetings in which I participated in making decisions by building consensus, specifically how inconsistently engaged I felt. I would enter some of these meetings with a desire to generate ideas, brainstorm, and explore solutions; and I would entry other of these meetings tired of sifting through ideas, craving to decide on a course of action. I can’t tell you why I felt the way I felt, but rather just that it varied, and that I felt it strongly. Sometimes I wanted to expand the solution space, and other times I wanted desperately to contract it. So far, I don’t see a problem with that, but there are, of course, other people.

When you and I enter a decision-making meeting with different goals, we create problems for each other. When you want to generate ideas and I want to select a solution, we fight for air time, for space, and indeed for life… at least, it feels that way to me. I struggle to bring us to a sensible solution, and as my wish comes close to coming true, you trample on it with yet another pie-in-the-sky idea. Worse, your idea might fit perfectly, but I simply won’t see it that way. I will interpret your every new idea as an attempt to prolong the agony, whereas you will interpret my every attempt to choose a solution as an attempt to shut you down. Result? War. Root cause? Cross purposes. Remedy? Alignment. (Surprised?)

Two goals, two meetings

Consider, instead, having two separate meetings: one to generate proposals, and one to select a proposal. You might find that that helps.

In the first meeting, we generate ideas, brainstorm, solicit opinions, run impact studies… whatever we need to do to generate proposals. The people who come to this meeting might or might not care about making the decision. Whatever happens, we can feel certain that whoever shows up wants to lend their ideas to the group, and most importantly, we ignore any attempt at choosing a solution. We agree in advance to ignore them, because we have a different goal in this meeting. We agree not to chastise people for attempting to choose a solution because they form part of our natural impulse to jump to a conclusion. We agree to recognize each other’s humanity by allowing each other to compare or rank solutions, but we generally ignore those attempts in a quiet, friendly manner. (See Ask Why, But Don’t Answer for an explanation.)

The second meeting starts with a proposal. The group may ask clarifying questions, but by now we shouldn’t need to ask too many. The Proposer than asks the group to vote, which the group does by signaling thumb up, sideways, or down. A thumb up means “I accept the proposal”. A thumb sideways means “I will go with the rest of the group”. I thumb down means “I reject the proposal”. Of course, if you reject the proposal, then you must make a counter-proposal right away, otherwise the group feels free to ignore your vote. The group repeats this process until either it makes a decision or reaches a deadlock. If it reaches a deadlock, then we immediately adjourn the meeting and schedule another one to explore the competing proposals in depth. Why schedule another meeting? In part, to discourage people from rejecting a proposal just for the sake of rejecting it, and to give those people an opportunity to sleep on it before beginning another round of brainstorming.

No silver bullet

Naturally, people could abuse this system. And yes, when we have to make particular tough decisions, this system could take longer than the consensus-building approach. Even so, for more routine decisions, consent-based decision making works more quickly and easily, and I’d rather make easy things easy and hard things possible than optimize for the hard decisions.

March 10, 2010 08:00 people, article, coaching

How test-driven development works (and more!)

It surprises me, from time to time, how much I still need to justify test-driven development to prospects and would-be course attendees. Many feel that TDD has crossed the chasm, while others still see TDD as a cultish practice worth marginalizing. I take some blame for those who find TDD cultish, because until now I haven’t had a strong, sensible, theoretical basis to justify TDD as an idea. I could do no better than “it works for me” or “my friends like it”. That has changed since I’ve started giving my talk “Introduction to Agile with the Theory of Constraints” in which I use concepts from Theory of Constraints to motivate the practices of agile software development, notably those of extreme programming. If you buy in to ideas from Theory of Constraints or Lean Manufacturing, then I think I now have a stronger argument to justify the core programming practices in extreme programming in particular and agile software development in general. I don’t even need all of the Theory of Constraints but rather a simple appeal to fundamental concepts in Queuing Theory.

Queuing Theory?

Yes, Queueing Theory. (And I don’t plan to capitalize that any longer.) I don’t proclaim to have any particular expertise in this area, but I have already seen how to use queuing theory ideas in optimizing network-based systems, and I see no reason we couldn’t extend that to software delivery systems. Better, I only need to appeal to a single idea from queuing theory to make my point.

Given a process B, which follows a process A, sometimes in performing B we need to perform some of A again. We can remove the need to rework by taking some portion of process B and performing it before process A1.

This merits a diagram. If we have this problem

then we can solve it by doing this

and the resulting system will work more efficiently by removing wasteful rework. I assume here that we derive no significant benefit from the rework itself, which I suppose I must justify, but let’s not ruin a good story with the truth. Here I’ve described the general problem, and by applying it to software development, I can… well, I find it more effective if I save the punchline for the end.

Winston Royce, 1970, revisited

I imagine you know this diagram

and appreciate that Royce wrote in his now infamous paper that this single-phase waterfall is risky and invites failure. If you don’t appreciate that, then I cannot strongly recommend enough your reading the original paper in its entirety, rather than stopping after page 2 as most people have done2.

We can apply the queuing theory result I’ve just cited to this diagram and generate some interesting conclusions. I’ll start by focusing in on this portion of the system

We write code, then we test it. Sadly, we occasionally find a bug3 which makes us change the code we wrote after we thought we’d finished it. That makes a loop of the type we can unravel with our queueing theory result.

Since “coding” is process A and “testing” is process B, we need to do some testing before we start coding.

It doesn’t take long for this to become a virtuous loop where we writing only the code we need to write in order to pass the tests we write.

I use the term test-first programming to describe this cycle4. When we practise test-first programming, we design as much detail as we can before writing the first test, then use the tests to help us type in our implementation correctly. Most teams most of the time can use test-first programming to reduce their defect mistake count to near zero, which increases their productivity and improves their ability to deliver, by helping them waste less time agonizing over whether to fix mistakes late in a release. I started this way in 2000 when I first discovered JUnit and stopped making silly mistakes in the code I wrote, which I found significantly beneficial in helping me code more confidently. I still designed most of what I built mostly up front.

After a while, though, I recognized a new process loop: I found some parts of my design difficult to test, or I found some parts of my design didn’t fit together when I tried to type them in.

Returning to our queuing theory result, since “designing” is process A and “doing test-first programming” is process B, we need to do some test-first programming before we start designing.

It doesn’t take long for this to become a virtuous loop where we check our design ideas as we think of them and implement only the parts of the design we can justify needing. When we include refactoring in our practice, we can confidently “under-design” compared to the level of design we expect to need by the end of a task, which I believe amounts to designing appropriately for the code we need to implement right now. This virtuous loop combines test-first programming and evolutionary design, including guiding principles like “you aren’t gonna need it” and the four elements of simple design into test-driven development, where we check our implementation by running tests and we check our design ideas by writing tests.

Where test-first programming helps most teams most of the time reduce their mistake count to near zero, test-driven development helps them reduce their design inventory—mostly code that gets in our way because it doesn’t actively help us deliver a feature—to near zero. This further increases productivity and improves their ability to deliver by helping them waste less time agonizing over design problems they find costly to fix. I waited until I’d spent an entire release practising test-first programming before doing more test-driven development. My transition consisted of trying to do less and less up-front design for each task, letting myself feel comfortable with each new step. Within two years I estimate I designed about 5% as much up front as I did before I started practising test-first programming. I can’t measure the corresponding improvement in my design, but I look back at projects that took 3 months before I practised test-driven development that I now feel confident I could complete—truly complete—in one week. Of course, we can’t stop here!

Enter our friend analysis. To simplify the discussion, I will treat analysis as “discovering the features we want in our software” without forcing myself to state too precisely how that happens5. Once again, we have our familiar situation.

Once again, we face the situation where in the process of implementing features we discover new features we need, current features we don’t need, and learn new things about features we know we need to build. This adds to our analysis, meaning that we should try test-driving some features before we try to implement others.

It doesn’t take long for this to become a virtuous loop in which our desire to implement (and deliver!) features drives them ever smaller, as we extract more concentrated value out of each one6. When we implement feature 12 we learn something about features 23, 30 and 52. We might decide not to deliver feature 30 any more. We might decide to expand feature 23 to encompass a few more key cases. We might decide to rush feature 52 to the top of the pile. Most teams most of the time find that this cycle helps them reduce the number of rarely- or infrequently-used features in their system7. This yet again increases productivity and improves their ability to deliver meaningful software to their stakeholders by eliminating the time wasted on delivering too much of a feature too soon, the time wasted on entire features we thought we needed but realized we don’t, and the time wasted arguing about what a feature means, rather than writing examples together: business-oriented tests that describe how a feature works in enough detail for the business and technical project community to agree on the conditions of satisfaction for delivering the feature.

I call this behavior-driven development, and refuse to spell it with the u that provides as much value to the word as your appendix does to your body8.

Once again, I didn’t coin the phrase, and some might argue against the way I use it, but I find it apt. This cycle include practices like business and technical people writing examples together, feature injection, feature splitting, and value-based (rather than cost-based) planning.

At this point, I think I’ve done my job. I believe I’ve justified not only test-first programming or test-driven development, but full-on behavior-driven development, using only a single result from fundamental queuing theory. I’ve made only a single assumption—that we agree on the appropriateness of applying queuing theory to a software development system. I’ve tried to add as little as possible to my reasoning in order to keep it as context-free as possible. As a result I claim that most teams most of the time will benefit from moving along the path from code-and-fix to test-first programming to test-driven development to behavior-driven development.

Now, for homework, what happens when we consider these processes?

Surely at least one you’ve needed to deliver more features for software you’d already deployed. How well does that work? What problems do you encounter? What if you applied our new favorite queuing theory result to that rework loop?


1 I really need a citation for this, and when I find it, I will place it here.

2 I digress, but I really can’t help myself on that one.

3 Also known as defect or, for the truly congruent, mistake.

4 Clearly I didn’t coin the phrase, but I know many people who treat “test-driven development” as a simple renaming of “test-first programming”, and I believe making a stronger distinction adds real value to the conversation.

5 I don’t think “gathering requirements”, as though we could pick them like berries, fits as a metaphor. I like “trawling for requirements”, which I believe I first read in Mike Cohn’s User Stories Applied.

6 We can easily apply the “Pareto Distribution” here in that we can deliver 80% of the value from implementing 20% of the feature.

7 You recall that Jim Johnson of the Standish Group reported in 1994 that 45% of developed features are “never used”. As I recall, only 7% of features were used very frequently.

8 My Canadian and British brethren and sistren be damned. I assert my right as a Canadian to choose the British spelling when I prefer it and the American spelling when it saves me time.

The World's Shortest Article on Behavior-Driven Development, revisited

I added more to this article on September 18, 2009.

On May 21, 2006, I wrote the world’s shortest article on Behavior-Driven Development. Although the title links to the entire article, it is so short that I can reproduce it here.

What is Behavior-Driven Development (BDD)?

It is Test-Driven Development (TDD) practiced correctly; nothing more.

At the time, I wrote this in anger, for reasons that I’m too tired to get in to just now (it is 4:30 AM on the last day of Agile 2006), but I wanted to share with you that my anger is changing to some more positive emotion regarding this topic.

The fact that BDD and TDD are equivalent—isomorphic, even—has its good points and bad points. I am unclear at the present moment whether the good outweigh the bad or the other way around.

What I dislike about the existence of two (perhaps three or more) different names for the same thing is that it can confuse people and divide them. Think of a single language written in two alphabets: while the speakers understand one another, they cannot read one another’s literature. I would hate to see that happen.

What I like about it is that we have two (perhaps three or more) standard approaches to explaining the technique that suit different audiences. To some, the word “test” resonates well, and to others, the words “behavior” or “example” resonate well. Rather than haphazardly sprinkling the word “behavior” into conversations about TDD, we can use an entire, cohesive vocabulary to explain TDD to someone who prefers to talk about behaviors over tests. I imagine this would help.

I would like to thank the people in room 2411 of the Hyatt Regency in Minneapolis for their willingness to participate in a spirited debate on this topic. It was tiring, and it was late, but I found it worth the effort.

Times have changed

In the time since I first wrote this article, BDD has evolved and my opinion of it has evolved as well. I now see how BDD ideas map well to the way I deliver features, complete with Feature Injection and the inner BDD design cycle. The BDD community have described how they set up a pull system for features, which I’ve been doing for years. As always seems the case, we had much more in common with one another than we originally thought!

Thanks to all the BDDers who have patiently worked with me on this unification, even when they didn’t know they were doing it: Dan North, Chris Matts, Olav Maassen, Aslak Hellesøy and Liz Keough.

September 19, 2009 08:00 testing, agile, people, agile 2006, article

Part 4: Surely we need integration tests for the Mars rover!

Recently, “Guest” commented about my Agile 2009 tutorial, Integration Tests Are A Scam. “Guest” wrote this:

A Mars rover mission failed because of a lack of integration tests. The parachute system was successfully tested. The system that detaches the parachute after the landing was successfully – but independently – tested. On Mars when the parachute successfully opened the deceleration “jerked” the lander, then the detachment system interpreted the jerking as a landing and successfully detached the parachute. Oops. Integration tests may be costly but they are absolutely necessary.

I don’t doubt the necessity of integration tests. I depend on them to solve difficult system-level problems. By contrast, I routinely see teams using them to detect unexpected consequences, and I don’t think we need them for that purpose. I prefer to use them to confirm an uneasy feeling that an unintended consequence lurks.

Let’s consider a clean implementation of the situation my commenter describes. I see this design, comprising the lander, the parachute, the detachment system, an accelerometer and an altimeter. A controller connects all these things together. Let’s look at the “code”, which I’ve written in a fantasy language that looks a little like Java/C# and a little like Ruby.

Ashley Moran has posted a working Ruby version of this example. If you speak Ruby, then I highly recommend looking at that example after you’ve read this.}

Controller.initialize() {
  parachute = Parachute.new(lander)
  detachment_system = DetachmentSystem.new(parachute)
  accelerometer = Accelerometer.new()
  lander = Lander.new(accelerometer, Altimeter.new())
  accelerometer.add_observer(detachment_system)
}
          
Parachute {
  needs a lander
  
  open() {
    lander.decelerate()
  }
  
  detach() {
    if (lander.has_landed == false)
      raise "You broke the lander, idiot."
  }
}
                        
AccelerationObserver is a role {
  handle_acceleration_report(acceleration) {
    raise "Subclass responsibility"
  }
}
                        
DetachmentSystem acts as AccelerationObserver {
  needs a parachute
  
  handle_acceleration_report(acceleration) {}
    if (acceleration <= -50.ms2) {
      parachute.detach()
    }
  }
}
 
Accelerometer acts as Observable {
  manages many acceleration_observers
                                    
  report_acceleration(acceleration) {
    acceleration_observers.each() {
      each.handle_acceleration_report(acceleration)
    }
  }
}
 
Lander {
  needs an accelerometer
  needs an altimeter
  
  decelerate() {
    // I know how much to decelerate by
    accelerometer.report_acceleration(how_much)
  }
}
 
view raw This Gist brought to you by GitHub.

I need to test what happens when I open the parachute. The lander should decelerate.

testOpenParachute() {
  parachute = Parachute.new(lander = mock(Lander))
  lander.expects().decelerate()
  
  parachute.open()
}
 
view raw This Gist brought to you by GitHub.

Since this test expects the lander to decelerate, I have to test that. When the lander decelerates, the accelerometer should report its deceleration.

testLanderDecelerates() {
  accelerometer = mock(Accelerometer)
  lander = Lander.new(accelerometer)
  accelerometer.expects().report_acceleration(-50.ms2)
  
  lander.decelerate()
}
 
view raw This Gist brought to you by GitHub.

Since this test shows that the accelerometer can report acceleration of -50 m/s2, I have to test that.

testAccelerometerCanReportRapidAcceleration() {
  accelerometer = Accelerometer.new()
  accelerometer.add_observer(observer = mock(AccelerationObserver))
  observer.expects().handle_acceleration_report(-50.ms2)
  
  accelerometer.report_acceleration(-50.ms2)
}
 
view raw This Gist brought to you by GitHub.

Since this test shows that any acceleration observer must be prepared to handle an acceleration report of -50 m/s2, I have to test that.

First, the general test for the contract of the interface:

AccelerationObserverTest {
  testAccelerationObserverCanHandleRapidAcceleration() {
    observer = create_acceleration_observer() // subclass responsibility
    this_block {
      observer.handle_acceleration_report(-50.ms2)
    }.should execute_without_incident
  }
}
 
view raw This Gist brought to you by GitHub.

Now the test for DetachmentSystem, which acts as an AccelerationObserver. What should it do if it detects such sudden deceleration? It should detach the parachute.

DetachmentSystemTest extends AccelerationObserverTest {
  // I inherit testAccelerationObserverCanHandleRapidAcceleration()
  
  create_acceleration_observer() {
    DetachmentSystem.new(parachute = mock(Parachute))
    parachute.expects().detach()
  }
}
 
view raw This Gist brought to you by GitHub.

You might find that easier to read this way, by inlining the method create_acceleration_observer():

DetachmentSystemTest {
  testRespondsToRapidAcceleration() {
    detachment_system = DetachmentSystem.new(parachute = mock(Parachute))
    parachute.expects().detach()
    this_block {
      detachment_system.handle_acceleration_report(-50.ms2)
    }.should execute_without_incident
  }
}
 
view raw This Gist brought to you by GitHub.

Since this test expects the parachute to be able to detach, I have to test that. Now, detaching only works if we’ve landed. (I’ve simplified on purpose. Suppose the parachute can’t survive a drop from any height. It’s easy to add that detail in later.)

ParachuteTest {
  testDetachingWhileLanded() {
    parachute = Parachute.new(lander = mock(Lander))
    lander.stubs().has_landed().to_return(true)
    this_block {
      parachute.detach()
    }.should execute_without_incident
  }
  
  testDetachingWhileNotLanded() {
    parachute = Parachute.new(lander = mock(Lander))
    lander.stubs().has_landed().to_return(false)
    this_block {
      parachute.detach()
    }.should raise("You broke the lander, idiot.")
  }
}
 
view raw This Gist brought to you by GitHub.

Hm. I notice that parachute.detach() might fail. But I just wrote a test that uses parachute.detach() and doesn’t yet show how it handles that method failing. I have to test that.

DetachmentSystemTest {
  testRespondsToDetachFailing() {
    detachment_system = DetachmentSystem.new(parachute = mock(Parachute))
    parachute.stubs().detach().to_raise(AnyException)
 
    this_block {
      detachment_system.handle_acceleration_report(-50.ms2)
    }.should raise(AnyException)
  }
}
 
view raw This Gist brought to you by GitHub.

Hm. So handling an acceleration report of -50 m/s2 can fail. Who might issue such a right? The accelerometer. Since the detach system doesn’t handle this failure, I have to test what the accelerometer does when issuing an acceleration report might fail.

testAccelerometerCanRespondToFailureWhenReportingAcceleration() {
  accelerometer = Accelerometer.new()
  accelerometer.add_observer(observer = mock(AccelerationObserver))
  observer.stubs().handle_acceleration_report().to_raise(AnyException)
 
  this_block {
    accelerometer.report_acceleration(-50.ms2)
  }.should raise(AnyException)
}
 
view raw This Gist brought to you by GitHub.

It turns out that the accelerometer might fail when reporting acceleration of -50 m/s2. When might it do that? When the lander decelerates. What happens then?

testLanderDeceleratesRespondsToFailure() {
  accelerometer = mock(Accelerometer)
  lander = Lander.new(accelerometer)
  accelerometer.stubs().report_acceleration().to_raise(AnyException)
 
  this_block {
    lander.decelerate()
  }.should raise(AnyException)
}
 
view raw This Gist brought to you by GitHub.

Hm. So decelerating could fail! All right, who causes the lander to decelerate? That code might fail. Oh yes… the parachute opening!

testOpenParachuteRespondsToFailure() {
  parachute = Parachute.new(lander = mock(Lander))
  lander.stubs().decelerate().to_raise(AnyException)
  
  this_block {
    parachute.open()
  }.should raise(AnyException)
}
 
view raw This Gist brought to you by GitHub.

So opening the parachute could fail! We probably want to nail down when that happens. We have a test that shows us when:

testDetachingWhileNotLanded() {
  parachute = Parachute.new(lander = mock(Lander))
  lander.stubs().has_landed().to_return(false)
  this_block {
    parachute.detach()
  }.should raise("You broke the lander, idiot.")
}
 
view raw This Gist brought to you by GitHub.

So the parachute opening could cause it to detach because the lander hasn’t landed yet. I don’t know about you, but I think the parachute provides the most value when its helps the lander land, and not once it has landed. That tells me that someone, somewhere needs to handle the exception that detach() would raise, or at least prevent detach() from happening while the altimeter reads above a few meters off the ground.

testDoNotDetachWhenTheLanderIsTooHighUp() {
  altimeter = mock(Altimeter)
  altimeter.stubs().altitude().to_return(5.m)
  
  DetachmentSystem.new(parachute = mock(Parachute))
  parachute.expects(no_invocations_of).detach()
  
  detachment_system.handle_acceleration_report(-50.ms2)
  
  // ???
}
 
view raw This Gist brought to you by GitHub.

In writing this test, I see that in order to stop the detachment system from telling the parachute to detach, it needs access to the altimeter.

Integration problem detected. When I wire the detachment system up to the altimeter, even the collaboration test shows how to ensure that the parachute doesn’t detach in this kind of dangerous situation.

testDoNotDetachWhenTheLanderIsTooHighUp() {
  DetachmentSystem.new(parachute = mock(Parachute), altimeter = mock(Altimeter))
  altimeter.stubs().altitude().to_return(5.m)
  parachute.expects(no_invocations_of).detach()
  
  detachment_system.handle_acceleration_report(-50.ms2)
}
 
view raw This Gist brought to you by GitHub.

This means I have to add the following production behavior.

DetachmentSystem acts as AccelerationObserver {
  needs a parachute
  needs an altimeter // NEW!
  
  handle_acceleration_report(acceleration) {}
    if (acceleration <= -50.ms2 and altimeter.altitude() < 5.m) {
      parachute.detach()
    }
  }
}
 
view raw This Gist brought to you by GitHub.

Integration problem solved with no integration tests. Instead, I have a bunch of collaboration tests, one important contract test, and the ability to notice things a systematic approach to choosing the next test, which I describe in the comments below. Any questions?

Dan Fabulich rightly jumped on me for using the phrase “an ability to notice things” just a little earlier in this article. I choose that phrase lazily because I didn’t want to patronize you by writing, “an ability to perform basic reasoning”. Oops. I thought about how I choose the next test, and I decided to take the time to include that here. Enjoy.

In this example, I used no magic to choose the next test; but rather some fundamental reasoning.

Every time I say “I need a thing to do X” I introduce an interface. In my current test, I end up stubbing or mocking one of those tests.

(See A sign you’re mocking too much for more about when I avoid interfaces and when I routinely create them.)

Every time I stub a method, I make an assumption about what values that method can return. To check that assumption, I have to write a test that expects the return value I’ve just stubbed. I use only basic logic there: if A depends on B returning x, then I have to know that B can return x, so I have to write a test for that.

Every time I mock a method, I make an assumption about a service the interface provides. To check that assumption, I have to write a test that tries to invoke that method with the parameters I just expected. Again, I use only basic logic there: if A causes B to invoke c(d, e, f) then I have to know that I’ve tested what happens when B invokes c(d, e, f), so I have to write a test for that.

Every time I introduce a method on an interface, I make a decision about its behavior, which forms the contract of that method. To justify that decision, I have to write tests that help me implement that behavior correctly whenever I implement that interface. I write contract tests for that. Once again, I use only basic logic there: if A claims to be able to do c(d, e, f) with outcomes x, y, and z, then when B implements A, it must be able to do c(d, e, f) with outcomes x, y, and z (and possibly other non-destructive outcomes).

I simply kept applying these points over and over again until I stopped needing tests. Along the way, I found a problem and fixed it before it left my hands.

If I can describe the steps well enough for others to follow – and I posit I’ve just done that here – then I don’t agree to labeling it “magic”.

Interlude: Basic Correctness

I tried to respond to a comment from James Bach, but surpassed the 3000-character limit, so I’ve decided to add this long comment as a short article.

James Bach: I don’t understand the point of calling a failure discovered by running a test “unjustifiable.” Let me offer you a justification: I WANT TO FIND BUGS. :)

J. B. Rainsberger: Now now, changing the definition of a term on me constitutes dirty pool. Stop that! :)

James Bach: When you say ‘Presumably, we have tests that intend to test step 2, which justifiably fail’ I would say that sounds like a dangerous presumption. Just because we write a test, and that test has a purpose, does not mean the test achieves its purpose. In fact, as far as we know a test NEVER achieves its deeper purpose of finding all possible interesting bugs in the thing it is testing. Of course, when I test, I want to find all interesting bugs, and of course, I will never know that I have found all of them worth finding.

J. B. Rainsberger: I think you’ve taken my specific definition of “justifiable failure” and extended it past how I chose to use the term. When I write a test that fails because of a defect in the code that test intends to focus on, then I call that failure justifiable. All other failures are not justifiable. As a result, the statement you zeroed in on appears tautological to me… except my sloppy use of “presumably”. Let me clarify what I left unexpressed. Let’s assume that we have used integration tests to test the five-step process minimally broadly (in other words, we have written tests to find at least all basic correctness defects). That means that we’ve written some tests specifically to check step 2. Suppose now the existence of a defect in only step 2. When our test for step 4 fails because of the defect in step 2, the tests for step 2 fail justifiably, while the tests for step 4 fail unjustifiably. I haven’t yet turned my attention to latent and lurking defects, for the reason I provide at the end of this comment.

James Bach: … That’s why I use a diversified test strategy. It seems to me that complicated integration tests that cover ground also covered in other tests is a reasonable strategy— as long as it is not too expenses to produce or maintain. There is a cost/benefit that must be weighed against an opportunity cost, of course.

J. B. Rainsberger: I agree; however, I see too many programmers (in particular) using integration tests to help them find or avoid defects that much less expensive isolated object tests would better help them find or avoid. I center my entire argument on the thesis that too many programmers use these tests in a way that leads to considerable wasted effort in maintenance.

James Bach: So, perhaps you are talking about a restricted context where its not worth the effort of testing a particular function indirectly. Maybe not, but I bet it’s worth considering that sort of testing. Personally, I like automation that touches a lot of things in a lot of places, as long as I can create and maintain it without too much disruption of my sapient testing.

J. B. Rainsberger: Indeed so! I have wanted to reveal these points gradually so as to avoid writing 20,000 words at once, but I limit the arguments in this series to a specific context: programmers writing tests to show the basic correctness of their code. By basic correctness I refer to the myth of perfect technology: if I ran the system on perfect technology, would it (eventually) compute the right answer every time? I would call such a system entirely basically correct. While integration tests offer value in other contexts, too many programmers use them to show basic correctness, and when they do that they waste a tremendous amount of time and effort.

« Previous 1 3 4 5 6 7