Monday, June 13, 2011

The Doctor Is In...

One of the tenets of agile development is continuous integration in order help minimize the time spent trying to fix the build when changes are committed that may break it. We have had an automated build process for a few years now; our SCM engineer spent countless hours on the phone with the guys over at Urban Code to make sure we had everything setup just how we wanted.

The last step was to figure out how to notify us that a build failed. We were given the option to be notified via email or IM whenever there was a build failure. The team immediately nixed the idea of an email notification. I mean, don't we all get enough email throughout the day is it? Who wants another email filling up an already bloated inbox? So, we chose going the IM route.

Everything was going smoothly: code was being committed to source control; builds were being kicked off upon a commit; and any build that failed usually got fixed in less than an hour due to the IM notification from the build server that was sent to the whole team. A typical build history would look something like this:

Over time, the team started to ignore the IM about the failure; some people had actually stopped signing into IM just so they wouldn't be "distracted" by the failure notification. Kind of defeated the purpose of having continuous integration in the first place, don't you think?

As a result, the build was not getting fixed until either the SCM group noticed several failures in a row and emailed the team, or, even worse, until the QA portion of the team was trying to deploy a build to our test environment. People were now waiting on the team to get a good build when there were mechanisms in place to avoid this very scenario.

After experiencing one of these episodes, we would be more diligent about being more aware of the build and fixing it as soon as possible if it were to fail again...for a while; then it would be back to the same old thing. The last straw was when I noticed the build failing for four or five days straight, so I thought there had to be a better way. In the days to follow, I came across an excerpt from the book, 97 Things Every Programmer Should Know: Collective Wisdom from the Experts that had just what we needed:
"You need to give your project a voice. This can be done by email or instant messaging, informing the developers about the latest decline or improvement in numbers. But it's even more effective to embody the project in your office by using an extreme feedback device (XFD)."
After doing some additional research on extreme feedback devices, I went to check out if there was a web API from our build tool that I could leverage to check on the latest build status. Much to my delight, Urban Code included a relatively rich API that gave me exactly what I needed. The next step was to order some home automation gear from X10 and download the SDK. Before long, I was programmatically turning a light in the other room of my house on and off.

After reaquainting myself with some of the nuances of creating a Windows Service and a few .NET classes later, I was in business. The only thing left to do was figure out what I would be switching off and on. From a lot of the posts I read, red and green lava lamps seemd to be the most popular: a glowing green lamp meant all was well; a red lamp starting to bubble meant the build was broken. I figured lava lamps wouldn't go over too well with our facilities management folks, so I opted for a large, annoying siren: large so it would be tough to ignore; annoying so we would want to fix the build as soon as possible to make it turn off.
After finding a nice central location for the siren and some additional testing to make sure the radio frequency from the X10 gear was in range,  we were good to go. Needless to say, the device received a fair amount of attention at first along with feedback that randged from, "We should attach this to the ceiling," to "Can we get something that makes noise?" and just about everything in between.

The whole setup has been coined the "build doctor" (not to be confused with this one) as it makes house calls every couple of minutes to monitor the health of our builds. After word spread through the company that a flashing siren meant the build was broken, the effect has been an increased visibility into the build and an increase in accountability and attention to detail amongst the team as it relates to our build. The $60 or so to get the whole thing setup and running was a small price to pay for the short term morale boost, more transparency, and increased accountability regarding the build status. It has also led to less waiting for a good build by our testers, which was the ultimate goal.

I'd be curious to know what others have done in the way of extreme feedback devices and for any tips on trying to motivate the team to maintain a high level of awareness to the build status.

No comments:

Post a Comment