Fred Brooks’s law of ‘adding manpower to a late software project makes it later‘ is one most of us have tried to prove wrong…….and failed!I was at Agile 2008 and saw an interesting session, “Breaking Brooks’s Law” from Menlo Innovations, a Michigan based Java development company. They claimed to disprove this law and demonstrated their working environment and techniques that allowed them to do so.Although the presentation was only 45 minutes, we were in the room for almost 2 hours asking questions to determine how robust their techniques were, and to gain more insight into the conditions developers work under.Menlo’s results are based on a 3 year project that the customer had a deadline to demonstrate at a show. More features were required for the show than currently in the plan. So rather than re-prioritize, Menlo decided to add more developers to attempt to complete the work. They managed to complete the Project on time with all added functionality.The environment at Menlo is quite unique. All developers are co-located in the same large room (no offices or cubes) and pair program 100% of the time - they follow strict XP practices. A scheduling team determines which projects developers work on and who they pair with on a weekly basis. So developers work with different team members and possibly different projects every week.Also, as part of the contract, the customer comes to Menlo every week to prioritize the work for the next sprint.These techniques may appear somewhat draconian (100% paring for example). I managed to catch up with the team and interview them to discuss this project further, bug rates, staff attrition rates and how Project Managers can push the message of pairing to Senior Managements/Directors (see video).I thoroughly enjoyed talking with the team from Menlo and they invite anyone passing by to stop in and take a look at how they operate. They also have an interview process which involves a large number of candidates performing a number of tasks including Pair Programming, with an appointment you can observe this too. A detailed paper about their techniques and contact details are here.
If you’ve been reading here for a while, you will know by now that we have been doing a lot of number-crunching here at Enerjy lately. We’re on a quest to find a correlation between metrics and defect rates, and to answer the question: “what metrics are good predictors of bugginess?”
There is a different, but in a way, very similar quest going on at Netflix, where, for the past 13 months or so, they have been running a competition to try and improve their recommendation engine. That’s the software that figures out that, if I gave Bambi five stars, I would probably also enjoy watching Reservoir Dogs. Clearly, improving the recommendation engine is important to Netflix: back in October 2006 they offered up $1m as a prize to anyone who could improve their current algorithm by 10%. Despite the best efforts of the nearly 24,000 teams working on the problem, no-one has achieved the 10% improvement yet, and as of now (December 2007), the best percentage improvement stands at 8.5%.
But the job of a recommendation engine is very similar to what we are doing here. We’re analyzing tens of thousands of source code files, collecting data on a couple of hundred metrics on each one, and then statistically correlating those to the number of defects that were found in each one of those files. The good news is that we’ve found some strong correlations between certain combinations of metrics and defect rates, that allows us to make a pretty accurate prediction of whether the code we are looking at is going to be bug-prone or not. We’ll be launching a product based on that analysis in the new year. Once we’re done with that, maybe we’ll take a crack at the Netflix Prize…
By the way, if you’re interested in reading more about number crunching, and some of its applications in everyday life, I can highly recommend Super Crunchers by Ian Ayres. Fascinating stuff.
In previous postings, I’ve been talking about code quality and coding standards – writing code right. This month I’d like to talk about the other side of successful product development: writing the right code.
Pretty much my entire career has been in the area of developing shrink-wrapped software and I’ve been behind the creation of around twenty new products. For reasons that I’ll talk more about in coming months, data mining is an area of research that I’m very interested in right now, so I thought it would be interesting to see if data mining is useful as a means of predicting the success of a new product. Obviously, twenty products is a small dataset and I’m not sure that we’ll be significantly redesigning our innovation process based on these results, but I think there are some general conclusions we can draw.
This analysis looks only at products that were offered for sale. I haven’t included products that never made it out of R&D. For each product, I recorded three key characteristics and the ultimate outcome. The outcomes are one of Successful, Marginal and Failed. A Successful product is one that you can build a business on. It is profitable even after including all the overheads of running an (appropriately sized) business. A Marginal product is still profitable and would make a valuable addition to a suite, but it isn’t strong enough to stand on its own. Failed products were not able to cover the costs of developing and selling them.
The characteristics I measured for each product were the source of the product idea, whether we used an external design partner, and whether the product was in a new area for us. The source was either Internal or External. Internal ideas are those that were generated purely within the company. External ideas originally came from a customer or prospect. A design partner is a customer or prospect who worked with us throughout the product development. Finally, I defined a new area to be an area in which we did not yet have a successful product: it may be a totally new area or an area in which we had introduced a marginal or even a failed product.
A golden rule of data mining is to start with the simplest model you can, since often adding additional complexity to the model has very little impact on the accuracy. A good starting point is the 1R (for 1-Rule) algorithm that attempts to predict the outcome based on a single characteristic. That didn’t work too well for our data, with all characteristics having an error rate of just below 50%. For the record, though, if forced to make a prediction based on a single characteristic, the best guess is that products in a new area will be successful whereas products in an existing area will be marginal.
The next step up in complexity is the Naïve Bayes algorithm. This is similar to the technique used by some spam filters to identify spam email based on the words in the message. For each combination of characteristics, the Bayesian model generates probabilities for the outcome of the resulting product. While I can’t share the raw data (many of the products are still being sold), here are the numbers from the Bayesian analysis.
For products in a new area:
|Internal, no DP||28||28||44|
|External, no DP||29||40||31|
For products in an existing area:
|Internal, no DP||19||61||20|
|External, no DP||17||71||12|
The Bayesian model has one immediate result: for any combination of the other product characteristics, using a design partner roughly doubles the likelihood that the product will be successful. It also significantly reduces, sometimes as much as halving, the likelihood that the product will be marginal or a failure.
There is less difference than you might expect between internal and external ideas. In a new area, both internal and external ideas have an equal chance of success. External ideas are, however, 10-15% more likely to be successful. In an existing area, internal ideas are more likely to succeed or to fail, whereas external ideas tend to be marginal. That makes a lot of sense, since customer ideas are often variations on what we already have, and thus likely to be successful but not outstandingly so. Internal ideas in an area we know well are likely to be more innovative and trade a slightly higher chance of failing for a higher chance of success.
So what do we make of the influence of design partners? My feeling is that the actual contribution of the design partner is not the cause of the product success: we certainly build better products when a design partner helps us, but I’m not sure that the difference is enough to account for a different outcome for the product. No, my view is that the fact that we are able to find a design partner willing to work with us on the product shows that the product is addressing a problem that companies are prepared to invest (time) in to solve.
That means that f we aren’t able to find a design partner for a new product then we know we’re taking a gamble: internal ideas in a new area have a 45% chance of failure with no design partner. However, we can’t solve that by, for example, offering money to a potential design partner, since it’s their very investment that validates the product.
On the subject of failure, our overall failure rate is around 25%. Not surprisingly, it varies significantly between products in new areas (35%) and existing areas (10%). I admit to being a little embarrassed by that number: it’s way too low. I remember one talk where Seth Godin, the former VP of Permission Marketing for Yahoo, revealed his failure rate was up in the 80% range. Nowadays it’s so cheap to bring a product to market (see Guy Kawasaki’s post where he creates and launches a Web 2.0 site for $12,107.09) that there’s no reason not to try more things. Sure there’ll be more failures, but there will be successes too.
One of my favorite business quotes of all time is from Herb Kelleher, former CEO of Southwest Airlines. He said “We have a ‘strategic plan.’ It’s called doing things.” (Herb Kelleher, quoted by Tom Peters – (PowerPoint link)). Given that Southwest is the only major airline to remain profitable after 9/11, that plan seems to be working out very nicely for them.
If you’re interested in becoming a design partner for code quality tools and you’re doing Java development using Eclipse, I’m interested in hearing from you. Drop me a line (mark underscore dixon at enerjy dot com) and I’ll follow up with you with more information on what we’re looking for. But, like I said, I’m afraid we won’t be paying you!