How it works Static code analysis Technical paper

Choose a category:

Glitch Watch - United flights delayed, canceled

June 29th, 2007 by Nigel Cheshire. Posted in Glitch Watch

Problems with the software that calculates the weight balance of aircraft before take-off caused a two-hour delay for United flights across their system last week. In all, 300 flights were delayed, and two dozen were canceled.

In other news, 3,000 veterans found themselves forced to pay accumulated unpaid medical co-payments after a software glitch in Togus Veterans’ Affairs billing system failed to generate any bills between February 2006 and March 2007.

Continuous integration buzz

June 28th, 2007 by Nigel Cheshire. Posted in Software Quality

It seems like everywhere I’ve turned this week, I’ve run into articles about Continuous Integration. Seems to me like this is another concept that, like static analysis, is really starting to pick up steam. First, I opened my copy of SD Times to read Andrew Binstock’s weekly column, which starts out:

One of the most important trends emerging in software development is the adoption of continuous integration. While developers, especially agile developers, have known of the benefits of continuous integration since the late 1990s, only now is it starting to gain real traction.

Binstock goes on to talk about how you can use your CI server to run unit tests, collect coverage data and run static analysis.

Then I read Jez Humble’s piece on the ThoughtWorks blog about CI. At heart, he says, CI tools are really timers that kick off the build process. But there’s more to CI than the build, Humble says, and he then goes on to detail six key things you should automate as part of the CI process. Worth a read.

Finally, if you are in the DC area, you might want to check out Stelligent’s CI War Stories session that they are holding at their office this evening. I heard that all over the country, people are already in line outside retail outlets, eagerly anticipating the June 29 release of Paul Duvall’s book Continuous Integration: Improving Software Quality and Reducing Risk. Oh wait, maybe that’s something else they are waiting in line for.

Top five reasons not to use static analysis

June 26th, 2007 by Mark Dixon. Posted in Coding Standards, Software Quality

We all know that finding bugs as soon as possible saves time and money. Static analysis is one of the best ways of finding bugs early, yet in my experience very few development teams have built it into their process. This post takes a look at five of the most common objections to using static analysis, and gives some suggestions for overcoming them.

1. “Static analysis is just for finding style mistakes, not real bugs.”

There are two problems with this objection. First, it’s no longer true. One of our customers identified 50 bugs the first time they used static analysis. The cost of those bugs reaching even QA would have exceeded the cost of bringing in static analysis. If any of those bugs had reached production, the costs would have been even higher. At another customer site, we started the day by presenting a brief summary of static analysis results to the development team. Before we left that afternoon, over 20 bugs had been fixed in the code.

There is an ongoing debate in the academic community on the effectiveness of static analysis for identifying bugs. My view is that static analysis is fundamentally limited in its ability to detect bugs. Static analysis can go some way towards identifying whether or not your code reliably implements the algorithm you’ve coded. It has no way of telling whether that algorithm is appropriate for solving the problem at hand. However, just because static analysis can only find some bugs, even if it’s only one in 10, why on earth would you not use it, so that you can concentrate on finding the hard bugs?

One of my favorite analogies here is with medical practice. There is evidence that maybe thousands of lives could be saved every year if hospital staff simply washed their hands more often. Washing your hands won’t cure cancer or AIDS. People will still get sick. But if you have a simple measure that can solve part of a problem it’s stupid to not use it.

Having said all that, just suppose that static analysis couldn’t find bugs, and could only ensure that your code was consistent with your style guidelines. How valuable would that be? Think about why you have style guidelines in the first place. There may be some practices in there that are designed to prevent bugs, but the primary purpose is to make sure that all of the code in the project looks the same. That way anyone on your team can look at any code in your project and at least have some signposts to help them find their way around. In an often-quoted article, Peter Hallam claims that professional developers spend 75-80% of their time understanding existing code. Anything you can do to make that process more efficient is likely to offer significant payback.

If you have only ever worked with responsible developers who instinctively use coding standards then this argument probably sounds rather forced. As someone who has worked with a developer who simply couldn’t get the hang of indenting his code consistently, I can’t begin to explain how hard it is to understand code when all your subconscious cues for detecting control flow have been destroyed. I remember another developer I worked with whose code was extremely robust but who didn’t use consistent coding standards. It always seemed to me that his code worked better than it ought to, and I found it very hard to maintain.

2. “It’s too noisy.”

A common reaction when talking to developers about static analysis tools is a rolling of the eyes and the recounting of the time they tried to use Lint on their project. It ran for hours and produced tens of thousands of warnings. No one had any idea what to do with the reams of data from the tool, so it went into a folder to be looked at some day and that was the end of the experiment.

The solution to this is to choose and use a tool that only generates data you are going to use. First, different static analysis tools have different philosophies on how to deal with false positives. A false positive is a situation where the tool reports a problem with the code, even though the code is actually correct. Typically, the more ambitious the rule, the greater is the likelihood of false positives. For example, highlighting class names that don’t meet some predefined template is straightforward and would never produce false positives. Identifying possible race conditions in a multi-threaded program is more valuable, but also much more likely to produce false positives.

In Enerjy Code Analyzer, we took the view that a static analysis tool is only useful if it is used constantly, so we excluded any rules that had significant likelihood of false positives. If we were writing a specialist analysis tool that perhaps would be used as part of an extensive QA process, then a different trade-off may have been appropriate. The key point is that you must match the false-positive rate of your tool to the time you have available to deal with the output.

Second, having selected a tool, you should configure it to produce only enough output to be useful to you. I describe how to do this in #5 below but, in brief, unless you’re going to take immediate action to deal with issues flagged by a particular analysis rule, turn the rule off. It’s very tempting to keep all the rules on as a todo list, but once that list becomes more than a few tens of items long, it’s too long to be useful.

3. “It’s inaccurate.”

This objection stems primarily from experience of using static analysis in the C++ world. C++ is a terrible language for tools vendors to handle: there are only a handful of people in the world capable of writing an accurate parser to read and understand C++ source files in all their template-ridden complexity. On top of this, you have to deal with the preprocessor transforming the visible source code into something quite different before it hits the compiler. Unless you get the exact same set of preprocessor defines to your static analysis tool that your compiler is using, you can expect to be overwhelmed with spurious or plain incorrect warnings.

Thankfully, the designers of Java long ago decided to do away with the preprocessor, so static analysis in Java tends to be much more robust. As long as you get the source code compatibility level and class path correct, any tool will analyze the code correctly.

The best solution to this problem, though, is to use a static analysis tool that can read from the same build scripts (or project files, if you’re using an IDE) that you use to compile your code. Most Java tools support Eclipse and Ant integration. In the C++ world, at least on Windows, the Visual Lint and LintProject tools from Riverblade allow you to run the excellent Gimpel Lint tool using the exact same settings that Visual Studio is using.

4. “It takes too long.”

I am still baffled at the number of static analysis tools that need to be explicitly executed by the developer. Eclipse made great strides with its smart incremental compiler guaranteeing that your Java code is constantly fully compiled and up to date, yet the static analysis framework within TPTP needs to be invoked from a menu. Why would you want to immediately check your code for compilation errors, but only check for bugs or security problems when you happen to have a few minutes and remember to run static analysis?

A design goal for the Enerjy Code Analyzer from day one was that it had to run whenever you saved a file and display the results of the analysis along with the compiler output. The Eclipse Checkstyle plugin also runs just like the compiler, analyzing every file as soon at it is saved and showing the results in the problems view. No commands to invoke. No opening a new view to look at the results. Some static analysis tools even require switching to a different perspective in Eclipse to view the results!

Developers are busy and the only way to make static analysis work is to integrate it completely into their existing workflow. If you treat a static analysis violation just the same way you would treat a compiler warning there is no overhead to using static analysis. And if your existing tool won’t let you work that way, find one that will.

5. “I’m in the middle of a project right now - I’ll use it for my next project.”

This is related to objection #2. Sure, I’ll write clean code next time, but for now I need to deal with an existing code base that wasn’t written with static analysis in place and contains too many violations to manage.

The solution is the same. The only way to use static analysis is with a zero-tolerance policy. Your entire project must build with no static analysis warnings at all times. That means that you must use a tool that supports escapements i.e., the ability to suppress messages from that static analysis tool in situations where you know the code is correct. This is usually achieved by inserting magic comments or, in Java, annotations.

Some static analysis is more valuable than no static analysis. Most tools have a severity setting, so try enabling only the most serious violations and then running the tool against your code. The chances are it will find few, if any, violations. The most serious violations typically correspond to bugs, and your existing unit and functional tests have probably flushed out those bugs already.

If there are still too many violations, then there’s probably a mismatch between your coding style and the default settings for the tool. Next project you can review your coding standards, but for now simply disable any rules that are too noisy to deal with. Remember, the goal is to get static analysis up and running on the project that you have today.

Once you get the number of violations down to a manageable number, take the time to check out any messages and resolve them. Typically this may show up a few bugs but in most cases the solution will be to mark up the source code to suppress incorrect warnings. That’s not a bad thing in itself; if the tool is confused by the code then the next maintenance programmer probably will be too, and they’ll thank you for the explanatory comment.

You now have a static analysis safety net for your project. You may even want to round out the rule set over time; it’s usually easy to deal with the number of warnings introduced by enabling one new rule at a time. Even if you don’t, I guarantee that you’ll be as surprised and happy as I am whenever my static analysis tool reminds me that I’m about to run code containing an infinite loop. Or that I never use the value I’ve just written hundreds of lines of code to compute…

Better Software Conference roundup

June 22nd, 2007 by Rich Sharpe. Posted in Software Quality

I’m at the Better Software Conference in Las Vegas this week, where one of the key focal points has been managing people and the culture of the organization rather than just the software itself.

Jeffery Payne (CEO, Citigal Inc.) discussed, in his keynote, how the ‘Measurement Crisis’ is hurting business executives as their decisions are based on measurements from all parts of the company (NPV, ROI, Marketing, etc.) but this is still lacking for software development measurements.

Payne believes this is in part due to many software engineers being mathematicians and feeling uncomfortable reporting low numbers (for justification reasons), or metrics that do not compare or correlate easily. Personally, I recall a few years ago being in a meeting where the marketing manager reported our Web site’s click-through rate was 3.1%. I thought this was an amazingly low number and wondered what would happen if I reported that the unit test coverage rate was 3.1%… I did not know that the industry average click through rate at the time was 2.7% so the number was actually very good.

This has been a familiar theme in all of the measurement/metrics sessions I’ve attended this week – read behind the numbers! Understanding the metrics reported is vital to success if you are going to use them to good effect.

This is a cultural issue, and Payne recommends that managers should start by getting teams measuring something to get them interested and comfortable with the process.
This is something we advocate strongly. Starting to measure something, even if the numbers are initially low, and tracking them to see the metrics increase for the better is not only the beginning of a better quality process, but is motivational and encourages people to extend the measurement process into other areas. For example, start by tracking unit test pass rates, and once you are comfortable with the process, move on to coverage.

Alan Shalloway, CEO of NetObjectives, discussed how ‘Sustaining the code is different from sustaining the team.’ In the midst of much discussion about tools and results, this excellent point is often missed by organizations looking to implement quality processes. Sometimes, focusing on all the tools available can blind you to the most important part of the organization – the people. Shalloway went on to explain how Lean, Agile and Scrum processes can facilitate motivation and getting the best out of the team.

The agile sessions attracted most people and the low number of attendees at some of the great metric/measurement sessions was disappointing. On the basis of the many questions asked at these sessions though, metrics seems to be an area of discovery for many organizations, so I expect these numbers to be on the increase in the future.

As the conference is in Las Vegas, there were numerous gambling-related anecdotes and quotes, the best of which included:

“The similarity between software development and poker is that they are both very expensive when done poorly.”

and

“The difference between software development and poker is that bluffing [lying] is an acceptable strategy in both.”

And my personal favorite from Payson Hall (Catalysis Group) in his keynote on risk management:

“You all know how [brittle] software is. Do you know how many lines of code there are in a Boeing 747? And yet you all chose to fly here…”

The dark side of estimation optimism

June 21st, 2007 by Nigel Cheshire. Posted in Software Quality

Yesterday, I was heading from my office in Beverly, MA, to a meeting in Canton, MA. According to Google Maps, that’s a distance of 49.3 miles, almost all of which (43.0 miles, or 87%) is on Route 128 - a four to eight-lane highway. Google estimates that it will take me exactly one hour to make that drive, which, conveniently enough, indicates an average speed of 49.3 miles per hour.

So, to get to my meeting on time (8:30 am), what time should I leave? Given that the speed limit on Rt 128 is 55 mph, and I have to travel right in the middle of rush hour, that one hour estimate seems a bit optimistic, but I will certainly take it into account. The smart thing would be to assume that I won’t be able to average more than 25 mph, and allow two hours for the drive. If I get there an hour early, I’ll grab a cup of coffee and catch up on my RSS feeds. But here’s where things start to go wrong. There’s that little part of my brain, the “rational” part, that says “it couldn’t possibly take twice as long as the Google estimate, could it?” I mean, there’s realism, and then there’s going over the top.

So, I decide to leave a little before 7:00 am - giving myself plenty of time to get there - after all, I added 50% to the original estimate. I’m sure you can guess the end of the story, especially if you live in the Greater Boston area: I arrive about 20 minutes late.

Any of this sound familiar? We humans have a tendency to be over optimistic when it comes to time estimation. And time is such a scarce resource in our culture that we don’t want to give the impression that we have any of it to spare. And so, we submit our hopelessly optimistic software delivery estimates, and as the deadline approaches, we start to cut corners, and that is where quality suffers.

Fortunately, I have never been involved in a serious car accident. But it seems likely to me that, if it were ever to happen, it would be when I was running late to get somewhere.

Software quality and the U.S. Open Golf Championship

June 19th, 2007 by Rich Sharpe. Posted in Software Quality

When the 107th US Open Golf Championship started last week, quality would have been at the forefront of every player’s mind. Hours of work, practicing different techniques and consulting with people who know the environment can either lead to glorious victory frustration at having a bad day by finding the numerous hurdles, both expected and unexpected, in their path – sound familiar?

Over the last few weeks I have had my first set of golf lessons in 15 years. Afterwards, I realized that many of the techniques I was learning, tweaking, trying out etc. follow the experiences I had in developing software. Over the years I had slipped into a couple of bad habits and could not realize that I was doing this as you do not watch yourself when playing. One of the issues I needed to resolve was poor posture at set up. This is a fundamental part of the whole golf swing, if you set up badly then it is very difficult to get things right! – Again, sound familiar?

This all relates to why we have code reviews and mentors (as well as training courses). These are the controls that stop the rot setting into our code and frequent tweaks and technique improvements can ensure that not only does the current code get reviewed from a quality perspective, but we learn for the next piece of code we write.

There were many areas that were comparable: preparation, environment, technique, execution and rhythm. I had to think about this last point, but having a constant release rhythm allows customers to be prepared for what is coming and, internally, sets a schedule for other people not directly involved in the day to day development – marketing, document authors, training etc.

Probably the most important thing I realized was that learning a new technique often has a negative impact on productivity before you start getting good at what you’ve learnt. This was very noticeable on the golf course, but made me think about unit testing. To start with, we learn simple tests using basic assertions and think ‘well, OK but how can I test this in my real-world, more complex situation?”. As time goes by, we learn to move onto these more complex areas of code and after a few months writing tests becomes an automatic part of the process. Thus, the quality of our work improves and, due to the many well known benefits of unit testing, so does productivity, over time.

So, like the players at the Open, the next time I am out on the golf course, I’ll be hoping that the exceptions will be few and far between, and when they do occur, they will be handled gracefully.

The Mythical Man-Month - still relevant 30 years on

June 18th, 2007 by Rich Sharpe. Posted in General, Software Quality

After our recent discussions on ‘Keeping Your Estimates Honest’ and ‘No Silver Bullets’, I decided to blow the dust off my copy of The Mythical Man-Month, (Frederick P. Brooks, JR, 20th Anniversary Ed. Addison Wesley, 1995) and reread it.

What amazes me about this book is that it is over 30 years old, and yet it is still very relevant today. Am I amazed that people haven’t learned from these well-documented problems and solutions? No. It reconfirms that software development is not an easy task, and managing these projects from estimation through conclusion is more stressful now (with greater business demands and competition) than it was 30 years ago.

The book covers many topics, but a recurring point Brooks makes is that Conceptual Integrity is central to product quality. If there is not purity of design that is applied to the whole application, this lack of integrity will cause problems down the road. Brooks discusses the art of growing software rather than building it at length, and suggests using ‘Surgical Teams’ and division of responsibilities (similar to that seen in Scrum practices today), which leads to a better-integrated team and a higher quality result.

All of this reminds me that the various manufacturing metaphors that are often applied to the software development process are unhelpful at best. While it is important to ensure consistency of behavior of the finished application, you can’t get away from the fact that the software development process is, by its nature, a creative activity and demands freedom of creative thought.

It also raises questions in my mind about the viability of outsourcing: we have several clients who outsource the development of various parts of their applications to different groups, often at different external vendors in different countries. (I wonder what impact that approach has on Conceptual Integrity.)

Another area Brooks comments on is “adding manpower to a late project makes it later” (Brooks’ Law). This is one lesson that I believe the industry has learnt over the past 30 years. Brooks’ observation “The bearing of a child takes nine months no matter how many women are assigned” is one I have heard many times in my own career.

Brooks wrote The Mythical Man-Month largely from the experience he gained when managing the IBM System 360 project back in the 1960’s. But most of the concepts of the book are still relevant today; fundamentally, this book is about managing and organizing teams to become more productive. Brooks expands on this subject in the last chapter of the 20th Anniversary Edition - The Mythical Man-Month after 20 years – and many of his conclusions are proven with references to numerous projects and studies. In my opinion this will still be staple reading for any project manager or development manager for many years to come.

Not just a thing of great beauty

June 15th, 2007 by Nigel Cheshire. Posted in Software Quality

I admit it. I’m a gadget guy, an early adopter, on the bleeding edge of any new technology I can lay my hands on. I still have my Apple Newton in the back of a drawer somewhere, at home I listen to Pandora via Sonos, and I control the GPS functions of my car via the voice recognition system. Actually, I don’t, because it doesn’t work well enough, but that’s a whole different blog posting.

I also worry about oil. Both in the “carbon emissions, global warming” sense, as well as the “dependence on unstable economies that are clearly out of our control” sense. So I’ve been interested in the concept of electric vehicles for a while, but man, are they uncool. At least they were, up until Tesla Motors came along. And so I have been carefully watching Tesla’s progress, reading their blog, and desperately trying to figure out how to come up with $98,000 so I can buy a Tesla Roadster.

overhead_sml.jpg

What does all this have to do with software quality, you ask. Well, a couple of things, I think.</p> <p>To start with, one of the things that bothers me about the car I drive is that infernal internal combustion engine under the hood. Fundamentally, that is 100-year-old technology. That would be fine, if it weren’t so inelegant. One of the things we strive for in software design is elegance - simplicity - “>

The other thing I like about Tesla is that they understand the idea of conceptual integrity. There is a clarity of purpose to what they are doing that drives every design decision that they make, and they have some means of communicating that internally. Here’s an extract from Tuesday’s posting by David Vespremi:

Sure, the engineers could have specified steamroller-sized wheels and tires to put even more rubber on the road and inflate the lateral grip numbers for the magazine statistics. That would have compromised the Tesla Roadster’s tactile response, and created a car that tramlines over grooves in the pavement and thuds over bumps and potholes, but posts amazing dry skid pads numbers. At the end of the day, shouldn’t a car be fun to drive and not simply a way of generating numbers that look good in a magazine?

Driving dynamics is one of the core design goals for the Tesla Roadster, and everyone who works there understands that. How much better could our software be if every engineer understood every design goal perfectly?

Tesla is a fascinating example of a company that’s rethinking an industry, and anywhere that kind of innovative thinking is going on, there will be lessons to be learned.

Dropping hints about what's coming next at Enerjy

June 14th, 2007 by Nigel Cheshire. Posted in Enerjy

<shameless promotion>
Sys-con.tv just posted an interview that Roger Strukhoff did with me at JavaOne last month. While there’s not too much you can say about software quality in 9 or 10 minutes, I did drop a few hints about what we are working on here at Enerjy, which we are pretty excited about. I’ll start to blog about that here soon, but in the mean time you can go check out my funny British accent here.
</shameless promotion>

Glitch Watch - Fire alarm on space station caused by software glitch

June 13th, 2007 by Nigel Cheshire. Posted in Glitch Watch

The New York Times reports that a fire alarm on board the International Space Station today was caused by a software glitch on Russian computers that control navigation and functions of the station.

Clayton C. Anderson, an astronaut on the space station, called down to mission control and said, “We looked around and smelled around,” and found no evidence of fire.

The alarm came one day after the space station received a new set of solar panels, whose deployment went smoothly. Unlike the last time the solar panels were unfurled, back in September, when there was a three-hour delay caused by… a software glitch.