Monitoring and Improving the Health of your Source Code

There are several factors that contribute to the health of your source code:

  • Accuracy – whether the code does what it is expected to do
  • Robustness – whether the code gracefully handles exceptional conditions
  • Extensibility – how easily the code can be changed without affecting accuracy or requiring changes to a large amount of other code
  • Readability – how easily the code can be understood (in a team environment, this is important for both the efficiency of other team members and also to increase the overall level of code comprehension, which in turn reduces the risk that its accuracy will be affected by change)

Maven has reports that can help with each of these health factors, and this section will look at three:

  • Tag List

PMD takes a set of either predefined or user-defined rule sets and evaluates the rules across your Java source code. The result can help identify bugs, copy-and-pasted code, and violations of a coding standard. Figure 6-7 shows the output of a PMD report on proficio-core, which is obtained by running mvn pmd:pmd.

Figure 6-7: An example PMD report

As you can see, some source files are identified as having problems that could be addressed, such as unused methods and variables. Also, since the JXR report was included earlier, the line numbers in the report are linked to the actual source code so you can browse the issues.

Adding the default PMD report to the site is just like adding any other report – you can include it in the reporting section in the proficio/pom.xml file:


[...]
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-pmd-plugin</artifactId>
</plugin>
[...]

The default PMD report includes the basic, unused code, and imports rule sets. The “basic” rule set includes checks on empty blocks, unnecessary statements and possible bugs – such as incorrect loop variables. The “unused code” rule set will locate unused private fields, methods, variables and parameters. The “imports” rule set will detect duplicate, redundant or unused import declarations.

Adding new rule sets is easy, by passing the rulesets configuration to the plugin. However, if you configure these, you must configure all of them – including the defaults explicitly. For example, to include the default rules, and the finalizer rule sets, add the following to the plugin configuration you declared earlier:

[...]
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-pmd-plugin</artifactId>
<configuration>
<rulesets>
<ruleset>/rulesets/basic.xml</ruleset>
<ruleset>/rulesets/imports.xml</ruleset>
<ruleset>/rulesets/unusedcode.xml</ruleset>
<ruleset>/rulesets/finalizers.xml</ruleset>
</rulesets>
</configuration>
</plugin>
[...]

You may find that you like some rules in a rule set, but not others. Or, you may use the same rule sets in a number of projects. In either case, you can choose to create a custom rule set. For example, you could create a rule set with all the default rules, but exclude the “unused private field” rule. To try this, create a file in the proficio-core directory of the sample application called src/main/pmd/custom.xml, with the following content:

<?xml version="1.0"?>
<ruleset name="custom">
<description>
Default rules, no unused private field warning
</description>
<rule ref="/rulesets/basic.xml" />
<rule ref="/rulesets/imports.xml" />
<rule ref="/rulesets/unusedcode.xml">
<exclude name="UnusedPrivateField" />
</rule>
</ruleset>

To use this rule set, override the configuration in the proficio-core/pom.xml file by adding:

[...]
<reporting>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-pmd-plugin</artifactId>
<configuration>
<rulesets>
<ruleset>${basedir}/src/main/pmd/custom.xml</ruleset>
</rulesets>
</configuration>
</plugin>
</plugins>
</reporting>
[...]

For more examples on customizing the rule sets, see the instructions on the PMD Web site at http://pmd.sf.net/howtomakearuleset.html. It is also possible to write your own rules if you find that existing ones do not cover recurring problems in your source code.

One important question is how to select appropriate rules. For PMD, try the following guidelines from the Web site at http://pmd.sf.net/bestpractices.html:

  • Pick the rules that are right for you. There is no point having hundreds of violations you won’t fix.
  • Start small, and add more as needed. basic, unusedcode, and imports are useful in most scenarios and easily fixed. From this starting, select the rules that apply to your own project.

If you’ve done all the work to select the right rules and are correcting all the issues being discovered, you need to make sure it stays that way.

Try this now by running mvn pmd:check on proficio-core. You’ll see that the build fails with the following 3 errors:

[INFO] ---------------------------------------------------------------------------
[INFO] Building Maven Proficio Core
[INFO] task-segment: [pmd:check]
[INFO] ---------------------------------------------------------------------------
[INFO] Preparing pmd:check
[INFO] [pmd:pmd]
[INFO] [pmd:check]
[INFO] ---------------------------------------------------------------------------
[ERROR] BUILD FAILURE
[INFO] ---------------------------------------------------------------------------
[INFO] You have 3 PMD violations.
[INFO] ---------------------------------------------------------------------------

Before correcting these errors, you should include the check in the build, so that it is regularly tested. This is done by binding the goal to the build life cycle. To do so, add the following section to the proficio/pom.xml file:

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-pmd-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>check</goal>
</goals>
</execution>
</executions>
</plugin>
[...]
</plugins>
</build>

Note: You may have noticed that there is no configuration here, but recall from Configuring Reports and Checks section of this chapter that the reporting configuration is applied to the build as well.

By default, the pmd:check goal is run in the verify phase, which occurs after the packaging phase. If you need to run checks earlier, you could add the following to the execution block to ensure that the check runs just after all sources exist:

<phase>process-sources</phase>

To test this new setting, try running mvn verify in the proficio-core directory. You will see that the build fails. To correct this, fix the errors in the src/main/java/com/exist/mvnbook/proficio/DefaultProficio.java file by adding a //NOPMD comment to the unused variables and method:

[...]
// Trigger PMD and checkstyle
int i; // NOPMD
[...]
int j; // NOPMD
[...]
private void testMethod() // NOPMD
{
}
[...]

If you run mvn verify again, the build will succeed.

While this check is very useful, it can be slow and obtrusive during general development. For that reason, adding the check to a profile, which is executed only in an appropriate environment, can make the check optional for developers, but mandatory in an integration environment. See Continuous Integration with Continuum section in the next chapter for information on using profiles and continuous integration.

While the PMD report allows you to run a number of different rules, there is one that is in a separate report. This is the CPD, or copy/paste detection report, and it includes a list of duplicate code fragments discovered across your entire source base. An example report is shown in the figure below.This report is included by default when you enable the PMD plugin in your reporting section, and will appear as “CPD report” in the Project Reports menu.

Figure 6-8: An example CPD report

In a similar way to the main check, pmd:cpd-check can be used to enforce a failure if duplicate source code is found. However, the CPD report contains only one variable to configure: minimumTokenCount, which defaults to 100. With this setting you can fine tune the size of the copies detected. This may not give you enough control to effectively set a rule for the source code, resulting in developers attempting to avoid detection by making only slight modifications, rather than identifying a possible factoring of the source code. Whether to use the report only, or to enforce a check will depend on the environment in which you are working.

There are other alternatives for copy and paste detection, such as Checkstyle, and a commercial product called Simian (http://www.redhillconsulting.com.au/products/simian/). Simian can also be used through Checkstyle and has a larger variety of configuration options for detecting duplicate source code.

Checkstyle is a tool that is, in many ways, similar to PMD. It was originally designed to address issues of format and style, but has more recently added checks for other code issues.

Depending on your environment, you may choose to use it in one of the following ways:

  • Use it to check code formatting only, and rely on other tools for detecting other problems.
  • Use it to check code formatting and selected other problems, and still rely on other tools for greater coverage.
  • Use it to check code formatting and to detect other problems exclusively

This section focuses on the first usage scenario. If you need to learn more about the available modules in Checkstyle, refer to the list on the Web site at http://checkstyle.sf.net/availablechecks.html.

The figure below shows the Checkstyle report obtained by running mvn checkstyle:checkstyle from the proficio-core directory. Some of the extra summary information for overall number of errors and the list of checks used has been trimmed from this display.

Figure 6-9: An example Checkstyle report

You’ll see that each file with notices, warnings or errors is listed in a summary, and then the errors are shown, with a link to the corresponding source line – if the JXR report was enabled.

That’s a lot of errors! By default, the rules used are those of the Sun Java coding conventions, but Proficio is using the Maven team’s code style.

This style is also bundled with the Checkstyle plugin, so to include the report in the site and configure it to use the Maven style, add the following to the reporting section of proficio/pom.xml:

[...]
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-checkstyle-plugin</artifactId>
<configuration>
<configLocation>config/maven_checks.xml</configLocation>
</configuration>
</plugin>

Table 6-3 shows the configurations that are built into the Checkstyle plugin.

Table 6-3: Built-in Checkstyle configurations

Configuration Description Reference
config/sun_checks.xml Sun Java Coding Conventions http://java.sun.com/docs/codeconv/
config/mave_checks.xml Maven team's coding conventions http://maven.apache.org/guides/development/guide-m2-development.html#Maven%20Code%20Style
config/turbine_checks.xml Conventions from the Jakarta Turbine project http://turbine.apache.org/moving.html
config/avalon_checks.xml Conventions from the Apache Avalon project No longer online - the Avalon project has closed. These checks are for backwards compatibility only.

The configLocation parameter can be set to a file within your build, a URL, or a resource within a special dependency also.

It is a good idea to reuse an existing Checkstyle configuration for your project if possible – if the style you use is common, then it is likely to be more readable and easily learned by people joining your project. The built-in Sun and Maven standards are quite different, and typically, one or the other will be suitable for most people. However, if you have developed a standard that differs from these, or would like to use the additional checks introduced in Checkstyle 3.0 and above, you will need to create a Checkstyle configuration.

While this chapter will not go into an example of how to do this, the Checkstyle documentation provides an excellent reference at http://checkstyle.sf.net/config.html.

The Checkstyle plugin itself has a large number of configuration options that allow you to customize the appearance of the report, filter the results, and to parameterize the Checkstyle configuration for creating a baseline organizational standard that can be customized by individual projects. It is also possible to share a Checkstyle configuration among multiple projects, as explained at http://maven.apache.org/plugins/maven-checkstyle-plugin/tips.html.

Before completing this section it is worth mentioning the Tag List plugin. This report, known as “Task List” in Maven 1.0, will look through your source code for known tags and provide a report on those it finds. By default, this will identify the tags TODO and @todo in the comments of your source code.

To try this plugin, add the following to the reporting section of proficio/pom.xml:

[...]
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>taglist-maven-plugin</artifactId>
<configuration>
<tags>
<tag>TODO</tag>
<tag>@todo</tag>
<tag>FIXME</tag>
<tag>XXX</tag>
</tags>
</configuration>
</plugin>
[...]

This configuration will locate any instances of TODO, @todo, FIXME, or XXX in your source code. It is actually possible to achieve this using Checkstyle or PMD rules, however this plugin is a more convenient way to get a simple report of items that need to be addressed at some point later in time.

PMD, Checkstyle, and Tag List are just three of the many tools available for assessing the health of your project’s source code. Some other similar tools, such as FindBugs, JavaNCSS and JDepend, have beta versions of plugins available from the http://mojo.codehaus.org/ project at the time of this writing, and more plugins are being added every day.