PHPMD - PHP Mess Detector


How to write a Rule for PHPMD

:Author: Manuel Pichler :Copyright: All rights reserved :Description: This article describes the how to develop custom rule classes for PHPMD (PHP Mess Detector). You will learn how to develop different rule types and how to configure them in a custom rule set file, so that PHPMD can use those rules for its analysis runs. Additionally you will learn several other aspects about PHPMD, like the violation message template engine and how to write customizable rule classes. :Keywords: NPM, Number of Public Methods, Software Metrics, PHPMD, PMD, rule set, rule, xml, violation, AST, Abstract Syntax Tree

This article describes how you can extend PHPMD with custom rule classes that can be used to detect design issues or errors in the analyzed source code.

Let us start with some architecture basics behind PHPMD. All rules in PHPMD must at least implement the \\PHPMD\\Rule interface. You can also extend the abstract rule base class \\PHPMD\\AbstractRule which already provides an implementation of all required infrastructure methods and application logic, so that the only task which is left to you is the implementation of the concrete validation code of your rule. To implement this validation-code the PHPMD rule interface declares the apply() method which will be invoked by the application during the source analysis phase. :

require_once 'PHPMD/AbstractRule.php';

class Com_Example_Rule_NoFunctions extends \PHPMD\AbstractRule
{
    public function apply(\PHPMD\AbstractNode $node)
    {
        // Check constraints against the given node instance
    }
}

The apply() method gets an instance of \\PHPMD\\AbstractNode as argument. This node instance represents the different high level code artifacts found in the analyzed source code. In this context high level artifact means interfaces, classes, methods and functions. But how do we tell PHPMD which of these artifacts are interesting for our rule, because we do not want duplicate implementations of the decision code in every rule? To solve this problem PHPMD uses so-called marker interfaces. The only purpose of these interfaces is to label a rule class, which says: Hey I'm interested in nodes of type class and interface, or I am interested in function artifacts. The following list shows the available marker interfaces:

With this marker interfaces we can now extend the previous example, so that the rule will be called for functions found in the analyzed source code. :

class Com_Example_Rule_NoFunctions
       extends \PHPMD\AbstractRule
    implements \PHPMD\Rule\FunctionAware
{
    public function apply(\PHPMD\AbstractNode $node)
    {
        // Check constraints against the given node instance
    }
}

And because our coding guideline forbids functions every call to the apply() method will result in a rule violation. Such a violation can be reported to PHPMD through the addViolation() method. The rule inherits this helper method from it's parent class \\PHPMD\\AbstractRule. :

class Com_Example_Rule_NoFunctions // ...
{
    public function apply(\PHPMD\AbstractNode $node)
    {
        $this->addViolation($node);
    }
}

That's it. The only thing left to do is adding a configuration entry for this rule to a rule set file. This ruleset file is an XML document where all settings of one or more rules can be configured, so that everyone can customize an existing rule without any changes the rule's source. The syntax of the rule set file is completely adapted from PHPMD's inspiring example PMD. To get started with a custom rule set you should take a look at one of the existing XML files and then adapt one of the rule configurations for a newly created rule. The most important elements of a rule configuration are:

<ruleset name="example.com rules"
       xmlns="http://pmd.sf.net/ruleset/1.0.0"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://pmd.sf.net/ruleset/1.0.0 http://pmd.sf.net/ruleset_xml_schema.xsd"
       xsi:noNamespaceSchemaLocation="http://pmd.sf.net/ruleset_xml_schema.xsd">

    <rule name="FunctionRule"
          message = "Please do not use functions."
          class="Com_Example_Rule_NoFunctions"
          externalInfoUrl="http://example.com/phpmd/rules.html#functionrule">

        <priority>1</priority>
    </rule>
</ruleset>

The previous listing shows a basic rule set file that configures all required settings for the created example rule. For more details on PHPMD's rule set file format you should take a look a the Create a custom rule set tutorial.

Finally the real world test. Let's assume we have saved the rule class in a file Com/Example/Rule/NoFunction.php that is somewhere in the PHP include_path and we have saved the rule set in a file named example-rule.xml. No we can test the rule from the command line with the following command: :

~ $ phpmd /my/source/example.com text /my/rules/example-rule.xml

/my/source/example.com/functions.php:2    Please do not use functions.

That's it. Now we have a first custom rule class that can be used with PHPMD.

Writing a rule based on an existing Software Metric

Since the root goal for the development of PHPMD was the implementation of a simple and user friendly interface for PHP_Depend, we will show you in this section how to develop a rule class, that uses a software metric measured by PDepend as input data.

In this section you will learn how to access software metrics for a given \\PHPMD\\AbstractNode instance. And you will learn how to use PHPMD's configuration backend in such a way, that thresholds and other settings can be customized without changing any PHP code. Additionally you will see how the information content of an error message can be improved.

The first thing we need now is a software metric that we want to use as basis for the new rule. A complete and up2date list of available software metrics can be found PHP_Depend's metric catalog. For this article we choose the Number of Public Methods (npm) metric and we define an upper and a lower threshold for our rule. The upper threshold is 10, because we think a class with more public methods exposes to much of its privates and should be refactored into two or more classes. For the lower threshold we choose 1, because a class without any public method does not expose any service to surrounding application.

The following code listing shows the entire rule class skeleton. As you can see, this class implements the \\PHPMD\\Rule\\ClassAware interface, so that PHPMD knows that this rule will only be called for classes. :

class Com_Example_Rule_NumberOfPublicMethods
       extends \PHPMD\AbstractRule
    implements \PHPMD\Rule\ClassAware
{
    const MINIMUM = 1,
          MAXIMUM = 10;

    public function apply(\PHPMD\AbstractNode $node)
    {
        // Check constraints against the given node instance
    }
}

Now that we have the rule skeleton we must access the npm metric which is associated with the given node instance. All software metrics calculated for a node object can directly be accessed through the getMetric() method of the node instance. This method takes a single parameter, the abbreviation/acronym of the metric as documented in PHP_Depends metric catalog. :

class Com_Example_Rule_NumberOfPublicMethods
       extends \PHPMD\AbstractRule
    implements \PHPMD\Rule\ClassAware
{
    const MINIMUM = 1,
          MAXIMUM = 10;

    public function apply(\PHPMD\AbstractNode $node)
    {
        $npm = $node->getMetric('npm');
        if ($npm < self::MINIMUM || $npm > self::MAXIMUM) {
            $this->addViolation($node);
        }
    }
}

That's the coding part for the metric based rule. Now we must add this class to a rule set file.

<ruleset name="example.com rules"
       xmlns="http://pmd.sf.net/ruleset/1.0.0"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://pmd.sf.net/ruleset/1.0.0 http://pmd.sf.net/ruleset_xml_schema.xsd"
       xsi:noNamespaceSchemaLocation="http://pmd.sf.net/ruleset_xml_schema.xsd">

    <!-- ... -->

    <rule name="NumberOfPublics"
          message = "The context class violates the NPM metric."
          class="Com_Example_Rule_NumberOfPublicMethods"
          externalInfoUrl="http://example.com/phpmd/rules.html#numberofpublics">

        <priority>3</priority>
    </rule>
</ruleset>

Now we can run PHPMD with this rule set file and it will report us all classes that do not fulfill our requirement for the NPM metric. But as promised, we will make this rule more customizable, so that it can be adjusted for different project requirements. Therefore we will replace the two constants MINIMUM and MAXIMUM with properties that can be configured in the rule set file. So let us start with the modified rule set file. :

<ruleset name="example.com rules"
       xmlns="http://pmd.sf.net/ruleset/1.0.0"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://pmd.sf.net/ruleset/1.0.0 http://pmd.sf.net/ruleset_xml_schema.xsd"
       xsi:noNamespaceSchemaLocation="http://pmd.sf.net/ruleset_xml_schema.xsd">

    <!-- ... -->

    <rule name="NumberOfPublics"
          message = "The context class violates the NPM metric."
          class="Com_Example_Rule_NumberOfPublicMethods"
          externalInfoUrl="http://example.com/phpmd/rules.html#numberofpublics">

        <priority>3</priority>
        <properties>
            <property name="minimum"
                      value="1"
                      description="Minimum number of public methods." />
            <property name="maximum"
                      value="10"
                      description="Maximum number of public methods." />
        </properties>
    </rule>
</ruleset>

In PMD rule set files you can define as many properties for a rule as you like. All of them will be injected into a rule instance by PHPMD's runtime environment and then can be accessed through the get<type>Property() methods. Currently PHPMD supports the following getter methods.

So now let's modify the rule class and replace the hard coded constants with the configurable properties. :

class Com_Example_Rule_NumberOfPublicMethods
       extends \PHPMD\AbstractRule
    implements \PHPMD\Rule\ClassAware
{
    public function apply(\PHPMD\AbstractNode $node)
    {
        $npm = $node->getMetric('npm');
        if ($npm < $this->getIntProperty('minimum') ||
            $npm > $this->getIntProperty('maximum')
        ) {
            $this->addViolation($node);
        }
    }
}

Now we are nearly done, but one issue is still left out. When we execute this rule, the user will get the message "The context class violates the NPM metric." which isn't really informative, because he must manually check if the upper or lower threshold was exceeded and what the actual thresholds are. To provide more information about a rule violation you can use PHPMD's minimalistic template/placeholder engine for violation messages. With this engine you can define violation messages with placeholders, that will be replaced with actual values. The format for such placeholders is '{' + \d+ '}'. :

<ruleset name="example.com rules"
       xmlns="http://pmd.sf.net/ruleset/1.0.0"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://pmd.sf.net/ruleset/1.0.0 http://pmd.sf.net/ruleset_xml_schema.xsd"
       xsi:noNamespaceSchemaLocation="http://pmd.sf.net/ruleset_xml_schema.xsd">

    <!-- ... -->

    <rule name="NumberOfPublics"
          message = "The class {0} has {1} public method, the threshold is {2}."
          class="Com_Example_Rule_NumberOfPublicMethods"
          externalInfoUrl="http://example.com/phpmd/rules.html#numberofpublics">

        <priority>3</priority>
        <properties>
            <property name="minimum"
                      value="1"
                      description="Minimum number of public methods." />
            <property name="maximum"
                      value="10"
                      description="Maximum number of public methods." />
        </properties>
    </rule>
</ruleset>

Now we can adjust the rule class in such a manner, that it will set the correct values for the placeholders {0}, {1} and {2} :

class Com_Example_Rule_NumberOfPublicMethods
       extends \PHPMD\AbstractRule
    implements \PHPMD\Rule\ClassAware
{
    public function apply(\PHPMD\AbstractNode $node)
    {
        $min = $this->getIntProperty('minimum');
        $max = $this->getIntProperty('maximum');
        $npm = $node->getMetric('npm');

        if ($npm < $min) {
            $this->addViolation($node, array(get_class($node), $npm, $min));
        } else if ($npm > $max) {
            $this->addViolation($node, array(get_class($node), $npm, $max));
        }
    }
}

If we run this version of the rule we will get an error message like the one shown in the following figure. :

The class FooBar has 42 public method, the threshold is 10.

Writing a rule based on the Abstract Syntax Tree

Now we will learn how to develop a PHPMD rule that utilizes PHP_Depend's abstract syntax tree to detect violations or possible error in the analyzed source code. The ability to access PHP_Depend's abstract syntax tree gives you the most powerful way to write rules for PHPMD, because you can analyze nearly all aspects of the software under test. The syntax tree can be accessed through the getFirstChildOfType() and findChildrenOfType() methods of the \\PHPMD\\AbstractNode class.

In this example we will implement a rule that detects the usage of the new and controversial PHP feature goto. Because we all know and agree that goto was already bad in Basic, we would like to prevent our developers from using the bad feature. Therefore we implement a PHPMD rule, that searches through PHP_Depend's for the goto language construct.

Because the goto statement cannot be found in classes and interfaces, but in methods and functions, the new rule class must implement the two marker interfaces \\PHPMD\\Rule\\FunctionAware and \\PHPMD\\Rule\\MethodAware.

namespace PHPMD\Rule\Design;

use PHPMD\AbstractNode;
use PHPMD\AbstractRule;
use PHPMD\Rule\MethodAware;
use PHPMD\Rule\FunctionAware;

class GotoStatement extends AbstractRule implements MethodAware, FunctionAware
{
    public function apply(AbstractNode $node)
    {
        foreach ($node->findChildrenOfType('GotoStatement') as $goto) {
            $this->addViolation($goto, array($node->getType(), $node->getName()));
        }
    }
}

As you can see, we are searching for the string GotoStatement in the previous example. This is a shortcut notation used by PHPMD to address concrete PHP_Depend syntax tree nodes. All abstract syntax tree classes in PDepend have the following format: :

\PDepend\Source\AST\ASTGotoStatement

where :

\PDepend\Source\AST\AST

is fixed and everything else depends on the node type. And this fixed part of the class name can be omitted in PHPMD when searching for an abstract syntax tree node. To implement additional rules you should take a look at PHP_Depend's Code package where you can find all currently supported code nodes.

Conclusion

In this article we have shown you several ways to implement custom rules for PHPMD. If you think one of your rules could be reusable for other projects and user, don't hesitate to propose your custom rules on the project's issue tracker at GitHub or open a pull request.

Source | Edit