AnalysisTool

State of the project

AnalysisTool was originally created to serve two main purposes: to provide an easy-to-use executable binary of Clang static analyzer and to customize Clang by providing some additional checks. When Clang static analyzer was in its early stages, the only option for developers to try it out was to check out the latest source code of LLVM and Clang, compile it, and use the analyzer from the command line. AnalysisTool provided an easy-to-use GUI interface and removed the need to touch Clang source code. It also provided automatic updates, so that users of AT could always use the latest Clang static analyzer.

However, after AnalysisTool was created in 2008 things have gone forward. Clang static analyzer has now become a part of Apple developer tools and is now integrated with XCode. Those who want to use the latest static analyzer version can now always download the latest binary build from the Clang Static Analyzer web site. AnalysisTool has served its purpose well, but there's no much need for it as a GUI frontend anymore.

The original implementation of AnalysisTool's custom analyses tied them very strictly to underlying Clang's source code. Since Clang was (and still is) a rapidly moving target, maintainability of these analyses was very weak. These additional analyses still have their use, and my long term goal is still to have them integrated into official Clang and/or creating a Clang plugin once plugins become supported. At this moment, descriptions of AnalysisTool's additional analyses can be used as source for ideas for Clang static analyzer developers. I do not have any schedule for when these analyses will work with the latest Clang.

My current recommendation is to either use the Clang static analyzer which is bundled with XCode, or to download the latest official Clang's static analyzer build.

About

AnalysisTool is a Mac OS X application which provides GUI and CLI frontends to the LLVM/Clang static analyzer, a tool that finds bugs in C and Objective-C programs. AnalysisTool includes a custom version of LLVM/Clang static analyzer. AnalysisTool's version of Clang includes some additional analyses and features which are not included in the official LLVM/Clang static analyzer distribution. These additional checks can be turned off. Do not send bug reports concerning AnalysisTool or our internal checks to the Clang team.

AnalysisTool project was started by Nikita Zhuk as part of his master's thesis for Marko Karppinen & Co. LLC in 2008. The LLVM/Clang revision on which AnalysisTool's custom Clang is based on is stated in the release notes and in the about window of AnalysisTool. AnalysisTool is a completely separate project from Clang. It's not supported by Apple or the people behind Clang. Clang team should not be bothered with feedback about AnalysisTool itself (the wrapper) or its custom checks.

AnalysisTool was primarily developed for in-house use at Marko Karppinen & Co. LLC, so it contains some analyses which check coding conventions and style issues. For many ObjC developers this will cause false positives. Do not trust AnalysisTool results blindly. There are some cases where it would be useful to disable certain analyses for e.g. a single line of code. As soon as Clang will support some form of analysis disabling, all custom analyses provided by AnalysisTool will be disableable as well.

GUI frontend

AnalysisTool provides a simple GUI frontend which allows running both official LLVM/Clang analyses and custom Clang analyses on a single XCode project. The selected target of the XCode project is built with the xcodebuild command-line tool and source code is analyzed while it's being compiled. AnalysisTool creates a new directory called Static analysis as a subdirectory of the project directory. Build products are saved into Static analysis/build directory and the results of static analysis are saved into Static analysis/results/<date> directory. One can also specify custom directories for both built products and static analysis results in the preferences dialog.

AnalysisTool GUI frontend

To run full analysis on your XCode project check the "Clean project" checkbox. This will cause a clean build to be run. If the "Clean project" checkbox isn't selected, AnalysisTool will analyze only source files which were changed since the last AnalysisTool run.

All analyses which are run by AnalysisTool are listed in the Preferences dialog, and they can be enabled or disabled one-by-one. Currently there's no GUI to set any configuration options of various analyses. This may improve in the future versions of AnalysisTool.

CLI frontend

AnalysisTool provides a CLI frontend to the custom Clang distribution. CLI frontend allows one to run the same analyses as with the GUI version from the command line. CLI frontend is run from inside the AnalysisTool bundle with the following command-line arguments:

./AnalysisTool.app/Contents/MacOS/cli/analysistool <path-to-xcodeproj-file> <path-to-output-directory> [name-of-XCode-target-to-analyze] [-no-run-subdirs] [<analysis-flag-1>] [<analysis-flag-2>] [<analysis-flag-n>] ...

AnalysisTool CLI frontend

If no analysis flags are given, all available analyses are run. If the -no-run-subdirs flag is set, no date-stamped result output subdirectories will be created for each run of AnalysisTool.

CLI frontend is useful when AnalysisTool is integrated into automatic build system such as Buildbot. It can also be used within XCode itself by creating an XCode target which runs AnalysisTool in a Run script build phase.

Additional analyses

AnalysisTool includes several custom analyses which are not included in the official LLVM/Clang distribution. Custom analyses are mostly simple, syntactic analyses which check that the code follows our various in-house coding conventions. These analyses have much higher false positive rate than official ones, because the primary goal of these analyses is often different than of official analyses.

The goal of official analyses is to find real bugs and to keep the false positive rate as low as possible while being able to analyze code from various developers, using various frameworks and following various coding conventions and coding styles. We want to develop a set of coding conventions which, if followed, prevents us with a certain probability from creating new bugs. These conventions are born from our coding experience and results of our code reviews. Conventions are then represented as executable analyses which become a part of AnalysisTool. This way coding experience and knowledge about various coding patterns and practices can be collected and shared in-house and with other developers in executable form.

AnalysisTool reports violations of these conventions. It should be noted that a violation of a coding convention doesn't necessarily mean that there's a bug in the code. One should review any violation report very carefully and make changes to the code only if the violation report seems relevant. A good unit test suite with high code coverage helps to make sure that these changes don't cause any unanticipated bugs.

Our current goal is to produce Objective-C and C code which is consistent in style, follows similar coding patterns, can be compiled and run in garbage collected and reference counted modes and is 64-bit safe. We follow Apple's coding style and pattern recommendations when appropriate and deviate from them when necessary.

This section lists all custom analyses which are included in AnalysisTool and provides some code examples which cause violations to be reported. All violation reports generated by AnalysisTool use the AnalysisTool: prefix to distinguish them from official analyses' reports.

Access control

Use of public instance variables is discouraged because they break class encapsulation.

@interface MyClass { 
@public
	id mIvar; // access control level of instance variable 'mIvar' is 'public'.
}
@end	

Coercion

Coercion analyses report violations when data of one type is coerced by compiler to another type in such a way that the value of data may change. Some of coercion analyses were inspired by the GCC's Wcoercion project.

AnalysisTool also warns of suspicious use of signed and unsigned values, for example when an if statement contains the following comparison: unsigned-value < 0. Coercion analyses are made for implicit coercions only, so warnings are not generated if an explicit cast is found (e.g. (int)unsigned-value < 0).

unsigned char c = 1;
unsigned long long ll = INT_MAX;
c = ll; // coercion from 'unsigned long long' to 'unsigned char' may alter its value.

/* ----- */

unsigned int a = 0x7fffffff;
unsigned int b = 0x00000100;
return (a & b); // coercion from 'unsigned int' to 'char' may alter its value.

/* ----- */

char c = 0;
int i = 0;
long long ll = 0;
float f = 0.0;
double d = 0.0;

c = i; // coercion from 'int' to 'char' may alter its value.
c = ll; // coercion from 'long long' to 'char' may alter its value.
c = f; // coercion from 'float' to 'char' may alter its value.
c = d; // coercion from 'double' to 'char' may alter its value.
i = ll; // coercion from 'long long' to 'int' may alter its value.
i = d; // coercion from 'double' to 'int' may alter its value.

/* ----- */

int x = 0;
int y = 1;
int z = 2;

return x = (y == z); // coercion from 'int' to 'BOOL' may alter its value.

/* ----- */

int x = 0;
int y = 1;
int z = 2;

// x = y == z implies x = (y == z)
return x = y == z; // coercion from 'int' to 'BOOL' may alter its value.

/* ----- */

unsigned char c = 'c';
c = c | UCHAR_MAX+1; // coercion from 'int' to 'unsigned char' may alter its value.

c = c | CHAR_MIN; // coercion from 'int' to 'unsigned char' may alter its value.
c = c | CHAR_MIN-1; // coercion from 'int' to 'unsigned char' may alter its value.

/* ----- */

unsigned int i = 32000;
unsigned char c = i & 0xff; // ok

c = (i+1) & 0xffff;  // coercion from 'unsigned int' to 'unsigned char' may alter its value.

/* ----- */

unsigned char c = 0;
c = c + 256; // coercion from 'int' to 'unsigned char' may alter its value.

/* ----- */

const int i2 = 256;
unsigned char c = 0;
c = c + i2; // coercion from 'int' to 'unsigned char' may alter its value.

/* ----- */

enum {
  enumValMinus1 = -1,
  enumValZero = 0
};

typedef enum {
	typeValMinus1 = -1,
	typeValZero = 0
} TypeMinus1OrZero;

typedef enum _AnotherTypeMinus1OrZero{
	anotherTypeValMinus1 = -1,
	anotherTypeValZero = 0
} AnotherTypeMinus1OrZero;

unsigned a = 0;
unsigned b = -1; // coercion from negative signed value of type 'int' 
// to unsigned value of type 'unsigned int' will cause overflow.

unsigned i = 0;
signed si = 0;

while(i >= 0) // unsigned value is always non-negative.
	i--;

while(0 <= i) // unsigned value is always non-negative.
	i--;

if(i < 0) // unsigned value is always non-negative.
	i--;

if(0 > i) // unsigned value is always non-negative.
	i--;

if(i == -1) // coercion from negative signed value of type 'int' to unsigned value of type 'unsigned int' will cause overflow.
	i--;

if(i <= -1) // coercion from negative signed value of type 'int' to unsigned value of type 'unsigned int' will cause overflow.
	i--;

if(i > -1) // coercion from negative signed value of type 'int' to unsigned value of type 'unsigned int' will cause overflow.
	i--;

if(i >= -1) // coercion from negative signed value of type 'int' to unsigned value of type 'unsigned int' will cause overflow.
	i--;

if(i != -1) // coercion from negative signed value of type 'int' to unsigned value of type 'unsigned int' will cause overflow.
	i--;

for(i = 100; i >= 0; i--) // unsigned value is always non-negative.
	i--;

TypeMinus1OrZero val = typeValZero; // no-warning
val = -1; // The given value '-1' isn't one of enumerated values of type 'TypeMinus1OrZero'.
val = 0; // The given value '0' isn't one of enumerated values of type 'TypeMinus1OrZero'.
val = 999; // The given value '999' isn't one of enumerated values of type 'TypeMinus1OrZero'.

AnotherTypeMinus1OrZero val2 = anotherTypeValZero; // no-warning
val2 = anotherTypeValMinus1; // no-warning
val2 = 1 ? anotherTypeValZero : anotherTypeValMinus1; // no-warning
val2 = 1 ? 0 : -1; // The given value '1 ? 0 : -1' isn't one of enumerated values of type 'AnotherTypeMinus1OrZero'.

val2 = typeValZero; // The given value 'typeValZero' isn't one of enumerated values of type 'AnotherTypeMinus1OrZero'.
val2 = typeValMinus1; // The given value 'typeValMinus1' isn't one of enumerated values of type 'AnotherTypeMinus1OrZero'.
val2 = -1; // The given value '-1' isn't one of enumerated values of type 'AnotherTypeMinus1OrZero'.
val2 = 0; // The given value '0' isn't one of enumerated values of type 'AnotherTypeMinus1OrZero'.
val2 = 999; // The given value '999' isn't one of enumerated values of type 'AnotherTypeMinus1OrZero'.

/* ----- */

@interface MyClass
+(int)m2;
@end

TypeMinus1OrZero val5 = [MyClass m2]; // The given value '[MyClass m2]' isn't one of 
// enumerated values of type 'TypeMinus1OrZero'.
const TypeMinus1OrZero val6 = [MyClass m2]; // The given value '[MyClass m2]' isn't one of 
// enumerated values of type 'TypeMinus1OrZero const'.

Complexity

AnalysisTool calculates four complexity metrics of your source code: cyclomatic complexity of methods, conditional logical complexity, depth of conditional nesting and depth of loop nesting. Cyclomatic complexity is calculated as the number of conditional statements, such as 'if' statements and loops. Conditional logical complexity is the number of logical operators inside a single conditional statement. Other two complexity metrics should be pretty self explanatory.

AnalysisTool uses preconfigured maximum thresholds for each of four complexity metrics. If a complexity value calculated by AnalysisTool exceeds its maximum threshold value a warning is generated. Currently there's no GUI to change these thresholds. Current maximum thresholds are defined as follows:

Dealloc safety

Dealloc is a special method which is called by the runtime when an object is deallocated (in non-GC environment). There are many things which should and shouldn't be done in dealloc so that the code is correct, maintainable and GC-safe. AnalysisTool contains various analyses which generate warnings if some unsafe methods are called from the implementation of dealloc method.

Method calls to avoid in dealloc:

There is couple of reasons why these method calls should be avoided in dealloc. First is the GC-safety: dealloc is not called by the runtime under GC, so these method calls will not be made at all under GC. Unregistering key-value observation or unbinding objects must be done by hand in reference counted and GC environments, so dealloc isn't the right place for these method calls. Second reason is that even if the code doesn't have to be GC-safe, it's still invalid to post KVO change notifications or NSNotifications from dealloc. The object sending these notifications is already being deallocated (its retain count has already reached zero), so it should not pass itself to other objects which may try to use and/or retain it.

Declaration conventions

The element variable of an ObjC 2.0 fast enumeration for...in loop must be declared within the scope of for loop. The goal of this convention is to keep namespaces as clean as possible by limiting the scope of variables to the minimum.

NSString *string = nil;
for(string in stringArray) {...} // For-in-loop element variable should be declared 
// inside the for loop to ensure minimal scope of the variable.

Discouraged method calls

We want to keep our code consistent by forbidding use of some methods and recommending use of some other methods instead. Current list of discouraged method calls includes only one method: new. The new method is discouraged because it hides the fact that it's using the default init method. If the default init method is later changed to accept new arguments, the new method will be changed to init... method anyway. Other reason is that we want to reserve new for our C++ code, which we sometimes might use within Objective-C classes.

NSArray *array = [NSArray new]; // For the sake of consistency, 
		// use alloc init pattern instead of new. Reserve new for C++ code.

Error handling

If a method takes NSError** as argument, we want to use that argument to get as much information as we can about the potential failure. Ignoring information about reasons for failure usually makes debugging harder and more time-consuming than necessary. The two rules which are checked by the error handling analyses are:

  1. Always pass a valid NSError argument to methods which take one
  2. Always read the NSError value returned by reference at least once (e.g. print it to a log or pass to other method)
NSError *error = nil;
[self methodWithError:&error]; // NSError** argument is passed to a method but it's never read after the method call. 
// NSError object should be read at least once after method call which may have set it.

/* ----- */

[self methodWithError:nil]; // 'nil' is passed as NSError** argument. 
// Although it's legal to do so, ignoring NSError makes debugging 
// potential problems more difficult than necessary.

Extra parentheses

Extra parentheses analysis complements GCC's -Wparentheses flag. GCC's -Wparentheses issues a warning when there are missing parentheses around assignment used in truth value, e.g. a warning is issued for: if(self = [super init]), correct form being if((self = [super init])). This analysis warns when there are extra parentheses around comparison used in truth value, e.g. a warning is issued for if((self == [super init])), correct form being, once again, if((self = [super init])).

Format strings

Format string analysis parses all Objective-C format strings which are passed as arguments to a set of methods which are known to accept format strings as arguments and warns if:

KVO

KVO analyses check that keys used in willChange/didChange method pairs are string literals and each willChange... call is paired properly with a subsequent didChange... call with the same key.

- (void)willChange1 // willChange/didChange mismatch.
{
	[self willChangeValueForKey:@"key1"];
	[self didChangeValueForKey:@"key1"];
  
	[self didChangeValueForKey:@"key2"]; // willChange/didChange mismatch.
	[self willChangeValueForKey:@"key2"];
}

-(void)m2 // willChange/didChange mismatch.
{
	[self didChangeValueForKey:@"key"]; // willChange/didChange mismatch.
	[self willChangeValueForKey:@"key"];
}

-(void)m3 // willChange/didChange mismatch.
{
	[self willChangeValueForKey:@"key"];
}

-(void)m4
{
	[self didChangeValueForKey:@"key"]; // willChange/didChange mismatch.
}

-(void)m5
{
	// Wrong order
	[self willChangeValueForKey:@"literalKey"];
	[self willChangeValueForKey:kKeyPath];
	[self willChangeValueForKey:stKeyPath];
	
	[self didChangeValueForKey:@"literalKey"]; // Expected first argument of method 
// 'didChangeValueForKey:' was 'stKeyPath'. Actual key was '@"literalKey"'.
	[self didChangeValueForKey:kKeyPath];
	[self didChangeValueForKey:stKeyPath]; // Expected first argument of method
// 'didChangeValueForKey:' was '@"literalKey"'. Actual key was 'stKeyPath'.
}

Memory management

Memory management analyses complement the excellent CFRefCount checker which comes with the official Clang. These analyses are syntactic and rely on the fact that certain memory management conventions are followed. These conditions include:

Shadowing

Shadowing analysis warns if two variables have the same name which is visible in the same scope. This analysis generates much less warnings than GCC's -Wshadow flag, which usually produces too many false positives to be useful in any less-than-trivial Obj-C application. Warnings are generated if:

Dependency graphs

AnalysisTool is capable of generating dependency graphs from ObjC source code. Dependencies are generated from @implementations of ObjC classes and categories. Dependencies are categorized into four different categories:

AnalysisTool generates a XML file which includes all dependencies which AnalysisTool found during the analysis. There's a tool called DependencyAnalyser which provides an interactive interface to XML files generated by AnalysisTool. DependencyAnalyser can also calculate various metrics from dependency graphs, including Class Reachability Set Size (CRSS).

The XML file can also be transformed with a XSLT stylesheet to any other format for further processing, e.g. visualization. AnalysisTool includes a default XSLT stylesheet which transforms the XML output into dot format, which can be visualized with OmniGraffle or GraphViz. XSLT stylesheet generates a dot file where each dependency type is represented with different arrow style, so it's easy to focus on just one dependency type when using OmniGraffle for visualization.

Sparkly dependency graph
A sample class dependency graph generated from open source project Sparkle.

To enable dependency graph generation, activate it in the preferences dialog of AnalysisTool and include the following preprocessor macro in your target's Debug build configuration:

XCODEBUILD_PROJECT_FILE_PATH="$(PROJECT_FILE_PATH)"

Class dependency graph generator can group classes into projects and targets. This is especially useful with large XCode projects, which include multiple targets and/or depend on multiple XCode projects. The default XML-to-dot XSLT stylesheet uses project and target into to create subgraphs in the dot file. To enable project and target grouping, include the following preprocessor macros in your Debug build configuration:

XCODEBUILD_TARGET_NAME="$(TARGET_NAME)"
XCODEBUILD_PROJECT_NAME="$(PROJECT_NAME)"

Contact

If you have any questions or feedback, contact Nikita Zhuk directly by mail - nikita dot zhuk at karppinen dot fi. I announce new AnalysisTool versions which have significant new features on Twitter, so you can also follow me if you want to be notified about them.

Release notes

View the latest release notes.

AnalysisTool uses three numbers to differentiate between different versions: LLVM/Clang revision, Clang fork revision and AnalysisTool build number. These numbers are displayed in the About window of AnalysisTool GUI frontend and are usually mentioned in each release note entry.

LLVM/Clang revision is the revision number of LLVM and Clang projects on which AnalysisTool is built. Clang fork revision is the revision number of the set of custom patches which are applied to the official LLVM/Clang distribution. AnalysisTool build number is the revision number of the whole AnalysisTool application.