S R T B H P N

Code Artistry - Code Analyzer

Initial Thoughts:

Code analysis is an interesting topic that takes us into many of the areas covered by our course sequence, especially CSE681 - Software Modeling and Analysis, CSE687 - Object Oriented Design and CSE776 - Design Patterns. Before reading this blog you may wish to review the Code Parsing Blog, as parsing is the most important and complex activity for the Analyzer. I'll assume that you are familiar with that material and not repeat those discussions here. The Mission of the analyzer is to compute size and complexity metrics for all the source code files residing in the directory tree rooted at a specified path. It can also show you the Abstract Syntax Tree (AST) it built to hold the analysis results, and show the files analyzed with their line counts. Another goal of this design is to provide a structure that can easily be extended to other kinds of analysis, e.g., finding dependencies between all files in the working set, or identifying some of the flaws in a design. The design and implementation come in two parts. The first part is a console application written in C++, using some of the features of the latest C++11 standard. It takes all of its inputs from its command line and writes its output to the console and also, if that option is chosen, to a log file deposited at the root of the analysis path. This first part implements the entire analysis functionality of the Analyzer. The second part is a Graphical User Interface, written in the managed .Net language C++/CLI, and uses the Windows Presentation Foundation (WPF) framework1. The purpose of the GUI is to support easily browsing to the root directory for analysis and to collect user settings. The path and settings it builds into a command line and then starts the Analyzer console application process with that. The implementation was divided into these two parts because browsing for a starting path is much easier with a GUI using an OpenFileDialog2 than from the command line with a lot of CDing to navigate. However, a console is a better place to send a lot of information which needs space and flexibility to convey a lot of resulting information. In this discussion we look at the package structure of the analyzer, its activities, and the output it generates. At the end we will draw some conclusions about what makes this design interesting and areas for improvement.
  1. Packages:

    The Code Analyzer package structure is shown in the figure at the right. The packages focus on three areas:
  2. Activities:

    Activities are separated, in the activities diagram at the right, into two rows. The top row shows GUI activities and the bottom shows the Analyzer activities.
  3. Output:

    You see, in the figure at the right, a typical output for VisualCodeAnalyzer execution. The user has selected both C++ and C# file analysis, browsed to the root folder of the CodeAnalyser Solution, selected metric display, and started the CodeAnalyzer. The command line paramters for the console application are shown at the top of the display, along with the date and time of execution.

    It is convenient to have the controls and output in separate windows. We can look at the console output, decide to change some execution parameters, do that in the GUI window while still observing the output, and run again. One other point - one way communication from the GUI application to the console is a lot eaiser to set up and manage than two way communcation between the executing code and GUI displays.

    Notice that public data is shown in the console window just below the class or struct that owns that data, and the display also localizes it to a particular package. In this application the public data are members of a couple of structs that are strictly private to the implementation. They are never returned from functions that another code author might have to deal with. I treat public data from classes, and any construct that other code has to use, as an error of design. I do not do that for data held in private structs.

Summary - The Good, The Bad, and the Ugly5:

Here are things I like about this design:

  1. The combination of GUI and Console Application. GUI for browsing and setting parameters combined with a console analysis application for execution of the analysis is simple, works well, and is very usable.
  2. The structure is fairly simple for processing as complex as code analysis. We've built something close to a compiler front end. It is of course simpler, because we need to recognize only a small part of the C++ and C# languages. It's also surprising how much code is common for the analysis of both C++ and C# code.
  3. The individual parts are all recognizable by name and function and they distribute the program's complexity fairly uniformily among themselves.
  4. Not much Need to Change:
    The structure is such that other applications like dependency analysis will keep almost all the packages intact, only modifying a few, like Window, ActionsAndRules, and Display, and those modifications will be small. Almost all the other of the fifteen packages will not need to change.

Here are things I don't like about the design:

  1. Processing is not concurrent.
    Analysis of each file is independent of that for every other file. The only thing that is shared is use of the Astract Syntax tree and scope stack.
    • That means that we could make the expensive parsing part concurrent, provided that we let each analyzer thread have its own Abstract Syntax Tree and Scope Stack. We just run each file's analysis on a thread pool thread.
    • That means that we have to build mechanics to merge the ASTs for each file, but that is close to trivial to accomplish.
    • We would also have to construct a parser for each file because the parser is welded to its scanner data source and that could work only on a single file at a time. It turns out that is also easy to do. In fact some of my parser demos do just that.
    So making this application concurrent should be relatively easy to do. I just haven't done that yet.
  2. It's incomplete.
    I haven't gotten around to trying on Java code. I expect that since it works on C# it's very likely to need next to no changes for Java.

Here are the ugly parts:

  1. Well, I really can't think of anything I think is ugly about this design!

  1. The C++/CLI compiler tool chain does not have a Xaml processor so all of the WPF functionality needs to be implemented with code - no declarative implementation.
  2. You might thing we would use a FolderBrowserDialog for selecting directories. I did not for two reasons. The first is that the FolderBrowserDialog control doesn't work very well. It does not scroll down to the selected path when you open it. You have to manually scroll and that gets to be a pain. The second reason is that sometimes we want to select specific files to process, not all those matching a pattern. For that you need the OpenFileDialog.
  3. C++ surprisingly does not have a directory manipulation library, so I wrote one a couple of years ago and use that here.
  4. We build views programmatically because we can't do that declaratively with C++/CLI. See footnote 1.
  5. Thanks Clint! Great movie.

Newhouse