There are several types of software tests each with its own specific objectives:
Construction Tests - run by Developers:
As we build a software package we lay out its structure with declarations of one or more classes.
These are implemented by adding a few lines of code in a class method, adding a test for that method
in a construction test stub1, then building, and executing the new test. Doing this continuously allows us
to keep the software always running - at first just doing simple things, but progressively adding more
detail, until the class meets its complete set of obligations. We do that for each class until
our package is complete. Then we repeat with the next package and so on until the project has been completed.
Unit Tests - run by Developers:
It is essential that we make packages, on which other packages depend, as free of errors as is practical.
Any errors in a low-level package imply that the operation of most packages that depend on the flawed
package will also be flawed. A unit test is an eyes-on, debugger walk through that exercises every line
of code, function-by-function, toggles every predicate, and examines the results carefully to ensure correct operation.
These are expensive tests and we often choose not to unit test high level packages, especially
Graphical User Interfaces2 because their operations should be simple and clear.
Integration Tests - run by Developers and QA:
Integration tests are automated tests that exercise changed or newly introduced
code in the context of the entire project baseline3, or some large subset of that.
When we create a new package or modify an existing package the changes we make may break, not only packages
that directly depend on the new version, but also on other packages that are not obviously connected. For
example, a package that minipulates data in a database may introduce changes to the database that break other
code that depend on the database state. This can happen even if the new version has no compilation dependency
relationships with the other package.
Integration tests are aggregated
over the course of developement of a large project and are run frequently to ensure that new code doesn't break
the existing baseline. Each integration test driver (there will be many of them) has, embedded in comments of its
source code, a test description that defines what the test is trying to demonstrate and the steps it uses to do that.
Regression Tests - run by QA:
Regression tests are very similar to Integration tests. The only difference is their purpose. We use
regression testing when porting an application to a new platform or modified platform and when we deploy the
application to another site. Here we run a sequence of tests on the entire baseline to show that it still
functions in the same way it did in its original environment.
Performance and Stress Testing - run by SW Architect, and technical Developers:
Performance tests are run to see if the system is meeting its allocated time and CPU resource budgets. Stress
tests are designed to exceed the required input load capacity and demand more of the platform's resources than
expected. The purpose of these tests is to insure that the system survives unusual loads and resource constraints,
continuing to operate, although it may not satisfy all its obligations in those extreme environments.
Qualification Tests - run by QA under supervision of a Program Manager:
The final demonstration of a large software system, called a Qualification Test, is used to demonstrate that the
system meets all of its obligations to the
customer as laid out in a system specification. This qualification testing is a legalistic, not technical, test.
It should be very organized, detailed, and boring4. Qualification testing uses both test automation, like
regression tests, but also uses hands-on probing and demonstrating to the customer the system's operational details.
Implementing Tests:
Construction tests are implemented manually as you work on the initial construction of a system. Every package has
a main function that is included or excluded from compilation by a compiler directive. We refer to this main function
as a test stub. During construction testing for each package we:
Layout class declarations that are initially empty. This allows us to think about the functionality required
and the language we want to provide for higher level packages to use.
Begin populating classes with methods, one-at-a-time. Each method starts with a few lines of code.
For each new small addition of code we may add a test in the test stub, and build, and execute. When there are flaws
in our implementation, we know were to look - it is very likely to be in the code we just added.
Repeat this process until the package has become fully functional.
Repeat all of the steps, above, for each package in the project.
Unit tests walk through every line of code with a debugger ensuring that every branch has been
activated by test inputs. We do this function-by-function for each class in the package.
This means that we will need to:
Do some initial planning for test execution.
Build a test driver that supplies inputs and may log package state and results.
The driver provides inputs that drive the code into each of its possible states5.
For complex functions we may need to accept data stored in files or a database and record results in files or database state.
Each test driver records, in comments, the resources it uses6, e.g., files and databases.
When unit testing of a package is complete, the test driver is saved in a repository for future use.
For critical packages we may repeat unit tests for every new version.
There are also frameworks like NUnit and JUnit that support testing .Net and Java code, respectively. Developers
call these Unit test frameworks, but the testing they support is a combination of parts of unit testing and integration
testing.
Integration tests are a fundamental part of continuous software integration and are used by
each developer and Quality Assurance (QA) to test code against the
current project baseline or some significant part of the baseline. These are automated tests for which we need:
A Test Harness that loads and executes tests on demand.
For each test a dynamic link library (DLL)7 will be loaded by the Test Harness that contains
a test driver and the code to be tested8.
The test driver derives from an ITest interface and supplies a factory class with a static method to create instances
of all the classes used by the test.
A source of test inputs that may be supplied by the developer or, where feasible, by the test harness.
A mechanism for logging test results and possibly internal state values. These logs are named, identify the test
developer, the code tested including version, and are time-date stamped for each test execution.
A code Repository that holds the current project baseline and can deliver, on demand, DLLs for all the code on which
the tested code depends or may affect. Note that implies the repository has the capability to evaluate calling dependency
relationships.
Test requests consist of a message, perhaps in XML format, that lists a series of one or more tests to execute. The
Test Harness will execute each request on its own thread, probably using a thread pool, and executes
each test in the request sequentially. A developer is likely to submit test requests with only a few tests while
QA will submit test requests with many tests, perhaps all of the tests that have been defined for a specific subsystem.
Test drivers and test resources are stored in the Repository, as they will be used many times, some perhaps
thousands of times over the lifetime of a project. Essentially test drivers are part of the software baseline.
Ideally the Repository and Test Harness are designed to work together.
Existing configuration management systems can be made
to work as repositories in the sense we discuss here, but it is hard to get them to deliver just what is needed for a given
test. More likely they clone massive parts of the baseline and give us many more parts to deal with than needed by each test.
That happens because they don't make dependency graphs available. They just provide clones of named parts. Therefore, the
developer has to figure out what is needed, or, more likely, will just clone big globs of code that hopefully contain all of
the depended upon code. Note that a developer certainly knows the packages on which his code depends directly, but is unlikely
to know all of the dependency decendents of those packages.
For the Repository, I prefer a dedicated server that stores each version of each package only once,
has metadata for each package that describes its dependency relationships - thus forming a virtual dependency graph for each
package - and contains useful descriptive information in the metadata to support browsing and analysis. It supplies, on demand,
the entire dependency graph of a named component9.
The Repository will need to provide build facilities for creating DLLs needed for testing. These will almost certainly
be cached in the Repository baseline. It will also need to support versioning information and ownership policies to determine
who is allowed to checkin changes to any package.
Here's a link that discusses using git for large builds and suggests using Maven to manage project dependencies - Maven was designed
to support Java development:
Code Dependencies - pain point with classic config mgrs like git
There are dependency managers for .Net, none of which I've used yet. Here's some links: Paket - package manager with dependency resolution nuget - package manager - dependency resolution? Using nuget with repositories
Regression Tests are really just the same tests used for integration, but are used for other purposes, as
described in the Initial Thoughts section. Their implementation will not differ from integration tests in any significant way.
Performance Tests are specialized tests that use the integration test framework with drivers that are
designed specifically for timing and throughput analysis. Instead of running many tests on different parts of the baseline
they run many tests repeatedly on the same code, collecting execution times for each and then building outputs that
provide statistics on execution times and throughput.
Stress tests are specialized tests that work much like performance tests. For performance testing we
provide inputs and an environment that is specified for the application. For stress testing we provide inputs that
exceed the specified load and modify environments in ways that the application was not expected to tolerate. We look
critically for any error states from which the application can not recover while still running. We also look for
corruption of persistant state in databases or files that will prevent the system from successfully restarting into
a fully functional state.
Qualification Tests are very thorough examinations of the operations of a system to ensure that it
meets all of its specified obligations. A Test Harness and Repository can be
excellent resources for Qualification tests.
The extent of Qualification testing is very much dependent on the size of each project and the context in which the
system is built, e.g., large projects funded by government agencies vs. internal industrial projects carried out to
extend existing products or to implement a new product.
You may wish to look at CSE681 - SMA: Project #5 to see how these parts,
Test Harness and Repository
fit into a Software Development Environment.
Your use of Testing in Courses:
In the courses you will take in your program of study, as you work on class projects, you should always use construction
testing, starting with a simple system shell that doesn't do much but does compile and run. You add functionality,
package-by-package, in small pieces, adding construction tests with each small addition of functionality. This way
you keep the system always running and progressively acquire more functionality until you are finished or the Project
due date has arrived.
We discuss test frameworks in CSE681 - Software Modeling and Analysis (SMA), and also briefly in
CSE687 - Object Oriented Design (OOD).
For example, in the Fall of 2016 we are developing, in CSE681 - Software Modeling and Analysis, a test harness that
could be used in a collaboration system for software development.
You probably will not engage in Regression, Performance, Stress, or Qualification testing as part of your program of study, but
may acquire that as on the job training once you are working professionally. We used to do some of that in CSE784 - Software Studio,
but unfortunately we are no longer able to offer that course.
Each package contains a main function enclosed with compiler directives that allow us to compile for stand-along operation
using the main, or for integration into a larger body of code, without the main. We call this main function a
test stub.
Graphical User Interfaces should focus on accepting program data from the user and displaying results. It should do
no other computation. Any computation it needs should be delegated to another package that can be tested effectively
in a test framework.
The project baseline is all the code associated with the project's current state. It includes only code that has been
successfully tested and checked into the project's Repository. That includes both product code and test drivers.
If a Qualification test is exciting, we have problems. Qualification should proceed in an orderly fashion, demonstrating
to the customer(s) that the system works as specified. We don't want any unexpected behavior or lack of expected behavior.
If a function has high complexity, it may be untestable. That is, it may be far easier to throw away the function and
rebuild using a simpler structure than to test all the permutations of its many states.
These comments form a test description for the tested code. It would be relatively easy to build an analysis tool that
extracts these "descriptions" for all of the test drivers and build a test document in real time - always up-to-date
and as accurate as the comments in the driver code.
A dynamic link library (DLL) is a compiled binary that is constructed to load into an executable image at run-time.
Unix and Linux developers refer to these as "shared" libraries.
It would be equally effective to build separate DLLs for the test driver and tested code, where the test driver loads
the tested code DLL when it is loaded.
The Repository recursively scans metadata, starting at the node for the requested component, and recursing into all its
descendents, capturing a list of all the filespecs it encounters. This list is what the Test Harness uses to get the
DLLs it needs for any given test. Obviously it would be desirable to cache the downloaded DLLs so we don't keep sending
some of the same components test-after-test.