Recorded Test

The book has now been published and the content of this chapter has likely changed substanstially.
Please see page 278 of xUnit Test Patterns for the latest information.
Also known as: Record and Playback Test, Robot User Test, Capture/Playback Test

How do we prepare automated tests for our software?

We automate tests by recording interactions with the application and playing them back using a test tool.

Sketch Recorded Test embedded from Recorded Test.gif

Automated tests serve several purposes. They can be used for regression testing software after it has been changed. They can help document the behavior of the software. They can also be used to specify the behavior of the software before it has been written. How we prepare the automated test scripts affects which purpose they can be used for, how robust they are to changes in the system under test (SUT) and how much skill and effort it takes to prepare them.

Recorded Tests allow us to rapidly create regression tests after the SUT has been built and before it is changed.

How It Works

We use a tool that monitors our interactions with the SUT as we use it. This tool keeps track of most of what the SUT communicates to us and our responses to it. When the recording session is done, we can save the session to a file for later playback. When we are ready to run the test, we start up the "playback" part of the tool and point it at the recorded session. It starts up the SUT and feeds it our recorded inputs in response to the SUT's outputs. It may also compare the SUT's outputs with what the SUT emitted during the recording session. Failure to match may be cause for failing the test.

Some Recorded Test tools allow us to adjust the sensitivity of the comparisons that the tool does between what the SUT said to us during the recording session and what it said during the playback. Most Recorded Test tools interact with the SUT through the user interface.

When To Use It

Once an application is up and running and we don't expect a lot of changes to it, we can use Recorded Tests to do regression testing. We could also use Recorded Tests when we have an existing application that needs to be refactored (in anticipation of modifying the functionality) and we do not have Scripted Tests (page X) available to use as regression tests. It is typically much quicker to produce a set of Recorded Tests than to prepare Scripted Tests for the same functionality. In theory, the test recording can be done by anyone who knows how to operate the application; very little technical expertise should be required. In practice, many of the commercial tools have a steep learning curve. As well, some technical expertise may be required to add "checkpoints", to adjust the sensitivity of the playback tool or to adjust the test scripts when the recording tool became confused and recorded the wrong information.

Most Recorded Test tools interact with the SUT through the user interface. This makes them particularly prone to fragility if the user interface of the SUT is evolving (Interface Sensitivity (see Fragile Test on page X)). Even small changes such as changing the internal name of a button or field may be enough to cause the playback tool to stumble. The tools also tend to record information at a very low and detailed level making the tests hard to understand (Obscure Test (page X)) and therefore repair by hand when they are broken by changes to the SUT. Therefore, we should plan on rerecording the tests fairly regularly if we are continuing to evolve the SUT.

If we want to use the Tests as Documentation (see Goals of Test Automation on page X) or we want to use the tests to drive new development, we should consider using Scripted Tests. These goals are difficult to address with commercial Recorded Test tools because most do not let us define a Higher Level Language (see Principles of Test Automation on page X) for the test recording. This can be addressed by building the Recorded Test capability into the application itself or by using Refactored Recorded Test.

Variation: Refactored Recorded Test

A hybrid of the two strategies is to use the "Record, Refactor, Playback"(To the best of my knowledge, the name "Record, Refactor, Playback" was coined by Adam Geras.) sequence to extract a set of "action components" or "verbs" from the newly Recorded Tests and then rewire the test cases to call these "action components" instead of having detailed in-line code. Most commercial capture/replay tools provide the means to turn Literal Values (page X) into parameters that can be passed into the "action component" by the main test case. When a screen changes we simply rerecord the action component"; all the test cases continue to function by automatically using the new "action component" definition. This is effectively the same as using Test Utility Methods (page X) to interact with the SUT in unit tests. It opens the door to using the Refactored Recorded Test components as a Higher Level Language in Scripted Tests. Tools such as BPT use this paradigm for scripting tests top-down; once the high-level scripts are developed and the components required for the test steps are specified, more technical people can either record or hand-code the individual components.

Implementation Notes

There are two basic choices when using a Recorded Test strategy: We can either acquire third-party tools that record while we interact with the application or we can build a "record&playback" mechanism right into our application.

Variation: External Test Recording

There are many test recording tools available commercially and each has its own strengths and weaknesses. The best choice will depend on the nature of the user interface of the application, our budget, how complex the functionality to be verified is and possibly other factors.

If we want to use the tests to drive development, we need to pick a tool that uses a test recording file format that is editable by hand and easily understood. We'll need to hand-craft the contents which is really an example of a Scripted Test even if we are using a "Record and Playback" tool to execute the tests.

Variation: Built-In Test Recording

It is also possible to build a Recorded Test capability into the SUT. We did this on several projects and have found it to be very valuable. The test scripting "language" can be defined at a fairly high level, high enough to make it possible to hand script the tests even before the system is built. It has also been reported that the VBA macro capability of Microsoft's Excel spreadsheet started out as a mechanism for automated testing of Excel.

Example: Built-In Test Recording

On the surface, it doesn't make sense to provide a code sample for a Recorded Test because it is about how the test is produced, not how it is represented. When the test is played back, it is in effect a Data-Driven Test (page X). Likewise, we don't often refactor to Recorded Test as this is often the first test automation strategy attempted on a project. The one case where we might introduce Recorded Test after attempting Scripted Tests is when we discover that we are missing tests because the cost of manual automation is too high. In that case, we won't be trying to turn existing scripted tests into Recorded Tests; we'd just record new tests.

Here's an example of a test recorded by the application itself. The test were used to regression test a safety-critical application after it was ported from C on OS2 to C++ on Windows. Note how the recorded information forms a domain-specific Higher Level Language that is quite readable by a user:

<interaction-log>
   <commands>
      <!-- more commands ommitted -->
      <command seqno="2" id="Supply Create">
         <field name="engineno" type="input">
            <used-value>5566</used-value>
            <expected></expected>
            <actual status="ok"/>
         </field>
         <field name="direction" type="selection">
            <used-value>SOUTH</used-value>
            <expected>
               <value>SOUTH</value>
               <value>NORTH</value>
            </expected>
            <actual>
               <value status="ok">SOUTH</value>
               <value status="ok">NORTH</value>
            </actual>
         </field>
      </command>
      <!-- more commands ommitted -->
   </commands>
</interaction-log>
Example SutRecordedTest embedded from RecordedTests/TrainControlTest.xml

This is actually the output of having played back the tests. The actual elements were inserted by the built-in playback mechanism. The status attributes indicate whether it match the expected values. We applied a style sheet to these files to format them much like a Fit test with color-coded results. All the recording, replaying and result analysis was done by the business users on the project.

This recording was made by inserting hooks in the presentation layer of the software to record the lists of choices offered the user and how they responded. An example of one of these hooks is:

if (playback_is_on()) { choice = get_choice_for_playback(dialog_id, choices_list);
} else { choice  = display_dialog(choices_list, row, col, title, key);
}
�
if (recording_is_on())  {
   record_choice(dialog_id, choices_list, choice, key);
}
Example SutRecordingHooks embedded from RecordedTests/TrainControlHooks.txt

The method record_choice generates the actual element and does the "assertions" against the expected elements recording the result in the status attribute of each element.

Example: Commercial Record and Playback Test Tool

Just about every commercial testing tool uses a "record&playback" metaphor. Each also defines its own Recorded Test file format. Most of these formats are very verbose. The following is a "short" excerpt from a test recorded using Mercury Interactive's QuickTest Professional (QTP) tool. It is shown in "Expert View" which exposes what is really recorded: a VbScript program! I've inserted comments (preceded by "@@") manually to get a better handle on what this test is doing; these comments would be lost if the test were rerecorded after a change to the application caused the test to no longer run.

@@
@@ GoToPageMaintainTaxonomy()
@@
Browser("Inf").Page("Inf").WebButton("Login").Click 
Browser("Inf").Page("Inf_2").Check CheckPoint("Inf_2")
Browser("Inf").Page("Inf_2").Link("TAXONOMY LINKING").Click
Browser("Inf").Page("Inf_3").Check CheckPoint("Inf_3")
Browser("Inf").Page("Inf_3").Link("MAINTAIN TAXONOMY").Click
Browser("Inf").Page("Inf_4").Check CheckPoint("Inf_4")
@@
@@ AddTerm("A","Top Level", "Top Level Definition")
@@
Browser("Inf").Page("Inf_4").Link("Add").Click 
wait 4
Browser("Inf_2").Page("Inf").Check CheckPoint("Inf_5")
Browser("Inf_2").Page("Inf").WebEdit("childCodeSuffix").Set "A" 
Browser("Inf_2").Page("Inf").WebEdit("taxonomyDto.descript").Set "Top Level"
Browser("Inf_2").Page("Inf").WebEdit("taxonomyDto.definiti").Set "Top Level Definition"
Browser("Inf_2").Page("Inf").WebButton("Save").Click 
wait 4
Browser("Inf").Page("Inf_5").Check CheckPoint("Inf_5_2")
@@
@@ SelectTerm("[A]-Top Level")
@@
Browser("Inf").Page("Inf_5").WebList("selectedTaxonomyCode").Select "[A]-Top Level"
@@
@@ AddTerm("B","Second Top Level", "Second Top Level Definition")
@@
Browser("Inf").Page("Inf_5").Link("Add").Click 
wait 4
Browser("Inf_2").Page("Inf_2").Check CheckPoint("Inf_2_2")   
   infofile_;_Inform_Alberta_21.inf_;_hightlight id_;
      _Browser("Inf_2").Page("Inf_2")_;_
@@
@@ and it goes on, and on, and on ....
Example QtpRecordedTest embedded from RecordedTests/IA QuickTest Sample-editted.txt

Note how the test is very much focused on the user interface of the application. It suffers from two main issues: Obscure Test caused by the detailed nature of the recorded information and Interface Sensitivity resulting in Fragile Test.

Refactoring Notes

We can make this test more useful as documentation, reduce High Test Maintenance Cost (page X) as well as enable composition of other tests from a Higher Level Language by using a series of Extract Method[Fowler] refactorings.

Example: Refactored Commercial Recorded Test

This is an example of the same test refactored to Communicate Intent (see Principles of Test Automation):

GoToPageMaintainTaxonomy()
AddTerm("A","Top Level", "Top Level Definition")
SelectTerm("[A]-Top Level")
AddTerm("B","Second Top Level", "Second Top Level Definition")
Example QtpRefactoredTest embedded from RecordedTests/IA QuickTest Sample-refactored.txt

Note how much more intent revealing this test has become. The Test Utility Methods we extracted look like this:

Method GoToPageMaintainTaxonomy()
   Browser("Inf").Page("Inf").WebButton("Login").Click 
   Browser("Inf").Page("Inf_2").Check CheckPoint("Inf_2")
   Browser("Inf").Page("Inf_2").Link("TAXONOMY LINKING").Click
   Browser("Inf").Page("Inf_3").Check CheckPoint("Inf_3")
   Browser("Inf").Page("Inf_3").Link("MAINTAIN TAXONOMY").Click
   Browser("Inf").Page("Inf_4").Check CheckPoint("Inf_4")
End

Method AddTerm( code, name, description)
   Browser("Inf").Page("Inf_4").Link("Add").Click 
   wait 4
   Browser("Inf_2").Page("Inf").Check CheckPoint("Inf_5")
   Browser("Inf_2").Page("Inf").WebEdit("childCodeSuffix").Set code
   Browser("Inf_2").Page("Inf").WebEdit("taxonomyDto.descript").Set name
   Browser("Inf_2").Page("Inf").WebEdit("taxonomyDto.definiti").Set description
   Browser("Inf_2").Page("Inf").WebButton("Save").Click 
   wait 4
   Browser("Inf").Page("Inf_5").Check CheckPoint("Inf_5_2")
end

Method SelectTerm( path )
   Browser("Inf").Page("Inf_5").WebList("selectedTaxonomyCode").Select path
   Browser("Inf").Page("Inf_5").Link("Add").Click 
   wait 4
end
Example QtpRefactoredTestUtilities embedded from RecordedTests/IA QuickTest Sample-refactored.txt

This example is one I hacked together to illustrate the similarities to what we do in xUnit; don't try running it at home as it probably not syntactically correct.