Testing User Input: One Approach

An Aside on Styles in Passing Arguments to Functions
Back to Calling the Test

The Test Subroutine
Adapting to Perl's Standard Testing Framework
Frequently Asked Questions

Testing User Input: One Approach

Submission for YAPC::NA::2005, Toronto

2 draft : Sun Apr 10 09:43:24 EDT 2005

From time to time we have to write Perl programs which are run on a batch basis by someone who knows nothing about Perl. If we are smart about it, we will thoroughly test such programs, because if something does go wrong with such a program, the program operator is in no position to fix it.

But what if such a program requires input from the operator, such as responses to questions posed in a GUI message box or a terminal prompt? How do we test that we have properly handled operator input? More specifically, how do we incorporate testing for user input in files built on Perl's standard testing apparatus?

In this presentation I present one way to do it -- a relatively simple and unsophisticated way that works for me and may work for you as well. I'm not going to speak about handling user input from a GUI. I'm only going to speak about ''pre-empting standard input.''

The Production Problem

In my day job as an administrator in a psychiatric facility, I had to figure out a way to schedule treatment groups by group leader, time and room, and then to tally and report statistics as to whether those groups actually took place or not. Naturally, I wanted to embody this process in a Perl program. But, unlike some of my earlier on-the-job efforts using Perl, I wanted to delegate the day-to-day operation of this program to a colleague who knew nothing about Perl. I wanted her to be able to open up a terminal window, call a Perl script, and enter responses at command prompts. I wanted this process to be as smooth and painless as possible. For example, I didn't want her to have to see a single Perl warning about uninitialized variables.

I wrote a suite of modules and we used it successfully for two years. Then my boss asked me to report the data in a different way -- a way that, to my chagrin, would require substantial overhaul of my modules. The modules had to be able to compute the statistics in both the old and new ways -- and to get both right. How would I know if I had gotten both right? By consulting my test suite, of course. The only problem was: I hadn't written a test suite. I had simply relied on visual inspection of the output files. That would no longer suffice.

So before I could overhaul my modules I was faced with the task of retroactively writing a test suite for the modules as I had created them two years ago. That meant writing tests for those methods which called for operator input. It meant creating a set of testing data which could be provided to the test suites at exactly the same points where the production program would call for standard input. It called for pre-empting standard input.

Example: Entering a New Room into the Database

Room.pm is the module containing methods for entering, changing and deleting data about rooms in which treatment groups are held. Apart from its constructor, it holds 3 publicly callable methods:

    enter_new_room()
    change_room_data()
    delete_room_data()

To enter a new room in the database, the operator must respond to command-prompts for:

room number
section of the building where room is located
room capacity
comment: special features of room

Within method enter_new_room(), each of these questions is represented by an internal subroutine:

    _room()
    _section()
    _capacity()
    _comment()

So far we can anticipate having to provide four pieces of information for a passing test. enter_new_room() also calls an internal subroutine _verify() which asks, ''Do you wish to make any changes in this data?.'' The operator's response constitutes a fifth piece of information needed for a passing test.

Since the command-prompts appear in a particular order, the data which substitutes for STDIN must appear in a particular order as well -- which suggests that they would be held in a Perl array:

    @responses = (
        4001,
        'A',
        20,
        'Kitchen:  sink and refrigerator',
        '',           # no changes requested
    );

But some of these internal subroutines themselves have internal error-checking capacity. For example, since all the rooms used for treatment groups have numbers between 4000 and 4099, _room() checks to make sure that the data entered by the operator at the command-prompt is all numerical and in the proper range. That implies that in order to provide a thorough test of enter_new_room(), we have to write some tests which provide incorrect entries and see if the program prompts the operator for better input. In this case, @responses would need more than 5 elements to represent a passing test.

    @responses = (
        'Room 4001',  # incorrect: not strictly numeric
        3001,         # incorrect: out of range
        4001,         # correct
        'A',
        20,
        'Kitchen:  sink and refrigerator',
        '',           # no changes requested
    );

You can anticipate that in our test we will be shifting one element at a time off @responses and feeding it to the testing routine.

Calling the Test

In a minute, we'll look at how we actually run a test of data which pre-empts standard input . First, let's look at the test's interface, i.e., how the test would appear in a standard Perl test file.

I found in practice that routines for entering, changing or deleting data tend to have their own peculiarities. So I found it convenient to write a testing subroutine for each of those operations. For example, to test the major methods of Room.pm I have three testing subroutines:

    test_enter_new_room()
    test_change_room_data()
    test_delete_room_data()

Most of the code within these three subroutines is repeated and can be refactored out into internal subroutines. So the testing subroutines are actually rather simple.

test_enter_new_room() takes as arguments a hash reference and a reference to @responses.

    test_enter_new_room(\%args, \@responses);

The elements of %args are largely invariant. They include information on the location of the database, the name of the directory where temporary files are to be created, the name of the class, and so forth. Perhaps the single most important element in %args is size, the number of data records in the testing database.

    $args{'size'} = 29;

This argument is important for two reasons. First, we want to make sure that we have correctly counted the number of items in our testing database. Our count has to match any count conducted by a constructor or object method before we try to change that count.

Second, we need to know whether a method like enter_new_room() correctly increments the number of records in our testing database. If we had 29 rooms in our database before calling enter_new_room(), we want to have 30 rooms after.

An Aside on Styles in Passing Arguments to Functions

In Perl Debugged, Peter J. Scott recommends the use of a single hash reference rather than a list of arguments, particularly when the number of data points to be passed is three or more. Why? Because as the list of arguments passed increases, you have to pay more attention, both in calling the subroutine and in the subroutine's own code, to the order in which the arguments are passed. By passing a single hash reference, you sidestep the problem of the order in which arguments are passed. You simply dereference the hash as needed inside the subroutine.

Peter's point was brought to my attention by David H. Adler and in the actual production code I followed it strictly. For example, test_enter_new_room() took exactly one hash reference; a reference to @responses was one element in %args. But while all the other elements in %args were largely invariant, @responses changed with each individual test. So now I think it makes more sense to have the testing subroutine take two arguments, one holding largely invariant information, the other holding the information that changes with each test.

Back to Calling the Test

So, a particular instance of calling test_enter_new_room() will look like this:

    @responses = (
        4001,
        'A',
        20,
        'Kitchen:  sink and refrigerator',
        '',           # no changes requested
    );
    test_enter_new_room(\%args, \@responses);

The Test Subroutine

Here is a simplified version of test_enter_new_room():

    sub test_enter_new_room { 
        my ($argsref, $responsesref) = @_;
        my %args      = %{$argsref};
        my $class     = $args{'class'};
        my $source    = $args{'source'};
        my $size      = $args{'size'};

        my @responses = @{$responsesref};

        my ($rm, $initial_count) = 
            _get_initial_object_count($class, $source, $size);
        
        tie *STDIN, 'Preempt::Stdin', @responses;

        ok($rm->enter_new_room(),
            "enter_new_room() executed successfully");

        untie *STDIN;

        ok($initial_count + 1 == _get_revised_count($class, $source), 
            "data count correctly incremented"); 
    }

The first six lines pull in the arguments and dereference them for clarity. _get_initial_object_count is an internal subroutine (not shown here) that creates the Room object, tests that the object has been created (ok($rm, "Room object created");), and tests that the number of records in the object (in this case, rooms) matches what was predicted in $args{'size'}.

The real Perl magic takes place in:

    tie *STDIN, 'Preempt::Stdin', @responses;

Filehandle STDIN is tied via class Preempt::Stdin such that the elements of @responses pre-empt responses normally coming from standard input.

The code for Preempt::Stdin is devilishly simple and is adapted from Chapter 14 of the Camel book:

    package Preempt::Stdin;
    use strict;
    use Carp;

    sub TIEHANDLE {
        my $class = shift;
        my @lines = @_;
        bless \@lines, $class;
    }

    sub READLINE {
        my $self = shift;
        if (@$self) {
            shift @$self;
        } else {
            croak "List of prompt responses has been exhausted: $!";
        }
    }

1;

When you call tie FILEHANDLE, Perl expects you to supply as your first argument the name of the class which will build the tied object. That class, in turn, must have a constructor called sub TIEHANDLE. The constructor then processes all other arguments supplied to the tie call. In our case, @responses is pulled into @lines and a reference to @lines is blessed into the tied object.

We then call method enter_new_room(). Wherever that method would normally get input from the operator, it instead gets the first remaining element of @responses as shifted off by the READLINE subroutine of Preempt::Stdin.

If in our test we haven't provided enough arguments for test_enter_new_room we will die. We can even write tests for this case which capture the error and test to see if the correct error message was generated via croak.

Once we've run enter_new_room() we untie STDIN to free it up for the next test. We then call another internal subroutine, _get_revised_count (not shown), to see whether the number of records in the Room object has been correctly incremented.

Adapting to Perl's Standard Testing Framework

In order to make testing of user input play nicely with ExtUtils::MakeMaker and make test, I structure my t/ directory like this:

    t/
      enter_new_room.t
      change_room_data.t
      delete_room_data.t
      testlib/
        Preempt/
          Stdin.pm
        RoomSpecial.pm

where t/testlib/ holds files and subdirectories which in turn hold all testing functions not provided by Test::More. So I'll place test_enter_new_room(), along with internal subroutines like _get_initial_object_count() and _get_revised_count(), into a package in t/testlib/RoomSpecial.pm. My test file will then start out something like this:

    # t/enter_new_room.t
    use Test::More qw(no_plan); 
    BEGIN { 
        use_ok('Room');
        use_ok('Carp');
        use lib ( "./t/testlib" );
        use_ok('Preempt::Stdin');
        use_ok('RoomSpecial', qw{ 
            test_enter_new_room 
        });
    };

Frequently Asked Questions

Now let me touch upon a number of topics that occurred to me in developing this approach.

Q: How do I know how many tests I have to write to test a method which calls for input from STDIN?

: 1. If you took computer science, you probably learned how to diagram the flow of a program through its various branches. Dig out your textbooks, draw a flow chart of each method to be tested and write a test for each branch.
: 2. If, like me, you did not take computer science, you cheat. You use Paul Johnson's Devel::Cover module from CPAN and you keep writing tests till all branches and conditions are covered.

Of course, in reality you probably want to use some combination of the two.

Q: A program which gets input from STDIN only gets it by prompting for it on STDOUT. But I don't want the STDOUT prompts cluttering up the screen when I run my test suite. What do I do?

IO::Capture::Stdout

IO::Capture::Stdout::Extended

Q: Have you discovered any bugs or limitations in this approach?

Preempt::Stdin

    print "Enter room whose data you wish to enter:  ";
    chomp ($try = <>);

This doesn't work. You have to hard-code STDIN instead.

    chomp ($try = <STDIN>);

I don't know why this happens. If this bothers you, look at IO::Scalar which is reported to handle the diamond operator properly, but which, IMHO, has a more complex interface.

Q: Do you have references to other approaches to this problem?