=head1 Testing User Input:  One Approach

=head2 James E Keenan

=head2 Submission for YAPC::NA::2005, Toronto

2 draft : Sun Apr 10 09:43:24 EDT 2005

=head2 Introduction

From time to time we have to write Perl programs which are run on a
batch basis by someone who knows nothing about Perl.  If we are smart
about it, we will thoroughly test such programs, because if something does
go wrong with such a program, the program operator is in no position 
to fix it.

But what if such a program requires input from the operator, such as
responses to questions posed in a GUI message box or a terminal prompt?
How do we test that we have properly handled operator input?  More
specifically, how do we incorporate testing for user input in files
built on Perl's standard testing apparatus?

In this presentation I present one way to do it -- a relatively simple
and unsophisticated way that works for me and may work for you as well.
I'm not going to speak about handling user input from a GUI.  I'm only
going to speak about ''pre-empting standard input.''

=head2 The Production Problem

In my day job as an administrator in a psychiatric facility, I had to
figure out a way to schedule treatment groups by group leader, time and
room, and then to tally and report statistics as to whether those groups
actually took place or not.  Naturally, I wanted to embody this process
in a Perl program.  But, unlike some of my earlier on-the-job efforts
using Perl, I wanted to delegate the day-to-day operation of this
program to a colleague who knew nothing about Perl.  I wanted her to be
able to open up a terminal window, call a Perl script, and enter
responses at command prompts.  I wanted this process to be as smooth and
painless as possible.  For example, I didn't want her to have to see a
single Perl warning about uninitialized variables.

I wrote a suite of modules and we used it successfully for two years.
Then my boss asked me to report the data in a different way -- a way
that, to my chagrin, would require substantial overhaul of my modules.
The modules had to be able to compute the statistics in both the old and
new ways -- and to get both right.  How would I know if I had gotten
both right?  By consulting my test suite, of course.  The only problem was:  I
hadn't written a test suite.  I had simply relied on visual inspection of
the output files.  That would no longer suffice.

So before I could overhaul my modules I was faced with the task of
retroactively writing a test suite for the modules as I had created them
two years ago.  That meant writing tests for those methods which called
for operator input.  It meant creating a set of testing data which could
be provided to the test suites at exactly the same points where the
production program would call for standard input.  It called for
pre-empting standard input.

=head2 Example:  Entering a New Room into the Database

F<Room.pm> is the module containing methods for entering, changing and
deleting data about rooms in which treatment groups are held.  Apart 
from its constructor, it holds 3 publicly callable methods:

=over 4

    enter_new_room()
    change_room_data()
    delete_room_data()

=back

To enter a new room in the database, the operator must
respond to command-prompts for:

=over 4

=item 1 room number

=item 2 section of the building where room is located

=item 3 room capacity

=item 4 comment:  special features of room

=back

Within method C<enter_new_room()>, each of these questions is
represented by an internal subroutine:

    _room()
    _section()
    _capacity()
    _comment()

So far we can anticipate having to provide four pieces of information for a
passing test.  C<enter_new_room()> also calls an internal subroutine
C<_verify()> which asks, ''Do you wish to make any changes in this
data?.''  The operator's response constitutes a fifth piece of
information needed for a passing test.

Since the command-prompts appear in a particular order,
the data which substitutes for STDIN must appear in a
particular order as well -- which suggests that they would be held in a
Perl array:

    @responses = (
        4001,
        'A',
        20,
        'Kitchen:  sink and refrigerator',
        '',           # no changes requested
    );

But some of these internal subroutines themselves have
internal error-checking capacity.  For example, since all the rooms used
for treatment groups have numbers between 4000 and 4099, C<_room()> checks 
to make sure that the data entered by the operator at the command-prompt
is all numerical and in the proper range.  That implies that in order to
provide a I<thorough> test of C<enter_new_room()>, we have to write some
tests which provide I<incorrect> entries and see if the program prompts
the operator for better input.  In this case, C<@responses> would
need I<more than 5> elements to represent a passing test.

    @responses = (
        'Room 4001',  # incorrect: not strictly numeric
        3001,         # incorrect: out of range
        4001,         # correct
        'A',
        20,
        'Kitchen:  sink and refrigerator',
        '',           # no changes requested
    );

You can anticipate that in our test we will be shifting one element at a
time off C<@responses> and feeding it to the testing routine.

=head2 Calling the Test

In a minute, we'll look at how we actually run a test of data which pre-empts
standard input .  First, let's look at the test's interface,
I<i.e.,> how the test would appear in a standard Perl test file.

I found in practice that routines for entering, changing or
deleting data tend to have their own peculiarities.  So I found it
convenient to write a testing subroutine for each of those operations.
For example, to test the major methods of F<Room.pm> I have three
testing subroutines:

=over 4

    test_enter_new_room()
    test_change_room_data()
    test_delete_room_data()

=back

Most of the code within these three subroutines is repeated and can be
refactored out into internal subroutines.  So the testing subroutines are
actually rather simple.

C<test_enter_new_room()> takes as arguments a hash reference and a
reference to C<@responses>.

    test_enter_new_room(\%args, \@responses);

The elements of C<%args> are largely invariant.  They include
information on the location of the database, the name of the directory
where temporary files are to be created, the name of the class, and so
forth.  Perhaps the single most important element in C<%args> is
C<size>, the number of data records in the testing database.

    $args{'size'} = 29;

This argument is important for two reasons.  First, we want to make sure
that we have correctly counted the number of items in our testing
database.  Our count has to match any count conducted by a constructor
or object method before we try to change that count.

Second, we need to know whether a method like C<enter_new_room()>
correctly increments the number of records in our testing database.  If
we had 29 rooms in our database before calling 
C<enter_new_room()>, we want to have 30 rooms after.

=head3 An Aside on Styles in Passing Arguments to Functions

In I<Perl Debugged>, Peter J. Scott recommends the use of a single hash
reference rather than a list of arguments, particularly when the number
of data points to be passed is three or more.  Why?  Because as the list
of arguments passed increases, you have to pay more attention, both in
calling the subroutine and in the subroutine's own code,  to the
I<order> in which the arguments are passed.  By passing a single hash
reference, you sidestep the problem of the order in which arguments are
passed.  You simply dereference the hash as needed inside the
subroutine.

Peter's point was brought to my attention by David H. Adler and in the
actual production code I followed it strictly.  For example, 
C<test_enter_new_room()> took exactly one hash reference; a
reference to C<@responses> was one element in C<%args>.  But while all the
other elements in C<%args> were largely invariant, C<@responses>
changed with each individual test.  So now I think it makes more sense
to have the testing subroutine take two arguments, one holding largely
invariant information, the other holding the information that  changes
with each test.

=head3 Back to Calling the Test

So, a particular instance of calling C<test_enter_new_room()> will look
like this:

    @responses = (
        4001,
        'A',
        20,
        'Kitchen:  sink and refrigerator',
        '',           # no changes requested
    );
    test_enter_new_room(\%args, \@responses);

=head2 The Test Subroutine

Here is a simplified version of C<test_enter_new_room()>:

    sub test_enter_new_room { 
        my ($argsref, $responsesref) = @_;
        my %args      = %{$argsref};
        my $class     = $args{'class'};
        my $source    = $args{'source'};
        my $size      = $args{'size'};

        my @responses = @{$responsesref};

        my ($rm, $initial_count) = 
            _get_initial_object_count($class, $source, $size);
        
        tie *STDIN, 'Preempt::Stdin', @responses;

        ok($rm->enter_new_room(),
            "enter_new_room() executed successfully");

        untie *STDIN;

        ok($initial_count + 1 == _get_revised_count($class, $source), 
            "data count correctly incremented"); 
    }

The first six lines pull in the arguments and dereference them for clarity.
C<_get_initial_object_count> is an internal subroutine (not shown here)
that creates the Room object, tests that the object has been created
(C<ok($rm, "Room object created");>), and tests that the number of 
records in the object (in this case, rooms) matches what was predicted in
C<$args{'size'}>.

The real Perl magic takes place in:

    tie *STDIN, 'Preempt::Stdin', @responses;

Filehandle C<STDIN> is tied via class C<Preempt::Stdin> such that the
elements of C<@responses> pre-empt responses normally coming from
standard input.

The code for C<Preempt::Stdin> is devilishly simple and is adapted from
Chapter 14 of the Camel book:

    package Preempt::Stdin;
    use strict;
    use Carp;

    sub TIEHANDLE {
        my $class = shift;
        my @lines = @_;
        bless \@lines, $class;
    }

    sub READLINE {
        my $self = shift;
        if (@$self) {
            shift @$self;
        } else {
            croak "List of prompt responses has been exhausted: $!";
        }
    }

    1;

When you call C<tie FILEHANDLE>, Perl expects you to supply as your
first argument the name of the class which will build the tied object.
That class, in turn, must have a constructor called C<sub TIEHANDLE>.
The constructor then processes all other arguments supplied to the
C<tie> call.  In our case, C<@responses> is pulled into C<@lines> and a
reference to C<@lines> is blessed into the tied object.

We then call method C<enter_new_room()>.  Wherever that method
would normally get  input from the operator, it instead gets the first
remaining element of C<@responses> as shifted off by the C<READLINE>
subroutine of C<Preempt::Stdin>.

If in our test we haven't provided
enough arguments for C<test_enter_new_room> we will die.  We can even
write tests for this case which capture the error and test to see if the
correct error message was generated via C<croak>.

Once we've run C<enter_new_room()> we untie C<STDIN> to free it up for
the next test.  We then call another internal subroutine,
C<_get_revised_count> (not shown), to see whether the number of records
in the Room object has been correctly incremented.

=head2 Adapting to Perl's Standard Testing Framework

In order to make testing of user input play nicely with
F<ExtUtils::MakeMaker> and C<make test>, I structure my C<t/> directory
like this:

    t/
      enter_new_room.t
      change_room_data.t
      delete_room_data.t
      testlib/
        Preempt/
          Stdin.pm
        RoomSpecial.pm

where F<t/testlib/> holds files and subdirectories which in turn hold
all testing functions not provided by F<Test::More>. So I'll place
C<test_enter_new_room()>, along with internal subroutines like
C<_get_initial_object_count()> and C<_get_revised_count()>, into a package
in F<t/testlib/RoomSpecial.pm>.  My test file will then start out
something like this:

    # t/enter_new_room.t
    use Test::More qw(no_plan); 
    BEGIN { 
        use_ok('Room');
        use_ok('Carp');
        use lib ( "./t/testlib" );
        use_ok('Preempt::Stdin');
        use_ok('RoomSpecial', qw{ 
            test_enter_new_room 
        });
    };

=head2 Frequently Asked Questions

Now let me touch upon a number of topics that occurred to me in
developing this approach.

=over 4

=item * I<Q:  How do I know how many tests I have to write to test a method
which calls for input from STDIN?>

A:  Two ways.

=over 4

=item

1.  If you took computer science, you probably learned
how to diagram the flow of a program through its various branches.  Dig
out your textbooks, draw a flow chart of each method to be tested and
write a test for each branch.

=item

2.  If, like me, you did I<not> take computer science, you cheat.  You
use Paul Johnson's F<Devel::Cover> module from CPAN and you keep writing
tests till all branches and conditions are covered.

=back

Of course, in reality you probably want to use some combination of the
two.

=item * I<Q:  A program which gets input from STDIN only gets it by 
prompting for it on STDOUT.  But I don't want the STDOUT prompts 
cluttering up the screen when I run my test suite.  What do I do?>

A:  Capture STDOUT with a CPAN module such as F<IO::Capture::Stdout>.  I've
got my own version of this which corrects bugs in the CPAN version, and
I've got an extension up on CPAN, coincidentally called
F<IO::Capture::Stdout::Extended>,  which provides additional subroutines
useful in a testing context.

=item * I<Q: Have you discovered any bugs or limitations in this approach?> 

A:  Yes.  F<Preempt::Stdin> doesn't DWIM when the source code it's
testing uses only the Perl diamond operator for standard input.

    print "Enter room whose data you wish to enter:  ";
    chomp ($try = <>);

This doesn't work.  You have to hard-code C<STDIN> instead.

    chomp ($try = <STDIN>);

I don't know why this happens.  If this bothers you, look at
F<IO::Scalar> which is reported to handle the diamond operator properly,
but which, IMHO, has a more complex interface.

=item * I<Q: Do you have references to other approaches to this problem?> 

A:  Yes.  If this talk gets accepted for YAPC, I'll look them up and
insert them here!

=back

Copyright 2005 James E Keenan.  All rights reserved.  Send e-mail to:
jkeenan (at) cpan (dot) org

=cut