NAME

What do we expect out of "paragraph mode"?

GOALS

We want to be able to answer questions like these:

How is paragraph mode documented in the the core distribution? In the Camel book?
- Do we have tests for each kind of functionality so documented?
In tests in the core distribution (or other codebases surveyed, e.g., CPAN), what kind of "paragraphs" are found within the files we process in paragraph mode?
- Can we imagine files whose "paragraphs" do not conform to, or are not limited to, those we are currently testing?
- If so, does $/="" DWIM on such files and "paragraphs"?

PARAGRAPH MODE IN THE CORE DISTRIBUTION

pod/ and perlfaq

Of the following, pod/perlvar.pod is the most authoritative.

pod/perlform.pod

Used in code sample illustrating format.

pod/perlfunc.pod

In documentation of chomp built-in:

    When in paragraph mode (C<$/ = ''>), [C<chomp>] removes all trailing newlines from the string.

pod/perlop.pod

Used in code samples explaining the \G assertion

pod/perlvar.pod

Main documentation for $INPUT_RECORD_SEPARATOR:

    $INPUT_RECORD_SEPARATOR
    $RS
    $/      The input record separator, newline by default. This influences
            Perl's idea of what a "line" is. Works like awk's RS variable,
            including treating empty lines as a terminator if set to the
            null string (an empty line cannot contain any spaces or tabs).
    ...
            Setting to "" will treat two or more consecutive
            empty lines as a single empty line.
    ...

However, in perlvar there are no specific examples of $/='' and the discussion is heavily weighted toward $/=undef (slurp mode).

Discussion in Programming Perl, 4th edition, p. 778 is very similar and, once again, contains no specific code samples for $/=''.

cpan/perlfaq/lib/perlfaq5.pod

Starting at line 1175:

  How can I read in a file by paragraphs?
    Use the $/ variable (see perlvar for details). You can either set it to
    "" to eliminate empty paragraphs ("abc\n\n\n\ndef", for instance, gets
    treated as two paragraphs and not three), or "\n\n" to accept empty
    paragraphs.

    Note that a blank line must have no blanks in it. Thus
    "fred\n \nstuff\n\n" is one paragraph, but "fred\n\nstuff\n\n" is two.

cpan/perlfaq/lib/perlfaq6.pod

Starting at line 81:

  I'm having trouble matching over more than one line. What's wrong?
    ...
    There are many ways to get multiline data into a string. If you want it
    to happen automatically while reading input, you'll want to set $/
    (probably to '' for paragraphs or "undef" for the whole file) to allow
    you to read more than one line at a time.

Code samples starting at line 108:

    $/ = '';          # read in whole paragraph, not just one line
    while ( <> ) {
        while ( /\b([\w'-]+)(\s+\g1)+\b/gi ) {     # word starts alpha
            print "Duplicate $1 at paragraph $.\n";
        }
    }
    ...
    $/ = '';          # read in whole paragraph, not just one line
    while ( <> ) {
        while ( /^From /gm ) { # /m makes ^ match next to \n
        print "leading from in paragraph $.\n";
        }
    }

Test programs maintained by p5p

t/base/rs.t

rs here means "record separator".

Starting at line 193:

    # Does paragraph mode work?
    $/ = '';
    $bar = <FH>;
    if ($bar ne "1234\n12345\n\n") {print "not ";}
    print "ok $test_count # \$/ = ''\n";
    $test_count++;

Believe it or not, that's the full extent of formal, explicit testing of paragraph mode in the core distribution.

t/op/chop.t

Starting at line 61:

    $_ = "f\n\n\n\n\n";
    $/ = "";
    $got = chomp();
    is ($got, 5, 'check return value when chomp in paragraph mode on string ending with 5 newlines');
    is ($_, "f", 'chomp in paragraph mode on string ending with 5 newlines');

    $_ = "f\n\n";
    $/ = "";
    $got = chomp();
    is ($got, 2, 'check return value when chomp in paragraph mode on string ending with 2 newlines');
    is ($_, "f", 'chomp in paragraph mode on string ending with 2 newlines');

    $_ = "f\n";
    $/ = "";
    $got = chomp();
    is ($got, 1, 'check return value when chomp in paragraph mode on string ending with 1 newline');
    is ($_, "f", 'chomp in paragraph mode on string ending with 1 newlines');

    $_ = "f";
    $/ = "";
    $got = chomp();
    is ($got, 0, 'check return value when chomp in paragraph mode on string ending with no newlines');
    is ($_, "f", 'chomp in paragraph mode on string lacking trailing newlines');

t/porting/copyright.t

Starting at line 56:

  open my $readme, '<', '../README' or die "Opening README failed: $!";

  # The copyright message is the first paragraph:
  local $/ = '';
  my $copyright_msg = <$readme>;

Other core programs maintained by p5p

configpm

Starting at line 1011:

    if ($Opts{glossary}) {
      open(GLOS, '<', $Glossary) or die "Can't open $Glossary: $!";
    }
    ...
    $/ = '';
    ...
    if ($Opts{glossary}) {
        <GLOS>;                # Skip the "DO NOT EDIT"
        <GLOS>;                # Skip the preamble
      while (<GLOS>) {
        process;
        print CONFIG_POD;
      }
    ...

install.html

Starting at line 239:

    open(H, '<', "$file.html") ||
    die "$0: error opening $file.html for input: $!\n";
    $/ = "";
    my @data = ();
    while (<H>) {
    last if m!<h1 id="NAME">NAME</h1>!;
    $_ =~ s{href="#(.*)">}{
        my $url = "$file/@{[anchorify(qq($1))]}.html" ;
        $url = relativize_url( $url, "$file.html" )
        if ( ! defined $Options{htmlroot} || $Options{htmlroot} eq '' );
        "href=\"$url\">" ;
    }egi;
    push @data, $_;
    }
    close(H);

Starting at line 404:

    sub splitpod {
        my($pod, $poddir, $htmldir, $splitdirs) = @_;
        my(@poddata, @filedata, @heads);
        my($file, $i, $j, $prevsec, $section, $nextsec);

        print "splitting $pod\n" if $verbose;

        # read the file in paragraphs
        $/ = "";
        open(SPLITIN, '<', $pod) ||
        die "$0: error opening $pod for input: $!\n";
        @filedata = <SPLITIN>;
        close(SPLITIN) ||
        die "$0: error closing $pod: $!\n";

lib/diagnostics.pm

Starting at line 316:

    local $/ = '';
    local $_;
    my $header;
    my @headers;
    my $for_item;
    my $seen_body;
    while (<POD_DIAG>) {

pod/buildtoc

    sub podset {
        my ($pod, $file) = @_;

        open my $fh, '<:raw', $file or my_die "Can't open file '$file' for $pod: $!";
        ...

        seek $fh, 0, 0 or my_die "Can't rewind file '$file': $!";
        local $/ = '';

        while(<$fh>) {
        tr/\015//d;
        if (s/^=head1 (NAME)\s*/=head2 /) {
            unhead1();
            $OUT .= "\n\n=head2 ";
            $_ = <$fh>;
            # Remove svn keyword expansions from the Perl FAQ
            s/ \(\$Revision: \d+ \$\)//g;
            if ( /^\s*\Q$pod\E\b/ ) {
            s/$pod\.pm/$pod/;       # '.pm' in NAME !?
            } else {
            s/^/$pod, /;
            }
        }
        elsif (s/^=head1 (.*)/=item $1/) {
            unhead2();
            $OUT .= "=over 4\n\n" unless $inhead1;
            $inhead1 = 1;
            $_ .= "\n";
        }
        elsif (s/^=head2 (.*)/=item $1/) {
            unitem();
            $OUT .= "=over 4\n\n" unless $inhead2;
            $inhead2 = 1;
            $_ .= "\n";
        }
        elsif (s/^=item ([^=].*)/$1/) {
            next if $pod eq 'perldiag';
            s/^\s*\*\s*$// && next;
            s/^\s*\*\s*//;
            s/\n/ /g;
            s/\s+$//;
            next if /^[\d.]+$/;
            next if $pod eq 'perlmodlib' && /^ftp:/;
            $OUT .= ", " if $initem;
            $initem = 1;
            s/\.$//;
            s/^-X\b/-I<X>/;
        }
        else {
            unhead1() if /^=cut\s*\n/;
            next;
        }
        $OUT .= $_;
        }
    }

pod/perlmodlib.PL

Starting at line 46:

    for my $filename (@files) {
        unless (open MOD, '<', $filename) {
            warn "Couldn't open $filename: $!";
        next;
        }

        my ($name, $thing);
        my $foundit = 0;
        {
        local $/ = "";
        while (<MOD>) {
            next unless /^=head1 NAME/;
            $foundit++;
            last;
        }
        }
        unless ($foundit) {
            next if pod_for_module_has_head1_NAME($filename);
            die "p5p-controlled module $filename missing =head1 NAME\n"
                if $filename !~ m{^(dist/|cpan/)}n # under our direct control
                && $filename !~ m{/_[^/]+\z}       # not private
                && $filename ne 'lib/meta_notation.pm'      # no pod
                && $filename ne 'lib/overload/numbers.pm';  # no pod
            warn "$filename missing =head1 NAME\n" unless $Quiet;
        next;
        }
        my $title = <MOD>;
        chomp $title;
        close MOD
            or die "Error closing $filename: $!";

        ($name, $thing) = split /\s+--?\s+/, $title, 2;

        unless ($name and $thing) {
        warn "$filename missing name\n"  unless $name;
        warn "$filename missing thing\n" unless $thing or $Quiet;
        next;
        }

        $name =~ s/[^A-Za-z0-9_:\$<>].*//;
        $name = $exceptions{$name} || $name;
        $thing =~ s/^perl pragma to //i;
        $thing = ucfirst $thing;
        $title = "=item $name\n\n$thing\n\n";

        if ($name =~ /[A-Z]/) {
        push @mod, $title;
        } else {
        push @pragma, $title;
        }
    }

Starting at line 98:

    sub pod_for_module_has_head1_NAME {
        my ($filename) = @_;
        (my $pod_file = $filename) =~ s/\.pm\z/.pod/ or return 0;
        return 0 if !-e $pod_file;
        open my $fh, '<', $pod_file
            or die "Can't open $pod_file for reading: $!\n";
        local $/ = '';
        while (my $para = <$fh>) {
            return 1 if $para =~ /\A=head1 NAME$/m;
        }
        return 0;
    }

pod/splitpod

Applies to almost entire file.

Porting/sort_perldiag.pl

Applies to almost entire file.

dist/ and ext/ modules maintained by p5p

dist/Tie-File/lib/Tie/File.pm

Tie-File does not implement paragraph mode:

"An undefined value is not permitted as a record separator. Perl's special "paragraph mode" semantics (à la $/ = "") are not emulated."

ext/B/t/OptreeCheck.pm

Starting at line 972:

    foreach my $file (@files) {
    open (my $fh, '<', $file) or die "cant open $file: $!\n";
    $/ = "";
    my @chunks = <$fh>;
    print preamble (scalar @chunks);
    foreach my $t (@chunks) {
        print "\n\n=for gentest\n\n# chunk: $t=cut\n\n";
        print OptreeCheck::gentest ($t);
    }
    }

ext/PerlIO-scalar/t/scalar.t

Starting at line 168:

    {
        # [perl #35929] verify that works with $/ (i.e. test PerlIOScalar_unread)
        my $s = <<'EOF';
    line A
    line B
    a third line
    EOF
        open(F, '<', \$s) or die "Could not open string as a file";
        local $/ = "";
        my $ln = <F>;
        close F;
        is($ln, $s, "[perl #35929]");
    }

ext/Pod-Html/t/anchorify.t

Starting at line 8:

    my @filedata;
    {
        local $/ = '';
        @filedata = <DATA>;
    }

cpan/ modules shipped with core

cpan/ExtUtils-Install/t/Packlist.t

Starting at line 79:

    local *IN;
    ...
        $file_is_ready = open(IN, 'eplist');
    ...
        skip("cannot open file for reading: $!", 5) unless $file_is_ready;
        my $file = do { local $/ = <IN> };

cpan/ExtUtils-MakeMaker/t/Mkbootstrap.t

Starting at line 95:

    $file_is_ready = open(IN, 'dasboot.bs');
    ok( $file_is_ready, 'should have written a new .bs file' );
    ...
    my $file = do { local $/ = <IN> };

Starting at line 129:

    $file_is_ready = open(IN, 'dasboot.bs');
    ...
    my $file = do { local $/ = <IN> };

cpan/IO-Compress/t/compress/generic.pl

Starting at line 697:

            {
                local $/ = "";  # paragraph mode
                my $io = $UncompressClass->new($name);
                is $., 0;
                is $io->input_line_number, 0;
                ok ! $io->eof;
                my @lines = $io->getlines();
                is $., 2;
                is $io->input_line_number, 2;
                ok $io->eof;
                ok @lines == 2
                    or print "# Got " . scalar(@lines) . " lines, expected 2\n" ;
                ok $lines[0] eq "This is an example\nof a paragraph\n\n\n"
                    or print "# $lines[0]\n";
                ok $lines[1] eq "and a single line.\n\n";
            }

Starting at line 882:

            {
                local $/ = "";  # paragraph mode
                my $io = $UncompressClass->new($name);
                ok ! $io->eof;
                my @lines = $io->getlines;
                is $., 2;
                is $io->input_line_number, 2;
                ok $io->eof;
                ok @lines == 2
                    or print "# expected 2 lines, got " . scalar(@lines) . "\n";
                ok $lines[0] eq "This is an example\nof a paragraph\n\n\n"
                    or print "# [$lines[0]]\n" ;
                ok $lines[1] eq "and a single line.\n\n";
            }

cpan/IO-Compress/t/compress/newtied.pl

Starting at line 196:

            {
                local $/ = "";  # paragraph mode
                my $io = $UncompressClass->new($name);
                ok ! $io->eof;
                my @lines = <$io>;
                ok $io->eof;
                ok @lines == 2
                    or print "# Got " . scalar(@lines) . " lines, expected 2\n" ;
                ok $lines[0] eq "This is an example\nof a paragraph\n\n\n"
                    or print "# $lines[0]\n";
                ok $lines[1] eq "and a single line.\n\n";
            }

cpan/IO-Compress/t/compress/tied.pl

Starting at line 237:

            {
                local $/ = "";  # paragraph mode
                my $io = $UncompressClass->new($name);
                ok ! $io->eof;
                my @lines = <$io>;
                ok $io->eof;
                ok @lines == 2
                    or print "# Got " . scalar(@lines) . " lines, expected 2\n" ;
                ok $lines[0] eq "This is an example\nof a paragraph\n\n\n"
                    or print "# $lines[0]\n";
                ok $lines[1] eq "and a single line.\n\n";
            }

Starting at line 365:

            {
                local $/ = "";  # paragraph mode
                my $io = $UncompressClass->new($name);
                ok ! $io->eof;
                my @lines = <$io>;
                ok $io->eof;
                ok @lines == 2
                    or print "# expected 2 lines, got " . scalar(@lines) . "\n";
                ok $lines[0] eq "This is an example\nof a paragraph\n\n\n"
                    or print "# [$lines[0]]\n" ;
                ok $lines[1] eq "and a single line.\n\n";
            }

cpan/Text-Tabs/t/Wrap-JLB.t

Starting at line 20:

    $/ = q();
    binmode(DATA, ":utf8") || die "can't binmode DATA to utf8: $!";

    our @DATA = (
        [ # paragraph 0
        sub { die "there is no paragraph 0" }
        ],
        { # paragraph 1
        OLD => { BYTES =>    44, CHARS =>   44, CHUNKS =>   44, WORDS =>   7, TABS =>  3, LINES =>  4 },
        NEW => { BYTES =>    44, CHARS =>   44, CHUNKS =>   44, WORDS =>   7, TABS =>  3, LINES =>  4 },
        },
        { # paragraph 2
        OLD => { BYTES =>  1766, CHARS => 1635, CHUNKS => 1507, WORDS => 275, TABS =>  0, LINES =>  2 },
        NEW => { BYTES =>  1766, CHARS => 1635, CHUNKS => 1507, WORDS => 275, TABS =>  0, LINES => 24 },
        },
        { # paragraph 3
        OLD => { BYTES =>   157, CHARS =>  148, CHUNKS =>  139, WORDS =>  27, TABS =>  0, LINES =>  2 },
        NEW => { BYTES =>   157, CHARS =>  148, CHUNKS =>  139, WORDS =>  27, TABS =>  0, LINES =>  3 },
        },
        { # paragraph 4
        OLD => { BYTES =>    30, CHARS =>   25, CHUNKS =>   24, WORDS =>   3, TABS =>  4, LINES =>  1 },
        NEW => { BYTES =>    30, CHARS =>   25, CHUNKS =>   24, WORDS =>   3, TABS =>  4, LINES =>  1 },
        },
    );

PARAGRAPH MODE ON CPAN

Search on grep.metacpan.org

Some CPAN distributions found in that search:

    B-DeparseTree
    BioPerl
    CPAN-DistnameInfo
    GnuPG
    IO-Socket-SSL
    IO-String
    IO-stringy
    Parse-RecDescent
    PerlBench
    Regexp-Grammars