1 Subroutines

#!/usr/local/bin/perl -w
# main program

open(IN,"test.html");
@file=<IN> ;
foreach $line (@file){
    find_closing_tags($line);
}
close IN;

# end of main
sub find_closing_tags {
    my $line = $line;
    if ($line =~ /(<\/.*?>)/i){
       print "$1\n";
   }
}

Note: Instead of my $line = $line; it would be better to write my $line = $_[0]; because variables should be sent to subroutines in the @_ array. But the first version is easier to remember and works in many cases.

1.1 Exercises

1) Add a second subroutine to the example that prints all tags that are not closing tags. Before the subroutines are called, let the main routine print a heading "these are the opening tags" and "these are the closing tags". (Note: this script only finds the first tag in every line. To change that so that it finds all tags you could either first split the lines after the tags or insert line breaks after the tags - compare exercise 11 from week 5).

1.2 Example

#!/usr/local/bin/perl -w
#
# main subroutine
#
$file="input.txt";
@textfile= read_file($file);
push (@textfile,"this is the last line");
write_file($file,@textfile);
system('more input.txt');
##############################
sub read_file {
    my $inputfile = $file;
   open (INPUT, $inputfile);
   @text = <INPUT> ;
   chomp @text;
   close (INPUT);
   return @text;
}
#########################
sub write_file {
   my $outputfile = $file;
   my @text = @textfile;
   open (OUTPUT, ">$outputfile");
   foreach $line (@text) {
      print OUTPUT "$line\n";
   }
   close (OUTPUT);
}

The main routine sends $file as argument to the subroutine read_file and receives @text from it as return value. The subroutine sends values to the main routine via the "return" function.

1.3 Exercises

2) Write a third subroutine for the script. The subroutine is called get_input. It takes @textfile as argument and also returns it. The subroutine asks a user to input some text. The text is inserted at the end of the file. Optional: the user is also asked for the line where the text will be inserted.

2 Local and Global Variables

my ($number)=0;
local ($string)="";

Each subroutine should use its own set of variables so that the different subroutines do not interfere with each other. "Global variables" are variables that are used in the main routine and in subroutines. In the previous script $file and @textfile are global variables. But in the subroutines they are renamed to $inputfile, $outputfile and @text. All variables in subroutines should be defined with "my".

Variables declared with "local" are visible in the subroutine and all subroutines that are called from the subroutine.

If "use strict" is used, global variables are not permitted. In that case all variables must be sent to subroutines using the @_ array.

2.1 Exercises

3) Modify the previous example so that the user is asked for his/her first name, last name, email address. The information is stored in the file in the following format:
Mary | Smith | mary@somecompany.com
John | Miller | miller@somecompany.com

A fourth subroutine is added that asks a user for her/his last name and retrieves the email address from the file. To do that every line of the file is split into an array. The second element in that array is compared to the current user's last name.

3 Searching a Webpage

The following CGI script expects a URL and a keyword as input from a form.

#!/usr/local/bin/perl -w
use IO::Socket;
use CGI qw(:standard -debug);

############# beginning of main routine ####################

my $url = param("url");
my $keyword = param("keyword");

start_webpage();
($host, $document) = parse_input($url);
@page_content= read_page($host,$document);
search($keyword, @page_content);

############## end of main routine ###########################

sub start_webpage{
print header();
print "<HTML>
<HEAD>
<TITLE>Search Results</TITLE>
</HEAD>
<BODY>
<H3>Search Results</h3>"
}

##############################################################

sub parse_input {
    $current_url = $_[0];
    $current_url =~ /(http:\/\/)?([^\/]*)(.*)/;
    $host = $2;
    $document = $3;
    if ($document eq "") {$document = "/";}
    return ($host, $document);
}

########################################################################

sub read_page{
    my $current_host=$_[0];
    my $current_doc=$_[1];
    $remote =IO::Socket::INET->new(Proto => "tcp",
    PeerAddr => $current_host,
    PeerPort => "http(80)",
    );
    if (!$remote) { die "cannot connect to http daemon on $host"}
    $remote->autoflush(1);
    print $remote "GET $current_doc HTTP/1.0\r\n";
    print $remote "Host: $current_host\r\n\r\n";
    @output = <$remote> ;
    close $remote;
    return @output;
}

##################################################################
sub search{
    ($term,@text) = @_;
    print "<p>The results for $term are:";
    print @text;
}

3.1 Exercises

4) Write a form that lets users input a URL and a keyword and invokes the CGI script.
5) Change the subroutine "search" so that it actually searches for the keyword and prints the lines that contain the keyword.