L548: Computer Programming for
Information Management

School of Library and Information Science
Indiana University
Spring 2000

Instructor: Uta Priss
Email: upriss@indiana.edu
Office: 029 SLIS
Phone: 812-855-2793
Office hours: Monday 2.30 - 3.30 or by appointment

This syllabus is electronically available at http://php.indiana.edu/~upriss/l548/548-Sp00-syllabus.html

Course Syllabus

Some class-related links:

Student projects
Some information on Ella
Documentation for the CGI Perl module

Introduction

This course introduces basic skills for programming and manipulation of text-based information systems. Information management is a major task for librarians and information professionals who are asked to extract information from sources on the WWW, design interactive text-based web interfaces to information systems, utilize text that is stored or is supposed to be stored in a markup format or preprocess information for storage in databases. This course teaches computer-based approaches to these tasks.

Currently the class is taught using Perl/CGI. Perl provides a good introduction to general programming concepts. These concepts include basic programming structures, such as control structures, file handling and program design strategies. But they also include more advanced topics, such as networking, text-based user interfaces, and basic retrieval concepts. Perl allows rapid prototyping which is appropriate for applications in a fast changing environment such as the WWW. Furthermore, Perl is very suited for search engines, parsers and mark-up languages. Students will develop a small information systems application as a project for this class. The concepts are therefore not taught abstractly but as hands-on experiences with WWW applications.

Course Objectives

This course
  1. teaches basic programming concepts and structures.
  2. introduces basic information processing and management concepts.
  3. uses small scale but realistic examples of information management tasks.
  4. teaches the basics of Perl and Perl/CGI.
  5. provides an introduction to more advanced topics such as object oriented programming.

Prerequisites

L401 (must be either completed in a prior semester or an approved waiver must be in the student's file) or consent of instructor. Especially important: basic Unix skills, i.e. understanding of the Unix directory structure and ability to edit and save files on a Unix computer; ability to create HTML web forms.

Class Organization

The class is taught as a combination of lecture and lab sessions. The students will work on a semester project either as a team of two members or individually. The results of the projects will be presented during the last class session.

Computer Lab

The lab session is taught in GY226 (a Unix lab). All students must create an account on the Unix Nations cluster at least 24 hours before the first lab session. If students want to practise in the Unix lab during other times, they should first check the on-line availability schedule for the lab. (Select month and lab "GY226" on this page.)

Readings

Required Textbook:
Randal L. Schwartz & Tom Christiansen: Learning Perl, 2nd Edition, July 1997, O'Reilly

The book should be treated as an "encyclopedia". It contains detailed technical information, which sometimes may be difficult to understand for programming beginners. Some on-line tutorials may serve as complimentary, easier reading: Day 2 on this page or the Perl Tutorial. Additional resources can be found at www.perl.com and on a local IU resource page.

Grading

The grades are given according to the SLIS grading standards. Good work that meets the course expectations will be assigned a grade of B. To get a higher grade than B, the students must demonstrate above average comprehension of the course materials, knowledge and/or effort.

The final course grade will be computed for each student on the basis of grades assigned for the following:

Class contribution and listserv discussions 1/5
Project 2/5
Final exam 2/5

Each student is expected to complete all course work by the end of the term. A grade of incomplete (I) will be assigned only if exceptional circumstances warrant. Late work will be accepted only at the discretion of the instructor and in every case will be automatically downgraded by 1/3 grade (e.g., a B+ becomes a B, a B- becomes a C+, etc.).

Class contribution and listserv discussions

The class contribution grade will be calculated based on class attendance (25%), contributions to class discussions and to discussions on the majordomo distribution list (50%) (upriss_l548@indiana.edu) and three multiple choice tests (25%). It is required that every student demonstrate respect for the ideas, opinions, and feelings of all other members of the class.

Project

Teams and topics

Students can work on their projects either as a team of two members or individually. The teams must be formed during the first week of class. Each project will consist of developing an information processing or information management tool. The tool should have a CGI-based user interface.

Examples for projects are: a mail filtering system (allows users to extract mail messages from a standard Unix mail folder based on certain preferences), a search engine, an information extraction tool for webpages (allows to extract certain information from a set of webpages), an html or xml viewer (displays markup pages as text with certain formatting), a data preprocessing tool (prepares data for input into a database or formats the output from a database), a library access tool (formats user input for search in an electronic library catalog), an indexing tool (parses text and identifies important words based on frequencies). Other similar topics can be suggested by the students. Some of these topics require additional knowledge (such as databases or xml) and should only be chosen by students who have acquired such knowledge prior to this class. The students should discuss their choice of topic with the instructor.

Project presentation, assignments and final project report

The students will present their tools during the last lab session (April 25th, 2000). Parts of the projects will be handed in as assignments during the semester (see the Class Schedule). The final project report is due on April 25th, 2000. It must contain information on the purpose, features and limits of the software and indicate possible future extensions and improvements. The source code of the tool should not be included in the documentation but it must be made available for evaluation by the instructor.

Grading of the projects

A total of up to 100 points will be given for the project. Each assignment is worth 10 points, the presentation is worth 10 points and 30 points will be given for the final project report and the project as a whole.

The project will be evaluated according to the following criteria:

Final Exam

The final exam will be a take-home exam consisting of several small information management tasks for which the students will write appropriate Perl scripts. The exam will be distributed at the conclusion of the class on April 17th and will be due on Monday, May 1st, 12.00 pm (Noon). Team work is not allowed for the final exam.

A note on plagiarism

The students must clearly indicate if they use materials from other sources, such as textbooks or Internet webpages. Full citation information must be given for such sources. Academic and personal misconduct by students in this class are defined and dealt with according to the procedures in the Code of Student Ethics.

Class Schedule

Week 1. Programming basics

Jan 10, 11

Topics: Introduction to information processing tasks; simple Perl programs; scalar variables
Assignments:

  1. Exercises
  2. Read chapter 2 p. 31-36 in Learning Perl
  3. Develop a plan for your information processing tool: what do you want to accomplish with the tool? Which components will your tool have? What are possible features and limits? Find a name for your software tool.

Week 2. Operators and if statements

Jan 18 (there is no lecture on Monday)

Topics: operators, if statements and debugging
Assignments:

  1. Exercises
  2. Read the rest of chapter 2 and chapter 6 in Learning Perl
  3. To be handed in by Jan 25: Email the name of your project and a short description to the discussion list.

Week 3. Logical expressions

Jan 24, 25

Topics: Logical "and", "or", and "not", truth tables
Assignments:

  1. Exercises
  2. Optional reading: A Logic Tutorial

Week 4. Program design and control structures

Jan 31, Feb 1

Topics: Program design; flowcharts; control structures
Assignments:

  1. Exercises
  2. Read chapter 4 and p. 101 and 102 in Learning Perl
  3. To be handed in by Feb 7: Draw flowcharts for components of your information processing tool.

Week 5. Arrays, Hashs and File handling

Feb 7, 8

Topics: Arrays and hashs; file handling
Assignments:

  1. Exercises
  2. Read chapter 3, 5 and p. 108 - 111 of chapter 10 in Learning Perl
  3. Chapter 3 exercises 1 and 2;

Week 6. CGI I

Feb 14, 15

Topics: HTML forms and how to process them with CGI
Assignments:

  1. Exercises
  2. Read chapter 19, pages 180 - 186, in Learning Perl
  3. To be handed in by Feb 21: Create forms for your project and email the URL of the forms to upriss@indiana.edu.

Week 7. Regular expressions I

Feb 21, 22

Topics: Regular expressions
Assignments:

  1. Exercises
  2. Read chapter 7, p. 76 - 81, in Learning Perl

Week 8. Regular expressions II

Feb 28, 29

Topics: Regular expressions; substitution, transliteration and split
Assignments:

  1. Exercises
  2. Read the rest of chapter 7 and 15 in Learning Perl
  3. Chapter 7 exercise 1

Week 9. Programming in the large

Mar 6, 7

Topics: Functions, modular program design, local and global variables
Assignments:

  1. Exercises
  2. Read chapter 8 in Learning Perl
  3. Chapter 8 exercise 1
  4. To be handed in by Mar 20: Write the main (sub)routine of your project. Print the source code of your main routine and hand it in.

Week 10. A Perl networking client

Mar 20, 21

Topics: Retrieving documents from the web via Perl
Assignments:

  1. Exercises
  2. Read Appendix C in Learning Perl

Week 11. CGI II

Mar 27, 28

Topics: Searching web pages on-line; security
Assignments:

  1. Exercises
  2. Read chapter 19, pages 187 - 192, 203 - 206 in Learning Perl
  3. To be handed in by Apr 3: Process the form input for your project in a secure manner. Print the source code of the subroutine that processes the form input and hand it in.

Week 12. CGI III

Apr 3, 4

Topics: Environment variables, hidden text and cookies
Assignments:

  1. Exercises

Week 13. The object oriented paradigm I

Apr 10, 11

Topics: Objects, classes, methods
Assignments:

  1. Exercises
  2. Here is an optional reading on object-oriented Perl. Follow the links: Object-oriented programming, Objects, ..., Using Modules on that page.
  3. To be handed in by Apr 17: Write a two page user manual for your project. Print the manual and hand it in.


Week 14. The object oriented paradigm II

Apr 17, 18

Topics: Class hierarchy, inheritance, polymorphism and encapsulation

  1. Exercises
Final exam is handed out

Week 15. Outlook and Team presentations

Apr 24: Outlook
Apr 25; Presentation of projects,

Project report is due
Final exam is due: May 1