Projects for the Programming Languages Course
Math & Computer Science College of the Holy Cross One College Street Worcester, MA 01610 LA@cs.holycross.edu |
Computer Science & Math Ithaca College 1212 Williams Hall Ithaca, NY 14850-7284 barr@ithaca.edu |
Computer Science The College of William and Mary McGlothlin-Street Hall Williamsburg, VA 23187-8795 coleman@cs.wm.edu |
Abstract
The last few years has seen renewed interest in teaching
programming-in-the-large (PIL) and programming-in-context of a larger existing
program (PIC) throughout the computer science curriculum. Although these
skills have been a focus of software engineering courses and capstone projects,
there is an emphasis to teach these skills in other courses across the
curriculum. This paper addresses incorporation of PIL and PIC in the programming
language course, and presents specific PIC and PIL projects using an interpreter
for SLic, a simple logic (declarative) language. SLic itself is part of
a family of interpreters in MuLE, a software environment designed to support
interpreter-based projects in the programming languages course. MuLE is
written in DrScheme (from Riceís PLT software project distributed under
the GNU Library General Public License) and runs under Windows 95/98/NT/2000,
MacOS, and Unix/X.
In industry, large software projects developed by
teams of programmers are common. Unfortunately, the requisite skills for
success in such settings are sometimes undervalued by academia and can
be ignored in the curriculum. These skills include the ability to work
in teams and to modify and maintain pieces of much larger software systems.
Thus there is a gap between what we teach and what many computer scientists
face in commercial software development. We should close this gap.
Advocacy for teaching team-oriented, PIL and PIC software development skills has always been present in curriculum discussions, but the skills were often cloistered in a software engineering course or as a capstone project. There is momentum to place new emphasis on these skills throughout the curriculum. [9, 13] Furthermore it is recognized that "education should not be constrained to development" and that "students gain valuable insight in the process of reading, deciphering, and evaluating the design and implementation of an existing software system." [13] The ACM computing curricula 2001 may even incorporate team skills and PIL skills into their new curriculum. [7]
It is important to include PIL or PIC projects in courses
other than software engineering or as capstone projects. Here, we describe
PIL and PIC oriented projects in the programming languages course using
MuLE (Multiple Language Environment), a software environment developed
to support incorporation of interpreter-based projects into the programming
languages course. This differs from Kaminís interpreter based approach
in that one similar tool (MuLE) for several paradigms is used. [10] In
particular we will describe a set of projects concerned with the implementation
of MuLEís interpreter for SLic (Simple LogIC language) a simple logic (declarative)
language.
As the programming languages course is concerned with
the implementation of languages, and this implementation often comes in
the form of large software packages, i.e. compilers and interpreters, the
course is a natural setting for teaching PIC and PIL skills. There have
been a number of papers and books supporting the use of interpreters in
this course with the goal, as Friedman et. al. put it, "to give students
a deep, hands-on understanding of the essential concepts of programming
languagesÖ". [8] Furthermore, the ACM 91 curriculum goals to give students
extensive practice with virtual machines, language translation, semantics
and programming paradigms are also in harmony with implementation projects.
[1] Thus the implementation or modification of these large programs is
well established pedagogically in the programming language course. [5,
6, 10]
The difficulty is to avoid bogging students down in a large interpreter implementation at the expense of other teaching goals, such as broadening student experience with a variety of language features and design concerns. We developed MuLE to address this concern, and to provide a simplified setting for interpreter projects as ancillary teaching tools.
In spite of its relatively small size and simplicity compared
to interpreters for "real" programming languages, MuLE has still proved
an adequate vehicle for PIC and PIL projects for the following reasons.
First, what is important for some PIC and PIL skills is not the absolute
size of the software system, but rather the size and complexity of the
software to be modified relative to the modifierís understanding the system.
Second, MuLEís apparent size is magnified as MuLE is written in Scheme,
which is often a new language for the student. Consequently, a project
involving reading, deciphering and modifying a 2000 line MuLE interpreter
written in an unfamiliar language still offers valuable experience (and
teaching opportunities) on working within a software system one does not
comprehend in its totality.
The PIC and PIL projects described subsequently involve
the MuLE system. MuLE consists of a launcher program, a utilities file,
and a set of four interpreters for each of four major programming language
paradigms: imperative, functional, object-oriented and declarative. MuLE
permits students to easily experiment with a number of different language
paradigms and to then extend or modify the interpreters for the languages
to gain insight into the relationship between the implementation of languages
and their use.
The built-in interpreters are SLam (Simple Lambda language, a functional language), SPoc (Simple Procedural language, an imperative language), SOOP (Simple Object Oriented Programming language), and SLic (Simple LogIC language, a declarative language). The interpreters are designed to share as much code as possible to facilitate students moving from one interpreter to another for projects.
MuLE is started by running the launcher that provides an interface for starting each of the interpreters. Upon launch, an interpreter displays a double paned window in which the user can enter, edit, and evaluate programs. A user can start any or all of the interpreters and can even start multiple copies of a particular interpreter.
Each interpreter is a separate module which consists of
a parse function (or possibly a set of parse functions) and an execute
function. Projects can be based on writing a particular function, a complete
component (like the parser), or on extending the features of a language.
Specific details of MuLE and example projects for functional and procedural
languages are given in [5, 3, 4, 2]. Each of MuLEís interpreters provides
a context for PIC or PIL projects, but here we concentrate on SLic projects.
SLic is a logic language that provides some of the
basic features found in declarative languages like Prolog. Logic languages
are normally taught as part of the programming languages course both because
it is a unique paradigm that has features not found in other languages
and because logic is a basic concept that is important in many parts of
the computer science curriculum.
In general, declarative programs describe properties of the solution as a collection of "facts" and "rules" (called the database), but do not specify how to compute the solution. Instead, the database engine derives the actual solution using predicate calculus to infer valid statements. For example, in a procedural language, when we need to sort a list of items, we write procedures to compare two items, to swap two items, and finally, an algorithm to specify when to compare items and when to swap them. In a declarative language, we simply state the current list as a "fact" and that a sorted list is a permutation of the order of the elements in the list such that for any two consecutive items in the list, a given relationship holds. The database engine will use inference rules to deduce the sorted list.
Interpreters for declarative languages also employ unique algorithms (resolution, backward or forward chaining, and unification) not found in other language paradigms. Although the interpreter still has a parser, instead of an "execute" component it has a database engine consisting of search, unification, inference, and output components (with auxiliary functions). In operation, an expression is read by the interpreter and parsed to determine whether the expression is a fact (which describes an attribute), a goal (which is the question to be answered), or a rule (which describes implications). Depending on the expression, the interpreter either extends the database (by a fact or a rule) or searches the database (under some constraints). Consequently, the SLic interpreter implementation shares the least code with MuLEís other interpreters.
SLic searches with a back-chaining algorithm and relies unification to identify unique matches. Within unification infinite recursion must be prevented with what is usually called the occur check. The SLic language does not allow recursive rules, however the interpreter does have an occur function for future language growth. The following algorithm describes the basic iterative loop structure of the search engine of SLic.
SLic implements the search algorithm via three recursive
functions, back-chain(), prove(), and prover(), which
each represent one level of the triple nested for loop in the algorithm.
SLic also includes a parser function, a unification function, and a collection
of auxiliary helper functions for variable substitution, time-stamping,
pretty printing the answer sets and so forth. The functional decomposition
of SLic makes it especially useful for PIC programming since each function
can be implemented independently if the interfaces between the functions
is specified with sufficient care. Projects using SLic can be derived from
the parser, the unification function, the inference engine (back-chaining,
proving and the answer set of functions) or from the auxiliary functions.
The described projects are of three flavors: introductory
projects which introduce the MuLE environment and its implementation language,
Scheme; PIC projects which require students to understand the overall design
of the SLic interpreter and the flow of information through the interpreter
to implement a specific component; and PIL projects where students must
design and implement 500-1,000 line modules that interface with several
other components of the interpreter.
This project requires students to write some of the
ancillary functions used by the SLic interpreter. The first task is to
write a simple output function, printAnswer, to print the results
of the database search. The second task is to write two functions, rename
and unname, to rename variables before and after unification.
The printAnswer function takes a single argument
which is either the token ëfail or a list of items, each of which
is itself a list of valid variable assignments with the following specification:
The second part of the project is to implement some variable manipulation functions required by the unification function. When the same variables are used in multiple facts, rules, and queries, variable context is not clear within the database engine. Therefore prior to unification, the variables are renamed using different time stamps for each expression. The function rename inputs an expression and a timestamp and replaces all variables in the expression with a variable-timestamp pair according to the following specification:.
After unification, a similar unname function must be written which takes an expression and replaces renamed variables with their original names. Another boolean helper function is provided to determine whether an item x is a renamed variable or not.
The second project gives students hands-on experience
with the fundamental unification algorithm used in the inference engine.
We are motivated by the opinion that "we should always expose our students
to systems that we wouldnít expect them to develop from scratch." [13]
Students are given the code for SLic with the unification function removed
and are expected to write the missing function. This requires students
to understand the SLic interpreterís basic structure, unification in theory
and how to match two expressions in practice. Thus, this project is a classic
PIC project where students must supply a small part to a large program.
Students are asked to implement the original unification algorithm proposed by J. A. Robinson. [11] Two expressions are said to unify (or match) if there exists a set of substitutions that, when applied to both expressions, results in identical expressions. Unification is the process of finding such a set of substitutions, if that set exists. The concept of unification remains the standard method to match expressions in database engines.
In the context of the SLic interpreter, the unification function is a recursive function, unif, which takes three arguments, the two expressions to be unified and a list representing the current set of variable assignments with the following specification:
( unif
(lambda (expr1 expr2 s)
Ö
))
unif should either return a list of valid variable
assignments or return the token ëfail if no such valid assignments
are possible. For example, consider when unif is called with (eats
(X 0) (Y 0)) as expr1, (eats shark (X 1)) as expr2,
and s as the empty list (meaning no variables have been assigned
values). Unification should return the list (((X 0) shark) ((Y 0) (X
1))) which indicates that assigning (X 0) the value shark,
and (Y 0) same value as (X 1) is a valid assignment set.
However, for the same expressions, if s were (((X 0) bear)),
unification would fail because (X 0) has been assigned to be bear
and cannot be shark, so we cannot create a valid assignment set.
Note that per the introductory project, variables have been renamed.
For the project, students are given the skeleton interpreter for SLic designed to work within the MuLE environment, the unification function, and some ancillary functions but no inference engine. Possibly working in teams, students are required to implement the inference engine using the backward-chaining algorithm. Of the methods to find answers to queries, backward chaining is still the most straightforward technique. With backward chaining, we begin with the goal and find a sequence of matching facts and/or rules that lead to true statements with no unassigned variables. Thus, to find the set of answers to a query, we attempt to unify the query with each item in the database. A fact that succeeds represents one correct answer, since variables are not allowed in facts. For rules, we attempt to unify the left hand side of the rule with the goal. Success means that we must begin another query with the right hand side as the goal.
Implementation requires a firm understanding of all the principles of both unification and backward chaining. The primary challenge is to organize the search through the database, sub-goals, and potential answers. Our "standard solution" as described in the earlier discussion of the SLic interpreter uses three recursive functions, but other solutions are possible.
The specific PIL skills to be emphasized can be varied.
Although the project itself only involves writing 500 to 1,000 lines of
code, it is sufficiently large to be done with small teams. One opportunity
is to emphasize team-oriented skills such as the use of design meetings,
requirements documents, requirements reviews, and code inspections. Another
opportunity exists to emphasize project requirement concerns such as extensibility
and robustness, and promote critical thinking and design with these requirements
in mind. Additionally, standard software engineering practices are taught
and encouraged to ensure that students get these benefits of a PIL experience
as well. One could even require student teams to use a public domain version
control system, such as CVS, to access the source code, with teams
modifying different parts of MuLE, but integrating into a single student
version of MuLE.
References
[1] ACM/IEEE-CS Joint Curriculum Task Force (1991). Computing
Curricula 1991. New York, ACM Press.
[2] Barr, John and L.A. Smith King, "Interpreter-based Projects for a Traditional Programming Languages Course", The Journal of Computing in Small Colleges, Volume 10, Number 2, November 1994.
[3] Barr, John and L.A. Smith King, "An Environment for Interpreter-based Programming Language Projects", twenty-sixth SIGCSE Technical Symposium on Computer Science Education, Volume 26, Number 1, March 1995.
[4] John Barr and L.A. Smith King, "Teaching Programming Languages by Counter-Example", The Proceedings of the Eleventh Annual Eastern Small College Computing Conference, New Rochelle, NY, October 20-21, 1995.
[5] Barr, John and L.A. Smith King, "An Environment for Interpreter-based Projects for the Programming Languages Course", National Science Foundation DUE CCLI-EMD 9952398, January 2000 ó June 2001.
[6] Bruce, Kim, "Formal Semantics and Interpreters in a Principles of Programming Languages Course", Thirtieth SIGCSE Technical Symposium on Computer Science Education, Volume 39, Number 1, March 1999.
[7] http://www.computer.org/education/cc2001/index.htm, accessed 8/22/2000.
[8] Friedman, D., Wand, M. and Haynes, C., Essentials of Programming Languages, The MIT Press, 1992.
[9] Haines, Jimmie, "Collaborative Learning in Undergraduate Information Science Education", Panel, Twenty-Sixth SIGSCE Technical Symposium on Computer Science Education, Volume 27, Number 1, March 1995.
[10] Kamin, Samuel, Programming Languages: An Interpreter-Based Approach, Addison-Wesley Publishing Company, 1990.
[11] Robinson, J.A., "A Machine Oriented Logic Based on the Resolution Principle", J.A.C.M. Volume 12, Number 1, 1965, pages 23-44.
[12] Sebesta, Robert W., Concepts of Programming Languages, 4th Edition, Addison-Wesley, 1999.
[13] Stevens, K. Todd, John Lewis, et.al. "Using Large Projects in a Computer Science Curriculum", Panel, Thirty-first SIGSCE Technical Symposium on Computer Science Education, Volume 32, Number 1, March 2000.