Ethan Frome

What Could Be More SLic?:

Projects for the Programming Languages Course

L.A.Smith King
Math & Computer Science
College of the Holy Cross
One College Street
Worcester, MA 01610
LA@cs.holycross.edu

John Barr
Computer Science & Math
Ithaca College
1212 Williams Hall
Ithaca, NY 14850-7284
barr@ithaca.edu

Ben Coleman
Computer Science
The College of William and Mary
McGlothlin-Street Hall
Williamsburg, VA 23187-8795
coleman@cs.wm.edu

Abstract

The last few years has seen renewed interest in teaching programming-in-the-large (PIL) and programming-in-context of a larger existing program (PIC) throughout the computer science curriculum. Although these skills have been a focus of software engineering courses and capstone projects, there is an emphasis to teach these skills in other courses across the curriculum. This paper addresses incorporation of PIL and PIC in the programming language course, and presents specific PIC and PIL projects using an interpreter for SLic, a simple logic (declarative) language. SLic itself is part of a family of interpreters in MuLE, a software environment designed to support interpreter-based projects in the programming languages course. MuLE is written in DrScheme (from Riceís PLT software project distributed under the GNU Library General Public License) and runs under Windows 95/98/NT/2000, MacOS, and Unix/X.

Introduction

In industry, large software projects developed by teams of programmers are common. Unfortunately, the requisite skills for success in such settings are sometimes undervalued by academia and can be ignored in the curriculum. These skills include the ability to work in teams and to modify and maintain pieces of much larger software systems. Thus there is a gap between what we teach and what many computer scientists face in commercial software development. We should close this gap.

Advocacy for teaching team-oriented, PIL and PIC software development skills has always been present in curriculum discussions, but the skills were often cloistered in a software engineering course or as a capstone project. There is momentum to place new emphasis on these skills throughout the curriculum. [9, 13] Furthermore it is recognized that "education should not be constrained to development" and that "students gain valuable insight in the process of reading, deciphering, and evaluating the design and implementation of an existing software system." [13] The ACM computing curricula 2001 may even incorporate team skills and PIL skills into their new curriculum. [7]

It is important to include PIL or PIC projects in courses other than software engineering or as capstone projects. Here, we describe PIL and PIC oriented projects in the programming languages course using MuLE (Multiple Language Environment), a software environment developed to support incorporation of interpreter-based projects into the programming languages course. This differs from Kaminís interpreter based approach in that one similar tool (MuLE) for several paradigms is used. [10] In particular we will describe a set of projects concerned with the implementation of MuLEís interpreter for SLic (Simple LogIC language) a simple logic (declarative) language.

PIL and PIC in the PL Course

As the programming languages course is concerned with the implementation of languages, and this implementation often comes in the form of large software packages, i.e. compilers and interpreters, the course is a natural setting for teaching PIC and PIL skills. There have been a number of papers and books supporting the use of interpreters in this course with the goal, as Friedman et. al. put it, "to give students a deep, hands-on understanding of the essential concepts of programming languagesÖ". [8] Furthermore, the ACM 91 curriculum goals to give students extensive practice with virtual machines, language translation, semantics and programming paradigms are also in harmony with implementation projects. [1] Thus the implementation or modification of these large programs is well established pedagogically in the programming language course. [5, 6, 10]

The difficulty is to avoid bogging students down in a large interpreter implementation at the expense of other teaching goals, such as broadening student experience with a variety of language features and design concerns. We developed MuLE to address this concern, and to provide a simplified setting for interpreter projects as ancillary teaching tools.

In spite of its relatively small size and simplicity compared to interpreters for "real" programming languages, MuLE has still proved an adequate vehicle for PIC and PIL projects for the following reasons. First, what is important for some PIC and PIL skills is not the absolute size of the software system, but rather the size and complexity of the software to be modified relative to the modifierís understanding the system. Second, MuLEís apparent size is magnified as MuLE is written in Scheme, which is often a new language for the student. Consequently, a project involving reading, deciphering and modifying a 2000 line MuLE interpreter written in an unfamiliar language still offers valuable experience (and teaching opportunities) on working within a software system one does not comprehend in its totality.

The MuLE System

The PIC and PIL projects described subsequently involve the MuLE system. MuLE consists of a launcher program, a utilities file, and a set of four interpreters for each of four major programming language paradigms: imperative, functional, object-oriented and declarative. MuLE permits students to easily experiment with a number of different language paradigms and to then extend or modify the interpreters for the languages to gain insight into the relationship between the implementation of languages and their use.

The built-in interpreters are SLam (Simple Lambda language, a functional language), SPoc (Simple Procedural language, an imperative language), SOOP (Simple Object Oriented Programming language), and SLic (Simple LogIC language, a declarative language). The interpreters are designed to share as much code as possible to facilitate students moving from one interpreter to another for projects.

MuLE is started by running the launcher that provides an interface for starting each of the interpreters. Upon launch, an interpreter displays a double paned window in which the user can enter, edit, and evaluate programs. A user can start any or all of the interpreters and can even start multiple copies of a particular interpreter.

Each interpreter is a separate module which consists of a parse function (or possibly a set of parse functions) and an execute function. Projects can be based on writing a particular function, a complete component (like the parser), or on extending the features of a language. Specific details of MuLE and example projects for functional and procedural languages are given in [5, 3, 4, 2]. Each of MuLEís interpreters provides a context for PIC or PIL projects, but here we concentrate on SLic projects.

The SLic Interpreter

SLic is a logic language that provides some of the basic features found in declarative languages like Prolog. Logic languages are normally taught as part of the programming languages course both because it is a unique paradigm that has features not found in other languages and because logic is a basic concept that is important in many parts of the computer science curriculum.

In general, declarative programs describe properties of the solution as a collection of "facts" and "rules" (called the database), but do not specify how to compute the solution. Instead, the database engine derives the actual solution using predicate calculus to infer valid statements. For example, in a procedural language, when we need to sort a list of items, we write procedures to compare two items, to swap two items, and finally, an algorithm to specify when to compare items and when to swap them. In a declarative language, we simply state the current list as a "fact" and that a sorted list is a permutation of the order of the elements in the list such that for any two consecutive items in the list, a given relationship holds. The database engine will use inference rules to deduce the sorted list.

Interpreters for declarative languages also employ unique algorithms (resolution, backward or forward chaining, and unification) not found in other language paradigms. Although the interpreter still has a parser, instead of an "execute" component it has a database engine consisting of search, unification, inference, and output components (with auxiliary functions). In operation, an expression is read by the interpreter and parsed to determine whether the expression is a fact (which describes an attribute), a goal (which is the question to be answered), or a rule (which describes implications). Depending on the expression, the interpreter either extends the database (by a fact or a rule) or searches the database (under some constraints). Consequently, the SLic interpreter implementation shares the least code with MuLEís other interpreters.

SLic searches with a back-chaining algorithm and relies unification to identify unique matches. Within unification infinite recursion must be prevented with what is usually called the occur check. The SLic language does not allow recursive rules, however the interpreter does have an occur function for future language growth. The following algorithm describes the basic iterative loop structure of the search engine of SLic.

for each sub-goal g' in the goal list g

for each set s' of variable assignments in s

for each item d in the database

sp = unify (g', d, s')

if sp = 'fail

remove s' from s

else

replace sí with sp in s

The algorithm checks each database item against each sub-goal of the expression and finds all valid variable assignments satisfying the goal. The variable g represents the goal, and is created when the interpreter parses a goal expression into a list of sub-goals (conjuncts). The variable s represents the current list of potential variable assignments sets. As combinations of sub-goals and database items are examined, the set s will shrink when an invalid assignment is eliminated, or grow when a new, possibly valid assignment is found. Those that remain after the outer loop terminates represent the variable assignments that are valid for all database items across all sub-goals.

SLic implements the search algorithm via three recursive functions, back-chain(), prove(), and prover(), which each represent one level of the triple nested for loop in the algorithm. SLic also includes a parser function, a unification function, and a collection of auxiliary helper functions for variable substitution, time-stamping, pretty printing the answer sets and so forth. The functional decomposition of SLic makes it especially useful for PIC programming since each function can be implemented independently if the interfaces between the functions is specified with sufficient care. Projects using SLic can be derived from the parser, the unification function, the inference engine (back-chaining, proving and the answer set of functions) or from the auxiliary functions.

SLic Project Examples

The described projects are of three flavors: introductory projects which introduce the MuLE environment and its implementation language, Scheme; PIC projects which require students to understand the overall design of the SLic interpreter and the flow of information through the interpreter to implement a specific component; and PIL projects where students must design and implement 500-1,000 line modules that interface with several other components of the interpreter.

An Introductory Project

This project requires students to write some of the ancillary functions used by the SLic interpreter. The first task is to write a simple output function, printAnswer, to print the results of the database search. The second task is to write two functions, rename and unname, to rename variables before and after unification.

The printAnswer function takes a single argument which is either the token ëfail or a list of items, each of which is itself a list of valid variable assignments with the following specification:

(printAnswer

(lambda (s)

Ö..

))

printAnswer

ëfail

The second part of the project is to implement some variable manipulation functions required by the unification function. When the same variables are used in multiple facts, rules, and queries, variable context is not clear within the database engine. Therefore prior to unification, the variables are renamed using different time stamps for each expression. The function rename inputs an expression and a timestamp and replaces all variables in the expression with a variable-timestamp pair according to the following specification:.

(rename

(lambda (expr ts)

ÖÖ

))

The return value is the expression, expr, in which all variables are renamed. For example, if rename were called with (eats X Y) and a time stamp of 0, the resulting expression would be (eats (X 0) (Y 0)). We also provide a boolean helper function used to determine if an item is a variable or not.

After unification, a similar unname function must be written which takes an expression and replaces renamed variables with their original names. Another boolean helper function is provided to determine whether an item x is a renamed variable or not.

A PIC Project

The second project gives students hands-on experience with the fundamental unification algorithm used in the inference engine. We are motivated by the opinion that "we should always expose our students to systems that we wouldnít expect them to develop from scratch." [13] Students are given the code for SLic with the unification function removed and are expected to write the missing function. This requires students to understand the SLic interpreterís basic structure, unification in theory and how to match two expressions in practice. Thus, this project is a classic PIC project where students must supply a small part to a large program.

Students are asked to implement the original unification algorithm proposed by J. A. Robinson. [11] Two expressions are said to unify (or match) if there exists a set of substitutions that, when applied to both expressions, results in identical expressions. Unification is the process of finding such a set of substitutions, if that set exists. The concept of unification remains the standard method to match expressions in database engines.

In the context of the SLic interpreter, the unification function is a recursive function, unif, which takes three arguments, the two expressions to be unified and a list representing the current set of variable assignments with the following specification:

    ( unif
        (lambda (expr1 expr2 s)
            Ö
    ))

unif should either return a list of valid variable assignments or return the token ëfail if no such valid assignments are possible. For example, consider when unif is called with (eats (X 0) (Y 0)) as expr1, (eats shark (X 1)) as expr2, and s as the empty list (meaning no variables have been assigned values). Unification should return the list (((X 0) shark) ((Y 0) (X 1))) which indicates that assigning (X 0) the value shark, and (Y 0) same value as (X 1) is a valid assignment set. However, for the same expressions, if s were (((X 0) bear)), unification would fail because (X 0) has been assigned to be bear and cannot be shark, so we cannot create a valid assignment set. Note that per the introductory project, variables have been renamed.

A PIL Project

The final project concerns programming-in-the-large. Students are asked to complete either the entire interpreter (after seeing the other interpreters) or a substantial portion of the interpreter. In the other MuLE languages, a logical choice here might be to complete the parser for the interpreter given the specification (e.g. BNF grammar) for the input language and the format of the parsed code. However, the SLic parser is very simple in comparison, so a more challenging PIL project within SLic entails writing the inference engine for the SLic interpreter.

For the project, students are given the skeleton interpreter for SLic designed to work within the MuLE environment, the unification function, and some ancillary functions but no inference engine. Possibly working in teams, students are required to implement the inference engine using the backward-chaining algorithm. Of the methods to find answers to queries, backward chaining is still the most straightforward technique. With backward chaining, we begin with the goal and find a sequence of matching facts and/or rules that lead to true statements with no unassigned variables. Thus, to find the set of answers to a query, we attempt to unify the query with each item in the database. A fact that succeeds represents one correct answer, since variables are not allowed in facts. For rules, we attempt to unify the left hand side of the rule with the goal. Success means that we must begin another query with the right hand side as the goal.

Implementation requires a firm understanding of all the principles of both unification and backward chaining. The primary challenge is to organize the search through the database, sub-goals, and potential answers. Our "standard solution" as described in the earlier discussion of the SLic interpreter uses three recursive functions, but other solutions are possible.

The specific PIL skills to be emphasized can be varied. Although the project itself only involves writing 500 to 1,000 lines of code, it is sufficiently large to be done with small teams. One opportunity is to emphasize team-oriented skills such as the use of design meetings, requirements documents, requirements reviews, and code inspections. Another opportunity exists to emphasize project requirement concerns such as extensibility and robustness, and promote critical thinking and design with these requirements in mind. Additionally, standard software engineering practices are taught and encouraged to ensure that students get these benefits of a PIL experience as well. One could even require student teams to use a public domain version control system, such as CVS, to access the source code, with teams modifying different parts of MuLE, but integrating into a single student version of MuLE.

Conclusion

PIC and PIL software development skills are more important than ever for our students who graduate to careers in industry. We should incorporate teaching PIC and PIL skills throughout the curriculum and not isolate them to a software engineering course or a capstone project. Our experience thus far has been positive using MuLe to address concerns with teaching PIC and PIL projects and formal assessment is planned for spring 2001 and fall 2001. We have used MuLE in a variety of ways in the programming languages course as well as capstone experiences. The first version of the SLic interpreter was itself a senior capstone project. MuLE has been used to reinforce lecture material on any number of implementation ideas, teaching by counter-example and as a vehicle for group projects. A number of PIC and PIL projects based on language interpreters are possible and in harmony with the teaching goals of programming language course. A set of projects concerned with the implementation of MuLEís interpreter for SLic have been described. Although large capstone projects are valuable and worthwhile, it is not the only vehicle for effective presentation of PIC and PIL issues. Smaller MuLE sized projects such as the SLic projects described herein also effectively expose students to PIC and PIL issues.

References
[1] ACM/IEEE-CS Joint Curriculum Task Force (1991). Computing Curricula 1991. New York, ACM Press.

[2] Barr, John and L.A. Smith King, "Interpreter-based Projects for a Traditional Programming Languages Course", The Journal of Computing in Small Colleges, Volume 10, Number 2, November 1994.

[3] Barr, John and L.A. Smith King, "An Environment for Interpreter-based Programming Language Projects", twenty-sixth SIGCSE Technical Symposium on Computer Science Education, Volume 26, Number 1, March 1995.

[4] John Barr and L.A. Smith King, "Teaching Programming Languages by Counter-Example", The Proceedings of the Eleventh Annual Eastern Small College Computing Conference, New Rochelle, NY, October 20-21, 1995.

[5] Barr, John and L.A. Smith King, "An Environment for Interpreter-based Projects for the Programming Languages Course", National Science Foundation DUE CCLI-EMD 9952398, January 2000 ó June 2001.

[6] Bruce, Kim, "Formal Semantics and Interpreters in a Principles of Programming Languages Course", Thirtieth SIGCSE Technical Symposium on Computer Science Education, Volume 39, Number 1, March 1999.

[7] http://www.computer.org/education/cc2001/index.htm, accessed 8/22/2000.

[8] Friedman, D., Wand, M. and Haynes, C., Essentials of Programming Languages, The MIT Press, 1992.

[9] Haines, Jimmie, "Collaborative Learning in Undergraduate Information Science Education", Panel, Twenty-Sixth SIGSCE Technical Symposium on Computer Science Education, Volume 27, Number 1, March 1995.

[10] Kamin, Samuel, Programming Languages: An Interpreter-Based Approach, Addison-Wesley Publishing Company, 1990.

[11] Robinson, J.A., "A Machine Oriented Logic Based on the Resolution Principle", J.A.C.M. Volume 12, Number 1, 1965, pages 23-44.

[12] Sebesta, Robert W., Concepts of Programming Languages, 4th Edition, Addison-Wesley, 1999.

[13] Stevens, K. Todd, John Lewis, et.al. "Using Large Projects in a Computer Science Curriculum", Panel, Thirty-first SIGSCE Technical Symposium on Computer Science Education, Volume 32, Number 1, March 2000.