Unification Lab

Unification Lab[1]

The SLic Interpreter.

Assumptions

This labs assumes knowledge of the MuLE environment and familiarity with the SLic (Simple Logic) language.

Objective

To understand unification, the process of matching two expressions.

Overview of SLic

SLic is a logic language that provides some of the basic features found in declarative languages like PROLOG. The declarative/logic paradigm is a unique paradigm that has features not found in languages of other paradigms. In general, declarative programs describe properties of the solution as a collection of "facts" and "rules" (called the database), but do not specify how to compute the solution. Instead, the database engine derives the actual solution using predicate calculus to infer valid statements.

In a SLic program, the user interacts with the database of facts and rules by asking queries. A query is a question to the database asking if certain properties can be deduced based on the current knowledge in the database.

Overview of Laboratory Assignment

A fundamental problem in finding query results is to determine if two expressions unify, or match. Inside the database, the fact

eats { bear fish }

would look like:

((eats bear fish))

and the rule

predator { X } if eats { X Y } and animal { Y }

would look like:

((predator X)(eats X Y)(animal Y))

In general, a fact or rule is represented as a list with a sub-list representing each expression. For these expressions, the functor is

the car of the list and the operands the cdr.

Unify will be passed two expressions from either facts, rules, or the query. They will be of the form (eats bear fish) or (predator (X 0)). Note there is only one level of ()s. Note also that the variable X has been replaced by the list (X 0) representing a time-stamped variable. This renaming process is done outside unification.

Explanation diversion: To understand why renaming is necessary, consider the expressions (father X Y) and (father Y X). Unification would conclude that only if X = Y do these terms match. From our implied meaning of this example, that could only occur if someone was his own father - which makes no sense from a genealogical standpoint. To avoid this problem, all variables are renamed using a time-stamp.

Two expressions unify if they match both in functor and operands. When no variables are present, this is a simple task of checking each pair of items to make sure they match.

The inclusion of variables makes this process more complicated for two reasons. First, there could be variables in both expressions and these variables may occur more than once. Also, because unification is ultimately used to answer queries of the database, there may be predetermined values for certain variables. For example, we may want to unify (foo (X 0) d) and (bar c (Y 0)) with the limitation that (Y 0) = f. In this case the two expressions would not unify.

The final answer to the question of whether two expressions unify is either a list of variable assignments that cause the expressions to match, or "fail," meaning the two expressions did not match.

Example 1:

Unify

(predator (X 0))

and

(predator lion)

with no initial variable assignments.

The result of unification would be

(((X 0) lion))

meaning the two expressions unify or match when (X 0) is lion.

Example 2:

Unify

(father (X 0) ben)

and

(father joe jennifer)

with no initial variable assignments.

The result of unification would be

'fail

because "ben" and "jennifer" do not match.

The Unification Algorithm in Detail

Unification is a recursive process. The unif function takes three parameters, e1, e2, and s. e1 and e2 are the expressions we are attempting to unify and s is the current variable assignment list (VAL). In general, we work from left to right determining if each pair of symbols unify. When we encounter a variable in one expression, the value in the other expression becomes its assignment and it is added to the VAL. The return value of each call of unif is the current VAL.

At the most basic level, the two expressions could be symbols. If the two are equal (using the function eq?), then the two expressions unify and the return value is the current VAL. (Note that within the SLic window, symbols are case-sensitive but in the interpreter window all symbols are case-insensitive).

If either (or both) of the expressions is a variable, then the two expressions unify. This means we have just found another variable assignment to add to the VAL. The function add has the following form:

(add

(lambda (x val s)

...

where x is a variable, val is the value assigned the that variable, and s is a VAL. The return value is the VAL with the inclusion of x assigned val.

Optional: In a full implementation of a logic language, we would have to make sure the variable does not occur in the other expression. Since SLic is a simple logic language, recursive facts and rules are not allowed and the occur check can be omitted.)

The next check is to see if either of the expressions is a symbol (using the function symbol?). Since neither is a variable (from the above test) and the two don't match directly, if either expression is a symbol, it is not possible for the two to unify. This check is required because the remaining conditions assume both expressions are lists (i.e. not symbols). If either is a symbol, the return value is 'fail.

At this point, we know both are lists. If the two lists do not have the same length, the two cannot unify, and the return value is 'fail.

Otherwise, we need to check to see if the first two sub-expressions of the expressions unify using the current VAL. Let news (read new 's') be the return value of this attempt at unification. If the recursive call returned 'fail, then this attempt should also return 'fail. Else, we now want to see if the cdr of the two expressions unify using

the new VAL, news.

Note that when we encountered a variable, we did not check to see if it already was assigned a value in the VAL. Rather than doing this check when we find a variable, we replace all variables with their assigned values. Since we have potentially updated the VAL, we need to make sure all variables have been assigned before we recursively

call unif. The function appsub looks like

(appsub

(lambda (te s)

...

where te is the expression to be modified and s is the variable list. Thus, the parameters to the recursive call to unif are e1 and e2 after we have applied the substitution, and news.

Subtle issue:

When two expressions are passed in with a non-empty VAL, the first two items in the expressions are symbols representing the functors. Therefore, given that the two functors unify, appsub will be called to assign all the variables with their assigned values.

Consider the example of attempting to unify

e1 = (foo (X 0) b (X 0))

e2 = (foo c (Y 0) d)

s = '()

Since the two expressions are both lists, they don't match directly, neither is a variable, and neither is a constant. Therefore, we check to see if the car of each list unifies. The second call to unify is given

e1 = foo

e2 = foo

s = '()

In this case, the two match directly and we can continue checking the rest of each list with a recursive call to unify

e1 = ((X 0) b (X 0))

e2 = (c (Y 0) d)

s = '()

As with the first call, the two don't match directly, neither is a variable or a constant so we check the first item:

e1 = (X 0)

e2 = c

s = '()

In this case, the first expression is a variable so we add the variable assignment (X 0) = c to the VAL.

Before we recursively call unify again, we must apply the known variable assignments. Thus, the recursive call to unif will have the parameters:

e1 = (b c)

e2 = ((Y 0) d)

s = (((X 0) c))

Note that we have changed the (X 0) in the first expression to a c and that the variable list now denotes the assignment.

As before, we will have to check the first expressions

e1 = b

e2 = (Y 0)

s = (((X 0) c))

and we will discover that (Y 0) must be b. We use add to add this assignment to the VAL.

Thus, the next recursive call to unify will be

e1 = (c)

e2 = (d)

s = (((X 0) c)((Y 0) b))

Since the two don't match directly, neither is a variable or constant, we attempt to unify the first expressions

e1 = c

e2 = d

s = (((X 0) c)((Y 0) b))

Here, we discover that the two don't match, neither is a variable, but both are constants. Thus we have discovered that the two do not unify.

This call to unif will return 'fail.

The call with (c) and (d) will return 'fail.

The call with (b c) and ((Y 0) d) will return 'fail.

The call with ((X 0) b (X 0)) and (c (Y 0) d) will return 'fail and ultimately the call with (foo (X 0) b (X 0)) and (foo c (Y 0) d) will return 'fail.

(A diagram with the various calls to unif would be helpful in the final document).

Now consider the example with

e1 = (foo (X 0) b (X 0))

e2 = (foo c (Y 0) c)

s = '()

In this case, the trace will be almost identical except during the last check, we will find that

e1 = c

e2 = c

s = (((X 0) c)((Y 0) b))

and this will return (((X 0) c)((Y 0) b)) since the two are equal. This return value will propagate through the recursive calls and be the final answer.

Assignment

You are to implement the function unif which has the following form:

(unif

(lambda (e1 e2 s)

...

e1 and e2 are the two expressions to unify and s is the VAL. In addition to the standard scheme functions, you may (should) use the following functions in your implementation.

(var?

(lambda (str)

...

var? returns whether str is a variable.

(add

(lambda (x val s)

...

x is a variable, val is a value for that variable and s is a VAL. add returns an updated VAL with x assigned val.

(appsub

(lambda (te s)

...

Using the VAL s, appsub applies all variable substitutions possible in te and returns the modified expression.

(occur?

(lambda (x te)

...

occur? returns whether the variable x occurs in the expression te.

[1] This work was partially funded by National Science Foundation CCLI-DUE # 9952398.