Unification Lab[1]
The SLic
Interpreter.
Assumptions
This labs assumes knowledge of the MuLE environment and
familiarity with the SLic (Simple Logic) language.
Objective
To understand unification, the process of matching two
expressions.
Overview
of SLic
SLic is a logic language that provides some of the basic
features found in declarative languages like PROLOG. The declarative/logic paradigm is a unique paradigm that has
features not found in languages of other paradigms. In general, declarative programs describe properties of the
solution as a collection of "facts" and "rules" (called the
database), but do not specify how to compute the solution. Instead, the database engine derives the
actual solution using predicate calculus to infer valid statements.
In a SLic program, the user interacts with the database of
facts and rules by asking queries. A
query is a question to the database asking if certain properties can be deduced
based on the current knowledge in the database.
Overview
of Laboratory Assignment
A fundamental problem in finding query results is to
determine if two expressions unify, or match.
Inside the database, the fact
eats { bear fish
}
would look like:
((eats bear
fish))
and the rule
predator { X } if
eats { X Y } and animal { Y }
would look like:
((predator
X)(eats X Y)(animal Y))
In general, a fact or rule is represented as a list with a
sub-list representing each expression.
For these expressions, the functor
is
the car of the
list and the operands the cdr.
Unify will be passed two expressions from either facts,
rules, or the query. They will be of
the form (eats bear fish) or (predator (X 0)). Note there is only one level of ()s. Note also that the variable X has been
replaced by the list (X 0) representing a time-stamped variable. This renaming process is done outside
unification.
Explanation
diversion: To understand why
renaming is necessary, consider the expressions (father X Y) and (father
Y X). Unification would conclude
that only if X = Y do these terms match. From our implied meaning of this example, that could only occur if
someone was his own father - which makes no sense from a genealogical
standpoint. To avoid this problem, all
variables are renamed using a time-stamp.
Two expressions unify if they match both in functor and operands. When no variables
are present, this is a simple task of checking each pair of items to make sure
they match.
The inclusion of variables makes this process more
complicated for two reasons. First,
there could be variables in both expressions and these variables may occur more
than once. Also, because unification is
ultimately used to answer queries of the database, there may be predetermined
values for certain variables. For
example, we may want to unify (foo (X 0) d) and (bar c (Y 0)) with the
limitation that (Y 0) = f. In this
case the two expressions would not unify.
The final answer to the question of whether two expressions
unify is either a list of variable assignments that cause the expressions to match,
or "fail," meaning the two expressions did not match.
Example 1:
Unify
(predator
(X 0))
and
(predator lion)
with no initial variable assignments.
The result of unification would be
(((X 0) lion))
meaning the two expressions unify or match when (X 0) is lion.
Example 2:
Unify
(father (X 0)
ben)
and
(father joe
jennifer)
with no initial variable assignments.
The result of unification would be
'fail
because "ben" and
"jennifer" do not match.
The
Unification Algorithm in Detail
Unification is a recursive process. The unif function
takes three parameters, e1, e2, and s. e1 and e2 are the expressions we are attempting to unify and s is the
current variable assignment list (VAL). In general, we work from left to right
determining if each pair of symbols unify.
When we encounter a variable in one expression, the value in the other
expression becomes its assignment and it is added to the VAL. The return value of each call of unif is the
current VAL.
At the most basic level, the two expressions could be
symbols. If the two are equal (using
the function eq?), then the two expressions unify and the return
value is the current VAL. (Note that within the SLic window, symbols are case-sensitive
but in the interpreter window all symbols are case-insensitive).
If either (or both) of the expressions is a variable, then
the two expressions unify. This means
we have just found another variable assignment to add to the VAL. The function add has the following form:
(add
(lambda (x val s)
...
where x is a variable, val is the
value assigned the that variable, and s is a VAL. The return value is the VAL with the
inclusion of x assigned val.
Optional: In a full implementation of a logic language,
we would have to make sure the variable does not occur in the other expression.
Since SLic is a simple logic language,
recursive facts and rules are not allowed and the occur check can be omitted.)
The next check is to see if either of the expressions is a
symbol (using the function symbol?). Since neither is a variable (from the above test) and the two
don't match directly, if either expression is a symbol, it is not possible for
the two to unify. This check is required
because the remaining conditions assume both expressions are lists (i.e. not
symbols). If either is a symbol, the
return value is 'fail.
At this point, we know both are lists. If the two lists do not have the same
length, the two cannot unify, and the return value is 'fail.
Otherwise, we need to check to see if the first two sub-expressions
of the expressions unify using the current VAL. Let news (read new 's') be the
return value of this attempt at unification.
If the recursive call returned 'fail, then
this attempt should also return 'fail. Else, we now want to see if the cdr of the
two expressions unify using
the new VAL, news.
Note that when we encountered a variable, we did not check
to see if it already was assigned a value in the VAL. Rather than doing this check when we find a
variable, we replace all variables with their assigned values. Since we have potentially updated the VAL, we need to
make sure all variables have been assigned before we recursively
call unif.
The function appsub looks like
(appsub
(lambda (te s)
...
where te is the expression to be modified and s is the
variable list. Thus, the parameters to
the recursive call to unif are e1 and e2 after we
have applied the substitution, and news.
Subtle
issue:
When two expressions are passed in with a non-empty VAL, the
first two items in the expressions are symbols representing the functors. Therefore, given that the two functors
unify, appsub will be called to assign all the variables with
their assigned values.
Consider the example of attempting to unify
e1 = (foo (X 0) b (X 0))
e2 = (foo c (Y 0) d)
s = '()
Since the two expressions are both lists, they don't match
directly, neither is a variable, and neither is a constant. Therefore, we check to see if the car of each
list unifies. The second call to unify
is given
e1 = foo
e2 = foo
s = '()
In this case, the two match directly and we can continue
checking the rest of each list with a recursive call to unify
e1 = ((X 0) b (X 0))
e2 = (c (Y 0) d)
s = '()
As with the first call, the two don't match directly,
neither is a variable or a constant so we check the first item:
e1 = (X 0)
e2 = c
s = '()
In this case, the first expression is a variable so we add
the variable assignment (X 0) = c to the VAL.
Before we recursively call unify again, we must apply the
known variable assignments. Thus, the
recursive call to unif will have the parameters:
e1 = (b c)
e2 = ((Y 0) d)
s = (((X 0) c))
Note that we have changed the (X 0) in the
first expression to a c and that the variable list now denotes the
assignment.
As before, we will have to check the first expressions
e1 = b
e2 = (Y 0)
s = (((X 0) c))
and we will discover that (Y 0) must be b. We use add to add this assignment to the VAL.
Thus, the next recursive call to unify will be
e1 = (c)
e2 = (d)
s = (((X 0) c)((Y
0) b))
Since the two don't match directly, neither is a variable or
constant, we attempt to unify the first expressions
e1 = c
e2 = d
s = (((X 0) c)((Y
0) b))
Here, we discover that the two don't match, neither is a
variable, but both are constants. Thus
we have discovered that the two do not unify.
This call to unif will
return 'fail.
The call with (c) and (d) will
return 'fail.
The call with (b c) and ((Y 0) d) will
return 'fail.
The call with ((X 0) b (X 0)) and (c (Y 0)
d) will return 'fail and ultimately the call with (foo (X
0) b (X 0)) and (foo c (Y 0) d) will return 'fail.
(A diagram with the various calls to unif would be
helpful in the final document).
Now consider the example with
e1 = (foo (X 0) b (X 0))
e2 = (foo c (Y 0) c)
s = '()
In this case, the trace will be almost identical except
during the last check, we will find that
e1 = c
e2 = c
s = (((X 0) c)((Y
0) b))
and this will return (((X 0) c)((Y 0) b)) since the
two are equal. This return value will
propagate through the recursive calls and be the final answer.
Assignment
You are to implement the function unif which has
the following form:
(unif
(lambda (e1 e2 s)
...
e1 and e2 are the two expressions to unify and s is the VAL. In addition to the standard scheme
functions, you may (should) use the following functions in your implementation.
(var?
(lambda (str)
...
var? returns whether str is a
variable.
(add
(lambda (x val s)
...
x is a variable, val is a
value for that variable and s is a VAL. add returns
an updated VAL with x assigned val.
(appsub
(lambda (te s)
...
Using the VAL s, appsub applies
all variable substitutions possible in te and
returns the modified expression.
(occur?
(lambda (x te)
...
occur? returns whether the variable x occurs in
the expression te.