CS 161 W '02 PROJECT Due date (for both Parts I & II): Pass in to TA by noon, Fri Mar 15. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Overview: -------- This project involves writing (in Scheme) a forward-chainer (Fig. 9.1, p.273 of textbook) that is applied to a (very) short story. You may work alone (Part I) or form a team with another person (at most 2 per team) and do extra work (Part II). Part I: If you work alone then you are expected to write the following ------ functions/data: 1. UNIFY : a unifier. 2. FCHAIN : a forward-chainer (that calls the unifier). 3. READLOGIC : takes as input the short story in logic format and calls FCHAIN repeatedly to augment the knowledge base. 4. KB : a knowledge base, consisting of logic facts and rules. 5. You will also write a brief (1-3 page) discussion of the final status of your project and any problems or issues you encountered and how you dealt with them (or why they were too difficult to solve). If you work in a team, then you are expected to do 1-4 above and also to do 6-12 of Part II. Part II: Working in a team will enable those (who are interested) to ------- explore in more detail a few of the issues involved in Natural Language Processing. 6. LEX : a lexicon that holds two-way associations of word patterns with logic patterns 7. ENGLOG : a function that maps an English sentence into a logic representation 8. LOGENG : a function that maps a logic representation into an English sentence 9. READSTORY : a function that reads the English story, interspersed with possible questions, augments the KB and answers the questions 10. QA : a function that takes a question and returns an answer in English 11. You will also hand in a description of how the work was divided up (who coded what; who debugged what; who wrote up what) 12. You will each separately write a 2-3 page description of the final status of Part II of your team project and problems you encountered with Part II. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - DETAILS for Part I ------------------ i) Short story and logical equivalent: ---------------------------------- The short story is labeled THIEVERY. It consists of a list of 9 simple English sentences (as sublists). THIEVERY = ( (JOHN ADMIRED FRANK) (MARY LIKED JOHN) (FRANK WAS GREEDY) (JOHN DROVE TO WORK) (FRANK SAW JOHNS HONDA) (MARY SAW FRANK STEAL JOHNS HONDA) (MARY TOLD JOHN) (JOHN WALKED HOME) (JOHN WAS FURIOUS) ) LOGTHIEV consists of the same story, but already represented as logical sentences (in Scheme-like syntax) which represents, say, Likes(John, Mary) as (Likes John Mary). Notice that nearly every logical sentence is augmented with a symbol representing a different situation and that situations follow each other. LOGTHIEV = ( (ADMIRES JOHN FRANK S0) (FOLLOWS S0 S1) (LIKES MARY JOHN S1) (FOLLOWS S2) (GREEDY FRANK) (DRIVES JOHN WORK S2) (FOLLOWS S2 S3) (SEES FRANK (HONDA (OWNEDBY JOHN)) S3) (FOLLOWS S3 S4) (SEES MARY (STEALS FRANK (HONDA (OWNEDBY JOHN)) S4) S4) (FOLLOWS S4 S5) (TELLS MARY JOHN S5) (FOLLOWS S5 S6) (WALKS JOHN HOME S6) (FOLLOWS S6 S7) (ANGRY JOHN S7) ) ii) Knowledge Base (KB): ------------------- KB consists of one or more facts and a greater number of rules, which together represent the system's initial knowledge of the world. The knowledge base is described informally below in an English-like format. Notice that situations are not specified. Part of your task is to re-write the knowledge base into a Scheme-style logic format and include the appropriate situation variables. - Facts in KB: MARY is HUMAN MARY LIKES JOHN - Rules in KB: a) Inferring Likes: 1. If x LIKES y and x POSSESS z and y STEALS z and x not KNOW that y STEALs z Then x still LIKES y 2. If x POSSESS y Then x LIKES y 3. If x STEALs y Then x LIKES y 4. If x ADMIRES y in s and y is HUMAN Then x LIKES y b) Inferring Angryat: 1. If x POSSESS z and x KNOWS that y STEALS z Then x ANGRYAT y 2. If x ANGRY and x KNOWS just recently that y STEALS from x Then x ANGRYAT y c) Inferring Steals: 1. If x is GREEDY and x SEES y and y is not HUMAN Then x STEALS y d) Inferring Tells: 1. If x is HUMAN and x KNOWS that y STEALS z and x LIKES w and w POSSESS z in recent past Then x TELLS w that y STEALS z 2. If x TELLS y and x SEES z just before Then x TELLS y about z e) Inferring Knows: 1. If x TELLS y z and y is HUMAN Then y KNOWS z 2. If x is HUMAN and x SEES that y STEALS z Then x KNOWS that y STEALS z f) Inferring Locations: 1. If x DRIVES to y Then x is ATLOC y 2. If x WALKS to y Then x is ATLOC y g) Inferring Works/Quitswork: 1. If x is HUMAN and x is ATLOC WORK Then x WORKS 2. If x WORKS and x is FURIOUS at y Then x QUITS WORK h) Inferring Walks: 1. If x QUITSWORK and x not POSSESS CAR Then x WALKS HOME i) Inferring Human: 1. If x DRIVES to y Then x is HUMAN 2. If x is GREEDY Then x is HUMAN 3. If x ADMIRES y Then x is HUMAN 4. If x WORKS Then x is HUMAN 5. If x TELLS y z Then x is HUMAN j) Inferring Mortal: 1. If x is HUMAN Then x is MORTAL k) Inferring After: 1. If s2 FOLLOWS s1 and s3 FOLLOWS s2 Then s3 is AFTER s1. l) Inferring Possess: 1. If x SEES y and y = (HONDA (OWNEDBY z) Then z POSSESS the HONDA 2. If x STEALS y Then x POSSESS y 3. If x POSSESS (HONDA (OWNEDBY z)) Then x POSSESS the HONDA Updating Situations: The rules above have all been written in an informal manner that leaves out situations. But you will have to put situations in for all the situation-dependent predicates. Some of the rules above are independent of situations while others are situation-dependent. Specifically: - Being HUMAN and MORTAL are situation independent. If you are human you are always human. (We ignore what dying might do to being human.) - Being GREEDY is also situation independent. In this story-world we will assume that being greedy is a character trait that never changes. Once greedy then always greedy. - All other predicates are situation dependent. A character might possess, like, hate, be angry, drive, work, admire, be at a location, walk etc. in one situation but not in another. For instance, if you work in situation s1 and walk home in situation s2 then you are no longer working in s2. If you admire x in situ s1 and you find out x stole from you in situ s2 then you no longer admire x in situ s2, and so on. For example, if you work then you are a human independent of the situation (so no situation variable is needed). But if John admires Frank and John finds out that Frank steals from John then John will no longer admire Frank in later situations and John will be angry at Frank, so admiring someone is situation-dependent. Here is a situation update of the DRIVES rule: (If x DRIVES to y Then x is ATLOC y) It would become something like this: If x DRIVES to z in s1 and s2 FOLLOWS s1 and x ATLOC y in s1 Then x ATLOC z in s2 and x not ATLOC y in s2 Here is an update of the ANGRYAT rule: (If x POSSESS z and x KNOWS that y STEALS z Then x ANGRYAT y) It would become something like this: If s2 AFTER s1 and x ADMIRES y in s1 and x KNOWS that y STEALS z in s2 Then x ANGRYAT y in s2 Encoding rules in Scheme: Rules are represented in Scheme using Scheme syntax and are of the form: ((premise1 premise2 ... ) single-conclusion) where variables are indicated by the read-macro ? For example, ?X expands into (*V X) during read-time, which indicates that X is a variable (to distinguish it from constants). For example, here are two rules in Scheme format: 1. Infer knows: (((TELLS ?X ?Y ?Z) (HUMAN ?Y)) (KNOWS ?Y ?Z)) (You have to be human to know something when being told it. E.g., if a person tells a dog Fido that Frank stole John's car then we cannot conclude that Fido knows this.) 2. Infer human: (((TELLS ?X ?Y?Z)) (HUMAN ?X)) (i.e., only humans can tell each other various things, so x TELLS y implies that x is human.) iii) Forward Chainer: ---------------- FCHAIN takes as input a single logic sentence LS and a knowledge base KB. It returns as output a new knowledge base augmented with LS and with all nonredundant inferences that are entailed from both LS and KB. For example, if KB1 = ( (((TELLS ?X ?Y?Z)) (HUMAN ?X)) ) and LS = (TELLS JOHN FIDO (LOVES JOHN FIDO)) then (FCHAIN LS KB1) returns: ( (HUMAN JOHN) {inferred} (TELLS JOHN FIDO (LOVES JOHN FIDO)) {comes from LS} (((TELLS ?X ?Y?Z)) (HUMAN ?X)) {already in prior KB} ) Below are some informal trace of new entailments (==>) found by FCHAIN (which is being called over and over again by READLOGIC) and some few comments. Not all inferences are shown; just some of the major ones. (ADMIRES JOHN FRANK S0) ==> (MORTAL MARY) (HUMAN JOHN) (MORTAL JOHN) { Frank is not inferred to be human because Frank need not be human (he could be, say, a painting). Mary is inferred to be mortal because there was already a fact that she is human in KB} (FOLLOWS S0 S1) ==> no inferences (LIKES MARY JOHN S1) ==> no inferences (FOLLOWS S1 S2) ==> (AFTER S0 S2) (GREEDY FRANK) ==> (HUMAN FRANK) (MORTAL FRANK) {only humans are capable of greed} (DRIVES JOHN WORK S2) ==> (HUMAN JOHN) (MORTAL JOHN) {only humans can drive so John is human} (FOLLOWS S2 S3) ==> (ATLOC JOHN WORK S3) (AFTER S0 S3) (AFTER S1 S3) (WORKS JOHN S3) { Notice that the new location is inferred only after (FOLLOWS S2 S3) is input, so x is at work in the situation that FOLLOWS the situation of x driving. } (SEES FRANK (HONDA (OWNEDBY JOHN)) S3) ==> (POSSESS JOHN HONDA S3) (FOLLOWS S3 S4) ==> (STEALS FRANK (HONDA (OWNEDBY JOHN)) S4) {he wouldn't be able to steal it if it were human} (LIKES FRANK (HONDA (OWNEDBY JOHN)) S4) {since he stole it we know he must like it} (POSSESS FRANK (HONDA (OWNEDBY JOHN)) S4) (POSSESS FRANK HONDA S4) {there is only one Honda in this microworld} {there will be more AFTER inferences following each new FOLLOWS} {(POSSESS FRANK HONDA S4) will also infer LIKES but it will be redundant since it is an identical LIKES in the same situation} (SEES MARY (STEALS FRANK HONDA (OWNEDBY JOHN) S4) S4) ==> (FOLLOWS S4 S5) ==> (KNOWS MARY (STEALS FRANK HONDA (OWNEDBY JOHN) S4) S5) { Mary knows in next situation what she saw in prior situation} (TELLS MARY JOHN S5) ==> (TELLS MARY JOHN (STEALS FRANK (HONDA (OWNEDBY JOHN)) S4) S5) { Mary tells John because she is human, she knows, Jonn possessed it before Frank and she likes John} (FOLLOWS S5 S6) ==> (KNOWS JOHN (STEALS FRANK HONDA (OWNEDBY JOHN) S4) S6) (ANGRYAT JOHN FRANK S6) (QUITS JOHN WORK S6) (WALKS JOHN HOME S6) ==> no new inferences (FOLLOWS S6 S7) ==> (ATLOC JOHN HOME S7) (ANGRY JOHN S7) ==> (ANGRYAT JOHN FRANK S7) { John is angry but we don't know AT who, but we have a rule that infers that you are angry at a recent thief } iv) Unifier: ------- UNIFY takes 3 arguments: - two expressions exp1, exp2; and - vbl (a list of variable bindings). If successful but no bindings you can have Unify return (( )) or (T). If successful with bindings b1 b2 then you can have it return either (bindings) or (T bindings). If unification fails, then it should return (). In either case there should be a difference between a successful unification with no bindings vs. no successful unification at all. For example, if we use the (T) approach then if UNIFY is asked to unify the following while passed a single binding of ?W = BILL we should get: (UNIFY '(KNOWS ?X (STEALS FRANK (HONDA (OWNEDBY ?X)) ?S1) ?S2) '(KNOWS JOHN (STEALS ?Y ?Z S4) ?S) '(T (?W BILL))) then UNIFY should return the following bindings: (T ((*V X) JOHN) ((*V Y) FRANK) ((*V Z) (HONDA (OWNEDBY JOHN))) ((*V S1) S4) ((*V S2) (*V S)) ((*V W) BILL)) (UNIFY '(KNOWS JOHN MARY) '(KNOWS MARY JOHN) ( T )) should return: ( ) (UNIFY '(KNOWS JOHN ?X) '(KNOWS MARY JOHN) (T (?X MARY))) should return: ( ) (UNIFY '(KNOWS JOHN MARY) (KNOWS JOHN MARY) ( )) should return: ( T ) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - DETAILS for Part I ------------------ LEX is a lexicon of word-logic associations. For example, we want to map the English: (MARY SAW FRANK STEAL JOHNS HONDA) into the logic representation: (SEES MARY (STEALS FRANK (HONDA (OWNEDBY JOHN) S4) S4) LEX will contain an English-Logic pattern-pair associated with each word or phrase. For example: 1. ((--x SAW --y STEAL ++z) (SEES x y z)) 2. ((JOHNS --x) (x (OWNEDBY JOHN)) ENGLOG will use these associations (and the pattern-matcher from HW#1) to replace English with logic. ENGLOG (or READSTORY) will also have to generate situation symbols S0, S1, etc. and insert them appropriately into the logic sentences. LEX will also contain Logic-English pairs to replace logic with its equivalent English. For example: 1. ((STEALS --x --y --z) (x STOLE y FROM Z)) 2. ((ANGRYAT --x --y) (x WAS FURIOUS WITH y)) LOGENG will use this part of LEX to replace logic with suitable English. READSTORY will take THIEVERY as input and produce LOGTHIEV which it will then pass to READLOGIC. QA will take the final KB (that is the result of forward chaining) and a question. QA will handle only two types of questions: WHY and WHEN questions. QA will hunt through prior situations in the KB for an answer, which it will pass to LOGENG in order to generate an answer in English. It is up to you to decide what kind of search strategy to implement in order to pick out an appropriate answer. Assume that KBFINAL is the final knowledge base, after processing the story. Here are 4 examples of the kind of I/O behavior that you might get working for QA: (QA '(WHY DID JOHN WALK HOME) KBFINAL) ==> (BECAUSE JOHN QUIT WORK AND JOHN DID NOT OWN A CAR) (QA '(WHY DID FRANK STEAL JOHNS HONDA) KBFINAL) ==> (BECAUSE FRANK WAS GREEDY AND HE SAW JOHNS HONDA) (QA '(WHEN WAS JOHN AT WORK) KBFINAL) ==> (AFTER JOHN DROVE TO WORK) (QA '(WHEN DID JOHN GET MAD AT FRANK) KBFINAL) ==> (AFTER MARY TOLD JOHN FRANK STOLE JOHNS HONDA) Whatever strategy you use, you should not search through the English text in THIEVERY. You should try to get at least 2 WHY and 2 WHEN questions answered in some manner. I suggest that every WHY be answered with (BECAUSE ...) and every WHEN with (AFTER ...). The English answers do not have to be identical to the English examples above. Note: There is no single correct output for Part I. There will be variations in how people encode the informally stated rules. Also there is variation in how different functions are implemented and how they call each other. There will be much more variation for Part II because many more aspects are left up to your team. If you work in a team, make sure that you show each part working independently.