I have made a domain that can make a rabbit fall asleep! I’m very excited. While trying plan recognition on existing CRISP domains, I realized I was misunderstanding some key aspects of CRISP’s approach. I started fixing those up for plan recognition, and have ended up making significant progress towards what my modifications will be. (Described below.)
Plan recognizer: I got plan recognition working on several benchmark domains, and hit a wall on CRISP domains. It turned out that CRISP’s initial state—and all its actions—explicitly describe the end goal. Every action relies on information that wouldn’t be available in a parsing scenario, so it’s pointless to estimate CRISP’s plan recognition accuracy until that’s resolved.
Language PDDL: So I started fixing those issues and ended up writing my own version of CRISP’s approach. (Details below.) I still haven’t found the code for XTAG to CRISP XML translation, but signs point to it existing!
UROP funding application: I’ve decided not to apply for next semester’s UROP. I’m allowed two semesters of UROP funding. I’ve used one, and I think I will use the other during my last semester. I’m researching during the summer in part because I won’t have the time for it next semester—so I can’t commit to UROP’s ten hours a week.
Grad School List: No progress this week.
Broader AI Studies: I finished this report on AI and discrimination, plus a few scattered news-y articles. I’m still working through my thoughts on the discrimination report. I recommend reading it.
- Language PDDL: Continue fiddling with what I’ve got! Work up some alternative models for the questions detailed below, and maybe work on increasing the vocabulary.
- Investigate contacting a nearby linguistics professor about the best approach for describing LTAGS. I will also ask Rogelio (my mentor at Utah) for advice on how to maximize planning efficiency.
- Plan Recognizer: Assuming language PDDL goes okay, toss some of that in the plan recognizer. Think on how to enumerate possible hypotheses.
- Figure out plan recognition for mutually exclusive possible observations. (We see the word “the”, but don’t know where it’s adjoining into the syntax tree.)
- Lists: Continue on grad school list and employment list. Also, school-year housing.
- Broader AI Studies: Read an actual academic paper in a topic I’m unfamiliar with.
A Crash Course in CRISP + Developing My Own CRISP
CRISP describes language in terms of adjoining LTAGs1LTAGs are incomplete syntax trees, each with a grounding word. See the image.. Every LTAG gets an action that describes how that LTAG attaches to other LTAGs, as in the image. These actions also describe what the LTAG means. So, the action for
nx0Vnx-likes describes not only the tree pictured, but also states that we don’t need to express
likes(mary rabbit1) anymore. This action also introduces some problems—two dissatisfied nodes, and distractor sets for
rabbit1. Those nodes (in red) need more LTAGs to satisfy them. The
mary distractor set contains every object except
mary—the goal is to remove everything in this set, so that the listener is sure the liker is
rabbit1. (Likewise for the
rabbit1 distractor set.)
We reduce the
mary distractor set (and satisfy a red node) using the
propername-mary action. This deletes everything not named “Mary” from the
mary distractor set—which deletes everything, in this case. If there were two people named Mary, CRISP would have to further specify. Similarly, the action
n-rabbit satisfies the other red node and deletes the
rabbit1 distractor set. It adds a new grammatical need (the yellow adjoin! node), which is then satisfied by the
Until this week, I misunderstood how this worked. I thought that
likes(mary rabbit1) wasn’t expressed until the end, and that
mary would still remain in her distractor set at the end, not excluded from it entirely. The way it is now poses a problem to goal recognition. Notice that every action takes, as parameter, the object it’s trying to specify. For goal recognition, the entire point is trying to figure out which object is being referenced. Our input is a sequence of words—not words plus the objects they’re referring to.
This week I also realized that CRISP’s goal didn’t include
expressed(like(mary rabbit1)), but rather the goal was to have nothing left in need of expressing. (A key difference—the initial state includes
need-to-express(like(mary rabbit1)), making goal recognition rather trivial. The goal is always
forall(x,y): not(need-to-express(x y)).)
I fixed this last issue in about an hour, including setup time. The issue with every action taking the object specified as input is trickier. It’s integral to CRISP’s methodology. If I were to fix it, I might as well write my own language domain that enables in-game actions. So I did!
My approach is largely the same as CRISP’s, with regard to LTAGs and distractor sets. I substituted the
nx0Vnx-likes action for
nx0Vnx-wake, and it doesn’t express
wake(mary rabbit1), it expresses
action-is(wake), It instantiates two distractor sets, one for each of those unfulfilled red nodes. Instead of identifying these with
rabbit1, I identify them with particular roles for the
wake action. Now, every object (including
mary) is in the
wake actor2More generally, ?frame-name ?role distractor distractor set. (I’m borrowing from frame theory here, so I need to study more on it.)
Now the actions like
n-rabbit whittle down the
wake actor and
wake subject distractor sets until there is just one object left. When only one object is left in every
wake distractor set, no grammatical need is left, and
action-is(wake), the planner can call
compile-wake(mary rabbit1). This puts it all together, and if the in-game
wake action’s preconditions are satisfied (
mary is awake,
rabbit1 is asleep),
rabbit1 is no longer asleep.
After a few iterations of this, simplified, I finally got the hang of PDDL syntax, compiled it all, and got the sentence “the rabbit wake the rabbit” and a very awake
rabbit1. I was rather giddy after that. After a little more tinkering, I’ve now got it reliably expressing
I wake the rabbit. (I only use Mary as an example. I’m more interested in player commands, so I also added ‘I’ to the dictionary.) The only goal is for the rabbit to be awake, not to express any particular thing. Using this domain, the only way to wake the rabbit is to say “I wake the rabbit”. That’s a big step closer to D&D.
I still have questions remaining, however, mostly around those syntax nodes. I can’t wrap my head around how CRISP gets the syntax tree ordered properly, or what its syntaxnodes even represent. I took an approach where I tossed the planner a dozen blank
syntaxnodes and let it use them like connector joints. It’s been tricky to keep it from using the same
syntaxnode for everything, and I’m not sure how to ensure the syntax tree keeps words in order. (No “wake rabbit I the”.)
Over the next week+ I’ll be working on that problem, and the questions/tasks below.
- CRISP takes a condensed approach, and doesn’t describe syntax nodes that can’t be adjoined to. Should I do that, or fully specify the syntax tree?
- How do I translate from a plan to a syntax tree to a sentence?
- How should I ensure order in the syntax tree, especially if a node can have an arbitrary number of children?
- Conjunctions and plurals??? “I shoot the rabbit and the goblin.” or “I shoot the rabbits.”
- What about non-action sentences? When we merely wish to express a fact? Can I model that as an action, with a
- Add adjoining back into the domain. (I took it out for simplicity)
- Add more actions, words, versions of words, etc. Deal with any questions/problems that arise with it.
- Investigate frame theory and, if useful, adopt their terminology.
Notes [ + ]
|1.||↑||LTAGs are incomplete syntax trees, each with a grounding word. See the image.|
|2.||↑||More generally, ?frame-name ?role distractor|