984,547 days

“Still—if you had a server farm, you could do it in like three years.”

– My husband

TL;DR:

I’ve implemented my formulation for plan recognition from complex observations, and written some more of the paper. Next up is evaluating it, and revising the paper. As the plan is now, I’ve got a lot of settings to vary over, which results in a combinatorial explosion.

Also, Matt and I got our living situation figured out for the school year, and it’s pet friendly! So now I have an excuse to make Shadow the featured picture.

This Week

Goal Recognition:

  • Drafted introduction, kind of. It needs work.
  • Got feedback on the formal definitions.
  • Implemented my parser atop the Ramirez. Now I can generate PDDL with observation actions in the correct ordering, mutexes, and even with 0-cost fluent observations!

Apartment Hunting: Matt and I got an apartment lined up for the school year. I hate making phone calls.

Next Week

  • Work on introduction section more
  • Reorder the formal definitions for better flow, and integrate informal english explanations.
  • Document my code, add readme, etc.
  • Code for translating a partially observed observation like (pickup A ?) to an option group of all possibilities like (|(pickup A A),(pickup A B),...,(pickup A Z)|).
  • Begin evaluation phase
    • Trim the plan below
    • Obtain a plan tracer to calculate the intermediate states between actions in a plan.
      • Or write it, if I can’t find it. :/
    • Prototype code to obscure a plan trace, as in the below plan
    • Prototype code to sample random fixed orderings of a complex observation group

Evaluation Plan

So my ‘complex’ observations differ from normal observations in a few ways:

  • Observation of both actions and states (as opposed to just actions)
  • Observations can be ordered, but order might not be known for some
  • Partial observations are possible, and expressed as a set of possible observations

I want to highlight these features in my evaluation. The way previous work has evaluated itself is by generating a bunch of optimal plans from some standard domains, removing a percentage of the plan steps, and using what remains as observations. The plan recognizer tried to guess the correct goal, given those observations.

Below is my initial plan for evaluation, which I’m sure will get revised. Still, it’s useful to write it all out and do the math.

We will have three modes for observations: actions only (A), mixed actions and fluents (A+F), and fluents only (F). Previous work cannot work with fluent observation, so we exclude fluent observations when evaluating it on A+F, and do not evaluate it on F. Observations for A+F and F are generated from a plan trace (actions and their intermediate states).

Within these three modes, we will obscure optimal plans / plan traces, varying over the following settings:

  • Percentage of observations seen (100% / 50% / 10%)
  • Percentage of observations of unknown order (chosen randomly, such that each unordered observation is in a group of at least 3) (100% / 50% / 0%)
    • We evaluate previous work by its average performance, if a random ordering is fixed for unordered observations.
  • Percentage of ‘obscured’ observations (observations missing one parameter, expressed as an option group) (50% / 25% / 0%)
    • We evaluate previous work by both its average performance if a random option is chosen, and by its performance if obscured actions are omitted.
    • This applies to both fluent observations and action observations. If an observation has no parameters, another observation is chosen.
  • Fluent visibility (Percentage of fluents included in a fluent observation. Includes ‘not’ values for missing fluents) (100% / 50% / 10%)
    • Not a variable for the A mode.

A setting of (100% / 0% / 0% / 100%) results in no obscuration, while a setting of (10% / 100% / 100% / 10% ) obscures a plan to a small subset of completely unordered and highly obscured observations.

We will generate optimal plans for X goals in each domain, and for each setting above, obscure each optimal plan Y times to create X*Y observation-goal pairings. We run both our goal recognizer and the Ramirez goal recognizer on each obscured observation set, paired with G goal hypotheses. If a setting involves unordered or obscure observations, we run the Ramirez recognizer several times to compute average performance, as specified above. We report performance as time to compute, how often the correct goal was indicated and average fraction the correct goal took of all goals indicated correct.

This is a lot of settings, and I dearly hope my advisor helps me trim it down. I went and did the math and, well…

The Ramirez recognizer is not run for mode F. Mode A has only 27 settings, while the other modes have 81 settings. The Ramirez recognizer must actually be run R times for 12 of A‘s settings and 36 of A+F‘s settings, using random samplings. In total:

A = 2 recognizers * (12 settings * R samples + 15 settings) * X goals * Y obscurations * G hypotheses * 2 plan runs per hypothesis = 4*X*Y*G(12*R + 15) runs

A+F = 2 recognizers * (36 settings * R samples + 45 settings) * X goals * Y obscurations * D hypotheses * 2 plan runs per hypothesis = 4*X*Y*G*(36*R + 15) runs

A+F = 1 recognizer * 81 settings * X goals * Y obscurations * G hypotheses * 2 plan runs per hypothesis = 162*X*Y*G runs

Previous work has set X around 15, Y around 13, and G depends on the domain but is around 15 as well. R we’ll set at 150, for an okay-ish random sampling. Subbing in those values, that’s around 85064850 plan runs per domain. Time per run varies a lot by domain, but we’ll say it’s around 1000 seconds. 85,064,850,000 seconds = 1,417,747,500 minutes = 23,629,125 hours = 984,547 days.

Per domain.

So I’m gonna need to trim something. But still, it’s good practice to write it all out.

Leave a Reply

Your email address will not be published. Required fields are marked *