Download Computer-Assisted Troubleshooting for Efficient Off-board

Transcript
4.2. Evaluation of Iterative Bounding LAO*
105
and goals, as well as the average plan depth of the optimal solution for each
problem is listed below:
Problem
rover1 rover2 rover3 rover4 rover5 rover6
No. of fluents
68
68
68
68
68
119
No. of actions
100
100
100
100
100
235
No. of goals
1
1
1
1
3
3
Average depth 5
6.5
6.33
7.5
16
18.33
We will evaluate the algorithms in the same way as in the previous section
for six problem instances from the rovers domain with increasing difficulty.
The lower bound heuristic will be the full observability heuristic hfo which is
a better heuristic than the hmin for this domain. The upper bound will be a
constant value of 10000.
The results are shown in Tables 4.3–4.5. For the first four problems we only
show the results for = 0.001 since these problems were easily solved by
all algorithms. For the more difficult problems we show the results when the
algorithms have found solutions with provable error bounds of 0.1, 0.01, and
0.001 respectively.
We see that the weighted variant of IBLAO* is the fastest for all problems and all error thresholds except the tightest error thresholds where the
unweighted variant of IBLAO* is slightly faster. However, for the larger
thresholds, the difference between the unweighted and the weighted variant of
IBLAO* is significant. This is because the upper bound heuristic is a constant
so the upper bound cannot be reduced until a complete path to a goal state
is found. For unweighted IBLAO*, this happens only short before a complete
solution is found, which, consequently, is an optimal solution. Therefore, the
performance for high error thresholds is almost the same as the performance
for the lower thresholds.
Weighted IBLAO* avoids this, by first finding a suboptimal solution using
the weighted heuristic and then gradually improving the solution. However,
the many adjustments of the weight causes the weighted variant of IBLAO*
to perform more backups, but this is counterweighted by the fewer number of
expansions.
A constant upper bound is less problematic for the RTDP-based algorithms,
because the trials are depth-first and therefore many, possibly suboptimal,
paths to a goal state are found early. However, because of the many choices
of actions, the trials become less efficient, since each path becomes less likely
to be part of an optimal solution. Therefore, even though they make few backups, a much larger portion of the search space is explored.