## AbstractIn Bayesian approaches to utility learning from preferences, the objective is to infer a posterior belief distribution over an agent’s utility function based on previously observed agent preferences. From this, one can then estimate quantities such as the expected utility of a decision or the probability of an unobserved preference, which can then be used to make or suggest future decisions on behalf of the agent. However, there remains an open question as to how one can represent beliefs over agent utilities, perform Bayesian updating based on observed agent pairwise preferences, and make inferences with this posterior distribution in an exact, closed-form. In this paper, we build on Bayesian pairwise preference learning models under the assumptions of linearly additive multi-attribute utility functions and a bounded uniform utility prior. These assumptions lead to a posterior form that is a uniform distribution over a convex polytope for which we then demonstrate how to perform exact, closed-form inference w.r.t. this posterior, i.e., without resorting to sampling or other approximation methods.
[Edit] |