... maximized.
1
Most RLP formulations maximize some discounted expected reward, with a discount factor
. In the limit
, our formulation is obtained, which we find more intuitive.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .