How should we evaluate and compare the performances of policy institutions? We propose to evaluate institutions based on their reaction function, i.e., on how well they reacted to the different shocks that hit the economy. We show that reaction function evaluation is possible with only two sufficient statistics (i) the impulse responses of the policy objectives to non-policy shocks, which capture what an institution did on average to counteract these shocks, and (ii) the impulse responses of the policy objectives to policy shocks, which capture what an institution could have done to counteract the shocks. A regression of (i) on (ii) —a regression in impulse response space— allows to compute the distance to the optimal reaction function, and thereby evaluate and rank institutions. We use our methodology to evaluate US monetary policy; from the Gold standard period, the early Fed years and the Great Depression, to the post World War II period, and the post-Volcker regime. We find no material improvement in the reaction function over the first 100 years, and it is only in the last 30 years that we estimate large and uniform improvements in the conduct of monetary policy.