Reducing Model Bias in Reinforcement Learning