Our work on architecture and quantization co-policy search in an end-to-end differentiable manner has been accepted in TMLR!