Reinforcement learning (RL) is a type of machine learning technique for solving sequential decision problems which has achieved great success in many areas. Some recent progress on the convergence of exact policy gradient methods for RL will be discussed in this talk, with an emphasis on the convergence of projected policy gradient method, and the convergence of other methods will be briefly mentioned if time permitted.