Jump to content

User:Leticia Echevarría

From Wikipedia, the free encyclopedia

What is dynamic programming?

"The term "Dynamic Programming" (DP) refers to a collection of algorithms (definition) that can be used to compute optimal policies given a perfect model of the environment as a Markov decision process (MDP). Classical DP algorithms are of limited utility in reinforcement learning both because of their assumption of a perfect model and because of their great computational expense, but they are still very important theoretically. DP provides an essential foundation for the understanding of the methods presented in the rest of this book. In fact, all of these methods can be viewed as attempts to achieve much the same effect as DP, only with less computation and without assuming a perfect model of the environment."

 This paragraph has been taken from this page:
 http://www-anw.cs.umass.edu/~rich/book/4/node1.html
 Useful links:
 http://plus.maths.org/issue3/dynamic
 http://www.nist.gov/dads/HTML/dynamicprog.html
 http://benli.bcc.bilkent.edu.tr/~omer/research/dynprog.html