Variational Dynamic Programming for Stochastic Optimal Control

Marc Lambert; Francis Bach; Silvère Bonnabel

Preprints, Working Papers, ... Year : 2024

Variational Dynamic Programming for Stochastic Optimal Control

(1, 2) , (1) , (3, 4)

1
2
3
4

Marc Lambert

Function : Author
PersonId : 1377005

Statistical Machine Learning and Parsimony

DGA

Francis Bach

Function : Author
PersonId : 863126

Statistical Machine Learning and Parsimony

Silvère Bonnabel

Function : Author
PersonId : 866182
IdHAL : silvere-bonnabel
ORCID : 0000-0002-6001-7766

Mines Paris - PSL (École nationale supérieure des mines de Paris)

Centre de Robotique

Abstract

We consider the problem of stochastic optimal control where the state-feedback control policies take the form of a probability distribution, and where a penalty on the entropy is added. By viewing the cost function as a Kullback-Leibler (KL) divergence between two Markov chains, we bring the tools from variational inference to bear on our optimal control problem. This allows for deriving a dynamic programming principle, where the value function is defined as a KL divergence again. We then resort to Gaussian distributions to approximate the control policies, and apply the theory to control affine nonlinear systems with quadratic costs. This results in closed-form recursive updates, which generalize LQR control and the backward Riccati equation. We illustrate this novel method on the simple problem of stabilizing an inverted pendulum.

Keywords

Variational Approximation Linear Quadratic Control Dynamic Programming DP Maximum entropy

Domains

Optimization and Control [math.OC] Automatic

Fichier principal

VariationalDP.pdf (645.34 Ko)

Origin : Files produced by the author(s)

Marc Lambert : Connect in order to contact the contributor

https://inria.hal.science/hal-04553255

Submitted on : Tuesday, April 23, 2024-7:22:02 PM

Last modification on : Monday, April 29, 2024-3:14:40 AM

Dates and versions

hal-04553255 , version 1 (22-04-2024)

hal-04553255 , version 2 (23-04-2024)

Identifiers

HAL Id : hal-04553255 , version 2

Cite

Marc Lambert, Francis Bach, Silvère Bonnabel. Variational Dynamic Programming for Stochastic Optimal Control. 2024. ⟨hal-04553255v2⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENSMP ENS-PARIS CNRS INRIA ENSMP_CAOR INRIA2 TDS-MACS PSL ENSMP_DR ANR PRAIRIE-IA

3 View

0 Download

Variational Dynamic Programming for Stochastic Optimal Control

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Share