Skip to main content
By Anna Kahkoska

 

We have been actively involved in developing reinforcement learning methods for estimating dynamic treatment regimes in mHealth settings and have applied these methods to generate data-driven decision rules for the day-to-day management of blood glucose levels for individuals living with type 1 diabetes. In contrast to standard settings where dynamic treatment regimes are used, mobile health applications provide a large number of observations per individual at a very fine granularity. It is difficult to model such systems and therefore difficult to estimate an optimal treatment policy.

We have developed a reinforcement learning method, V-learning, which attempts to alleviate these difficulties by estimating an optimal policy without posing modeling assumptions on the data generating process (Luckett et al., 2020). V-learning has exhibited excellent performance in simulation studies and in an application to an observational data set of individuals with type 1 diabetes wearing continuous glucose monitors (for glucose data) and accelerometers (for physical activity data). Furthermore, V-learning easily provides for continually updating the estimated decision rule as new data are collected. We will use V-learning estimate a policy to minimize how far a patient’s current blood glucose level is outside the normal range.

For more information, see:
Daniel J. Luckett, Eric B. Laber, Anna R. Kahkoska, David M. Maahs, Elizabeth Mayer-Davis & Michael R. Kosorok (2020) Estimating Dynamic Treatment Regimes in Mobile Health Using V-Learning, Journal of the American Statistical Association, 115:530, 692-706. https://doi.org/10.1080/01621459.2018.1537919

Lab Members Involved: