Skip to main content
By Hunyong Cho

 

For cancer patients who plan multiple rounds of chemotherapies, clinicians may need to make decisions at every time point between two or more treatment options. Dynamic treatment regimes (DTR) formalize this decision-making mechanism. A DTR adaptively recommends the (optimal) treatment based on the patient history up to each visit.

When the end goal is to maximize patients’ survival time, deriving an optimal DTR is not trivial and requires careful consideration of censoring. For this, we have developed a Q-learning framework that uses random forests. Our method is flexible in many aspects:

  1. It allows conditionally independent censoring
  2. The resulting decision rules are unrestricted (e.g., non-linear)
  3. It allows different numbers of treatment options at each stage
  4. It can maximize either the mean survival time or the survival chance at a certain time point.

Our estimator is consistent, has a polynomial regret bound, and shows strong empirical performances. An R package, dtrSurv, is available on CRAN.

For more information, see:
Cho, H., Holloway, S. T., Couper, D. J., & Kosorok, M. R. (2022). Multi-stage optimal dynamic treatment regimes for survival outcomes with dependent censoring. Biometrika. https://doi.org/10.1093/biomet/asac047

Lab Members Involved: