CONICET | Buscador de Institutos y Recursos Humanos

The dynamic complexity of the glucose?insulin metabolism in diabetic patients is the main obstacle towards widespread use of an artificial pancreas. The significant level of subject-specific glycemic variability requires continuously adapting the control policy to successfully face daily changes in patient?s metabolism and lifestyle. In this paper, an on-line selective reinforcement learning algorithm that enables real-time adaptation of a control policy based on ongoing interactions with the patient so as to tailor the artificial pancreas is proposed. Adaptation includes two online procedures: on-line sparsification and parameter updating of the Gaussian process used to approximate the control policy. With the proposed sparsification method, the support data dictionary for on-line learning is modified by checking if in the arriving data stream there exists novel information to be added to the dictionary in order to personalize the policy. Results obtained in silico experiments demonstrate that on-line policy learning is both safe and efficient for maintaining blood glucose variability within the normoglycemic range.