INVESTIGADORES
GARELLI Fabricio
congresos y reuniones científicas
Título:
Comparative analysis of reward functions in RL adaptation of glucose control
Autor/es:
CECILIA SERAFINI; NICOLÁS ROSALES; FABRICIO GARELLI
Lugar:
Berlin
Reunión:
Congreso; 16th International Conference on Advanced Technology & Treatments for Diabetes.; 2023
Resumen:
Background and AimsPrevious work with Reinforcement Learning (RL) techniques applied to closed loop glucose control have shown that reward definition can severely modify the performance of RL adaptation strategies.MethodsIn this work, we make use of the previously presented Q-learning adaptation technique for the Automatic Regulation of Glucose (ARG) algorithm [1], modifying the reward definition for training different agents. These agents were trained using the UVA simulator, for 20000 episodes of 10 steps, each step being 24hs with 4 meals. The trained agents were tested in 5 pre-classified adult patients with a common scenario, considering a 16-day, highly demanding intra-patient variability profile.4 different schemes were compared:•Ad-hoc strategy based on medical protocols.•Test 1: piecewise reward.•Test 2: exponentially shaped, continuous reward.•Test 3: same as test 2 with a factor to give pre-meal hyperglycemia more negative rewardThe ah-hoc strategy has good average %TIR but does not avoid hypoglycemic episodes successfully. All adaptation schemes using RL agents avoid hypoglycemic episodes, but the ones trained under shaped rewards also achieve better %TIR (avoiding hyperglycemia). When adding a reward discount for premeal hyperglycemia (Test 3) this further improves.ConclusionsReward shaping has shown to play a key role in RL, but in the case of glucose control, some sort of pre-classification of patients might be needed to properly train agents. Given this, agents trained under continuous rewards may greatly improve adaptation techniques.[1] Serafini, Rosales, Garelli, (2022) “Long-Term Adaptation of Closed-Loop Glucose Regulation Via Reinforcement Learning Tools”, IFAC-PapersOnLine 55,7, pp. 649–654.