Digital health technologies, such as smart toothbrushes, hold great potential for unobtrusively monitoring brushing behaviors in home settings by utilizing motion-sensors, including accelerometers, gyroscopes, and magnetometers. Some studies have attempted to modify toothbrushes by attaching these sensors to the brush handle, while others have employed external devices, such as wrist-watches to identify the dental regions being brushed during a brushing session. Although these research studies show promising preliminary results, they were conducted under structured toothbrushing assumptions performed in controlled laboratory settings (e.g. limited head and body movement, predefined sequence of brushing) and not the free-form brushing observed in real-world settings.
To address the aforementioned issue, we collected a dataset of 187 brushing sessions, including free-form brushing, and present, to the best of our knowledge, the first motion-sensor dataset obtained during free-form brushing. We label these brushing session using concurrent video recordings as the ground truth. Given the fast and frequent transitions of the brush, labeling is prone to errors, known as the noisy label scenario in machine learning, and to address this challenge, we propose a relabeling algorithm for dataset cleaning. To perform a statistical analysis of brushing behavior for the first time, we utilized a zero-inflated mixed-effect generalized linear regression model to examine the brushing time for each dental region. Previous studies frequently depended on non-parametric tests, such as the Wilcoxon test, which are not suitable for repeated measurement studies like ours, involving multiple participants each brushing for multiple sessions. Our results show that individuals generally brushed their buccal teeth surfaces (i.e. outer teeth surfaces) more than twice as long as the occlusal (i.e. flat teeth surfaces) (2.18 times longer (95\% CI 1.42, 3.35; p < 0.001)) and lingual surfaces (i.e. inner teeth surfaces) (2.22 times longer (95\% CI 1.62, 3.10; p < 0.001)). Additionally, we also propose a three-stage method (i.e. pre-processing, change point detection (CPD), and time-series classification) to detect the teeth surfaces brushed during a session using motion-sensor data. We present CluMing, a novel change point detection (CPD) algorithm based on clustering, which outperforms other CPD algorithms in our application. We compare the results of multiple time-series classification methods in machine learning such as the feature engineering method, LSTM, and Transformer models. Our findings indicate that high classification accuracy can be achieved using a random train-test split of the data (i.e. k-fold cross-validation); however, considering the high variations in brushing data, generalization beyond the participants in the training set (i.e. one-subject-out cross-validationin), may need additional aid such as domain knowledge transfer or personalization. We validate our findings by applying our proposed method to our provided dataset, as well as the datasets of toothbrushing in controlled settings.