Article
Open access
Published: 26 April 2023

Assessing nocturnal scratch with actigraphy in atopic dermatitis patients

npj Digital Medicine volume 6, Article number: 72 (2023) Cite this article

2816 Accesses
3 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Nocturnal scratch is one major factor leading to impaired quality of life in atopic dermatitis (AD) patients. Therefore, objectively quantifying nocturnal scratch events aids in assessing the disease state, treatment effect, and AD patients’ quality of life. In this paper, we describe the use of actigraphy, highly predictive topological features, and a model-ensembling approach to develop an assessment of nocturnal scratch events by measuring scratch duration and intensity. Our assessment is tested in a clinical setting against the ground truth obtained from video recordings. The new approach addresses unmet challenges in existing studies, such as the lack of generalizability to real-world applications, the failure to capture finger scratches, and the limitations in the evaluation due to imbalanced data in the current literature. Furthermore, the performance evaluation shows agreement between derived digital endpoints and the video annotation ground truth, as well as patient-reported outcomes, which demonstrated the validity of the new assessment of nocturnal scratch.

Development of digital measures for nighttime scratch and sleep using wrist-worn wearable devices

Article Open access 03 March 2021

A multimodal sensing ring for quantification of scratch intensity

Article Open access 19 September 2023

Automated detection of mouse scratching behaviour using convolutional recurrent neural network

Article Open access 12 January 2021

Introduction

Most atopic dermatitis (AD) patients experience itching more often and severely in the evening or nighttime than during the day^1,2. Sleep disturbance is a concern for AD patients, and itch-related scratch is a major cause. Arousal does not occur prior to itching but ensues after scratch^3,4. Studies show that itching and scratching are not synonymous concepts, and generally, scratching is not required for mild itch². Patient-reported outcome (PRO) assessments commonly used in trials do not explicitly reflect nocturnal scratches. Furthermore, subjective measures are limited in accurately reflecting the true scratch severity, especially when patients are unconsciously experiencing discomfort that leads to being awake. Therefore, objectively evaluating and quantifying nocturnal scratches can bring additional benefits in understanding the patient’s disease state and life quality. Our study aims to objectively track the duration and intensity of nocturnal scratches for AD patients using wrist-worn actigraphy devices.

Many recent studies leverage wearable sensor technologies to objectively measure and assess the health-related life quality of patients across various therapeutic areas. These studies demonstrate the benefit of actigraphy devices as cost-effective, noninvasive, and user-friendly ways of enabling proactive personal health management, continuous health monitoring, early detection of symptoms, and context awareness as healthcare costs increase^5,6,7,8. Many of these wearable sensor studies explore nocturnal scratch detection from wrist-worn actigraphy or smartwatch applications. Yang et al.⁹ provide a systematic review of studies published before 2021 to compare methods of measuring scratch objectively and pointed out the large variability in performance and limited evaluation of specificity due to imbalanced data. In addition, lack of generalizability and failure to capture finger scratches would bring additional errors. Figure 1 shows finger and hand scratch examples.

**Fig. 1: Finger scratch and hand scratch events.**

The generalizability of nocturnal scratch detection comes from two aspects—the spontaneous scratch motion from the AD patient population and the exact definition of nocturnal scratch. Feuerstein et al. and Petersen et al.^10,11 built algorithms from simulated scratches from healthy adults that may not be generalizable to the AD population with spontaneous scratches. Moreau et al.¹² proposed a bidirectional recurrent neural network classifier to detect scratches among AD patients with improvement from previous studies. However, the reported evaluation metric is based on re-balanced data after over-sampling scratch events, leading to over-estimation of the performance when most of the time contains non-scratch events. Mahadevan et al.¹³ proposed a two-step approach of excluding the period without hand movement and only detecting scratching when hand movement is present. Mahadevan’s method helps to balance the scratch and non-scratch events, but the false negative rate from the movement detection filter is not reported. Therefore, finger scratches can be identified as non-movement and falsely excluded before binary classification. On the other hand, Mahadevan et al. only evaluate scratch during the total sleep opportunity (TSO) window to provide a generalized definition of nocturnal scratch. The TSO window is first defined by van Hees et al.¹⁴ as the period subjects intend to sleep. It can be detected by actigraphy signal with a heuristic algorithm. In a free-living environment, various scenarios of non-wear or sleep patterns exist. Therefore, a more specific heuristic rule is needed based on the actigraphy signal and the time of the day to capture the “nighttime, intend to sleep” period to define the nocturnal scratch more precisely.

Although many studies explored nocturnal scratch detection from actigraphy devices with accelerometer sensors, few utilized gyroscopes. Gyroscopes measure the device orientation and angular velocity with the additional benefit of motion detection and gesture recognition on top of the accelerometer, especially for low amplitudes motion such as finger scratches^15,16. In addition, gyroscopes provide a more precise gravity removal from accelerometer measures by getting the device orientation by solving a differential equation for the orientation that depends on the angular velocity¹⁷. However, the advantage of including a gyroscope sensor in actigraphy devices for nocturnal scratch detection has yet to be fully explored.

To improve the model performance, we advanced both feature extraction and modeling methods. Besides the interpretable features in the time and frequency domain commonly used in literature, we develop additional features with topological data analysis (TDA). TDA, which originated from algebraic topology in pure mathematics, is a rapidly growing field and has proven successful in several scientific areas^18,19,20,21. TDA tools can capture intrinsic shape information, provide effective features or fingerprints, and are robust to noise. Researchers have recently found that using TDA offers different aspects in analyzing time series data^22,23,24,25. We adapt the methods from Chung et al.²² to extract topological features. In addition, we ensemble models to boost the performance by deriving extra features from deep learning (DL) models as input to a top-layer LightGBM classifier. Topology-based and DL model-derived features are the top predictors selected by LightGBM.

In this work, we refine the heuristic approach to derive TSO to help define nocturnal scratches more precisely in a free-living environment. We improve the movement detection algorithm to achieve a sensible balance between scratch events prevalence and false negative rate (falsely detect scratch as non-movement). We also demonstrate the benefit of gyroscopes in nocturnal scratch detection by providing more precise gravity removal and additional features. Furthermore, we develop highly predictive topology-based and DL model-derived features to improve model performance. In addition, performance is evaluated on the test set without resampling to reflect the actual performance on imbalanced data, which is most often the case. Finally, digital endpoints measuring the scratch duration and intensities are derived and compared with video annotation ground truth and subjective PROs to demonstrate the objective digital measures’ feasibility, consistency, and reliability.

Results

Demographics for study participants

The nocturnal scratch model is built based on 96 nights of data from 20 AD patients. The proportion of male participants is 80%, and the average age is 31 years. The overall disease severity in this study is mild to moderate, with a Severity Scoring of Atopic Dermatitis Index (SCORAD) score mean of 25.8 and an average number of wakings each night of 2. Table 1 includes detailed information on the demographics of study participants.

Table 1 Demographics of study participants.

Full size table

Movement detection results

Across all 96 study nights, there are 33,559 s of scratch events from the video annotations with a 0.98% scratch event rate. After applying the movement detection, the scratch event rate increased to 17.8%. The overall performance of the movement detection is shown in Table 2 together with the comparison to the method in Mahadevan et al.¹³. We improved the rate of false detection of scratch as non-movement while greatly increasing the prevalence of scratch compared to the situation without any movement detection filter. The false rate dropped from 36.1% to 10.1% without sacrificing the prevalence too much (prevalence dropped from 20.3 % to 17.8%). There are two possible reasons for those 10.1% of scratches that were falsely detected as non-movement. One is due to low-intensity-scratch such as finger scratches. The scratch pattern varies from person to person, which leads to large variations in subject-level performance, as reported in Supplementary Table S1. The second reason is human error in the video annotation labels. During wrongly annotated periods, the actigraphy signals appear flat without wrist motion but were still annotated as scratches. Given the big challenge in the precision of manual video annotations, it is impossible to avoid this completely. Example plots of these false negative scratch actigraphy signals are shown in Supplementary Fig. S3.

Table 2 Performance of movement detection algorithm.

Full size table

Nocturnal scratch classifier

With only 16 features for the accelerometer data model and 18 features for the accelerometer and gyroscope model, the area under the curve (AUC) can achieve above 99% of the AUC of models with all features. In both feature importance plots shown in Fig. 2, topology-based features (persistence statistics and norm of Gaussian persistence curves) are the top selected features for both models. Features derived from DL models are highly predictive as well.

The leave-one-subject-out prediction evaluation is performed within the movement period identified by the movement detection algorithm. Table 3 compares two versions of binary classifiers. The average AUC increased from 0.77 to 0.80 after including the gyroscope signals. The F1 score increased from 0.39 to 0.44. A subject-level evaluation is provided in Supplementary Table S2. The performance difference varies largely among subjects. Subjects with results showing more advantage by adding the gyroscope are likely to have low-intensity motion scratches such as finger scratches. For example, subject 8 benefits from the gyroscope with AUC increased by more than 0.1, and F1 increased by 0.08. Subjects 4 and 6 have AUC increased by more than 0.06 and F1 increased by more than 0.1 and 0.3, respectively. However, subjects 1, 7, and 20 do not differ much, with or without including the gyroscope.

Table 3 Leave-one-subject-out evaluation of scratch binary classifier.

Full size table

Digital endpoints validation

For each subject night, the total scratch duration was derived based on the binary classifier output. The model-derived scratch duration averaged across all 96 study nights is 2.13 s shorter than the ground truth from the video. For most subject-nights, there is no discrepancy between the two measures. From the histogram and Bland–Altman plots shown in Fig. 3, there is no discrepancy or systematic bias in model-derived scratch duration compared to ground truth. The blue line in the Bland–Altman plot is the mean difference between the two measures, and the red lines are the upper and lower limits of the 95% confidence interval for the average difference. There are a few outliers with total scratch duration underestimated by the model compared to video annotation for more than 5 min and one outlier overestimated by more than 10 min.

**Fig. 3: Assessing consistency between ground truth and model-derived endpoints.**

Overall, there is a weak to moderate correlation between digital endpoints and PROs. Since the patient reports SCORAD only once on their first visit, the correlation shown in Fig. 4 is between SCORAD and subject-level averaged scratch duration. Similarly, the correlation between the SCORAD score and true scratch duration is slightly higher (0.44) than the derived scratch duration (0.37). The correlation between ADSS and digital endpoints is across all 96 nights. The correlation between the number of awakenings and true hourly scratch is 0.35 and slightly higher than the derived scratch duration, 0.22. This result confirms the consistency between digital and PRO measures and also suggests that objective digital endpoints can complement PRO by assessing disease states from different perspectives.

**Fig. 4: Correlation between patient-reported outcomes and digital endpoints.**

Digital endpoints and ADSS were measured after participants spent each night in the clinic. The intraclass correlation (ICC) shown in Table 4 was calculated based on repeated measures. Both true, from the video annotations, and derived scratch duration have ICC slightly higher than 0.4, indicating weaker consistency or reproducibility compared to the number of awakenings reported by patients.

Table 4 Intraclass correlation (ICC).

Full size table

Discussion

The results of this study showed the validity of nocturnal scratch endpoints derived from wrist-worn actigraphy devices in AD patients. Compared to the video annotation ground truth and PROs, nocturnal scratch endpoints accurately capture the disease state. More importantly, nocturnal scratches provide an additional objective assessment of the disease state that subjective measures cannot provide. Given the concern of over-estimation of the scratch detection performance in reality, when the majority of time is non-scratch behaviors, our metric is reported on the dataset without balancing scratch and non-scratch events. Since this study enrolled patients with mild to moderate disease severity, the low prevalence of scratch events makes having a high precision or F1 value more challenging. In a high prevalence event population, it is more likely that a tested positive sample is truly positive than in a low prevalence population. Therefore, we expect higher precision of scratch detection among patients with longer scratch duration. Another path to greatly help to increase the scratch events rate is applying a movement detection filter before the binary scratch classifier. We further improved the movement detection algorithm by lowering the false negative rate compared to the existing method while preserving the purpose of increasing the scratch events rate.

Another innovation of this work is to develop topology-based features from actigraphy signals as predictors for nocturnal scratches. Given the great robustness property of extracting Betti numbers based on the shape of signals, these features are interpretable and highly predictive compared to existing features in time and frequency domains. In the model with the full set of 348 features derived from accelerometer data, 11 out of 16 top predictive features are topology-based. The same idea can also be extended to predicting certain physiological behaviors using wearable device signals.

We also explored the benefit of adding gyroscope data from two aspects—gravity removal and predictive feature extraction. The advantage of gyroscope signals in providing a more precise gravity removal in accelerometer data can also be applied to other actigraphy use cases beyond nocturnal scratch detection. Features extracted from gyroscope signals are highly predictive, given that 7 of 18 selected features are extracted from gyroscope data. Model performance improvement due to adding the gyroscope varies across subjects. This improvement is expected because scratch patterns differ from person to person. In general, including gyroscope signals in nocturnal scratch detection is recommended. Investigating gyroscope benefits in a larger population would be future work. Another big challenge in nocturnal scratch quantification in a free-living environment is to identify the reasonable TSO window, especially when patients may fail to wear devices on both wrists all the time. We extended the existing approach of identifying TSO for each hand separately to getting a combined TSO from both hands when valid data are available. Otherwise, if a subject only wears a device on one wrist, TSO and digital endpoints are only derived from that hand. In addition, we leveraged movement detection and temperature information to identify the non-wear period and, thus, better identify the TSO in a free-living setting.

Finally, we examined the validity of derived digital endpoints analytically and clinically by comparing them with the true scratch duration from video annotations and PROs. The model-derived scratch duration is consistent with ground truth without any systematic bias. The weak to moderate correlation between digital endpoints and PROs indicates that the digital endpoints are consistent with existing PROs and can assess disease states from different aspects than PROs. The weaker ICC of digital endpoints compared to PROs is due to the finer granularity that digital endpoints measure from night to night. Therefore, collecting device data and averaging them across one to two weeks from each patient is recommended for studies using actigraphy to assess disease state.

Given that nocturnal scratch is detected based on wrist motions in this work, there is a limitation in capturing other formats of scratching. Some examples include non-hand scratches and rubbing part of the body on a pillow, sheets, or other bedding. To fully consider non-hand scratches, we need to leverage other sensor technologies. Similar to many existing studies, the sample size is also a limitation. To further improve the algorithm’s robustness and generalizability, refining the algorithm with more subjects and across different disease severity groups will be future work. In addition, the ability to assess treatment efficacy through digital endpoints is not yet fully explored. A future study is required to collect actigraphy data from patients while taking treatment with known efficacy.

Methods

Study design and data collection

There were 23 AD patients enrolled with up to five nights of data collected from each participant, and three subjects were excluded due to device malfunction and software issues. We have 96 nights of data from 20 subjects to train the model. During each study night, participants slept at the clinical research unit (CRU) on a mattress with an EMFIT bed sensor, wore an AX6 actigraphy device on each wrist, and had an infrared camera videotape throughout the night. Triaxial accelerometer and triaxial gyroscope signal data were collected from AX6 to capture the wrist motion of both hands. Two independent reviewers annotate videos with an arbitrator to reconcile the discrepancy. The data source and description are shown in Table 5. WIRB-Copernicus Group Institutional Review Boards approved this study. All enrolled participants provided written informed consent approved by the ethical review board governing the CRU.

Table 5 Description of data source.

Full size table

Algorithm overview

Given the scope of nocturnal scratch detection, the first step is identifying the TSO window. The TSO window is defined as the period of nighttime in which participants are ready for sleep or trying to sleep. In this study, participants were instructed to wear the actigraphy devices only during the nighttime after arrival at the CRU. After participants woke up and left the CRU in the morning, devices were placed on the table (still turned on). In this case, TSO is identified by the EMFIT bed sensor. The moment when participants sit or lay on the mattress each study night is the start of the TSO window, and the moment when participants leave their bed in the morning is the end of the TSO window. We also developed a rule-based approach using accelerometer and temperature data to obtain TSO for the case without bed sensors and participants wearing devices for 24 h^13,14. After getting the TSO window, we followed the steps shown in Fig. 5 to process signals, extract features, build models, and derive endpoints. Given that the AX6 device collects both accelerometer and gyroscope data, in signal processing and modeling steps, we built two versions for comparison—version one involves accelerometer data only, and version two uses both accelerometer and gyroscope data. Detailed descriptions for each step are discussed in the following subsections.

Signal processing

To ensure the model was trained with correctly annotated data, the first step was to align the timestamps between the actigraphy device and video annotation for each study night. Factors such as initial actigraphy device configuration and drift from the internal real-time clock (RTC) can impact the alignment. The initial time configuration of the AX6 device is synced to the time of the connected computer to the nearest second. In this case, the connected computer may have been set to a different time zone, which caused hours of alignment discrepancies. Under normal operating conditions, the RTC drift for AX6 is specified to be up to ±4.32 s per day. To visually align for initial device configuration and RTC drift, we plot accelerometer and gyroscope signals along with the annotation. Then, we visually measured the amount of alignment correction needed based on the movement patterns in the sensor signals for each subject- night to the nearest second. The amount of RTC drift during a night is typically less than ±1 s, so we did not correct for any slight variations of RTC drift within each night. We used the visual measurements to shift the AX6 actigraphy data to align with the video annotations. Figure 6 shows a sample of the actigraphy and video annotation data after alignment.

**Fig. 6: Overlaid actigraphy data and video annotations.**

The raw accelerometer signal was calibrated to local gravity to account for the device-level noise²⁶. Since we are interested in patient movement, not, e.g., hand orientation in a global reference frame, we need to remove gravity from the measured acceleration. To this end, we employed two distinct approaches, depending on whether the downstream model leverages gyroscopic information. Many clinicians may opt to forgo gyroscope utilization due to the additional energy burden placed on the device that requires more frequent device charging and potential issues with patient compliance. Unfortunately, it is impossible to remove gravity exactly with only accelerometer data. We followed the standard approximation for this version and used a high-pass first-order Butterworth filter with a cutoff frequency of 0.25 Hz. When we also have gyroscopic information, it is possible to remove gravity exactly (up to device noise and numerical drift). Here we used an in-house approach that solves a quaternion ODE that leverages stationary regions with sensor fusion to mitigate drift, as described in Supplementary Information Section 1.

Similar to the approach taken in Mahadevan et al., our model aims to detect scratches generated by hand or wrist motion. Thus, a threshold-based filter was applied to exclude the period without hand movement. The model only classifies scratch and non-scratch within the period with any hand or wrist-related motion present. As scratch only happens with hand motions, this reduces non-scratch time and helps to increase scratch labels in train and test sets, especially in the mild disease state population. We compared two versions of the movement algorithm with and without gyroscope data and observed a small difference in performance. The current movement detection algorithm is based on accelerometer data only to simplify the approach and avoid extra parameters. Figure 7 shows the process of movement extraction. After resampling raw data to 20 Hz sampling frequency, we computed vector magnitude (VM) \(\sqrt{{X}^{2}+{Y}^{2}+{Z}^{2}}\) within each 1-s window. Both low pass filter (6th order with 3 Hz cutoff) and high-pass filter (1st order with 0.25 Hz cutoff) were applied to remove noise and constant in VM, respectively. The movement period was detected based on two threshold quantities. The first quantity is the rolling coefficient of variation (CoV) computed by sliding a 1-s window. The second quantity is the maximum standard deviations (SD) among the X, Y, and Z axes within each 1-s window. For each second, if more than half of the CoV is greater than 0.41 and the maximum SD is greater than 0.013, then we marked this second as hand movement. The choice of thresholds reflects a trade-off between how much true scratch is falsely excluded and the percent of scratch in train and test sets. Here 0.41 and 0.013 were picked as the 8% quantiles from the entire dataset to achieve a good trade-off.

**Fig. 7: Detailed diagrams of pipeline.**

Feature engineering

The extracted movement period was further segmented into 3-s windows with 1.5-s overlap to increase the sample size. The literature suggests that 3-s windows achieve a good trade-off between temporal resolution and detection performance^10,11,13. The binary label for each 3-s window was generated based on video annotation. A window with more than 1-s scratch present was labeled as a scratch. Edge cases (27.6 %) with a 3-s window covering both movement and non-movement periods are also included, and video annotations determine their labels.

Accelerometer and gyroscope sensor signals were transformed by computing VM, first and second principal components (PC1, PC2) to make the signal orientation invariant. This transformation eliminates the dependence on wrist orientation by effectively changing the signal basis into the two-dimensional plane that scratch occurs^13,16. For each sensor, we have 12 channels of signals (X, Y, Z, PC1, PC2, VM, and their corresponding derivatives) as input to extract features.

There are two types of features extracted from the transformed signals. The first type of feature is interpretable with meanings in the time and frequency domains to reflect the signal’s range, periodicity, smoothness, and other properties. In addition to those commonly used features in literature^10,11,13,16, we developed additional features with TDA. TDA is a rapidly growing field and has proven successful across various scientific disciplines^21,27. Recently, researchers have also found that using TDA tools offer different aspects in analyzing time series data^{22,23,24,25,28}. In this work, we consider 11 TDA-based features, and 10 of them are called persistence statistics²², and they are summary statistics of two sets of numbers lifespan persistence and midlife persistence. The last feature is the norm of the Gaussian persistence curve²⁹. Details about these 11 features can be found in Supplementary Information Section 2. We use the GUDHI package for computing persistence diagrams³⁰. The second type of feature is non-interpretable and derived from DL models. We derived ten features by taking the penultimate layers from the convolutional neural network (CNN) and recurrent neural network (RNN) models. In total, we have 348 features for the model with only accelerometer data, where 338 are interpretable features and 10 are layers from CNN and RNN models. The model with accelerometer and gyroscope data has 686 features, 676 are interpretable, and 10 are layers from CNN and RNN models. Feature selection has been performed with recursive feature elimination. For the accelerometer data model, 16 features are selected. For the model with both accelerometer and gyroscope data, 18 features are selected. This decision is based on the elbow plots shown in Supplementary Fig. S2.

Nocturnal scratch detection model

A binary classifier is trained to detect scratch events. We employed a hierarchical ensembling approach with LightGBM as the top layer. Lower levels consist of physics-based, topology-based, and DL model-derived features (CNN/RNN), as shown in Fig. 7a. This ensembling allows us to effectively leverage the inductive biases offered by a variety of different approaches in order to boost the overall performance. For the ML-based features, we used modest architectures for our CNN and RNN. The CNN consists of two layers of convolution, batch normalization, and max pooling, followed by a 3-layer MLP with output dimensions (16,5,1). The RNN consists of a bidirectional LSTM layer again, followed by a 3-layer MLP with layer output dimensions (16,5,1). In each case, we trained the model to classify scratch, remove the final layer from the trained network, and use the penultimate layer output as a feature. We chose a conservative hidden dimension for the penultimate layer to obtain five distinct features from each model.

Given that the future deployment of this model is on new patients, we evaluate the leave-one-subject-out performance to mimic reality. For each fold, we hold out data from one subject. For the rest subjects, we randomly split data to train and validation sets with an 8–2 ratio. The validation set is used to evaluate early stop criteria to prevent model overfitting. Hyperparameters, including the number of leaves and the tree’s maximum depth, were tuned with the train set. Since participants in this study had mild to moderate disease states, the scratch events prevalence is low. To address this, “scale_pos_weight" has been included as a hyperparameter to over-weight scratch samples to balance the train set. However, we kept the original prevalence in the test (hold-out) set for evaluation to reflect the actual performance in reality, especially for mild to moderate AD patients, where scratch events are rare. Feature selection has been performed with recursive feature elimination within each fold to reduce dimension. To evaluate the additional benefit of the gyroscope sensor, we fit the model with two sets of features for comparison. The first set includes features processed and extracted only from accelerometer data. The second set includes features extracted and processed with accelerometer and gyroscope data.

Model deployment, digital endpoints extraction and analysis

This subsection describes the steps to deploy the model and derive nocturnal scratch endpoints using the proposed approach. After processing the raw signal with calibration, gravity removal, and resampling to 20 Hz, the movement detection filter is applied to exclude all non-movement periods. Then, the time of non-wear, or when the actigraphy device is not worn on the wrist, is identified based on temperature and non-movement, as seen in Fig. 8. For example, if a non-movement period has a temperature lower than 25 Celsius for more than 10 min, it is considered a non-wear period. Next, the TSO periods are retrieved. For studies without information on TSO start and end time, the TSO can be identified by the heuristic algorithm proposed by van Hees et al.¹⁴ together with rules in Fig. 7c to obtain a combined TSO from left and right hands that cover nighttime only. After excluding all non-wear time within the TSO window, the time series data are further segmented into 3-s, non-overlapping windows to apply the movement detection algorithm. All 3-s windows with at least 1-s identified as a movement have features extracted and passed as input to the binary classifier to detect scratch.

For each subject night, two endpoints are extracted to measure scratch duration and intensity based on the binary classifier output. Information from both hands is pooled together. Hourly scratch duration is computed as the average scratch duration, in seconds, within each hour within the TSO window. Averaged scratch intensity is the mean dominant frequency value, in Hz, of all scratch windows.

Analyses are applied to evaluate the digital endpoints’ accuracy, reliability, and consistency. First, the endpoints derived from models are compared against ground truth endpoints from the video annotations with correlation. Second, the association between digital endpoints and PROs (SCORAD and ADSS) is assessed. Finally, the ICC coefficient is used to assess test-retest reliability to show whether digital endpoints are reproducible for the same patient across time points with a similar disease condition.

Platform and pipeline

Digital endpoints, such as the objective and quantifiable measure of nocturnal scratch duration and density, offer unique insights into AD patients’ disease states and life quality. However, the opportunity comes with its unique challenges. Data we collect in this study (and many others) includes sensor signals from actigraphy, PROs from hand-held devices, and video annotation ground truth information for algorithm development and model building. Handling these data is a big data problem. For example, with a sampling frequency of 50 Hz, over 4 million 3-axial data points are collected from both the accelerometer and gyroscope sensors for a single day to understand a patient’s daily activities. Developing various measures requires an iterative process and repeated trial and error cycles to develop, validate, and confirm our hypothesis. Discovering accurate and meaningful measures also requires data aggregation across different levels, which can often be tedious and error-prone. One emerging capability in the industry is a fit-for-purpose ecosystem to collect, visualize, analyze, and report digital datasets efficiently and effectively across the typical data life-cycle to develop desired digital endpoints.

Toward this goal, we developed a home-grown digital data platform at Eli Lilly & Company. The typical data flow enabled by this platform is shown in Fig. 9. Sensor signals and PRO data arrive at our platform’s storage layer through high-performance transfer tools and services. Once data lands, various processing pipelines are triggered, either in parallel (when there is no dependency) or in a particular order (i.e., chained pipelines when the sequence matters). One example is a spark pipeline for raw data cleaning and quality checking, followed by concurrent feature extraction pipelines, each of which handles a specific sensor data type, e.g., our primarily used actigraphy data processing pipeline that involves the following modules: (1) calibration to local gravity; (2) resampling; (3) gravity removal; (4) noise removal; (5) data segmentation; and (6) features calculation. Once data from all channels (e.g., accelerometer, gyroscope) are ready, a data aggregation pipeline launches to perform aggregation at different levels, from hourly, daily, weekly, or per visit in the study.

**Fig. 9: A typical data flow in data platform.**

Data artifacts from running data pipelines are pushed automatically into the storage layer and the data portal (see, e.g., Fig. 10 for a complete sensor datasets collected from a participant in this study), making them immediately accessible through graphical web interfaces or programmatically via APIs for digital measure development and validation.

**Fig. 10: An example of data visualization.**

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

Upon request, and subject to review, Eli Lilly and Company will provide the aggregated and raw data that support the findings of this study.

Code availability

Analysis of datasets was performed with Python version 3.6+ and R version 4.2.2, and code will be shared upon request.

References

Lavery, M. J. et al. Nocturnal pruritus: prevalence, characteristics, and impact on itchyqol in a chronic itch population. Acta Derm. Venereol. 97, 513–515 (2017).
Article PubMed Google Scholar
Martin, S. A. et al. The atopic dermatitis itch scale: development of a new measure to assess pruritus in patients with atopic dermatitis. J. Dermatol. Treat. 31, 484–490 (2020).
Article Google Scholar
Lavery, M. J., Stull, C., Kinney, M. O. & Yosipovitch, G. Nocturnal pruritus: the battle for a peaceful night’s sleep. Int. J. Mol. Sci. 17, 425 (2016).
Article PubMed PubMed Central Google Scholar
Podder, I., Mondal, H. & Kroumpouzos, G. Nocturnal pruritus and sleep disturbance associated with dermatologic disorders in adult patients. Int. J. Womens Dermatol. 7, 403–410 (2021).
Article PubMed PubMed Central Google Scholar
Pantelopoulos, A. & Bourbakis, N. G. A survey on wearable sensor-based systems for health monitoring and prognosis. IEEE Trans. Syst. Man Cybern. C Appl. Rev. 40, 1–12 (2009).
Article Google Scholar
Banaee, H., Ahmed, M. U. & Loutfi, A. Data mining for wearable sensors in health monitoring systems: a review of recent trends and challenges. Sensors 13, 17472–17500 (2013).
Article CAS PubMed PubMed Central Google Scholar
Majumder, S., Mondal, T. & Deen, M. J. Wearable sensors for remote health monitoring. Sensors 17, 130 (2017).
Article PubMed PubMed Central Google Scholar
Rodgers, M. M., Pai, V. M. & Conroy, R. S. Recent advances in wearable sensors for health monitoring. IEEE Sens. J. 15, 3119–3126 (2014).
Article Google Scholar
Yang, A. F. et al. Use of technology for the objective evaluation of scratching behavior: a systematic review. JAAD Int. 5, 19–32 (2021).
Article PubMed PubMed Central Google Scholar
Feuerstein, J., Austin, D., Sack, R. & Hayes, T. L. Wrist actigraphy for scratch detection in the presence of confounding activities. In 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 3652–3655 (IEEE, 2011).
Petersen, J., Austin, D., Sack, R. & Hayes, T. L. Actigraphy-based scratch detection using logistic regression. IEEE J. Biomed. Health Inform. 17, 277–283 (2013).
Article PubMed Google Scholar
Moreau, A. et al. Detection of nocturnal scratching movements in patients with atopic dermatitis using accelerometers and recurrent neural networks. IEEE J. Biomed. Health Inform. 22, 1011–1018 (2017).
Article PubMed Google Scholar
Mahadevan, N. et al. Development of digital measures for nighttime scratch and sleep using wrist-worn wearable devices. NPJ Digit. Med. 4, 1–10 (2021).
Article Google Scholar
van Hees, V. T. et al. Estimating sleep parameters using an accelerometer without sleep diary. Sci. Rep. 8, 1–11 (2018).
Google Scholar
Lefebvre, G., Berlemont, S., Mamalet, F. & Garcia, C. Inertial gesture recognition with blstm-rnn. In Artificial Neural Networks, 393–410 (Springer, 2015).
Lee, J. et al. Itchtector: a wearable-based mobile system for managing itching conditions. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, 893–905 (2017).
Boyle, M. The integration of angular velocity. Adv. Appl. Clifford Algebras 27, 2345–2374 (2017).
Article Google Scholar
Carlsson, G. Topology and data. Bull. Am. Math. Soc. 46, 255–308 (2009).
Article Google Scholar
Chazal, F. & Michel, B. An introduction to topological data analysis: fundamental and practical aspects for data scientists. Front. Artif. Intell. 4, 667963 (2021).
Article PubMed PubMed Central Google Scholar
Zomorodian, A. Topological data analysis. Adv. Appl. Comput. Topol. 70, 1–39 (2012).
Article Google Scholar
Carlsson, G. & Vejdemo-Johansson, M. Topological Data Analysis with Applications (Cambridge University Press, 2021).
Chung, Y.-M., Hu, C.-S., Lo, Y.-L. & Wu, H.-T. A persistent homology approach to heart rate variability analysis with an application to sleep-wake classification. Front. Physiol. 12, 202 (2021).
Article Google Scholar
Karan, A. & Kaygun, A. Time series classification via topological data analysis. Expert Syst. Appl. 183, 115326 (2021).
Article Google Scholar
Lawson, A., Chung, Y.-M. & Cruse, W. A hybrid metric based on persistent homology and its application to signal classification. In 2020 25th International Conference on Pattern Recognition (ICPR), 9944–9950 (IEEE, 2021).
Perea, J. A. & Harer, J. Sliding windows and persistence: an application of topological methods to signal analysis. Found. Comput. Math. 15, 799–838 (2015).
Article Google Scholar
Van Hees, V. T. et al. Autocalibration of accelerometer data for free-living physical activity assessment using local gravity and temperature: an evaluation on four continents. J. Appl. Physiol. 117, 738–744 (2014).
Article PubMed PubMed Central Google Scholar
Patania, A., Vaccarino, F. & Petri, G. Topological analysis of data. EPJ Data Sci. 6, 1–6 (2017).
Article Google Scholar
Chung, Y.-M., Nikooienejad, A. & Zhang, B. Automatic eating behavior detection from wrist motion sensor using Bayesian, gradient boosting, and topological persistence methods. In IEEE Big Data 2022 (IEEE, 2022).
Chung, Y.-M., Hull, M., Lawson, A. & Pritchard, N. Gaussian persistence curves. Preprint at arXiv:2205.11353 (2022).
The GUDHI Project. GUDHI User and Reference Manual, 3.4.1 edn (GUDHI Editorial Board). https://gudhi.inria.fr/doc/3.4.1/ (2021).

Download references

Acknowledgements

This work was sponsored by Eli Lilly and Company Inc. We thank all internal and external colleagues who supported this study.

Author information

Authors and Affiliations

Eli Lilly & Company, INc., Indianapolis, IN, USA
Ju Ji, Jordan Venderley, Hui Zhang, Mengjue Lei, Guangchen Ruan, Neel Patel, Yu-Min Chung, Regan Giesting & Leah Miller

Authors

Ju Ji
View author publications
You can also search for this author in PubMed Google Scholar
Jordan Venderley
View author publications
You can also search for this author in PubMed Google Scholar
Hui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Mengjue Lei
View author publications
You can also search for this author in PubMed Google Scholar
Guangchen Ruan
View author publications
You can also search for this author in PubMed Google Scholar
Neel Patel
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Min Chung
View author publications
You can also search for this author in PubMed Google Scholar
Regan Giesting
View author publications
You can also search for this author in PubMed Google Scholar
Leah Miller
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.J., J.V., H.Z. conducted the study design. J.J., J.V., M.L., Y.-M.C. analyzed data and built the algorithm. H.Z., G.R., N.P., R.G., L.M. supported platform and portal. J.J., H.Z., G.R., N.P., R.G., L.M. collected data. J.J., J.V., H.Z., N.P., Y.-M.C. drafted the manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Ju Ji.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ji, J., Venderley, J., Zhang, H. et al. Assessing nocturnal scratch with actigraphy in atopic dermatitis patients. npj Digit. Med. 6, 72 (2023). https://doi.org/10.1038/s41746-023-00821-y

Download citation

Received: 09 December 2022
Accepted: 04 April 2023
Published: 26 April 2023
DOI: https://doi.org/10.1038/s41746-023-00821-y

This article is cited by

A multimodal sensing ring for quantification of scratch intensity
- Akhil Padmanabha
- Sonal Choudhary
- Zackory Erickson
Communications Medicine (2023)

Subjects

Abstract

Similar content being viewed by others

Development of digital measures for nighttime scratch and sleep using wrist-worn wearable devices

A multimodal sensing ring for quantification of scratch intensity

Automated detection of mouse scratching behaviour using convolutional recurrent neural network

Introduction

Results

Demographics for study participants

Movement detection results

Nocturnal scratch classifier

Digital endpoints validation

Discussion

Methods

Study design and data collection

Algorithm overview

Signal processing

Feature engineering

Nocturnal scratch detection model

Model deployment, digital endpoints extraction and analysis

Platform and pipeline

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Information

Reporting Summary

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

A multimodal sensing ring for quantification of scratch intensity

Search

Quick links