In recent years, we have seen progress on the use of deep
Reinforcement Learning (RL) to assist sequential decision-making problems, such as games [
2,
67,
77], robotics [
18], autonomous driving [
57,
61,
78], quantitative trading strategies [
37], and healthcare systems [
46]. The systems assisted by RL have shown tremendous promise in games [
50,
67], robotics [
18], and natural language processing [
22,
51]. This can be attributed to the fact that RL systems in dynamic environments can learn from history and adapt better to the environment. In this article, we show that because the health of an electrode in a
Digital Microfluidic Biochip (DMFB) dynamically changes over time, we can utilize innovations in RL to ensure more reliable droplet transportation in DMFBs.
1.1 Digital Microfluidic Biochips
The rapid worldwide spread and impact of the COVID-19 virus has created an urgent need for reliable, accurate, and affordable testing on a massive scale. For example, the National Institutes of Health (NIH) has launched the Rapid Acceleration of Diagnostics (RADx) initiative to develop and implement technologies for COVID-19 testing [
54]. One of the most promising technologies for realizing this goal is digital microfluidics. A microfluidic biochip (DMFB) manipulates tiny amounts of fluids to automatically execute biochemical protocols for point-of-care clinical diagnosis with high efficiency and fast sample-to-result turnaround [
16,
65,
74]. Because of these characteristics, the RADx initiative has awarded grants to several biomedical diagnostic companies to develop microfluidic technologies that could dramatically increase testing capacity and throughput [
53,
56]. Other applications of DMFBs include screening of newborn infants [
33,
69], drug discovery [
38], and clinical diagnostics [
8,
62].
A DMFB consists of an electrode array in two dimensions that controls the movement of discrete liquid droplets. Upon actuation by a sequence of control voltages, the electrode array can perform a variety of fluidic operations, such as dispensing, mixing, and splitting [
7,
25]. Figure
1(a) shows a DMFB in which two droplets are present on a patterned electrode array. Nanoliter droplets on this platform are transported using the principle of
Electrowetting-on-Dielectric (EWOD) [
60]. This principle refers to the modulation of the interfacial tension between a conductive fluid and a solid electrode coated with a dielectric layer through the application of an electric field between them. See Figure
1(b).
Illumina commercialized digital microfluidics for sample preparation in 2015 through NeoPrep—a nearly $40K instrument that automates the preparation of up to 16 sequencing libraries at a time [
31]. Genmark has also deployed the microfluidic technology for infectious disease testing [
59], and Baebies uses this technology to detect lysosomal storage diseases in newborns [
26].
However, reliability remains a major concern in DMFB systems. Illumina halted the sale of NeoPrep in February 2017. In its letter to customers, Illumina cited reliability issues in-house and far worse ones in the field. Even though biochips are tested after production, defects such as electrode degradation can occur during system lifetime [
13,
71]. As the electrodes are actuated over time, two types of electrode degradation might happen: charge residual and charge trapping. Charge residual is caused by the accumulated charges, which can be mitigated by inserting grounding vectors [
58]. Charge trapping is when the charges are trapped in the dielectric insulator, and this phenomenon is irreversible [
5]. A consequence of electrode degradation is that droplet movement is impeded [
72]. An example of electrode degradation is shown in Figure
2. The figure shows two droplets on the biochip—one located on a degraded electrode. Two electrodes are actuated to move these droplets. However, one of these operations fails because the degraded electrode exerts additional surface-tension force. Detailed analyses of the relationship between electrode defects and fluidic operations can be found in the work of Drygiannakis et al. [
14].
1.2 Motivating RL-Guided Droplet Routing
In a typical use model for DMFBs [
70], a bioassay protocol with fluidic operations is obtained from biologists. Next, a synthesis technique maps these operations to groups of electrodes, referred to as fluidic modules, of a biochip to perform the required operations [
4]. A droplet has to be transported from one module to the next. The problem of determining droplet transportation paths between modules is referred to as
droplet routing. A number of droplet routing techniques have been proposed in the literature for bioassay applications [
73,
82,
86]. Su et. al [
73] proposed the first systematic droplet routing approach, which adopted the Lee algorithm and minimized the number of electrodes used for droplet routing. Xu and Chakrabarty [
82] proposed a droplet routing aware synthesis tool, which was based on parallel recombinative simulated annealing. Zhao and Chakrabarty [
86] proposed an integer linear programming based method to co-optimize droplet routing and pin mapping.
However, these methods overlook the fact that transportation of droplet may fail if the electrodes on the routing path degrade over time.
Example. Figure
3(a) shows a pre-computed routing path. We can see that this route is the shortest path between the start and the destination points. Droplet transportation can be successful because the biochip is healthy (i.e., no electrode degradation has occurred). Conversely, Figure
3(b) shows that droplet transportation to the destination fails because degraded electrodes exist in the associated path. If an online droplet router knows the locations of the degraded electrodes, it can generate another route that involves only healthy electrodes. An alternative route is shown in Figure
3(c); note that this is a shortest path, and it avoids electrodes that are degraded.
In Figure
3, a different color is used to indicate the degraded electrodes. However, in reality, we cannot identify degraded electrodes by simple examination; this is because the degradation process results from charge trapped in the insulator. When routing errors occur, simply replacing the degraded DMFB with a new one will not only increase the cost but also lead to undesirable wastage of biosamples. Droplets that are in the middle of an unfinished operation, such as mixing or diluting, need to be abandoned. The wastage of droplets is particularly undesired in some applications, such as newborn screening [
32] and forensic analysis [
83], since the bio-examples are limited in volume and availability. For example, in the newborn screening test provided by Baebies Inc., the entire screening test contains 10 to 20 different assays and each assay needs 100 nl of dried blood spot extract [
32]. Thus, a newborn screening test needs at least 1,000 nl of dried blood spot extract, which needs 200 to 300
\(\mu\) L (4–6 drops) of whole blood [
3]. Prior work has led to synthesis methods that prevent excessive usage of a few electrodes by evenly distributing fluidic operations to multiple electrodes [
5,
88]. However, these methods can only postpone the occurrence of electrode degradation, which still happens as electrodes are actuated over time. If such electrode degradation happens during bioassay execution and a route is associated with degraded electrodes, bioassay execution will fail, and it will need to be re-executed on a new biochip [
29]. Furthermore, the locations of degraded electrodes may vary from biochip to biochip because the electrode degradation process is affected by geometric variations and different electrode actuation times [
24].
Several methods have been proposed to perform error recovery when routing tasks fail [
1,
40,
66]. However, these methods are focused on recovery after routing failures, and they do not proactively alleviate the occurrence of erroneous behaviors caused by electrode degradation. Recently, an RL-based routing framework was developed to identify degradation-aware routing strategies for
Micro-Electrode-Dot-Array (MEDA) biochips [
15]. However, this method cannot be used for DMFBs due to the inherent difference between DMFBs and MEDA biochips: MEDA biochips provide the real-time degradation status of each electrode using built-in sensing circuits. This is not the case of non-MEDA DMFBs. In this work, we adopt RL techniques to respond to the dynamic degradation environments, which is not possible with existing offline routing methods.
Numerous papers have been published in recent years to advance applications that leverage RL theory [
9,
11,
49]. Our work aims to introduce RL to a new application—that is, the droplet routing problem on DMFBs. We target an RL formulation for the droplet routing problem to address the dynamic degradation of electrodes. An RL-based droplet router addresses the electrode degradation problem and ensures reliable bioassay executions in three ways. First, it provides real-time decision for droplet routing. Second, it can “learn” from the prior experience associated with electrodes that start malfunctioning. Therefore, the droplet router can generate routing paths that include any healthy electrodes. Third, even though the degradation processes may differ for two DMFBs, the router can generate different, yet reliable, routing paths on distinct DMFBs for the same routing objective.