Nothing Special   »   [go: up one dir, main page]

WO2024211659A1 - Dispositif d'écriture d'adn - Google Patents

Dispositif d'écriture d'adn Download PDF

Info

Publication number
WO2024211659A1
WO2024211659A1 PCT/US2024/023201 US2024023201W WO2024211659A1 WO 2024211659 A1 WO2024211659 A1 WO 2024211659A1 US 2024023201 W US2024023201 W US 2024023201W WO 2024211659 A1 WO2024211659 A1 WO 2024211659A1
Authority
WO
WIPO (PCT)
Prior art keywords
substrate
fluid
nucleic acid
module
collection
Prior art date
Application number
PCT/US2024/023201
Other languages
English (en)
Inventor
Sean MIHM
James Loomis
Original Assignee
Catalog Technologies, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Catalog Technologies, Inc. filed Critical Catalog Technologies, Inc.
Publication of WO2024211659A1 publication Critical patent/WO2024211659A1/fr

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J19/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J19/0046Sequential or parallel reactions, e.g. for the synthesis of polypeptides or polynucleotides; Apparatus and devices for combinatorial chemistry or for making molecular arrays
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00277Apparatus
    • B01J2219/00279Features relating to reactor vessels
    • B01J2219/00306Reactor vessels in a multiple arrangement
    • B01J2219/00324Reactor vessels in a multiple arrangement the reactor vessels or wells being arranged in plates moving in parallel to each other
    • B01J2219/00326Movement by rotation
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00583Features relative to the processes being carried out
    • B01J2219/00603Making arrays on substantially continuous surfaces
    • B01J2219/00605Making arrays on substantially continuous surfaces the compounds being directly bound or immobilised to solid supports
    • B01J2219/00612Making arrays on substantially continuous surfaces the compounds being directly bound or immobilised to solid supports the surface being inorganic
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00583Features relative to the processes being carried out
    • B01J2219/00603Making arrays on substantially continuous surfaces
    • B01J2219/00659Two-dimensional arrays
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00583Features relative to the processes being carried out
    • B01J2219/00603Making arrays on substantially continuous surfaces
    • B01J2219/00675In-situ synthesis on the substrate
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/0068Means for controlling the apparatus of the process
    • B01J2219/00686Automatic
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00718Type of compounds synthesised
    • B01J2219/0072Organic compounds
    • B01J2219/00722Nucleotides

Definitions

  • Nucleic acid digital data storage is a stable approach for encoding and storing information for long periods of time, with data stored at higher densities than magnetic tape or hard drive storage systems. Additionally, digital data stored in nucleic acid molecules that are stored in cold and dry conditions can be retrieved as long as 60,000 years later or longer.
  • nucleic acid molecules may be sequenced.
  • nucleic acid digital data storage may be an ideal method for storing data that is not frequently accessed but may have a high volume of information to be stored or archived for long periods of time.
  • nucleic acid molecules that store digital information.
  • nucleic acid molecules e.g., identifiers
  • component nucleic acid molecules e.g., components
  • the components are printed or dispensed at the same location (e.g., coordinate) on the substrate so as to be co-located.
  • the components are configured to self-assemble, or otherwise sort themselves in a predetermined order, to form the identifier nucleic acid molecules (identifiers).
  • Each identifier corresponds to a particular symbol value (e.g., bit or series of bits), or that symbol’s position (e.g., rank or address), in a string of symbols (e.g., a bitstream).
  • the system prints or dispenses a reaction mix onto the same location, which causes the components to align themselves to form identifiers.
  • the system may alternatively or additionally provide a condition necessary to physically link the components, such as a particular temperature that causes the components to align.
  • multiple identifiers may be combined into a pool of identifiers, where the pool is representative of at least a portion of the entire string of symbols.
  • a system for assembling an identifier nucleic acid molecule encoding digital information includes a substrate comprising a plurality of coordinates.
  • the substrate is removably mounted on a carrier.
  • the system includes a writing module configured to dispense a droplet of a solution including a nucleic acid molecule onto one of the plurality of coordinates on the substrate.
  • the system includes a collection module configured to collect droplets from the plurality of coordinates.
  • the system includes a cleaning module configured to clean the substrate after collection of the droplets.
  • the system includes a control system configured to actuate the carrier to transfer the substrate between the writing module, the collection module, and the cleaning module.
  • a method for assembling an identifier nucleic acid molecule encoding digital information includes actuating, using a control system, a substrate comprising a plurality of coordinates.
  • the substrate is removably mounted on a carrier.
  • the method includes actuating, using the control system, a writing module to dispense a droplet of a solution including a nucleic acid molecule onto one of the plurality of coordinates on the substrate.
  • the method includes actuating, using the control system, a collection module to collect droplets from the plurality of coordinates.
  • the method includes actuating, using the control system, a cleaning module to clean the substrate after collection of the droplets.
  • the method includes transferring the substrate between the writing module, the collection module, and the cleaning module.
  • FIG. 1 illustrates an example system for assembling an identifier nucleic acid molecule encoding digital information.
  • FIG. 2A depicts a schematic of an example carrier serving as a platform for one or more implements (e.g., mechanical and/or electronic components discussed below) for handling nucleic acids suspended in a liquid.
  • FIG. 2B depicts a schematic of an example configuration of a set of components of a system for conveying nucleic acid droplets / reaction spots in a DNA writer system.
  • FIG. 3A and FIG. 3B depict a schematic of an example configuration of a set of components of a system for conveying nucleic acid droplets / reaction spots in a DNA writer system.
  • FIG. 4 depicts a schematic of an example working principle of an example system for assembling an identifier nucleic acid molecule encoding digital information.
  • FIG. 5 depicts a diagram illustrating various example components of an example system for assembling an identifier nucleic acid molecule encoding digital information and their respective operational characteristics and requirements.
  • FIG. 6 depicts a diagram illustrating an example general communication relationship between the different components of an example system for assembling an identifier nucleic acid molecule encoding digital information.
  • FIG. 7 depicts a diagram illustrating the rotating elements of an example system for assembling an identifier nucleic acid molecule encoding digital information.
  • FIG. 8 depicts a diagram illustrating electronic communication relationships between various rotating elements of an example system for assembling an identifier nucleic acid molecule encoding digital information.
  • FIG. 9 depicts a diagram illustrating electric connections between various rotating elements of an example system for assembling an identifier nucleic acid molecule encoding digital information.
  • FIG. 10 depicts a diagram illustrating an example configuration of a system for assembling an identifier nucleic acid molecule encoding digital information.
  • FIG. 11 depicts a top view schematic of an example slice that can be used with the technologies described in this specification.
  • FIG. 12A and FIG. 12B depict perspective view schematics of an example slice and plate that can be used with the technologies described in this specification.
  • FIG. 13A and FIG. 13B depict a top view schematic of an example slice and plate mounted on a carrier that can be used with the technologies described in this specification.
  • FIG. 14 depicts a diagram illustrating example features of the modules of a system for assembling an identifier nucleic acid molecule encoding digital information.
  • FIG. 15 depicts a diagram illustrating an example writing module of a system for assembling an identifier nucleic acid molecule encoding digital information.
  • FIG. 16 depicts a diagram illustrating an example printhead arrangement of a writing module of a system for assembling an identifier nucleic acid molecule encoding digital information.
  • FIG. 17 depicts a diagram illustrating an example workflow for an example collection module of a system for assembling an identifier nucleic acid molecule encoding digital information.
  • FIG. 18 depicts a perspective view schematic of an example collection module on an example slice of a system for assembling an identifier nucleic acid molecule encoding digital information.
  • FIG. 19 depicts a diagram illustrating an example collection module of a system for assembling an identifier nucleic acid molecule encoding digital information.
  • FIG. 20 depicts a diagram illustrating an example separator for use in a system for assembling an identifier nucleic acid molecule encoding digital information.
  • FIG. 21A to FIG. 21E depict diagrams illustrating an operation of an example separator for use in a system for assembling an identifier nucleic acid molecule encoding digital information.
  • FIG. 22A depicts a perspective view of a rendering of an example system for assembling an identifier nucleic acid molecule encoding digital information.
  • FIG. 22B depicts perspective view of a rendering of the example system showing modules and carrier of the system.
  • FIG. 22C depicts and another perspective view of a rendering of the example system,
  • FIG. 23 depicts a perspective view of a rendering of an array of example systems for assembling an identifier nucleic acid molecule encoding digital information.
  • component generally refers to a nucleic acid sequence.
  • a component may be a distinct nucleic acid sequence.
  • a component may be concatenated or assembled with one or more other components to generate other nucleic acid sequence or molecules.
  • layer generally refers to group or pool of components. Each layer may comprise a set of distinct components such that the components in one layer are different from the components in another layer. Components from one or more layers may be assembled to generate one or more identifiers.
  • identifier generally refers to a nucleic acid molecule or a nucleic acid sequence that represents the position and value of a bit-string within a larger bit-string. More generally, an identifier may refer to any object that represents or corresponds to a symbol in a string of symbols. In some implementations, identifiers may comprise one or multiple concatenated components.
  • a nucleic acid may include one or more subunits selected from adenosine (A), cytosine (C), guanine (G), thymine (T), and uracil (U), or variants thereof.
  • a nucleotide can include A, C, G, T, or U, or variants thereof.
  • a nucleotide can include any subunit that can be incorporated into a growing nucleic acid strand.
  • Such subunit can be A, C, G, T, or U, or any other subunit that may be specific to one of more complementary A, C, G, T, or U, or complementary to a purine (i.e., A or G, or variant thereof) or pyrimidine (i.e., C, T, or U, or variant thereof).
  • a nucleic acid may be single-stranded or double stranded, in some cases, a nucleic acid is circular.
  • nucleic acid molecule or “nucleic acid sequence,” as used herein, generally refer to a polymeric form of nucleotides, or polynucleotide, that may have various lengths, either deoxyribonucleotides (DNA) or ribonucleotides (RNA), or analogs thereof.
  • nucleic acid sequence may refer to the alphabetical representation of a polynucleotide; alternatively, the term may be applied to the physical polynucleotide itself. This alphabetical representation can be input into databases in a computer having a central processing unit and used for mapping nucleic acid sequences or nucleic acid molecules to symbols, or bits, encoding digital information.
  • Nucleic acid sequences or oligonucleotides may include one or more non-standard nucleotide (s), nucleotide analog(s) and/or modified nucleotides.
  • oligonucleotide generally refers to a single-stranded nucleic acid sequence, and is typically composed of a specific sequence of four nucleotide bases: adenine (A); cytosine (C); guanine (G), and thymine (T) or uracil (U) when the polynucleotide is RNA.
  • A adenine
  • C cytosine
  • G guanine
  • T thymine
  • U uracil
  • modified nucleotides include, but are not limited to diaminopurine, 5 -fluorouracil, 5 -bromouracil, 5 -chlorouracil, 5 -iodouracil, hypoxanthine, xantine, 4-acetylcytosine, 5- (carboxyhydroxylmethyl)uracil, 5 -carboxymethylaminomethyl -2 -thiouridine, 5 - carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N6- isopentenyladenine, 1-methylguanine, 1 -methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2- methylguanine, 3-methylcytosine, 5 -methylcytosine, N6-adenine, 7-methylguanine, 5- methylaminomethyluracil, 5-methoxyaminomethyl -2 -thiouracil, beta-
  • primer generally refers to a strand of nucleic acid that serves as a starting point for nucleic acid synthesis, such as polymerase chain reaction (PCR).
  • PCR polymerase chain reaction
  • an enzyme that catalyzes replication starts replication at the 3'- end of a primer attached to the DNA sample and copies the opposite strand.
  • polymerase or “polymerase enzyme,” as used herein, generally refers to any enzyme capable of catalyzing a polymerase reaction.
  • polymerases include, without limitation, a nucleic acid polymerase.
  • the polymerase can be naturally occurring or synthesized.
  • An example polymerase is a 029 polymerase or derivative thereof.
  • a transcriptase or a ligase is used (i.e., enzymes which catalyze the formation of a bond) in conjunction with polymerases or as an alternative to polymerases to construct new nucleic acid sequences.
  • polymerases examples include a DNA polymerase, a RNA polymerase, a thermostable polymerase, a wild-type polymerase, a modified polymerase, E. coli DNA polymerase I, T7 DNA polymerase, bacteriophage T4 DNA polymerase 029 (phi29) DNA polymerase, Taq polymerase, Tth polymerase, Tli polymerase, Pfu polymerase Pwo polymerase, VENT polymerase, DEEPVENT polymerase, Ex-Taq polymerase, LA- Taw polymerase, Sso polymerase Poc polymerase, Pab polymerase, Mth polymerase ES4 polymerase, Tru polymerase, Tac polymerase, Tne polymerase, Tma polymerase, Tea polymerase, Tih polymerase, Tfi polymerase, Platinum Taq polymerases, Tbr polymerase, Tfl polymerase, Pfutubo polymerase, Pyrobest
  • Digital information such as computer data, in the form of binary code can comprise a sequence or string of symbols.
  • a binary code may encode or represent text or computer processor instructions using, for example, a binary number system having two binary symbols, typically 0 and 1, referred to as bits.
  • Digital information may be represented in the form of non-binary code which can comprise a sequence of non-binary symbols. Each encoded symbol can be re-assigned to a unique bit string (or “byte”), and the unique bit string or byte can be arranged into strings of bytes or byte streams.
  • a bit value for a given bit can be one of two symbols (e.g., 0 or 1).
  • a byte which can comprise a string of N bits, can have a total of 2N unique byte-values.
  • a byte comprising 8 bits can produce a total of 28 or 256 possible unique byte-values, and each of the 256 bytes can correspond to one of 256 possible distinct symbols, letters, or instructions which can be encoded with the bytes.
  • Raw data e.g., text files and computer instructions
  • Zip files, or compressed data files comprising raw data can also be stored in byte streams, these files can be stored as byte streams in a compressed form, and then decompressed into raw data before being read by the computer.
  • Information can be stored in nucleic acid sequences.
  • methods to encode digital information into identifiers which are built from one or more components.
  • Each component can comprise a nucleic acid sequence.
  • a print-based system known as the DNA Writer, may be used to collocate and assemble components for construction of identifiers.
  • a DNA Writer may comprise one system, a printer which dispenses both the components and reaction mix onto a substrate.
  • two or more subsystems may be attached and dependent on each other for individual function. In other implementations, the two or more subsystems may be disjoint and capable of functioning independently.
  • a method for encoding information into nucleic acid sequences may comprise (a) translating the information into a string of symbols, (b) mapping the string of symbols to a plurality of identifiers, and (c) constructing an identifier library comprising at least a subset of the plurality of identifiers.
  • An individual identifier of the plurality of identifiers may comprise one or more components.
  • An individual component of the one or more components may comprise a nucleic acid sequence.
  • Each symbol at each position in the string of symbols may correspond to a distinct identifier.
  • the individual identifier may correspond to an individual symbol at an individual position in the string of symbols.
  • one symbol at each position in the string of symbols may correspond to the absence of an identifier.
  • a string of binary symbols e.g., bits
  • each occurrence of ‘0’ may correspond to the absence of an identifier.
  • a method for nucleic acid-based computer data storage may comprise (a) receiving computer data, (b) synthesizing nucleic acid molecules comprising nucleic acid sequences encoding the computer data, and (c) storing the nucleic acid molecules having the nucleic acid sequences.
  • the computer data may be encoded in at least a subset of nucleic acid molecules synthesized and not in a sequence of each of the nucleic acid molecules.
  • the present disclosure provides methods for writing and storing information in nucleic acid sequences.
  • the method may comprise, (a) receiving or encoding a virtual identifier library that represents information, (b) physically constructing the identifier library, and (c) storing one or more physical copies of the identifier library in one or more separate locations.
  • An individual identifier of the identifier library may comprise one or more components.
  • An individual component of the one or more components may comprise a nucleic acid sequence.
  • a method for nucleic acid-based computer data storage may comprise (a) receiving computer data, (b) synthesizing a nucleic acid molecule comprising at least one nucleic acid sequence encoding the computer data, and (c) storing the nucleic acid molecule comprising the at least one nucleic acid sequence. Synthesizing the nucleic acid molecule may be in the absence of base-by-base nucleic acid synthesis.
  • a method for writing and storing information in nucleic acid sequences may comprise, (a) receiving or encoding a virtual identifier library that represents information, (b) physically constructing the identifier library, and (c) storing one or more physical copies of the identifier library in one or more separate locations.
  • An individual identifier of the identifier library may comprise one or more components.
  • An individual component of the one or more components may comprise a nucleic acid sequence.
  • a method for reading information encoded in nucleic acid sequences may comprise (a) providing an identifier library, (b) identifying the identifiers present in the identifier library, (c) generating a string of symbols from the identifiers present in the identifier library, and (d) compiling information from the string of symbols.
  • An identifier library may comprise a subset of a plurality of identifiers from a combinatorial space. Each individual identifier of the subset of identifiers may correspond to an individual symbol in a string of symbols.
  • An identifier may comprise one or more components.
  • a component may comprise a nucleic acid sequence.
  • Information may be written into one or more identifier libraries as described elsewhere herein. Identifiers may be constructed using any method described elsewhere herein. Stored data may be copied and accessed using any method described elsewhere herein.
  • the identifier may comprise information relating to a location of the encoded symbol, a value of the encoded symbol, or both the location and the value of the encoded symbol.
  • An identifier may include information relating to a location of the encoded symbol and the presence or absence of the identifier in an identifier library may indicate the value of the symbol.
  • the presence of an identifier in an identifier library may indicate a first symbol value (e.g., first bit value) in a binary string and the absence of an identifier in an identifier library may indicate a second symbol value (e.g., second bit value) in a binary string.
  • basing a bit value on the presence or absence of an identifier in an identifier library may reduce the number of identifiers assembled and, therefore, reduce the write time.
  • the presence of an identifier may indicate a bit value of ‘ 1 ’ at the mapped location and the absence of an identifier may indicate a bit value of ‘0’ at the mapped location.
  • decoding nucleic acid encoded data may be achieved by base-by-base sequencing of the nucleic acid strands, such as Illumina® Sequencing, or by utilizing a sequencing technique that indicates the presence or absence of specific nucleic acid sequences, such as fragmentation analysis by capillary electrophoresis.
  • the sequencing may employ the use of reversible terminators.
  • the sequencing may employ the use of natural or non-natural (e.g., engineered) nucleotides or nucleotide analogs.
  • decoding nucleic acid sequences may be performed using a variety of analytical techniques, including but not limited to, any methods that generate optical, electrochemical, or chemical signals.
  • PCR polymerase chain reaction
  • digital PCR Sanger sequencing
  • high-throughput sequencing sequencing-by-synthesis
  • single-molecule sequencing sequencing-by- ligation
  • RNA-Seq IIlumina
  • Next generation sequencing Digital Gene Expression (Helicos)
  • Cetos Chromosomes
  • Solexa Single MicroArray
  • shotgun sequencing Maxim-Gilbert sequencing
  • massively-parallel sequencing PCR
  • PCR polymerase chain reaction
  • digital PCR Sanger sequencing
  • high-throughput sequencing sequencing-by-synthesis
  • single-molecule sequencing sequencing-by- ligation
  • RNA-Seq RNA-Seq (Illumina)
  • Next generation sequencing Digital Gene Expression (Helicos)
  • Clonal Single MicroArray Solexa
  • shotgun sequencing Maxim-Gilbert sequencing
  • massively-parallel sequencing massively-parallel sequencing.
  • the efficiency of encoding and decoding data may be increased by recoding input bit strings to enable the use of fewer nucleic acid molecules. For example, if an input string is received with a high occurrence of ‘ 111 ’ substrings, which may map to three nucleic acid molecules (e.g., identifiers) with an encoding method, it may be recoded to a ‘000’ substring which may map to a null set of nucleic acid molecules. The alternate input substring of ‘000’ may also be recoded to ‘ 111’. This method of recoding may reduce the total amount of nucleic acid molecules used to encode the data because there may be a reduction in the number of ‘ 1 ’s in the dataset.
  • This method of recoding may reduce the total amount of nucleic acid molecules used to encode the data because there may be a reduction in the number of ‘ 1 ’s in the dataset.
  • the total size of the dataset may be increased to accommodate a codebook that specifies the new mapping instructions.
  • An alternative method for increasing encoding and decoding efficiency may be to recode the input string to reduce the variable length. For example, ‘ 11’ may be recoded to ‘00’ which may shrink the size of the dataset and reduce the number of ‘ 1 ’s in the dataset.
  • nucleic acid sequences e.g., identifiers
  • nucleic acid sequences comprising a majority of nucleotides that are easier to call and detect based on their optical, electrochemical, chemical, or physical properties.
  • Engineered nucleic acid sequences may be either single or double stranded.
  • Engineered nucleic acid sequences may include synthetic or unnatural nucleotides that improve the detectable properties of the nucleic acid sequence.
  • Engineered nucleic acid sequences may comprise all natural nucleotides, all synthetic or unnatural nucleotides, or a combination of natural, synthetic, and unnatural nucleotides.
  • Synthetic nucleotides may include nucleotide analogues such as peptide nucleic acids, locked nucleic acids, glycol nucleic acids, and threose nucleic acids.
  • Unnatural nucleotides may include dNaM, an artificial nucleoside containing a 3-methoxy-2-naphthly group, and d5SICS, an artificial nucleoside containing a 6-methylisoquinoline- l-thione-2-yl group.
  • Example chemical moieties may include, but are not limited to, fluorescent moieties, chemiluminescent moieties, acidic or basic moieties, hydrophobic or hydrophilic moieties, and moieties that alter oxidation state or reactivity of the nucleic acid sequence.
  • a sequencing platform may be designed specifically for decoding and reading information encoded into nucleic acid sequences.
  • the sequencing platform may be dedicated to sequencing single or double stranded nucleic acid molecules.
  • the sequencing platform may decode nucleic acid encoded data by reading individual bases (e.g., base-by-base sequencing) or by detecting the presence or absence of an entire nucleic acid sequence (e.g., component) incorporated within the nucleic acid molecule (e.g., identifier).
  • the sequencing platform may include the use of promiscuous reagents, increased read lengths, and the detection of specific nucleic acid sequences by the addition of detectable chemical moieties.
  • the use of more promiscuous reagents during sequencing may increase reading efficiency by enabling faster base calling which in turn may decrease the sequencing time.
  • the use of increased read lengths may enable longer sequences of encoded nucleic acids to be decoded per read.
  • the addition of detectable chemical moiety tags may enable the detection of the presence or absence of a nucleic acid sequence by the presence or absence of a chemical moiety. For example, each nucleic acid sequence encoding a bit of information may be tagged with a chemical moiety that generates a unique optical, electrochemical, or chemical signal. The presence or absence of that unique optical, electrochemical, or chemical signal may indicate a ‘0’ or a ‘ 1’ bit value.
  • the nucleic acid sequence may comprise a single chemical moiety or multiple chemical moieties.
  • the chemical moiety may be added to the nucleic acid sequence prior to use of the nucleic acid sequence to encode data. Alternatively or in addition to, the chemical moiety may be added to the nucleic acid sequence after encoding the data, but prior to decoding the data.
  • the chemical moiety tag may be added directly to the nucleic acid sequence or the nucleic acid sequence may comprise a synthetic or unnatural nucleotide anchor and the chemical moiety tag may be added to that anchor.
  • an original identifier library with identifiers ⁇ X1Y1, X1Y3, X2Y1, X2Y2, X2Y3 ⁇ may be supplemented to include checksums to become the following pool: ⁇ XI Yl, X1Y3, X2Y1, X2Y2, X2Y3, X1Y6, X2Y7, X3Y4, X6Y1, X5Y2, X6Y3 ⁇ .
  • the checksum sequences may also be used for error correction. For example, absence of X1Y1 from the above dataset and the presence of X1Y6 and X6Y1 may enable inference that the X1Y1 nucleic acid molecule is missing from the dataset.
  • the checksum sequences may indicate whether identifiers are missing from a sampling of the identifier library or an accessed portion of the identifier library. In the case of a missing checksum sequence, access methods such as PCR or affinity tagged probe hybridization may amplify and/or isolate it. In some implementations, the checksums may not be supplemental nucleic acid sequences. The checksums may be coded directly into the information such that they are represented by identifiers.
  • Noise in data encoding and decoding may be reduced by constructing identifiers palindromically, for example, by using palindromic pairs of components rather than single components in the product scheme. Then the pairs of components from different layers may be assembled to one another in a palindromic manner (e.g., YXY instead of XY for components X and Y). This palindromic method may be expanded to larger numbers of layers (e.g., ZYXYZ instead of XYZ) and may enable detection of erroneous cross reactions between identifiers.
  • Adding supplemental nucleic acid sequences in excess (e.g., vast excess) to the identifiers may prevent sequencing from recovering the encoded identifiers.
  • the identifiers Prior to decoding the information, the identifiers may be enriched from the supplemental nucleic acid sequences. For example, the identifiers may be enriched by a nucleic acid amplification reaction using primers specific to the identifier ends.
  • the information may be decoded without enriching the sample pool by sequencing (e.g., sequencing by synthesis) using a specific primer. In both decoding methods, it may be difficult to enrich or decode the information without having a decoding key or knowing something about the composition of the identifiers.
  • Alternative access methods may also be employed such as using affinity tag -based probes.
  • a system for encoding digital information into nucleic acids can comprise systems, methods and devices for converting fdes and data (e.g., raw data, compressed zip files, integer data, and other forms of data) into bytes and encoding the bytes into segments or sequences of nucleic acids, typically DNA, or combinations thereof.
  • fdes and data e.g., raw data, compressed zip files, integer data, and other forms of data
  • a system for encoding binary sequence data using nucleic acids may comprise a device and one or more computer processors.
  • the device may be configured to construct an identifier library.
  • the one or more computer processors may be individually or collectively programmed to (i) translate the information into a sting of symbols, (ii) map the string of symbols to the plurality of identifiers, and (iii) construct an identifier library comprising at least a subset of a plurality of identifiers.
  • An individual identifier of the plurality of identifiers may correspond to an individual symbol of the string of symbols.
  • An individual identifier of the plurality of identifiers may comprise one or more components.
  • An individual component of the one or more components may comprise a nucleic acid sequence.
  • a system for reading binary sequence data using nucleic acids may comprise a database and one or more computer processors.
  • the database may store an identifier library encoding the information.
  • the one or more computer processors may be individually or collectively programmed to (i) identify the identifiers in the identifier library, (ii) generate a plurality of symbols from identifiers identified in (i), and (iii) compile the information from the plurality of symbols.
  • the identifier library may comprise a subset of a plurality of identifiers. Each individual identifier of the plurality of identifiers may correspond to an individual symbol in a string of symbols.
  • An identifier may comprise one or more components.
  • a component may comprise a nucleic acid sequence.
  • Non-limiting implementations of methods for using the system to encode digital data can comprise steps for receiving digital information in the form of byte streams. Parsing the byte streams into individual bytes, mapping the location of a bit within the byte using a nucleic acid index (or identifier rank), and encoding sequences corresponding to either bit values of 1 or bit values of 0 into identifiers.
  • Steps for retrieving digital data can comprise sequencing a nucleic acid sample or nucleic acid pool comprising sequences of nucleic acid (e.g., identifiers) that map to one or more bits, referencing an identifier rank to confirm if the identifier is present in the nucleic acid pool and decoding the location and bit-value information for each sequence into a byte comprising a sequence of digital information.
  • nucleic acid e.g., identifiers
  • Systems for encoding, writing, copying, accessing, reading, and decoding information encoded and written into nucleic acid molecules may be a single integrated unit or may be multiple units configured to execute one or more of the aforementioned operations.
  • a system for encoding and writing information into nucleic acid molecules may include a device and one or more computer processors.
  • the one or more computer processors may be programmed to parse the information into strings of symbols (e.g., strings of bits).
  • the computer processor may generate an identifier rank.
  • the computer processor may categorize the symbols into two or more categories.
  • One category may include symbols to be represented by a presence of the corresponding identifier in the identifier library and the other category may include symbols to be represented by an absence of the corresponding identifiers in the identifier library.
  • the computer processor may direct the device to assemble the identifiers corresponding to symbols to be represented to the presence of an identifier in the identifier library.
  • the device may comprise a plurality regions, sections, or partitions.
  • the reagents and components to assemble the identifiers may be stored in one or more regions, sections, or partitions of the device, e.g., in separate ink tanks or cartridges. Layers may be stored in separate regions of section of the device. A layer may comprise one or more unique components. The component in one layer may be unique from the components in another layer.
  • the regions or sections may comprise vessels and the partitions may comprise wells. Each layer may be stored in a separate vessel or partition. Each reagent or nucleic acid sequence may be stored in a separate vessel or partition. Alternatively, or in addition to, reagents may be combined to form a master mix for identifier construction.
  • the device may transfer reagents, components, and templates from one section of the device to be combined in another section.
  • the device may provide the conditions for completing the assembly reaction, e.g., using a writer system 100. For example, the device may provide heating, agitation, and detection of reaction progress.
  • the constructed identifiers may be directed to undergo one or more subsequent reactions to add barcodes, common sequences, variable sequences, or tags to one or more ends of the identifiers.
  • the identifiers may then be directed to a region or partition to generate an identifier library.
  • One or more identifier libraries may be stored in each region, section, or individual partition of the device.
  • the device may transfer fluid (e.g., reagents, components, templates) using pressure, vacuum, or suction.
  • the identifier libraries may be stored in the device or may be moved to a separate database.
  • the database may comprise one or more identifier libraries.
  • the database may provide conditions for long term storage of the identifier libraries (e.g., conditions to reduce degradation of identifiers).
  • the identifier libraries may be stored in a powder, liquid, or solid form. Aqueous solutions of identifiers may be lyophilized for more stable storage. Alternatively, identifiers may be stored in the absence of oxygen (e.g. anaerobic storage conditions).
  • the database may provide Ultra-Violet light protection, reduced temperature (e.g., refrigeration or freezing), and protection from degrading chemicals and enzymes.
  • the device that copies the information may extract an aliquot of an identifier library from the device and combine that aliquot with the reagents and constituents to amplify a portion of or the entire identifier library.
  • the device may control the temperature, pressure, and agitation of the amplification reaction.
  • the device may comprise partitions and one or more amplification reaction may occur in the partition comprising the identifier library.
  • the device may copy more than one pool of identifiers at a time.
  • the copied identifiers may be transferred from the copy device to an accessing device.
  • the accessing device may be the same device as the copy device.
  • the access device may comprise separate regions, sections, or partitions.
  • the access device may have one or more columns, bead reservoirs, or magnetic regions for separating identifiers bound to affinity tags.
  • the access device may have one or more size selection units.
  • a size selection unit may include agarose gel electrophoresis or any other method for size selecting nucleic acid molecules. Copying and extraction may be performed in the same region of a device or in different regions of a device.
  • the accessed data may be read in the same device or the accessed data may be transferred to another device.
  • the reading device may comprise a detection unit to detect and identify the identifiers.
  • the detection unit may be part of a sequencer, hybridization array, or other unit for identifying the presence or absence of an identifier.
  • a sequencing platform may be designed specifically for decoding and reading information encoded into nucleic acid sequences.
  • the sequencing platform may be dedicated to sequencing single or double stranded nucleic acid molecules.
  • the sequencing platform may decode nucleic acid encoded data by reading individual bases (e.g., base-by-base sequencing) or by detecting the presence or absence of an entire nucleic acid sequence (e.g., component) incorporated within the nucleic acid molecule (e.g., identifier).
  • the sequencing platform may be a system such as Illumina® Sequencing or fragmentation analysis by capillary electrophoresis.
  • decoding nucleic acid sequences may be performed using a variety of analytical techniques implemented by the device, including but not limited to, any methods that generate optical, electrochemical, or chemical signals.
  • Information storage in nucleic acid molecules may have various applications including, but not limited to, long term information storage, sensitive information storage, and storage of medical information.
  • a person e.g., medical history and records
  • the information may be stored external to the body (e.g., in a wearable device) or internal to the body (e.g., in a subcutaneous capsule).
  • a sample may be taken from the device or capsule and the information may be decoded with the use of a nucleic acid sequencer.
  • nucleic acid molecules may provide an alternative to computer and cloud-based storage systems. Personal storage of medical records in nucleic acid molecules may reduce the instance or prevalence of medical records being hacked.
  • Nucleic acid molecules used for capsule-based storage of medical records may be derived from human genomic sequences. The use of human genomic sequences may decrease the immunogenicity of the nucleic acid sequences in the event of capsule failure and leakage.
  • reactions and methods provided herein can be used in systems described herein for assembling identifiers from one or more components.
  • different reaction mixtures for different chemical methods provided herein can be used in the finisher of the system to assemble different components.
  • components can be assembled in a reaction comprising polymerase and dNTPs (deoxynucleotide tri phosphates comprising dATP, dTTP, dCTP, dGTP or variants or analogs thereof).
  • dNTPs deoxynucleotide tri phosphates comprising dATP, dTTP, dCTP, dGTP or variants or analogs thereof.
  • Components can be single stranded or double stranded nucleic acids.
  • Components to be assembled adjacent to each other may have complementary 3' ends, complementary 5' ends, or homology between one component's 5' end and the adjacent component's 3' end.
  • the OEPCR may comprise cycling between three temperatures: a melting temperature, an annealing temperature, and an extension temperature.
  • the melting temperature is intended to turn double stranded nucleic acids into single stranded nucleic acids, as well as remove the formation of secondary structures or hybridizations within a component or between components.
  • the melting temperature is high, for example above 95 degrees Celsius.
  • the melting temperature may be at least 96, 97, 98, 99, 100, 101, 102, 103, 104, or at least 105 degrees Celsius.
  • the melting temperature may be at most 95, 94, 93, 92, 91, or at most 90 degrees Celsius.
  • the annealing temperature is intended to facilitate the formation of hybridization between complementary 3' ends of intended adjacent components (or their complements).
  • the annealing temperature may match the calculated melting temperature of the intended hybridized nucleic acid formation.
  • the annealing temperature may be within 10 degrees Celsius or more of said melting temperature.
  • the annealing temperature may be at least 25, 30, 50, 55, 60, 65, or at least 70 degrees Celsius.
  • the melting temperature may depend on the sequence of the intended hybridization region between components. Longer hybridization regions have higher melting temperatures, and hybridization regions with higher percent content of Guanine or Cytosine nucleotides may have higher melting temperatures. It may therefore be possible to design components for OEPCR reactions intended to assemble optimally at particular annealing temperatures. Annealing temperatures may be applied to the reaction for at least 1, 5, 10, 15, 20, 25, or at least 30 seconds, or above.
  • the extension temperature is intended to initiate and facilitate the nucleic acid chain elongation of hybridized 3' ends catalyzed by one or more polymerase enzymes.
  • the extension temperature may be set at the temperature in which the polymerase functions optimally in terms of nucleic acid binding strength, elongation speed, elongation stability, or fidelity.
  • the extension temperature may be at least 30, 40, 50, 60, or at least 70 degrees Celsius, or above. Annealing temperatures may be applied to the reaction for at least 1, 5, 10, 15, 20, 25, 30, 40, 50, or at least 60 seconds or above. Recommended extension times may be around 15 to 45 seconds per kilobase of expected elongation.
  • the annealing temperature and the extension temperature may be the same.
  • a 2-step temperature cycle may be used instead of a 3 -step temperature cycle.
  • Examples of combined annealing and extension temperatures include 60, 65, or 72 degrees Celsius.
  • OEPCR may be performed with one temperature cycle. Such implementations may involve the intended assembly of just two components. In other implementations, OEPCR may be performed with multiple temperature cycles. Any given nucleic acid in OEPCR may only assemble to at most one other nucleic acid in one cycle. This is because assembly (or extension or elongation) may only occurs at the 3' end of a nucleic acid and each nucleic acid only has one 3' end. Therefore, the assembly of multiple components may require multiple temperature cycles. For example, assembling four components may involve 3 temperature cycles. Assembling 6 components may involve 5 temperature cycles. Assembling 10 components may involve 9 temperature cycles. In some implementations, using more temperature cycles than the minimum required may increase assembly efficiency.
  • Hybridization regions with high guanine or cytosine content may hybridize more efficiently at a given temperature than hybridization regions with low guanine or cytosine content. This is because guanine forms a more stable base-pair with cytosine than adenine does with thymine.
  • Hybridization regions may have a guanine or cytosine content (also known as GC content) of anywhere from 0% to 100%.
  • hybridization regions may have a guanine or cytosine content from 0% to 5%, from 5% to 10%, from 10% to 15%, from 15% to 20%, from 20% to 25%, from 25% to 30%, from 30% to 35%, from 35% to 40%, from 40% to 45%, from 45% to 50%, from 50% to 55%, from 55% to 600, from 60% to 65%, from 65% to 70%, from 70% to 75%, from 75% to 80%, from 80% to 85%, from 85% to 90%, from 90% to 95%, or from 95% to 100%.
  • nucleic acid sequence design In addition to hybridization region length and GC content, there are many more aspects of the nucleic acid sequence design that may affect the efficiency of the OEPCR. For example, the formation of undesired secondary structures within a component may interfere with its ability to form a hybridization product with its intended adjacent component. These secondary structures may include hairpin loops. The types of possible secondary structures and their stability (for example meting temperature) for a nucleic acid may be predicted based on the sequence. Design space search algorithms may be used to determine nucleic acid sequences that meet proper length and GC content criteria for efficient OEPCR, while avoiding sequences with potentially inhibitory secondary structures.
  • homodimers nucleic acid molecules that hybridize with nucleic acid molecules of the same sequence
  • unwanted heterodimers nucleic acid sequences that hybridize with other nucleic acid sequences aside from their intended assembly partner
  • OEPCR may be optimized by using long hybridization regions with high GC content but short non-hybridization regions with low GC content.
  • the overall length of nucleic acids may be at least 10, 20, 30, 40, 50, 60, 70, 80, 90, or at least 100 bases, or above. In some implementations, there may be an optimal length and optimal GC content for the hybridization regions of nucleic acids where the assembly efficiency is optimized.
  • Additives may be included in the OEPCR reaction to improve assembly efficiency.
  • Additives may be included in the OEPCR reaction to improve assembly efficiency.
  • Additive content weight per volume may be at least 0%, 1%, 5%, 10%, or at least 20%, or more.
  • coli DNA polymerase I T7 DNA polymerase, bacteriophage T4 DNA polymerase 29 (phi29) DNA polymerase, Taq polymerase, Tth polymerase, Tli polymerase, Pfu polymerase Pwo polymerase, VENT polymerase, DEEPVENT polymerase, Ex-Taq polymerase, LA-Taw polymerase, Sso polymerase Poc polymerase, Pab polymerase, Mth polymerase ES4 polymerase, Tru polymerase, Tac polymerase, Tne polymerase, Tma polymerase, Tea polymerase, Tih polymerase, Tfi polymerase, Platinum Taq polymerases, Tbr polymerase, Phusion polymerase, KAPA polymerase, Q5 polymerase, Tfl polymerase, Pfutubo polymerase, Pyrobest polymerase, KOD polymerase, Bst polymerase, Sac polymerase, Klenow fragment polymerase with 3'
  • polymerases may be stable and function optimally at different temperatures. Moreover, different polymerases have different properties. For example, some polymerases, such a Phusion polymerase, may exhibit 3' to 5' exonuclease activity, which may contribute to higher fidelity during nucleic acid elongation. Some polymerases may displace leading sequences during elongation, while others may degrade them or halt elongation. Some polymerases, like Taq, incorporate an adenine base at the 3' end of nucleic acid sequences. This process is referred to as A-tailing and may be inhibitory to OEPCR as the addition of an Adenine base may disrupt the designed 3' complementarity between intended adjacent components. OEPCR may also be referred to as polymerase cycling assembly (or PCA).
  • PCA polymerase cycling assembly
  • ligation assembly separate nucleic acids are assembled in a reaction comprising one or more ligase enzymes and additional co-factors.
  • Co-factors may include Adenosine Tri-Phosphate (ATP), Dithiothreitol (DTT), or Magnesium ion (Mg2+).
  • ATP Adenosine Tri-Phosphate
  • DTT Dithiothreitol
  • Mg2+ Magnesium ion
  • Components in a ligation reaction may be blunt-ended double stranded DNA (dsDNA), single stranded DNA (ssDNA), or partially hybridized single -stranded DNA.
  • a double stranded nucleic acid When a double stranded nucleic acid has an overhang strand on one end, the other strand on the same end may be referred to as a “cavity.” Together, a cavity and overhang form a “sticky end”, also known as a “cohesive-end.”
  • a sticky end may be either a 3' overhang and a 5' cavity, or a 5' overhang and a 3' cavity.
  • the sticky-ends between two intended adjacent components may be designed to have complementarity such that the overhang of both sticky ends hybridize such that each overhang ends directly adjacent to the beginning of the cavity on the other component.
  • nick a double stranded DNA break
  • a ligase a double stranded DNA break
  • the top and bottom strand of a molecule that forms a sticky end may move between associated and dissociated states, and therefore the sticky end may be a transient formation.
  • the nick along one strand of a sticky end duplex between two components is sealed, that covalent linkage remains even if the members of the opposite strand dissociate.
  • the linked strand may then become a template to which the intended adjacent members of the opposite strand can bind and once again form a nick that may be sealed.
  • the digestion and ligation may occur together in the same reaction if the endonuclease and ligase are compatible.
  • the reaction may occur at a uniform temperature, such as 4, 10, 16, 25, or 37 degrees Celsius.
  • the reaction may cycle between multiple temperatures, such as between 16 degrees Celsius and 37 degrees Celsius. Cycling between multiple temperatures may enable the digestion and ligation to each proceed at their respective optimal temperatures during different parts of the cycle.
  • nucleic acids may be separated from enzymes through phenol-chloroform extraction, ethanol precipitation, magnetic bead capture, and/or silica membrane adsorption, washing, and elution.
  • endonucleases may be used in the same reaction, though care should be taken to ensure that the endonucleases do not interfere with each other and function under similar reaction conditions. Using two endonucleases, one may create orthogonal (non-complementary) sticky ends on both ends of a dsDNA component.
  • Endonuclease digestion will leave sticky ends with phosphorylated 5' ends.
  • Ligases may only function on phosphorylated 5' ends, and not on non-phosphorylated 5' ends. As such, there may not be any need for an intermediate 5' phosphorylation step in between digestion and ligation.
  • a digested dsDNA component with a palindromic overhang on its sticky end may ligate to itself. To prevent self-ligation, it may be beneficial to dephosphorylate said dsDNA component prior to ligation.
  • Multiple endonucleases may target different restriction sites, but leave compatible overhangs (overhangs that are the reverse complement of each other).
  • the product of ligation of sticky ends created with two such endonucleases may result in an assembled product that does not contain a restriction site for either endonuclease at the site of ligation.
  • Such endonucleases form the basis of assembly methods, such as biobricks assembly, that may programmably assemble multiple components using just two endonucleases by performing repetitive digestion-ligation cycles.
  • the endonucleases used to create sticky ends may be type IIS restriction enzymes. These enzymes cleave a fixed number of bases away from their restriction sites in a particular direction, therefore the sequence of the overhangs that they generate may be customized. The overhang sequences need not be palindromic.
  • the same type IIS restriction enzyme may be used to create multiple different sticky ends in the same reaction, or in multiple reactions.
  • one or multiple type IIS restriction enzymes may be used to create components with compatible overhangs in the same reaction, or in multiple reactions.
  • the ligation site between two sticky ends generated by type IIS restriction enzymes may be designed such that it does not form a new restriction site.
  • type US restriction enzyme sites may be placed on a dsDNA such that the restriction enzyme cleaves off its own restriction site when it generates a component with a sticky end. Therefore the ligation product between multiple components generated from type IIS restriction enzymes may not contain any restriction sites.
  • Type IIS restriction enzymes may be mixed in a reaction together with ligase to perform the component digestion and ligation together.
  • the temperature of the reaction may be cycled between two or more values to promote optimal digestion and ligation.
  • the digestion may be performed optimally at 37 degrees Celsius and the ligation may be performed optimally at 16 degrees Celsius. More generally, the reaction may cycle between temperature values of at least 0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, or at least 65 degrees Celsius or above.
  • a combined digestion and ligation reaction may be used to assemble at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 components, or more. Examples of assembly reactions that leverage Type IIS restriction enzymes to create sticky ends include Golden Gate Assembly (also known as Golden Gate Cloning) or Modular Cloning (also known as MoClo).
  • exonucleases may be used to create components with sticky ends.
  • 3' exonucleases may be used to chew back the 3' ends from dsDNA, thus creating 5' overhangs.
  • 5' exonucleases may be used to chew back the 5' ends from dsDNA thus creating 3' overhangs.
  • Different exonucleases may have different properties.
  • exonucleases may differ in the direction of their nuclease activity (5' to 3' or 3' to 5'), whether or not they act on ssDNA, whether they act on phosphorylated or non-phosphorylated 5' ends, whether or not they are able to initiate on a nick, or whether or not they are able to initiate their activity on 5' cavities, 3' cavities, 5' overhangs, or 3' overhangs.
  • Exonucleases include Lambda exonuclease, RecJf, Exonuclease III, Exonuclease 1, Exonuclease T, Exonuclease V, Exonuclease VIII, Exonuclease VII, Nuclease BAL 31, T5 Exonuclease, and T7 Exonuclease.
  • Exonuclease may be used in a reaction together with ligase to assemble multiple components. The reaction may occur at a fixed temperature or cycle between multiple temperatures, each ideal for the ligase or the exonuclease, respectively.
  • Polymerase may be included in an assembly reaction with ligase and a 5'-to-3' exonuclease.
  • the components in such a reaction may be designed such that components intended to assemble adjacent to each other share homologous sequences on their edges.
  • a component X to be assembled with component Y may have a 3' edge sequence of the form 5'-z-3', and the component Y may have a 5' edge sequence of the form 5'-z-3', where z is any nucleic acid sequence.
  • homologous edge sequences of such a form as ‘gibson overlaps’.
  • Gibson assembly may be performed by using T5 exonuclease, Phusion polymerase, and Taq ligase, and incubating the reaction at 50 degrees Celsius.
  • the use of the thermophilic ligase, Taq enables the reaction to proceed at 50 degrees Celsius, a temperature suitable for all three types of enzymes in the reaction.
  • Gibson assembly may generally refer to any assembly reaction involving polymerase, ligase, and exonuclease.
  • Gibson assembly may be used to assemble at least 2, 3, 4, 5, 6, 7, 8, 9, or at least 10, or more components. Gibson assembly may occur as a one-step, isothermal reaction or as a multi-step reaction with one or more temperature incubations. For example, Gibson assembly may occur at temperatures of at least 30, 40, 50, 60, or at least 70 degrees, or more.
  • the incubation time for a Gibson assembly may be at least 1, 5, 10, 20, 40, or at least 80 minutes.
  • Gibson assembly reactions may occur optimally when gibson overlaps between intended adjacent components are a certain length and have sequence features, such as sequences that avoid undesirable hybridization events such as hairpins, homodimers, or unwanted heterodimers.
  • gibson overlaps of at least 20 bases are recommended.
  • Gibson overlaps may be at least 1, 2, 3, 5, 10, 20, 30, 40, 50, 60, or at least 100, or more bases in length.
  • the GC content of a gibson overlap may be anywhere from 0% to 100%.
  • the GC content of a gibson overlap may be from 0% to 5%, from 5% to 10%, from 10% to 15%, from 15% to 20%, from 20% to 25%, from 25% to 30%, from 30% to 35%, from 35% to 40%, from 40% to 45%, from 45% to 50%, from 50% to 55%, from 55% to 60%, from 60% to 65%, from 65% to 70%, from 70% to 75%, from 75% to 80%, from 80% to 85%, from 85% to 90%, from 90% to 95%, or from 95% to 100%.
  • components with sticky ends may be created synthetically, as opposed to enzymatically, by mixing together two single stranded nucleic acids, or oligos, that do not share full complementarity.
  • the index region and hybridization region(s) of oligos in sticky-end ligation may be designed to facilitate the proper assembly of components.
  • Components with long overhangs may hybridize more efficiently with each other at a given annealing temperature compared with components with short overhangs.
  • Overhangs may have a length of at least 1, 2, 3 4, 5, 6, 7, 8, 9, 10, 15, 20, or at least 30, or more bases.
  • Components with overhangs that contain high guanine or cytosine content may hybridize more efficiently to their complementary component at a given temperature than components with overhangs that contain low guanine or cytosine content. This is because guanine forms a more stable base-pair with cytosine than adenine does with thymine.
  • Overhangs may have a guanine or cytosine content (also known as GC content) of anywhere between 0% and 100%.
  • the GC content and length of the index region of an oligo may also affect ligation efficiency. This is because sticky-end components may assemble more efficiently if the top and bottom strand of each component are stably bound. Therefore, index regions may be designed with higher GC content, longer sequences, and other features that promote higher melting temperatures.
  • index regions may be designed with higher GC content, longer sequences, and other features that promote higher melting temperatures.
  • the oligo design, for both the index region and overhang sequence(s) that may affect the efficiency of the ligation assembly. For example, the formation of undesired secondary structures within a component may interfere with its ability to form an assembled product with its intended adjacent component. This may occur due to either secondary structures in the index region, in the overhang sequence, or in both. These secondary structures may include hairpin loops.
  • Design space search algorithms may be used to determine oligo sequences that meet proper length and GC content criteria for the formation of effective components, while avoiding sequences with potentially inhibitory secondary structures.
  • Design space search algorithms may include genetic algorithms, heuristic search algorithms, meta-heuristic search strategies like tabu search, branch-and-bound search algorithms, dynamic programming-based algorithms, constrained combinatorial optimization algorithms, gradient descent-based algorithms, randomized search algorithms, or combinations thereof.
  • homodimers oligos that hybridize with oligos of the same sequence
  • unwanted heterodimers oligos that hybridize with other oligos aside from their intended assembly partner
  • the formation of homodimers and heterodimers may be predicted and accounted for during oligo design using computation methods and design space search algorithms.
  • oligo sequences or higher GC content may create increased formation of unwanted secondary structures, homodimers, and heterodimers within the ligation reaction. Therefore, in some implementations, the use of shorter oligos or lower GC content may lead to higher assembly efficiency. These design principles may counteract the design strategies of using long oligos or high GC content for more efficient assembly. As such, there may be an optimal length and optimal GC content for the oligos that make up each component such that the ligation assembly efficiency is optimized.
  • the overall length of oligos to be used in ligation may be at least 10, 20, 30, 40, 50, 60, 70, 80, 90, or at least 100 bases, or above.
  • the overall GC content of oligos to be used in ligation may be anywhere from 0% to 100%.
  • the overall GC content of oligos to be used in ligation can be from 0% to 5%, from 5% to 10%, from 10% to 15%, from 15% to 20%, from 20% to 25%, from 25% to 30%, from 30% to 35%, from 35% to 40%, from 40% to 45%, from 45% to 50%, from 50% to 55%, from 55% to 60%, from 60% to 65%, from 65% to 70%, from 70% to 75%, from 75% to 80%, from 80% to 85%, from 85% to 90%, from 90% to 95%, or from 95% to 100%.
  • ligation may also occur between single-stranded nucleic acids using staple (or template or bridge) strands.
  • This method can be referred to as staple strand ligation (SSL), template directed ligation (TDL), or bridge strand ligation.
  • SSL staple strand ligation
  • TDL template directed ligation
  • two single stranded nucleic acids hybridize adjacently onto a template, thus forming a nick that may be sealed by a ligase.
  • the same nucleic acid design considerations for sticky end ligation also apply to TDL. Stronger hybridization between the templates and their intended complementary nucleic acid sequences may lead to increased ligation efficiency.
  • sequence features that improve the hybridization stability (or melting temperature) on each side of the template may improve ligation efficiency.
  • These features may include longer sequence length and higher GC content.
  • the length of nucleic acids in TDL, including templates may be at least 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, or at least 100 bases, or above.
  • the GC content of nucleic acids, including templates may be anywhere from 0% to 100%.
  • the GC content of nucleic acids, including templates can be from 0% to 5%, from 5% to 10%, from 10% to 15%, from 15% to 20%, from 20% to 25%, from 25% to 30%, from 30% to 35%, from 35% to 40%, from 40% to 45%, from 45% to 50%, from 50% to 55%, from 55% to 60%, from 60% to 65%, from 65% to 70%, from 70% to 75%, from 75% to 80%, from 80% to 85%, from 85% to 90%, from 90% to 95%, or from 95% to 100%.
  • TDL In TDL, as with sticky end ligation, care may be taken to design component and template sequences that avoid unwanted secondary structures by using nucleic acid structure-predicting software with sequence space search algorithms. As the components in TDL may be single stranded instead of double stranded, there may be higher incidence of unwanted secondary structures (as compared to sticky end ligation) due to the exposed bases.
  • TDL may also be performed with blunt-ended dsDNA components.
  • the staple in order for the staple strand to properly bridge two single-stranded nucleic acids, the staple may first need to displace or partially displace the full single-stranded complements.
  • the dsDNA may initially be melted with incubation at a high temperature. The reaction may then be cooled thus allowing staple strands to anneal to their proper nucleic acid complements. This process may be made even more efficient by using a relatively high concentration of template compared to dsDNA components, thus enabling the templates to outcompete the proper full-length ssDNA complements for binding.
  • ligation of blunt-ended dsDNA with TDL may be improved through multiple rounds of melting (incubation at higher temperatures) and annealing (incubation at lower temperatures). This process may be referred to as Ligase Cyling Reaction, or LCR.
  • Proper melting and annealing temperatures depend on the nucleic acid sequences. Melting and annealing temperatures may be at least 4, 10, 20, 20, 30, 40, 50, 60, 70, 80, 90, or 100 degrees Celsius. The number of temperature cycles may be at least 1, 5, 10, 15, 20, 15, 30, or more.
  • All ligations may be performed in fixed temperature reactions or in multi-temperature reactions.
  • Ligation temperatures may be at least 0, 4, 10, 20, 20, 30, 40, 50, or 60 degrees Celsius or above.
  • the optimal temperature for ligase activity may differ depending on the type of ligase.
  • the rate at which components adjoin or hybridize in the reaction may differ depending on their nucleic acid sequences. Higher incubation temperatures may promote faster diffusion and therefore increase the frequency with which components temporarily adjoin or hybridize. However increased temperature may also disrupt base pair bonds and therefore decrease the stability of those adjoined or hybridized component duplexes.
  • the optimal temperature for ligation may depend on the number of nucleic acids to be assembled, the sequences of those nucleic acids, the type of ligase, as well as other factors such as reaction additives. For example, two sticky end components with 4-base complementary overhangs may be assembled faster at 4 degrees Celsius with T4 ligase than at 25 degrees Celsius with T4 ligase. But two sticky-end components with 25-base complementary overhangs may assemble faster at 25 degrees Celsius with T4 ligase than at 4 degrees Celsius with T4 ligase, and perhaps faster than ligation with 4-base overhangs at any temperature. In some implementations of ligation, it may be beneficial to heat and slowly cool the components for annealing prior to ligase addition.
  • Ligation may be used to assemble at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or more nucleic acids.
  • Ligation incubation times may be at most 30 seconds, 1 minute, 2 minutes, 5 minutes, 10 minutes, 20 minutes, 30 minutes, 1 hour, or longer. Longer incubation times may improve ligation efficiency.
  • Ligation may require nucleic acids with 5' phosphorylated ends. Nucleic acid components without 5' phosphorylated ends may be phosphorylated in a reaction with polynucleotide kinase, such as T4 polynucleotide kinase (or T4 PNK). Other co-factors may be present in the reaction such as ATP, magnesium ion, or DTT. Polynucleotide kinase reactions may occur at 37 degrees Celsius for 30 minutes. Polynucleotide kinase reaction temperatures may be at least 4, 10, 20, 20, 30, 40, 50, or 60 degrees Celsius.
  • Polynucleotide kinase reaction incubation times may be at most, 1 minute, 5 minutes, 10 minutes, 20 minutes, 30 minutes, 60 minutes, or more.
  • the nucleic acid components may be synthetically (as opposed to enzymatically) designed and manufactured with a modified 5' phosphorylation. Only nucleic acids being assembled on their 5' ends may require phosphorylation. For example, templates in TDL may not be phosphorylated as they are not intended to be assembled.
  • Additives may be included in a ligation reaction to improve ligation efficiency.
  • DMSO Dimethyl sulfoxide
  • PEG polyethylene glycol
  • 1,2-Propanediol (1,2-Prd) 1,2-Propanediol
  • glycerol Tween-20 or combinations thereof.
  • PEG6000 may be a particularly effective ligation enhancer.
  • PEG6000 may increase ligation efficiency by acting as a crowding agent.
  • the PEG6000 may form aggregated nodules that take up space in the ligase reaction solution and bring the ligase and components to closer proximity.
  • Additive content weight per volume
  • ligases may be used for ligation.
  • the ligases can be naturally occurring or synthesized.
  • Examples of ligases include T4 DNA Ligase, T7 DNA Ligase, T3 DNA Ligase, Taq DNA Ligase, 9oNTM DNA Ligase, E. coli DNA Ligase, and SplintR DNA Ligase.
  • Different ligases may be stable and function optimally at different temperatures. For example, Taq DNA Ligase is thermostable and T4 DNA Ligase is not.
  • different ligases have different properties. For example, T4 DNA Ligase may ligate blunt-ended dsDNA while T7 DNA Ligase may not.
  • Fork adapters may be used to asymmetrically attach adapters to a nucleic acid library with either blunt ends or sticky ends that are equivalent at each end (such as A- tails).
  • Ligation may be inhibited by heat inactivation (for example incubation at 65 degrees Celsius for at least 20 minutes), addition of a denaturant, or addition of a chelator such as EDTA.
  • restriction enzymes such as Dpnl and Afe
  • Dpnl and Afe may cut their restriction sites in the center, thus leaving blunt-ended dsDNA products.
  • Other restriction enzymes such as EcoRI and Aatll, cut their restriction sites off-center, thus leaving dsDNA products with sticky ends (or staggered ends).
  • Some restriction enzymes may target discontinuous restriction sites.
  • the restriction enzyme AlwNI recognizes the restriction site CAGNNNCTG, where N may be either A, T, C, or G. Restriction sites may be at least 2, 4, 6, 8, 10, or more bases long.
  • Type II restriction enzymes cleave nucleic acids outside of their restriction sites.
  • the enzymes may be sub-classified as either Type IS or Type IIG restriction enzymes.
  • Said enzymes may recognize restriction sites that are non-palindromic.
  • Examples of said restriction enzymes include BbsI, that recognizes GAAAC and creates a staggered cleavage 2 (same strand) and 6 (opposite strand) bases further downstream.
  • Another example includes Bsal, that recognizes GGTCTC and creates a staggered cleavage 1 (same strand) and 5 (opposite strand) bases further downstream.
  • Said restriction enzymes may be used for golden gate assembly or modular cloning (MoClo).
  • restriction enzymes such as Bcgl (a Type IIG restriction enzyme) may create a staggered cleavage on both ends of its recognition site.
  • Restriction enzymes may cleave nucleic acids at least 1, 5, 10, 15, 20, or more bases away from their recognition sites. Because said restriction enzymes may create staggered cleavages outside of their recognitions sites, the sequences of the resulting nucleic acid overhangs may be arbitrarily designed. This is as opposed to restriction enzymes that create staggered cleavages within their recognition sites, where the sequence of a resulting nucleic acid overhang is coupled to the sequence of the restriction site.
  • Nucleic acid overhangs created by restriction digests may be at least 1, 2, 3, 4, 5, 6, 7, 8, or more bases long. When restriction enzymes cleave nucleic acids, the resulting 5' ends contain a phosphate.
  • One or more nucleic acid sequences may be included in a restriction digest reaction.
  • one or more restriction enzymes may be used together in a restriction digest reaction.
  • Restriction digests may contain additives and cofactors including potassium ion, magnesium ion, sodium ion, BSA, S-Adenosyl-L-methionine (SAM), or combinations thereof.
  • Restriction digest reactions may be incubated at 37 degrees Celsius for one hour. Restriction digest reactions may be incubated in temperatures of at least 0, 10, 20, 30, 40, 50, or 60 degrees Celsius. Optimal digest temperatures may depend on the enzymes. Restriction digest reactions may be incubated for at most 1, 10, 30, 60, 90, 120, or more minutes. Longer incubation times may result in increased digestion.
  • Nucleic acid amplification may be executed with polymerase chain reaction, or PCR.
  • a starting pool of nucleic acids (referred to as the template pool or template) may be combined with polymerase, primers (short nucleic acid probes), nucleotide tri phosphates (such as dATP, dTTP, dCTP, dGTP, and analogs or variants thereof), and additional cofactors and additives such as betaine, DMSO, and magnesium ion.
  • the template may be single stranded or double stranded nucleic acids.
  • the primer may be a short nucleic acid sequence built synthetically to complement and hybridize to a target sequence in the template pool.
  • PCR typically refers to reactions specifically of said form, it may also be used more generally to refer to any nucleic acid amplification reaction.
  • PCR may comprise cycling between three temperatures: a melting temperature, an annealing temperature, and an extension temperature.
  • the melting temperature is intended to turn double stranded nucleic acids into single stranded nucleic acids, as well as remove the formation of hybridization products and secondary structures.
  • the melting temperature is high, for example above 95 degrees Celsius.
  • the melting temperature may be at least 96, 97, 98, 99, 100, 101, 102, 103, 104, or 105 degrees Celsius. In other implementations the melting temperature may be at most 95, 94, 93, 92, 91, or 90 degrees Celsius.
  • a higher melting temperature will improve dissociation of nucleic acids and their secondary structures, but may also cause side effects such as the degradation of nucleic acids or the polymerase.
  • Melting temperatures may be applied to the reaction for at least 1, 2, 3, 4, 5 seconds, or above, such as 30 seconds, 1 minute, 2 minutes, or 3 minutes.
  • a longer initial melting temperature step may be recommended for PCR with complex or long template.
  • the annealing temperature is intended to facilitate the formation of hybridization between the primers and their target templates.
  • the annealing temperature may match the calculated melting temperature of the primer.
  • the annealing temperature may be within 10 degrees Celsius or more of said melting temperature.
  • the annealing temperature may be at least 25, 30, 50, 55, 60, 65, or 70 degrees Celsius.
  • the melting temperature may depend on the sequence of the primer. Longer primers may have higher melting temperatures, and primers with higher percent content of Guanine or Cytosine nucleotides may have higher melting temperatures. It may therefore be possible to design primers intended to assemble optimally at particular annealing temperatures.
  • Annealing temperatures may be applied to the reaction for at least 1, 5, 10, 15, 20, 25, or 30 seconds, or above.
  • the primer concentrations may be at high or saturating amounts.
  • Primer concentrations may be 500 nanomolar (nM).
  • Primer concentrations may be at most 1 nM, 10 nM, 100 nM, 1000 nM, or more.
  • the extension temperature is intended to initiate and facilitate the 3' end nucleic acid chain elongation of primers catalyzed by one or more polymerase enzymes.
  • the extension temperature may be set at the temperature in which the polymerase functions optimally in terms of nucleic acid binding strength, elongation speed, elongation stability, or fidelity.
  • the extension temperature may be at least 30, 40, 50, 60, or 70 degrees Celsius, or above. Annealing temperatures may be applied to the reaction for at least 1, 5, 10, 15, 20, 25, 30, 40, 50, or 60 seconds or above. Recommended extension times may be approximately 15 to 45 seconds per kilobase of expected elongation.
  • PCR may be performed with one temperature cycle. Such implementations may involve turning targeted single stranded template nucleic into double stranded nucleic acid. In other implementations, PCR may be performed with multiple temperature cycles. If the PCR is efficient, it is expected that the number of target nucleic acid molecules will double each cycle, thereby creating an exponential increase in the number of targeted nucleic acid templates from the original template pool. The efficiency of PCR may vary.
  • the actual percent of targeted nucleic acid that is replicated each round may be more or less than 100%.
  • Each PCR cycle may introduce undesirable artifacts such as mutated and recombined nucleic acids.
  • a polymerase with high fidelity and high processivity may be used.
  • a limited number of PCR cycles may be used. PCR may involve at most 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, or more cycles.
  • multiple distinct target nucleic acid sequences may amplified together in one PCR. If each target sequence has common primer binding sites, then all nucleic acid sequences may be amplified with the same set of primers.
  • PCR may comprise multiple primers intended to each target distinct nucleic acids. Said PCR may be referred to as multiplex PCR. PCR may involve at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more distinct primers.
  • each PCR cycle may change the relative distribution of the targeted nucleic acids. For example, a uniform distribution may become skewed or non-uniformly distributed.
  • optimal polymerases e.g., with high fidelity and sequence robustness
  • optimal PCR conditions may be used. Factors such as annealing and extension temperature and time may be optimized. In addition, a limited number of PCR cycles may be used.
  • a primer with base mismatches to its targeted primer binding site in the template may be used to mutate the target sequence.
  • a primer with an extra sequence on its 5' end (known as an overhang) may be used to attach a sequence to its targeted nucleic acid.
  • primers containing sequencing adapters on their 5' ends may be used to prepare and/or amplify a nucleic acid library for sequencing. Primers that target sequencing adapters may be used to amplify nucleic acid libraries to sufficient enrichment for certain sequencing technologies.
  • linear-PCR (or asymmetric-PCR) is used wherein primers only target one strand (not both strands) of a template.
  • the replicated nucleic acid from each cycle is not complemented to the primers, so the primers do not bind it. Therefore, the primers only replicate the original target template with each cycle, hence the linear (as opposed to exponential) amplification.
  • the amplification from linear-PCR may not be as fast as conventional (exponential) PCR, the maximal yield may be greater.
  • the primer concentration in linear-PCR may not become a limiting factor with increased cycles and increased yield as it would with conventional PCR.
  • Linear-After-The-Exponential-PCR (or LATE-PCR) is a modified version of linear-PCR that may be capable of particularly high yields.
  • nucleic acid amplification the process of melting, annealing, and extension may occur at a single temperature.
  • Such PCR may be referred to as isothermal PCR.
  • Isothermal PCR may leverage temperature-independent methods for dissociating or displacing the fully-complemented strands of nucleic acids from each other in favor of primer binding. Strategies include loop-mediated isothermal amplification, strand displacement amplification, helicasedependent amplification, and nicking enzyme amplification reaction.
  • Isothermal nucleic acid amplification may occur at temperatures of at most 20, 30, 40, 50, 60, or 70 degrees Celsius or more.
  • PCR may further comprise a fluorescent probe or dye to quantify the amount of nucleic acid in a sample.
  • the dye may interpolate into double stranded nucleic acids.
  • An example of said dye is SYBR Green.
  • a fluorescent probe may also be a nucleic acid sequence attached to a fluorescent unit. The fluorescent unit may be release upon hybridization of the probe to a target nucleic acid and subsequent modification from an extending polymerase unit. Examples of said probes include Taqman probes. Such probes may be used in conjunction with PCR and optical measurement tools (for excitation and detection) to quantify nucleic acid concentration in a sample. This process may be referred to as quantitative PCR (qPCR) or real-time PCR (rtPCR).
  • qPCR quantitative PCR
  • rtPCR real-time PCR
  • a PCR may be performed on single a molecule template (in a process that may be referred to as single -molecule PCR), rather than on a pool of multiple template molecules.
  • emulsion-PCR ePCR
  • the water droplets may also contain PCR reagents, and the water droplets may be held in a temperature-controlled environment capable of requisite temperature cycling for PCR. This way, multiple self-contained PCR reactions may occur simultaneously in high throughput.
  • the stability of oil emulsions may be improved with surfactants.
  • the movement of droplets may be controlled with pressure through microfluidic channels.
  • Microfluidic devices may be used to create droplets, split droplets, merge droplets, inject material intro droplets, and to incubate droplets.
  • the size of water droplets in oil emulsions may be at least 1 picoliter (pL), 10 pL, 100 pL, 1 nanoliter (nL), 10 nL, 100 nL, or more.
  • single-molecule PCR may be performed on a solid-phase substrate.
  • a solid-phase substrate examples include the Illumina solid-phase amplification method or variants thereof.
  • the template pool may be exposed to a solid-phase substrate, wherein the solid phase substrate may immobilize templates at a certain spatial resolution. Bridge amplification may then occur within the spatial neighborhood of each template thereby amplifying single molecules in a high throughput fashion on the substrate.
  • High-throughput, single-molecule PCR may be useful for amplifying a pool of distinct nucleic acids that may interfere with each other. For example, if multiple distinct nucleic acids share a common sequence region, then recombination between the nucleic acids along this common region may occur during the PCR reaction, resulting in new, recombined nucleic acids. Single-molecule PCR would prevent this potential amplification error as it compartmentalizes distinct nucleic acid sequences from each other so they may not interact. Single-molecule PCR may be particularly useful for preparing nucleic acids for sequencing. Single-molecule PCR mat also be useful for absolute quantitation of a number of targets within a template pool. For example, digital PCR (or dPCR), uses the frequency of distinct single -molecule PCR amplification signals to estimate the number of starting nucleic acid molecules in a sample.
  • a group of nucleic acids may be non-discriminately amplified using primers for primer binding sites common to all nucleic acids. For example, primers for primer binding sites flanking all nucleic acids in a pool. Synthetic nucleic acid libraries may be created or assembled with these common sites for general amplification.
  • PCR may be used to selectively amplify a targeted subset of nucleic acids from a pool. For example, by using primers with primer binding sites that only appear on said targeted subset of nucleic acids.
  • Synthetic nucleic acid libraries may be created or assembled such that nucleic acids belonging to potential sub-libraries of interest all share common primer binding sites on their edges (common within the sub-library but distinct from other sub-libraries) for selective amplification of the sub-library from the more general library.
  • PCR may be combined with nucleic acid assembly reactions (such as ligation or OEPCR) to selectively amplify fully assembled or potentially fully assembled nucleic acids from partially assembled or mis-assembled (or unintended or undesirable) bi-products.
  • the assembly may involve assembling a nucleic acid with a primer binding site on each edge sequence such that only a full assembled nucleic product would contain the requisite two primer binding sites for amplification.
  • a partially assembled product may contain neither or only one of the edge sequences with the primer binding sites, and therefore should not be amplified.
  • a mis-assembled (or unintended or undesirable) product may contain neither or only one of the edge sequences, or both edge sequences but in the incorrect orientation or separated by an incorrect amount of bases. Therefore, said mis- assembled product should either not amplify or amplify to create a product of incorrect length.
  • the amplified mis-assembled product of incorrect length may be separated from the amplified fully assembled product of correct length by nucleic acid size selection methods, such as DNA electrophoresis in an agarose gel followed by gel extraction.
  • Additives may be included in the PCR to improve the efficiency of nucleic acid amplification.
  • Additives may be included in the PCR to improve the efficiency of nucleic acid amplification.
  • Additive content weight per volume may be at least 0%, 1%, 5%, 10%, 20%, or more.
  • polymerases may be used for PCR.
  • the polymerase can be naturally occurring or synthesized.
  • An example polymerase is 29 polymerase or derivative thereof.
  • a transcriptase or a ligase is used (i.e., enzymes which catalyze the formation of a bond) in conjunction with polymerases or as an alternative to polymerases to construct new nucleic acid sequences.
  • examples of polymerases include a DNA polymerase, a RNA polymerase, a thermostable polymerase, a wild-type polymerase, a modified polymerase, E.
  • coli DNA polymerase I T7 DNA polymerase, bacteriophage T4 DNA polymerase 29 (phi29) DNA polymerase, Taq polymerase, Tth polymerase, Tli polymerase, Pfu polymerase Pwo polymerase, VENT polymerase, DEEPVENT polymerase, Ex-Taq polymerase, LA-Taw polymerase, Sso polymerase Poc polymerase, Pab polymerase, Mth polymerase ES4 polymerase, Tru polymerase, Tac polymerase, Tne polymerase, Tma polymerase, Tea polymerase, Tih polymerase, Tfi polymerase, Platinum Taq polymerases, Tbr polymerase, Phusion polymerase, KAPA polymerase, Q5 polymerase, Tfi polymerase, Pfutubo polymerase, Pyrobest polymerase, KOD polymerase, Bst polymerase, Sac polymerase, Klenow fragment polymerase with 3'
  • Different polymerases may be stable and function optimally at different temperatures. Moreover, different polymerases have different properties. For example, some polymerases, such a Phusion polymerase, may exhibit 3' to 5' exonuclease activity, which may contribute to higher fidelity during nucleic acid elongation. Some polymerases may displace leading sequences during elongation, while others may degrade them or halt elongation. Some polymerases, like Taq, incorporate an adenine base at the 3' end of nucleic acid sequences.
  • polymerases may have higher fidelity and processivity than others and may be more suitable to PCR applications, such as sequencing preparation, where it is important for the amplified nucleic acid yield to have minimal mutations and where it is important for the distribution of distinct nucleic acids to maintain uniform distribution throughout amplification.
  • Nucleic acids of a particular size may be selected from a sample using size-selection techniques.
  • size-selection may be performed using gel electrophoresis or chromatography.
  • Liquid samples of nucleic acids may be loaded onto one terminal of a stationary phase or gel (or matrix).
  • a voltage difference may be placed across the gel such that the negative terminal of the gel is the terminal at which the nucleic acid samples are loaded and the positive terminal of the gel is the opposite terminal. Since the nucleic acids have a negatively charged phosphate backbone, they will migrate across the gel to the positive terminal. The size of the nucleic acid will determine its relative speed of migration through the gel. Therefore nucleic acids of different sizes will resolve on the gel as they migrate.
  • Voltage differences may be 100V or 120V. Voltage differences may be at most 50V, 100V, 150V, 200V, 250V, or more. Larger voltage differences may increase the speed of nucleic acid migration and size resolution. However, larger voltage differences may also damage the nucleic acids or the gel. Larger voltage differences may be recommended for resolving nucleic acids of larger sizes.
  • Typical migration times may be between 15 minutes and 60 minutes. Migration times may be at most 10 minutes, 30 minutes, 60 minutes, 90 minutes, 120 minutes, or more. Longer migration times, similar to higher voltage, may lead to better nucleic acid resolution but may lead to increased nucleic acid damage. Longer migration times may be recommended for resolving nucleic acids of larger sizes. For example, a voltage difference of 120V and a migration time of 30 minutes may be sufficient for resolving a 200-base nucleic acid from a 250-base nucleic acid.
  • the properties of the gel, or matrix may affect the size-selection process.
  • Gels typically comprise a polymer substance, such as agarose or polyacrylamide, dispersed in a conductive buffer such as TAE (Tris-acetate-EDTA) or TBE (Tris-borate-EDTA).
  • the content (weight per volume) of the substance (e.g. agarose or acrylamide) in the gel may be at most 0.5%, 1%, 2%, 3%, 5%, 10%, 15%, 20%, 25%, or higher. Higher content may decrease migration speed. Higher content may be preferable for resolving smaller nucleic acids.
  • Agarose gels may be better for resolving double stranded DNA (dsDNA).
  • Polyacrylamide gels may be better for resolving single stranded DNA (ssDNA).
  • the preferred gel composition may depend on the nucleic acid type and size, the compatibility of additives (e.g., dyes, stains, denaturing solutions, or loading buffers) as well as the anticipate downstream applications (e.g., gel extraction then ligation, PCR, or sequencing).
  • Agarose gels may be simpler for gel extraction than polyacrylamide gels.
  • TAE though not as good a conductor as TBE, may also be better for gel extraction because borate (an enzyme inhibitor) carry-over in the extraction process may inhibit downstream enzymatic reactions.
  • Gels may further comprise a denaturing solution such as SDS (sodium dodecyl sulfate) or urea.
  • SDS sodium dodecyl sulfate
  • Urea may be used, for example, to denature proteins or to separate nucleic acids from potentially bound proteins.
  • Urea may be used to denature secondary structures in DNA.
  • urea may convert dsDNA into ssDNA, or urea may convert a folded ssDNA (for example a hairpin) to a non-folded ssDNA.
  • Urea-polyacrylamide gels further comprising TBE may be used for accurately resolving ssDNA.
  • Samples may be incorporate into gels with different formats.
  • gels may contain wells in which samples may be loaded manually.
  • One gel may have multiple wells for running multiple nucleic acids samples.
  • the gels may be attached to microfluidic channels that automatically load the nucleic acid sample(s).
  • Each gel may be downstream of several microfluidic channels, or the gels themselves may each occupy separate microfluidic channels.
  • the dimensions of the gel may affect the sensitivity of nucleic acid detection (or visualization). For example, thin gels or gels inside of microfluidic channels (such as in bioanalyzers or tapestations) may improve the sensitivity of nucleic acid detection.
  • the nucleic acid detection step may be important for selecting and extracting a nucleic acid fragment of the correct size.
  • a ladder may be loaded into a gel for nucleic acid size reference.
  • the ladder may contain markers of different sizes to which the nucleic acid sample may be compared. Different ladders may have different size ranges and resolutions. For example a 50 base ladder may have markers at 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, and 600 bases. Said ladder may be useful for detecting and selecting nucleic acids within the size range of 50 and 600 bases. The ladder may also be used as a standard for estimating the concentration of nucleic acids of different sizes in a sample.
  • Nucleic acid samples and ladders may be mixed with loading buffer to facilitate the gel electrophoresis (or chromatography) process.
  • Loading buffer may contain dyes and markers to help track the migration of the nucleic acids.
  • Loading buffer may further comprise reagents (such as glycerol) that are denser than the running buffer (e.g., TAE or TBE), to ensure that nucleic acid samples sink to the bottom of the sample loading wells (which may be submerged in the running buffer).
  • Loading buffer may further comprise denaturing agents such as SDS or urea.
  • Loading buffer may further comprise reagents for improving the stability of nucleic acids.
  • loading buffer may contain EDTA to protect nucleic acids from nucleases.
  • the gel may comprise a stain that binds the nucleic acid and that may be used to optically detect nucleic acids of different sizes. Stains may be specific for dsDNA, ssDNA, or both. Different stains may be compatible with different gel substances. Some stains may require excitation from a source light (or electromagnetic wave) in order to visualize. The source light may be UV (ultraviolet) or blue light. In some implementations, stains may be added to the gel prior to electrophoresis. In other implementations, stains may be added to the gel after electrophoresis. Examples of stains include Ethidium Bromide (EtBr), SYBR Safe, SYBR Gold, silver stain, or methylene blue.
  • EtBr Ethidium Bromide
  • SYBR Safe SYBR Safe
  • SYBR Gold silver stain
  • silver stain or methylene blue.
  • a reliable method for visualizing dsDNA of a certain size may be to use an agarose TAE gel with a SYBR Safe or EtBr stain.
  • a reliable method for visualizing ssDNA of a certain size may be to use a urea-polyacrylamide TBE gel with a methylene blue or silver stain.
  • the migration of nucleic acids through gels may be driven by other methods besides electrophoresis.
  • gravity, centrifugation, vacuums, or pressure may be used to drive nucleic acids through gels so that they may resolve according to their size.
  • Nucleic acids of a certain size may be extracted from gels using a blade or razor to excise the band of gel containing the nucleic acid.
  • Proper optical detection techniques and DNA ladders may be used to ensure that the excision occurs precisely at a certain band and that the excision successfully excludes nucleic acids that may belong to different, undesirable size bands.
  • the gel band may be incubated with buffer to dissolve it, thus releasing the nucleic acids into the buffer solution. Heat or physical agitation may speed the dissolution.
  • the gel band may be incubated in buffer long enough to allow diffusion of the DNA into the buffer solution without requiring gel dissolution.
  • the buffer may then be separated from the remaining solid-phase gel, for example by aspiration or centrifugation.
  • the nucleic acids may then be purified from the solution using standard purification or buffer-exchange techniques, such as phenol-chloroform extraction, ethanol precipitation, magnetic bead capture, and/or silica membrane adsorption, washing, and elution. Nucleic acids may also be concentrated in this step.
  • nucleic acids of a certain size may be separated from a gel by allowing them to run off the gel.
  • Migrating nucleic acids may pass through a basin (or well) either embedded in the gel or at the end of the gel.
  • the migration process may be timed or optically monitored such that when the nucleic acid group of a certain size enters the basin, the sample is collected from the basin. The collection may occur, for example, by aspiration.
  • the nucleic acids may then be purified from the collected solution using standard purification or buffer-exchange techniques, such as phenol-chloroform extraction, ethanol precipitation, magnetic bead capture, and/or silica membrane adsorption, washing, and elution. Nucleic acids may also be concentrated in this step.
  • nucleic acid size selection may include mass-spectrometry or membranebased filtration.
  • membrane-based filtration nucleic acids are passed through a membrane (for example a silica membrane) that may preferentially bind to either dsDNA, ssDNA, or both.
  • the membrane may be designed to preferentially capture nucleic acids of at least a certain size.
  • membranes may be designed to filter out nucleic acids of less than 20, 30, 40, 50, 70, 90, or more bases. Said membrane-based, size-selection techniques may not be as stringent as gel electrophoresis or chromatography,
  • Affinity-tagged nucleic acids may be used as sequence specific probes for nucleic acid capture.
  • the probe may be designed to complement a target sequence within a pool of nucleic acids. Subsequently, the probe may be incubated with the nucleic acid pool and hybridized to its target.
  • the incubation temperature may be below the melting temperature of the probe to facilitate hybridization.
  • the incubation temperature may be up to 5, 10, 15, 20, 25, or more degrees Celsius below the melting temperature of the probe.
  • the hybridized target may be captured to a solid-phase substrate that specifically binds the affinity tag.
  • the solid-phase substrate may be a membrane, a well, a column, or a bead. Multiple rounds of washing may remove all non-hybridized nucleic acids from the targets.
  • the washing may occur at a temperature below the melting temperature of the probe to facilitate stable immobilization of target sequences during the wash.
  • the washing temperature may be up to 5, 10, 15, 20, 25, or more degrees Celsius below the melting temperature of the probe.
  • a final elution step may recover the nucleic acid targets from the solid phase-substrate, as well as from the affinity tagged probes.
  • the elution step may occur at a temperature above the melting temperature of the probe to facilitate the release of nucleic acid targets into an elution buffer.
  • the elution temperature may be up to 5, 10, 15, 20, 25, or more degrees Celsius above the melting temperature of the probe.
  • the oligonucleotides bound to a solid-phase substrate may be removed from the solid-phase substrate, for example, by exposure to conditions such as acid, base, oxidation, reduction, heat, light, metal ion catalysis, displacement or elimination chemistry, or by enzymatic cleavage.
  • the oligonucleotides may be attached to a solid support through a cleavable linkage moiety.
  • the solid support may be functionalized to provide cleavable linkers for covalent attachment to the targeted oligonucleotides.
  • the linker moiety may be of six or more atoms in length.
  • the cleavable linker may be a TOPS (two oligonucleotides per synthesis) linker, an amino linker, or a photocleavable linker.
  • biotin may be used as an affinity tag that is immobilized by streptavidin on a solid-phase substrate.
  • Biotinylated oligonucleotides for use as nucleic acid capture probes, may be designed and manufactured. Oligonucleotides may be biotinylated on the 5' or 3' end. They may also be biotinylated internally on thymine residues. Increased biotin on an oligo may lead to stronger capture on the streptavidin substrate. A biotin on the 3' end of an oligo may block the oligo from extending during PCR.
  • the biotin tag may be a variant of standard biotin.
  • the biotin variant may be biotin-TEG (triethylene glycol), dual biotin, PC biotin, DesthioBiotin-TEG, and biotin Azide. Dual biotin may increase the biotin-streptavidin affinity.
  • Biotin-TEG attaches the biotin group onto a nucleic acid separated by a TEG linker. This may prevent the biotin from interfering with the function of the nucleic acid probe, for example its hybridization to the target.
  • a nucleic acid biotin linker may also be attached to the probe.
  • the nucleic acid linker may comprise nucleic acid sequences that are not intended to hybridize to the target.
  • the biotinylated nucleic acid probe may be designed with consideration for how well it may hybridize to its target. Nucleic acid probes with higher designed melting temperatures may hybridize to their targets more strongly. Longer nucleic acid probes, as well as probes with higher GC content, may hybridize more strongly due to increased melting temperatures. Nucleic acid probes may have a length of a least 5, 10, 15, 20, 30, 40, 50, or 100 bases, or more. Nucleic acid probes may have a GC content anywhere between 0 and 100%. Care may be taken to ensure that the melting temperature of the probe does not exceed the temperature tolerance of the streptavidin substrate.
  • Nucleic acid probes may be designed to avoid inhibitory secondary structures such as hairpins, homodimers, and heterodimers with off-target nucleic acids. There may be a tradeoff between probe melting temperature and off-target binding. There may be an optimal probe length and GC content at which melting temperature is high and off-target binding is low.
  • a synthetic nucleic acid library may be designed such that its nucleic acids comprise efficient probe binding sites.
  • the solid-phase streptavidin substrate may be magnetic beads. Magnetic beads may be immobilized using a magnetic strip or plate. The magnetic strip or plate may be brought into contact with a container to immobilize the magnetic beads to the container. Conversely, the magnetic strip or plate may be removed from a container to release the magnetic beads from the container wall into a solution.
  • Beads may have varying sizes. For example beads may be anywhere between 1 and 3 micrometers (um) in diameter. Beads may have a diameter of at most 1, 2, 3, 4, 5, 10, 15, 20, or more micrometers. Bead surfaces may be hydrophobic or hydrophilic. Beads may be coated with blocking proteins, for example BSA. Prior to use, beads may be washed or pre-treated with additives, such as blocking solution to prevent them from non- specifically binding nucleic acids.
  • a biotinylated probe may be coupled to the magnetic streptavidin beads prior to incubation with the nucleic acid sample pool. This process may be referred to as direct capture. Alternatively, the biotinylated probe may be incubated with the nucleic acid sample pool prior to the addition of magnetic streptavidin beads. This process may be referred to as indirect capture. The indirect capture method may improve target yield. Shorter nucleic acid probes may require a shorter amount of time to couple to the magnetic beads.
  • Optimal incubation of the nucleic acid probe with the nucleic acid sample may occur at a temperature that is 1 to 10 degrees Celsius or more below the melting temperature of the probe. Incubation temperatures may be at most 5, 10, 20, 30, 40, 50, 60, 70, 80, or more degrees Celsius.
  • the recommended incubation time may be 1 hour. The incubation time may be at most 1, 5, 10, 20, 30, 60, 90, 120, or more minutes. Longer incubation times may lead to better capture efficiency.
  • An additional 10 minutes of incubation may occur after the addition of the streptavidin beads to allow biotin-streptavidin coupling. This additional time may be at most 1, 5, 10, 20, 30, 60, 90, 120, or more minutes.
  • Hybridization of the probe to its target may be improved if the nucleic acid pool is singlestranded nucleic acid (as opposed to double -stranded).
  • Preparing a ssDNA pool from a dsDNA pool may entail performing linear-PCR with one primer that commonly binds the edge of all nucleic acid sequences in the pool. If the nucleic acid pool is synthetically created or assembled, then this common primer binding site may be included in the synthetic design.
  • the product of the linear-PCR will be ssDNA. More starting ssDNA template for the nucleic acid capture may be generated with more cycles of linear-PCR.
  • the beads may be immobilized by a magnet and several rounds of washing may occur. Three to five washes may be sufficient to remove non-target nucleic acids, but more or less rounds of washing may be used. Each incremental wash may further decrease non-targeted nucleic acids, but it may also decrease the yield of target nucleic acids.
  • a low incubation temperature may be used. Temperatures as low as 60, 50, 40, 30, 20, 10, or 5 degrees Celsius or less may be used.
  • the washing buffer may comprise Tris buffered solution with sodium ion.
  • Optimal elution of the hybridized targets from the magnetic bead-coupled probes may occur at a temperature that is equivalent to or more than the melting temperature of the probe. Higher temperatures will facilitate the dissociation of the target to the probe. Elution temperatures may be at most 30, 40, 50, 60, 70, 80, or 90 degrees Celsius, or more. Elution incubation time may be at most 1, 2, 5, 10, 30, 60 or more minutes. Typical incubation times may be approximately 5 minutes, but longer incubation times may improve yield.
  • Elution buffer may be water or tris-buffered solution with additives such as EDTA.
  • Nucleic acid capture of target sequences containing at least one or more of a set of distinct sites may be performed in one reaction with multiple distinct probes for each of those sites.
  • Nucleic acid capture of target sequences containing every member of a set of distinct sites may be performed in a series of capture reactions, one reaction for each distinct site using a probe for that particular site. The target yield after a series of capture reactions may be low, but the captured targets may subsequently be amplified with PCR. If the nucleic acid library is synthetically designed, then the targets may be designed with common primer binding sites for PCR.
  • Synthetic nucleic acid libraries may be created or assembled with common probe binding sites for general nucleic acid capture.
  • these common sites may be used to selectively capture fully assembled or potentially fully assembled nucleic acids from assembly reactions, thereby fdtering out partially assembled or mis-assembled (or unintended or undesirable) bi-products.
  • the assembly may involve assembling a nucleic acid with a probe binding site on each edge sequence such that only a fully assembled nucleic product would contain the requisite two probe binding sites necessary to pass through a series of two capture reactions using each probe.
  • a partially assembled product may contain neither or only one of the probe sites, and therefore should not ultimately be captured.
  • a mis-assembled (or unintended or undesirable) product may contain neither or only one of the edge sequences. Therefore said mis-assembled product may not ultimately be captured.
  • probe binding sites may be included on each component of an assembly.
  • a subsequent series of nucleic acid capture reactions using a probe for each component may isolate only fully assembled product (containing each component) from any bi-products of the assembly reaction.
  • Subsequent PCR may improve target enrichment, and subsequent size-selection may improve target stringency.
  • nucleic acid capture may be used to selectively capture a targeted subset of nucleic acids from a pool. For example, by using probes with binding sites that only appear on said targeted subset of nucleic acids.
  • Synthetic nucleic acid libraries may be created or assembled such that nucleic acids belonging to potential sub-libraries of interest all share common probe binding sites (common within the sub-library but distinct from other sub-libraries) for the selective capture of the sub-library from the more general library.
  • Lyophilization is a dehydration process. Both nucleic acids and enzymes may be lyophilized. Lyophilized substances may have longer lifetimes. Additives such as chemical stabilizers may be used to maintain functional products (e.g., active enzymes) through the lyophilization process. Disaccharides, such as sucrose and trehalose, may be used as chemical stabilizers.
  • sequences of nucleic acids for building synthetic libraries (e.g., identifier libraries) may be designed to avoid synthesis, sequencing, and assembly complications. Moreover, they may be designed to decrease the cost of building the synthetic library and to improve the lifetime over which the synthetic library may be stored.
  • Nucleic acids may be designed to avoid long strings of homopolymers (or repeated base sequences) that may be difficult to synthesize. Nucleic acids may be designed to avoid homopolymers of length greater than 2, 3, 4, 5, 6, 7 or more. Moreover, nucleic acids may be designed to avoid the formation of secondary structures, such as hairpin loops, that may inhibit their synthesis process. For example, predictive software may be used to generate nucleic acid sequences that do not form stable secondary structures. Nucleic acids for building synthetic libraries may be designed to be short. Longer nucleic acids may be more difficult and expensive to synthesize. Longer nucleic acids may also have a higher chance of mutations during synthesis. Nucleic acids (e.g., components) may be at most 5, 10, 15, 20, 25, 30, 40, 50, 60 or more bases.
  • Nucleic acids to become components in an assembly reaction may be designed to facilitate that assembly reaction. Efficient assembly reactions typically involve hybridization between adjacent components. Sequences may be designed to promote these on-target hybridization events while avoiding potential off-target hybridizations. Nucleic acid base modifications, such as locked nucleic acids (LNAs), may be used to strengthen on-target hybridization. These modified nucleic acids may be used, for example, as staples in staple strand ligation or as sticky ends in sticky-strand ligation.
  • LNAs locked nucleic acids
  • modified bases that may be used for building synthetic nucleic acid libraries (or identifier libraries) include 2,6-Diaminopurine, 5-Bromo dU, deoxyUridine, inverted dT, inverted diDeoxy-T, Dideoxy-C, 5-Methyl dC, deoxyinosine, Super T, Super G, or 5 -Nitroindole.
  • Nucleic acids may contain one or multiple of the same or different modified bases.
  • Some of the said modified bases are natural base analogs (for example, 5-Methyl dC and 2,6-Diaminopurine) that have higher melting temperatures and may therefore be useful for facilitating specific hybridization events in assembly reactions.
  • modified bases are universal bases (for example, 5-Nitroindole) that can bind to all natural bases and may therefore be useful for facilitating hybridization with nucleic acids that may have variable sequences within desirable binding sites.
  • these modified bases may be useful in primers (e.g., for PCR) and probes (e.g., for nucleic acid capture) as they may facilitate the specific binding of primers and probes to their target nucleic acids within a pool of nucleic acids.
  • Nucleic acids may be designed to facilitate sequencing.
  • nucleic acids may be designed to avoid typical sequencing complications such as secondary structure, stretches of homopolymers, repetitive sequences, and sequences with too high or too low of a GC content. Certain sequencers or sequencing methods may be error prone.
  • Nucleic acid sequences (or components) that make up synthetic libraries e.g., identifier libraries
  • Nucleic acid sequences may be designed with hamming distances of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 or more base mutations. Alternative distance metrics from hamming distance may also be used to define a minimum requisite distance between designed nucleic acids.
  • Some sequencing methods and instruments may require input nucleic acids to contain particular sequences, such as adapter sequences or primer-binding sites. These sequences may be referred to as “method-specific sequences”. Typical preparatory workflows for said sequencing instruments and methods may involve assembling the method-specific sequences to the nucleic acid libraries.
  • a synthetic nucleic acid library e.g., identifier library
  • these method-specific sequences may be designed into the nucleic acids (e.g., components) that comprise the library (e.g., identifier library).
  • sequencing adapters may be assembled onto the members of a synthetic nucleic acid library in the same reaction step as when the members of a synthetic nucleic acid library are themselves assembled from individual nucleic acid components.
  • Nucleic acids may be designed to avoid sequences that may facilitate DNA damage. For example, sequences containing sites for site-specific nucleases may be avoided. As another example, UVB (ultraviolet-B) light may cause adjacent thymines to form pyrimidine dimers which may then inhibit sequencing and PCR. Therefore, if a synthetic nucleic acid library is intended to be stored in an environment exposed to UVB, then it may be beneficial to design its nucleic acid sequences to avoid adjacent thymines (i.e., TT).
  • TT adjacent thymines
  • the technologies described in this specification include a system for writing digital information into nucleic acid molecules, e.g., a set of identifiers as described above.
  • the system includes a number of electrical and mechanical components discussed below.
  • the system includes a main system controller including a processor and a memory.
  • the main system controller receives and processes user input and controls the devices and processes described below via a set of electrically controlled implements, including motors, pumps, valves, and/or actuators.
  • An example system 100 is shown in FIG. 1.
  • An example system 100 includes a substantially rigid substrate for receiving, storing, and/or handling nucleic acids (e.g., DNA).
  • An example substrate is a single contiguous flat ring or one of a plurality of separate annular sectors or “slices.”
  • a set of reconfigurable modules each perform a discrete action on a ring section or slice.
  • a closed-loop motion control system moves ring sections or slices through the set of (fixed in place) modules.
  • RNA cuts are arranged in a close-packed series or spaced apart, forming a “virtual continuous web.”
  • droplets or reaction spots (or, “spots”) are printed on the substrate (write operation).
  • the droplets or reaction spots contain nucleic acids encoding digital information as described in this specification and are used for storage of the nucleic acids and/or as environment for biochemical reactions (e.g., ligation assembly, PCR amplification, etc.) described above.
  • the droplets / spots are then encapsulated or emulsified, collected, and transferred to a holding container (a collector or “scoop”).
  • the ring sections or slices are cleaned and/or sterilized allowing the process to repeat (cleaning or “Wash”).
  • Each module is its own independent module and communicates with a main system controller.
  • An example system 100 includes a carrier 110 configured as a rotating disc connected to a motor and a controller configured to rotate the carrier either continuously or in predefined increments (FIG. 2A).
  • the carrier 110 serves as a platform for one or more implements (e.g., mechanical and/or electronic components discussed below) for handling nucleic acids suspended in a liquid, including one or more nucleic acid substrates.
  • a carrier 110 is implemented as a large (e.g., 0.5-2 m, e.g., approx. ⁇ 1.2 m diameter) rotating disk.
  • a disk offers numerous advantages over other types of motion systems, including simple reliable control (single drive motor and encoder feedback), low cost, rigidity, etc.
  • the various modules operating on the nucleic acids on a substrate are mounted in fixed positions above the rotating substrate 110.
  • the substrate 110 rotates at certain speed (e.g., 5- 500 rpm, e.g., 5-50 rpm, e.g., 10 rpm) such that a substrate or section thereof passes through all modules and completes a full revolution in, e.g., 10-600 seconds, e.g., -60 seconds.
  • the substrate 110 itself is mainly a mechanical support component, e.g., to serve as the chassis for implements connecting substrates to the carrier (“plates”) and/or other required equipment (e.g., vacuum pump, etc., discussed below).
  • the substrate 110 also includes one or more electronic carrier master controllers 111 attached or otherwise connected to the substrate 110 to serve as the middleman for communication with each plate and/or substrate, and to relay information and commands to/from the main system controller to a plate or substrate.
  • FIG. 2B illustrates an example configuration of a set of components of a system 100 for conveying nucleic acid droplets / reaction spots in a DNA writer system 100.
  • a set of plates 120 are mounted on a carrier 110.
  • the plates 120 are electronically connected (e.g., via cables 112) to a carrier master controller 111, which is configured to coordinate communication between plates 120 and the main system controller (not shown).
  • each plate is electronically connected to a single carrier master controller 111.
  • a plate 120 is fixedly mounted on carrier 110 and establishes a removable connection to a slice 130.
  • the plate 120 is configured as a data, power, and/or communications link between carrier 110 and the slice 130.
  • a plate 120 includes or is configured as a printed circuit board.
  • a plurality of slices 130 are arranged on a carrier 110 to form a “virtual continuous web.”
  • FIG. 3A and FIG. 3B illustrate another example configuration of a set of components of a system 100 for conveying nucleic acid droplets / reaction spots in a DNA writer system 100.
  • a substrate is configured as a single ring 530 instead of a slice.
  • the ring 530 is removably mounted on carrier 510 via ring plate 520.
  • the carrier 510 is driven by motor 540.
  • a ring plate 520 is fixedly mounted on carrier 510 and establishes a removable connection to ring 530.
  • the ring plate 520 is configured as a data-, power-, and/or communications-link between carrier 510 and the ring 530.
  • the ring 530 includes a plurality of sections configured to progress through the modules analogously to slices 130. In an implementation, each section is electronically connected to a single carrier master controller 111.
  • the ring is a metal or metal coated, has a thickness of 1-5 mm, an outer radius of about 500 mm and an inner radius of about 400 mm
  • a slice or ring is made of metal or plastic.
  • the material is polished to achieve a mirror-like finish.
  • the finish is added as a coating.
  • a substrate has one or more different surface chemistries/coatings (e.g., ligase, enzymes, capture probes, etc.) to support different chemical reactions and processing.
  • Slice and ring configurations can serve different applications.
  • a slicebased system would be used if the DNA is kept on reaction spots on the disk for a longer period of time (given that slices can be removed and stored in a “hotel”).
  • a ring is used for a continuous application.
  • a ring (mounted on the carrier 110) rotates continuously - the system writes, images, collects, and cleans the ring within a (single) 360-degree rotation.
  • the system prints a full single ring worth of DNA - however collection and cleaning modules would be paused during the writing to allow the reaction spots to stay on the ring for some predetermined period of time.
  • the carrier starts spinning again (or is kept spinning the entire time) and the clean and/or decontamination modules are brought back online to collect the droplets / reaction spots (this operation may be referred to as the “disk as a hotel” option).
  • a slice or ring as described in this specification is configured to support and provide the necessary conditions for a number of chemical processes occurring in the reaction spots, including ligation, a combination of ligation and PCR, PCR-based component stitching, click chemistry, and/or a combination of ligation and isothermal amplification.
  • the system 100 is configured to provide real-time concentration readout of nucleic acid copy count. For example, reaction spots are imaged or samples are collected for spectrophotometry.
  • FIG. 4 illustrates an example working principle of a system 100.
  • the rotating components e.g., the components mounted on carrier 110 or 510) pass once or repeatedly a slice or ring section through a printing or writing module (where liquids containing the NA molecules are deposited on a substrate), a collection module (to collect the emulsion or encapsulated nucleic acids, e.g., for collection into a vessel for off-instrument processing), and/or a cleaning module.
  • the system 100 optionally includes a sterilization or decontamination module, which is a stand-alone module or part of the cleaning module.
  • the modules can be removed and/or swapped, e.g., for maintenance, testing, or upgrades.
  • each module includes its own onboard controller, with a standardized set of electrical, communications, and/or utility connections.
  • FIG. 5 illustrates various example components of a system 100 and their respective operational characteristics and/or requirements.
  • FIG. 6 illustrates an example general communication relationship between the different components of system 100.
  • the main system controller is in electronic communication (e.g., wireless communication) with the rotating components of the system 100.
  • the main system controller exchanges data with one or more carrier master controllers (111), which are in communications with an onboard processor or of a plate 120 / ring plate 520.
  • the onboard processor or of a plate 120 / ring plate 520 is configured to receive data from slice 130 / ring 530 and relay the data to the carrier master controller.
  • the main system controller is in electronic communication with each non-rotating component of the system, including the writing, collection, and cleaning modules.
  • the electric/electronic connections are standardized for easy replacement or swapping of modules.
  • the main system controller is also in electric/electronic communication with ancillary systems, such as cameras, pumps, actuators, and the like.
  • FIG. 7 is a cross-sectional diagram illustrating the rotating elements of an example system 100.
  • slices 130 are removably held by plates 120, and plates 120 are mounted on the carrier 110 driven by a motor 140, which includes a motor controller connected to main system controller 300.
  • ring 530 is removably held by ring plate 520 mounted on carrier 510.
  • Mounted on or in the carrier 110/510 is at least one carrier master controller 111.
  • a carrier 110/510 holds a single carrier master controller 111.
  • a slice 130 or ring 530 is held in place on plate 120 or ring plate 520 through a pneumatic/suction mechanism including an onboard vacuum header, distribution lines, and/or valves, which are controlled by the carrier master controller 111.
  • the system 100 includes a vacuum pump 150 mounted on the carrier, or an external vacuum pump and vacuum distribution header (as required for suction cups to retain slices or rings).
  • FIG. 8 illustrates example electronic communication relationships between various rotating elements.
  • each of the plates 120 / ring plate 520 has an onboard microcontroller in electronic communication with carrier master controller 111.
  • carrier master controller 111 Given that the carrier 110/510 is rotating, a hardwired connection 143 between carrier master controller 111 and main system controller 300 is established using a slip ring noise-rated for data transfer.
  • wireless communication e.g., Bluetooth
  • the communication between carrier master controller 111 and main system controller 300 includes transmitting information from plates 120/520 to main system controller 520, e.g., information that the slices 130 / ring slice 530 are loaded/unloaded, sending over their serial numbers for logging, as well as transmitting from main system controller 300 to plates 120/520, e.g., information relating to timing for slice release, and/or temperature settings.
  • slices 130 / ring slices 530 are generally passive - a plate 120 / ring plate 520 is connected to a slice or ring via spring connection 123, e.g., to read an integrated circuit on the slice / ring and/or to transmit power.
  • Plates 120 / ring plates 520 are electrically connected to carrier master controller 111 via two-way communication, e.g., via ribbon cable 121.
  • carrier master controller 111 is electrically connected to a vacuum gauge 151 and/or to vacuum pump 150 to control the attachment and release of slices or rings to/from plates.
  • main system controller 300 receives and processes information from wheel encoder 141, which determines (e.g., optically or electromagnetically) speed and/or position of the carrier 110/510. Based on the wheel encoder data, main system controller 300 can determine speed and/or position of carrier 110/510 and actuate motor 140/540 to move a slice 130 or section of ring slice 530 to a desired position.
  • FIG. 9 illustrates example electric connections between various rotating elements.
  • electrical supply to the carrier 110/510 includes a slip ring, e.g., to establish electronic connection between carrier master controller 111 and main system controller 300 and/or to establish electrical connection between an external power source (not shown) and the electrically powered rotating components, e.g., the vacuum pump 150 and the carrier master controller 111.
  • a single set of power lines delivers electrical power to carrier 110/510 (e.g., at 120 V or 240 V).
  • a regulator e.g., as part of or connected to carrier master controller 111 regulates down the voltage to other required voltages onboard the carrier 110/510 (e.g., due to complexity of rotational motion).
  • a battery and/or capacitor bank is included, e.g., on a carrier 110/510 to smooth noise/volt fluctuations.
  • electrical power is transferred to the carrier wirelessly.
  • FIG. 10 illustrates an example configuration of a system 100 including a writing module 170, a collection module 180, and a cleaning module 190 disposed above carrier 110/510, and their respective inputs and outputs. Additionally, compressed air and vacuum implements (e.g., headers) are included for use by the modules - for example, the writing module 170 includes or is connected to an ink supply system that require compressed air.
  • the system 100 includes a compressed air system 155 for use in the collection module 180 and/or cleaning module 190 and includes a vacuum / suction system including a vacuum pump 150a.
  • the vacuum pump 150a is the same pump as vacuum pump 150.
  • Substrates as described in this specification are devices capable of being moved within or between multiple devices or modules (e.g., different types of writers) that perform diverse physical and chemical tasks on the slice or ring, including printing/writing from diverse sources (e.g., using tips, inkjets, pipettes), as well as moving, measuring, and/or sensing substances deposited on slices or rings with multiple different modalities.
  • Slices or rings as described in this specification offer several advantages over a “consumable web” configuration.
  • a consumable web refers to a print surface of a single-use, disposable material that consumed in a similar fashion to how a printing press would use paper: clean, unused web is fed into the input of the writer and the contaminated, used web is retrieved in the output.
  • the reusable substrates used in the technologies described in this specification utilize the concept of a “Smart Slice” or “Smart Ring,” which provides for instrumenting the slice or ring to provide additional functionality.
  • a slice or ring is or includes a silicon wafer and/or a printed circuit board.
  • slices or rings are a custom PCBs comprised of a top nucleic acid (e.g., DNA) printing surface (which contains a (sectorsized) print area, hydrophobic coating, fiducial or other alignment marks, and/or a product detect barcode, QR code, or similar).
  • the bottom side of the slice or ring contains various surface mount integrated circuit (IC) components and their associated traces.
  • IC integrated circuit
  • Components / traces provide for, e.g., embedded heaters, temperature sensors, electrowetting pads, unique serial number chips, and/or vibrating/ultrasonic elements, etc.
  • each removable slice or ring is controlled by a more semi-permanent device (e.g., plate 120 or ring plate 520), which is fixed to the motion control system carrier (e.g., carrier 110/510).
  • a carrier is a closed-loop moving carrier system, e.g., a rotating disk as described above, or a conveyor belt.
  • plate 110 locations dictated by size and number of slices 130 on the carrier 110.
  • Plates are also custom PCB-designed components such that, e.g., one plate 120 interacts with a single slice 130 at a time.
  • a ring plate 520 interacts with a section or multiple sections of ring plate 530.
  • each plate or ring plate has an embedded microcontroller that handles slice or ring control tasks. These tasks include running a low-speed Proportional -Integral -Derivative (PID) heating loop, sensing for when a slice or ring is loaded, reading each slice or ring or ring section’s unique serial number, etc.
  • PID Proportional -Integral -Derivative
  • plates or ring plates also serve as the mechanical fixturing component for the slice or ring (e.g., via a pneumatic or magnetic retention system, or using tapes or other mechanical means).
  • a slice or ring contains a ferromagnetic material.
  • the slice or ring is made at least in part of a ferromagnetic metal.
  • the slice or ring has a multi-layer configuration with a ferromagnetic bottom layer, a substrate layer, a reflective layer, and a hydrophobic coating layer.
  • a slice or ring (e.g., slice 130 or ring slice 530) includes minimal “smarts.”
  • a slice or ring does not have an onboard microcontroller.
  • the slice or ring includes one or more embedded resistive heating elements, which are controlled through the plate 120 or ring plate 520.
  • the heating elements are configured to ensure uniform temperature on the slice or section of ring during the nucleic acid droplet / reaction spot printing process.
  • a slice or ring includes a number, e.g., 1, 2, 3, 4, 5, 6, or more, vibration elements, which are mounted on the backside (facing the plate or ring plate), which can help emulsify and/or break free the reaction spots.
  • a slice, ring, or section of a ring includes a unique serial number IC, and/or a temperature sensing IC.
  • a slice or ring or section of a ring has no physical protrusions on or above the top surface of the slice or ring.
  • a plate (e.g., plate 120 or ring plate 520) is configured as or includes a nonremovable PCB mounted to the carrier, e.g., carrier 110/510.
  • a plate or ring plate includes an onboard microcontroller configured to, e.g., control an embedded heater on the slice or ring slice using a proportional-integral-derivative (PID) loop to pulse-width modulate (PWM) a solid-state relay (SSR) to regulate temperature.
  • PID proportional-integral-derivative
  • PWM pulse-width modulate
  • SSR solid-state relay
  • the onboard controller of a plate or ring plate is configured to, e.g., read a temperature sensor and/or serial number of a slice or ring, and/or to perform an alignment check of the slice or ring (e.g., optically or electronically, e.g., through an electrical contact).
  • the controller of a plate or ring plate coordinates this and other information (e.g., “slice is loaded, thus this position is ready for printing”) to the carrier master controller 111.
  • a plate or ring plate includes one or more indication LEDs, e.g., for technician/operator situational awareness.
  • an onboard controller of a plate or ring plate controls a solenoid valve connected to a low pressure / vacuum line or pump (e.g., vacuum pump 150) connected to a suction system configured to (temporarily) hold a slice or ring in place.
  • a solenoid valve connected to a low pressure / vacuum line or pump (e.g., vacuum pump 150) connected to a suction system configured to (temporarily) hold a slice or ring in place.
  • an onboard controller of a plate or ring plate controls an electromagnetic device or a spring or an electric motor device, each configured to (temporarily) hold a slice or ring in place.
  • FIG. 11 A top view schematic of an example slice 230 that can be used with the technologies described in this specification is shown in FIG. 11.
  • One or more or all features or elements illustrated for slice 230 can be applied to and used in slice 130 and ring 530.
  • One, more, or all features or elements of slice 230 can be used alone or in combination with each other.
  • the edges of slice 230 are shown as straight, but in other implementations are curved to match the radius of the carrier.
  • the example slice 230 includes a heater that is or includes a copper thermal pad or metal coil. The heater is embedded in the slice 230 or is mounted on the bottom side of slice 230.
  • Example slice 230 includes a print region 231 on which the droplets or spots containing nucleic acids can be deposited.
  • a ring 530 includes 1, 2, 3, 4, 5, 6, or more print regions, e.g., one print region for each ring section.
  • print region 231 is 160 mm x 60 mm.
  • the print region is square, rectangular, circular, or of any other flat shape.
  • the print region has a surface treatment, e.g., the region is polished and/or includes a superhydrophobic coating.
  • a slice or ring is made of magnetic grade stainless steel (430) or similar, is about 2-3 mm, and has a surface that is super-polished/lapped to achieve mirror-finish, which is important for optical properties (e.g., for imaging of reaction spots).
  • a line-scan camera with a telecentric lens and special light setup including blue, green, red, or white diffuse light, a collimated film light, or a telecentric light is used.
  • a surface of a slice or ring is coated with a clear hydrophobic coating, which facilitates removal of DNA reaction spots and resists DNA binding or contamination.
  • a slice or ring is coated with a (super-)hydrophobic durable coating (e.g., comprising ceramic or siloxane).
  • a (super-)hydrophobic durable coating e.g., comprising ceramic or siloxane.
  • the surface is textured, e.g., for increased hydrophobicity (e.g., through brush polymer coating, nanotexture, and/or laser nanopatteming, e.g., to enhance printing performance).
  • the surface of a slice or ring is resistant to cleaning, decontamination, contact, and replaceable (e.g., if wear becomes apparent).
  • Example slice 230 includes at least one onboard temperature sensor 232.
  • a ring 530 includes 1, 2, 3, or more temperature sensors, e.g., one for each ring section.
  • Example slice 230 includes a serial number 233 or identifier (e.g., an optical identifier, e.g., a barcode or a QR code).
  • a serial number 233 or identifier e.g., an optical identifier, e.g., a barcode or a QR code.
  • QR codes, barcodes, etc. are designed such that a slice or ring is compatible with a variety of different pieces of equipment/hardware devices.
  • Example slice 230 includes a product detection device or region 234 on the top, bottom, or both, of the slice 230.
  • the product detection device 234 is read by a sensor of a plate (e.g., plate 120, ring plate 520) to detect presence and/or proper alignment of the slice.
  • Example slice 230 includes velocity detection fiducials 235, e.g., optical markings, that are used, e.g., by a camera of system 100 to read actual velocity of the slice 230 as it moves along its path.
  • the ring includes one or more of the features illustrated for example slice 230 (e.g., serial numbers, barcodes, and velocity) described in this specification.
  • FIG. 12A and FIG. 12B are perspective view schematics showing generalized mockups of an example slice 230 / example plate 220 mounting.
  • One or more or all features and elements illustrated for plate 220 can be applied to and used in plate 120 and ring plate 520.
  • One, more, or all features or elements of plate 220 can be used alone or in combination with each other.
  • Example plate 220 includes a power indicator LED 1201 to indicate that the plate has electrical power to perform one of more functions described herein.
  • Example plate 220 includes one or more microcontroller connections 1202 (e.g., iOS connections) and one or more spring contacts 1203, e.g., to electrically connect example plate 220 with example slice 230.
  • Example plate 220 includes one or more suction cups 1204, which are fluidically connected to a low pressure or vacuum source, e.g., vacuum pump 150. The suction cups 1204 are used to removably attach slice 230 to plate 220.
  • Example plate 220 includes spacer pins 1206.
  • Example plate 220 includes a power delivery connection 1207, e.g., to deliver power to slice 230.
  • Example plate 220 includes alignment features 1208, which are configured to provide proper alignment of slice 230 on plate 220.
  • Example plate 220 includes one or more status LEDs 1209 for optical indication of a status, e.g., faulty connections or misalignment.
  • Example plate 220 includes a manual release implement 1210 to manually release a slice 230 from plate 220, e.g., by manually suspending the vacuum/low pressure or magnetic field holding the slice 230 in place.
  • Example plate 220 includes a power input 1211 implement to receive electrical power (e.g., at 12 V) from the electrical system of system 100 to power one or more devices on plate 220 and/or slice 230.
  • FIG. 13 illustrates an example plate 220 with slice 230 mounted on an example carrier 210.
  • the carrier 210 is configured to transport plate 220/slice 230 between modules described in this specification.
  • a carrier 110/210/510 includes a number of (optional) features, all of which can be used alone or in combination with each other.
  • An example carrier 110/210/510 includes an onboard microcontroller, e.g., carrier master controller 111, that is electrically/electronically networked with each plate, e.g., plate 120, 220, or ring plate 520 (e.g., via a physical cable).
  • the carrier master controller 111 is electronically connected to main system controller 300 (e.g., wirelessly).
  • the carrier master controller 111 transmits identifying information of a slice or ring, e.g., serial number 233, e.g., read by a reader on a plate, from each slice or ring section to carrier main system controller 300 for logging. Analogously, carrier master controller 111 also reports this information whenever a slice or ring is loaded/removed from a plate.
  • identifying information of a slice or ring e.g., serial number 233, e.g., read by a reader on a plate
  • the carrier master controller 111 transmits (e.g., continuously) temperature information from each slice 230 or section of ring 530 to main system controller 300 for logging. In an implementation, temperature data is logged at a rate of between 0.1 Hz and 10 Hz, e.g., at about 0. 1 Hz, about 1 Hz, or about 10 Hz. [0197]
  • the carrier master controller 111 receives temperature settings from main system controller 300 and relays these to plates 120/220 or ring plate 520.
  • the carrier master controller 111 receives “Slice Release” commands from main system controller 300 and relays these to plates 120/220 or ring plate 520, e.g., causing vacuum or magnets to be turned off and to release slice 130/230 or ring 530.
  • a manual release implement 1210 e.g., a push-button mechanism
  • the carrier master controller 111 monitors vacuum header pressure and reports any deviations/anomalies to the main system controller 300.
  • the carrier master controller 111 logs operational data, e.g., cumulative numbers of cycles of components (pumps, valves, etc.) as well as cumulative run time of continuously operating components.
  • the modules are of a similar design / have similar features regardless of operation.
  • modules have substantially the same size, shape, and/or connections, e.g., to power or data. This feature facilitates both initial system design, as well as that for future modules.
  • Example features of the modules of system 100 are shown in FIG. 14.
  • Each module has a module controller including a processor and a memory.
  • the module controller receives inputs from one or more sensors of the module (e.g., temperature or humidity) and provides control output to hardware of the module, e.g., motors, valves, pumps, and/or actuators.
  • the module controller receives input from one or more detectors of the module (e.g., cameras and/or electromagnetic detectors).
  • Example detectors are a slice / section entry detection device (e.g., a camera or a barcode reader), a slice / ring position monitor (e.g., a reflectance sensor or a camera), and/or a slice / section exit detector (e.g., a camera).
  • the module controller is in electronic communication with main system controller 300 via cable or wireless connection to receive and/or transmit data between, e.g., sensors and the main system controller 300.
  • the module includes a regulator to regulate voltage or electrical power.
  • modules are designed to be quickly swapped out without “breaking” the rest of the system.
  • each module e.g., Write, Collect, Clean
  • This system either performs (or does not perform, as applicable) an action every time a slice or section of a ring enters the module.
  • one or more actions taken by a module are determined by main system controller 300.
  • An example process implemented on the system 100 described in this specification is illustrated below:
  • An example process starts with module (e.g., Write, Collect, Clean) actions.
  • module e.g., Write, Collect, Clean
  • a product detect sensor registers a plate/slice or section of a ring entering the module.
  • the module queries the main system controller 300 whether it needs to perform any action and awaits the response from the main system controller 300.
  • the main system controller 300 takes the follow actions.
  • the main system controller 300 receives a query from the module(s).
  • Main system controller 300 uses wheel encoder (141) data to determine which plate / ring section location has entered the module.
  • Main system controller 300 checks whether that plate is loaded with a slice (or if a ring is loaded). If affirmative, the main system controller 300 retrieves the slices’ (or ring’s or section’s) unique serial number. This information was sent over by the carrier master controller 111, e.g., during system startup.
  • the architecture of system 100 allows for performing different actions/chemistries on different nucleic acids (e.g., DNA) during the same print run. These instructions are encoded in a data file during file encoding.
  • Main system controller 300 then uses the slice / ring / section serial number to check the nucleic acids present on the slice / ring section. With this information, the main systems controller 300 determines if these nucleic acids need to be operated on by the module. For example, assume that the collection module (for collection of emulsions or encapsulated nucleic acids, described below) reports a slice or ring section entering. Suppose the main systems controller 300 references the slice or ring section serial number and determines that it actually contains firing checks. Nucleic acids from this slice or ring section would not be collected. Therefore, the main systems controller 300 would communicate instructions to the collection module to NOT perform any actions (and allow the firing checks to proceed to the wash module to be rinsed off). Otherwise, assuming that the nucleic acids were to be collected, the main systems controller 300 would communicate instructions to the collection module to perform its action and collect the reaction spots.
  • the collection module for collection of emulsions or encapsulated nucleic acids, described below
  • the module receives a response from the main system controller 300.
  • the module uses one or more onboard sensors to monitor slice / ring section position as it passes through the module.
  • the module uses location trigger points relative to a slice or ring / section position to perform one or more preset series of actions, e.g.: turning pumps / valves / spray ON or OFF, printing, aspirating, or imaging.
  • actual location data from a slice or ring section is used in the print engine to trigger each printhead (e.g., instead of encoder data).
  • the module reports to the main system controller 300 that the action sequence is complete (optionally, the main system controller logs this data). Subsequently, the slice or ring section leaves the module, and the module awaits for next slice / ring section to enter.
  • Each module only acts on a single slice or ring section at a time, and either performs or does not perform a preset sequence of actions. In an implementation, a module does not require precision timing (except the printing module).
  • the main system controller 300 logs all sensor data reported from the modules, e.g., tracking and logging the slice / ring / ring section serial number for reach reaction spot, etc. Camera systems acting as quality control and feedback tools can generate large volumes of information (100s of MB/s), which are transmitted to the main system controller 300 for processing and/or storing. [0206] To qualify system operation and assess operational parameters, e.g., which are the most crucial, all data can be logged.
  • the main system controller 300 logs on which exact slice or ring section it was printed, which modules operated on it (with time stamps), collects time stamp logs of slice / ring section temperature, module temperature, etc. during the lifetime of the (reaction) spot, etc.
  • An example purpose is to monitor the system 100 performance, e.g., to perform optimization, calibration, trouble shooting and the like, and also to reduce multiplicity.
  • the collected data can be used in the following way: for example, after sequencing, for missing identifiers, a user can trace back the overall writing / printing process to look at the exact parameters of, e.g., when any missing identifiers were printed.
  • the logged data can be used to check whether the neighbor on the slice of the missing identifier(s) was created correctly, or to check if a subsequent identifier in the identical location the next time the same slice went through the writing module was created (correctly), etc.
  • logging of all events for each module allows a user to track e.g., exact number of times each slice / ring section was printed, number of times each valve/switch/solenoid/etc. was actuated, and the like. Metrics like these combined with mean time between failures (MTBF) information from spec sheets can be used to derive maintenance and replacement schedules.
  • MTBF mean time between failures
  • the system 100 includes a writing module 170 to deposit droplets of an “ink” containing oligonucleotides encoding digital information onto a surface of a slice 130 or ring 530 (FIG. 15).
  • the writing module 170 utilizes writing software designed with an emphasis on high throughput data management and manipulation.
  • the system 100 receives print files from a user/extemal machine in a certain format (e.g., in a bit string format to minimize file sizes).
  • the coder/decoder operates off-instrument either using an on-prem compute architecture or a virtual machine on the cloud converting input data files in to “print files” (including, e.g., terabytes / petabytes of data).
  • the CODEC operates on system 100 (e.g., on main system controller 300) either using an on-prem compute architecture or a virtual machine on the cloud converting input data files in to “print files.”
  • the main system controller 300 converts these print files into “print instructions” that are compatible with the printhead controller on the writing module 170 and the printheads.
  • the main system controller 300 receives (additional) instructions from a graphical user interface (GUI) and/or other human-machine interface (HMI).
  • GUI graphical user interface
  • HMI human-machine interface
  • the main system controller 300 receives carrier position data (e.g., from encoder 141) and information regarding ink levels and/or other information related to the ink supply (e.g., temperature, pressure, type).
  • the ink used by the system is supplied by one or more cartridges (e.g., a disposable cartridge) mounted on the writing module, or by an external tank, e.g., a refillable tank.
  • the cartridge or tank include a sample port.
  • the cartridge or tank is fluidically connected to a cleaning / sterilization system configured to flush the cartridge or tank a after use.
  • the ink is supplied to system 100 in a cartridge or tank ready to use.
  • the ink is supplied to system 100 at high concentration of oligonucleotides and is diluted (e.g., using an onboard dilution system) prior to delivery to a printhead.
  • an ink supply system includes a tubing -based spectrophotometer to measure DNA concentration during printing.
  • a writing module includes one or more of printheads.
  • Each printhead of said plurality includes and/or dispenses one or more components.
  • a writing module includes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or more printheads.
  • each printhead comprises a different component.
  • Each printhead comprises at least one nozzle.
  • each printhead comprises a row of nozzles.
  • each printhead comprises at least 1, 2, 3, 4, or more rows of nozzles.
  • a printhead may be considered a set of nozzles each dispensing the same ink.
  • the row of nozzles dispenses the same ink.
  • a particular subset of nozzles in a row of nozzles dispense different ink from the other nozzles in said row of nozzles.
  • the row of nozzles comprises at least 20, 40, 60, 80, 100, 150, 200, 250, 300, 350, 400, or more nozzles.
  • some or all of the nozzles in a row of nozzles may be disjoint.
  • said printhead dispenses a droplet comprising said component onto a substrate, e.g., a slice or ring.
  • said printhead dispenses a droplet comprises a reaction mix onto said substrate.
  • said droplet is at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 picoliter in volume. In an implementation, said droplet is at least 10, 20, 30, 40, 50, 60, 70, or 80 picoliter in volume.
  • a writing module includes a register, a spot imager, and/or a spot dryer. In an implementation, said spot imager is a camera. In an implementation, said one or more components is in solution. In an implementation, said one or more components is a dry component. In an implementation, a droplet or reaction spot includes a reaction mixture.
  • additive provides compatibility of one or more components with said printhead. In some implementations, an additive is a solute, a humectant, or a surfactant.
  • said reaction mixture comprises a ligase.
  • the ligase can be used to ligate different components comprising nucleic acid sequences.
  • said condition is a temperature condition.
  • said one or more components further include a dye.
  • said reaction mix comprises a dye.
  • the dye can be any nucleic acid dye.
  • the dye can be a visible dye.
  • the main system controller 300 transmits one or more of print instructions, encoder signals, and/or configuration values (e.g., printhead configurations) to one or more printhead controllers (e.g., including a prime controller card (PCC)).
  • the printhead controllers include a processor and a memory and are mounted either on the print module or are integrated into the main system controller 300.
  • the writing module 170 includes devices to align, calibrate (e.g., using a calibration jig with cameras and motors for automated calibration), and/or move printheads such that a specific printing / overprinting pattern can be obtained. Movement systems include manual or electromechanical systems, e.g., controlled via a feedback loop with positioning elements and motors. In an implementation, printheads are movably arranged on a track configured for lateral movement in a plane parallel to a slice or ring. The writing module 170 also includes an alignment feature to ensure a slice or ring section is set to a predetermined position while passing through the writing module. [0212] An example printhead arrangement / configuration is shown in FIG. 16.
  • printheads 171 are grouped and arranged in a row (left to right in the image) on a track.
  • printheads are arranged in (removable) blocks and are indexed to increase combinatorial space.
  • three printheads are mounted movably along a single track.
  • an entire array of printheads e.g., five rows of three printheads
  • the slice or ring section is moved relative to the array.
  • the relative motion of slice and printheads is in increments corresponding the length of a printhead.
  • one of the three printheads per track is positioned over a target area, e.g., along the strip in the diagram, and dispenses bioink. Moving the slice forward and/or backward along the arrow, a target area can be reached by each of the 15 printheads.
  • a target area can be reached by each of the 15 printheads.
  • combinations of ink can be deposited on each position on a slice or ring section.
  • printheads are arranged not in straight rows/columns, but in arcs following the curvature of the substrate.
  • printheads are positioned perpendicular over a slice or ring section, or are positioned at an angle.
  • reaction spots are either collected as printed or are encased / encapsulated. In the latter case, reaction spots are encased in a (temporary) shell, e.g., a polymer/lipid/hydrogel.
  • the encapsulation is performed either by the writing module 170 or by a separate encapsulation module after the slice or ring section has passed through the writing module.
  • An encapsulating agent is dispensed from a printhead of the writing module 170 or a separate dispenser.
  • a cross-linker is dispensed from a printhead of the writing module 170 or a separate dispenser.
  • the dispenser includes a reservoir and a nozzle to dispense a droplet or spray, e.g., of a cross-linker or gelation mixture.
  • a slice or ring section includes one or more elements to repel/attract DNA through application of an electric charge.
  • the ink composition contains a cross-linking agent that reacts with another material (e.g., PDMS) which is dispensed (e.g., sprayed) onto a reaction spot.
  • the ink is aqueous and the encapsulation agent is or includes an oil.
  • the encapsulation is implemented as a water-in-oil emulsion.
  • the surface of the slice or ring section is oily (e.g., aided by textured elements on surface.).
  • the encapsulation process is configured such that vesicles form in a reaction spot.
  • ion-exchange is performed as part of encapsulation process: . to quench a ligation reaction, magnesium is removed, for example, using a chelator, e.g., EDTA.
  • chelating action occurs during formation of the vesicle.
  • a ligation reaction is quenched without the use of EDTA.
  • writing module 170 or a separate dispenser dispenses a liquid containing nucleases, e.g., to remove undesired oligonucleotides in the reaction spots.
  • a writing module 170 optionally includes an inspection or quality control system including a processor and a memory.
  • the (closed-loop) quality control system includes an optical system (e.g., a camera) for imaging of a print area of a slice or ring section.
  • the images are processed to determine, e.g., whether a writing process has been completed on all required spots, e.g., through detection of a fluorescent signal.
  • Output data from the inspection system is then relayed to main system controller 300 and/or a network attached storage (NAS). If the main system controller 300 determines that the writing process has been completed successfully, main system controller 300 actuates motor 140/540 to move the printed slice or ring section to the collection module 180.
  • NAS network attached storage
  • the system 100 includes an example collection module 180 to collect the (final) droplets / reaction spots and prepare the polynucleotides stored therein for storage or further processing.
  • a collection module includes one or more liquid dispensers to apply liquid to a surface, e.g., sprayers or spray headers as described below.
  • deposited (aqueous) droplets or reaction spots are collected in a collection fluid that contains an oil/surfactant mix, creating an emulsion of aqueous droplets (which contain the nucleic acids) in oil.
  • the collection fluid is aqueous and contains a cross-linker to cross-link the spots and encapsulate nucleic acids in a bead.
  • the collection fluid e.g., an aqueous collection fluid
  • the collection fluid contains agents that inhibit ligation or that otherwise terminate the nucleic acids assembly upon contact.
  • Reaction spot emulsification or encapsulation and collection can be implemented as follows: first, after (final) droplet / reaction spot deposition on a slice or ring section, the reaction spots are sprayed with one or more first collection fluids.
  • a collection fluid is an oil/surfactant solution or, if applicable, a cross-linker or other encapsulation agent.
  • a larger volume of a (second) collection fluid e.g., “neat” oil spray or other collection fluid, is sprayed or otherwise deposited on the surface of the slice or ring section.
  • high-frequency vibration (or sonication) elements connected to the slice or ring energize to assist in breaking the (e.g., emulsified) reaction spots free from the surface of the slice or ring.
  • the surface is hydrophobic or superhydrophobic.
  • a low-pressure system e.g., a “vacuum scoop” aspirates the emulsion spot / oil slurry mix from the slice or ring.
  • This mix is conveyed to one or more systems for processing.
  • the mix is conveyed to a density separation system for collection and further off-instrument processing.
  • an “air blade”-type fan follows the vacuum scoop to dry out any remaining liquid on the surface of the slice or ring.
  • reaction spots are recovered using mechanical scoops or other devices that sweep the surface of the slice or ring (e.g., a wiper or “squeegee”).
  • the collection fluid is a high viscosity fluid or a fluid that can be polymerized or otherwise solidified for mechanical removal, e.g., peeling the material off the surface.
  • recovery is assisted by rotating, tilting, or flipping a slice or ring.
  • droplets / reaction spots are cooled or frozen, e.g., using liquid nitrogen, to solidify droplets / reaction spots.
  • the solidified reaction spots are aspirated or swept from the surface of the slice or ring.
  • FIG. 17 An example workflow for an example collection module is illustrated in FIG. 17.
  • a slice or ring section moves from left to right and relative to the (fixed) elements of the collection module.
  • the slice or ring section remains stationary, and the element of the collection module move relative to the slice or ring.
  • the operation sequence is triggered by, e.g., a position sensor in the collection module confirming the appropriate position of a slice or ring section in or under the collection module.
  • the aspirator vacuum / low pressure
  • a (first) vibration element (“Buzzer”) in a plate 120 / ring plate 520 is turned on to vibrate the slice or ring.
  • the sprayer e.g., oil/surfactant/cross-linker sprayer
  • a (second) vibration element in a plate 120 / ring plate 120 is turned on to vibrate the slice or ring, and the first vibration element is turned off.
  • a (third) vibration element in a plate 120 / ring plate 520 is turned on to vibrate the slice or ring, and the second vibration element is turned off.
  • the sprayer is then turned off.
  • the third vibration element is tuned off, and the aspirator is turned off thereafter. The process is then repeated with the next slice or ring section.
  • a collection module includes or is connected to a quality control system.
  • This quality control system is configured to detect an amount of nucleic acid molecules in aspirated liquid and/or recycled collection fluid, e.g., using a spectrophotometer.
  • the quality control system detects and analyzes the amount of surfactant in oil, e.g., using a tensiometer. The results of this analysis are used to adjust the ratio of recycled collection fluid to new / neat collection fluid.
  • FIG. 18 illustrates an example implementation of elements of a collection module 280.
  • the example collection module 280 includes an aspirator 281 mounted via alignment feature 283 on a sled 285.
  • the sled 285 is stationary and “glides” along the top of the slice or ring as the carrier moves the slice or ring through collection module 280.
  • the front of the sled 285 contains a liquid atomizer/sprayer(s) (not shown) for dispensing of the first and/or second collection fluid, and the rear of the sled contains the aspirator 281 (vacuum collection mechanism).
  • Collection module 380 is configured for continuous recirculating polynucleotide capture and collection.
  • the printed nucleic acids e.g., DNA in droplets / reaction spots
  • the mixture is then aspirated and collected in a tank.
  • the collected fluid comprising the nucleic acids is then filtered.
  • the purified collection fluid is recirculated for reuse.
  • the components disposed above the slice 230 or ring 530 are in a fixed position; the slice or ring moves relative to the collection module 380.
  • the slice or ring is stationary, and one or more components of the collection module 380 move relative to the surface of the ring or slice.
  • the collection module 380 includes an example coating device 1301 for coating the printed droplet / reaction spots 105 with a collection fluid 104, e.g., an emulsifier (e.g., an oil/surfactant mix as described below) or with a cross-linking agent.
  • a collection fluid 104 e.g., an emulsifier (e.g., an oil/surfactant mix as described below) or with a cross-linking agent.
  • the collection fluid 104 is applied in two steps: a first amount is applied to coat the reaction spots 105, and subsequently, a second amount is applied to provide additional amount of liquid on the slice or ring for aspiration and collection. The amount of liquid deposited is sufficient to cover the printed DNA, but not enough to spill over off the surface of the slice or ring.
  • the coating device 1301 applies the collection fluid 104 using a spray header 1310 controlled by a spray controller 1313 in electronic communication with carrier master controller 111.
  • the coating device 1313 has a liquid level sensor to detect low level of collection fluid 1312 and a sensor to detect high level of collection fluid 1311.
  • the coated droplets/reaction spots 106 are conveyed (e.g., through rotation of the carrier) to the recovery device 1302.
  • the recovery device 1302 is configured as an aspirator connected to a low pressure / vacuum source as described above (not shown) and uses an aspiration header 1320 to aspirate the coated droplets/reaction spots 106.
  • the aspiration header 1320 or the recovery device 1302, or both span the width of the slice or ring (e.g., about 100 mm).
  • the recovery device includes a reservoir or liquid trap with a fluid level sensor 1321.
  • a transfer pump 1324 conveys, continuously or intermittently, the liquid through transfer header 1325 to a filter device 1303.
  • the droplet / reaction spots 105 and/or nucleic acids (e.g., DNA) contained therein are separated from the collection fluid, e.g., using a filter 1333.
  • filtration and/or separation includes one or more of separation of the aqueous droplet / reaction spots 105 from the collection fluid 104, and collection of the polynucleotides from the droplet / reaction spots 105.
  • Filter mechanisms used in the filter device 1303 include, but are not limited to, charge-based filter capture, bead-based capture, diatomaceous earth, cellulose.
  • filter device 1303 includes means to release nucleic acids from a gel or bead or crosslinked material, e.g., through chemical degradation of the bead and/or application of a restriction enzyme to release the nucleic acid.
  • the filter device 1303 includes a separated droplet/reaction spot fluid level sensor 1331 and a collection fluid level sensor 1332.
  • the filter device 1303 is fluidically connected to a vacuum / low pressure pump 1336 and a vacuum break valve 1337.
  • the separated collection fluid 104 is conveyed from the filter device 1303 through return header 1335 back to the coating device 1301 using return pump 1334. Any losses, e.g., during filtration or due to contamination, can be replenished from a collection fluid supply reservoir 1304.
  • the fluidic system of the collection module 380 is primed with collection fluid collection fluid supply reservoir 1304 prior to operation.
  • the collection fluid supply reservoir 1304 has a liquid level sensor to detect low level of collection fluid 1341 and a sensorto detect high level of collection fluid 1342.
  • top-up collection fluid is supplied to coating device 1301 using fill pump 1344.
  • collection fluid is supplied to aspiration header 1320 using flush pump 1345 and flush header 1346, e.g., to flush the aspiration header after the aspiration process.
  • the collection fluid 104 is filtered, e.g., using nanofiltration or reverse osmosis before being delivered to the coating device 1301.
  • a separator is used to retrieve the emulsified droplets containing the nucleic acids.
  • An example separator 482 is shown in FIG. 20.
  • the separator 482 includes an emulsion collection container 1401 in fluidic connection with a conduit having a fill valve 1404 at the top and having a recovery valve 1405 at the bottom.
  • the separator 482 includes a pooler vessel 1402 in fluidic communication with the emulsion collection container 1401 via emulsion drain line 1406, oil overflow return line 1407, and pressure equalization line 1408. Pooler vessel 1402 has a fill line 1409.
  • the separator 482 includes a disposal oil container 1403 in fluidic communication with the pooler vessel 1402 via oil overflow disposal line 1410 and pressure equalization line 1411.
  • Disposal oil container 1403 is in fluidic connection with a conduit having a disposal oil drain valve 1413.
  • Pooler vessel 1402 includes two optional baffles 1412a, b to prevent disruption of the separation by oil overflow return and prevent accidental discharge to the disposal oil container 1403.
  • the pooler vessel 1402 is connected at the top to the aspiration (vacuum) system (not shown) to receive the emulsion containing the droplets / reaction spots from the slice or ring.
  • FIGS. 21A-21E Example operation of the separator 482 is illustrated in FIGS. 21A-21E.
  • fdl valve 1404 is opened to fdl emulsion collection container 1401 at least in part with a low- density oil 1394 (e.g., FC-40) (FIG. 21A).
  • a low- density oil 1394 e.g., FC-40
  • FIG. 21A Low-density oil 1394 is added until the level of the oil in the pooler vessel 1402 reaches fdl line 1409.
  • emulsion containing the example reaction spots 1395 and oil-surfactant mix (e.g., HFE-7500 + Surfactant) plus collection fluid are aspirated from the slice or ring and transferred to the pooler vessel 1402 (FIG. 21B).
  • the example reaction spots 1395 is an aqueous emulsion in an oil (e.g., HFE-7500) plus a surfactant.
  • Clean HFE-7500 (with or without surfactant) is used as a collection fluid 1396.
  • the density of the low-density oil is lower than the density of water.
  • the density of water is lower than the density of the oil in the oil-surfactant-mix and collection fluid. Therefore, the emulsion (water) 1395 and oil-surfactant mix / collection fluid sink to the bottom of the pooler vessel 1402 (FIG. 21C).
  • the emulsion (water) 1395 and oil-surfactant mix / collection fluid 1396 flow down emulsion drain line 1406 into emulsion collection container 1401 (FIG. 21D).
  • An equal volume of low-density oil flows through the oil overflow return line 1407 and enters the pooler vessel 1402 through baffle 1412a.
  • the fluid level in pooler vessel 1402 increases. Once the oil level reaches the inlet to the oil overflow disposal line 1410, low-density oil flows through oil overflow disposal line 1410 into disposal oil container 1403.
  • Pressure equalization line 1411 prevents unintended buildup/o verflow.
  • Baffle 1412b prevents emulsions collected in the pooler vessel from entering the oil overflow disposal line 1410.
  • the emulsion collection container 1401 After completion of one or more writing and collecting operations (e.g., after collecting reaction spots from one or more slices or ring sections), the emulsion collection container 1401 contains all the collected emulsion and thus all collected polynucleotides (FIG. 21E). The contents of emulsion collection container 1401 are then drained through recovery valve 1405 for further processing. The low-density oil in disposal oil container 1403 is drained through disposal oil drain valve 1413 and can be re-used, e.g., as collection fluid.
  • an example system 100 has at least two collection tanks (one on service and one reserve) to support continuous print runs. Also, an example system includes conduits, valves (e.g., solenoid valves) and other required features to provide automated sequencing, e.g., on the collection module 180 or elsewhere on the system 100.
  • conduits, valves e.g., solenoid valves
  • other required features e.g., on the collection module 180 or elsewhere on the system 100.
  • the collection module 180 optionally includes implements for removing and/or retaining polynucleotides.
  • collection module 180 includes membranes or resin/gel-based materials in one or more traps to hold oligonucleotides as they are recovered from the surface of a slice or ring.
  • the trap is a removable cartridge that can be (manually) collected and transferred to another device for postprocessing.
  • the system 100 includes a cleaning module 190 to reduce or eliminate cross-contamination of reaction spots after collection.
  • the slices or ring sections are rinsed with fluid (e.g., de-ionized water) to remove potential residual DNA.
  • the cleaning module 190 includes a cleaning liquid spray (e.g., DI water spray instead of an oil/surfactant spray), and a vacuum recovery system for collecting the used cleaning liquid and for transfer to a waste container.
  • the cleaning module 190 includes implements to reduce risk of oligonucleotide build-up / contamination on a slice or ring section (e.g., in a decontamination module). Such implements are implemented on a batch-wise basis or are run continuously.
  • the decontamination module is a separate module that is (permanently) mounted on system 100 in addition to cleaning module 190 or is mounted (periodically) instead of cleaning module 190.
  • the decontamination module is configured to apply a decontamination medium (e.g., solution or gel) to a slice or ring section (e.g., using a spray nozzle) and to remove the decontamination medium (e.g., using a vacuum system), and restore a slice or ring section surface to its original state.
  • a decontamination medium e.g., solution or gel
  • the decontamination module dispenses a solution (e.g., bleach, nucleases, and/or oxidizing agent) onto a print surface, e.g., using a spray nozzle, e.g., a pneumatically driven nozzle.
  • More viscous media are either directly or indirectly dispensed onto reaction spots or the print surface using a printhead-type device.
  • the decontamination module includes a high-pressure steam nozzle aimed directly at the surface of a slice or ring section.
  • the decontamination module includes an implement for sonicating liquid on the surface of a slice or ring section.
  • decontamination module includes decontamination fluid dispensing system including a pressure washer configured to dispense decontamination fluid (e.g., a water-based fluid) at a flow rate of at least 1, 2, 3, 4, 5 or more liters/min.
  • Decontamination media are removed by one or more of the following means.
  • decontamination media are removed by spraying a rinse solution (e.g., DI water) onto the surface of a slice or ring section using a pneumatically driven spray nozzle.
  • decontamination media are removed through application of high-pressure steam directly to a surface via directed nozzle.
  • the collection module 180 can be used as described above and remove decontamination liquid from a surface.
  • decontamination media are removed by conversion of liquid to solid, e.g., using a gelation mixture.
  • decontamination module includes a solid cleaning implement to remove any solid / liquid residue.
  • the finishing of a surface of a slice or ring section is restored to its original state.
  • This restoration is achieved via one or more of the following:
  • surface finishing is restored using high pressure steam applied directly to a surface via directed nozzle.
  • surface finishing is restored using a replaceable soft, nonabrasive buffing surface moving (e.g., rotationally or linearly) directly contacting the surface of a slice or ring section.
  • the cleaning module includes a quality control system.
  • this quality control system includes UV illumination and a camera (fluorescent) to assess cleanliness prior to printing (the main system controller 300 may stop or prevent the writing process if cleanliness levels are insufficient).
  • a short-wave infrared camera is used to check if the slice or ring section is dry prior to writing.
  • the decontamination medium is disposed in a waste receptacle (e.g., onboard system 100) or is reused.
  • system 100 includes a recirculation system configured such that used decontamination medium is recirculated to the dispensers described above.
  • the recirculation system includes a filter or purification implement to clean liquid (e.g., aqueous) decontamination media.
  • the decontamination system optionally includes a condenser for capturing excess water in the system and to recirculate the water, e.g., to be re-used for cleaning.
  • the decontamination liquid is converted into a solid (e.g., using a gelation mixture) or is absorbed by a high-absorbency material within a reservoir.
  • FIGS. 22A-22C An example writer device implementing the system 100 described in this specification is shown in FIGS. 22A-22C.
  • FIG. 22A shows the example device during normal writing operation.
  • the upper module of the device housing system 100 is encased in “smart glass” having changeable opacity.
  • FIG. 22B shows the device in inspection mode.
  • the smart glass is transparent providing full view of the writing, collecting, and cleaning modules and carrier of system 100.
  • FIG. 22C shows another view of the device with LCD display/HMI and an integrated print engine in the upper module.
  • Ancillary sub-systems of system 100 are housed in the lower module.
  • FIG. 23 shows an example arrangement of an array of example devices implementing system 100.
  • the software architecture of one or more systems 100 is designed to support scaling up to an array of writers deployed, e.g., in a factory.
  • One such implementation is a series of writers, each supplied (from the factory) with 220 V / 1 kW power supply, compressed air system (e.g., supply >120psi, regulated down to 100 psi on instrument), vacuum / low pressure line, and DI water line.
  • Example waste streams include DI water waste and oil/ surfactant waste.
  • the output of the writer device would be an “emulsion slurry” or “encapsulated nucleic acid slurry” (the reaction spots after collection and storage on-machine and (manually) moved to post-processing unit).
  • the writer Prior to operation, the writer is loaded with virgin oil/surfactant mix, and nucleic acid ink, e.g., concentrated DNA inks (diluted down to writing concentration using DI water by the instrument). Writing, collecting, and cleaning is performed as described above.
  • a system for assembling an identifier nucleic acid molecule encoding digital information comprising: a substrate comprising a plurality of coordinates, the substrate removably mounted on a carrier; a writing module configured to dispense a droplet of a solution comprising a nucleic acid molecule onto one of the plurality of coordinates on the substrate; a collection module configured to collect droplets from the plurality of coordinates; a cleaning module configured to clean the substrate after collection of the droplets; and a control system configured to actuate the carrier to transfer the substrate between the writing module, the collection module, and the cleaning module.
  • A2 The system of item Al, wherein the substrate is one of a plurality of substrates, each substrate being an individually removable slice.
  • A5. The system as in any one of items A1-A4, wherein the carrier is configured as a closed- loop carrier system.
  • Al l The system as in any one of items Al -A 10, wherein the substrate comprises a silicon wafer or a printed circuit board (PCB).
  • PCB printed circuit board
  • A13 The system as in any one of items Al -A 12, wherein the substrate comprises a heating element or a temperature sensor, or both.
  • A14 The system of item A13, wherein the heating element comprises a copper thermal pad.
  • A15 The system as in any one of items Al -A 14, wherein the substrate comprises at least one electro wetting pad.
  • a 16 The system as in any one of items Al -A 15, wherein the substrate is configured to perform cyclic heating and cooling.
  • a 17 The system as in any one of items Al -A 16, wherein the substrate comprises a chip representing or storing a unique serial number.
  • Al 8 The system as in any one of items Al -A 17, wherein the substrate comprises one or more vibrating/ultrasonic elements.
  • A19 The system as in any one of items A1-A18, wherein the substrate has a smooth top surface.
  • A20 The system as in any one of items Al -A 19, wherein the top surface comprises a hydrophobic coating.
  • A21 The system as in any one of items A1-A20, wherein the substrate comprises one or more alignment features configured to align the substrate with the carrier.
  • A22 The system as in any one of items A1-A21, wherein the carrier comprises at least one plate or ring plate configured to removably retain the substrate on the carrier.
  • A23 The system of item A22, wherein the at least one plate or ring plate comprises a PCB configured to interact with a PCB of a substrate.
  • A24 The system as in any one of items A22-A23, wherein the at least one plate or ring plate comprises a plate controller comprising processor and a memory, the plate controller being in electronic communication with the control system and configured to read or write data to the PCB of a substrate mounted on the plate or ring plate.
  • A25 The system of item A24, wherein the plate controller is in electronic communication with the control system via a slip ring connection.
  • A28 The system as in any one of items A22-A27, wherein the at least one plate or ring plate comprises a sensor for detecting the presence of a substrate mounted on the plate or ring plate.
  • A29 The system as in any one of items A22-A28, wherein the at least one plate or ring plate comprises a reading device configured to read a substrate’s unique serial number.
  • A30 The system as in any one of items A22-A29, wherein the at least one plate or ring plate comprises a mechanical system controlled by the plate controller and configured to removably mount the substrate to the plate or ring plate.
  • A31 The system of item A30, wherein the mechanical system comprises a pneumatic mounting system.
  • A32 The system of item A31, wherein the pneumatic mounting system comprises a vacuum pump in fluid communication with a suction device configured to contact a surface of the substrate to removably retain the substrate on a plate.
  • A33 The system of item A30, wherein the mechanical system comprises a magnetic mounting system.
  • A34 The system as in any one of items A22-A29, wherein the at least one plate comprises an adhesive device configured to removably mount the substrate to the plate.
  • A35 The system as in any one of items A1-A34, wherein the carrier comprises a carrier controller comprising a processor and a memory, the carrier controller being in electronic communication with the control system and configured to control rotation of the carrier.
  • each plate or ring plate comprises one or more alignment features configured to align the plate or ring plate with the substrate.
  • A37 The system as in any one of items Al -A36, wherein the writing module comprises one or more printheads configured to dispense a first liquid onto the substrate, each printhead being in electronic communication with a printhead controller.
  • A38 The system of item A37, comprising an alignment device configured to set and/or maintain a substrate a position within the writing module.
  • A39 The system as in any one of items A37-A38, wherein the first liquid is a bioink comprising nucleic acid molecules suspended therein.
  • A40 The system as in any one of items A37-A39, wherein one or more of the one or more printheads are configured to dispense a droplet of bioink at a discrete location on the substrate.
  • A41 The system as in any one of items A37-A40, comprising: (a) a first printhead configured to dispense a first droplet of a first solution comprising a first component nucleic acid molecule onto a coordinate on the substrate; (b) a second printhead configured to dispense a second droplet of a second solution comprising a second component nucleic acid molecule onto the coordinate on the substrate, such that the first and second component nucleic acid molecules are collocated on the substrate.
  • A42 The system as in any one of items A37-A41, wherein one or more of the one or more printheads are configured to dispense a droplet of a reaction mix at a discrete location on the substrate.
  • A43 The system of item A42, wherein one or more of the one or more printheads are configured to dispense a droplet of a reaction mix on the coordinate on the substrate, such that the first and second component nucleic acid molecules and the reaction mix are collocated on the substrate.
  • reaction mix comprises one or more biochemicals for ligation or PCR.
  • A45 The system as in any one of items A41-A44, wherein one or more of the one or more printheads are configured to dispense a droplet of an oil/surfactant mix on the coordinate on the substrate, such that the first and second component nucleic acid molecules and the oil/surfactant mix are collocated on the substrate.
  • A46 The system as in any one of items A37-A45, comprising one or more dispensers configured to dispense a droplet of a reaction mix or an oil/surfactant mix on the coordinate on the substrate, such that the first and second component nucleic acid molecules and reaction mix and/or the oil/surfactant mix are collocated on the substrate.
  • A47 The system as in any one of items A37-A46, wherein the one or more printheads are grouped and movably arranged on a plurality of tracks.
  • A48 The system as in any one of items A37-A47, wherein the one or more printheads are in fluid communication with a bioink supply system.
  • A49 The system as in any one of items A37-A48, wherein the printhead controller is in electronic communication with the control system.
  • A50 The system as in any one of items A37-A49, wherein the first liquid comprises a polymer, a lipid, or a hydrogel.
  • A51 The system as in any one of items A37-A50, wherein the first liquid is configured to form one or more vesicles.
  • A52 The system as in any one of items A37-A51, wherein the first liquid is configured to form an emulsion.
  • A53 The system as in any one of items A37-A52, comprising a dispenser configured to dispense a second liquid onto the substrate.
  • A54 The system of item A53, wherein the dispenser comprises a nozzle or a printhead.
  • A55 The system as in any one of items A53-A54, wherein the first liquid, the second liquid, or both, comprise a cross-linker. [0300] A56. The system as in any one of items A53-A55, wherein the second liquid comprises an encapsulation agent.
  • A57 The system as in any one of items A53-A56, wherein the first liquid, the second liquid, or both, comprise a nuclease.
  • A58 The system as in any one of items A1-A57, comprising an optical inspection system configured to image a substrate and transmit image information to the control system.
  • A59 The system as in any one of items A1-A58, wherein the collection module comprises a fluid delivery system configured to deliver a fluid to a surface of a substrate and a fluid collection system.
  • A60 The system of item A59, wherein the fluid delivery system is configured to deliver a collection fluid to the surface.
  • A61 The system of item A60, wherein the collection fluid comprises a mixture comprising oil and a surfactant.
  • A62 The system of item A60, wherein the collection fluid comprises a cross-linker.
  • A63 The system as in any one of items A41-A62, comprising:
  • a dispenser configured to dispense a droplet of collection fluid on a coordinate on the substrate, such that the droplet of bioink and the droplet of the collection fluid are collocated on the substrate.
  • A64 The system as in any one of items A63, wherein the fluid delivery system is configured to deliver an amount of collection fluid to the surface such that the collection fluid contiguously covers a plurality of droplets.
  • A65 The system as in any one of items A59-A64, wherein the collection module comprises a fluid collection system comprising a low-pressure source in fluidic communication with a suction device, the low-pressure source configured to aspirate fluid from the substrate.
  • the collection module comprises a fluid collection system comprising a low-pressure source in fluidic communication with a suction device, the low-pressure source configured to aspirate fluid from the substrate.
  • A66 The system as in any one of items A59-A65, wherein the fluid delivery system and/or the fluid collection system are at least in part mounted on a sled movable on a track.
  • A67 The system as in any one of items A59-A66, wherein the collection module comprises a fluid separation system configured to separate an emulsion from an oil/surfactant mixture.
  • A69 The system as in any one of items A39-A68, wherein the bioink deposited on the substrate is aqueous.
  • A70 The system as in any one of items A67-A69, wherein the fluid separation system is configured to separate the emulsion from the oil/surfactant mixture by contacting the emulsion and the oil/surfactant mixture with a volume of oil that has a lower density than the emulsion and the oil/surfactant mixture.
  • A71 The system as in any one of items A59-A70, comprising a membrane or gel configured to retain oligonucleotides.
  • A72 The system of item A71, wherein the collection module comprises the membrane or gel configured to retain oligonucleotides.
  • A73 The system as in any one of items A59-A71, comprising a nanofiltration system configured to remove impurities from the from at least a part of the aspirated fluid.
  • A74 The system of item A73, wherein the collection module comprises the nanofiltration system configured to remove impurities from the from at least a part of the aspirated fluid.
  • A75 The system as in any one of items A1-A74, wherein the cleaning module comprises a cleaning fluid dispensing system and a cleaning module fluid collection system.
  • A76 The system of item A75, wherein the cleaning fluid comprises water, a detergent, or a solvent.
  • A77 The system as in any one of items A75-A76, wherein the cleaning module fluid collection system comprises a low-pressure source in fluidic communication with a suction device, configured to aspirate fluid from the substrate.
  • A78 The system as in any one of items A75-A77, wherein the cleaning module fluid collection system comprises a waste reservoir.
  • A79 The system as in any one of items A1-A78, comprising a decontamination module configured to prevent or reduce build-up of biological substance on the substrate.
  • A80 The system of item A79, comprising a decontamination fluid dispensing system and a decontamination module fluid collection system.
  • A81 The system of item A80 wherein the decontamination fluid comprises bleach, a nuclease, an oxidizing agent, a gelation agent, or a combination thereof.
  • A82 The system as in any one of items A79-A81, wherein the decontamination fluid dispensing system comprises a printhead.
  • A83 The system as in any one of items A79-A81, wherein the decontamination fluid dispensing system comprises a pressure washer configured to dispense decontamination fluid at a flow rate of at least 3 liters/min.
  • A84 The system as in any one of items A79-A83, wherein the decontamination module comprises a solid cleaning implement.
  • A86 The system as in any one of items A1-A85, wherein the identifier nucleic acid molecule is configured to represent a position and a value of a symbol in a string of symbols.
  • A87 The system as in any one of items A41-A86, wherein the identifier nucleic acid molecule is assembled from at least the first component nucleic acid molecule and the second component nucleic acid molecule.
  • a method for assembling an identifier nucleic acid molecule encoding digital information comprising: actuating, using a control system, a substrate comprising a plurality of coordinates, the substrate removably mounted on a carrier; actuating, using the control system, a writing module to dispense a droplet of a solution comprising a nucleic acid molecule onto one of the plurality of coordinates on the substrate; actuating, using the control system, a collection module to collect droplets from the plurality of coordinates; actuating, using the control system, a cleaning module to clean the substrate after collection of the droplets; and transferring the substrate between the writing module, the collection module, and the cleaning module.
  • B5. The method as in any one of items B1-B4, wherein the carrier is configured as a closed- loop carrier system.
  • Bl l The method as in any one of items Bl -B10, wherein the substrate comprises a silicon wafer or a printed circuit board (PCB).
  • the substrate comprises a silicon wafer or a printed circuit board (PCB).
  • B12 The method as in any one of items Bl-Bl 1, comprising detecting a product detect barcode or QR code configured to uniquely identify the substrate.
  • B13 The method as in any one of items Bl -Bl 2, comprising heating the substrate, measuring the temperature of the substrate, or both.
  • B15 The method as in any one of items B1-B14, wherein the substrate comprises at least one electro wetting pad.
  • B16 The method as in any one of items Bl -Bl 5, comprising performing cyclic heating and cooling.
  • B17 The method as in any one of items B1-B17, wherein the substrate comprises a chip representing or storing a unique serial number.
  • B 19 The method as in any one of items B 1-B 18, wherein the substrate has a smooth top surface.
  • B20 The method as in any one of items Bl -Bl 9, wherein the top surface comprises a hydrophobic coating.
  • the at least one plate or ring plate comprises a plate controller comprising processor and a memory, the plate controller being in electronic communication with the control system and configured to read or write data to the PCB of a substrate mounted on the plate or ring plate.
  • B27 The method as in any one of items B22-B26, comprising heating the at least one plate or ring plate using a heating loop and/or a temperature sensor.
  • B28 The system as in any one of items B22-B27, comprising detecting the presence of a substrate mounted on the plate or ring plate using a sensor on the at least one plate or ring plate, .
  • B29 The system as in any one of items B22-B28, comprising reading a substrate’s unique serial number using a reading device on or in the at least one plate or ring plate.
  • B30 The system as in any one of items B22-B29, comprising removably mounting the substrate to the plate or ring plate on the at least one plate or ring plate using a mechanical system controlled by the plate controller.
  • B32 The system of item B31, wherein the pneumatic mounting system comprises a vacuum pump in fluid communication with a suction device configured to contact a surface of the substrate to removably retain the substrate on a plate.
  • B35 The method as in any one of items B1-B34, wherein the carrier comprises a carrier controller comprising a processor and a memory, the carrier controller being in electronic communication with the control system and configured to control rotation of the carrier.
  • the carrier comprises a carrier controller comprising a processor and a memory, the carrier controller being in electronic communication with the control system and configured to control rotation of the carrier.
  • each plate or ring plate comprises one or more alignment features configured to align the plate or ring plate with the substrate.
  • B37 The method as in any one of items B1-B36, comprising dispensing a first liquid onto the substrate using one or more printheads of the writing module, each printhead being in electronic communication with a printhead controller.
  • B40 The method as in any one of items B37-B39, wherein one or more of the one or more printheads dispense a droplet of bioink at a discrete location on the substrate.
  • B41 The method as in any one of items B37-B40, comprising: (a) dispensing from a first printhead a first droplet of a first solution comprising a first component nucleic acid molecule onto a coordinate on the substrate; (b) dispensing from a second printhead a second droplet of a second solution comprising a second component nucleic acid molecule onto the coordinate on the substrate, such that the first and second component nucleic acid molecules are collocated on the substrate.
  • B42 The method as in any one of items B37-B41, wherein one or more of the one or more printheads dispense a droplet of a reaction mix at a discrete location on the substrate.
  • B43 The method of item B42, wherein one or more of the one or more printheads dispense a droplet of a reaction mix on the coordinate on the substrate, such that the first and second component nucleic acid molecules and the reaction mix are collocated on the substrate.
  • B44 The method of item B43, wherein the reaction mix comprises one or more biochemicals for ligation or PCR.
  • B46 The method as in any one of items B37-B45, comprising dispensing a droplet of a reaction mix or an oil/surfactant mix on the coordinate on the substrate, such that the first and second component nucleic acid molecules and reaction mix and/or the oil/surfactant mix are collocated on the substrate.
  • B48 The method as in any one of items B37-B47, wherein the one or more printheads are in fluid communication with a bioink supply system.
  • B50 The method as in any one of items B37-B49, wherein the first liquid comprises a polymer, a lipid, or a hydrogel.
  • B51 The method as in any one of items B37-B50, wherein the first liquid is configured to form one or more vesicles.
  • B52 The method as in any one of items B37-B51, wherein the first liquid is configured to form an emulsion.
  • B53 The method as in any one of items B37-B52, comprising dispensing a second liquid onto the substrate.
  • B54 The method of item B53, comprising dispensing from dispenser comprising a nozzle or a printhead.
  • B58 The method as in any one of items B1-B57, comprising imaging a substrate and transmitting image information to the control system.
  • B59 The method as in any one of items B1-B58, comprising delivering a fluid to a surface of a substrate and collecting fluid from the surface, using a fluid delivery system and a fluid collection system of the collection module.
  • B60 The method of item B59, comprising delivering a collection fluid to the surface.
  • B61 The method of item B60, wherein the collection fluid comprises a mixture comprising oil and a surfactant.
  • B64 The method as in any one of items B63, comprising, using the fluid delivery system, delivering an amount of collection fluid to the surface such that the collection fluid contiguously covers a plurality of droplets.
  • B66 The method as in any one of items B59-B65, wherein the fluid delivery system and/or the fluid collection system are at least in part mounted on a sled movable on a track.
  • B69 The method as in any one of items B39-B68, wherein the bioink deposited on the substrate is aqueous.
  • B70 The method as in any one of items B67-B69, comprising separating the emulsion from the oil/surfactant mixture by contacting the emulsion and the oil/surfactant mixture with a volume of oil that has a lower density than the emulsion and the oil/surfactant mixture.
  • B71 The method as in any one of items B59-B70, comprising using a membrane or gel configured to retain oligonucleotides.
  • B73 The method as in any one of items B59-B71, comprising removing impurities from the from at least a part of the aspirated fluid using a nanofiltration system.
  • B74 The method of item B73, wherein the collection module comprises the nanofiltration system configured to remove impurities from the from at least a part of the aspirated fluid.
  • B75 The method as in any one of items B1-B74, comprising dispensing a cleaning fluid on a substrate and collecting the cleaning fluid from the substrate using a cleaning fluid dispensing system and a cleaning module fluid collection system of cleaning module.
  • B80 The method of item B79, comprising dispensing and collecting a fluid using a decontamination fluid dispensing system and a decontamination module fluid collection system of the decontamination module.
  • decontamination fluid comprises bleach, a nuclease, an oxidizing agent, a gelation agent, or a combination thereof.
  • B83 The method as in any one of items B79-B81, comprising dispensing decontamination fluid at a flow rate of at least 3 liters/min from a pressure washer of the decontamination fluid dispensing system.
  • B86 The method as in any one of items B 1-B85, wherein the identifier nucleic acid molecule to represents a position and a value of a symbol in a string of symbols.
  • B87 The method as in any one of items B41-B86, comprising assembling the identifier nucleic acid molecule from at least the first component nucleic acid molecule and the second component nucleic acid molecule.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Apparatus Associated With Microorganisms And Enzymes (AREA)

Abstract

Les technologies de stockage de données dans des acides nucléiques comprennent un système d'assemblage d'une molécule d'acide nucléique d'identifiant codant des informations numériques. Ledit système comprend un substrat monté de manière amovible sur un support, un module d'écriture conçu pour distribuer une gouttelette d'une solution comprenant une molécule d'acide nucléique sur le substrat, un module de collecte conçu pour collecter des gouttelettes à partir du substrat, un module de nettoyage pour nettoyer le substrat après la collecte des gouttelettes; et un système de commande pour commander le support pour transférer le substrat entre le module d'écriture, le module de collecte et le module de nettoyage.
PCT/US2024/023201 2023-04-05 2024-04-05 Dispositif d'écriture d'adn WO2024211659A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202363457271P 2023-04-05 2023-04-05
US63/457,271 2023-04-05

Publications (1)

Publication Number Publication Date
WO2024211659A1 true WO2024211659A1 (fr) 2024-10-10

Family

ID=90971557

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/023201 WO2024211659A1 (fr) 2023-04-05 2024-04-05 Dispositif d'écriture d'adn

Country Status (1)

Country Link
WO (1) WO2024211659A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019222562A1 (fr) * 2018-05-16 2019-11-21 Catalog Technologies, Inc. Système d'imprimante-finisseur pour le stockage de données dans l'adn
WO2019226314A1 (fr) * 2018-05-22 2019-11-28 Microsoft Technology Licensing, Llc Production d'adn, stockage et système d'accès
WO2022272068A1 (fr) * 2021-06-25 2022-12-29 Catalog Technologies, Inc. Procédés de traitement pour le stockage de données d'acide nucléique

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019222562A1 (fr) * 2018-05-16 2019-11-21 Catalog Technologies, Inc. Système d'imprimante-finisseur pour le stockage de données dans l'adn
WO2019226314A1 (fr) * 2018-05-22 2019-11-28 Microsoft Technology Licensing, Llc Production d'adn, stockage et système d'accès
WO2022272068A1 (fr) * 2021-06-25 2022-12-29 Catalog Technologies, Inc. Procédés de traitement pour le stockage de données d'acide nucléique

Similar Documents

Publication Publication Date Title
US11305527B2 (en) Printer-finisher system for data storage in DNA
US20230211308A1 (en) De novo synthesized gene libraries
US11867672B2 (en) Flow cell with one or more barrier features
US20240060954A1 (en) Obtaining information from a biological sample in a flow cell
Yu et al. High-throughput DNA synthesis for data storage
US20210147833A1 (en) Systems and methods for information storage and retrieval using flow cells
WO2024211659A1 (fr) Dispositif d'écriture d'adn
WO2022272068A1 (fr) Procédés de traitement pour le stockage de données d'acide nucléique
WO2024086294A1 (fr) Réduction du bruit pour le stockage de données dans l'adn
CA3195364A1 (fr) Systeme de reactions fluidiques a temperature regulee
AU2023228860A1 (en) Dna microarrays and component level sequencing for nucleic acid-based data storage and processing