Nothing Special   »   [go: up one dir, main page]

CN104293940B - Build the method and its application of sequencing library - Google Patents

Build the method and its application of sequencing library Download PDF

Info

Publication number
CN104293940B
CN104293940B CN201410521540.4A CN201410521540A CN104293940B CN 104293940 B CN104293940 B CN 104293940B CN 201410521540 A CN201410521540 A CN 201410521540A CN 104293940 B CN104293940 B CN 104293940B
Authority
CN
China
Prior art keywords
sequencing data
sequence
sequencing
chain
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410521540.4A
Other languages
Chinese (zh)
Other versions
CN104293940A (en
Inventor
管彦芳
钱朝阳
吕小星
常连鹏
易鑫
朱红梅
杨玲
吴仁花
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TIANJIN BGI TECHNOLOGY Co Ltd
BGI Shenzhen Co Ltd
Original Assignee
TIANJIN BGI TECHNOLOGY Co Ltd
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TIANJIN BGI TECHNOLOGY Co Ltd, BGI Shenzhen Co Ltd filed Critical TIANJIN BGI TECHNOLOGY Co Ltd
Priority to CN201410521540.4A priority Critical patent/CN104293940B/en
Publication of CN104293940A publication Critical patent/CN104293940A/en
Application granted granted Critical
Publication of CN104293940B publication Critical patent/CN104293940B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1082Preparation or screening gene libraries by chromosomal integration of polynucleotide sequences, HR-, site-specific-recombination, transposons, viral vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Virology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Plant Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The method and its application for building sequencing library are disclosed, this method includes:(a) jointing is distinguished at the two ends of double chain DNA fragment, to obtain connection product;(b) connection product is cracked into Single-stranded DNA fragments;(c) Single-stranded DNA fragments are screened using probe;(d) chain extension reaction is carried out using Single-stranded DNA fragments described in the first primer pair, to obtain chain extension product;(e) the chain extension product is expanded, to obtain amplified production, the amplified production constitutes the sequencing library.Also disclose sequence measurement, the method for determining nucleotide sequence, the device for building sequencing library, sequencing equipment and the system for determining nucleotide sequence.

Description

Build the method and its application of sequencing library
Technical field
The present invention relates to biomedical sector.Specifically, the present invention relates to method, the sequencing side for building sequencing library Method, the method for determining nucleotide sequence, the device for building sequencing library, sequencing equipment and the system for determining nucleotide sequence.
Background technology
High-flux sequence is increasingly concerned, but high-flux sequence still needs to be changed for the detection of low frequency mutation at present Enter.
The content of the invention
It is contemplated that at least solving one of technical problem present in prior art.Therefore, according to the implementation of the present invention Example, the present invention proposes the method for building sequencing library and detects the means of low frequency mutation.
In the first aspect of the present invention, the present invention proposes a kind of method for building sequencing library.According to the reality of the present invention Example is applied, this method includes:(a) jointing is distinguished at the two ends of double chain DNA fragment, to obtain connection product, wherein, it is described Joint includes the first chain and the second chain, and first chain and the second chain part are matched and first chain includes the first label sequence Row, to limit double stranded region and two single-stranded afterbodys on the joint, are included in the sequence of one of described two single-stranded afterbodys First label;(b) connection product is cracked into Single-stranded DNA fragments;(c) Single-stranded DNA fragments are carried out using probe Screening, wherein, the probe specificity recognizes presumptive area, wherein, the presumptive area includes one of following:(1) shown in table 1 At least one of gene;(2) the CDS regions of (1);And the upstream and downstream of (3) (2) at least 10bp region;(d) draw using first Thing carries out chain extension reaction to the Single-stranded DNA fragments, to obtain chain extension product, wherein, first primer includes the Two sequence labels, and first primer is suitable to the first chain with the joint into duplex structure, simply described first marks There is mispairing between label sequence and second sequence label;(e) the chain extension product is expanded, to be expanded Product, the amplified production constitutes the sequencing library, wherein, the amplification using suitable for expanding the first label sequence simultaneously The primer of row and second sequence label..
Thus, using the method for structure sequencing library according to embodiments of the present invention, sequencing library can be effectively built, Meanwhile, in constructed sequencing library, for every of identical double chain DNA fragment (also referred herein as " source sequence ") Chain, obtains the amplified production with the first sequence label and the second sequence label respectively, thus, in point of follow-up sequencing result In analysis, mutual correction can be carried out according to the sequencing result of two kinds of labels, improve the reliability of analysis result.
Embodiments in accordance with the present invention, the double chain DNA fragment is obtained through the following steps:Sample of nucleic acid is carried out End is repaired, to obtain the sample of nucleic acid by reparation;And base A is added in 5 ' ends of the sample of nucleic acid, so as to Obtain two ends has cohesive end base A sample of nucleic acid respectively, and the two ends have cohesive end base A nucleic acid sample respectively This composition double chain DNA fragment.Thus, it is possible in subsequent operation, easily be added at the two ends of the double chain DNA fragment Joint.So as to improve the efficiency for building sequencing library.
Embodiments in accordance with the present invention, the sample of nucleic acid is at least a portion or free nucleic acid of human gene group DNA.Root According to embodiments of the invention, people's free nucleic acid is extracted from the peripheral blood of patient.Embodiments in accordance with the present invention, it is described Patient suffers from colorectal cancer.Thus, using the method for the embodiment of the present invention, effectively the gene of human patient can be dashed forward Change is effectively analyzed, and then can be effective for the early diagnosis of colorectal cancer, personalized medicine and postoperative monitoring etc..
Embodiments in accordance with the present invention, at least a portion of the human gene group DNA is by being carried out to human gene group DNA Interrupt and obtain at random.Thus, it is possible in subsequent operation, easily add joint at the two ends of the double chain DNA fragment. So as to improve the efficiency for building sequencing library.
Embodiments in accordance with the present invention, the joint has 3 ' base T cohesive ends.Thus, it is possible in subsequent operation, Easily joint is added at the two ends of the double chain DNA fragment.So as to improve the efficiency for building sequencing library.
Embodiments in accordance with the present invention, the Single-stranded DNA fragments are by the way that connection product progress denaturation treatment is obtained .Thus, it is possible to fast and effectively obtain Single-stranded DNA fragments.According to some embodiments of the present invention, the denaturation treatment can Think thermal denaturation processing or alkaline denaturation processing.
Embodiments in accordance with the present invention, the probe is provided in the form of chip.Thus, it is possible to improve probe screening Efficiency.
Embodiments in accordance with the present invention, when there is UDG enzymes/FPG enzymes, carry out the chain extension reaction.Thus, it is possible to have Effect ground is repaired to the DNA that there is damage during chain extension, reduces the generation of false positive, is improved and is built sequencing library Quality.
Separately length is for embodiments in accordance with the present invention, first sequence label and second sequence label 4~10nt.The length of embodiments in accordance with the present invention, first sequence label and second sequence label is 8nt.Root According to embodiments of the invention, there is at least 2nt mispairing between first sequence label and second sequence label.Invention People utilizes the first sequence label and the second mark it has surprisingly been found that using being arranged such, can effectively improve in subsequent analysis The efficiency that label sequence is corrected.
Embodiments in accordance with the present invention, the first chain of the joint has SEQ ID NO:Sequence shown in 1, the joint The second chain there is SEQ ID NO:Sequence shown in 2, first label has SEQ ID NO:Shown in any one of 3-6 Sequence, second label has SEQ ID NO:Sequence shown at least one of 7-10, first primer has SEQ ID NO:Sequence shown in 11, the primer tool for being suitable to expand first sequence label and second sequence label simultaneously There are SEQ ID NO:12 and SEQ ID NO:Sequence shown in 13.
Wherein, " XXXXXXXX " is represented in the first sequence label, the first primer in sequence in the sequence of the first chain of joint " XXXXXXXX " represent the second sequence label.
Embodiments in accordance with the present invention, label includes but is not limited to 4 couple described above, can be related to as needed multipair Detected while label is for Multi-example.
In the second aspect of the present invention, the present invention proposes a kind of sequence measurement, and this method includes:According to foregoing Method builds sequencing library;The sequencing library is sequenced.
Embodiments in accordance with the present invention, carry out the sequencing on Hiseq2000 or Hiseq2500.Thus, it is possible to effectively Improve the efficiency of sequencing in ground.In addition, it is previously with regard to build the feature and advantage described by the method for sequencing library, it is equally applicable to be somebody's turn to do Sequence measurement, will not be repeated here.
In the third aspect of the present invention, the present invention proposes a kind of method for determining nucleotide sequence, and this method includes:For Sample of nucleic acid, is sequenced according to the foregoing method of claim, to obtain the sequencing being made up of multiple sequencing datas As a result;Based on the sequencing result, at least one sequencing data subset is built, wherein, it is all in each sequencing data subset Sequencing data corresponds to identical source sequence on sample of nucleic acid;For each sequencing data subset, determine respectively and described the The corresponding sequencing data of one sequence label is normal chain sequencing data, and sequencing data corresponding with second sequence label is minus strand Sequencing data;For each sequencing data subset, the normal chain sequencing data and the minus strand sequencing data are based respectively on, it is right Sequencing data is corrected, to determine corrected sequencing data;And based on the corrected sequencing data, really The sequence of the fixed sample of nucleic acid.Thus, it is possible to be effectively corrected based on normal chain sequencing data and minus strand sequencing data, carry The reliability of high analyte result.
Embodiments in accordance with the present invention, the sequencing is double end sequencings, and the sequencing result is by multipair paired sequencing Data are constituted.
Embodiments in accordance with the present invention, based on the sequencing result, it is under to build at least one sequencing data subset What row step was carried out:For every a pair of the multipair paired sequencing data, it is determined that sequencing data index in pairs, described paired Sequencing data index be made up of the initial N number of base of each of paired sequencing data, wherein, N be 10~20 between it is whole Number;Indexed based on the paired sequencing data, build at least one preliminary sequencing data subset, wherein, the preliminary sequencing number The paired sequencing data index of identical is respectively provided with according to each sequencing data in subset;And based on the preliminary sequencing data Hamming distance in subset between sequencing data, is finely divided at least one described preliminary sequencing data subset, to obtain Multiple sequencing data subsets.
Embodiments in accordance with the present invention, N is 12.
Embodiments in accordance with the present invention, in each of the multiple sequencing data subset, any two pairs sequencings in pairs The Hamming distance of data is no more than 20.
Embodiments in accordance with the present invention, in each of the multiple sequencing data subset, normal chain sequencing data and negative Chain sequencing data is respectively at least two.
Embodiments in accordance with the present invention, based on the normal chain sequencing data and the minus strand sequencing data, it is determined that by school Positive sequencing data is carried out based on following principle:Each base in corrected sequencing data is obtained at least simultaneously 50% normal chain sequencing data and at least support of 50% minus strand sequencing data.
Each base in embodiments in accordance with the present invention, corrected sequencing data is obtaining at least 80% just simultaneously Chain sequencing data and at least support of 80% minus strand sequencing data.
Embodiments in accordance with the present invention, further comprise:The corrected sequencing data is compared to reference sequences On, and delete the sequencing data that comparison quality is less than 30.
Embodiments in accordance with the present invention, further comprise:Based on the sequence of the sample of nucleic acid, carry out SNV analyses or Indel is analyzed.
In the fourth aspect of the present invention, the present invention proposes a kind of device for building sequencing library.According to the reality of the present invention Example is applied, the device includes:Connection unit, for distinguishing jointing at the two ends of double chain DNA fragment, to obtain connection production Thing, wherein, the joint includes the first chain and the second chain, the first chain and the second chain part matching and the first chain bag Containing the first sequence label, to limit one of double stranded region and two single-stranded afterbodys, described two single-stranded afterbodys on the joint Sequence in include the first label;Unit is cracked, for the connection product to be cracked into Single-stranded DNA fragments;Screening unit, is used In before the chain extension is carried out, the Single-stranded DNA fragments are screened using probe, wherein, the probe specificity Presumptive area is recognized, wherein, the presumptive area includes one of following:(1) at least one of gene shown in table 1;(2) (1) CDS regions;And the upstream and downstream of (3) (2) at least 10bp region;Chain extension unit, for utilizing list described in the first primer pair Chain DNA fragment carries out chain extension reaction, to obtain chain extension product, wherein, first primer includes the second sequence label, And first primer is suitable to the first chain of the joint into duplex structure, simply first sequence label with it is described There is mispairing between second sequence label;Amplification unit, for being expanded to the chain extension product, to obtain amplification production Thing, the amplified production constitutes the sequencing library, wherein, the amplification using suitable for expanding first sequence label simultaneously With the primer of second sequence label.
Embodiments in accordance with the present invention, said apparatus can effectively implement the side of structure sequencing library described above Method, can effectively build sequencing library, meanwhile, in constructed sequencing library, for identical double chain DNA fragment (at this Every chain, obtains the amplification with the first sequence label and the second sequence label in text also referred to as " source sequence ") respectively Product, thus, in the analysis of follow-up sequencing result, can carry out mutual correction according to the sequencing result of two kinds of labels, improve The reliability of analysis result.
Embodiments in accordance with the present invention, further comprise:Unit is repaired in end, for sample of nucleic acid progress end to be repaiied It is multiple, to obtain the sample of nucleic acid by reparation;And end modified unit, in the addition of 5 ' ends of the sample of nucleic acid Base A, has cohesive end base A sample of nucleic acid, the two ends have cohesive end alkali respectively respectively to obtain two ends Base A sample of nucleic acid constitutes the double chain DNA fragment.
Embodiments in accordance with the present invention, the probe is provided in the form of chip.
Embodiments in accordance with the present invention, when there is UDG enzymes/FPG enzymes, carry out the chain extension reaction.Thus, it is possible to have Effect ground is repaired to the DNA that there is damage during chain extension, reduces the generation of false positive, is improved and is built sequencing library Quality.
Separately length is for embodiments in accordance with the present invention, first sequence label and second sequence label 4~10nt.
The length of embodiments in accordance with the present invention, first sequence label and second sequence label is 8nt.
, there is at least 2nt between first sequence label and second sequence label in embodiments in accordance with the present invention Mispairing.
Embodiments in accordance with the present invention, the first chain of the joint has SEQ ID NO:Sequence shown in 1, the joint The second chain there is SEQ ID NO:Sequence shown in 2, first label has SEQ ID NO:Shown in any one of 3-6 Sequence, second label has SEQ ID NO:Sequence shown at least one of 7-10, first primer has SEQ ID NO:Sequence shown in 11, the primer tool for being suitable to expand first sequence label and second sequence label simultaneously There are SEQ ID NO:12 and SEQ ID NO:Sequence shown in 13.
Embodiments in accordance with the present invention, label includes but is not limited to 4 couple described above, can be related to as needed multipair Detected while label is for Multi-example.
It will be appreciated to those of skill in the art that above for the feature and excellent built described by the method for sequencing library Point, is equally applicable to the device of the structure sequencing library, will not be repeated here.
In the fifth aspect of the present invention, the present invention proposes a kind of sequencing equipment.Embodiments in accordance with the present invention, the sequencing Equipment includes:According to the device of foregoing structure sequencing library;Sequencing device, for being surveyed to the sequencing library Sequence.
Thus, it is possible to effectively improve the efficiency of sequencing.In addition, being previously with regard to build the method and apparatus institute of sequencing library The feature and advantage of description, the equally applicable sequencing equipment, will not be repeated here.
Embodiments in accordance with the present invention, the sequencing device is Hiseq2000 or Hiseq2500.
In the sixth aspect of the present invention, the present invention proposes a kind of system for determining nucleotide sequence.According to the reality of the present invention Example is applied, the system includes:Foregoing sequencing equipment, for being sequenced for sample of nucleic acid, is surveyed to obtain by multiple Ordinal number according to composition sequencing result;Sequencing data subset builds equipment, for based on the sequencing result, building at least one survey Sequence data subset, wherein, all sequencing datas in each sequencing data subset correspond to identical source sequence on sample of nucleic acid; Sequencing data sorting device, for for each sequencing data subset, determining respectively corresponding with first sequence label Sequencing data is normal chain sequencing data, and sequencing data corresponding with second sequence label is minus strand sequencing data;Number is sequenced According to calibration equipment, for for each sequencing data subset, being based respectively on the normal chain sequencing data and minus strand sequencing Data, are corrected to sequencing data, to determine corrected sequencing data;And sequence determination device, for based on The corrected sequencing data, determines the sequence of the sample of nucleic acid.Thus, determination according to embodiments of the present invention is utilized The system of nucleotide sequence, can effectively implement the method for nucleotide sequence determined above.Surveyed so as to effectively be based on normal chain Ordinal number evidence and minus strand sequencing data are corrected, and improve the reliability of analysis result.
Embodiments in accordance with the present invention, the sequencing is double end sequencings, and the sequencing result is by multipair paired sequencing Data are constituted.
Embodiments in accordance with the present invention, sequencing data subset, which builds equipment, to be included:Sequencing data index determines equipment, is used for For every a pair of the multipair paired sequencing data, it is determined that sequencing data index in pairs, the paired sequencing data index It is made up of the initial N number of base of each of paired sequencing data, wherein, N is the integer between 10~20;Preliminary screening is filled Put, for being indexed based on the paired sequencing data, build at least one preliminary sequencing data subset, wherein, the just pacing Each sequencing data in sequence data subset is respectively provided with the paired sequencing data index of identical;And postsearch screening device, use Hamming distance in based on the preliminary sequencing data subset between sequencing data, at least one described preliminary sequencing data Subset is finely divided, to obtain multiple sequencing data subsets.
Embodiments in accordance with the present invention, N is 12.
Embodiments in accordance with the present invention, in each of the multiple sequencing data subset, any two pairs sequencings in pairs The Hamming distance of data is no more than 20.
Embodiments in accordance with the present invention, in each of the multiple sequencing data subset, normal chain sequencing data and negative Chain sequencing data is respectively at least two.
Embodiments in accordance with the present invention, based on the normal chain sequencing data and the minus strand sequencing data, it is determined that by school Positive sequencing data is carried out based on following principle:Each base in corrected sequencing data is obtained at least simultaneously 50% normal chain sequencing data and at least support of 50% minus strand sequencing data.
Each base in embodiments in accordance with the present invention, corrected sequencing data is obtaining at least 80% just simultaneously Chain sequencing data and at least support of 80% minus strand sequencing data.
Embodiments in accordance with the present invention, further comprise:The corrected sequencing data is compared to reference sequences On, and delete the sequencing data that comparison quality is less than 30.
Embodiments in accordance with the present invention, further comprise sequence analysis device, and the sequence analysis device is used to be based on institute The sequence of sample of nucleic acid is stated, SNV analyses or Indel analyses is carried out.
It will be appreciated by persons skilled in the art that being previously with regard to determine advantage and the spy described by the method for nucleotide sequence The system for levying the equally applicable determination nucleotide sequence, will not be repeated here.
The additional aspect and advantage of the present invention will be set forth in part in the description, and will partly become from the following description Obtain substantially, or recognized by the practice of the present invention.
Brief description of the drawings
The above-mentioned and/or additional aspect and advantage of the present invention will become from description of the accompanying drawings below to embodiment is combined Substantially and be readily appreciated that, wherein:
Fig. 1 shows the flow chart for the method for building sequencing library according to an embodiment of the invention;
Fig. 2 shows according to one embodiment of present invention, the analysis result of same index reads clusters;And
Fig. 3 shows according to one embodiment of present invention, spectrum of mutation analysis result.
Embodiment
Below by specific embodiment, the present invention will be described, it is necessary to which explanation is that these embodiments are only to be Illustration purpose, and can not be construed to limitation of the present invention in any way.
Conventional method
Unless stated otherwise, in the following embodiments, carried out according to following conventional method:
First, probe is designed
According to human genome HG19, transfer the exon sequence of related gene, it is contemplated that the size of capture region and into This, final chip has pertained only to the CDS regions of said gene, and to extending 20bp before and after CDS regions.On chip covered with Abundant capture probe, probe overlay area can be enriched with target DNA fragments, same up to 98% from complicated genome Open on chip with high specific and high coverage rate capture genome area.
2nd, sequencing library and sequencing are built
Reference picture 1, builds the step of library and sequencing as follows:
1. extracting patient's 5ml peripheral bloods, centrifugal separation plasma and leucocyte, plasma sample and leucocyte sample are carried respectively Take DNA, detection of the control for somatic mutation will be used as after the DNA that leucocyte is extracted.
2. the free Circulating DNA extracted in blood plasma is average in 170BP, directly 3 are carried out according to conventional banking process afterwards Walk enzymatic reaction:End is repaired, plus " A " and the sequence measuring joints of connection specially treated (carry 8BP label, ordered on the joint Entitled index1, it not only has the function of distinguishing different samples, the mark of normal chain after being also used for).
3. obtain connection product, carry out Colorectalpan chip hybridization captures, its elute single-stranded template product it Afterwards by the primer amplification marked with index2 of 1 wheel, 1 circulation so that anti-chain is labeled.Added simultaneously during PCR UDG/FPG enzymes are incubated, and to eliminate the DNA damage carried in template strand, reduce the generation of false positive.
4. the product that the double index marks of positive anti-chain are completed, takes turns PCR enrichments by after purification, carrying out second, completes library Prepare.
5. sequence measurement uses Hiseq 2000 or Hiseq2500, the difference measured according to sequencing and sample number, can be flexible Select suitable microarray dataset.
Specific steps include:
1.cfDNA extraction
The blood plasma about 2-3ml that 5ml peripheral bloods are isolated is taken, according to QIAamp Circulating Nucleic Acid Kit extracts reagent specifications, carry out blood plasma cfDNA extraction.Qubit (Invitrogen, the Quant-iTTM dsDNA HS Assay Kit) quantitative extracted DNA, total amount is about 5~50ng.
2. the preparation in sample library:
The cfDNA extracted in blood plasma, builds storehouse specification according to KAPA LTP Library Preparation Kit afterwards, Carry out 3 step enzymatic reactions.
1) end is repaired
Afterwards, the μ L of Agencourt AMPure XP reagent 120 are added, magnetic beads for purifying, the last μ L of back dissolving 42 is carried out ddH2O, band magnetic bead carries out next step reaction.
2) A is added
The μ L of PEG/NaCl SPRI solution 90 are added afterwards, are sufficiently mixed, and carry out magnetic beads for purifying, last back dissolving (35- joints) μL ddH2O, band magnetic bead carries out next step reaction.
3) joint is connected
50 μ L of PEG/NaCl SPRI solution are separately added into afterwards 2 times, carry out 2 magnetic beads for purifying, the last μ L of back dissolving 25 ddH2O。
3 chip hybridizations are captured
The early screening chip Colorectalpan for colorectal cancer designed in the present invention using inventor, with reference to chip The specification that manufacturer provides carries out hybrid capture.Finally elute the μ L ddH of back dissolving 212O band hybridization elution magnetic beads.
4. the positive anti-chain marks of couple index and enrichment:
2 are carried out altogether and takes turns PCR, and PCR 1 carries out anti-chain mark and template DNA injury repair, and PCR2 carries out amplification enrichment, complete Prepared into library.
1)PCR1
PCR1 programs:
Hybridization elution magnetic bead is first removed, the μ L of Agencourt AMPure XP reagent 40 is then added, carries out magnetic bead Purifying, the last μ L ddH of back dissolving 202O, band magnetic bead carries out next step reaction.
2)PCR2
PCR2 programs:
Previous step magnetic bead is first removed, the μ L of Agencourt AMPure XP reagent 50 are then rejoined, magnetic is carried out Pearl purifies, the last μ L ddH of back dissolving 252O, carries out QC and upper machine.
3rd, sequencing result is analyzed
1, by paired reads (paired sequencing data) reads1 preceding 12bp bases and reads2 preceding 12bp alkali Base (i.e. sequence of breakpoints) connects into a 24bp short sequence, and using this 24bp as paired reads index, and root Normal chain and anti-chain are marked according to its index.
2, external sort, the purpose being brought together with the copy reached same DNA profiling are carried out to index.
3, central cluster is carried out to the reads for possessing same index gathered together, according to the Hamming distance between its sequence From each big cluster for having same index to be gathered into the Chinese of any two couples of paired reads in several tuftlets, each tuftlet Prescribed distance is no more than 10, to reach the purpose for distinguishing the reads for possessing same index but from different DNA profilings.
4, the copy cluster of the same DNA profiling to being obtained in step 3 is screened, if the reads numbers of normal chain and anti-chain More than 2 pairs are all reached, then carries out subsequent analysis.
5, error correction is carried out to the cluster for meeting 4 conditionals, and a pair of error-free new reads are produced, for each of DNA profiling Individual sequencing base, if certain concordance rate of base type in the reads of normal chain reaches 80%, and it is consistent in anti-chain reads Rate also reaches 80%, then this base for remembering new reads is this base type, is otherwise designated as N, has so just obtained representing original The new reads of DNA profiling sequence.
6, new reads is compared on genome again with bwa mem algorithms, screens out and compares the reads that quality is less than 30.
7, SNV analyses:
1) counted according to the reads obtained in 6, the base type distribution in each site in capture region is obtained, with master It had both been mutating alkali yl type to flow the inconsistent base type of base type (ratio is more than 15% base type).Count target area covering big Small, average sequencing depth, positive anti-chain interworking rate, low frequency mutation rate etc..
2) SNP is annotated using CCDS, human genome database (NCBI36.3), dbSNP (v130) information, really Determine mutational site generation gene, coordinate, mRNA sites, amino acid change, (the missense mutation/nonsense mutation/variable of SNP functions Shearing site), SIFT prediction SNP influence protein function predictions etc.;
3) according to the comparison of Patient Sample A and control sample information, Call Somatic Mutation.Simultaneously candidate's The SNP occurred in dbSNP, HAPMAP, 1000 human genomes, other extron sequencing projects is got rid of in SNV, using as The related candidate SNV of last disease.
8, INDEL analyses:
1) counted according to the reads containing indel in the reads obtained in 6, obtain all indel and selection There is the indel of 2 and above reads supports as reliable mutation indel,
2) Indel is annotated using CCDS, human genome database (NCBI36.3), dbSNP (v130) information, Determine gene, coordinate, mRNA sites, the change of Coding region sequence, the influence to amino acid, InDel that mutational site occurs Function (amino acid insertion/amino acid deletions/frameshift mutation);
3) according to the comparison of Patient Sample A and control sample information, Call Somatic Mutation.Simultaneously candidate's The Indel occurred in dbSNP and other extron sequencing projects is got rid of in Indel, to be used as last disease correlation Candidate Indel.
Embodiment 1:Colorectal cancer is early sieved
First, chip is designed
1) design of colorectal cancer early screening chip:
Based on TCGA, ICGC, database and the pertinent literature reference such as COSMIC design pin Colon and rectum using iterative algorithm The genetic chip Colorectalpan that cancer is early sieved.Colorectalpan chips include:The related driving gene of colorectal cancer, Important gene in high frequency mutant gene, and the signal paths of cancer 12, altogether 60 genes, 123KB.
Chip the design process is divided into 4 steps:
1st, statistics cosmic databases in about colorectal cancer driver gene each exon 1 variation sample number, Variation sample, hottest point variation where sample number, PI values (to assess level of patient's reply frequency on each extron, Accumulative number of patients/extron length of mutation is carried on the every extrons of PI=), and arranged according to PI values descending.Use afterwards Iterative algorithm:Sample using first exon 1 variation counts other all interval and sample datas as sample database The number of storehouse difference sample, the most sample interval of different number of samples is classified as into second, and to screen chip interval, now with The two interval variation samples screened screen the 3rd interval, until sample in the same way as sample database Database includes all samples, to count exon 1 collection, and for not screening any all areas of interval gene Between, then all it is added on chip interval.
2. based on TCGA, the database such as ICGC is interval and including being more than or equal to 5 samples to remove driver gene Focus variation interval (SNV>=5) interval for candidate, repeat the iterative calculation of previous step.
3. based on TCGA, the database such as ICGC, remove be screened it is interval in respectively with:PI>=30, SNV>=3 With:PI>=20, SNV>=3 be that candidate is interval, and screening causes single sample database sample number to reduce most intervals and be used as first Individual chip is interval, repeats above procedure and is iterated calculating.
4. add the intervals such as fusion.
List of genes details are shown in Table 1.
Table 1
KRAS SRC TLR3 EP300 TMPRSS13 EPHA5
BRAF PTEN MC4R CYLD PHF2 EPHA3
APC AXIN1 MLH1 FBN2 OPRD1 PTPRD
TP53 FLG AKT1 NF1 LILRB5 NTRK3
PIK3CA LIG1 CASD1 ASXL1 COL18A1 NTRK1
CTNNB1 MAP2K1 PTCH1 SMAD4 LARP4B ALK
NRAS PIK3R1 ADAMTS18 IRF5 DMKN ROS1
EGFR ERBB2 MSH2 DOCK3 ROBO2 RET
FBXW7 STK11 BAP1 MYOM1 KCNN3 PDGFRA
ARID1A IL7R CTNNA1 NEFH INHBA FGFR1
2nd, sequencing analysis
Using the present invention, 1 intestinal polyp patient is surveyed according to colorectal cancer early screening is carried out the step of above method, as a result It is as follows:
Sequencing data statistical result see the table below:
Annotation:Positive anti-chain interworking rate:Based on cluster total 3 more than reads cluster/3 more than reads that just anti-chain is having Ratio, to assess positive anti-chain interworking situation in data available;Valid data utilization rate:Based on the reads at least meeting 2+/2- clusters Number and the ratio of total sequencing reads numbers after error correction;Average sequencing depth:After valid data error correction, to target area The average coverage condition of base.
The analysis of cluster:
Fig. 2 is shown in the analysis of same index reads clusters, wherein, abscissa represents duplication (dup) number of cluster, indulges Coordinate represents the total reads numbers for the cluster for meeting a certain dup numbers.Fig. 2 result is shown:The dup clusters overwhelming majority is left 6 The right side, most of cluster interior energy meets 2 just+2 anti-conditions, and final data data effective rate of utilization is 5.12%, average sequencing depth For:1033X
It is mutated analysis of spectrum:
Spectrum of mutation analysis result is shown in Fig. 3, wherein, complementary mutation type is theoretical for the molecule (DNA) from double-strand The frequency of mutation is essentially identical, and abscissa represents the type of base mutation;Ordinate represents the number of mutation.Fig. 3 result is shown: The distribution of mutating alkali yl type is in a basic balance, and its frequency of mutation (Mutations per nucleotide) is:2.2×10-6
Variation detection list details (are counted) based on exon areas and nonsynonymous mutation:
Gene Base mutation Amino acid mutation Mutation type The frequency of mutation
SMAD4 c.2119G>A p.Y301F Missense mutation 2.8%
ARID1A c.817C>T p.A1872T Missense mutation 2.34%
APC c.217A>C p.A426T Missense mutation 1.80%
Interpretation of result:Relational database and the documents and materials such as foundation TCGA, COSMIC, ClinVar, HMGD, in patient SMAD4 p.Y301F, APC p.A426T driving mutation are detected in blood plasma and imply that patient has higher risk of cancer Rate, it is proposed that patient to relevant healthcare institution is more fully detected intervening measure related to taking.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " illustrative examples ", The description of " example ", " specific example " or " some examples " etc. means to combine specific features, the knot that the embodiment or example are described Structure, material or feature are contained at least one embodiment of the present invention or example.In this manual, to above-mentioned term Schematic representation is not necessarily referring to identical embodiment or example.Moreover, specific features, structure, material or the spy of description Point can in an appropriate manner be combined in any one or more embodiments or example.In addition, it is necessary to explanation, ability Field technique personnel are it is understood that order the step of included in scheme proposed by the invention, and those skilled in the art can be with It is adjusted, this is also included within the scope of the present invention.
Although an embodiment of the present invention has been shown and described, it will be understood by those skilled in the art that:Not In the case of departing from the principle and objective of the present invention a variety of change, modification, replacement and modification can be carried out to these embodiments, this The scope of invention is limited by claim and its equivalent.

Claims (46)

1. a kind of method for building sequencing library, it is characterised in that including:
(a) jointing is distinguished at the two ends of double chain DNA fragment, to obtain connection product, wherein, the joint includes first Chain and the second chain, first chain and the second chain part are matched and first chain includes the first sequence label, so as to described Limit double stranded region and two single-stranded afterbodys on joint, the first label is included in the sequence of one of described two single-stranded afterbodys;
(b) connection product is cracked into Single-stranded DNA fragments;
(c) Single-stranded DNA fragments are screened using probe, wherein, the probe specificity recognizes presumptive area, its In, the presumptive area includes one of following:
(1)TLR3、TMPRSS13、MC4R、PHF2、OPRD1、FLG、LILRB5、LIG1、CASD1、COL18A1、LARP4B、 At least one of ADAMTS18, IRF5, DMKN, DOCK3, MYOM1, KCNN3 and NEFH gene;
(2) the CDS regions of (1);And
(3) upstream and downstream of (2) at least 10bp region;
(d) chain extension reaction is carried out using Single-stranded DNA fragments described in the first primer pair, to obtain chain extension product, wherein, institute Stating the first primer includes the second sequence label, and first primer is suitable to the first chain link in pairs with the joint , simply there is mispairing between first sequence label and second sequence label in structure;
(e) the chain extension product is expanded, to obtain amplified production, the amplified production constitutes the sequencing text Storehouse, wherein, the amplification is described using the primer for being suitable to expand first sequence label and second sequence label simultaneously Primer is the second primer and three-primer.
2. according to the method described in claim 1, it is characterised in that the double chain DNA fragment is obtained through the following steps:
Sample of nucleic acid is subjected to end reparation, to obtain the sample of nucleic acid by reparation;And
Base A is added in 5 ' ends of the sample of nucleic acid, there is cohesive end base A nucleic acid sample respectively to obtain two ends This, the sample of nucleic acid with cohesive end base A constitutes the double chain DNA fragment respectively at the two ends.
3. method according to claim 2, it is characterised in that the sample of nucleic acid is at least one of human gene group DNA Divide or free nucleic acid.
4. method according to claim 3, it is characterised in that the free nucleic acid is extracted from the peripheral blood of patient.
5. method according to claim 4, it is characterised in that the patient suffers from colorectal cancer.
6. method according to claim 3, it is characterised in that at least a portion of the human gene group DNA is by right Human gene group DNA is interrupted and obtained at random.
7. according to the method described in claim 1, it is characterised in that the joint has 3 ' base T cohesive ends.
8. according to the method described in claim 1, it is characterised in that the Single-stranded DNA fragments are by by the connection product Carry out denaturation treatment acquisition.
9. according to the method described in claim 1, it is characterised in that the probe is provided in the form of chip.
10. according to the method described in claim 1, it is characterised in that when there is UDG enzymes/FPG enzymes, carry out the chain extension Reaction.
11. according to the method described in claim 1, it is characterised in that first sequence label and second sequence label Separately length is 4~10nt.
12. according to the method described in claim 1, it is characterised in that first sequence label and second sequence label Length be 8nt.
13. according to the method described in claim 1, it is characterised in that first sequence label and second sequence label Between exist at least 2nt mispairing.
14. according to the method described in claim 1, it is characterised in that the nucleotides sequence of the first chain of the joint is classified as SEQ ID NO:Sequence shown in 1, the nucleotides sequence of the second chain of the joint is classified as SEQ ID NO:Sequence shown in 2, described The nucleotides sequence of one label is classified as SEQ ID NO:Sequence shown at least one of 3-6, the nucleotides sequence of second label It is classified as SEQ ID NO:Sequence shown at least one of 7-10, the nucleotides sequence of first primer is classified as SEQ ID NO:11 Shown sequence, the nucleotides sequence of second primer is classified as SEQ ID NO:Sequence shown in 12, the core of the three-primer Nucleotide sequence is SEQ ID NO:Sequence shown in 13.
15. a kind of sequence measurement, methods described is used for non-diagnostic purpose, it is characterised in that including:
Method according to any one of claim 1~14 builds sequencing library;
The sequencing library is sequenced.
16. method according to claim 15, it is characterised in that the survey is carried out on Hiseq2000 or Hiseq2500 Sequence.
17. a kind of method for determining nucleotide sequence, methods described is used for non-diagnostic purpose, it is characterised in that including:
For sample of nucleic acid, the method according to claim 15 or 16 is sequenced, to obtain by multiple sequencing datas The sequencing result of composition;
Based on the sequencing result, at least one sequencing data subset is built, wherein, all surveys in each sequencing data subset Ordinal number is according to identical source sequence on corresponding sample of nucleic acid;
For each sequencing data subset, determine that sequencing data corresponding with first sequence label is sequenced for normal chain respectively Data, sequencing data corresponding with second sequence label is minus strand sequencing data;
For each sequencing data subset, the normal chain sequencing data and the minus strand sequencing data are based respectively on, to sequencing Data are corrected, to determine corrected sequencing data;And
Based on the corrected sequencing data, the sequence of the sample of nucleic acid is determined.
18. method according to claim 17, it is characterised in that the sequencing is double end sequencings, the sequencing result It is made up of multipair paired sequencing data.
19. method according to claim 17, it is characterised in that based on the sequencing result, builds at least one sequencing Data subset is carried out through the following steps:
For every a pair of the multipair paired sequencing data, it is determined that sequencing data index, the paired sequencing data in pairs Index is made up of the initial N number of base of each of paired sequencing data, wherein, N is the integer between 10~20;
Indexed based on the paired sequencing data, build at least one preliminary sequencing data subset, wherein, the preliminary sequencing number The paired sequencing data index of identical is respectively provided with according to each sequencing data in subset;And
Based on the Hamming distance between sequencing data in the preliminary sequencing data subset, at least one described preliminary sequencing number It is finely divided according to subset, to obtain multiple sequencing data subsets.
20. method according to claim 19, it is characterised in that N is 12.
21. method according to claim 19, it is characterised in that in each of the multiple sequencing data subset, The Hamming distance of any two pairs paired sequencing datas is no more than 20.
22. method according to claim 19, it is characterised in that in each of the multiple sequencing data subset, Normal chain sequencing data and minus strand sequencing data are respectively at least two.
23. method according to claim 22, it is characterised in that be sequenced based on the normal chain sequencing data and the minus strand Data, determining corrected sequencing data is carried out based on following principle:
Each base in corrected sequencing data obtains at least 50% normal chain sequencing data and at least 50% negative simultaneously The support of chain sequencing data.
24. method according to claim 23, it is characterised in that each base in corrected sequencing data is same When obtain at least 80% normal chain sequencing data and at least support of 80% minus strand sequencing data.
25. method according to claim 23, it is characterised in that further comprise:
The corrected sequencing data is compared to reference sequences, and deletes the sequencing data that comparison quality is less than 30.
26. method according to claim 17, it is characterised in that the sequence based on the sample of nucleic acid, carries out SNV analyses Or Indel analyses.
27. a kind of device for building sequencing library, it is characterised in that including:
Connection unit, for distinguishing jointing at the two ends of double chain DNA fragment, to obtain connection product, wherein, it is described to connect Head includes the first chain and the second chain, and first chain and the second chain part are matched and first chain includes the first label sequence Row, to limit double stranded region and two single-stranded afterbodys on the joint, are included in the sequence of one of described two single-stranded afterbodys First label;
Unit is cracked, for the connection product to be cracked into Single-stranded DNA fragments;
Screening unit, for before chain extension is carried out, being screened using probe to the Single-stranded DNA fragments, wherein, it is described Probe specificity recognizes presumptive area, wherein, the presumptive area includes one of following:
(1)TLR3、TMPRSS13、MC4R、PHF2、OPRD1、FLG、LILRB5、LIG1、CASD1、COL18A1、LARP4B、 At least one of ADAMTS18, IRF5, DMKN, DOCK3, MYOM1, KCNN3 and NEFH gene;
(2) the CDS regions of (1);And
(3) upstream and downstream of (2) at least 10bp region;
Chain extension unit, for carrying out chain extension reaction using Single-stranded DNA fragments described in the first primer pair, to obtain chain extension Product, wherein, first primer includes the second sequence label, and first primer is suitable to the first chain with the joint Duplex structure is formed, simply there is mispairing between first sequence label and second sequence label;
Amplification unit, for being expanded to the chain extension product, to obtain amplified production, the amplified production constitutes institute Sequencing library is stated, wherein, the amplification uses the second primer and three-primer, and second primer recognizes the of the joint Two chains, the three-primer is arranged to be suitable to while expanding first sequence label and second sequence label.
28. device according to claim 27, it is characterised in that further comprise:
Unit is repaired in end, for sample of nucleic acid to be carried out into end reparation, to obtain the sample of nucleic acid by reparation;And
End modified unit, for adding base A in 5 ' ends of the sample of nucleic acid, has viscosity respectively to obtain two ends Terminal bases A sample of nucleic acid, the sample of nucleic acid with cohesive end base A constitutes the double-stranded DNA piece respectively at the two ends Section.
29. device according to claim 27, it is characterised in that the probe is provided in the form of chip.
30. device according to claim 27, it is characterised in that when there is UDG enzymes/FPG enzymes, carries out the chain extension Reaction.
31. device according to claim 27, it is characterised in that first sequence label and second sequence label Separately length is 4~10nt.
32. device according to claim 27, it is characterised in that first sequence label and second sequence label Length be 8nt.
33. device according to claim 27, it is characterised in that first sequence label and second sequence label Between exist at least 2nt mispairing.
34. device according to claim 27, it is characterised in that the nucleotides sequence of the first chain of the joint is classified as SEQ ID NO:Sequence shown in 1, the nucleotides sequence of the second chain of the joint is classified as SEQ ID NO:Sequence shown in 2, described The nucleotides sequence of one label is classified as SEQ ID NO:Sequence shown at least one of 3-6, the nucleotides sequence of second label It is classified as SEQ ID NO:Sequence shown at least one of 7-10, the nucleotides sequence of first primer is classified as SEQ ID NO:11 Shown sequence, the nucleotides sequence of second primer is classified as SEQ ID NO:Sequence shown in 12, the core of the three-primer Nucleotide sequence is SEQ ID NO:Sequence shown in 13.
35. a kind of sequencing equipment, it is characterised in that including:
The device of structure sequencing library according to any one of claim 27~34;
Sequencing device, for the sequencing library to be sequenced.
36. sequencing equipment according to claim 35, it is characterised in that the sequencing device be Hiseq2000 or Hiseq2500。
37. a kind of system for determining nucleotide sequence, it is characterised in that including:
Sequencing equipment described in claim 35 or 36, for being sequenced for sample of nucleic acid, to obtain by multiple sequencings The sequencing result that data are constituted;
Sequencing data subset builds equipment, for based on the sequencing result, building at least one sequencing data subset, wherein, All sequencing datas in each sequencing data subset correspond to identical source sequence on sample of nucleic acid;
Sequencing data sorting device, for for each sequencing data subset, determining and first sequence label pair respectively The sequencing data answered is normal chain sequencing data, and sequencing data corresponding with second sequence label is minus strand sequencing data;
Sequencing data calibration equipment, for for each sequencing data subset, being based respectively on the normal chain sequencing data and institute Minus strand sequencing data is stated, sequencing data is corrected, to determine corrected sequencing data;And
Sequence determination device, for based on the corrected sequencing data, determining the sequence of the sample of nucleic acid.
38. the system according to claim 37, it is characterised in that the sequencing is double end sequencings, the sequencing result It is made up of multipair paired sequencing data.
39. the system according to claim 37, it is characterised in that sequencing data subset, which builds equipment, to be included:
Sequencing data index determines equipment, for every a pair for the multipair paired sequencing data, it is determined that sequencing in pairs Data directory, the paired sequencing data index is made up of the initial N number of base of each of paired sequencing data, wherein, N For the integer between 10~20;
Preliminary screening device, for being indexed based on the paired sequencing data, builds at least one preliminary sequencing data subset, its In, each sequencing data in the preliminary sequencing data subset is respectively provided with the paired sequencing data index of identical;And
Postsearch screening device, for based on the Hamming distance between sequencing data in the preliminary sequencing data subset, to described At least one preliminary sequencing data subset is finely divided, to obtain multiple sequencing data subsets.
40. the system according to claim 39, it is characterised in that N is 12.
41. the system according to claim 39, it is characterised in that in each of the multiple sequencing data subset, The Hamming distance of any two pairs paired sequencing datas is no more than 20.
42. the system according to claim 39, it is characterised in that in each of the multiple sequencing data subset, Normal chain sequencing data and minus strand sequencing data are respectively at least two.
43. system according to claim 42, it is characterised in that be sequenced based on the normal chain sequencing data and the minus strand Data, determining corrected sequencing data is carried out based on following principle:
Each base in corrected sequencing data obtains at least 50% normal chain sequencing data and at least 50% negative simultaneously The support of chain sequencing data.
44. system according to claim 43, it is characterised in that each base in corrected sequencing data is same When obtain at least 80% normal chain sequencing data and at least support of 80% minus strand sequencing data.
45. system according to claim 43, it is characterised in that further comprise:
The corrected sequencing data is compared to reference sequences, and deletes the sequencing data that comparison quality is less than 30.
46. the system according to claim 37, it is characterised in that further comprise sequence analysis device, the sequence point Analysis apparatus is used for the sequence based on the sample of nucleic acid, carries out SNV analyses or Indel analyses.
CN201410521540.4A 2014-09-30 2014-09-30 Build the method and its application of sequencing library Active CN104293940B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410521540.4A CN104293940B (en) 2014-09-30 2014-09-30 Build the method and its application of sequencing library

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410521540.4A CN104293940B (en) 2014-09-30 2014-09-30 Build the method and its application of sequencing library

Publications (2)

Publication Number Publication Date
CN104293940A CN104293940A (en) 2015-01-21
CN104293940B true CN104293940B (en) 2017-07-28

Family

ID=52313887

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410521540.4A Active CN104293940B (en) 2014-09-30 2014-09-30 Build the method and its application of sequencing library

Country Status (1)

Country Link
CN (1) CN104293940B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106434856B (en) * 2015-08-06 2020-02-07 深圳华大智造科技有限公司 Method for running and testing sequencer
CN105332063B (en) * 2015-08-13 2017-04-12 厦门飞朔生物技术有限公司 Construction method of single-tube and high-flux sequencing library
CN105602936A (en) * 2015-11-18 2016-05-25 中国人民解放军第四军医大学 Construction method of dual-barcode next-generation sequencing library
CN107034267B (en) * 2016-02-03 2021-06-08 深圳华大智造科技股份有限公司 Method and device for preparing candidate sequencing probe set and application of candidate sequencing probe set
CN107038349B (en) * 2016-02-03 2020-03-31 深圳华大生命科学研究院 Method and apparatus for determining pre-rearrangement V/J gene sequence
CN105950709A (en) * 2016-03-30 2016-09-21 广州精科生物技术有限公司 Kit, library building method, and method and system for detecting variation of object region
CN107312822A (en) * 2016-04-26 2017-11-03 厦门飞朔生物技术有限公司 A kind of construction method in oncogene variation library detected for high-flux sequence and its application
CN109994155B (en) * 2019-03-29 2021-08-20 北京市商汤科技开发有限公司 Gene variation identification method, device and storage medium
CN114464252B (en) * 2022-01-26 2023-06-27 深圳吉因加医学检验实验室 Method and device for detecting structural variation

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101967476A (en) * 2010-09-21 2011-02-09 深圳华大基因科技有限公司 Joint connection-based deoxyribonucleic acid (DNA) polymerase chain reaction (PCR)-free tag library construction method
CN102409048A (en) * 2010-09-21 2012-04-11 深圳华大基因科技有限公司 DNA label library construction method based on high-throughput sequencing
CN103806111A (en) * 2012-11-15 2014-05-21 深圳华大基因科技有限公司 Construction method and application of high-throughout sequencing library
WO2014145078A1 (en) * 2013-03-15 2014-09-18 Verinata Health, Inc. Generating cell-free dna libraries directly from blood

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104294371B (en) * 2014-09-30 2017-07-04 天津华大基因科技有限公司 Build method and its application of sequencing library
CN104264231B (en) * 2014-09-30 2017-04-19 天津华大基因科技有限公司 Method for constructing sequencing library and application of sequencing library
CN104293938B (en) * 2014-09-30 2017-11-03 天津华大基因科技有限公司 Build the method and its application of sequencing library

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101967476A (en) * 2010-09-21 2011-02-09 深圳华大基因科技有限公司 Joint connection-based deoxyribonucleic acid (DNA) polymerase chain reaction (PCR)-free tag library construction method
CN102409048A (en) * 2010-09-21 2012-04-11 深圳华大基因科技有限公司 DNA label library construction method based on high-throughput sequencing
CN103806111A (en) * 2012-11-15 2014-05-21 深圳华大基因科技有限公司 Construction method and application of high-throughout sequencing library
WO2014145078A1 (en) * 2013-03-15 2014-09-18 Verinata Health, Inc. Generating cell-free dna libraries directly from blood

Also Published As

Publication number Publication date
CN104293940A (en) 2015-01-21

Similar Documents

Publication Publication Date Title
CN104293940B (en) Build the method and its application of sequencing library
CN104264231B (en) Method for constructing sequencing library and application of sequencing library
US11447813B2 (en) Systems and methods to detect rare mutations and copy number variation
JP7119014B2 (en) Systems and methods for detecting rare mutations and copy number variations
CN104293938B (en) Build the method and its application of sequencing library
CN107475375B (en) A kind of DNA probe library, detection method and kit hybridized for microsatellite locus related to microsatellite instability
CN105518151B (en) Identification and use of circulating nucleic acid tumor markers
AU2016293025A1 (en) System and methodology for the analysis of genomic data obtained from a subject
CN104294371B (en) Build method and its application of sequencing library
CN108949941A (en) Low-frequency mutation detection method, kit and device
CN105132407B (en) A kind of cast-off cells DNA low frequencies mutation enrichment sequence measurement
CN104293941B (en) Method for constructing sequencing library and application of sequencing library
US12031186B2 (en) Homologous recombination repair deficiency detection
CN110093417B (en) Method for detecting tumor single cell somatic mutation
CN108229103A (en) The processing method and processing device of Circulating tumor DNA repetitive sequence
CN103114150A (en) Single nucleotide polymorphism site identification method based on digestion library-establishing and sequencing and bayesian statistics
CN112639983A (en) Microsatellite instability detection
CN105925664A (en) Method and system for determining nucleic acid sequence
CN112980961A (en) Method and device for jointly detecting SNV (single nucleotide polymorphism), CNV (CNV) and FUSION (FUSION mutation)
CN111511930A (en) Genetic modulation of immune responses through chromosomal interactions
CN107760783A (en) Gastric cancer peritoneum branch prediction model and its application based on 108 genes
CN108359723B (en) Method for reducing deep sequencing errors
CN105950709A (en) Kit, library building method, and method and system for detecting variation of object region
US20200095641A1 (en) Means and methods for anti-vegf therapy
KR20220074756A (en) Method for tracking the generation order of the generaed strands by linking information of the strands generated during the pcr process to create a cluster

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant