Nothing Special   »   [go: up one dir, main page]

CN111863125A - Mono-parent diploid detection method based on NGS-trio and application - Google Patents

Mono-parent diploid detection method based on NGS-trio and application Download PDF

Info

Publication number
CN111863125A
CN111863125A CN202010774623.XA CN202010774623A CN111863125A CN 111863125 A CN111863125 A CN 111863125A CN 202010774623 A CN202010774623 A CN 202010774623A CN 111863125 A CN111863125 A CN 111863125A
Authority
CN
China
Prior art keywords
sites
mutation
trio
upd
inheritance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010774623.XA
Other languages
Chinese (zh)
Other versions
CN111863125B (en
Inventor
刘晶星
于世辉
喻长顺
向丽娜
陈白雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Kingmed Diagnostics Group Co ltd
Guangzhou Kingmed Diagnostics Central Co Ltd
Original Assignee
Guangzhou Kingmed Diagnostics Group Co ltd
Guangzhou Kingmed Diagnostics Central Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Kingmed Diagnostics Group Co ltd, Guangzhou Kingmed Diagnostics Central Co Ltd filed Critical Guangzhou Kingmed Diagnostics Group Co ltd
Priority to CN202010774623.XA priority Critical patent/CN111863125B/en
Publication of CN111863125A publication Critical patent/CN111863125A/en
Application granted granted Critical
Publication of CN111863125B publication Critical patent/CN111863125B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/10Ploidy or copy number detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/50Mutagenesis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Molecular Biology (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to an NGS-trio-based monadic diploid detection method and application, and belongs to the technical field of bioinformatics analysis. According to the method, by acquiring NGS-trio sequencing data and analyzing and judging, the chromosome genetic source of a proband can be directly inferred, so that whether UPD (UPD is indirectly inferred through LOH) is directly judged, and the diagnosis positive rate is improved on the premise of not increasing any cost. The method can also be used for assisting in judging the heterozygous deletion of the large fragment, and the density resolution ratio can reach 1Mbp according to the mutation sites, so that the method has excellent detection performance.

Description

Mono-parent diploid detection method based on NGS-trio and application
Technical Field
The invention relates to the technical field of bioinformatics analysis, in particular to an NGS-trio-based uniparental diploid detection method and application thereof.
Background
Genomic imprinting (also called genetic imprinting) is a genetic process for marking the information of the parental origin of a gene or Genomic domain by biochemical means. Such genes are called imprinted genes, and whether or not they are expressed depends on the source of the chromosome from which they are derived (paternal or maternal), and whether or not the gene is silenced on the chromosome from which it is derived (the silencing mechanism is primarily methylation). Some imprinted genes are expressed only from maternal chromosomes, and some are expressed only from father chromosomes.
In a normal diploid, a pair of homologous chromosomes is respectively derived from a male parent and a female parent, and a UniParental diploid (UPD for short) refers to a pair of homologous chromosomes (or partial segments of chromosomes) derived from the same parent, and if the segments contain imprinted genes, gene expression disorder can be caused. The current method for diagnosing UPD is to determine whether the methylation level is consistent between the same segments of a pair of homologous chromosomes.
In most cases, UPD is a gamete with abnormal copy number of chromosome because two homologous chromosomes are not separated during meiosis, compared with a gamete with one copy in normal gamete, the gamete with abnormal copy number is 2 or 0 copies, and then a zygote (trisomy or monosome) with abnormal copy number is generated. Finally, through trisomy rescue, as shown in FIG. 1, one chromosome is randomly lost; or by monomer rescue, as shown in figure 2, replicating a single chromosome to become euploid. The probability of a triploid rescue with 1/3 resulted in UPD, whereas a monomer rescue must result in UPD.
For UPD generated by monomer rescue, since it generates homozygous for the entire chromosome, it can be inferred by indirect detection of LOH (loss of heterozygosity); while for UPD generated by trisomy rescue, local LOH is occasionally generated due to recombination during meiosis, but local LOH is caused more frequently (such as marriage close to the relative), and UPD cannot be determined 100%.
Moreover, the methylation detection method for detecting UPD in the conventional technology can only process small local chromosome segments, and different experiments need to be designed according to different regions, so that the efficiency is low, the speed is low, and the method is not suitable for screening in the whole genome range;
the method adopting the SNP chip has the defect of higher cost, and the target probe of the method is a polymorphic site, so that other pathogenic micro-mutations (point mutation, micro-insertion deletion and the like) cannot be simultaneously detected;
whole exon sequencing is the most common method for detecting gene defect diseases at present, can detect pathogenic point mutation, micro-insertion deletion, copy number variation and the like, and is the first choice of most of patients. However, UPD can only be inferred indirectly from LOH based on sequencing data of a single sample, as disclosed in CN 110211630A.
Disclosure of Invention
In view of the above, it is necessary to provide an NGS-trio-based method for detecting an unipolar diploid, which can directly infer the genetic origin of chromosomes of a proband, thereby, directly determine whether or not UPD (instead of indirectly inferring UPD by LOH), and improve the positive rate of diagnosis without increasing any cost.
An NGS-trio-based uniparental diploid detection method comprises the following steps:
data acquisition: acquiring NGS sequencing data of the same group of trio samples;
and (3) screening mutation sites: respectively selecting mutation sites which meet preset conditions in each sample, defining the mutation sites as qualified mutation sites of the sample, and positioning the mutation sites which are screened and removed as unqualified mutation sites of the sample;
merging the site data: taking a union set of unqualified mutation sites of all samples in the same group of trio samples, obtaining and concentrating chromosome coordinates of each unqualified mutation site, and removing mutation sites with the same coordinates as the unqualified mutation sites from qualified sites of each sample; according to the remaining qualified mutation sites in the group of samples, mutually complementing the genotypes at the positions without mutation into homozygous sites consistent with the reference sequence;
and (3) genetic pattern classification: the classification of the genetic pattern was performed for the trio combinations for each mutation site, dividing the mutation site into: sites conforming to the inheritance of parents, sites conforming to the inheritance of only a single parent and sites not conforming to the genetic rule;
and (3) paternity judgment: if the locus which does not accord with the genetic rule is smaller than the preset value, performing subsequent analysis, and if the locus which does not accord with the genetic rule is larger than or equal to the preset value, judging that the sample is unqualified;
judging the uniparental fragment: if the coverage range of the continuous locus only conforming to the single parent father source inheritance exceeds a preset value, judging the continuous locus as a fragment of the single parent father source; if the coverage range of the continuous locus only conforming to the inheritance of the single parent source exceeds a preset value, judging the continuous locus as a fragment of the single parent source;
judging UPD: analyzing the coverage depth of the sequencing data which is judged to be the single parent fragment, and judging that the fragment is missing if the section is suggested to be single copy; otherwise, judging the section as a UPD section;
pathogenic UPD screening: and checking whether the UPD section covers the imprinted gene or the corresponding strip, if not, judging the UPD section to be benign, and if so, indicating the risk of the pathogenic UPD.
With the reduction of sequencing cost, more and more full exon sequencing detection schemes select samples for simultaneously detecting probands and parents thereof, and based on the trio family data, the method can directly infer the chromosome genetic source of probands, thereby directly judging whether UPD exists or not and improving the diagnosis positive rate on the premise of not increasing any cost.
It will be appreciated that the NGS sequencing data described above may be either whole exon sequencing data or whole genome sequencing data.
In one embodiment, in the step of screening for a mutation site, the mutation site is selected as follows:
1) screening high-quality mutation sites in NGS sequencing data;
2) removing the mutation site located on the Y chromosome;
3) screening point mutation sites in the gene;
4) eliminating suspected false positive sites according to Hardy-Weinberg balance;
5) removing sites with mutation frequency higher than 70% for heterozygous sites and removing sites with mutation frequency lower than 85% for homozygous sites;
6) typing the mutation at each position, and removing the loci with more than 2 typing numbers;
7) the rest sites are mutation sites meeting the preset conditions.
In mutation analysis, since humans are diploid, one position has a maximum of 2 genotypes, more than two are typically sequencing errors, for example: the chr1:69849G > A, the Het is divided into chr1:69849[ A/G ], the chr1:69849G > A, and the Hom is divided into chr1:69849[ A/A ]. For example, if there are both chr1:69849G > A, Het and chr1:69849G > T, Het, the typing is chr1:69849[ A/G/T ], i.e., more than 2 types of typing, and this site needs to be removed.
It will be appreciated that the predetermined qualified mutation sites need to be qualified for all screening conditions simultaneously and not for all removal conditions.
It can be understood that according to Hardy-Weinberg's law of equilibrium, the genotype frequency and gene frequency at a locus in a population will remain unchanged and be in genetic equilibrium under the condition that a population is infinite and has random mating, no mutation, no selection and no genetic drift. Thus, false positive sites can be excluded by chi-square test. For example, the frequency of a locus AA-AB-BB is regular, for example, 1 million persons in a local population pool, the allele frequency of A is 0.4, B is 0.6, the theoretical value of the number of persons with the genotype AA is 1600, BB is 3600, and AB is 4800, and the actual number of persons and the theoretical number of persons in the population pool are used for chi-square test to exclude the locus where the actual number of persons deviates too much from the theoretical number of persons (i.e., the high-probability false positive locus).
A large number of sites with poor quality are doped in the sequencing result of the conventional NGS, so that the subsequent UPD judgment process of the method is greatly interfered, and the detection effect is poor if all the sites are used. Therefore, the mutation sites are selected by the method, so that the accuracy of the analysis result can be improved.
In one embodiment, in the mutation site screening step:
the high-quality mutation sites are mutation sites meeting the following standards: the GATK-VQSR quality control PASS, the total coverage is >20X, and the mutation frequency is > 25%.
In one embodiment, in the data acquiring step, the same set of trio samples includes a paternal sample, a maternal sample and a proband sample;
in the site data merging step, mutation site data with consistent coordinates are arranged according to the sequence of proband-father-mother.
The method for detecting the disease of the invention must include samples of probands and parents, and is not necessary.
In one embodiment, the genetic pattern classification step classifies the sites that correspond to parental inheritance as:
type 1: sites that only fit into parental inheritance;
type 0: the locus conforms to both parental inheritance and monophyletic inheritance;
sites that fit only uniparental inheritance were divided into:
type 3F: the resulting sites can only be rescued by the parent monomer;
type 2F: the generated sites can be rescued by father source monomers and also can be rescued by father source triplets;
3M type: sites that can only be rescued by maternal monomers;
2M type: (ii) a site that is rescued by either the maternal monomer or the maternal trisomy;
the sites that do not comply with the genetic rule are divided into:
-type 1: either parent does not comply with the genetic rule;
-type 2: both parents do not comply with the genetic rules.
It is understood that the above-mentioned parental inheritance compatible loci refer to loci from which both alleles of the proband can find their origin in parents, including loci compatible with parental inheritance only (i.e., type 1, such as Aa-Aa), and loci compatible with both parental inheritance and monadic inheritance (i.e., type 0).
In one embodiment, in the step of determining the uniparental fragment, if more than 8 continuous 2F or 3F sites are reached, the coverage range is more than 1Mbp, i.e., the fragment is determined as the uniparental source fragment; if more than 8 continuous 2M or 3M type sites are reached, the coverage range is more than 1Mbp, and the fragments are judged to be of uniparental origin.
It is understood that the above continuous sites are not divided by the type 1 site in the middle, such as more than 8 continuous sites of the type 2F or 3F, not divided by the type 1 site in the middle, or more than 8 continuous sites of the type 2M or 3M, not divided by the type 1 site in the middle.
In one embodiment, in the step of determining the UPD, the data determined to be the segments of the single parent are compared with the result of copy number analysis of sequencing of the whole exon, and if the copy number analysis indicates that the segment is single copy, the segment is determined to be missing; otherwise, judging the UPD.
The invention also discloses application of the NGS-trio-based monadic diploid detection method in research, development or preparation of a device for screening pathogenic UPD.
The invention also discloses a screening device of the monadic diploid based on NGS-trio, which comprises the following steps: the device comprises a data acquisition module, a data analysis module and a UPD judgment module;
the data acquisition module is used for acquiring NGS sequencing data of the same group of trio samples;
the data analysis module is used for analyzing the sequencing data, and dividing mutation sites into: sites conforming to the inheritance of parents, sites conforming to the inheritance of only a single parent and sites not conforming to the genetic rule;
the UPD judgment module is used for carrying out UPD judgment on the mutation sites according to a preset rule to obtain a judgment result;
the data analysis module performs analysis according to the following steps:
and (3) screening mutation sites: respectively selecting mutation sites which meet preset conditions in each sample, defining the mutation sites as qualified mutation sites of the sample, and positioning the mutation sites which are screened and removed as unqualified mutation sites of the sample;
merging the site data: taking a union set of unqualified mutation sites of all samples in the same group of trio samples, obtaining and concentrating chromosome coordinates of each unqualified mutation site, and removing mutation sites with the same coordinates as the unqualified mutation sites from qualified sites of each sample; according to the remaining qualified mutation sites in the group of samples, mutually complementing the genotypes at the positions without mutation into homozygous sites consistent with the reference sequence;
and (3) genetic pattern classification: the classification of the genetic pattern was performed for the trio combinations for each mutation site, dividing the mutation site into: sites conforming to the inheritance of parents, sites conforming to the inheritance of only a single parent and sites not conforming to the genetic rule;
the UPD judging module analyzes according to the following steps:
and (3) paternity judgment: if the locus which does not accord with the genetic rule is smaller than the preset value, performing subsequent analysis, and if the locus which does not accord with the genetic rule is larger than or equal to the preset value, judging that the sample is unqualified;
judging the uniparental fragment: if the coverage range of the continuous locus only conforming to the single parent father source inheritance exceeds a preset value, judging the continuous locus as a fragment of the single parent father source; if the coverage range of the continuous locus only conforming to the inheritance of the single parent source exceeds a preset value, judging the continuous locus as a fragment of the single parent source;
judging UPD: analyzing the coverage depth of the sequencing data which is judged to be the single parent fragment, and judging that the fragment is missing if the section is suggested to be single copy; otherwise, judging the section as a UPD section;
pathogenic UPD screening: and checking whether the UPD section covers the imprinted gene or the corresponding strip, if not, judging the UPD section to be benign, and if so, indicating the risk of the pathogenic UPD.
In one embodiment, in the step of screening for a mutation site, the mutation site is selected as follows:
1) screening high-quality mutation sites in NGS sequencing data;
2) removing the mutation site located on the Y chromosome;
3) screening point mutation sites in the gene;
4) eliminating suspected false positive sites according to Hardy-Weinberg balance;
5) removing sites with mutation frequency higher than 70% for heterozygous sites and removing sites with mutation frequency lower than 85% for homozygous sites;
6) typing the mutation at each position, and removing the loci with more than 2 typing numbers;
7) the rest sites are mutation sites meeting the preset conditions.
In one embodiment, in the mutation site screening step:
the high-quality mutation sites are mutation sites meeting the following standards: the GATK-VQSR quality control PASS, the total coverage is >20X, and the mutation frequency is > 25%.
In one embodiment, in the data acquisition module, the same set of trio samples includes a paternal sample, a maternal sample and a proband sample;
in the site data merging step, mutation site data with consistent coordinates are arranged according to the sequence of proband-father-mother.
In one embodiment, the genetic pattern classification step classifies the sites that correspond to parental inheritance as:
type 1: sites that only fit into parental inheritance;
type 0: the locus conforms to both parental inheritance and monophyletic inheritance;
sites that fit only uniparental inheritance were divided into:
type 3F: the resulting sites can only be rescued by the parent monomer;
type 2F: the generated sites can be rescued by father source monomers and also can be rescued by father source triplets;
3M type: sites that can only be rescued by maternal monomers;
2M type: (ii) a site that is rescued by either the maternal monomer or the maternal trisomy;
the sites that do not comply with the genetic rule are divided into:
-type 1: either parent does not comply with the genetic rule;
-type 2: both parents do not comply with the genetic rules.
It is understood that the above-mentioned parental inheritance compatible loci refer to loci from which both alleles of the proband can find their origin in parents, including loci compatible with parental inheritance only (i.e., type 1, such as Aa-Aa), and loci compatible with both parental inheritance and monadic inheritance (i.e., type 0).
In one embodiment, in the step of determining the uniparental fragment, if more than 8 continuous 2F or 3F sites are reached, the coverage range is more than 1Mbp, i.e., the fragment is determined as the uniparental source fragment; if more than 8 continuous 2M or 3M type sites are reached, the coverage range is more than 1Mbp, and the fragments are judged to be of uniparental origin.
In one embodiment, in the step of determining the UPD, the data determined to be the segments of the single parent are compared with the result of copy number analysis of sequencing of the whole exon, and if the copy number analysis indicates that the segment is single copy, the segment is determined to be missing; otherwise, judging the UPD.
The invention also discloses a storage medium which comprises a stored program, and the program realizes the functions of the modules.
The invention also discloses a processor, which is used for running a program, and the program realizes the functions of the modules.
Compared with the prior art, the invention has the following beneficial effects:
according to the NGS-trio-based uniparental diploid detection method, the occurrence of UPD and the occurrence of UPD in a high-risk imprinting area can be judged while the conventional pathogenic mutation is checked on the basis of the trio data of whole exome/whole genome sequencing, and no additional experiment or labor cost is needed.
In addition, the method can also be used for assisting in judging the heterozygous deletion of the large fragment, and the density resolution ratio of the mutation site can reach 1Mbp, so that the method has excellent detection performance.
Drawings
FIG. 1 is a schematic diagram of a three-body rescue in the background art;
FIG. 2 is a schematic diagram of monomer rescue in the background art;
FIG. 3 is a flow chart of the method for detecting the monadic diploid based on NGS-trio in example 1;
FIG. 4 is a schematic view of a screening apparatus module in example 2;
FIG. 5 is a schematic view of a normal sample in example 3;
FIG. 6 is a schematic representation of the analysis of the trio sample set NP21S0557-NP21S0558-NP21S0549 in example 4;
FIG. 7 is an enlarged view of a portion of the frame of FIG. 4;
FIG. 8 is a schematic diagram of the analysis of the trio sample set NP19E0911-NP19E0910-NP19E0912 in example 4;
FIG. 9 is an enlarged view of a portion of the frame of FIG. 6;
FIG. 10 is a schematic representation of the analysis of the trio sample set NP20E957-NP20E956-NP20E958 of example 4;
FIG. 11 is an enlarged view of a portion of the frame of FIG. 8;
FIG. 12 is a schematic representation of the analysis of the trio sample set NP21F6166- -NP21F6167- -NP21F6168 in example 5;
FIG. 13 is an enlarged view of a portion of the frame of FIG. 10;
FIG. 14 is a diagram of analysis of the trio sample set NP19F0315- -NP19F0313- -NP19F0314 in example 5;
FIG. 15 is an enlarged view of a portion of the frame of FIG. 12;
FIG. 16 is a schematic representation of the analysis of the trio sample set NP21F3536- -NP21F3567- -NP21F3537 in example 5;
FIG. 17 is an enlarged view of a portion of the frame of FIG. 14;
FIG. 18 is a schematic diagram of the analysis of the trio sample set NP19E1380- -NP19E1381- -NP19E1382 in example 6;
FIG. 19 is an enlarged view of a portion of the frame of FIG. 16;
FIG. 20 is a schematic diagram showing the analysis of the trio sample set NP19E0056- -NP9E0057- -NP9E0055 in example 6;
FIG. 21 is an enlarged view of a portion of the frame of FIG. 18;
wherein: in FIGS. 5, 6, 6, 8, 10, 12, 14, 16, 18, 20, the abscissa is the number of each chromosome, the lower half of the figure is the proportion of consecutive homozygous fragments to the entire chromosome length, and the upper half is the distribution of mutated sites on each chromosome;
in the enlarged schematic diagrams of fig. 7, 9, 11, 13, 15, 17, 19, 21, the schematic diagrams of the different types of loci on each chromosome are, in order from left to right: the cross-shaped unInherit _2 refers to a-type 2 locus, the round dot unInherit _1 refers to a-type 1 locus, the diamond-shaped Norm refers to a normal locus, the solid line exome _ bed refers to the whole exon sequencing coverage, the imprint location refers to an imprinting section, the imprint gene refers to an imprinting gene range, the inverted triangle Mather refers to the uniparental maternal genetic locus (3M and 2M), and the regular triangle farmer refers to the uniparental paternal genetic locus (3F and 2F).
Detailed Description
To facilitate an understanding of the invention, the invention will now be described more fully with reference to the accompanying drawings. Preferred embodiments of the present invention are shown in the drawings. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
Example 1
An NGS-trio-based uniparental diploid detection method is shown in a flow chart of figure 1 and comprises the following steps:
firstly, data acquisition.
NGS sequencing data were obtained for the same set of trio samples. It is understood that the NGS sequencing data may be whole exome sequencing data or whole genome sequencing data.
For the sample, the proband sample, the paternal sample and the maternal sample need to be complete.
And secondly, screening mutation sites.
For a group of trio samples, respectively selecting mutation sites meeting preset conditions in each sample, defining the mutation sites as qualified mutation sites of the samples, positioning the mutation sites to be screened and removed as unqualified mutation sites of the samples, and specifically screening according to the following method:
1. screening high-quality mutation sites (GATK-VQSR quality control PASS, total coverage >20X, mutation frequency > 25%) in whole exome sequencing;
2. removing the mutation site located on the Y chromosome;
3. screening point mutation sites in the gene;
4. excluding possible false positive sites in the local population frequency bin according to Hardy-Weinberg equilibrium;
5. removing sites with mutation frequency higher than 70% from heterozygous sites, and removing sites with mutation frequency lower than 85% from homozygous sites;
6. mutations at each position are typed to remove more than 2 (human diploid, up to 2 genotypes at a position, more than two sequencing errors in general), for example chr1:69849G > A, Het typing chr1:69849[ A/G ], chr1:69849G > A, and Hom typing chr1:69849[ A/A ]. For example, if there are both chr1:69849G > A, Het and chr1:69849G > T, Het, the typing is chr1:69849[ A/G/T ], i.e., more than 2 types of typing, and this site needs to be removed.
7. And respectively summarizing and recording the screened qualified sites and the screened unqualified sites.
Qualified sites need to be simultaneously "eligible for all screening conditions" and "ineligible for all removal conditions".
And thirdly, merging the site data.
1. Taking a union set of unqualified mutation sites of three samples (prob, father and mother samples) in the same group of trio samples, obtaining and concentrating chromosome coordinates of each unqualified mutation site, and removing mutation sites with the coordinates consistent with the unqualified mutation sites from qualified sites of each sample; that is, as long as one spot has a quality failure in one sample, it is rejected in the other two samples.
2. According to the remaining qualified mutation sites in the group of samples, mutually complementing the genotypes at the positions without mutation into homozygous sites consistent with the reference sequence; for example, pro-chr 1:69849[ A/G ], father chr1:69849[ A/A ], no mutation at this position of the mother, and since the reference sequence at this position is G, the mother type is chr1:69849[ G/G ].
Through the processing, about 5 ten thousand qualified mutant site trio combinations can be generally obtained from the sequencing data of the whole exon. And the sequence is ordered according to the following mode, and the trio combination sequence of the mutation sites is as follows: proband-father-mother, such as Aa-AA-Aa, namely proband is Aa, father is AA and mother is Aa.
And fourthly, classifying the genetic patterns.
The classification of the genetic pattern was performed for the trio combinations for each mutation site, dividing the mutation site into: sites conforming to the inheritance of parents, sites conforming to the inheritance of only single parents and sites not conforming to the genetic rule. The method specifically comprises the following steps:
1. sites that correspond to parental inheritance: that is, two alleles of proband can find a source in parents, wherein the Aa-AA-Aa type is necessarily inherited from parents, such loci are marked as type 1 (loci which only accord with the inheritance from parents), other loci such as Aa-Aa, AA-AA-Aa and the like also accord with the inheritance from parents but also accord with the inheritance from single parents, and such loci can not be used as the basis of any judgment and are marked as type 0 (loci which accord with both the inheritance from parents and the inheritance from single parents).
2. Sites that fit only uniparental inheritance: namely, two alleles of the proband can only be inherited from one side of the parents, taking the inheritance from the father as an example, two cases of AA-AA-AA and AA-AA types exist, wherein the AA-AA can only be generated by the monomer rescue, the mark is 3F type, and the AA-AA-AA can be generated by the monomer rescue or the trisomy rescue, and the mark is 2F type; similarly, if the corresponding type is inherited from mother, it is labeled as 3M and 2M.
3. The rest sites which do not accord with the genetic rule: if it is a plurality of sporadic sites, it may be caused by genetic mutation, sequencing error, etc., and if it is extensive, it takes into account the possibility that parents are not parental. There are two cases: AA-AA-AA type, both parents are not family, and the mark is-2 type; Aa-Aa type, parental side is not parental, and labeled-1 type.
Fifthly, judging the relationship.
And if the locus which does not accord with the genetic rule is less than the preset value, performing subsequent analysis, and if the locus which does not accord with the genetic rule is more than or equal to the preset value, judging that the sample is unqualified.
Normally, due to gene mutation and sequencing errors, there may be a few sporadic-1 and-2 sites, typically no more than 100, while in the case of non-parental even if only one party is non-parental there are thousands of-1 sites.
In summary, the sites that exceed 800-1 and-2 are determined to be non-parentage, that is, in this embodiment, the predetermined value (threshold) of the sites that do not conform to the genetic rule is set to 800.
If the paternity is judged to be non-paternity, subsequent analysis cannot be performed. If the relationship judgment sample meets the requirement, the subsequent procedure is entered.
And sixthly, judging the uniparental segment.
If the coverage range of the continuous locus only conforming to the single parent father source inheritance exceeds a preset value, judging the continuous locus as a fragment of the single parent father source; if the coverage range of the continuous locus which only accords with the inheritance of the single parent source exceeds a preset value, the fragment is judged to be the fragment of the single parent source.
Specifically, in this embodiment, the single parent source/parent source segment is determined according to the following method: the fragments reach more than 8 continuous 2F or 3F type sites (the middle is not divided by the 1 type site), the coverage range exceeds 1Mbp, and the fragments are judged to be fragments from a single parent source; similarly, the sequence reaches more than 8 continuous 2M or 3M type sites (the middle is not divided by the 1 type site), the coverage range exceeds 1Mbp, and the fragment is judged to be the fragment from the single parent source.
And seventhly, judging the UPD.
Analyzing the coverage depth of the sequencing data which is judged to be the single parent fragment, and judging that the fragment is missing if the section is suggested to be single copy; otherwise, the UPD section is determined. The method specifically comprises the following steps:
combining the analysis result of sequencing Copy Number Variation (CNV) of the whole exon, namely comparing the sequencing data coverage depth of the single parent source/parent source segment with other samples in the same batch, and judging that the segment is missing if the CNV analysis indicates that the segment is single copy; otherwise, judging the test result as UPD; in particular, deletions of a large segment are generally lethal, and if the segment is more than half of the entire chromosome, or even the entire chromosome, deletion of a segment can be substantially excluded if the sample is derived from a non-embryonic source.
And eighthly, screening the pathogenic UPD.
And checking whether the UPD section covers the imprinted gene or the corresponding strip, if not, judging the UPD section to be benign, and if so, indicating the risk of the pathogenic UPD.
Example 2
A screening device for NGS-trio based uniparental diploids, as shown in fig. 4, comprising: the device comprises a data acquisition module, a data analysis module and a UPD judgment module.
The data acquisition module is used for acquiring NGS sequencing data of the same group of trio samples.
The data analysis module is used for analyzing the sequencing data, and dividing mutation sites into: sites conforming to the inheritance of parents, sites conforming to the inheritance of only a single parent and sites not conforming to the genetic rule; the data analysis module performs analysis according to steps two through four of example 1.
The UPD judgment module is used for carrying out UPD judgment on the mutation sites according to a preset rule to obtain a judgment result; the UPD judging module judges according to the fifth step to the eighth step in the embodiment 1.
Example 3
An NGS-trio based monadic diploid screening is carried out on a certain group of (NP19E1936-NP19E1937-NP19F0086) clinical samples by adopting the screening device of example 2.
As shown in FIG. 3, the sample has almost only the Norm (normal) site, and other types of sites sporadically appear as sequencing errors or new mutations during the genetic process, and the result is indicated as a normal sample.
Example 4
An NGS-trio based screening of monadic diploids, exemplified by 3 sets of clinical specimens, using the screening apparatus of example 2.
1. Group of trio samples: NP21S0557-NP21S0558-NP21S 0549.
The results are shown in fig. 4-5, the sample has sites which accord with the parental inheritance, sites which only accord with the single parental inheritance and sites which do not accord with the inheritance rule, and the sites are uniformly distributed, and meanwhile, 11443 sites and more than 800 sites are provided for the-1 site and the-2 site, the result is judged to be unqualified, parents are not in person or the sample is wrong, and the subsequent judgment cannot be carried out.
2. Group of trio samples: NP19E0911-NP19E0910-NP19E 0912.
The results are shown in fig. 6-7, the sample has sites conforming to the inheritance of both parents, sites conforming to the inheritance of only the parent source of the single parent and sites not conforming to the inheritance rule, and are uniformly distributed, single parent source type sites (sites almost without 2F or 3F) are lacked, meanwhile, the number of the-1 and-2 sites is 5878, and exceeds 800, the result is judged to be unqualified, and the parents are not in person or the sample has errors, so that the subsequent judgment cannot be carried out.
3. Group of trio samples: NP20E957-NP20E956-NP20E 958.
The results are shown in fig. 8-9, the sample has sites conforming to the parental inheritance, sites conforming to the parental inheritance of the single parent only and sites not conforming to the inheritance rule, and are uniformly distributed, single parent-type sites (sites with almost no 2M or 3M) are lacked, and the-1 and-2 sites are 6044 sites, more than 800 sites, which are judged to have unqualified results, and the mother is not in person or the sample has errors, so that the subsequent judgment cannot be carried out.
After the samples are analyzed, the samples do not meet the requirement of the trio sample, the parent line samples and/or the maternal line samples are deleted, and the subsequent analysis cannot be continued.
Example 5
An NGS-trio based screening of monadic diploids, exemplified by 3 sets of clinical specimens, using the screening apparatus of example 2.
1. Group of trio samples: NP21F6166- -NP21F6167- -NP21F 6168.
As shown in FIGS. 10-11, only sites corresponding to the inheritance of the parent source of the single parent are on chr15 in the sample, sites corresponding to the inheritance of the double parent are almost uniformly distributed on the rest autosomes, sites corresponding to the inheritance of the parent source of the single parent and sites not corresponding to the genetic rule are absent (sites of 2F, 3F, -1 and-2 are almost absent), the coverage range is about 72Mbp due to continuous 180 sites of 2M or 3M on chr15, and the CNV result is not abnormal, and the result is judged to be the UPD of the parent source of chr15, and the UPD segment covers a plurality of genetic imprinting regions and indicates the UPD with high risk pathogenicity.
2. Group of trio samples: NP19F0315- -NP19F0313- -NP19F 0314.
As shown in FIGS. 12-13, only sites corresponding to the inheritance of the parent source of the single parent are on chr6 in the sample, sites corresponding to the inheritance of the parent source of the single parent and sites not corresponding to the inheritance rule (almost no sites of 2M, 3M, -1, -2) are on the rest autosomes and are uniformly distributed, the coverage range is about 169Mbp due to 813 continuous sites of 2F or 3F on chr6, and the CNV result is not abnormal, and is judged as the parent source UPD of chr6, and the UPD segment covers a plurality of genetic imprinting regions, thereby indicating the high-risk pathogenic UPD.
3. Group of trio samples: NP21F3536- -NP21F3567- -NP21F 3537.
As shown in FIGS. 14-15, only sites corresponding to the inheritance of the parent source of the single parent are on chr20 in the sample, sites corresponding to the inheritance of the double parent are almost uniformly distributed on the rest autosomes, sites corresponding to the inheritance of the parent source of the single parent and sites not corresponding to the genetic rule are absent (sites with few 2F, 3F, -1 and-2), the coverage range is about 63Mbp due to continuous 197 sites of 2M or 3M on chr20, and the CNV result is not abnormal, and the result is judged to be the UPD of the parent source of chr20, and the UPD segment covers a plurality of genetic imprinting regions and indicates the UPD with high risk and pathogenicity.
The samples were analyzed to be at risk for pathogenic UPD.
Example 6
An NGS-trio based screening of monadic diploids, exemplified by 2 sets of clinical specimens, using the screening apparatus of example 2.
1. Group of trio samples: NP19E1380- -NP19E1381- -NP19E 1382.
The result is shown in fig. 16-17, a small segment of sites on chr15 in the sample only accord with the single parent source inheritance in a local range, the rest of chr15 and the rest of autosomes almost accord with the double parent inheritance and are evenly distributed, sites which are lack of the single parent source inheritance and sites which do not accord with the inheritance rule (the sites which almost do not have 2M, 3M, -1, -2) are lack, the coverage range is about 4Mbp due to 16 continuous sites of 2F or 3F on chr15, and the CNV result indicates that the heterozygous deletion of about 4Mbp exists in the same range of chr15, and the chr15 local maternal deletion is judged, namely, the local parent source fragment only has one copy (the clinical influence is similar to that of the parent source UPD), and the segment covers a plurality of gene imprinting areas, thereby indicating the high-risk pathogenic maternal heterozygous deletion.
2. Group of trio samples: NP19E0056- -NP9E0057- -NP9E 0055.
The results are shown in fig. 18-19, there is a small segment of sites on chr8 in the sample which only conform to the inheritance of a single parent in a local range (wherein, a single parent site may be sequencing error or other reasons and does not affect the overall analysis), the rest of chr8 and the rest of autosomes almost conform to the inheritance of a double parent and are evenly distributed, sites which lack the inheritance of a single parent and sites which do not conform to the inheritance rule (sites which hardly have 2M, 3M, -1, -2) are lacking, because 69 continuous sites of 2M or 3M on chr8 cover about 11Mbp, and the CNV results indicate that there is a heterozygous deletion of about 11Mbp in the same range of chr8, which is judged as a local parent deletion of chr8, namely, there is only one copy of a maternal fragment in a local region (the clinical effect is similar to that of the maternal UPD), because the segment covers a plurality of genetic imprinting regions, suggesting a heterozygous deletion for the high risk pathogenic parent.
The samples were analyzed to be high-risk pathogenic heterozygous deletions with similar clinical impact to UPD as opposed to the source of the deletion (e.g., heterozygous deletions of parent origin have similar clinical impact to UPD of parent origin).
Example 7
UPD was screened in 792 examples of all exon trio sequencing in this detection center using the screening device of example 2, and the results are shown in the following table.
Table 1.792 examples screening UPD results in Whole exon trio sequencing
Figure BDA0002617920140000111
Figure BDA0002617920140000121
Note: the "detection of a single parent origin" means that UPD (group 14) or heterozygous deletion (group 32) is detected;
the above "PWS-AS" refers to the pathogenic situation that is caused by the chr15-UPD, where the parent source UPD would cause PWS, the parent source UPD would cause AS,
the chr15-UPD is a common pathogenic condition, corresponding methylation detection methods are available in the market at present, wherein a mother source UPD can cause PWS, a father source UPD can cause AS, 7 cases of chr15-UPD screened by the embodiment are verified by methylation detection, and the results are all matched, so that the method has high detection result accuracy.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (17)

1. An NGS-trio-based uniparental diploid detection method is characterized by comprising the following steps:
data acquisition: acquiring NGS sequencing data of the same group of trio samples;
and (3) screening mutation sites: respectively selecting mutation sites which meet preset conditions in each sample, defining the mutation sites as qualified mutation sites of the sample, and positioning the mutation sites which are screened and removed as unqualified mutation sites of the sample;
merging the site data: taking a union set of unqualified mutation sites of all samples in the same group of trio samples, obtaining and concentrating chromosome coordinates of each unqualified mutation site, and removing mutation sites with the same coordinates as the unqualified mutation sites from qualified sites of each sample; according to the remaining qualified mutation sites in the group of samples, mutually complementing the genotypes at the positions without mutation into homozygous sites consistent with the reference sequence;
and (3) genetic pattern classification: the classification of the genetic pattern was performed for the trio combinations for each mutation site, dividing the mutation site into: sites conforming to the inheritance of parents, sites conforming to the inheritance of only a single parent and sites not conforming to the genetic rule;
and (3) paternity judgment: if the locus which does not accord with the genetic rule is smaller than the preset value, performing subsequent analysis, and if the locus which does not accord with the genetic rule is larger than or equal to the preset value, judging that the sample is unqualified;
judging the uniparental fragment: if the coverage range of the continuous locus only conforming to the single parent father source inheritance exceeds a preset value, judging the continuous locus as a fragment of the single parent father source; if the coverage range of the continuous locus only conforming to the inheritance of the single parent source exceeds a preset value, judging the continuous locus as a fragment of the single parent source;
judging UPD: analyzing the coverage depth of the sequencing data which is judged to be the single parent fragment, and judging that the fragment is missing if the section is suggested to be single copy; otherwise, judging the section as a UPD section;
pathogenic UPD screening: and checking whether the UPD section covers the imprinted gene or the corresponding strip, if not, judging the UPD section to be benign, and if so, indicating the risk of the pathogenic UPD.
2. The NGS-trio-based uniparental diploid detection method according to claim 1, wherein in said mutation site screening step, mutation sites are selected according to the following method:
1) screening high-quality mutation sites in NGS sequencing data;
2) removing the mutation site located on the Y chromosome;
3) screening point mutation sites in the gene;
4) eliminating suspected false positive sites according to Hardy-Weinberg balance;
5) removing sites with mutation frequency higher than 70% for heterozygous sites and removing sites with mutation frequency lower than 85% for homozygous sites;
6) typing the mutation at each position, and removing the loci with more than 2 typing numbers;
7) the rest sites are mutation sites meeting the preset conditions.
3. The NGS-trio-based uniparental diploid detection method according to claim 1, wherein said mutation site screening step: the high-quality mutation sites are mutation sites meeting the following standards: the GATK-VQSR quality control PASS, the total coverage is >20X, and the mutation frequency is > 25%.
4. The NGS-trio-based monadic diploid detection method according to any one of claims 1-3, wherein in said data acquisition step, said same set of trio samples comprises paternal, maternal and proband samples;
in the site data merging step, mutation site data with consistent coordinates are arranged according to the sequence of proband-father-mother.
5. The NGS-trio-based uniparental diploid detection method according to claim 4, wherein said genetic pattern classification step classifies the loci corresponding to the inheritance of the parents into:
type 1: sites that only fit into parental inheritance;
type 0: the locus conforms to both parental inheritance and monophyletic inheritance;
sites that fit only uniparental inheritance were divided into:
type 3F: the resulting sites can only be rescued by the parent monomer;
type 2F: the generated sites can be rescued by father source monomers and also can be rescued by father source triplets;
3M type: sites that can only be rescued by maternal monomers;
2M type: (ii) a site that is rescued by either the maternal monomer or the maternal trisomy;
the sites that do not comply with the genetic rule are divided into:
-type 1: either parent does not comply with the genetic rule;
-type 2: both parents do not comply with the genetic rules.
6. The NGS-trio-based monadic diploid detection method according to claim 5, wherein said step of determining monadic fragments, if 8 or more continuous 2F or 3F sites are reached, the coverage is more than 1Mbp, i.e. the fragments are determined to be monadic father-sourced; if more than 8 continuous 2M or 3M type sites are reached, the coverage range is more than 1Mbp, and the fragments are judged to be of uniparental origin.
7. The NGS-trio-based monadic diploid detection method according to claim 1, wherein in said UPD determining step, data determined to be monadic fragments are compared with results of whole exon sequencing copy number analysis, and if the copy number analysis indicates that the segment is single copy, the segment is determined to be missing; otherwise, judging the UPD.
8. Use of an NGS-trio based monadic diploid detection method according to any one of claims 1-7 in the development or manufacture of a device for pathogenic UPD screening.
9. An NGS-trio-based uniparental diploid screening device, comprising: the device comprises a data acquisition module, a data analysis module and a UPD judgment module;
the data acquisition module is used for acquiring NGS sequencing data of the same group of trio samples;
the data analysis module is used for analyzing the sequencing data, and dividing mutation sites into: sites conforming to the inheritance of parents, sites conforming to the inheritance of only a single parent and sites not conforming to the genetic rule;
the UPD judgment module is used for carrying out UPD judgment on the mutation sites according to a preset rule to obtain a judgment result;
the data analysis module performs analysis according to the following steps:
and (3) screening mutation sites: respectively selecting mutation sites which meet preset conditions in each sample, defining the mutation sites as qualified mutation sites of the sample, and positioning the mutation sites which are screened and removed as unqualified mutation sites of the sample;
merging the site data: taking a union set of unqualified mutation sites of all samples in the same group of trio samples, obtaining and concentrating chromosome coordinates of each unqualified mutation site, and removing mutation sites with the same coordinates as the unqualified mutation sites from qualified sites of each sample; according to the remaining qualified mutation sites in the group of samples, mutually complementing the genotypes at the positions without mutation into homozygous sites consistent with the reference sequence;
and (3) genetic pattern classification: the classification of the genetic pattern was performed for the trio combinations for each mutation site, dividing the mutation site into: sites conforming to the inheritance of parents, sites conforming to the inheritance of only a single parent and sites not conforming to the genetic rule;
the UPD judging module analyzes according to the following steps:
and (3) paternity judgment: if the locus which does not accord with the genetic rule is smaller than the preset value, performing subsequent analysis, and if the locus which does not accord with the genetic rule is larger than or equal to the preset value, judging that the sample is unqualified;
judging the uniparental fragment: if the coverage range of the continuous locus only conforming to the single parent father source inheritance exceeds a preset value, judging the continuous locus as a fragment of the single parent father source; if the coverage range of the continuous locus only conforming to the inheritance of the single parent source exceeds a preset value, judging the continuous locus as a fragment of the single parent source;
judging UPD: analyzing the coverage depth of the sequencing data which is judged to be the single parent fragment, and judging that the fragment is missing if the section is suggested to be single copy; otherwise, judging the section as a UPD section;
pathogenic UPD screening: and checking whether the UPD section covers the imprinted gene or the corresponding strip, if not, judging the UPD section to be benign, and if so, indicating the risk of the pathogenic UPD.
10. The NGS-trio-based uniparental diploid screening device as claimed in claim 9, wherein in said mutation site screening step, mutation sites are selected according to the following method:
1) screening high-quality mutation sites in NGS sequencing data;
2) removing the mutation site located on the Y chromosome;
3) screening point mutation sites in the gene;
4) eliminating suspected false positive sites according to Hardy-Weinberg balance;
5) removing sites with mutation frequency higher than 70% for heterozygous sites and removing sites with mutation frequency lower than 85% for homozygous sites;
6) typing the mutation at each position, and removing the loci with more than 2 typing numbers;
7) the rest sites are mutation sites meeting the preset conditions.
11. The NGS-trio-based monadic diploid screening device of claim 9, wherein in said mutation site screening step: the high-quality mutation sites are mutation sites meeting the following standards: the GATK-VQSR quality control PASS, the total coverage is >20X, and the mutation frequency is > 25%.
12. The NGS-trio-based monadic diploid screening device of claim 9, wherein said data acquisition module includes paternal, maternal, and proband samples in said same set of trio samples;
in the site data merging step, mutation site data with consistent coordinates are arranged according to the sequence of proband-father-mother.
13. The NGS-trio-based uniparental diploid screening apparatus according to claim 12, wherein in said genetic pattern classification step, the loci corresponding to the inheritance of parents are classified as:
type 1: sites that only fit into parental inheritance;
type 0: the locus conforms to both parental inheritance and monophyletic inheritance;
sites that fit only uniparental inheritance were divided into:
type 3F: the resulting sites can only be rescued by the parent monomer;
type 2F: the generated sites can be rescued by father source monomers and also can be rescued by father source triplets;
3M type: sites that can only be rescued by maternal monomers;
2M type: (ii) a site that is rescued by either the maternal monomer or the maternal trisomy;
the sites that do not comply with the genetic rule are divided into:
-type 1: either parent does not comply with the genetic rule;
-type 2: both parents do not comply with the genetic rules.
14. The NGS-trio-based monadic diploid screening device of claim 13, wherein said step of determining monadic fragments, if more than 8 consecutive sites of 2F or 3F type are reached, the coverage is more than 1Mbp, i.e. the fragments are determined to be of monadic parent origin; if more than 8 continuous 2M or 3M type sites are reached, the coverage range is more than 1Mbp, and the fragments are judged to be of uniparental origin.
15. The NGS-trio-based monadic diploid screening device of claim 9, wherein in said UPD determining step, the data determined as monadic fragments are compared with the result of whole exon sequencing copy number analysis, and if the copy number analysis indicates that the segment is single copy, the segment is determined as missing; otherwise, judging the UPD.
16. A storage medium, characterized in that the storage medium comprises a stored program, which realizes the functionality of the module of any of claims 9-15.
17. A processor, characterized in that the processor is adapted to run a program, which realizes the functionality of the module of any of claims 9-15.
CN202010774623.XA 2020-08-04 2020-08-04 Method for detecting single parent diploid based on NGS-trio and application Active CN111863125B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010774623.XA CN111863125B (en) 2020-08-04 2020-08-04 Method for detecting single parent diploid based on NGS-trio and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010774623.XA CN111863125B (en) 2020-08-04 2020-08-04 Method for detecting single parent diploid based on NGS-trio and application

Publications (2)

Publication Number Publication Date
CN111863125A true CN111863125A (en) 2020-10-30
CN111863125B CN111863125B (en) 2024-04-12

Family

ID=72953607

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010774623.XA Active CN111863125B (en) 2020-08-04 2020-08-04 Method for detecting single parent diploid based on NGS-trio and application

Country Status (1)

Country Link
CN (1) CN111863125B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112375829A (en) * 2020-11-25 2021-02-19 苏州赛美科基因科技有限公司 Method and device for identifying UPD (user Equipment) by using family WES (family WES) data and electronic equipment
CN112687336A (en) * 2021-03-11 2021-04-20 北京贝瑞和康生物技术有限公司 Method, computing device and storage medium for determining UPD type
CN113593644A (en) * 2021-06-29 2021-11-02 广东博奥医学检验所有限公司 Method for detecting chromosome uniparental disomy by low-depth sequencing based on family
CN114255821A (en) * 2021-12-31 2022-03-29 天津金域医学检验实验室有限公司 Family three-sample high-throughput sequencing risk grouping screening method and system
CN117025753A (en) * 2023-08-15 2023-11-10 广州女娲生命科技有限公司 Method and device for detecting chromosomal variation

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070184467A1 (en) * 2005-11-26 2007-08-09 Matthew Rabinowitz System and method for cleaning noisy genetic data from target individuals using genetic data from genetically related individuals
CN110029157A (en) * 2018-01-11 2019-07-19 北京大学 A method of the unicellular genome monoploid of detection tumour copies number variation
CN110211630A (en) * 2019-06-06 2019-09-06 广州金域医学检验中心有限公司 The screening apparatus and storage medium and processor of pathogenic uniparental disomy
CN112201306A (en) * 2020-09-21 2021-01-08 广州金域医学检验集团股份有限公司 True and false gene mutation analysis method based on high-throughput sequencing and application
CN114566213A (en) * 2022-01-20 2022-05-31 四川省妇幼保健院 Single-parent diploid analysis method and system for family high-throughput sequencing data
CN114921536A (en) * 2022-06-28 2022-08-19 苏州贝康医疗器械有限公司 Method, device, storage medium and equipment for detecting uniparental diploid and loss of heterozygosity

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070184467A1 (en) * 2005-11-26 2007-08-09 Matthew Rabinowitz System and method for cleaning noisy genetic data from target individuals using genetic data from genetically related individuals
CN110029157A (en) * 2018-01-11 2019-07-19 北京大学 A method of the unicellular genome monoploid of detection tumour copies number variation
CN110211630A (en) * 2019-06-06 2019-09-06 广州金域医学检验中心有限公司 The screening apparatus and storage medium and processor of pathogenic uniparental disomy
CN112201306A (en) * 2020-09-21 2021-01-08 广州金域医学检验集团股份有限公司 True and false gene mutation analysis method based on high-throughput sequencing and application
CN114566213A (en) * 2022-01-20 2022-05-31 四川省妇幼保健院 Single-parent diploid analysis method and system for family high-throughput sequencing data
CN114921536A (en) * 2022-06-28 2022-08-19 苏州贝康医疗器械有限公司 Method, device, storage medium and equipment for detecting uniparental diploid and loss of heterozygosity

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CAROLINE F. WRIGHT,等: "Paediatric genomics: diagnosing rare disease in children", 《GENETICS》, vol. 19, pages 1 - 16 *
DANIELA DEL GAUDIO, PHD,等: "Diagnostic testing for uniparental disomy: a points to consider statement from the American College of Medical Genetics and Genomics (ACMG)", 《GENETICS IN MEDICINE》, pages 1133 - 1140 *
KEVIN YAUY,等: "Accurate detection of clinically relevant uniparental disomy from exome sequencing data", 《GENETICS IN MEDICINE》, pages 803 - 807 *
王煜: "基于目标捕获测序技术的全基因组拷贝数变异、杂合性丢失和单亲二倍体检测方法", 《中国优秀硕士学位论文全文数据库基础科学辑》, pages 006 - 134 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112375829A (en) * 2020-11-25 2021-02-19 苏州赛美科基因科技有限公司 Method and device for identifying UPD (user Equipment) by using family WES (family WES) data and electronic equipment
CN112687336A (en) * 2021-03-11 2021-04-20 北京贝瑞和康生物技术有限公司 Method, computing device and storage medium for determining UPD type
CN112687336B (en) * 2021-03-11 2021-06-22 北京贝瑞和康生物技术有限公司 Method, computing device and storage medium for determining UPD type
CN113593644A (en) * 2021-06-29 2021-11-02 广东博奥医学检验所有限公司 Method for detecting chromosome uniparental disomy by low-depth sequencing based on family
CN113593644B (en) * 2021-06-29 2024-03-26 广东博奥医学检验所有限公司 Method for detecting chromosome single parent dimer based on family low depth sequencing
CN114255821A (en) * 2021-12-31 2022-03-29 天津金域医学检验实验室有限公司 Family three-sample high-throughput sequencing risk grouping screening method and system
CN114255821B (en) * 2021-12-31 2024-08-06 天津金域医学检验实验室有限公司 Family three-sample high-throughput sequencing risk grouping screening method and system
CN117025753A (en) * 2023-08-15 2023-11-10 广州女娲生命科技有限公司 Method and device for detecting chromosomal variation

Also Published As

Publication number Publication date
CN111863125B (en) 2024-04-12

Similar Documents

Publication Publication Date Title
CN111863125A (en) Mono-parent diploid detection method based on NGS-trio and application
McCarroll et al. Integrated detection and population-genetic analysis of SNPs and copy number variation
Castel et al. Tools and best practices for data processing in allelic expression analysis
Fujimoto et al. Whole-genome sequencing and comprehensive variant analysis of a Japanese individual using massively parallel sequencing
Wong et al. Deep whole-genome sequencing of 100 southeast Asian Malays
Gorcenco et al. New generation genetic testing entering the clinic
US8090543B2 (en) Computer algorithm for automatic allele determination from fluorometer genotyping device
Adams et al. Analysis of DNA sequence variants detected by high‐throughput sequencing
CA3160566A1 (en) Systems and methods for predicting homologous recombination deficiency status of a specimen
CN110211630A (en) The screening apparatus and storage medium and processor of pathogenic uniparental disomy
Scharpf et al. Hidden Markov models for the assessment of chromosomal alterations using high-throughput SNP arrays
Pankratov et al. Prioritizing autoimmunity risk variants for functional analyses by fine-mapping mutations under natural selection
US20140274749A1 (en) Systems and Methods for SNP Characterization and Identifying off Target Variants
CN114921536A (en) Method, device, storage medium and equipment for detecting uniparental diploid and loss of heterozygosity
WO2022027212A1 (en) Method for detecting uniparental disomy on basis of ngs-trio and use thereof
Oliveira et al. Homozygosity mapping using whole-exome sequencing: a valuable approach for pathogenic variant identification in genetic diseases
CN114566213A (en) Single-parent diploid analysis method and system for family high-throughput sequencing data
O’Rielly et al. Genetic Epidemiology of Complex Phenotypes
Sun et al. A genetical genomics approach to genome scans increases power for QTL mapping
CN115579056B (en) Gene group for evaluating molecular typing of schizophrenia, and diagnostic product and application thereof
Oliveira et al. Evaluating runs of homozygosity in exome sequencing data-utility in disease inheritance model selection and variant filtering
CN109402114A (en) It is a kind of for assist identification height method and its primer special group
CN112687336B (en) Method, computing device and storage medium for determining UPD type
Kurki et al. Contribution of rare and common variants to intellectual disability in a high-risk population sub-isolate of Northern Finland
Zukauskaite et al. Allele Frequency Analysis Suggests Potentially Protective Effect in the Lithuanian Population

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Country or region after: China

Address after: No. 10, Helix 3 Road, International Biological Island, Huangpu District, Guangzhou City, Guangdong Province, 510320

Applicant after: GUANGZHOU KINGMED CENTER FOR CLINICAL LABORATORY

Applicant after: GUANGZHOU KINGMED DIAGNOSTICS GROUP Co.,Ltd.

Address before: 510335 3rd floor, 2429 Xingang East Road, Haizhu District, Guangzhou City, Guangdong Province

Applicant before: GUANGZHOU KINGMED CENTER FOR CLINICAL LABORATORY

Country or region before: China

Applicant before: GUANGZHOU KINGMED DIAGNOSTICS GROUP Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant