Nothing Special   »   [go: up one dir, main page]

WO2018049946A1 - Biomarker composition for detection of adenomyosis and application thereof - Google Patents

Biomarker composition for detection of adenomyosis and application thereof Download PDF

Info

Publication number
WO2018049946A1
WO2018049946A1 PCT/CN2017/096248 CN2017096248W WO2018049946A1 WO 2018049946 A1 WO2018049946 A1 WO 2018049946A1 CN 2017096248 W CN2017096248 W CN 2017096248W WO 2018049946 A1 WO2018049946 A1 WO 2018049946A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
adenomyosis
sample
nucleic acid
biomarker
Prior art date
Application number
PCT/CN2017/096248
Other languages
French (fr)
Chinese (zh)
Inventor
贾慧珏
钟焕姿
宋晓蕾
王子榕
陈晨
Original Assignee
深圳华大基因研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大基因研究院 filed Critical 深圳华大基因研究院
Priority to CN201780047953.5A priority Critical patent/CN109689890B/en
Publication of WO2018049946A1 publication Critical patent/WO2018049946A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/136Screening for pharmacological compounds

Definitions

  • the present application relates to the field of biomarkers, and in particular to a biomarker combination for adenomyosis detection or risk assessment of a disease and its use.
  • Adenomyosis is a symptom caused by the intima and glands of the uterus invading the myometrium. Under normal circumstances, the endometrium should be under the myometrium, there is a boundary between them, when the endometrium and superficial muscle layer are damaged, such as childbirth, multiple abortions and curettage, the endometrium will Take advantage of the imaginary, they grow in the myometrium, and stimulate the proliferation of surrounding muscle cells, forming adenomyosis.
  • the endometrium in the myometrium can be the same as the normal endometrium, with periodic hyperemia, edema, and even hemorrhage due to changes in the menstrual cycle, causing intense uterine contractions and severe abdominal pain, while the patient's uterus uniformity increases. Hard, menorrhagia, long menstrual period, severe cases can lead to anemia.
  • adenomyosis occurred mostly in women over 40 years old, but in recent years it has gradually become younger, which may be related to the increase of cesarean section, induced abortion and other operations.
  • the clinical diagnosis of uterine adenomy depends mainly on symptoms, internal diagnosis and ultrasound examination. Ultrasonic scanning can see the entire uterus swollen, the uterine wall, especially the posterior wall, will exceed 2.5 mm or more. If it exceeds 2.5 cm thickness, it is almost certainly abnormal. If there is a certain group, it may be a fibroid or an adenoma. It can also be distinguished by ultrasound, because the adenoma has no capsular enveloping on the periphery, and the fibroids are there, and the ultrasound echo of the adenoma is better than the muscle. Strong tumor. In addition, the use of tumor index CA125 can also assist in diagnosis. However, none of the above methods can achieve early detection or risk assessment of adenomyosis.
  • the purpose of the present application is to provide a biomarker combination for adenomyosis detection or risk assessment of a disease, and its use in adenomyosis test kits, detection tools or drug screening.
  • biomarker combination for adenomyosis detection or disease risk assessment, the biomarker combination comprising at least one of forty four nucleic acids, forty-four nucleic acids
  • the biomarker combination comprising at least one of forty four nucleic acids, forty-four nucleic acids
  • the forty-four nucleic acids of the present application are researched and have a nucleic acid sequence associated with adenomyosis, wherein each nucleic acid sequence is associated with adenomyosis, and therefore, In the case of judging the accuracy of the judgment or the case where the requirement is low, it may be used alone or in combination for the adenomyosis test or the risk assessment of the disease. However, in a preferred embodiment of the present application, not only forty-four nucleic acids are used together, but also forty-four nucleic acids are classified according to a specific rule, and are divided into a plurality of marker groups, and each marker group is used together. Adenomyosis detection or risk assessment of disease, which will be described in detail in the following preferred technical solutions.
  • the forty-four nucleic acids of the present application are clustered according to 97% or more similarity, and then the most representative sequence is selected from each taxon (abbreviation OTU) as a seed sequence, wherein Adenomyosis has forty-four seed sequences that are related, ie, constitutes a biomarker combination of the present application; therefore, in the biomarker combination of the present application, forty-four nucleic acids are not limited to Seq ID No. 1 to Seq
  • the sequence shown by ID No. 44 may also be a sequence having 97% or more similarity to the sequence shown by Seq ID No. 1 to Seq ID No. 44.
  • biomarker combination for the detection of adenomyosis or the risk assessment of the present application is not directly based on the presence or absence of detection of the combination of biomarkers for adenomyosis or the risk of disease. Evaluate, but after detecting the combination of biomarkers, judge by random forest model, judge whether the test subject has adenomyosis or evaluate the adenomyosis of the test subject according to the probability of random forest model output. The risk will be explained in detail in the following technical solutions.
  • another aspect of the present application discloses a biomarker combination for adenomyosis detection or risk assessment of a disease, the biomarker combination comprising a first marker panel, a second marker panel, and a third At least one of the marker groups;
  • the first marker group consists of eighteen nucleic acids, and the eighteen nucleic acids are respectively Seq ID No. 1 to Seq ID No. 18, or respectively, and Seq ID No.
  • the sequence shown in 1 to Seq ID No. 18 has a sequence of 97% or more similarity;
  • the second marker group consists of twenty-two nucleic acids, and the twenty-two nucleic acids are Seq ID No. 1, Seq ID No. 4, respectively. , Seq ID No. 5, Seq ID No.
  • Seq ID No. 10 Seq ID No. 11, Seq ID No. 13, Seq ID No. 15, Seq ID No. 18 to Seq ID No. 31 Sequence, or respectively with Seq ID No. 1, Seq ID No. 4, Seq ID No. 5, Seq ID No. 7, Seq ID No. 10, Seq ID No. 11, Seq ID No. 13, Seq ID No. 15, Seq ID No. 18 to Seq ID No. 31 sequences having a similarity of 97% or more;
  • the third marker group consisting of eighteen nucleic acids, respectively Seq ID No. 1, Seq ID No. 2, Seq ID No. 13, Seq ID No. 19, Seq ID No. 28, Seq ID No. 32 to Seq ID No. 44, or respectively, and Seq ID No. 1, Seq ID No. 2, Seq ID No. 13, Seq ID No. 19, Seq ID No. 28, Seq ID
  • the sequence shown in No. 32 to Seq ID No. 44 has a sequence of 97% or more similarity.
  • nucleic acids are reproducibly divided into three marker groups, namely, a first marker group, a second marker group, and a third marker group;
  • the comprehensive judgment of the three marker groups can greatly improve the accuracy of detecting the adenomyosis of the biomarker combination of the present application or assessing the risk of the disease.
  • the first marker group is a CL marker group for performing adenomyosis detection or risk assessment of a sample from the lower third of the vagina.
  • the second marker group is a CU marker group for performing adenomyosis detection or risk assessment of a sample from the vaginal posterior iliac crest.
  • the third marker group is a CV marker group for performing adenomyosis detection or risk assessment of a sample from the cervical canal.
  • the forty-four nucleic acids in the biomarker combination of the present application actually represent 28 kinds of microorganisms in the lower third of the vagina, the posterior vaginal canal and the cervical canal; the present application passes under the vagina Forty-four nucleic acids of 28 microorganisms in 1/3, vaginal posterior fornix and cervical canal were detected, and the relationship between their relative abundance and adenomyosis was statistically analyzed to establish a random forest model. Determine whether the subject has adenomyosis or is at risk of developing adenomyosis. Therefore, the three marker groups actually correspond to three sampling sites respectively; the samples from the three sites correspond to the respective marker groups, and are independently analyzed and judged. However, comprehensive judgment based on the results of the three methods can improve the accuracy of detecting the adenomyosis of the biomarker combination of the present application or assessing the risk of the disease.
  • the number of microorganisms is far more than 28 species, and the nucleic acids of 28 microorganisms are far more than the 44 described in the present application;
  • this application screens forty-four nucleic acids of 28 microorganisms according to the random forest model, as a biomarker for adenomyosis detection, and provides a new way for the detection and evaluation of adenomyosis.
  • the CL marker group is the marker group of the lower third of the vagina, the lower third of the vagina is abbreviated as CL; the marker of the CU marker group is the marker of the posterior vaginal sample.
  • CU vaginal posterior hernia
  • CV marker group is the marker group of cervical canal sample, and cervical canal is abbreviated as CV.
  • kits for adenomyosis detection or risk assessment of a disease comprising a primer pair for detecting a biomarker combination of the present application, a forward primer of a primer pair
  • a forward primer of a primer pair The sequence shown in SEQ ID No. 45
  • the reverse primer is the sequence shown in SEQ ID No. 46.
  • biomarker combination of the present application can be present as a standard reference.
  • the primer pair is used directly for PCR amplification of the biomarker combination in the sample to be tested.
  • the other side of the present application discloses the use of the biomarker combination of the present application in a drug application for adenomyosis or in the preparation of a kit or detection tool for adenomyosis detection or risk assessment.
  • biomarker combination of the present application is itself studied for adenomyosis, and can of course be used for the detection or risk assessment of adenomyosis; and the biomarker combination of the present application can also be integrated into some special uses.
  • kit or tool for detecting adenomyosis in order to facilitate the detection and evaluation of adenomyosis, as long as the biomarker combination of the present application is employed, it is within the scope of protection of the present application.
  • the biomarker combination of the present application can detect adenomyosis or assess the risk of adenomyosis; of course, it can be compared with the pre- and post-medication adenomyosis or the disease.
  • the risk of the disease changes to determine whether the drug used is effective for the purpose of drug screening.
  • a further aspect of the present application discloses a method for detecting adenomyosis, comprising the following steps,
  • the level of each nucleic acid is the relative abundance of each nucleic acid; the reference data set or reference value is the level of each nucleic acid in the biomarker combination derived from adenomyosis patients and non-adenomyosis controls.
  • the reference data set or reference value in step (2) is at least one of Table 5, Table 6, or Table 7; comparing the level of each nucleic acid with a reference data set or a reference value to obtain a detection result, specifically Including, using a multivariate statistical model to calculate the probability of disease, preferably, the multivariate statistical model is a random forest model.
  • the sample to be tested is subjected to sample collection in step (1), including collecting the lower third of the vagina sample, the posterior vaginal sputum sample and the cervical canal sample.
  • the biomarker combination of the present application is a nucleic acid that has been studied and associated with adenomyosis. Therefore, by analyzing the collected samples of different parts of the test subject, the level of the corresponding biomarker combination is , that is, relative abundance, can detect whether the object to be tested is sick or judge the risk of disease.
  • a further aspect of the present application discloses a method for determining adenomyosis by detecting a biomarker for use in preparing a kit or a tool for assessing a disease or disease risk of adenomyosis; wherein the biomarker is the present application Biomarker combination;
  • a method for determining adenomyosis by detecting a biomarker includes the following steps,
  • the level of each nucleic acid is the relative abundance of each nucleic acid; the reference data set or reference value is the level of each nucleic acid in the biomarker combination derived from adenomyosis patients and non-adenomyosis controls.
  • the reference data set or reference value in step (2) is at least one of Table 5, Table 6, or Table 7; comparing the level of each nucleic acid with a reference data set or a reference value to obtain a detection result, specifically Including, using a multivariate statistical model to calculate the probability of disease, preferably, the multivariate statistical model is a random forest model.
  • a further aspect of the present application discloses a method of screening for a drug candidate for treating adenomyosis, comprising the following steps,
  • step 2) comparing the levels of each nucleic acid in the sample before and after administration, specifically including calculating the probability of disease using a multivariate statistical model, preferably, the multivariate statistical model is a random forest model.
  • a further aspect of the present application discloses a method for detecting a microbiota in a female reproductive tract, comprising the following steps:
  • the level of each nucleic acid is the relative abundance of each nucleic acid; the reference data set or reference value is the level of each nucleic acid in the biomarker combination derived from adenomyosis patients and non-adenomyosis controls.
  • the reference data set or reference value in step (2) is at least one of Table 5, Table 6, or Table 7; comparing the level of each nucleic acid with a reference data set or a reference value to obtain a detection result, specifically including
  • the multivariate statistical model is used to calculate the probability of disease. More preferably, the multivariate statistical model is a random forest model.
  • the microbial sample in the genital tract of the test subject is collected, specifically comprising collecting the lower third of the vaginal sample, the posterior vaginal sputum sample and the cervical canal sample of the test subject.
  • the collection of the microbial samples in the reproductive tract can be carried out by using a conventional nylon fluff swab, which is not specifically limited herein.
  • the biomarker combination of the present application is actually based on the relationship between the microbial DNA in the female reproductive tract and adenomyosis, that is, the biomarker of the present application is actually in the female reproductive tract.
  • a further aspect of the present application discloses a method of preparing a combination of adenomyosis biomarkers, comprising the steps of
  • the microbial sample is collected in the genital tract, specifically comprising collecting the lower third of the vagina sample, the posterior vaginal sputum sample and the cervical canal sample.
  • the key to the preparation method of the adenomyosis biomarker combination in the present application is to use a random forest model to fit and verify the association between the microbial DNA and the adenomyosis in the reproductive tract, and finally obtain the ability to the uterus.
  • a combination of biomarkers for the assessment of the risk or risk of adenomyosis is not limited to the preparation of a biomarker combination for adenomyosis; it can also be used to prepare a biomarker combination of similar conditions associated with the presence of microbial DNA in the reproductive tract, for example A biomarker combination of endometriosis.
  • the biomarker combination for adenomyosis detection of the present application provides a new way for the detection or risk assessment of adenomyosis, which can be used for early diagnosis of adenomyosis, avoiding symptoms and internal symptoms. Conventional tests such as diagnosis or ultrasound examination delay the diagnosis or treatment of adenomyosis.
  • Other key advantages of this application include:
  • the biomarker of the present application is used for the detection of adenomyosis or the risk assessment of a disease, and has the advantages of high sensitivity and high specificity, and has important application value.
  • the genital tract sample as a biomarker detection sample has the advantages of convenient material selection, simple operation steps and continuous in vitro detection.
  • biomarkers of the present application are useful for the detection of adenomyosis or for assessing the risk of disease with reproducible characteristics.
  • a is a randomized forest identification of adenomyosis with an increase in the number of OTUs. 5 times 10-fold cross-validation error rate distribution
  • b is the cross-validated combination of receiver operations Curve (abbreviated ROC curve), the area under the curve (abbreviated AUC) is 0.8668, the shaded area represents a 95% confidence interval, and the diagonal represents a curve with an AUC of 0.5;
  • FIG. 2 is a diagram showing the results of augmentation of adenomyosis based on the marker group of vaginal posterior sacral CU in the embodiment of the present application.
  • a is 5 times of random adolescent differentiation of adenomyosis with increasing number of OTUs.
  • the error rate distribution of the cross-validation verification b is the cross-validated combination ROC curve, the area under the curve is 0.8404, the shaded area represents the 95% confidence interval, and the diagonal line represents the curve with AUC of 0.5;
  • FIG. 3 is a graph showing the results of identifying adenomyosis based on the CV marker group of the cervical canal in the embodiment of the present application, in which a is a 10-fold crossover of a randomized forest differentiation adenomyosis with an increase in the number of OTUs.
  • the error rate distribution of the verification, b is the cross-validated combination ROC curve, the area under the curve is 0.8369, the shaded area represents the 95% confidence interval, and the diagonal represents the curve with AUC of 0.5;
  • ROC curve 4 is a ROC curve for identifying adenomyosis in a second population of the CL marker group at the lower third of the vagina in the embodiment of the present application;
  • FIG. 5 is a ROC curve for identifying adenomyosis in a second population of the vaginal posterior sputum CU marker group in the embodiment of the present application;
  • FIG. 6 is a ROC curve of the cervical canal CV marker group for identifying adenomyosis in a second population in the embodiment of the present application;
  • the biomarker of the present application is obtained based on the relationship between the microbial DNA of the three parts of the subject and the adenomyosis.
  • the biomarkers of the present application are actually the uterine glands of the three parts.
  • Microorganism OTU in the myopathy state Specifically, in a preparation method of the present application, the correspondence or biomarker is obtained by taking the relative abundance of the OTU seed sequence as a target, and the adenomyosis state (sick or non-diseased) is The second object, fitted to the two by a random forest model, was finally obtained by five 10-fold cross-validation. Through rigorous calculations and experimental studies, the present application finally obtained forty-four nucleic acids of 28 microorganisms at three sites as biomarkers of the present application.
  • the marker group of the three sites can independently evaluate the disease or risk of adenomyosis, but combine the probability of the three sites to determine whether the subject has a uterine adenomyis Whether the disease is at risk of adenomyosis, the accuracy will be higher.
  • the "Adenomyosis" of the present application is a diffuse or localized lesion of endometrial glands and interstitial invasion of the myometrium. Like endometriosis, it is a common gynecological disease and a difficult disease.
  • the levels of the biomarker materials of the present application are indicated by relative abundance.
  • the reference value refers to a reference value or a normal value of a healthy control. It will be apparent to those skilled in the art that in the case where the number of samples is sufficient, the range of normal values, i.e., absolute values, of each biomarker can be obtained by inspection and calculation.
  • a biomarker may be any substance in an individual as long as they are related to a specific biological state of the individual to be examined, such as a disease.
  • Such biomarkers can be, for example, nucleic acid markers (eg, DNA), protein markers, cytokine markers, chemokine markers, carbohydrate markers, antigenic markers, antibody markers, species markers ( Species/genus markers) and functional markers (KO/OG markers).
  • the biomarkers of the present application are specifically DNA nucleic acid markers.
  • OTU refers to the operation taxonomic units (OTU), which is in the phylogenetic study or population genetics research.
  • OTU operation taxonomic units
  • a certain classification unit such as strain, species, and genus
  • grouping, etc. set the same flag.
  • the sequence is divided into an OTU according to a 97% similarity threshold, whereby a plurality of OTUs can be obtained for each of the three sites, and each OTU is regarded as a microbial species.
  • the microbial diversity in the samples and the abundance of different microorganisms are based on an analysis of the OTU.
  • refers to an animal, particularly a mammal, such as a primate, referred to as a human in the examples of the present application.
  • the sample collection in this case was assisted by a gynaecologist at Shenzhen Peking University Hospital. Excluding the cases of inflammation, the subjects were non-menstrual, non-pregnancy, non-lactation women, no endocrine and autoimmune diseases, normal liver and kidney function. No hormones or antibiotics were used for some time before sampling, no vaginal medication, vaginal lavage and cervical treatment, and no sexual life was performed within 48 hours before sampling. According to the above criteria, 95 cases of women of childbearing age were selected as the first group. All individuals who meet the above criteria are detailed Phenotypic information was registered to understand his medical history, family history, medication history, and lifestyle habits, and both signed informed consent.
  • the lower genital tract sampling is performed after the individual is admitted to the hospital, without disinfection, after emptying the urine, the lower third of the vagina (abbreviation CL), the posterior vagina (abbreviated CU), and the cervical canal (abbreviated CV) are collected in the gynecological examination bed. A sample of secretions at each site.
  • the sample number and sampling information of the 95 acquisition objects are: the fourteen acquisition objects of numbers C033, C038, C043, C051, C057, C062, C063, C065, T023, T069, T078, T089, T092, T095 are For patients with adenomyosis, samples of CL, CU, and CV were collected from fourteen subjects; numbers C023, C026, C028, C035, C039, C040, C041, C042, C045, C047, C048, C050, C053, C055, C056, C058, C059, C060, C064, C066, C067, C068, T022, T024, T025, T026, T027, T028, T029, T030, T031, T032, T033, T035, T036, T038, T039, T040, T041, T042, T043, T044, T045, T046, T047, T048, T049, T05
  • Nylon fluff swabs were purchased from Chenyang Global Group CY-93050 and CY-98000. After sampling, the swab head was quickly frozen with liquid nitrogen, stored at -80 ° C, and transported to Shenzhen Huada Gene Research Institute with dry ice for subsequent experiments.
  • DNA extraction was performed using the QIAamp DNA Mini Kit (purchased from QIAGEN). The specific extraction steps are carried out in accordance with the instructions provided by the manufacturer.
  • the 16S rRNA gene V4-V5 hypervariable region-specific primers were used for amplification. The two primers were V4-515F and V5-907R, V4-515F was the sequence shown by Seq ID No. 45, and V5-907R was Seq ID No. The sequence shown in 46.
  • Seq ID No. 45 5'-GTGCCAGCMGCCGCGGTAA-3'
  • the PCR procedure was as follows, denaturation at 94 ° C for 3 min; then into 25 cycles: denaturation at 94 ° C for 45 s, annealing at 50 ° C for 60 s, extension at 72 ° C for 90 s; after the end of the cycle, extension at 72 ° C for 10 min.
  • the obtained PCR product was purified by AMPure Beads (Axygen), and sequencing was carried out by chip lane sequencing, and a plurality of samples were mixed and sequenced. Therefore, the library construction requires the addition of a linker sequence after ligation of a 10 bp barcode sequence at the outer end of the primer sequence of each sample.
  • V5-V4 reverse sequencing was performed by Ion torrent PGM sequencing platform. The above library construction and sequencing were carried out by Shenzhen Huada Gene.
  • the raw data was extracted and pre-processed from the PGM system using Mothur software (V1.33.3).
  • the standards for high-quality sequences include: 1) length greater than 200 bp; 2) less than 2 mismatched bases with degenerate PCR; 3) The average quality score is greater than 25.
  • the OTU was clustered using the QIIME uclust method, and the similarity threshold was set to 97%.
  • a seed sequence of each OTU was selected and annotated using the reference gene information gg_13_8_otus in the Greengene database.
  • the relative abundance of each OTU in each sample is calculated, where the relative abundance of an OTU is the ratio of the abundance of the OTU in a sample to the sum of all OTU abundances in the sample.
  • this example uses the Sorenson index ( –Dice index) to measure the similarity of microbial populations at different sites in the same individual, calculated as follows:
  • a and B represent the number of OTUs in samples A and B, respectively, and C represents the number of OTUs shared in the two samples.
  • QS is a similarity index and ranges from 0 to 1.
  • the similarity index of CL and CU, the similarity index of CL and CV, and the similarity index of CU and CV were calculated.
  • the similarity index is close to 1, indicating that the similarity of the microbiota at the two sampling sites is higher.
  • the relative abundance of OTU of each sample was fitted to the adenomyosis state using the randomForest toolkit in R software (3.1.2RC).
  • the default parameters wherein, the OTU of each sample is an OTU present in at least 10% of the sample, that is, the OTU detected in less than 10% of all samples to be tested at each site is excluded. Then, five 10-fold cross-validation is performed, and the error curves of the five 10-fold cross-validation are averaged, and the lowest error of the average post-curve is added to the standard error of the point as the domain value of the acceptable error.
  • the least number of OTUs is the optimal OTU combination as a biomarker combination for identifying adenomyosis.
  • this example additionally used an independent test population, that is, the second population for verification.
  • the second group there were 4 adenomyosis patients and 36 non-adenomyosis individuals for CL and CU; for CV, there were 4 adenomyosis patients and 37 non-adenotrophic individuals.
  • this example calculates the distance between samples of the same individual.
  • the weighted UniFrac distance from the posterior vagina (CU), cervical (CV) mucus to the uterus and peritoneal fluid increased sequentially relative to the lower vaginal 1/3 (CL) sample, again indicating the anatomical structure From bottom to top, the community structure of the female reproductive tract is continuously changing.
  • the cervical mucus was sampled through the vagina and the uterine cavity, respectively. It was found that the bacterial distribution of the samples taken by the two routes showed a high degree of similarity, further indicating that the uterine cavity microorganisms can be evaluated by analyzing the easily available cervical tube samples. Case.
  • this example establishes a random forest model.
  • the specific steps are as follows: (1) Using the relative abundance of OTU as an input feature, design a random forest model based on the first population; (2) For the random forest model, a 10-fold cross-validation algorithm was designed, and the first group was divided into two types: adenomyosis individuals and non-adenomyosis individuals, and the ROC curves of random forest models were obtained respectively, with the area under each ROC curve. The AUC value is used as an evaluation index.
  • a random forest model was used, combined with a 10-fold cross-validation, to obtain the optimal biomarkers for each part, as shown in Table 1, for identifying adenomyosis.
  • Tables 2 to 4 show the enrichment information of the marker group of the three sites in the sample, and Tables 5 to 7 respectively show the relative abundance information of the marker group of the three sites in the first population sample.
  • the biomarkers of the three sites identify the results of adenomyosis, as shown in Figures 1 to 3.
  • Figure 1 shows the identification of adenomyosis in the marker group at the lower third of the vagina (CL).
  • Figure 2 shows the adenomyosis of the vaginal posterior iliac crest (CU) and
  • FIG. 3 identifies the adenomyosis of the cervical canal (CV).
  • the adenomyosis group refers to the sample of adenomyosis in 95 of the first group
  • the control group refers to the absence of adenomyosis in 95 of the first group. Sample of the disease.
  • Figure 1 shows the adenomyosis identified by the marker group at the lower third of the vagina (CL).
  • a is a five-fold 10-fold cross-validation for randomized forest identification of adenomyosis with increasing number of OTUs. The distribution of error rates, the model was trained with the relative abundance of OTU in the sample, using a total of 14 adenomyosis individuals and 80 non-adenomyosis individuals with CL samples, and black lines representing the average of 5 trials.
  • the gray line represents 5 trials respectively
  • the black vertical line represents the number of OTUs in the best combination
  • the b diagram shows the receiver operation curve of the cross-validated combination
  • the area under the curve AUC is 0.8668, and the shaded area represents 95% confidence.
  • the interval, the diagonal line represents a curve with an AUC of 0.5.
  • Figure 2 shows the adenomyosis of the vaginal posterior iliac crest (CU).
  • a is the error rate of five 10-fold cross-validation on random forest identification of adenomyosis with increasing number of OTUs.
  • the model was trained with the relative abundance of OTU in the sample, using a total of 14 uterine adenomyosis individuals and 81 non-adenomyosis individuals with CU samples, black lines representing the average of 5 trials, gray
  • the line is 5 trials respectively, the black vertical line represents the number of OTUs in the best combination;
  • b is the receiver's operation curve of the cross-validated combination, the area under the curve AUC is 0.8404, the shaded area represents 95% confidence interval, diagonal
  • the line represents a curve with an AUC of 0.5.
  • Figure 3 is a marker group of cervical canal (CV) to identify adenomyosis.
  • a is the error rate distribution of five 10-fold cross-validation on random forest identification of adenomyosis with the increase of OTU number.
  • the model was trained with the relative abundance of OTU in the sample, using a total of 14 CV samples from individuals with adenomyosis and 81 individuals with non-adenomyosis.
  • the black line represents the mean of 5 trials, gray line For each of the five trials, the black vertical line represents the number of OTUs in the best combination; b is the cross-validated combination of the receiver operating curve, the area under the curve AUC is 0.8369, the shaded area represents the 95% confidence interval, diagonal A curve representing an AUC of 0.5.
  • the OTU biomarker group at three different sites can identify individuals with adenomyosis and non-adenomyosis individuals; the area under the curve of ROC is 0.8668 (CL) ), 0.8404 (CU) and 0.8369 (CV).
  • AUC is the area under the curve, and the larger the value, that is, the closer to 1, indicating that the judgment ability is stronger, that is, the more accurate the judgment.
  • the OTU biomarkers obtained from the random forest were verified in the second population samples, and the results are shown in Table 8, Table 9, and Table 10.
  • sample numbers such as C002CL, C002CU, and C002CV respectively indicate samples of three parts of CL, CU, and CV collected from the same C002 sample object.
  • Tables 8 to 10 show the probability that the three marker groups predict the individual suffering from adenomyosis, and the ROC curve thus obtained is sequentially shown in Figs. 4 to 6 .
  • the probability > 0.5 is considered to be that the individual has a risk of suffering from adenomyosis or adenomyosis through the marker group at the site.
  • Table 9 CU marker group at CU site predicts the probability of adenomyosis in a second population sample
  • Table 10 CV marker group at the CV site predicts the probability that the second population sample has adenomyosis
  • the results in Figure 4 show that the CL site is based on the CL marker group to determine the probability of adenomyosis, and its AUC value is 0.8750; the results in Figure 5 show that the CU site based on the CU marker group to determine the probability of adenomyosis, its AUC value is 0.840; The results in Fig. 6 show that the CV site is based on the CV marker group to determine the probability of adenomyosis, and its AUC value is 0.9189; it can be seen that these three marker groups have higher discriminating ability and can be used for the detection of adenomyosis. This result is consistent with the results of Tables 8 to 10.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Analytical Chemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biomedical Technology (AREA)
  • Pathology (AREA)
  • Plant Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Provided are a biomarker composition for detection or risk assessment of adenomyosis, and application thereof. The biomarker composition comprises at least one of 44 nucleic acids. The 44 nucleic acids respectively have sequences represented by Seq ID No. 1 to Seq ID No. 44, or sequences having similarity of 97% or more to those represented by Seq ID No. 1 to Seq ID No. 44.

Description

用于子宫腺肌症检测的生物标志物组合及其应用Biomarker combination for adenomyosis detection and its application 技术领域Technical field
本申请涉及生物标志物领域,特别是涉及一种用于子宫腺肌症检测或患病风险评估的生物标志物组合及其应用。The present application relates to the field of biomarkers, and in particular to a biomarker combination for adenomyosis detection or risk assessment of a disease and its use.
背景技术Background technique
子宫腺肌症是子宫的内膜和腺体侵入了子宫肌层内引起的症状。正常情况下,子宫内膜应在子宫肌层下面,它们之间有界限分隔,当子宫内膜和表浅的肌肉层受到损伤,如分娩、多次人工流产和刮宫等,子宫内膜就会乘虚而入,它们在子宫肌层里生长发育,并刺激周围的肌细胞增生,形成子宫腺肌症。子宫肌层内的子宫内膜可以和正常的子宫内膜一样,随月经周期变化而出现周期性充血、水肿,甚至出血,引起强烈的子宫收缩而出现剧烈下腹痛,同时患者子宫均匀性增大、质硬、月经过多、经期过长,严重的会导致患者出现贫血。Adenomyosis is a symptom caused by the intima and glands of the uterus invading the myometrium. Under normal circumstances, the endometrium should be under the myometrium, there is a boundary between them, when the endometrium and superficial muscle layer are damaged, such as childbirth, multiple abortions and curettage, the endometrium will Take advantage of the imaginary, they grow in the myometrium, and stimulate the proliferation of surrounding muscle cells, forming adenomyosis. The endometrium in the myometrium can be the same as the normal endometrium, with periodic hyperemia, edema, and even hemorrhage due to changes in the menstrual cycle, causing intense uterine contractions and severe abdominal pain, while the patient's uterus uniformity increases. Hard, menorrhagia, long menstrual period, severe cases can lead to anemia.
目前子宫腺肌症的治疗方法主要有以下几种:1、手术切除子宫;2、保守手术治疗,3、中医调理治疗。三种治疗方法各有利弊。以前,子宫腺肌症多发生于40岁以上的经产妇,但近年来逐渐成年轻化趋势,这可能与剖宫产、人工流产等手术的增多有关。At present, there are mainly the following treatment methods for adenomyosis: 1. surgical removal of the uterus; 2. conservative surgical treatment, 3. Chinese medicine conditioning treatment. Each of the three treatments has its pros and cons. In the past, adenomyosis occurred mostly in women over 40 years old, but in recent years it has gradually become younger, which may be related to the increase of cesarean section, induced abortion and other operations.
临床上对子宫肌腺症的诊断,主要依赖症状、内诊和超音波检查。超音波扫描可以看见子宫整个肿大起来,子宫壁,尤其常见的是后壁,会超过二点五公分以上,如果超过二点五公分以上的厚度,几乎可以肯定为异常。假如有某一处聚集成一团,可能是肌瘤或腺瘤,也可用超音波辨别,因为腺瘤没有一层荚膜包在外围,而肌瘤则有,并且腺瘤的超音波回音比肌瘤强。另外,使用肿瘤指数CA125也可协助诊断。但是,以上方法都不能做到子宫腺肌症的早期检测或患病风险评估。The clinical diagnosis of uterine adenomy depends mainly on symptoms, internal diagnosis and ultrasound examination. Ultrasonic scanning can see the entire uterus swollen, the uterine wall, especially the posterior wall, will exceed 2.5 mm or more. If it exceeds 2.5 cm thickness, it is almost certainly abnormal. If there is a certain group, it may be a fibroid or an adenoma. It can also be distinguished by ultrasound, because the adenoma has no capsular enveloping on the periphery, and the fibroids are there, and the ultrasound echo of the adenoma is better than the muscle. Strong tumor. In addition, the use of tumor index CA125 can also assist in diagnosis. However, none of the above methods can achieve early detection or risk assessment of adenomyosis.
因此,寻找敏感、特异的子宫腺肌症的生物标志物是目前急需解决的问题。Therefore, the search for sensitive and specific biomarkers of adenomyosis is an urgent problem to be solved.
发明内容Summary of the invention
本申请的目的是提供一种用于子宫腺肌症检测或患病风险评估的生物标志物组合,及其在子宫腺肌症检测试剂盒、检测工具或药物筛选等方面的应用。The purpose of the present application is to provide a biomarker combination for adenomyosis detection or risk assessment of a disease, and its use in adenomyosis test kits, detection tools or drug screening.
为了实现上述目的,本申请采用了以下技术方案:In order to achieve the above objectives, the present application adopts the following technical solutions:
本申请的一方面公开了一种用于子宫腺肌症检测或患病风险评估的生物标志物组合,该生物标志物组合包括四十四条核酸中的至少一条,四十四条核酸 分别为Seq ID No.1至Seq ID No.44所示序列,或者分别为与Seq ID No.1至Seq ID No.44所示序列具有97%以上相似性的序列。One aspect of the present application discloses a biomarker combination for adenomyosis detection or disease risk assessment, the biomarker combination comprising at least one of forty four nucleic acids, forty-four nucleic acids The sequences shown in Seq ID No. 1 to Seq ID No. 44, respectively, or sequences having 97% or more similarity to the sequences shown in Seq ID No. 1 to Seq ID No. 44, respectively.
需要说明的是,本申请的四十四条核酸是经过研究得出的,和子宫腺肌症有关联的核酸序列,其中每条核酸序列都与子宫腺肌症有关联性,因此,在不考虑判断准确性的情况下或者对此要求较低的情况下,可以单独或者组合用于子宫腺肌症检测或者患病风险评估。但是,本申请的一种优选方案中,不仅四十四条核酸一起使用,而且,还将四十四条核酸按照特定的规律进行分类,分成多个标志物组,各个标志物组一起用于子宫腺肌症检测或者患病风险评估,这将在后面的优选技术方案中详细描述。It should be noted that the forty-four nucleic acids of the present application are researched and have a nucleic acid sequence associated with adenomyosis, wherein each nucleic acid sequence is associated with adenomyosis, and therefore, In the case of judging the accuracy of the judgment or the case where the requirement is low, it may be used alone or in combination for the adenomyosis test or the risk assessment of the disease. However, in a preferred embodiment of the present application, not only forty-four nucleic acids are used together, but also forty-four nucleic acids are classified according to a specific rule, and are divided into a plurality of marker groups, and each marker group is used together. Adenomyosis detection or risk assessment of disease, which will be described in detail in the following preferred technical solutions.
还需要说明的是,本申请的四十四条核酸是根据97%以上相似性进行聚类分析,然后从每个分类单元(缩写OTU)中选取最具代表性的序列作为种子序列,其中与子宫腺肌症具有关联性的四十四个种子序列,即组成本申请的生物标志物组合;因此,本申请的生物标志物组合中,四十四条核酸不仅限于Seq ID No.1至Seq ID No.44所示序列,还可以是与Seq ID No.1至Seq ID No.44所示序列具有97%以上相似性的序列。It should also be noted that the forty-four nucleic acids of the present application are clustered according to 97% or more similarity, and then the most representative sequence is selected from each taxon (abbreviation OTU) as a seed sequence, wherein Adenomyosis has forty-four seed sequences that are related, ie, constitutes a biomarker combination of the present application; therefore, in the biomarker combination of the present application, forty-four nucleic acids are not limited to Seq ID No. 1 to Seq The sequence shown by ID No. 44 may also be a sequence having 97% or more similarity to the sequence shown by Seq ID No. 1 to Seq ID No. 44.
需要补充说明的是,本申请的用于子宫腺肌症检测或患病风险评估的生物标志物组合,并不是直接根据检测生物标志物组合的有或者无进行子宫腺肌症检测或患病风险评估的,而是,在检测到生物标志物组合后,通过随机森林模型进行判断,根据随机森林模型输出的概率判断待测对象是否患有子宫腺肌症或评估待测对象患子宫腺肌症的风险,这将在后面的技术方案中详细说明。It should be added that the biomarker combination for the detection of adenomyosis or the risk assessment of the present application is not directly based on the presence or absence of detection of the combination of biomarkers for adenomyosis or the risk of disease. Evaluate, but after detecting the combination of biomarkers, judge by random forest model, judge whether the test subject has adenomyosis or evaluate the adenomyosis of the test subject according to the probability of random forest model output. The risk will be explained in detail in the following technical solutions.
优选的,本申请的另一面公开了一种用于子宫腺肌症检测或患病风险评估的生物标志物组合,该生物标志物组合包括第一标志物组、第二标志物组和第三标志物组中的至少一组;第一标志物组由十八条核酸组成,十八条核酸分别为Seq ID No.1至Seq ID No.18所示序列,或者分别为与Seq ID No.1至Seq ID No.18所示序列具有97%以上相似性的序列;第二标志物组由二十二条核酸组成,二十二条核酸分别为Seq ID No.1、Seq ID No.4、Seq ID No.5、Seq ID No.7、Seq ID No.10、Seq ID No.11、Seq ID No.13、Seq ID No.15、Seq ID No.18至Seq ID No.31所示序列,或者分别为与Seq ID No.1、Seq ID No.4、Seq ID No.5、Seq ID No.7、Seq ID No.10、Seq ID No.11、Seq ID No.13、Seq ID No.15、Seq ID No.18至Seq ID No.31所示序列具有97%以上相似性的序列;第三标志物组由十八条核酸组成,这十八条核酸分别为Seq ID No.1、Seq ID No.2、Seq ID No.13、Seq ID No.19、Seq ID No.28、Seq ID No.32至Seq ID No.44所示序列,或者分别为与Seq ID No.1、Seq ID No.2、Seq ID No.13、Seq ID No.19、Seq ID No.28、Seq ID  No.32至Seq ID No.44所示序列具有97%以上相似性的序列。Preferably, another aspect of the present application discloses a biomarker combination for adenomyosis detection or risk assessment of a disease, the biomarker combination comprising a first marker panel, a second marker panel, and a third At least one of the marker groups; the first marker group consists of eighteen nucleic acids, and the eighteen nucleic acids are respectively Seq ID No. 1 to Seq ID No. 18, or respectively, and Seq ID No. The sequence shown in 1 to Seq ID No. 18 has a sequence of 97% or more similarity; the second marker group consists of twenty-two nucleic acids, and the twenty-two nucleic acids are Seq ID No. 1, Seq ID No. 4, respectively. , Seq ID No. 5, Seq ID No. 7, Seq ID No. 10, Seq ID No. 11, Seq ID No. 13, Seq ID No. 15, Seq ID No. 18 to Seq ID No. 31 Sequence, or respectively with Seq ID No. 1, Seq ID No. 4, Seq ID No. 5, Seq ID No. 7, Seq ID No. 10, Seq ID No. 11, Seq ID No. 13, Seq ID No. 15, Seq ID No. 18 to Seq ID No. 31 sequences having a similarity of 97% or more; the third marker group consisting of eighteen nucleic acids, respectively Seq ID No. 1, Seq ID No. 2, Seq ID No. 13, Seq ID No. 19, Seq ID No. 28, Seq ID No. 32 to Seq ID No. 44, or respectively, and Seq ID No. 1, Seq ID No. 2, Seq ID No. 13, Seq ID No. 19, Seq ID No. 28, Seq ID The sequence shown in No. 32 to Seq ID No. 44 has a sequence of 97% or more similarity.
需要说明的是,本申请的优选方案中,将四十四条核酸可重复选择的分为三个标志物组,即第一标志物组、第二标志物组和第三标志物组;通过三个标志物组的综合判断,可以大大提高本申请的生物标志物组合检测子宫腺肌症或者评估患病风险的准确性。It should be noted that, in a preferred embodiment of the present application, forty-four nucleic acids are reproducibly divided into three marker groups, namely, a first marker group, a second marker group, and a third marker group; The comprehensive judgment of the three marker groups can greatly improve the accuracy of detecting the adenomyosis of the biomarker combination of the present application or assessing the risk of the disease.
优选的,第一标志物组为CL标志物组,用于对来自阴道下1/3的样品进行子宫腺肌症检测或患病风险评估。Preferably, the first marker group is a CL marker group for performing adenomyosis detection or risk assessment of a sample from the lower third of the vagina.
优选的,第二标志物组为CU标志物组,用于对来自阴道后穹窿的样品进行子宫腺肌症检测或患病风险评估。Preferably, the second marker group is a CU marker group for performing adenomyosis detection or risk assessment of a sample from the vaginal posterior iliac crest.
优选的,第三标志物组为CV标志物组,用于对来自宫颈管的样品进行子宫腺肌症检测或患病风险评估。Preferably, the third marker group is a CV marker group for performing adenomyosis detection or risk assessment of a sample from the cervical canal.
需要说明的是,本申请的生物标志物组合中的四十四条核酸实际上代表的是阴道下1/3、阴道后穹窿和宫颈管三个部位的28种微生物;本申请通过对阴道下1/3、阴道后穹窿和宫颈管三个部位的28种微生物的四十四条核酸进行检测,并对其相对丰度与子宫腺肌症的关系进行统计分析,建立随机森林模型,以此判断待测对象是否患有子宫腺肌症或是否具有患子宫腺肌症的风险。因此,三个标志物组,实际上就是分别对应三个采样部位;来自于三个部位的样品,分别对应各自的标志物组,独立进行分析判断。只是,根据三者的结果进行综合判断,能够提高本申请的生物标志物组合检测子宫腺肌症或者评估患病风险的准确性。It should be noted that the forty-four nucleic acids in the biomarker combination of the present application actually represent 28 kinds of microorganisms in the lower third of the vagina, the posterior vaginal canal and the cervical canal; the present application passes under the vagina Forty-four nucleic acids of 28 microorganisms in 1/3, vaginal posterior fornix and cervical canal were detected, and the relationship between their relative abundance and adenomyosis was statistically analyzed to establish a random forest model. Determine whether the subject has adenomyosis or is at risk of developing adenomyosis. Therefore, the three marker groups actually correspond to three sampling sites respectively; the samples from the three sites correspond to the respective marker groups, and are independently analyzed and judged. However, comprehensive judgment based on the results of the three methods can improve the accuracy of detecting the adenomyosis of the biomarker combination of the present application or assessing the risk of the disease.
还需要说明的是,在阴道下1/3、阴道后穹窿和宫颈管这三个部位中,其微生物数量远不止28个种,28种微生物的核酸也远不止本申请所记载的44个;但是,本申请根据随机森林模型从中筛选出28种微生物的四十四条核酸,以作为子宫腺肌症检测的生物标志物,为子宫腺肌症的检测和评估提供了一条新的途径。It should also be noted that in the lower third of the vagina, the posterior vaginal canal and the cervical canal, the number of microorganisms is far more than 28 species, and the nucleic acids of 28 microorganisms are far more than the 44 described in the present application; However, this application screens forty-four nucleic acids of 28 microorganisms according to the random forest model, as a biomarker for adenomyosis detection, and provides a new way for the detection and evaluation of adenomyosis.
需要补充说明的是,三个标志物组中,CL标志物组即阴道下1/3样品的标志物组,阴道下1/3缩写为CL;CU标志物组即阴道后穹窿样品的标志物组,阴道后穹窿缩写为CU;CV标志物组即宫颈管样品的标志物组,宫颈管缩写为CV。It should be added that in the three marker groups, the CL marker group is the marker group of the lower third of the vagina, the lower third of the vagina is abbreviated as CL; the marker of the CU marker group is the marker of the posterior vaginal sample. Group, vaginal posterior hernia is abbreviated as CU; CV marker group is the marker group of cervical canal sample, and cervical canal is abbreviated as CV.
本申请的另一面公开了一种用于子宫腺肌症检测或患病风险评估的试剂盒,该剂盒中包含用于检测本申请的生物标志物组合的引物对,引物对的正向引物为SEQ ID No.45所示序列,反向引物为SEQ ID No.46所示序列。Another aspect of the present application discloses a kit for adenomyosis detection or risk assessment of a disease comprising a primer pair for detecting a biomarker combination of the present application, a forward primer of a primer pair The sequence shown in SEQ ID No. 45, the reverse primer is the sequence shown in SEQ ID No. 46.
需要说明的是,本申请的生物标志物组合,可以作为一个标准参考存在于 试剂盒中,而引物对则是直接用于PCR扩增待测样品中的生物标志物组合的。It should be noted that the biomarker combination of the present application can be present as a standard reference. In the kit, the primer pair is used directly for PCR amplification of the biomarker combination in the sample to be tested.
本申请的另一面公开了本申请的生物标志物组合在子宫腺肌症药物筛选或者在制备子宫腺肌症检测或患病风险评估的试剂盒或检测工具中的应用。The other side of the present application discloses the use of the biomarker combination of the present application in a drug application for adenomyosis or in the preparation of a kit or detection tool for adenomyosis detection or risk assessment.
可以理解,本申请的生物标志物组合本身就是针对子宫腺肌症而研究的,当然可以用于子宫腺肌症的检测或风险评估;而本申请的生物标志物组合也可以整合到一些专门用于子宫腺肌症检测的试剂盒或工具中,以方便子宫腺肌症的检测和评估,只要采用了本申请的生物标志物组合,都在本申请的保护范围内。与此同时,由于本申请的生物标志物组合可以检测子宫腺肌症或者对子宫腺肌症进行患病风险评估;当然,可以对比检测用药前和用药后的子宫腺肌症患病情况或者患病风险变化,从而判断所用药物是否有效,以达到药物筛选的目的。It can be understood that the biomarker combination of the present application is itself studied for adenomyosis, and can of course be used for the detection or risk assessment of adenomyosis; and the biomarker combination of the present application can also be integrated into some special uses. In the kit or tool for detecting adenomyosis, in order to facilitate the detection and evaluation of adenomyosis, as long as the biomarker combination of the present application is employed, it is within the scope of protection of the present application. At the same time, because the biomarker combination of the present application can detect adenomyosis or assess the risk of adenomyosis; of course, it can be compared with the pre- and post-medication adenomyosis or the disease. The risk of the disease changes to determine whether the drug used is effective for the purpose of drug screening.
本申请的再一面公开了一种子宫腺肌症的检测方法,包括以下步骤,A further aspect of the present application discloses a method for detecting adenomyosis, comprising the following steps,
(1)对待测对象进行样品采集,检测所采集的样品中本申请的生物标志物组合,并分析生物标志物组合中各核酸的水平;(1) performing sample collection on the object to be tested, detecting the biomarker combination of the present application in the collected sample, and analyzing the level of each nucleic acid in the biomarker combination;
(2)将步骤(1)测得的各核酸的水平与参考数据集或参考值进行比较,获得检测结果;(2) comparing the level of each nucleic acid measured in the step (1) with a reference data set or a reference value to obtain a detection result;
优选的,各核酸的水平为各核酸的相对丰度;参考数据集或参考值为来源于子宫腺肌症患者和非子宫腺肌症对照的生物标志物组合中各核酸的水平。Preferably, the level of each nucleic acid is the relative abundance of each nucleic acid; the reference data set or reference value is the level of each nucleic acid in the biomarker combination derived from adenomyosis patients and non-adenomyosis controls.
更优选的,步骤(2)中的参考数据集或参考值为表5、表6或表7中的至少一组;将各核酸的水平与参考数据集或参考值进行比较获得检测结果,具体包括,利用多元统计模型计算得出患病概率,优选地,多元统计模型为随机森林模型。More preferably, the reference data set or reference value in step (2) is at least one of Table 5, Table 6, or Table 7; comparing the level of each nucleic acid with a reference data set or a reference value to obtain a detection result, specifically Including, using a multivariate statistical model to calculate the probability of disease, preferably, the multivariate statistical model is a random forest model.
更优选的,步骤(1)中对待测对象进行样品采集,包括采集待测对象阴道下1/3样品、阴道后穹窿样品和宫颈管样品。More preferably, the sample to be tested is subjected to sample collection in step (1), including collecting the lower third of the vagina sample, the posterior vaginal sputum sample and the cervical canal sample.
需要说明的是,本申请的生物标志物组合是经过研究得出的,和子宫腺肌症有关联的核酸,因此,通过分析待测对象不同部位的采集样品中,相应生物标志物组合的水平,即相对丰度,可以检测待测对象是否患病或判断其患病风险。It should be noted that the biomarker combination of the present application is a nucleic acid that has been studied and associated with adenomyosis. Therefore, by analyzing the collected samples of different parts of the test subject, the level of the corresponding biomarker combination is , that is, relative abundance, can detect whether the object to be tested is sick or judge the risk of disease.
本申请的再一面公开了一种通过检测生物标志物判断子宫腺肌症的方法在制备子宫腺肌症检测或患病风险评估试剂盒或工具中的应用;其中,生物标志物为本申请的生物标志物组合;A further aspect of the present application discloses a method for determining adenomyosis by detecting a biomarker for use in preparing a kit or a tool for assessing a disease or disease risk of adenomyosis; wherein the biomarker is the present application Biomarker combination;
通过检测生物标志物判断子宫腺肌症的方法包括以下步骤,A method for determining adenomyosis by detecting a biomarker includes the following steps,
(1)对待测对象进行样品采集,检测所采集的样品中本申请的生物标志物组合,并分析生物标志物组合中各核酸的水平; (1) performing sample collection on the object to be tested, detecting the biomarker combination of the present application in the collected sample, and analyzing the level of each nucleic acid in the biomarker combination;
(2)将步骤(1)测得的各核酸的水平与参考数据集或参考值进行比较,获得检测结果;(2) comparing the level of each nucleic acid measured in the step (1) with a reference data set or a reference value to obtain a detection result;
优选的,各核酸的水平为各核酸的相对丰度;参考数据集或参考值为来源于子宫腺肌症患者和非子宫腺肌症对照的生物标志物组合中各核酸的水平。Preferably, the level of each nucleic acid is the relative abundance of each nucleic acid; the reference data set or reference value is the level of each nucleic acid in the biomarker combination derived from adenomyosis patients and non-adenomyosis controls.
更优选的,步骤(2)中的参考数据集或参考值为表5、表6或表7中的至少一组;将各核酸的水平与参考数据集或参考值进行比较获得检测结果,具体包括,利用多元统计模型计算得出患病概率,优选地,多元统计模型为随机森林模型。More preferably, the reference data set or reference value in step (2) is at least one of Table 5, Table 6, or Table 7; comparing the level of each nucleic acid with a reference data set or a reference value to obtain a detection result, specifically Including, using a multivariate statistical model to calculate the probability of disease, preferably, the multivariate statistical model is a random forest model.
本申请的再一面公开了一种筛选治疗子宫腺肌症的候选药物的方法,包括以下步骤,A further aspect of the present application discloses a method of screening for a drug candidate for treating adenomyosis, comprising the following steps,
1)分别测定用药前和用药后的样品中本申请的生物标志物组合,并分析生物标志物组合中各核酸的水平;1) separately determining the biomarker combination of the present application in the sample before and after administration, and analyzing the level of each nucleic acid in the biomarker combination;
2)根据比较用药前和用药后的样品中各核酸的水平,判断候选药物;2) judging the candidate drug according to the level of each nucleic acid in the sample before and after the drug is compared;
步骤2)中,比较用药前和用药后的样品中各核酸的水平,具体包括,利用多元统计模型计算得出患病概率,优选地,多元统计模型为随机森林模型。In step 2), comparing the levels of each nucleic acid in the sample before and after administration, specifically including calculating the probability of disease using a multivariate statistical model, preferably, the multivariate statistical model is a random forest model.
本申请的再一面公开了一种女性生殖道内微生物群的检测方法,包括以下步骤,A further aspect of the present application discloses a method for detecting a microbiota in a female reproductive tract, comprising the following steps:
(1)采集待测对象生殖道内微生物样品,检测所采集的样品中本申请的生物标志物组合,并分析生物标志物组合中各核酸的水平;(1) collecting a microbial sample in the reproductive tract of the test subject, detecting the biomarker combination of the present application in the collected sample, and analyzing the level of each nucleic acid in the biomarker combination;
(2)将步骤(1)测得的各核酸的水平与参考数据集或参考值进行比较,获得检测结果;(2) comparing the level of each nucleic acid measured in the step (1) with a reference data set or a reference value to obtain a detection result;
优选的,各核酸的水平为各核酸的相对丰度;参考数据集或参考值为来源于子宫腺肌症患者和非子宫腺肌症对照的生物标志物组合中各核酸的水平。Preferably, the level of each nucleic acid is the relative abundance of each nucleic acid; the reference data set or reference value is the level of each nucleic acid in the biomarker combination derived from adenomyosis patients and non-adenomyosis controls.
优选的,步骤(2)中的参考数据集或参考值为表5、表6或表7中的至少一组;将各核酸的水平与参考数据集或参考值进行比较获得检测结果,具体包括,利用多元统计模型计算得出患病概率,更优选地,多元统计模型为随机森林模型。Preferably, the reference data set or reference value in step (2) is at least one of Table 5, Table 6, or Table 7; comparing the level of each nucleic acid with a reference data set or a reference value to obtain a detection result, specifically including The multivariate statistical model is used to calculate the probability of disease. More preferably, the multivariate statistical model is a random forest model.
优选的,步骤(1)中采集待测对象生殖道内微生物样品,具体包括采集待测对象阴道下1/3样品、阴道后穹窿样品和宫颈管样品。其中,生殖道内微生物样品的采集可以采用常规的尼龙绒屑拭子,在此不做具体限定。Preferably, in step (1), the microbial sample in the genital tract of the test subject is collected, specifically comprising collecting the lower third of the vaginal sample, the posterior vaginal sputum sample and the cervical canal sample of the test subject. Among them, the collection of the microbial samples in the reproductive tract can be carried out by using a conventional nylon fluff swab, which is not specifically limited herein.
需要说明的是,本申请的生物标志物组合,实际上就是根据女性生殖道内微生物群DNA与子宫腺肌症之间的关系而得出的,即本申请的生物标志物实际上就是女性生殖道内能够体现子宫腺肌症状态的微生物OTU;因此,本申请提出了一种女性生殖道内微生物群检测方法,通过微生物群的检测为子宫腺肌症或其患病风险提供判断和评估依据。 It should be noted that the biomarker combination of the present application is actually based on the relationship between the microbial DNA in the female reproductive tract and adenomyosis, that is, the biomarker of the present application is actually in the female reproductive tract. The microorganism OTU capable of reflecting the adenomyosis state; therefore, the present application proposes a method for detecting a microbiota in the female reproductive tract, which provides a basis for judging and evaluating adenomyosis or its risk by detecting the microbial population.
本申请的再一面公开了一种制备子宫腺肌症生物标志物组合的方法,包括以下步骤,A further aspect of the present application discloses a method of preparing a combination of adenomyosis biomarkers, comprising the steps of
(1)分别对子宫腺肌症病患和非病患进行生殖道内微生物样品采集,对所有采集的样品分别进行16S测序;(1) Collecting microbial samples in the genital tract of adenomyosis patients and non-patients, respectively, and performing 16S sequencing on all collected samples;
(2)将16S测序结果进行聚类分析,获得OTU单元以及每个OTU的种子序列,并计算每个OTU单元的相对丰度;(2) Clustering the 16S sequencing results to obtain the OTU unit and the seed sequence of each OTU, and calculate the relative abundance of each OTU unit;
(3)利用随机森林模型对每个OTU单元的相对丰度与子宫腺肌症状态进行拟合,并进行5次十折交叉验证,得到最优的OTU组合,最优OTU组合中各OTU的种子序列,即组成子宫腺肌症的生物标志物组合。(3) Using the random forest model to fit the relative abundance of each OTU unit with the adenomyosis state, and perform five 10-fold cross-validation to obtain the optimal OTU combination, the OTU of the optimal OTU combination. A seed sequence, a combination of biomarkers that make up adenomyosis.
优选的,步骤(1)中,生殖道内微生物样品采集,具体包括采集待测对象阴道下1/3样品、阴道后穹窿样品和宫颈管样品。Preferably, in step (1), the microbial sample is collected in the genital tract, specifically comprising collecting the lower third of the vagina sample, the posterior vaginal sputum sample and the cervical canal sample.
需要说明的是,本申请子宫腺肌症生物标志物组合的制备方法,其关键在于利用随机森林模型对生殖道内微生物群DNA与子宫腺肌症的关联进行拟合、验证,最终得到能够对子宫腺肌症患病或风险进行评估的生物标志物组合。可以理解,本申请的制备方法或其基本思路,不只限于制备子宫腺肌症的生物标志物组合;还可以用于制备类似的与生殖道内微生物群DNA存在关联的病症的生物标志物组合,例如子宫内膜异位症的生物标志物组合。It should be noted that the key to the preparation method of the adenomyosis biomarker combination in the present application is to use a random forest model to fit and verify the association between the microbial DNA and the adenomyosis in the reproductive tract, and finally obtain the ability to the uterus. A combination of biomarkers for the assessment of the risk or risk of adenomyosis. It will be understood that the preparation method of the present application or its basic idea is not limited to the preparation of a biomarker combination for adenomyosis; it can also be used to prepare a biomarker combination of similar conditions associated with the presence of microbial DNA in the reproductive tract, for example A biomarker combination of endometriosis.
由于采用以上技术方案,本申请的有益效果在于:Due to the adoption of the above technical solutions, the beneficial effects of the present application are:
本申请的用于子宫腺肌症检测的生物标志物组合,为子宫腺肌症的检测或风险评估提供了一条新的途径,能够用于子宫腺肌症的早期诊断,避免了依赖症状、内诊或超音波检查等常规检测对子宫腺肌症诊断或治疗的延误。本申请的其他主要优点包括:The biomarker combination for adenomyosis detection of the present application provides a new way for the detection or risk assessment of adenomyosis, which can be used for early diagnosis of adenomyosis, avoiding symptoms and internal symptoms. Conventional tests such as diagnosis or ultrasound examination delay the diagnosis or treatment of adenomyosis. Other key advantages of this application include:
(a)本申请的生物标志物用于子宫腺肌症的检测或患病风险评估,具有高灵敏性、高特异性的优点,具有重要的应用价值。(a) The biomarker of the present application is used for the detection of adenomyosis or the risk assessment of a disease, and has the advantages of high sensitivity and high specificity, and has important application value.
(b)生殖道样品作为生物标志物检测样本具有取材方便、操作步骤简单和可连续体外检测等优点。(b) The genital tract sample as a biomarker detection sample has the advantages of convenient material selection, simple operation steps and continuous in vitro detection.
(c)本申请的生物标志物用于子宫腺肌症的检测或患病风险评估具有重复性好的特点。(c) The biomarkers of the present application are useful for the detection of adenomyosis or for assessing the risk of disease with reproducible characteristics.
附图说明DRAWINGS
图1是本申请实施例中基于阴道下1/3处CL的标志物组鉴别子宫腺肌症的结果图,图中,a为随着OTU数量的增加,对随机森林鉴别子宫腺肌症进行5次10折交叉验证的错误率分布情况,b为经过交叉验证过的组合的接收者操作 曲线(缩写ROC曲线),曲线下面积(缩写AUC)为0.8668,阴影面积代表95%置信区间,对角线代表AUC为0.5的曲线;1 is a graph showing the results of identifying adenomyosis based on a marker group of CL at the lower third of the vagina in the embodiment of the present application. In the figure, a is a randomized forest identification of adenomyosis with an increase in the number of OTUs. 5 times 10-fold cross-validation error rate distribution, b is the cross-validated combination of receiver operations Curve (abbreviated ROC curve), the area under the curve (abbreviated AUC) is 0.8668, the shaded area represents a 95% confidence interval, and the diagonal represents a curve with an AUC of 0.5;
图2是本申请实施例中基于阴道后穹窿CU的标志物组鉴别子宫腺肌症的结果图,图中,a为随着OTU数量的增加,对随机森林鉴别子宫腺肌症进行5次10折交叉验证的错误率分布情况,b为经过交叉验证过的组合的ROC曲线,曲线下面积为0.8404,阴影面积代表95%置信区间,对角线代表AUC为0.5的曲线;2 is a diagram showing the results of augmentation of adenomyosis based on the marker group of vaginal posterior sacral CU in the embodiment of the present application. In the figure, a is 5 times of random adolescent differentiation of adenomyosis with increasing number of OTUs. The error rate distribution of the cross-validation verification, b is the cross-validated combination ROC curve, the area under the curve is 0.8404, the shaded area represents the 95% confidence interval, and the diagonal line represents the curve with AUC of 0.5;
图3是本申请实施例中基于宫颈管CV标志物组鉴别子宫腺肌症的结果图,图中,a为随着OTU数量的增加,对随机森林鉴别子宫腺肌症进行5次10折交叉验证的错误率分布情况,b为经过交叉验证过的组合的ROC曲线,曲线下面积为0.8369,阴影面积代表95%置信区间,对角线代表AUC为0.5的曲线;3 is a graph showing the results of identifying adenomyosis based on the CV marker group of the cervical canal in the embodiment of the present application, in which a is a 10-fold crossover of a randomized forest differentiation adenomyosis with an increase in the number of OTUs. The error rate distribution of the verification, b is the cross-validated combination ROC curve, the area under the curve is 0.8369, the shaded area represents the 95% confidence interval, and the diagonal represents the curve with AUC of 0.5;
图4是本申请实施例中阴道下1/3处CL标志物组在第二群体中对子宫腺肌症进行鉴别的ROC曲线;4 is a ROC curve for identifying adenomyosis in a second population of the CL marker group at the lower third of the vagina in the embodiment of the present application;
图5是本申请实施例中阴道后穹窿CU标志物组在第二群体中对子宫腺肌症进行鉴别的ROC曲线;5 is a ROC curve for identifying adenomyosis in a second population of the vaginal posterior sputum CU marker group in the embodiment of the present application;
图6是本申请实施例中宫颈管CV标志物组在第二群体中对子宫腺肌症进行鉴别的ROC曲线;6 is a ROC curve of the cervical canal CV marker group for identifying adenomyosis in a second population in the embodiment of the present application;
图中,变量数量是指OTU数量,其中,灵敏性=真阳性/(真阳性+假阴性);特异度=真阴性/(真阴性+假阳性)。In the figure, the number of variables refers to the number of OTUs, where sensitivity = true positive / (true positive + false negative); specificity = true negative / (true negative + false positive).
具体实施方式detailed description
本申请的生物标志物,是根据采集对象三个部位的微生物群DNA与子宫腺肌症之间的关系而得出的,本申请的生物标志物实际上就是这三个部位的能够体现子宫腺肌症状态的微生物OTU。具体的,本申请的一种制备方法中,这种对应关系或者生物标志物的获得,是以OTU种子序列的相对丰度为一个对象,子宫腺肌症状态(患病或非患病)为第二个对象,通过随机森林模型对两者进行拟合,最终通过5次十折交叉验证而得出的。本申请通过严格的计算和试验研究,最终获得了三个部位的28种微生物的四十四条核酸作为本申请的生物标志物。The biomarker of the present application is obtained based on the relationship between the microbial DNA of the three parts of the subject and the adenomyosis. The biomarkers of the present application are actually the uterine glands of the three parts. Microorganism OTU in the myopathy state. Specifically, in a preparation method of the present application, the correspondence or biomarker is obtained by taking the relative abundance of the OTU seed sequence as a target, and the adenomyosis state (sick or non-diseased) is The second object, fitted to the two by a random forest model, was finally obtained by five 10-fold cross-validation. Through rigorous calculations and experimental studies, the present application finally obtained forty-four nucleic acids of 28 microorganisms at three sites as biomarkers of the present application.
本申请的一种实现方式中,三个部位的标志物组可以独立的对子宫腺肌症患病或风险进行评估,但是,结合三个部位的概率,判断待测对象是否患有子宫腺肌症或是否具有患子宫腺肌症的风险,这样准确性会更高。In one implementation of the present application, the marker group of the three sites can independently evaluate the disease or risk of adenomyosis, but combine the probability of the three sites to determine whether the subject has a uterine adenomyis Whether the disease is at risk of adenomyosis, the accuracy will be higher.
本申请所用术语是本领域普通技术人员通常理解的含义。为了更好地理解本申请,对一些定义和相关术语的解释如下: The terms used in this application are those that are generally understood by those of ordinary skill in the art. For a better understanding of this application, some definitions and related terms are explained as follows:
本申请的“子宫腺肌症”,是子宫内膜腺体和间质侵入子宫肌层形成弥漫或局限性的病变,与子宫内膜异位症一样,属于妇科常见病和疑难病。The "Adenomyosis" of the present application is a diffuse or localized lesion of endometrial glands and interstitial invasion of the myometrium. Like endometriosis, it is a common gynecological disease and a difficult disease.
本申请的生物标志物质的水平通过相对丰度指示。The levels of the biomarker materials of the present application are indicated by relative abundance.
在本申请的一个实施方式中,参考值是指健康对照的参考值或正常值。本领域的技术人员清楚,在样品数量足够多情况下,每个生物标志物的正常值,即绝对值,的范围可以通过检验和计算方法得到。In one embodiment of the present application, the reference value refers to a reference value or a normal value of a healthy control. It will be apparent to those skilled in the art that in the case where the number of samples is sufficient, the range of normal values, i.e., absolute values, of each biomarker can be obtained by inspection and calculation.
本申请的“生物标志物”,也称为“生物学标志物”,是指个体的生物状态的可测量指标。这样的生物标记物可以是在个体中的任何物质,只要它们与被检个体的特定生物状态,例如疾病,有关系即可。这样的生物标记物可以是,例如,核酸标志物(例如DNA)、蛋白质标志物、细胞因子标记物、趋化因子标记物、碳水化合物标志物、抗原标志物、抗体标志物、物种标志物(种/属的标记)和功能标志物(KO/OG标记)等。本申请的生物标志物具体的为DNA核酸标志物。A "biomarker", also referred to as a "biological marker," as used herein, refers to a measurable indicator of the biological state of an individual. Such a biomarker may be any substance in an individual as long as they are related to a specific biological state of the individual to be examined, such as a disease. Such biomarkers can be, for example, nucleic acid markers (eg, DNA), protein markers, cytokine markers, chemokine markers, carbohydrate markers, antigenic markers, antibody markers, species markers ( Species/genus markers) and functional markers (KO/OG markers). The biomarkers of the present application are specifically DNA nucleic acid markers.
本申请的“OTU”是指操作分类单元(operational taxonomic units缩写OTU),是在系统发生学研究或群体遗传学研究中,为了便于进行分析,人为给某一个分类单元,如品系、种、属、分组等,设置的同一标志。本申请中按照97%的相似性阈值将序列划分为一个OTU,由此使得三个部位的样品分别可以获得多个OTU,每一个OTU被视为一个微生物物种。样品中的微生物多样性和不同微生物的丰度都是基于对OTU的分析。The term "OTU" in this application refers to the operation taxonomic units (OTU), which is in the phylogenetic study or population genetics research. In order to facilitate the analysis, a certain classification unit, such as strain, species, and genus, is artificially given. , grouping, etc., set the same flag. In the present application, the sequence is divided into an OTU according to a 97% similarity threshold, whereby a plurality of OTUs can be obtained for each of the three sites, and each OTU is regarded as a microbial species. The microbial diversity in the samples and the abundance of different microorganisms are based on an analysis of the OTU.
本申请中提到的“个体”指动物,特别是哺乳动物,如灵长类动物,本申请的实施例中所指为人。As used herein, "individual" refers to an animal, particularly a mammal, such as a primate, referred to as a human in the examples of the present application.
下面通过具体实施例和附图对本申请作进一步详细说明。以下实施例仅对本申请进行进一步说明,不应理解为对本申请的限制。The present application will be further described in detail below by way of specific embodiments and the accompanying drawings. The following examples are only intended to further illustrate the present application and are not to be construed as limiting the invention.
实施例Example
1.材料与方法1. Materials and methods
1.1样品收集1.1 Sample Collection
本例的样品采集由深圳北大医院妇产科医生协助进行。排除炎症病例,研究对象均为非经期、非妊娠期、非哺乳期女性,无内分泌和自身免疫性疾病,肝肾功能正常。取样前一段时间没有使用激素及抗生素,没有进行阴道用药、阴道灌洗及宫颈治疗,取样前48小时内没有进行性生活。根据以上标准,本例筛选出95例育龄女性,作为第一群体。所有符合以上标准的个体都进行详细的 表型信息登记,以了解其病史、家族史、用药史及生活习惯等,并且均签署了知情同意书。The sample collection in this case was assisted by a gynaecologist at Shenzhen Peking University Hospital. Excluding the cases of inflammation, the subjects were non-menstrual, non-pregnancy, non-lactation women, no endocrine and autoimmune diseases, normal liver and kidney function. No hormones or antibiotics were used for some time before sampling, no vaginal medication, vaginal lavage and cervical treatment, and no sexual life was performed within 48 hours before sampling. According to the above criteria, 95 cases of women of childbearing age were selected as the first group. All individuals who meet the above criteria are detailed Phenotypic information was registered to understand his medical history, family history, medication history, and lifestyle habits, and both signed informed consent.
下生殖道采样是在个体入院后,不经过消毒处理,排空小便后,在妇科检查床采集阴道下1/3(缩写CL)、阴道后穹窿(缩写CU)、宫颈管(缩写CV)三个部位的分泌物样品。具体的,95个采集对象的样品编号及采样信息为,编号C033、C038、C043、C051、C057、C062、C063、C065、T023、T069、T078、T089、T092、T095的十四个采集对象为子宫腺肌症患者,十四个采集对象都采集了CL、CU和CV三个部位的样品;编号C023、C026、C028、C035、C039、C040、C041、C042、C045、C047、C048、C050、C053、C055、C056、C058、C059、C060、C064、C066、C067、C068、T022、T024、T025、T026、T027、T028、T029、T030、T031、T032、T033、T035、T036、T038、T039、T040、T041、T042、T043、T044、T045、T046、T047、T048、T049、T051、T052、T053、T054、T055、T056、T057、T058、T059、T060、T061、T062、T063、T064、T065、T066、T067、T068、T070、T071、T072、T073、T074、T075、T076、T084、T085、T086、T087、T088、T090、T091、T093、T094的八十一个采集对象为非子宫腺肌症患者,八十一个采集对象中,除了T048以外其它都是采集了CL、CU和CV三个部位的样品,T048只采集了CU和CV两个部位的样品,没有采集CL样品。The lower genital tract sampling is performed after the individual is admitted to the hospital, without disinfection, after emptying the urine, the lower third of the vagina (abbreviation CL), the posterior vagina (abbreviated CU), and the cervical canal (abbreviated CV) are collected in the gynecological examination bed. A sample of secretions at each site. Specifically, the sample number and sampling information of the 95 acquisition objects are: the fourteen acquisition objects of numbers C033, C038, C043, C051, C057, C062, C063, C065, T023, T069, T078, T089, T092, T095 are For patients with adenomyosis, samples of CL, CU, and CV were collected from fourteen subjects; numbers C023, C026, C028, C035, C039, C040, C041, C042, C045, C047, C048, C050, C053, C055, C056, C058, C059, C060, C064, C066, C067, C068, T022, T024, T025, T026, T027, T028, T029, T030, T031, T032, T033, T035, T036, T038, T039, T040, T041, T042, T043, T044, T045, T046, T047, T048, T049, T051, T052, T053, T054, T055, T056, T057, T058, T059, T060, T061, T062, T063, T064, T065, Eighty-one acquisition objects of T066, T067, T068, T070, T071, T072, T073, T074, T075, T076, T084, T085, T086, T087, T088, T090, T091, T093, T094 are non-adenomyosis Among the eighty-one patients, except for T048, three CL, CU and CV were collected. Bit samples, samples collected T048 only two positions CU and CV, CL not collected sample.
样品采集是利用尼龙绒屑拭子进行样品收集,尼龙绒屑拭子购自晨阳全球集团CY-93050和CY-98000两种型号。取样后将拭子头用液氮进行速冻,并保存于-80℃,用干冰运送至深圳华大基因研究院进行后续的试验。Sample collection was performed using a nylon fluff swab for sample collection. Nylon fluff swabs were purchased from Chenyang Global Group CY-93050 and CY-98000. After sampling, the swab head was quickly frozen with liquid nitrogen, stored at -80 ° C, and transported to Shenzhen Huada Gene Research Institute with dry ice for subsequent experiments.
1.2DNA提取与16S测序1.2 DNA extraction and 16S sequencing
本例利用QIAamp DNA Mini Kit试剂盒(购自QIAGEN)进行DNA提取。具体提取步骤参照生产厂商提供的说明书进行。采用16S rRNA基因V4-V5高变区特异引物进行扩增,两条引物分别为V4-515F和V5-907R,V4-515F为Seq ID No.45所示序列,V5-907R为Seq ID No.46所示序列。In this example, DNA extraction was performed using the QIAamp DNA Mini Kit (purchased from QIAGEN). The specific extraction steps are carried out in accordance with the instructions provided by the manufacturer. The 16S rRNA gene V4-V5 hypervariable region-specific primers were used for amplification. The two primers were V4-515F and V5-907R, V4-515F was the sequence shown by Seq ID No. 45, and V5-907R was Seq ID No. The sequence shown in 46.
Seq ID No.45:5’-GTGCCAGCMGCCGCGGTAA-3’Seq ID No. 45: 5'-GTGCCAGCMGCCGCGGTAA-3'
Seq ID No.46:5’-CCGTCAATTCMTTTRAGT-3’Seq ID No. 46: 5'-CCGTCAATTCMTTTRAGT-3’
PCR程序如下,94℃变性3min;然后进入25个循环:94℃变性45s,50℃退火60s,72℃延伸90s;循环结束后,72℃延伸10min。得到的PCR产物利用AMPure Beads(Axygen)进行纯化,测序采用芯片泳道测序的方法,将多个样品混合后测序。所以文库构建需要在各样品的引物序列外端连接10bp的barcode序列后,添加接头序列。通过对每个样品添加不同的barcode序列,即样本识别 序列,区分不同样本。文库构建完成后,通过Ion torrent PGM测序平台,进行V5-V4反向测序,以上文库构建和测序等由深圳华大基因进行。The PCR procedure was as follows, denaturation at 94 ° C for 3 min; then into 25 cycles: denaturation at 94 ° C for 45 s, annealing at 50 ° C for 60 s, extension at 72 ° C for 90 s; after the end of the cycle, extension at 72 ° C for 10 min. The obtained PCR product was purified by AMPure Beads (Axygen), and sequencing was carried out by chip lane sequencing, and a plurality of samples were mixed and sequenced. Therefore, the library construction requires the addition of a linker sequence after ligation of a 10 bp barcode sequence at the outer end of the primer sequence of each sample. By adding different barcode sequences to each sample, ie sample identification Sequence, distinguishing between different samples. After the library was constructed, V5-V4 reverse sequencing was performed by Ion torrent PGM sequencing platform. The above library construction and sequencing were carried out by Shenzhen Huada Gene.
1.3 16S测序数据处理1.3 16S sequencing data processing
利用Mothur软件(V1.33.3)从PGM系统中提取原始数据并进行预处理,高质量序列的标准包括:1)长度大于200bp;2)与简并PCR错配碱基少于2个;3)平均质量分数大于25。基于16S rRNA基因序列,利用QIIME的uclust方法对OTU进行聚类,相似阈值设置为97%。选取每个OTU的种子序列(Seed sequence),利用Greengene数据库中的参照基因信息gg_13_8_otus进行注释。计算每个样本中每一个OTU的相对丰度,其中某一OTU的相对丰度为某个样本中该OTU的丰度与该样本中所有OTU丰度之和的比值。The raw data was extracted and pre-processed from the PGM system using Mothur software (V1.33.3). The standards for high-quality sequences include: 1) length greater than 200 bp; 2) less than 2 mismatched bases with degenerate PCR; 3) The average quality score is greater than 25. Based on the 16S rRNA gene sequence, the OTU was clustered using the QIIME uclust method, and the similarity threshold was set to 97%. A seed sequence of each OTU was selected and annotated using the reference gene information gg_13_8_otus in the Greengene database. The relative abundance of each OTU in each sample is calculated, where the relative abundance of an OTU is the ratio of the abundance of the OTU in a sample to the sum of all OTU abundances in the sample.
1.4不同位点样品间微生物群一致性分析1.4 Analysis of microbial consistency between samples at different sites
基于OTU的存在或缺失,本例利用Sorenson指数(
Figure PCTCN2017096248-appb-000001
–Dice指数)来测量同一个体不同位点样品微生物群的相似性,计算方法如下:
Based on the presence or absence of OTU, this example uses the Sorenson index (
Figure PCTCN2017096248-appb-000001
–Dice index) to measure the similarity of microbial populations at different sites in the same individual, calculated as follows:
Figure PCTCN2017096248-appb-000002
Figure PCTCN2017096248-appb-000002
其中A和B分别代表样品A和B中OTU的数目,C代表两个样品中共有的OTU数目。QS是相似性指数,取值范围为0~1。本例中分别计算了CL和CU的相似性指数,CL和CV的相似性指数,以及CU和CV的相似性指数。相似性指数约接近1,表示两个采样部位的微生物群的相似性越高。Where A and B represent the number of OTUs in samples A and B, respectively, and C represents the number of OTUs shared in the two samples. QS is a similarity index and ranges from 0 to 1. In this example, the similarity index of CL and CU, the similarity index of CL and CV, and the similarity index of CU and CV were calculated. The similarity index is close to 1, indicating that the similarity of the microbiota at the two sampling sites is higher.
1.5随机森林分类器1.5 random forest classifier
为了建立一个能够鉴别异常状态样品的模型,对于每个采样部位,利用R软件(3.1.2RC)中randomForest工具包对每个样品的OTU的相对丰度与子宫腺肌症状态进行拟合,采用默认参数。其中,每个样品的OTU是至少存在于10%的样品中的OTU,也就是说,剔除在各个部位的所有待测样品中只有在不到10%的样品中才能检出的OTU。之后进行5次10折交叉验证,将5次10折交叉验证的误差曲线进行平均,将平均后曲线的最低误差加上该点的标准误差作为可接受误差的域值。在分类误差小于域值的各组OTU中,其中,OTU数目最少的为最优OTU组合,作为鉴别子宫腺肌症的生物标志物组合。In order to establish a model capable of identifying abnormal state samples, for each sampling site, the relative abundance of OTU of each sample was fitted to the adenomyosis state using the randomForest toolkit in R software (3.1.2RC). The default parameters. Wherein, the OTU of each sample is an OTU present in at least 10% of the sample, that is, the OTU detected in less than 10% of all samples to be tested at each site is excluded. Then, five 10-fold cross-validation is performed, and the error curves of the five 10-fold cross-validation are averaged, and the lowest error of the average post-curve is added to the standard error of the point as the domain value of the acceptable error. Among the groups of OTUs whose classification error is smaller than the domain value, the least number of OTUs is the optimal OTU combination as a biomarker combination for identifying adenomyosis.
1.6生物标志物验证1.6 Biomarker verification
为了验证本例得到的生物标志物,本例另外采用了独立的受试群体,即第二群体进行验证。第二群体中,对于CL和CU,各有4位子宫腺肌症患者和36位非子宫腺肌症个体;对于CV,有4位腺肌症患者和37位非腺肌症个体。 In order to verify the biomarkers obtained in this example, this example additionally used an independent test population, that is, the second population for verification. In the second group, there were 4 adenomyosis patients and 36 non-adenomyosis individuals for CL and CU; for CV, there were 4 adenomyosis patients and 37 non-adenotrophic individuals.
2.实验结果2. Experimental results
2.1同一个体内上下生殖道微生物结构特征及变化趋势2.1 Microbial structural characteristics and trends of the same in vivo genital tract
为了探索生殖道不同区域微生物群之间的关系,本例计算了同一个体的样品之间的距离。相对于下阴道1/3(CL)样品而言,从阴道后穹窿(CU)、宫颈管(CV)粘液到子宫和腹腔液的加权UniFrac距离依次增加,这也再一次指明随着解剖学结构由下至上,女性生殖道的群落结构呈现连续变化性。In order to explore the relationship between microbial populations in different regions of the reproductive tract, this example calculates the distance between samples of the same individual. The weighted UniFrac distance from the posterior vagina (CU), cervical (CV) mucus to the uterus and peritoneal fluid increased sequentially relative to the lower vaginal 1/3 (CL) sample, again indicating the anatomical structure From bottom to top, the community structure of the female reproductive tract is continuously changing.
同一个体中的不同部位样品呈现高度相关性,不同部位样品之间的Sorenson指数与它们的解剖学结构相一致。宫颈(CV)粘液与腹腔液样本具有显著的相关性,平均Sorenson指数为0.255,表明在普通人群中可以通过分析易取得的宫颈粘液样本来评价宫腔和腹腔的健康状况。Samples of different parts of the same individual showed a high correlation, and the Sorenson index between samples of different parts was consistent with their anatomical structure. There was a significant correlation between cervical (CV) mucus and peritoneal fluid samples, with an average Sorenson index of 0.255, indicating that the uterine cavity and abdominal cavity health can be evaluated by analyzing readily available cervical mucus samples in the general population.
此外,本例还分别通过阴道、宫腔底部对宫颈粘液取样,发现两种途径取样取得样品的细菌分布显示出高度的相似性,进一步表明可以通过分析易获得的宫颈管样本来评价宫腔微生物的情况。In addition, in this case, the cervical mucus was sampled through the vagina and the uterine cavity, respectively. It was found that the bacterial distribution of the samples taken by the two routes showed a high degree of similarity, further indicating that the uterine cavity microorganisms can be evaluated by analyzing the easily available cervical tube samples. Case.
2.2与疾病相关的微生物2.2 Disease-related microorganisms
为了得到用来鉴别子宫腺肌症的OTU生物标志物,本例建立随机森林模型,具体步骤为:(1)以OTU相对丰度作为输入特征,设计基于第一群体的随机森林模型;(2)对于随机森林模型,设计10折交叉验证算法,把第一群体分为子宫腺肌症个体与非子宫腺肌症个体两类,并分别得到随机森林模型的ROC曲线,以各ROC曲线下面积AUC值作为评价指标。In order to obtain the OTU biomarkers used to identify adenomyosis, this example establishes a random forest model. The specific steps are as follows: (1) Using the relative abundance of OTU as an input feature, design a random forest model based on the first population; (2) For the random forest model, a 10-fold cross-validation algorithm was designed, and the first group was divided into two types: adenomyosis individuals and non-adenomyosis individuals, and the ROC curves of random forest models were obtained respectively, with the area under each ROC curve. The AUC value is used as an evaluation index.
本例利用随机森林模型,并结合10折交叉验证,得到了各个部位最优的生物标志物,如表1所示,用于鉴别子宫腺肌症。表2至表4分别为三个部位的标志物组在样品中的富集信息,表5至表7分别为三个部位的标志物组在第一群体样品的相对丰度信息。本例中,三个部位的生物标志物,鉴别子宫腺肌症的结果,如图1至图3所示,图1为阴道下1/3处(CL)的标志物组鉴别子宫腺肌症,图2为阴道后穹窿(CU)的标志物组鉴别子宫腺肌症,图3为宫颈管(CV)的标志物组鉴别子宫腺肌症。In this example, a random forest model was used, combined with a 10-fold cross-validation, to obtain the optimal biomarkers for each part, as shown in Table 1, for identifying adenomyosis. Tables 2 to 4 show the enrichment information of the marker group of the three sites in the sample, and Tables 5 to 7 respectively show the relative abundance information of the marker group of the three sites in the first population sample. In this example, the biomarkers of the three sites identify the results of adenomyosis, as shown in Figures 1 to 3. Figure 1 shows the identification of adenomyosis in the marker group at the lower third of the vagina (CL). Figure 2 shows the adenomyosis of the vaginal posterior iliac crest (CU) and FIG. 3 identifies the adenomyosis of the cervical canal (CV).
表1生物标志物及其所属的各个部位Table 1 biomarkers and their respective parts
Seq ID No.Seq ID No. OTU编号OTU number OTU分类OTU classification CL CL CUCU CVCV
11 77 Acinetobacter sp.Acinetobacter sp.
22 8080 Anaerococcus sp.Anaerococcus sp. ----
33 8383 Finegoldia sp.Finegoldia sp. ---- ----
44 3636 Ochrobactrum sp.Ochrobactrum sp. ----
55 11 Lactobacillus crispatusLactobacillus crispatus ----
66 421421 Lactobacillus inersLactobacillus iners ---- ----
77 6161 Lactobacillus sp.Lactobacillus sp. ----
88 7171 RuminococcaceaeRuminococcaceae ---- ----
99 550550 Lactobacillus sp.Lactobacillus sp. ---- ----
1010 274274 Peptoniphilus sp.Peptoniphilus sp. ----
1111 157157 BifidobacteriaceaeBifidobacteriaceae ----
1212 5656 Staphylococcus sp.Staphylococcus sp. ---- ----
1313 3434 ComamonadaceaeComamonadaceae
1414 304304 Peptoniphilus sp.Peptoniphilus sp. ---- ----
1515 204204 Lactobacillus inersLactobacillus iners ----
1616 5959 Lactobacillus inersLactobacillus iners ---- ----
1717 1313 BifidobacteriaceaeBifidobacteriaceae ---- ----
1818 184184 Lactobacillus inersLactobacillus iners ----
1919 4444 EnterobacteriaceaeEnterobacteriaceae ----
2020 1212 Delftia sp.Delftia sp. ---- ----
21twenty one 1111 Vagococcus sp.Vagococcus sp. ---- ----
22twenty two 307307 Corynebacterium sp.Corynebacterium sp. ---- ----
23twenty three 3737 Pseudomonas viridiflavaPseudomonas viridiflava ---- ----
24twenty four 2626 Shewanella sp.Shewanella sp. ---- ----
2525 101101 Lactobacillus inersLactobacillus iners ---- ----
2626 9595 Paracoccus sp.Paracoccus sp. ---- ----
2727 3838 Lactobacillus sp.Lactobacillus sp. ---- ----
2828 4141 Pseudomonas sp.Pseudomonas sp. ----
2929 306306 Lactobacillus inersLactobacillus iners ---- ----
3030 138138 Lactobacillus inersLactobacillus iners ---- ----
3131 6060 Lactobacillus inersLactobacillus iners ---- ----
3232 3030 Stenotrophomonas sp.Stenotrophomonas sp. ---- ----
3333 4343 Pseudochrobactrum sp.Pseudochrobactrum sp. ---- ----
3434 8989 OxalobacteraceaeOxalobacteraceae ---- ----
3535 112112 Pseudomonas sp.Pseudomonas sp. ---- ----
3636 533533 Pseudomonas sp.Pseudomonas sp. ---- ----
3737 315315 Corynebacterium sp.Corynebacterium sp. ---- ----
3838 9999 Micrococcus luteusMicrococcus luteus ---- ----
3939 419419 TissierellaceaeTissierellaceae ---- ----
4040 492492 Paenibacillus sp.Paenibacillus sp. ---- ----
4141 147147 Shewanella sp.Shewanella sp. ---- ----
4242 1717 Pseudomonas fragiPseudomonas fragi ---- ----
4343 9898 Vagococcus sp.Vagococcus sp. ---- ----
4444 8181 Sphingobium sp.Sphingobium sp. ---- ----
表1中,CL、CU、CV三个部位的标记物可以单独分别做判断,“√”是表示针对该部位进行判断时所需用到的生物标志物,“--”表示不需要用到的。In Table 1, the markers of the three parts of CL, CU, and CV can be judged separately, and "√" is a biomarker used for judging the part. "--" means that it is not needed. of.
在进行样品检测时,要计算各部位的“√”的OTU的相对丰度,将相对丰 度输入随机森林模型,得到结果,判断是否为子宫腺肌症。When performing sample testing, it is necessary to calculate the relative abundance of the "√" OTU of each part, which will be relatively abundant. Enter the random forest model and get the result to determine whether it is adenomyosis.
表2 CL中标志物组各OTU丰度信息Table 2 OTU abundance information of the marker group in CL
Figure PCTCN2017096248-appb-000003
Figure PCTCN2017096248-appb-000003
表3 CU中标志物组各OTU丰度信息Table 3 OTU abundance information of marker groups in CU
Figure PCTCN2017096248-appb-000004
Figure PCTCN2017096248-appb-000004
Figure PCTCN2017096248-appb-000005
Figure PCTCN2017096248-appb-000005
表4 CV中标志物组各OTU丰度信息Table 4 OTU abundance information of the marker group in CV
Figure PCTCN2017096248-appb-000006
Figure PCTCN2017096248-appb-000006
表2至表4中,子宫腺肌症组是指第一群体的95个采集对象中患有子宫腺肌症的样品,对照组是指第一群体的95个采集对象中没有患子宫腺肌症的样品。In Tables 2 to 4, the adenomyosis group refers to the sample of adenomyosis in 95 of the first group, and the control group refers to the absence of adenomyosis in 95 of the first group. Sample of the disease.
表5 CL中标志物组各OTU在第一群体中的丰度信息Table 5 Abundance information of each OTU of the marker group in the first group in CL
Figure PCTCN2017096248-appb-000007
Figure PCTCN2017096248-appb-000007
Figure PCTCN2017096248-appb-000008
Figure PCTCN2017096248-appb-000008
Figure PCTCN2017096248-appb-000009
Figure PCTCN2017096248-appb-000009
Figure PCTCN2017096248-appb-000010
Figure PCTCN2017096248-appb-000010
表6 CU中标志物组各OTU在第一群体中的丰度信息Table 6 Abundance information of each OTU of the marker group in the CU in the first population
Figure PCTCN2017096248-appb-000011
Figure PCTCN2017096248-appb-000011
Figure PCTCN2017096248-appb-000012
Figure PCTCN2017096248-appb-000012
Figure PCTCN2017096248-appb-000013
Figure PCTCN2017096248-appb-000013
Figure PCTCN2017096248-appb-000014
Figure PCTCN2017096248-appb-000014
Figure PCTCN2017096248-appb-000015
Figure PCTCN2017096248-appb-000015
Figure PCTCN2017096248-appb-000016
Figure PCTCN2017096248-appb-000016
表7 CV中标志物组各OTU在第一群体中的丰度信息Table 7 Abundance information of each OTU of the marker group in the CV in the first population
Figure PCTCN2017096248-appb-000017
Figure PCTCN2017096248-appb-000017
Figure PCTCN2017096248-appb-000018
Figure PCTCN2017096248-appb-000018
Figure PCTCN2017096248-appb-000019
Figure PCTCN2017096248-appb-000019
Figure PCTCN2017096248-appb-000020
Figure PCTCN2017096248-appb-000020
图1为阴道下1/3处(CL)的标志物组鉴别子宫腺肌症,图中,a图为随着OTU数量的增加,对随机森林鉴别子宫腺肌症进行5次10折交叉验证的错误率分布情况,该模型用样品中OTU的相对丰度进行训练,总计采用了14位子宫腺肌症个体和80位非子宫腺肌症个体的CL样品,黑色线代表5次试验的平均值,灰色线则分别代表5次试验,黑色竖线代表最佳组合中OTU数目;b图为经过交叉验证过的组合的接收者操作曲线,曲线下面积AUC为0.8668,阴影面积代表95%置信区间,对角线代表AUC为0.5的曲线。Figure 1 shows the adenomyosis identified by the marker group at the lower third of the vagina (CL). In the figure, a is a five-fold 10-fold cross-validation for randomized forest identification of adenomyosis with increasing number of OTUs. The distribution of error rates, the model was trained with the relative abundance of OTU in the sample, using a total of 14 adenomyosis individuals and 80 non-adenomyosis individuals with CL samples, and black lines representing the average of 5 trials. The value, the gray line represents 5 trials respectively, the black vertical line represents the number of OTUs in the best combination; the b diagram shows the receiver operation curve of the cross-validated combination, the area under the curve AUC is 0.8668, and the shaded area represents 95% confidence. The interval, the diagonal line represents a curve with an AUC of 0.5.
图2为阴道后穹窿(CU)的标志物组鉴别子宫腺肌症,图中,a图为随着OTU数量的增加,对随机森林鉴别子宫腺肌症进行5次10折交叉验证的错误率分布情况,该模型用样品中OTU的相对丰度进行训练,总计采用了14位子宫腺肌症个体和81位非子宫腺肌症个体的CU样品,黑色线代表5次试验的平均值,灰色线分别为5次试验,黑色竖线代表最佳组合中OTU数目;b图为经过交叉验证过的组合的接收者操作曲线,曲线下面积AUC为0.8404,阴影面积代表95%置信区间,对角线代表AUC为0.5的曲线。Figure 2 shows the adenomyosis of the vaginal posterior iliac crest (CU). In the figure, a is the error rate of five 10-fold cross-validation on random forest identification of adenomyosis with increasing number of OTUs. Distribution, the model was trained with the relative abundance of OTU in the sample, using a total of 14 uterine adenomyosis individuals and 81 non-adenomyosis individuals with CU samples, black lines representing the average of 5 trials, gray The line is 5 trials respectively, the black vertical line represents the number of OTUs in the best combination; b is the receiver's operation curve of the cross-validated combination, the area under the curve AUC is 0.8404, the shaded area represents 95% confidence interval, diagonal The line represents a curve with an AUC of 0.5.
图3为宫颈管(CV)的标志物组鉴别子宫腺肌症,图中,a图为随着OTU数量的增加,对随机森林鉴别子宫腺肌症进行5次10折交叉验证的错误率分布情况,该模型用样品中OTU的相对丰度进行训练,总计采用了14位子宫腺肌症个体和81位非子宫腺肌症个体的CV样品,黑色线代表5次试验的平均值,灰色线分别为5次试验,黑色竖线代表最佳组合中OTU数目;b图为经过交叉验证过的组合的接收者操作曲线,曲线下面积AUC为0.8369,阴影面积代表95%置信区间,对角线代表AUC为0.5的曲线。Figure 3 is a marker group of cervical canal (CV) to identify adenomyosis. In the figure, a is the error rate distribution of five 10-fold cross-validation on random forest identification of adenomyosis with the increase of OTU number. In the case, the model was trained with the relative abundance of OTU in the sample, using a total of 14 CV samples from individuals with adenomyosis and 81 individuals with non-adenomyosis. The black line represents the mean of 5 trials, gray line For each of the five trials, the black vertical line represents the number of OTUs in the best combination; b is the cross-validated combination of the receiver operating curve, the area under the curve AUC is 0.8369, the shaded area represents the 95% confidence interval, diagonal A curve representing an AUC of 0.5.
由图1至图3的结果可以看出,三个不同位点的OTU生物标志物组能够鉴别子宫腺肌症个体和非子宫腺肌症个体;ROC的曲线下面积AUC值分别为0.8668(CL),0.8404(CU)和0.8369(CV)。其中,AUC是曲线下面积,该值越大,即越接近1,表示判断能力越强,即判断越准确。It can be seen from the results of Fig. 1 to Fig. 3 that the OTU biomarker group at three different sites can identify individuals with adenomyosis and non-adenomyosis individuals; the area under the curve of ROC is 0.8668 (CL) ), 0.8404 (CU) and 0.8369 (CV). Among them, AUC is the area under the curve, and the larger the value, that is, the closer to 1, indicating that the judgment ability is stronger, that is, the more accurate the judgment.
2.3生物标志物验证 2.3 Biomarker verification
将随机森林得到的OTU生物标志物在第二群体样品中进行验证,结果如表8、表9和表10所示。表8至表10中,样品编号例如C002CL、C002CU、C002CV,分别表示采集自同样一个C002采样对象的CL、CU、CV三个部位的样品。表8至表10为三个标志物组预测个体患有子宫腺肌症的概率,由此得到的ROC曲线依序为图4至图6。表8至表10中,概率>0.5认为通过该部位的标志物组判断个体具有患子宫腺肌症的风险或者患有子宫腺肌症。The OTU biomarkers obtained from the random forest were verified in the second population samples, and the results are shown in Table 8, Table 9, and Table 10. In Tables 8 to 10, sample numbers such as C002CL, C002CU, and C002CV respectively indicate samples of three parts of CL, CU, and CV collected from the same C002 sample object. Tables 8 to 10 show the probability that the three marker groups predict the individual suffering from adenomyosis, and the ROC curve thus obtained is sequentially shown in Figs. 4 to 6 . In Tables 8 to 10, the probability > 0.5 is considered to be that the individual has a risk of suffering from adenomyosis or adenomyosis through the marker group at the site.
表8 CL部位的CL标志物组预测第二群体样品患有子宫腺肌症的概率Table 8 CL marker group at the CL site predicts the probability that the second population sample has adenomyosis
样品编号Sample serial number 实际是否子宫腺肌症(N:否;Y是)Actually whether adenomyosis (N: No; Y is) 概率Probability
C001CLC001CL NN 0.4450.445
C002CLC002CL NN 0.1680.168
C003CLC003CL YY 0.2890.289
C004CLC004CL NN 0.0110.011
C005CLC005CL NN 0.3580.358
C007CLC007CL NN 0.1660.166
C008CLC008CL NN 0.0000.000
C009CLC009CL NN 0.0950.095
C011CLC011CL NN 0.4470.447
C012CLC012CL YY 0.5500.550
C014CLC014CL NN 0.4770.477
C016CLC016CL NN 0.3110.311
C018CLC018CL NN 0.2130.213
C019CLC019CL YY 0.8550.855
C020CLC020CL NN 0.1320.132
C021CLC021CL NN 0.3760.376
T000CLT000CL NN 0.1170.117
T001CLT001CL NN 0.1090.109
T003CLT003CL NN 0.5260.526
T005CLT005CL NN 0.5700.570
T006CLT006CL NN 0.0790.079
T007CLT007CL NN 0.0130.013
T008CLT008CL NN 0.3820.382
T009CLT009CL NN 0.0550.055
T010CLT010CL NN 0.0380.038
T011CLT011CL NN 0.1950.195
T012CLT012CL NN 0.1470.147
T013CLT013CL NN 0.0160.016
T014CLT014CL NN 0.3480.348
T015CLT015CL YY 0.5400.540
T016CLT016CL NN 0.3520.352
T017CLT017CL NN 0.3940.394
T018CLT018CL NN 0.0530.053
T019CLT019CL NN 0.1590.159
T020CLT020CL NN 0.7660.766
T021CLT021CL NN 0.0610.061
T080CLT080CL NN 0.0060.006
T081CLT081CL NN 0.5320.532
T082CLT082CL NN 0.0890.089
T083CLT083CL NN 0.2280.228
表9 CU部位的CU标志物组预测第二群体样品患有子宫腺肌症的概率 Table 9 CU marker group at CU site predicts the probability of adenomyosis in a second population sample
样品编号Sample serial number 实际是否子宫腺肌症(N:否;Y是)Actually whether adenomyosis (N: No; Y is) 概率Probability
C001CUC001CU NN 0.4950.495
C002CUC002CU NN 0.0740.074
C003CUC003CU YY 0.3160.316
C004CUC004CU NN 0.0400.040
C005CUC005CU NN 0.3020.302
C007CUC007CU NN 0.0000.000
C008CUC008CU NN 0.0330.033
C009CUC009CU NN 0.0830.083
C011CUC011CU NN 0.4270.427
C012CUC012CU YY 0.2340.234
C014CUC014CU NN 0.2440.244
C016CUC016CU NN 0.3460.346
C018CUC018CU NN 0.4890.489
C019CUC019CU YY 0.7980.798
C020CUC020CU NN 0.0120.012
C021CUC021CU NN 0.0690.069
T000CUT000CU NN 0.0770.077
T001CUT001CU NN 0.0170.017
T002CUT002CU NN 0.0970.097
T003CUT003CU NN 0.2740.274
T005CUT005CU NN 0.2010.201
T006CUT006CU NN 0.1630.163
T007CUT007CU NN 0.0710.071
T008CUT008CU NN 0.2440.244
T009CUT009CU NN 0.0610.061
T010CUT010CU NN 0.0010.001
T011CUT011CU NN 0.1720.172
T013CUT013CU NN 0.0900.090
T014CUT014CU NN 0.0270.027
T015CUT015CU YY 0.2400.240
T016CUT016CU NN 0.0000.000
T017CUT017CU NN 0.0000.000
T018CUT018CU NN 0.0760.076
T019CUT019CU NN 0.0560.056
T020CUT020CU NN 0.7010.701
T021CUT021CU NN 0.0200.020
T080CUT080CU NN 0.0070.007
T081CUT081CU NN 0.1500.150
T082CUT082CU NN 0.1360.136
T083CUT083CU NN 0.0170.017
表10 CV部位的CV标志物组预测第二群体样品患有子宫腺肌症的概率Table 10 CV marker group at the CV site predicts the probability that the second population sample has adenomyosis
样品编号Sample serial number 实际是否子宫腺肌症(N:否;Y是)Actually whether adenomyosis (N: No; Y is) 概率Probability
C002CVC002CV NN 0.4040.404
C003CVC003CV YY 0.3740.374
C004CVC004CV NN 0.1180.118
C005CVC005CV NN 0.4030.403
C007CVC007CV NN 0.4290.429
C008CVC008CV NN 0.2780.278
C009CVC009CV NN 0.3770.377
C011CVC011CV NN 0.4660.466
C012CVC012CV YY 0.5470.547
C014CVC014CV NN 0.3330.333
C016CVC016CV NN 0.4080.408
C018CVC018CV NN 0.0810.081
C019CVC019CV YY 0.5870.587
C020CVC020CV NN 0.3490.349
C021CVC021CV NN 0.0200.020
T000CVT000CV NN 0.3700.370
T001CVT001CV NN 0.3460.346
T002CVT002CV NN 0.0000.000
T003CVT003CV NN 0.0040.004
T004CVT004CV NN 0.0000.000
T005CVT005CV NN 0.0000.000
T006CVT006CV NN 0.3660.366
T007CVT007CV NN 0.4060.406
T008CVT008CV NN 0.0660.066
T009CVT009CV NN 0.2490.249
T010CVT010CV NN 0.1000.100
T011CVT011CV NN 0.3170.317
T012CVT012CV NN 0.3440.344
T013CVT013CV NN 0.4090.409
T014CVT014CV NN 0.1380.138
T015CVT015CV YY 0.6450.645
T016CVT016CV NN 0.3710.371
T017CVT017CV NN 0.0000.000
T018CVT018CV NN 0.0000.000
T019CVT019CV NN 0.0240.024
T020CVT020CV NN 0.6400.640
T021CVT021CV NN 0.0310.031
T080CVT080CV NN 0.3160.316
T081CVT081CV NN 0.3550.355
T082CVT082CV NN 0.3120.312
T083CVT083CV NN 0.3870.387
图4的结果显示CL部位基于CL标志物组判断子宫腺肌症概率,其AUC值为0.8750;图5的结果显示CU部位基于CU标志物组判断子宫腺肌症概率,其AUC值为0.840;图6的结果显示CV部位基于CV标志物组判断子宫腺肌症概率,其AUC值为0.9189;可见,这三个标志物组具有较高的鉴别能力,能够用于子宫腺肌症的检测,该结果与表8至表10的结果相符。表8至表10的结果中,三个标志物组预测的概率,其中至少一个大于0.5,则判断个体具有患子宫腺肌症的风险或者患有子宫腺肌症,由此获得的判断结果,与实际情况相符。The results in Figure 4 show that the CL site is based on the CL marker group to determine the probability of adenomyosis, and its AUC value is 0.8750; the results in Figure 5 show that the CU site based on the CU marker group to determine the probability of adenomyosis, its AUC value is 0.840; The results in Fig. 6 show that the CV site is based on the CV marker group to determine the probability of adenomyosis, and its AUC value is 0.9189; it can be seen that these three marker groups have higher discriminating ability and can be used for the detection of adenomyosis. This result is consistent with the results of Tables 8 to 10. In the results of Tables 8 to 10, the probability of prediction by the three marker groups, at least one of which is greater than 0.5, judges that the individual has the risk of suffering from adenomyosis or has adenomyosis, and the judgment result obtained thereby In line with the actual situation.
以上内容是结合具体的实施方式对本申请所作的进一步详细说明,不能认定本申请的具体实施只局限于这些说明。对于本申请所属技术领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干简单推演或替换。 The above content is a further detailed description of the present application in conjunction with the specific embodiments, and the specific implementation of the present application is not limited to the description. For the ordinary person skilled in the art to which the present invention pertains, a number of simple deductions or substitutions may be made without departing from the spirit of the present application.

Claims (18)

  1. 一种用于子宫腺肌症检测或患病风险评估的生物标志物组合,其特征在于:所述生物标志物组合包括四十四条核酸中的至少一条,所述四十四条核酸分别为Seq ID No.1至Seq ID No.44所示序列,或者分别为与Seq ID No.1至Seq ID No.44所示序列具有97%以上相似性的序列。A biomarker combination for adenomyosis detection or risk assessment of a disease, characterized in that the biomarker combination comprises at least one of forty four nucleic acids, respectively The sequence shown by Seq ID No. 1 to Seq ID No. 44, or a sequence having 97% or more similarity to the sequence shown by Seq ID No. 1 to Seq ID No. 44, respectively.
  2. 一种用于子宫腺肌症检测或患病风险评估的生物标志物组合,其特征在于:所述生物标志物组合包括第一标志物组、第二标志物组和第三标志物组中的至少一组;A biomarker combination for adenomyosis detection or risk assessment of a disease, characterized in that the biomarker combination comprises a first marker group, a second marker group and a third marker group At least one group;
    所述第一标志物组由十八条核酸组成,十八条核酸分别为Seq ID No.1至Seq ID No.18所示序列,或者分别为与Seq ID No.1至Seq ID No.18所示序列具有97%以上相似性的序列;The first marker group is composed of eighteen nucleic acids, and the eighteen nucleic acids are respectively the sequences shown by Seq ID No. 1 to Seq ID No. 18, or respectively, and Seq ID No. 1 to Seq ID No. 18, respectively. The sequence shown has a sequence of 97% or more similarity;
    所述第二标志物组由二十二条核酸组成,二十二条核酸分别为Seq ID No.1、Seq ID No.4、Seq ID No.5、Seq ID No.7、Seq ID No.10、Seq ID No.11、Seq ID No.13、Seq ID No.15、Seq ID No.18至Seq ID No.31所示序列,或者分别为与Seq ID No.1、Seq ID No.4、Seq ID No.5、Seq ID No.7、Seq ID No.10、Seq ID No.11、Seq ID No.13、Seq ID No.15、Seq ID No.18至Seq ID No.31所示序列具有97%以上相似性的序列;The second marker group is composed of twenty-two nucleic acids, and the twenty-two nucleic acids are Seq ID No. 1, Seq ID No. 4, Seq ID No. 5, Seq ID No. 7, and Seq ID No., respectively. 10. Seq ID No. 11, Seq ID No. 13, Seq ID No. 15, Seq ID No. 18 to Seq ID No. 31, or Seq ID No. 1, Seq ID No. 4, respectively. , Seq ID No. 5, Seq ID No. 7, Seq ID No. 10, Seq ID No. 11, Seq ID No. 13, Seq ID No. 15, Seq ID No. 18 to Seq ID No. 31 a sequence having a sequence similarity of 97% or more;
    所述第三标志物组由十八条核酸组成,这十八条核酸分别为Seq ID No.1、Seq ID No.2、Seq ID No.13、Seq ID No.19、Seq ID No.28、Seq ID No.32至Seq ID No.44所示序列,或者分别为与Seq ID No.1、Seq ID No.2、Seq ID No.13、Seq ID No.19、Seq ID No.28、Seq ID No.32至Seq ID No.44所示序列具有97%以上相似性的序列。The third marker group is composed of eighteen nucleic acids, which are Seq ID No. 1, Seq ID No. 2, Seq ID No. 13, Seq ID No. 19, and Seq ID No. 28, respectively. The sequence shown by Seq ID No. 32 to Seq ID No. 44, or Seq ID No. 1, Seq ID No. 2, Seq ID No. 13, Seq ID No. 19, Seq ID No. 28, The sequence shown by Seq ID No. 32 to Seq ID No. 44 has a sequence of 97% or more similarity.
  3. 根据权利要求2所述的生物标志物组合,其特征在于:所述第一标志物组为CL标志物组,用于对来自阴道下1/3的样品进行子宫腺肌症检测或患病风险评估。The biomarker combination according to claim 2, wherein the first marker group is a CL marker group for detecting adenomyosis or a disease risk from a sample of the lower third of the vagina. Evaluation.
  4. 根据权利要求2所述的生物标志物组合,其特征在于:所述第二标志物组为CU标志物组,用于对来自阴道后穹窿的样品进行子宫腺肌症检测或患病风险评估。The biomarker combination according to claim 2, wherein the second marker group is a CU marker group for performing adenomyosis detection or disease risk assessment on a sample from the vaginal posterior iliac crest.
  5. 根据权利要求2所述的生物标志物组合,其特征在于:所述第三标志物组为CV标志物组,用于对来自宫颈管的样品进行子宫腺肌症检测或患病风险评估。The biomarker combination according to claim 2, wherein the third marker group is a CV marker group for performing adenomyosis detection or disease risk assessment on a sample from the cervical canal.
  6. 一种用于子宫腺肌症检测或患病风险评估的试剂盒,其特征在于:所述试剂盒中包含用于检测权利要求1-5任一项所述的生物标志物组合的引物对,所述 引物对的正向引物为SEQ ID No.45所示序列,反向引物为SEQ ID No.46所示序列。A kit for use in the detection of adenomyosis or risk assessment of a disease, characterized in that the kit comprises a primer pair for detecting the biomarker combination according to any one of claims 1 to 5, Said The forward primer of the primer pair is the sequence shown in SEQ ID No. 45, and the reverse primer is the sequence shown in SEQ ID No. 46.
  7. 根据权利要求1-5任一项所述的生物标志物组合在子宫腺肌症药物筛选或者在制备子宫腺肌症检测或患病风险评估的试剂盒或检测工具中的应用。Use of the biomarker combination according to any one of claims 1 to 5 for drug screening of adenomyosis or for the preparation of a kit or a detection tool for adenomyosis detection or risk assessment.
  8. 一种子宫腺肌症的检测方法,其特征在于:包括以下步骤,A method for detecting adenomyosis characterized by comprising the following steps,
    (1)对待测对象进行样品采集,检测所采集的样品中权利要求1-5任一项所述的生物标志物组合,并分析生物标志物组合中各核酸的水平;(1) performing sample collection on the object to be tested, detecting the biomarker combination according to any one of claims 1 to 5 in the collected sample, and analyzing the level of each nucleic acid in the biomarker combination;
    (2)将步骤(1)测得的各核酸的水平与参考数据集或参考值进行比较,获得检测结果;(2) comparing the level of each nucleic acid measured in the step (1) with a reference data set or a reference value to obtain a detection result;
    优选的,所述各核酸的水平为各核酸的相对丰度;所述参考数据集或参考值为来源于子宫腺肌症患者和非子宫腺肌症对照的生物标志物组合中各核酸的水平。Preferably, the level of each nucleic acid is the relative abundance of each nucleic acid; the reference data set or reference value is the level of each nucleic acid in the biomarker combination derived from adenomyosis patients and non-adenomyosis controls. .
  9. 根据权利要求8所述的检测方法,其特征在于:所述步骤(2)中的参考数据集或参考值为表5、表6或表7中的至少一组;将各核酸的水平与参考数据集或参考值进行比较获得检测结果,具体包括,利用多元统计模型计算得出患病概率,优选地,所述多元统计模型为随机森林模型。The detecting method according to claim 8, wherein the reference data set or reference value in the step (2) is at least one of the table 5, the table 6 or the table 7; the level and reference of each nucleic acid are referred to The data set or the reference value is compared to obtain the detection result, which specifically includes calculating the disease probability by using the multivariate statistical model. Preferably, the multivariate statistical model is a random forest model.
  10. 根据权利要求8或9所述的检测方法,其特征在于:所述步骤(1)中对待测对象进行样品采集,包括采集待测对象阴道下1/3样品、阴道后穹窿样品和宫颈管样品。The detecting method according to claim 8 or 9, wherein in the step (1), the sample to be measured is subjected to sample collection, including collecting the lower third of the vagina sample, the posterior vaginal sample and the cervical canal sample of the object to be tested. .
  11. 一种通过检测生物标志物判断子宫腺肌症的方法在制备子宫腺肌症检测或患病风险评估试剂盒或工具中的应用;所述生物标志物为权利要求1-5任一项所述的生物标志物组合;A method for determining adenomyosis by detecting a biomarker for use in preparing a kit or a tool for assessing a disease or disease risk; the biomarker is according to any one of claims 1-5 Combination of biomarkers;
    所述通过检测生物标志物判断子宫腺肌症的方法包括以下步骤,The method for determining adenomyosis by detecting a biomarker includes the following steps,
    (1)对待测对象进行样品采集,检测所采集的样品中权利要求1-5任一项所述的生物标志物组合,并分析生物标志物组合中各核酸的水平;(1) performing sample collection on the object to be tested, detecting the biomarker combination according to any one of claims 1 to 5 in the collected sample, and analyzing the level of each nucleic acid in the biomarker combination;
    (2)将步骤(1)测得的各核酸的水平与参考数据集或参考值进行比较,获得检测结果;(2) comparing the level of each nucleic acid measured in the step (1) with a reference data set or a reference value to obtain a detection result;
    优选的,所述各核酸的水平为各核酸的相对丰度;所述参考数据集或参考值为来源于子宫腺肌症患者和非子宫腺肌症对照的生物标志物组合中各核酸的水平。Preferably, the level of each nucleic acid is the relative abundance of each nucleic acid; the reference data set or reference value is the level of each nucleic acid in the biomarker combination derived from adenomyosis patients and non-adenomyosis controls. .
  12. 根据权利要求11所述的应用,其特征在于:所述步骤(2)中的参考数据集或参考值为表5、表6或表7中的至少一组;将各核酸的水平与参考数据集或参考值进行比较获得检测结果,具体包括,利用多元统计模型计算得出患病概率, 优选地,所述多元统计模型为随机森林模型。The use according to claim 11, wherein the reference data set or reference value in the step (2) is at least one of the table 5, the table 6 or the table 7; the level of each nucleic acid and the reference data The set or reference value is compared to obtain the test result, which specifically includes calculating the disease probability by using the multivariate statistical model. Preferably, the multivariate statistical model is a random forest model.
  13. 一种筛选治疗子宫腺肌症的候选药物的方法,其特征在于:包括以下步骤,A method for screening a drug candidate for treating adenomyosis, characterized by comprising the following steps,
    1)分别测定用药前和用药后的样品中权利要求1-5任一项所述的生物标志物组合,并分析生物标志物组合中各核酸的水平;1) separately determining the biomarker combination according to any one of claims 1 to 5 in the sample before and after administration, and analyzing the level of each nucleic acid in the biomarker combination;
    2)根据比较用药前和用药后的样品中各核酸的水平,判断候选药物;2) judging the candidate drug according to the level of each nucleic acid in the sample before and after the drug is compared;
    所述步骤2)中,比较用药前和用药后的样品中各核酸的水平,具体包括,利用多元统计模型计算得出患病概率,优选地,所述多元统计模型为随机森林模型。In the step 2), comparing the levels of each nucleic acid in the sample before and after administration, specifically including calculating the probability of disease by using a multivariate statistical model, preferably, the multivariate statistical model is a random forest model.
  14. 一种女性生殖道内微生物群的检测方法,其特征在于:包括以下步骤,A method for detecting a microbiota in a female reproductive tract, comprising: the following steps,
    (1)采集待测对象生殖道内微生物样品,检测所采集的样品中权利要求1-5任一项所述的生物标志物组合,并分析生物标志物组合中各核酸的水平;(1) collecting a microbial sample in the reproductive tract of the test subject, detecting the biomarker combination according to any one of claims 1 to 5 in the collected sample, and analyzing the level of each nucleic acid in the biomarker combination;
    (2)将步骤(1)测得的各核酸的水平与参考数据集或参考值进行比较,获得检测结果;(2) comparing the level of each nucleic acid measured in the step (1) with a reference data set or a reference value to obtain a detection result;
    优选的,所述各核酸的水平为各核酸的相对丰度;所述参考数据集或参考值为来源于子宫腺肌症患者和非子宫腺肌症对照的生物标志物组合中各核酸的水平。Preferably, the level of each nucleic acid is the relative abundance of each nucleic acid; the reference data set or reference value is the level of each nucleic acid in the biomarker combination derived from adenomyosis patients and non-adenomyosis controls. .
  15. 根据权利要求14所述的检测方法,其特征在于:所述步骤(2)中的参考数据集或参考值为表5、表6或表7中的至少一组;将各核酸的水平与参考数据集或参考值进行比较获得检测结果,具体包括,利用多元统计模型计算得出患病概率,优选地,所述多元统计模型为随机森林模型。The detecting method according to claim 14, wherein the reference data set or reference value in the step (2) is at least one of the table 5, the table 6 or the table 7; the level and reference of each nucleic acid The data set or the reference value is compared to obtain the detection result, which specifically includes calculating the disease probability by using the multivariate statistical model. Preferably, the multivariate statistical model is a random forest model.
  16. 根据权利要求14或15所述的检测方法,其特征在于:所述步骤(1)中采集待测对象生殖道内微生物样品,具体包括采集待测对象阴道下1/3样品、阴道后穹窿样品和宫颈管样品。The detecting method according to claim 14 or 15, wherein in the step (1), the microbial sample in the genital tract of the object to be tested is collected, which comprises collecting the lower third of the vagina sample, the vaginal posterior iliac sample and the sample to be tested. Cervical tube samples.
  17. 一种制备子宫腺肌症生物标志物组合的方法,其特征在于:包括以下步骤,A method for preparing a combination of adenomyosis biomarkers, comprising: the following steps,
    (1)分别对子宫腺肌症病患和非病患进行生殖道内微生物样品采集,对所有采集的样品分别进行16S测序;(1) Collecting microbial samples in the genital tract of adenomyosis patients and non-patients, respectively, and performing 16S sequencing on all collected samples;
    (2)将16S测序结果进行聚类分析,获得OTU单元以及每个OTU的种子序列,并计算每个OTU单元的相对丰度;(2) Clustering the 16S sequencing results to obtain the OTU unit and the seed sequence of each OTU, and calculate the relative abundance of each OTU unit;
    (3)利用随机森林模型对每个OTU单元的相对丰度与子宫腺肌症状态进行拟合,并进行5次十折交叉验证,得到最优的OTU组合,最优OTU组合中各OTU的种子序列,即组成子宫腺肌症的生物标志物组合。 (3) Using the random forest model to fit the relative abundance of each OTU unit with the adenomyosis state, and perform five 10-fold cross-validation to obtain the optimal OTU combination, the OTU of the optimal OTU combination. A seed sequence, a combination of biomarkers that make up adenomyosis.
  18. 根据权利要求17所述的方法,其特征在于:所述步骤(1)中,生殖道内微生物样品采集,具体包括采集待测对象阴道下1/3样品、阴道后穹窿样品和宫颈管样品。 The method according to claim 17, wherein in the step (1), the microbial sample is collected in the genital tract, and specifically comprises collecting the lower third of the vagina sample, the posterior vaginal sputum sample and the cervical canal sample of the test subject.
PCT/CN2017/096248 2016-09-19 2017-08-07 Biomarker composition for detection of adenomyosis and application thereof WO2018049946A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201780047953.5A CN109689890B (en) 2016-09-19 2017-08-07 Biomarker combination for adenomyosis detection and application thereof

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610831104.6A CN107858415B (en) 2016-09-19 2016-09-19 Biomarker combination for adenomyosis detection and application thereof
CN201610831104.6 2016-09-19

Publications (1)

Publication Number Publication Date
WO2018049946A1 true WO2018049946A1 (en) 2018-03-22

Family

ID=61619297

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/096248 WO2018049946A1 (en) 2016-09-19 2017-08-07 Biomarker composition for detection of adenomyosis and application thereof

Country Status (2)

Country Link
CN (2) CN107858415B (en)
WO (1) WO2018049946A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021077026A1 (en) * 2019-10-16 2021-04-22 Icahn School Of Medicine At Mount Sinai Systems and methods for detecting a disease condition

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115404274A (en) * 2022-11-01 2022-11-29 广东省生殖科学研究所(广东省生殖医院) Application of methylation sites in diagnosis and typing of endometriosis

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1823274A (en) * 2003-07-14 2006-08-23 株式会社白丽高科 Method for diagnosing endometriosis-related disease in womb
CN101124340A (en) * 2005-02-18 2008-02-13 美国政府健康及人类服务部 Identification of molecular diagnostic markers for endometriosis in blood lymphocytes
CN101210929A (en) * 2006-12-29 2008-07-02 中国医学科学院北京协和医院 Method for detecting endometriosis blood plasma marker protein
US20110229481A1 (en) * 2005-03-31 2011-09-22 Chugai Seiyaku Kabushiki Kaisha Cancer-associated antigen analogue peptides and uses thereof
CN105988002A (en) * 2015-03-03 2016-10-05 南京鼓楼医院 Method for detecting endometrial receptivity through MST1 and phosphorylated MST1

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2002241589B2 (en) * 2000-12-01 2007-07-19 Xoma Technology Ltd. Modulation of pericyte proliferation using BPI protein products or BPI inhibitors
WO2006051986A1 (en) * 2004-11-15 2006-05-18 Periodock, Inc. Method of determining aromatase activity and determination kit therefor
EP2010567A2 (en) * 2006-04-07 2009-01-07 The Government of the United States of America as Represented by The Department of Health and Human Services Antibody compositions and methods for treatment of neoplastic disease
WO2013050540A1 (en) * 2011-10-05 2013-04-11 University Of Bremen Wnt4 and med12 for use in the diagnosis and treatment of tumor diseases

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1823274A (en) * 2003-07-14 2006-08-23 株式会社白丽高科 Method for diagnosing endometriosis-related disease in womb
CN101124340A (en) * 2005-02-18 2008-02-13 美国政府健康及人类服务部 Identification of molecular diagnostic markers for endometriosis in blood lymphocytes
US20110229481A1 (en) * 2005-03-31 2011-09-22 Chugai Seiyaku Kabushiki Kaisha Cancer-associated antigen analogue peptides and uses thereof
CN101210929A (en) * 2006-12-29 2008-07-02 中国医学科学院北京协和医院 Method for detecting endometriosis blood plasma marker protein
CN105988002A (en) * 2015-03-03 2016-10-05 南京鼓楼医院 Method for detecting endometrial receptivity through MST1 and phosphorylated MST1

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
AGIC, A. ET AL.: "Combination of CCR1 mRNA, MCP1, and CA 125 Measurements in Peripheral Blood as a Diagnostic Test for Endometriosis", REPRODUCTIVE SCIENCES, vol. 15, no. 9, 30 November 2008 (2008-11-30), pages 906 - 911, DOI: 10.1177/1933719108318598 *
DATABASE Nucleotide 16 April 2015 (2015-04-16), "Uncultured Planococcus sp. partial 16S rRNA gene , isolate S198, clone yantianma", XP055603753, retrieved from NCBI Database accession no. LN833484. 1 *
DATABASE Nucleotide 21 April 2016 (2016-04-21), "Uncultured bacterium clone IIGE3NT01BKCTY_F08_OTU063 16S ribosomal RNA gene , partial sequence", XP055603759, retrieved from NCBI Database accession no. KP688845. 1 *
HATOK, J. ET AL.: "Endometrial aromatase mRNA as a possible screening tool for advanced endometriosis and adenomyosis", GYNECOLOGICAL ENDOCRINOLOGY, vol. 27, no. 5, 30 May 2011 (2011-05-30), pages 331 - 336, DOI: 10.3109/09513590.2010.491925 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021077026A1 (en) * 2019-10-16 2021-04-22 Icahn School Of Medicine At Mount Sinai Systems and methods for detecting a disease condition

Also Published As

Publication number Publication date
CN107858415A (en) 2018-03-30
CN109689890A (en) 2019-04-26
CN107858415B (en) 2021-05-28
CN109689890B (en) 2022-03-25

Similar Documents

Publication Publication Date Title
Winters et al. Does the endometrial cavity have a molecular microbial signature?
Huang et al. Gut microbiota exceeds cervical microbiota for early diagnosis of endometriosis
Oelke et al. Diagnostic accuracy of noninvasive tests to evaluate bladder outlet obstruction in men: detrusor wall thickness, uroflowmetry, postvoid residual urine, and prostate volume
Holcomb et al. Human epididymis protein 4 offers superior specificity in the differentiation of benign and malignant adnexal masses in premenopausal women
CA2957549C (en) Diagnostic method for distinguishing forms of esophageal eosinophilia
CN111833963B (en) CfDNA classification method, device and application
CN104004840A (en) Kit for early screening and diagnosis of prostate cancer
WO2018049947A1 (en) Biomarker composition for detection of endometriosis and application thereof
WO2018049946A1 (en) Biomarker composition for detection of adenomyosis and application thereof
TW201625797A (en) Method and biomarkers for accessing the risk of having colorectal cancer
WO2019204985A1 (en) Osteoporosis biomarker and use thereof
Fiset et al. Prediction of spontaneous preterm birth among twin gestations using machine learning and texture analysis of cervical ultrasound images.
Penna et al. Combined evaluation of fecal calprotectin and C-reactive protein as a therapeutic target in the management of patients with Crohn's disease
JP2011004743A (en) Method for deciding efficacy of infliximab medicinal effect in patient with rheumatoid arthritis
Zhang et al. Association between vaginal Gardnerella and tubal pregnancy in women with symptomatic early pregnancies in China: a nested case-control study
US20170260571A1 (en) Methods and materials for treating endometrial cancer
Kraydaschenko et al. Serum Interleukin-18 as a Biomarker of Tubular Kidney Damage in Patients with Chronic Glomerulonephritis
RU2809438C1 (en) Method of predicting development of pelvic and urodynamic dysfunctions in women after childbirth
TWI626314B (en) Method for accessing the risk of having colorectal cancer
WO2023073074A1 (en) Jup biomarker for the diagnosis of diseases or disorders of the female reproductive tract
JP2019176754A (en) Inspection method for ovulation disorders
RU2763707C1 (en) Method for predicting premature rupture of membranes in the period from 22 to 28 weeks of gestation
Li et al. Clinical indicators and reproductive tract microbiota abnormalities indicate the occurrence of endometriosis
WO2024014498A1 (en) Method for evaluating lymph node metastasis capability of endometrial cancer
Pan et al. Vaginal microbiome differences between patients with adenomyosis with different menstrual cycles and healthy controls

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17850148

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17850148

Country of ref document: EP

Kind code of ref document: A1