CN115424669A - LR score-based triple negative breast cancer curative effect and prognosis evaluation model - Google Patents
LR score-based triple negative breast cancer curative effect and prognosis evaluation model Download PDFInfo
- Publication number
- CN115424669A CN115424669A CN202210993850.0A CN202210993850A CN115424669A CN 115424669 A CN115424669 A CN 115424669A CN 202210993850 A CN202210993850 A CN 202210993850A CN 115424669 A CN115424669 A CN 115424669A
- Authority
- CN
- China
- Prior art keywords
- score
- pair
- breast cancer
- pairs
- prognosis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 208000003721 Triple Negative Breast Neoplasms Diseases 0.000 title claims abstract description 76
- 208000022679 triple-negative breast carcinoma Diseases 0.000 title claims abstract description 76
- 238000004393 prognosis Methods 0.000 title claims abstract description 48
- 230000000694 effects Effects 0.000 title claims abstract description 31
- 238000013210 evaluation model Methods 0.000 title abstract description 4
- 238000011282 treatment Methods 0.000 claims abstract description 42
- 238000012502 risk assessment Methods 0.000 claims abstract description 22
- 238000012216 screening Methods 0.000 claims abstract description 19
- 238000010276 construction Methods 0.000 claims abstract description 8
- 238000000611 regression analysis Methods 0.000 claims abstract description 7
- 238000001914 filtration Methods 0.000 claims abstract description 3
- 230000014509 gene expression Effects 0.000 claims description 43
- 230000004083 survival effect Effects 0.000 claims description 30
- 238000000034 method Methods 0.000 claims description 23
- 239000002246 antineoplastic agent Substances 0.000 claims description 14
- 229940041181 antineoplastic drug Drugs 0.000 claims description 9
- 101000957708 Catostomus commersonii Corticoliberin-2 Proteins 0.000 claims description 4
- 230000001225 therapeutic effect Effects 0.000 claims description 4
- 230000004797 therapeutic response Effects 0.000 claims description 3
- 206010059866 Drug resistance Diseases 0.000 claims description 2
- 102100021597 Endoplasmic reticulum aminopeptidase 2 Human genes 0.000 claims description 2
- 102100026149 Fibroblast growth factor receptor-like 1 Human genes 0.000 claims description 2
- 101000912518 Homo sapiens Fibroblast growth factor receptor-like 1 Proteins 0.000 claims description 2
- 101000904173 Homo sapiens Progonadoliberin-1 Proteins 0.000 claims description 2
- 229920012196 Polyoxymethylene Copolymer Polymers 0.000 claims description 2
- 102100024028 Progonadoliberin-1 Human genes 0.000 claims description 2
- 229940034982 antineoplastic agent Drugs 0.000 claims description 2
- 108010080821 leucine-rich amelogenin peptide Proteins 0.000 claims description 2
- 230000002829 reductive effect Effects 0.000 claims description 2
- 206010028980 Neoplasm Diseases 0.000 description 32
- 210000004027 cell Anatomy 0.000 description 22
- 239000003814 drug Substances 0.000 description 22
- 229940079593 drug Drugs 0.000 description 21
- 230000004044 response Effects 0.000 description 19
- 230000037361 pathway Effects 0.000 description 18
- 108090000623 proteins and genes Proteins 0.000 description 18
- 238000004458 analytical method Methods 0.000 description 16
- 201000011510 cancer Diseases 0.000 description 14
- 210000002865 immune cell Anatomy 0.000 description 12
- 230000035945 sensitivity Effects 0.000 description 11
- 230000002596 correlated effect Effects 0.000 description 10
- 206010006187 Breast cancer Diseases 0.000 description 9
- 208000026310 Breast neoplasm Diseases 0.000 description 9
- 238000001514 detection method Methods 0.000 description 9
- 239000003446 ligand Substances 0.000 description 9
- 230000019491 signal transduction Effects 0.000 description 9
- 239000000523 sample Substances 0.000 description 8
- 210000001744 T-lymphocyte Anatomy 0.000 description 7
- 230000003993 interaction Effects 0.000 description 7
- 230000000875 corresponding effect Effects 0.000 description 6
- 239000011159 matrix material Substances 0.000 description 6
- 208000037821 progressive disease Diseases 0.000 description 6
- 108020003175 receptors Proteins 0.000 description 6
- 230000001105 regulatory effect Effects 0.000 description 6
- 229940076838 Immune checkpoint inhibitor Drugs 0.000 description 5
- 102000037984 Inhibitory immune checkpoint proteins Human genes 0.000 description 5
- 108091008026 Inhibitory immune checkpoint proteins Proteins 0.000 description 5
- 238000012313 Kruskal-Wallis test Methods 0.000 description 5
- 238000001793 Wilcoxon signed-rank test Methods 0.000 description 5
- 230000010252 chemokine signaling pathway Effects 0.000 description 5
- 201000010099 disease Diseases 0.000 description 5
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 5
- 238000009826 distribution Methods 0.000 description 5
- 239000012274 immune-checkpoint protein inhibitor Substances 0.000 description 5
- 230000036039 immunity Effects 0.000 description 5
- 210000002540 macrophage Anatomy 0.000 description 5
- 230000002503 metabolic effect Effects 0.000 description 5
- 230000031942 natural killer cell mediated cytotoxicity Effects 0.000 description 5
- FAQDUNYVKQKNLD-UHFFFAOYSA-N olaparib Chemical compound FC1=CC=C(CC2=C3[CH]C=CC=C3C(=O)N=N2)C=C1C(=O)N(CC1)CCN1C(=O)C1CC1 FAQDUNYVKQKNLD-UHFFFAOYSA-N 0.000 description 5
- 229960000572 olaparib Drugs 0.000 description 5
- 102000005962 receptors Human genes 0.000 description 5
- 210000001666 CD4-positive, alpha-beta memory T lymphocyte Anatomy 0.000 description 4
- 108091008036 Immune checkpoint proteins Proteins 0.000 description 4
- 238000010220 Pearson correlation analysis Methods 0.000 description 4
- 238000012352 Spearman correlation analysis Methods 0.000 description 4
- 108091008874 T cell receptors Proteins 0.000 description 4
- 102000016266 T-Cell Antigen Receptors Human genes 0.000 description 4
- HWGQMRYQVZSGDQ-HZPDHXFCSA-N chembl3137320 Chemical compound CN1N=CN=C1[C@H]([C@H](N1)C=2C=CC(F)=CC=2)C2=NNC(=O)C3=C2C1=CC(F)=C3 HWGQMRYQVZSGDQ-HZPDHXFCSA-N 0.000 description 4
- 108010057085 cytokine receptors Proteins 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 230000035772 mutation Effects 0.000 description 4
- 238000010837 poor prognosis Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 229950004550 talazoparib Drugs 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000010200 validation analysis Methods 0.000 description 4
- 206010069754 Acquired gene mutation Diseases 0.000 description 3
- 206010021143 Hypoxia Diseases 0.000 description 3
- 102000037982 Immune checkpoint proteins Human genes 0.000 description 3
- 210000004322 M2 macrophage Anatomy 0.000 description 3
- 229930012538 Paclitaxel Natural products 0.000 description 3
- 102000002689 Toll-like receptor Human genes 0.000 description 3
- 108020000411 Toll-like receptor Proteins 0.000 description 3
- 230000030741 antigen processing and presentation Effects 0.000 description 3
- 210000003719 b-lymphocyte Anatomy 0.000 description 3
- 239000003153 chemical reaction reagent Substances 0.000 description 3
- 238000012937 correction Methods 0.000 description 3
- 230000001186 cumulative effect Effects 0.000 description 3
- 238000005315 distribution function Methods 0.000 description 3
- 230000004064 dysfunction Effects 0.000 description 3
- 238000010201 enrichment analysis Methods 0.000 description 3
- 229940011871 estrogen Drugs 0.000 description 3
- 239000000262 estrogen Substances 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000002068 genetic effect Effects 0.000 description 3
- 230000034659 glycolysis Effects 0.000 description 3
- 230000007954 hypoxia Effects 0.000 description 3
- 238000009169 immunotherapy Methods 0.000 description 3
- 230000002401 inhibitory effect Effects 0.000 description 3
- 229960001592 paclitaxel Drugs 0.000 description 3
- 230000036961 partial effect Effects 0.000 description 3
- 230000000284 resting effect Effects 0.000 description 3
- 230000037439 somatic mutation Effects 0.000 description 3
- RCINICONZNJXQF-MZXODVADSA-N taxol Chemical compound O([C@@H]1[C@@]2(C[C@@H](C(C)=C(C2(C)C)[C@H](C([C@]2(C)[C@@H](O)C[C@H]3OC[C@]3([C@H]21)OC(C)=O)=O)OC(=O)C)OC(=O)[C@H](O)[C@@H](NC(=O)C=1C=CC=CC=1)C=1C=CC=CC=1)O)C(=O)C1=CC=CC=C1 RCINICONZNJXQF-MZXODVADSA-N 0.000 description 3
- 230000036962 time dependent Effects 0.000 description 3
- JNAHVYVRKWKWKQ-CYBMUJFWSA-N veliparib Chemical compound N=1C2=CC=CC(C(N)=O)=C2NC=1[C@@]1(C)CCCN1 JNAHVYVRKWKWKQ-CYBMUJFWSA-N 0.000 description 3
- 229950011257 veliparib Drugs 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 102100024167 C-C chemokine receptor type 3 Human genes 0.000 description 2
- 101710149862 C-C chemokine receptor type 3 Proteins 0.000 description 2
- 102100036170 C-X-C motif chemokine 9 Human genes 0.000 description 2
- 101100314454 Caenorhabditis elegans tra-1 gene Proteins 0.000 description 2
- 102000004127 Cytokines Human genes 0.000 description 2
- 101000947172 Homo sapiens C-X-C motif chemokine 9 Proteins 0.000 description 2
- 101001012157 Homo sapiens Receptor tyrosine-protein kinase erbB-2 Proteins 0.000 description 2
- 238000010824 Kaplan-Meier survival analysis Methods 0.000 description 2
- 206010061309 Neoplasm progression Diseases 0.000 description 2
- 102000016611 Proteoglycans Human genes 0.000 description 2
- 108010067787 Proteoglycans Proteins 0.000 description 2
- 230000010799 Receptor Interactions Effects 0.000 description 2
- 102100030086 Receptor tyrosine-protein kinase erbB-2 Human genes 0.000 description 2
- 108010067390 Viral Proteins Proteins 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 2
- 230000005809 anti-tumor immunity Effects 0.000 description 2
- 230000006023 anti-tumor response Effects 0.000 description 2
- 238000000546 chi-square test Methods 0.000 description 2
- 210000003690 classically activated macrophage Anatomy 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000002790 cross-validation Methods 0.000 description 2
- 102000003675 cytokine receptors Human genes 0.000 description 2
- 210000004443 dendritic cell Anatomy 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000017188 evasion or tolerance of host immune response Effects 0.000 description 2
- 230000002349 favourable effect Effects 0.000 description 2
- 210000003958 hematopoietic stem cell Anatomy 0.000 description 2
- 210000003630 histaminocyte Anatomy 0.000 description 2
- 210000000987 immune system Anatomy 0.000 description 2
- 230000008595 infiltration Effects 0.000 description 2
- 238000001764 infiltration Methods 0.000 description 2
- 230000000968 intestinal effect Effects 0.000 description 2
- 238000001325 log-rank test Methods 0.000 description 2
- 201000004792 malaria Diseases 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 210000000822 natural killer cell Anatomy 0.000 description 2
- 230000006916 protein interaction Effects 0.000 description 2
- 206010039073 rheumatoid arthritis Diseases 0.000 description 2
- 230000011664 signaling Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000005751 tumor progression Effects 0.000 description 2
- KHZOJCQBHJUJFY-UHFFFAOYSA-N 2-[4-(2-methylpyridin-4-yl)phenyl]-n-(4-pyridin-3-ylphenyl)acetamide Chemical compound C1=NC(C)=CC(C=2C=CC(CC(=O)NC=3C=CC(=CC=3)C=3C=NC=CC=3)=CC=2)=C1 KHZOJCQBHJUJFY-UHFFFAOYSA-N 0.000 description 1
- 102000008096 B7-H1 Antigen Human genes 0.000 description 1
- 108010074708 B7-H1 Antigen Proteins 0.000 description 1
- 101100494773 Caenorhabditis elegans ctl-2 gene Proteins 0.000 description 1
- 102000000844 Cell Surface Receptors Human genes 0.000 description 1
- 108010001857 Cell Surface Receptors Proteins 0.000 description 1
- 206010009944 Colon cancer Diseases 0.000 description 1
- 208000001333 Colorectal Neoplasms Diseases 0.000 description 1
- 230000007067 DNA methylation Effects 0.000 description 1
- RTDFOQVLLJZRIH-JZAVHCKJSA-N Glycoursocholanic acid Chemical compound C([C@H]1CC2)CCC[C@]1(C)[C@@H]1[C@@H]2[C@@H]2CC[C@H]([C@@H](CCC(=O)NCC(O)=O)C)[C@@]2(C)CC1 RTDFOQVLLJZRIH-JZAVHCKJSA-N 0.000 description 1
- 102100033969 Guanylyl cyclase-activating protein 1 Human genes 0.000 description 1
- 101001068480 Homo sapiens Guanylyl cyclase-activating protein 1 Proteins 0.000 description 1
- 101000611183 Homo sapiens Tumor necrosis factor Proteins 0.000 description 1
- 102000006992 Interferon-alpha Human genes 0.000 description 1
- 108010047761 Interferon-alpha Proteins 0.000 description 1
- 102000008070 Interferon-gamma Human genes 0.000 description 1
- 108010074328 Interferon-gamma Proteins 0.000 description 1
- 238000000636 Northern blotting Methods 0.000 description 1
- 102100040247 Tumor necrosis factor Human genes 0.000 description 1
- 230000033289 adaptive immune response Effects 0.000 description 1
- 230000005975 antitumor immune response Effects 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 210000000481 breast Anatomy 0.000 description 1
- 239000003560 cancer drug Substances 0.000 description 1
- 230000008568 cell cell communication Effects 0.000 description 1
- 230000022131 cell cycle Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 230000034994 death Effects 0.000 description 1
- 231100000517 death Toxicity 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 235000014113 dietary fatty acids Nutrition 0.000 description 1
- 238000009509 drug development Methods 0.000 description 1
- 238000002651 drug therapy Methods 0.000 description 1
- 239000012636 effector Substances 0.000 description 1
- 230000007705 epithelial mesenchymal transition Effects 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 229930195729 fatty acid Natural products 0.000 description 1
- 239000000194 fatty acid Substances 0.000 description 1
- 150000004665 fatty acids Chemical class 0.000 description 1
- 238000010230 functional analysis Methods 0.000 description 1
- 239000007789 gas Substances 0.000 description 1
- 230000004077 genetic alteration Effects 0.000 description 1
- 230000037442 genomic alteration Effects 0.000 description 1
- 230000008826 genomic mutation Effects 0.000 description 1
- 239000008241 heterogeneous mixture Substances 0.000 description 1
- 108091008039 hormone receptors Proteins 0.000 description 1
- 230000005746 immune checkpoint blockade Effects 0.000 description 1
- 230000028993 immune response Effects 0.000 description 1
- 208000026278 immune system disease Diseases 0.000 description 1
- 230000006058 immune tolerance Effects 0.000 description 1
- 238000003018 immunoassay Methods 0.000 description 1
- 230000028709 inflammatory response Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000035992 intercellular communication Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 229960003130 interferon gamma Drugs 0.000 description 1
- 208000026535 luminal A breast carcinoma Diseases 0.000 description 1
- 208000026534 luminal B breast carcinoma Diseases 0.000 description 1
- 230000003211 malignant effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 210000001806 memory b lymphocyte Anatomy 0.000 description 1
- 238000010197 meta-analysis Methods 0.000 description 1
- 238000002493 microarray Methods 0.000 description 1
- 238000011119 multifactor regression analysis Methods 0.000 description 1
- 210000000440 neutrophil Anatomy 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 201000008129 pancreatic ductal adenocarcinoma Diseases 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000003389 potentiating effect Effects 0.000 description 1
- 102000004196 processed proteins & peptides Human genes 0.000 description 1
- 108090000765 processed proteins & peptides Proteins 0.000 description 1
- 108090000468 progesterone receptors Proteins 0.000 description 1
- 102000003998 progesterone receptors Human genes 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 238000003753 real-time PCR Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000003757 reverse transcription PCR Methods 0.000 description 1
- 230000022379 skeletal muscle tissue development Effects 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
- 238000011272 standard treatment Methods 0.000 description 1
- 150000003431 steroids Chemical class 0.000 description 1
- CCEKAJIANROZEO-UHFFFAOYSA-N sulfluramid Chemical group CCNS(=O)(=O)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)F CCEKAJIANROZEO-UHFFFAOYSA-N 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 210000004881 tumor cell Anatomy 0.000 description 1
- 230000004614 tumor growth Effects 0.000 description 1
- 210000003171 tumor-infiltrating lymphocyte Anatomy 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/50—Mutagenesis
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/10—Gene or protein expression profiling; Expression-ratio estimation or normalisation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B5/00—ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/106—Pharmacogenomics, i.e. genetic variability in individual responses to drugs and drug metabolism
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/118—Prognosis of disease development
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Biotechnology (AREA)
- Chemical & Material Sciences (AREA)
- Biophysics (AREA)
- Genetics & Genomics (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Evolutionary Biology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Molecular Biology (AREA)
- Analytical Chemistry (AREA)
- Organic Chemistry (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Bioethics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Oncology (AREA)
- Biochemistry (AREA)
- Hospice & Palliative Care (AREA)
- Physiology (AREA)
- Artificial Intelligence (AREA)
- Microbiology (AREA)
- Data Mining & Analysis (AREA)
- Epidemiology (AREA)
- Evolutionary Computation (AREA)
- Public Health (AREA)
- Software Systems (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses a LR score-based triple negative breast cancer curative effect and prognosis evaluation model. The construction method of the model comprises the following steps: (1) Analyzing the LR pairs related to the prognosis of the triple-negative breast cancer by using LASSO-Cox regression to obtain primary screening LR pairs; (2) Filtering the primary screening LR pair through a stepAIC strategy in an MASS software package to obtain a target LR pair with the lowest stepAIC value; (3) And obtaining the coefficient of each target LR pair through multivariate Cox regression analysis, and constructing a risk assessment model. The risk assessment model has excellent application effect in the treatment effect assessment and the prognosis effect assessment of the triple negative breast cancer, has high AUC value, and can accurately obtain different risk groups, thereby providing technical support for the division of different risk groups, and early prompting medical workers to provide proper classification management and personalized treatment schemes.
Description
Technical Field
The invention relates to the field of molecular biology, in particular to a LR score-based triple negative breast cancer curative effect and prognosis evaluation model.
Background
Breast cancer is currently one of the most common female cancers, accounting for 11.7% of all cancer cases. In the clinic, breast cancer can be classified into three major subtypes, including hormone receptor positive/HER 2 negative subtype (70%), HER2 positive subtype (15% -20%) and triple negative subtype (TNBC, specifically tumor type lacking the above 3 standard molecular markers, 15%), according to the expression of molecular markers such as estrogen, progesterone receptor, and human epidermal growth factor receptor 2 (HER 2). Of all three breast cancer subtypes, triple Negative Breast Cancer (TNBC) is the most aggressive, worst prognosis subtype. With the reports of related studies, based on various clinical, pathological and genetic factor analyses, the human eye of the study considered triple negative breast cancer to be a single heterogeneous breast cancer subtype. The study of multiomic analysis also provides new insight for the biological heterogeneity of TNBC, and the classification of the TNBC is evolved into different molecular subtypes according to recurrent genetic aberration, transcription patterns, tumor microenvironment characteristics and the like, and the accurate classification of the molecular subtypes and prognosis prediction based on the genetic map of the molecular subtypes can possibly help promote the study of personalized treatment. However, the multi-component analysis steps are complex, the detection cost and the time cost are high, the requirement on personnel is high, and the effective popularization cannot be realized, and no other means for effectively typing different molecular subtypes of triple negative breast cancer exists in the prior art.
Tumors are heterogeneous mixtures of cancerous and non-cancerous cells. Intercellular communication (TME) mediated by ligand-receptor interactions in the tumor microenvironment has profound effects on tumor progression. Studies have shown that communication between these cells within the tumor is critical for tumor progression. Communication between these cells is achieved through cellular production of ligands (proteins, peptides, fatty acids, steroids, gases and other low molecular weight compounds) that are secreted by the cells or are present on the cell surface and thus serve as receptors on or within the target cell. The literature indicates that most cells express tens to hundreds of ligands and receptors, forming a highly connected signaling network through multiple ligand-receptor pairs. The biological importance and availability of the receptors and their corresponding ligands have designated them as particularly useful clinical targets for cancer, but there is currently no application for triple negative breast cancer.
At present, no specific treatment finger scheme or prognosis treatment method aiming at the triple negative breast cancer exists, so the treatment is generally carried out according to the conventional standard treatment of the breast cancer, but the treatment difficulty of the triple negative breast cancer is higher than that of the common breast cancer, the survival rate is lower, the prognosis is very poor, and only 10-20% of effective prognosis exists. For individuals, the molecular typing and specific properties of the triple negative breast cancer determine the final prognosis effect, but the triple negative breast cancer cannot effectively show the significant difference of treatment and prognosis of different types based on the current typing situation. Therefore, the development of an efficient and accurate new subtype typing method of the triple negative breast cancer has extremely important significance for the early treatment and prognosis evaluation of the triple negative breast cancer.
Disclosure of Invention
The present invention has been made to solve at least one of the above-mentioned problems occurring in the prior art. Therefore, the model is constructed based on the LR score, and by analyzing the gene expression parameters of the triple negative breast cancer patient through the model, the treatment effect of the anti-tumor drug on the triple negative breast cancer patient and the relevant prognosis condition can be effectively and accurately judged. Compared with the traditional method, the method can better embody the treatment effect and the prognosis survival condition of the patient and predict the development trend of the disease, thereby applying more effective treatment means or modes and effectively reducing the mortality of the triple negative breast cancer.
In a first aspect of the present invention, a method for constructing a risk assessment model based on LR pairs (LR pairs) is provided, which includes the steps of:
(1) Analyzing the LR pairs related to the prognosis of the triple-negative breast cancer by using a Cox regression with LASSO punishment, and eliminating the unimportant LR pairs by reducing the weight of model parameters to obtain primary screening LR pairs;
(2) Filtering the primary screening LR pair through a stepAIC strategy in an MASS software package to obtain a target LR pair with the lowest stepAIC value;
(3) And obtaining the coefficient of each target LR pair through multivariate Cox regression analysis, and constructing a risk assessment model.
According to the first aspect of the present invention, in some embodiments of the present invention, the prognosis-related LR pair for triple negative breast cancer in step (1) comprises: APOB-ENO1, CXCL12-ITGA4, GPI-AMFR, MUC7-SELL, SELPLG-SELL, PODXL-SELL, BSG-SLC16A7, CD22-PTPRC, PTPRC-CD22, PPBP-CXCR1, IL7-IL7 3245 zxft 3219-CCR 7, SERPING1-SELP, CCL16-CCR2, PLG-F2RL1, CXCL13-ACKR4, IL11-IL6ST, ICOS-ICOSLG, ICOSLG-ICOS, CCL19-ACKR4, CXCL3-CXCR1, ICAM4-ITGA4 CCL16-CCR5, CD2-CD48, CD48-CD2, TNFSF13B-TNFRSF13B, ADAM-ITGA 4, HBEGF-ERBB2, CALR-SCARF1, CXCL13-CXCR5, LGI3-STX1A, CD-CD 48, CD48-CD244, HLA-A-KIR3DL1, HLA-B-KIR3DL1, HLA-F-KIR3DL1, LGI3-FLOT1, VEGFA-GPC1, EFNA4-EPHA5, QRFP-P2RY14, FGF4-FGFRL1, FGFRL1 CXCL8-CXCR1, NMS-NMUR2, GLG1-SELE, LY9-LY9, EDN3-KEL, CXCL13-CCR10, CCL19-CXCR3, CCL21-CXCR3, ADM-RAMP2, CFH-SELL, CXCL12-ITGB1, CD34-SELL, NMB-BRS3, CCL21-CCR7, PODXL2-SELL, SERPING1-SELE, CLEC2B-KLRF1, KLRF1-CLEC2B, CXCL-CXCR 3, SEMA4D-MET, ADM-GPRX 2, EBI3-IL6ST, SELP-IL 7 zxft 3536 2M-CD1B, ADAM-ITGA 4, WNT8A-FZD5, ADAM7-ITGB7, GNRH 1-MRRH 35BR 7, RHG-CALR 1-CCL 19, RCH 2-CCR 2, CCL 2-CCR 3, CCL 2-TIC 3, CLC 2-TIR 4, SEMA4D-MET 2, CRF 2, CRL 3, CRF 2-CCR 7, CRF 2-OCR 7, CRF 2, CRL 7, PRLL 3-OCR 7, CRL 7, CRS 2-OCR 7, and CRS 2, DKK2-LRP6, SERPINE1-LRP1, EFNA4-EPHA6, NMB-NMBR, MMP7-ERBB4, POMC-OPRD1, CLEC2D-KLRB1, KLRB1-CLEC2D, GDF-TDGF 1, GUCA2A-GUCY2C, IL-KCNJ 10, FGF3-FGFRL1, LRAP 1-SORT1, CXCL11-CXCR3, WNT9A-FZD10, APOB-LSR, CD70-CD27, ANGPTL1-TEK, TNF-TNFRSF1B, CXCL-CD 4, NECTN 2-TIIT, NECTN 4-TIIT, TIGIN 2, TIGN-LG-FAS, FASMC-POMC 3 25 zitf 4325-LRB 4325, APP 1-LRP 22, APP 22-PLRA 22-CCL 22-SEL 22-L22-SEL TNFSF13-FAS, CD160-TNFRSF14, TNFRSF14-CD160, FGF6-FGFR1, SPP1-ITGB1, MADCAM1-ITGA4, OXT-AVPR1A, CXCL-CXCR 3, CXCL5-ACKR1, BTLA-TNFRSF14, TNFRSF14-BTLA, CCL5-CCR3, TAC3-TACR1, CXCL10-SDC4, VEGFA-KDR, EFNA1-EPHA5, CCL5-ACKR1, EFNA1-EPHA2, DKK4-KREMEN2, POMC-MC2R, GNAS-PTGDR, CCL13-CCR2, IL18-IL18R1, MYL9-CD69, CXCL6-CXCR1, RSPO 3-CD 6, LRPAMMBN-CD 63, CALR-ITHR, CALCM 3-TSICA 1, and NMCLL 3-NMURDLL 4.
In the invention, the inventor jointly screens the 145 pairs of LR related to the TNBC prognosis by analyzing and researching the conventional Triple Negative Breast Cancer (TNBC) related LR pairs, wherein the prognosis is poor for the LR, and the prognosis is good for the LR, namely 44 pairs of LR.
Wherein, 44 pairs of LRs with poor prognosis are APOB-ENO1, GPI-AMFR, BSG-SLC16A7, PPBP-CXCR1, PLG-F2RL1, CXCL3-CXCR1, HBEGF-ERBB2, CALR-SCARF1, LGI3-STX1A, LGI-FLOT 1, VEGFA-GPC1, EFNA4-EPHA5, FGF4-FGFRL1, CXCL8-CXCR1, NMS-NMUR2, ADM-RAMP2, NMB-BRS3, ADM-MRGPRX2, WNT8A-FZD5, ADM-CALCR, VTN-PLAUR, LRFN3-LRFN3 SERPINE1-LRP1, EFNA4-EPHA6, NMB-NMBR, GDF3-TDGF1, FGF3-FGFRL1, LRPAP1-SORT1, APOB-LSR, APP-LRP1, IL22-IL22RA1, FGF6-FGFR1, SPP1-ITGB1, OXT-AVPR1A, TAC3-TACR1, VEGFA-KDR, EFNA1-EPHA5, EFNA1-EPHA2, DKK4-KREMEN2, CXCL6-CXCR1, AMBN-CD63, CALR-TSHR, NMU-NMUR1, DLL4-NOTCH3.
Good 101 pairs of LR in advance are CXCL12-ITGA4, MUC7-SELL, SELPLG-SELL, PODXL-SELL, CD22-PTPRC, PTPRC-CD22, IL7-IL7R, CCL-CCR 7, SERPING1-SELP, CCL16-CCR2, CXCL13-ACKR4, IL11-IL6ST, ICOS-ICOSLG, ICOSLG-ICOS, CCL19-ACKR4, ICAM4-ITGA4, CCL16-CCR5, CD2-CD48, CD48-CD2, TNFSF13B-TNFRSF13B, ADAM-ITGA 4 CXCL13-CXCR5, CD244-CD48, CD48-CD244, HLA-A-KIR3DL1, HLA-B-KIR3DL1, HLA-F-KIR3DL1, QRFP-P2RY14, GLG1-SELE, LY9-LY9, EDN3-KEL, CXCL13-CCR10, CCL19-CXCR3, CCL21-CXCR3, CFH-SELL, CXCL12-ITGB1, CD34-SELL, CCL21-CCR7, PODXL2-SELL, SERPING 1-SELLE, CLEC2B-KLRF1, CD48-CD244 KLRF1-CLEC2B, CXCL-CXCR 3, SEMA4D-MET, EBI3-IL6ST, TSLP-IL7R, B2M-CD1B, ADAM-ITGA 4, ADAM7-ITGB7, GNRH1-GNRHR, LTB-LTBR, COL14A1-CD44, F2-F2RL2, DEFB103A-CCR6, SLIT3-ROBO1, CCL25-ACKR2, CCL19-CCR10, CXCL9-CCR3, MRC 1-PRPTPRC, PTPRC-MRC1, CD58-CD2, CYTL1-CCR2, CTL 2 DKK2-LRP6, MMP7-ERBB4, POMC-OPRD1, CLEC2D-KLRB1, KLRB1-CLEC2D, GUCA A-GUCY2C, IL-KCNJ 10, CXCL11-CXCR3, WNT9A-FZD10, CD70-CD27, ANGPTL1-TEK, TNF-TNFRSF1B, CXCL12-CD4, NECTN 2-TIGIT, NECTN 4-TIGIT, TIGIT-NECTN 2, FASLG-FAS, POMC-MC3R, VCAM1-ITGB7, SELG-SELE, CCL21-ACKR4, TNFSF13-FAS, CD160-TNFRSF14, TNFRSF14-CD160, MADCAM1-ITGA4, CXCL9-CXCR3, CXCL5-ACKR1, BTLA-TNFRSF14, TNFRSF14-BTLA, CCL5-CCR3, CXCL10-SDC4, CCL5-ACKR1, POMC-MC2R, GNAS-GDR, CCL13-CCR2, IL18-IL18R1, MYL9-CD69, RSPO3-LRP6, ICAM3-ITGAL.
During the development of cancer, cancer cell-stromal cell crosstalk is coordinated by a number of ligand-receptor interactions to produce a TME (tumor microenvironment) that favors tumor growth. In the tumor microenvironment, LR-based cell-cell communication is the basis for poor prognosis of various cancers (such as pancreatic ductal adenocarcinoma and colorectal cancer), and therefore, further studies on receptors and ligands and their interactions are the focus and focus in the field. In the present invention, the LR pairs are all from the literature management database Connectome DB2020, and the Connectome DB2020 is a database integrating 2293 pairs of LR interactions, and in the present invention, the inventor obtained 145 pairs of LR based on 2293 pairs of LR in the TNBC database as a basis for screening, but of course, those skilled in the art can select LR pairs from other database sources according to actual use requirements.
In the present invention, the enrichment analysis of the LR at 145 above revealed that of the LR at 145, 10 pathways were most abundant, including viral protein interactions with cytokines and cytokine receptors, cytokine-cytokine receptor interactions, adhesion molecules (CAM), chemokine signaling pathways, igA-produced intestinal immune network, rheumatoid arthritis, proteoglycans in cancer, malaria, neuroactive ligand-receptor interactions, and hematopoietic cell lineages.
In some embodiments of the invention, the primary screened LR pairs obtained in step (1) are obtained by LASSO-COX regression analysis and screening in a 10-fold cross-validation process.
In some embodiments of the invention, the primary screening LR pairs are: CXCL9-CCR3, GPI-AMFR, IL18-IL18R1, HLA-F-KIR3DL1, PODXL-SELL and PLG-F2RL1.
In the present invention, the above-described preliminary screening LR pairs present non-zero coefficients in the fitted LASSO-COX regression model.
In some embodiments of the invention, the target LR pair in step (3) is: CXCL9-CCR3, GPI-AMFR, IL18-IL18R1 and PLG-F2RL1
In some embodiments of the invention, the risk assessment model is:
LR pair score = -0.08996361 × (CXCL 9-CCR3 expression amount) +0.27093847 × (GPI-AMFR expression amount) -0.29143116 × (IL 18-IL18R1 expression amount) +0.28034741 × (PLG-F2 RL1 expression amount).
In the present invention, the risk assessment model is derived based on LR versus score = late × Expi. Where Expi is the expression level of the ligand i-dependent gene, beta is the coefficient of a specific gene of multivariate Cox regression, and patients can be classified into high risk group (high LR score group) and low risk group (low LR score group) by zscore treatment with "0" as the threshold.
In some embodiments of the present invention, the detection product for detecting the expression amount of LR includes, but is not limited to, detection products constructed based on semi-quantitative RT-PCR, northern blot, real-time fluorescence quantitative PCR, and the like. Relevant specific primers or probes and the like can be obtained based on the routine in the art.
In some embodiments of the invention, the detection product includes, but is not limited to, a detection reagent, a detection kit, and a gene chip.
In a second aspect of the present invention, there is provided an application of the risk assessment model constructed by the construction method of the first aspect of the present invention in the assessment of the treatment effect of triple negative breast cancer.
In some embodiments of the invention, the smaller the LR score obtained by the risk assessment model, the better the treatment effect on triple negative breast cancer using an antitumor agent; on the contrary, the poorer the therapeutic effect of the antitumor agent on the triple negative breast cancer
In some embodiments of the invention, the therapeutic effect is manifested by decreased resistance to the antineoplastic agent and increased response to the treatment.
In the invention, the inventor verifies through experiments that the LR pair scores are related to tumor immunity and immunity check point genes, 18 of 19 immunity check points show difference between two LR pair score groups in terms of expression level, and high LR has larger response to the score groups; the high LR score also showed a significantly up-regulated T cell rejection score and a significantly down-regulated T cell dysfunction score compared to the low LR score set, while the TIDE score was not significantly different between the two groups. Thus demonstrating that LR scores can be used to characterize tumor immune-related effects.
In the invention, the inventor verifies through experiments that by taking anti-PD-LI as an example of a medicament, the positive proportion of LR response to anti-PD-L1 treatment in patients with low fraction is remarkably higher than that of LR to patients with fraction. Thus demonstrating that LR scores can be used to characterize the therapeutic effectiveness of a drug.
In the present invention, the inventors have experimentally verified that using conventional antitumor drugs as the drug example, the IC50 value of the drug in the low LR dichotomy group is significantly lower than that in the high LR dichotomy group, indicating that the low LR dichotomy group may be more sensitive to four drug treatments. Thus demonstrating that LR scores can be used to characterize patient resistance and drug sensitivity.
In some embodiments of the invention, the magnitude of the LR versus score is defined based on a zscore treatment of the risk assessment model.
In some embodiments of the invention, the zscore process is defined by a threshold, and an LR score greater than the threshold is defined as a high LR score, whereas a low LR score is defined.
In some embodiments of the invention, the threshold is 0.
In a third aspect of the present invention, there is provided an application of the risk assessment model constructed by the construction method of the first aspect of the present invention in the assessment of the prognosis effect of triple negative breast cancer.
In some embodiments of the invention, the smaller the LR score obtained by the risk assessment model, the higher the prognosis survival rate and the longer the survival time of the subject; conversely, the lower the prognosis survival rate of the subject, the shorter the survival time.
In the present invention, the inventors have experimentally verified that LR shows significantly favorable survival results for patients with low scores. And the conclusion is effectively verified and proved in a plurality of training sets and verification sets.
In the invention, the inventor finds out through experiments that in the GSES8812 validation set, the AUC values of LR to the scoring model are 0.72, 0.75 and 0.67 at 3 years, 5 years and 10 years respectively; in the GSE21653 validation set, the AUC values of LR versus scoring model were 0.90, 0.87, and 0.78 for 1-year, 3-year, and 5-year survival, respectively; has excellent accuracy and effectiveness.
In some embodiments of the invention, the magnitude of the LR versus score is defined based on a zscore treatment of the risk assessment model.
In some embodiments of the invention, the zscore process is defined by a threshold, and an LR score greater than the threshold is defined as a high LR score, whereas a low LR score is defined.
In some embodiments of the invention, the threshold is 0.
The invention has the beneficial effects that:
the invention provides a method for constructing a risk assessment model, which can construct an LR pair (LR pairs) -based assessment model, and can effectively and accurately judge the treatment effect of an antitumor drug on a triple negative breast cancer patient and the relevant prognosis condition by analyzing the gene expression parameters of the triple negative breast cancer patient by using the model. Compared with the traditional method, the method can better embody the treatment effect and the prognosis survival condition of the patient and predict the development trend of the disease, thereby developing more effective treatment means or modes and effectively reducing the mortality of the triple negative breast cancer.
The risk assessment model has excellent application effect in the triple negative breast cancer treatment effect assessment and the triple negative breast cancer prognosis effect assessment, has high AUC value, and can accurately and efficiently obtain different risk groups, thereby providing effective technical support for the division of different risk groups, and further prompting medical workers to provide a proper classified management and personalized treatment scheme as soon as possible.
Drawings
FIG. 1 is a screening of LR pairs associated with prognosis in the present invention, wherein A is a screening flowchart; b is the predicted volcano plot of 145 versus LR; c is an interactive network map of 145 to LR.
Figure 2 is 10 highly enriched KEGG pathways of 145 vs LR.
Fig. 3 is a graph of consistent cluster Cumulative Distribution Function (CDF) when a is K =2 to 9, based on the results of LR pairs for identifying three TNBC subtypes; b is an increment-area curve of consistent clustering of samples in METARIC; c concordance K =3 sample clustering heatmaps.
FIG. 4 is a Kaplan-Meier analysis of three subtypes of OS in the METARIC dataset.
FIG. 5 is a Kaplan-Meier analysis plot of the three molecular subtypes of OS in the GSE58812 dataset (A) and the GSE21653 dataset (B).
FIG. 6 is the clinical characteristics and genomic alterations of LR pair-based molecular subtypes, wherein A is the stage, grade, age, distribution ratio of PAM50+ Claudin-low molecular subtypes and survival status of each subtype in METARIC database; b is the distribution ratio of age and survival status of each subtype in the GSE588123 cohort.
Figure 7 is an age and survival profile of the three subtypes in the GSE21653 dataset.
FIG. 8 is a waterfall plot of somatic mutations and CNV in the METARIC database for three subtypes of the present invention (chi-square test).
FIG. 9 is the results of functional analysis between molecular subtypes based on LR pairs, wherein A is a GSEA bubble map of C1 and C3 subtypes in metabolic cohorts; b is GSEA bubble diagram of C1 and C3 subtypes in three queues; c is a GSEA normalized enrichment score (NE) heatmap of C1 to C2, C1 to C3, C2 to C3, with the vertical axis representing different comparison groups and the horizontal axis representing pathnames; d is the path radar map of C1 vs C2 and C2 vs C3 coherent activation in the METABRIC database.
FIG. 10 is an estimated ratio of 22 immune cells in LR pair-based molecular subtypes in the METABRIC (A), GSE58812 (B), GSE21653 (C) cohort; p-values were calculated by KruskalWallis test, ns: p >0.05, x: p <0.05,.: p <0.01,.: p <0.001, p <0.0001.
FIG. 11 is matrix score, immune score and estimation score among three LR pair-based molecular subtypes in the METABRIC (A), GSE58812 (B), GSE21653 (C) cohort; p-values were calculated by KruskalWallis test, ns: p >0.05, x: p <0.05, x: p <0.01, x: p <0.001, p <0.0001.
FIG. 12 is a LASSO-COX regression analysis of 6 vs LR and a fitted LASSO-COX regression model curve.
FIG. 13 is a graph of the coefficients corresponding to these predictors in a COX regression model for 4 pairs LR.
Fig. 14 is a LR-score plot of three LR pair-based subtypes in metabolic cohort (a), GSE58812 cohort (B) and GSE21653 cohort (C), kruskal wallis test.
FIG. 15 is a log rank test based on the results of OS estimation for samples with different LR pair scores in Kaplan-Meier comparison METABRIC cohort (A), GSE58812 cohort (C) and GSE21653 cohort (D); and the time-dependent ROC curve shows the predictive power of LR versus score in metabolic cohorts (B); ns: p >0.05, x: p <0.05, x: p <0.01,.: p <0.001, p <0.0001.
FIG. 16 is a graph showing LR versus score prediction capability in GSE58812 queue (A) and GSE21653 queue (B) for a time-dependent ROC curve.
FIG. 17 is a forest plot of the coefficients of univariate (A) and multivariate (B) COX regression and their confidence intervals, including factors such as LR versus score in the metabolome, age of the patient, stage, grade, and outcome of the patient.
Figure 18 is a correlation between LR pair scores and immune composition and immune-related pathways, where a is the result between ssGSEA score of Pearson correlation analysis KEGG pathway and LR scores in METABRIC, r >0.4; b Wilcoxon test for relative abundance of 22 immune cells in the high LR pair score and low LR pair score in METABRIC cohort; c is the estimated immune score of the high LR pair score and the low LR pair score in METABRIC queue, and the Wilcoxon test is carried out; d is the Pearson correlation analysis of LR pair scores and immune cell components; ns: p >0.05, x: p <0.05, x: p <0.01, x: p <0.001, p <0.0001.
FIG. 19 is a correlation of LR pair scores with immune checkpoint gene expression, wilcoxon test; ns: p >0.05, x: p <0.05,.: p <0.01, x: p <0.001, p <0.0001.
Fig. 20 is a correlation between LR pair scoring model and exclusion score (a), dysfunction score (B), and TIDE score (C).
Fig. 21 is a graph of LR versus score variability (a) between the Complete Response (CR)/Partial Response (PR) and stable disease (PD)/Progressive Disease (PD) groups in the IMvigor210 cohort and survival curves for different LR versus score groups in the IMvigor210 cohort; ns: p >0.05, x: p <0.05, x: p <0.01, x: p <0.001, p <0.0001.
Figure 22 is the response of patients with different LR pair scores in the IMvigor210 cohort to anti-PD-L1 treatment, log rank test.
FIG. 23 is a graph of the relationship between LR versus score and drug sensitivity, wherein A is the correlation of LR versus score to drug sensitivity curve AUC, spearman correlation analysis; b is the difference of IC50 estimated values of paclitaxel, veliparib, olaparib and Talazoparib among different LR bisecting groups, and the Willoxon test is carried out; ns: p >0.05, x: p <0.05, x: p <0.01, x: p <0.001, p <0.0001.
Fig. 24 is exemplary data for 4 samples of other LR pairs.
Detailed Description
In order to make the objects, technical solutions and technical effects of the present invention more clear, the present invention will be described in further detail with reference to specific embodiments. It should be understood that the detailed description and specific examples, while indicating the preferred embodiment of the invention, are given by way of illustration only.
The experimental materials and reagents used are, unless otherwise specified, all consumables and reagents which are conventionally available from commercial sources.
In the embodiment of the invention, the TNBC data resource is mainly a METABRIC data set (breast Cancer database) obtained by arranging Bio Cancer Genomics Portal (cBioPortal). The METABRIC dataset was downloaded from cBioPortal (http:// cBioport. Org /) and screened for availability. The final dataset used in the present invention included genomic variation data for 318 TNBC samples and motif mapping for 298 samples (both from the METABRIC dataset), as well as microarray data for 10783 TNBC samples collected from the GSE58812 and GSE21653 datasets in the Gene Expression comprehensive database (Gene Expression Omnibus, GEO, https:// www.ncbi.nlm.nih.gov/GEO /).
In the present example, each statistical data was analyzed using R4.0.2 software. The Kaplan-Meier survival curve and the Receiver Operating Characteristic (ROC) curve were visualized by the "surveyor" software package and the "time ROC", respectively. LR scores and clinical parameters were included in Cox proportional risk regression to determine independent factors that predicted TNBC prognosis. The p-value cutoff was set to 0.05.
In the present invention, the expression level of the LR pair refers to the sum of the expression levels of the two genes in the LR pair, and is exemplified by: "CXCL9-CCR3 expression level" or "CXCL9-CCR3" means that CXCL9 expression level + CCR3 expression level.
In the present invention, each LR pair or gene can be quantitatively detected based on the existing detection kit or detection product, or can be detected by using primers and/or probes by a conventional means in the art, and the model construction and evaluation effects in the present invention are not limited by the choice of the detection product.
Ligand receptor pair acquisition and screening
The inventors downloaded 2293 interacting ligand-receptor (LR) pairs from the literature management database connectome DB2020 for use in screening for LR. Wherein the screening standard is as follows: a patient is defined as high expression if the sum of gene expression in LR is equal to or greater than the median of the sum of LR gene expression for all patients. Otherwise, the patient is defined as low expression. The "survival" data packet in the R data packet was used to analyze the correlation between each pair of LR and TNBC patient survival in each cohort. Statistical significance was analyzed by the peo and peo correction of the Gehan-Wilcoxon test and the risk ratio (HR) was calculated by establishing exponential coefficients of a Cox regression model. The "temp" function in the "meta" packet was used to integrate the P values of the different queues based on the Edgington method and multiple test corrections were made based on the Storey method.
The results are as follows.
The screening flow chart is shown in FIG. 1A.
As described above, the inventors performed survival analysis of LR pairs for METABRIC, GSE58812 and GSE21653 and merged prognostic significance P-values of LR groups generated by three cohorts, performed meta analysis, integrated P-values of three cohorts based on Edgington method using "sump" function in "meta" software package, and performed multiple test corrections based on Storey method using "qValue" software package, in order to screen LR pairs relevant to TNBC prognosis. As a result, 145 pairs of LR with poor prognosis and 101 pairs of LR with good prognosis correlated with TNBC were screened together (fig. 1B). And interaction networks were plotted against these LR pairs correlated with TNBC prognosis (fig. 1C).
Enrichment analysis was performed by further including the LR pairs into KEGG (fig. 2). As a result, of the 145 pairs of LR, 10 pathways were found to be most abundant, including viral protein interactions with cytokines and cytokine receptors, cytokine-cytokine receptor interactions, adhesion molecules (CAM), chemokine signaling pathways, igA-produced intestinal immune network, rheumatoid arthritis, proteoglycans in cancer, malaria, neuroactive ligand-receptor interactions, and hematopoietic cell lineages.
LR subtype typing based on consensus clustering
Clusters were classified using "consensus clustering" based on the expression of TNBC prognostic-associated LR pairs. Where the K-means algorithm, "1-Pearson correlation" and clustering algorithms were specified to divide each sample into K groups and have each bootstrap involved 80% of the samples for 500 replicates. The heatmap of the consensus cluster is generated by the R-packet "pheatmap". The number of clusters is determined by a consensus Cumulative Distribution Function (CDF) graph and a delta area graph, and the criteria are high consistency within clusters, low coefficient of variation, and no significant increase in the area under the CDF curve.
The results are as follows.
The inventors examined whether the TNBC sample could differentiate different subtypes (three TNBC subtypes) according to the diversity of expression patterns of their prognostic-related LR pairs using the above method. Where important prognostic-related LR pairs were included in the clustering pattern for analysis, the abundance of expression of each LR pair was represented by the sum of the expression of the ligand and receptor genes. In the METARIC cohort, 298 TNBC samples were clustered by consistent clustering analysis. In the optimization of the number of clusters K, a Cumulative Distribution Function (CDF) curve indicates that K =3 produces a stable clustering result (fig. 3A and 3B), and thus K =3 is selected as the final option (fig. 3C).
Further analysis of the prognostic signatures then revealed that there were significant differences in prognosis between the three subtypes. Overall Survival (OS) for C1 was the most unfavorable, with OS for C3 being the longest of the three subtypes and OS for C2 being between the two subtypes (fig. 4). Furthermore, the inventors have also formed the corresponding three molecular subtypes after applying the same molecular subtype determination method to the TNBC patient cohort of GSE58812 and GSE21653, and observed significant and similar differences in prognosis among the three subtypes also in the survival analysis (fig. 5A and 5B).
The finally obtained matrix is shown in the following ADAM28-ITGA4 (Table 1), the data of the subject is brought into the matrix to obtain the clustering data, and the type can be judged according to the clustering data and the verification set clustering typing information in the invention. Exemplary data for other LR pairs can be found in fig. 24.
Analysis of mutations and Copy Number Variations (CNV) between different triple-negative breast cancer subtypes
Genomic data types based on cpbioport integration include somatic mutations, copy number alterations, gene expression, and DNA methylation. In this example, the inventors queried and downloaded somatic mutation and copy number alteration data directly from cbioport and analyzed according to the cbioport program. Wherein the "maftools" package is used to visualize mutation data. Differences in CNV genes of subtypes with significant gain and loss were compared using the chi-square test.
The results are as follows.
The inventors have found that different clinical characteristics and genomic mutations may also be contributing factors to different prognostic outcomes. The inventors analyzed the clinical characteristics of each subtype in the three TNBC datasets. Among them, no significant correlation was found between molecular subtypes in the METARIC database and clinical variables such as tumor stage, age and sex, and also significant differences were noted in the distribution of the widely accepted 5 breast cancer intrinsic molecular subtypes (Luminal A, luminal B, HER 2-encheched, basal-like and Claudin-low) among the three LR pair-based subtypes. Wherein, claudin-low subtype samples account for a large proportion of C3 subtypes, and Basal-like subtype samples account for a large proportion of C1 subtypes. There was also a significant difference in mortality between C1 and C3. More than 60% of the C1 samples died and more than 55% of the C3 samples survived (fig. 6A). Whereas in the GSE58812 queue, the age distributions of C1 and C3 are in opposite trends. There were also statistically significant differences in survival status between the three subtypes (fig. 6B). However, in the GSE21653 dataset, there were no significant differences in the age distribution between the three subtypes. But among them, the proportion of surviving patients was much different in C1 and C3, and the proportion of surviving samples was high in C3 (fig. 7).
The first 10 genes with the greatest variation among the three subtypes were further shown as waterfall plots, and it was found that the first 10 CNV-deleted genes and CNV-amplified genes in this heat map showed a relatively high mutation rate and mutation diversity for C1 and C2 (fig. 8).
Functional enrichment analysis
The Hallmark gene set was retrieved and downloaded from the molecular characterization database (MSigDB). GSEA Analysis (Gene Set evaluation Analysis) was performed on the filtered LR Set using the GSEA software program and the most significantly enriched signaling pathway was selected from the normalized Enrichment fraction (NES) with the screening criteria of False Discovery Rate (FDR) <0.05.
The results are as follows.
To further explore the molecular biological differences between LR pairs based on the three molecular subtypes, the inventors performed GSEA on all three TNBC datasets. For GSEA of the METARIC database, a significant increase in activity of 14 pathways in C1 compared to C3 was found, mainly cell cycle-related signaling pathways, such as MYC target, E2F target, G2M checkpoint and cancer-related pathways, as well as glycolysis, hypoxia, etc.; the activity of the 11 pathways was significantly reduced, mainly immune-related pathways such as complement, inflammatory response, interferon-alpha response, allograft rejection, interferon-gamma response, etc. (figure 9A). Whereas in C1 and C3 of the three TNBC datasets glycolysis, hypoxia and estrogen response were significantly up-regulated early, 10 pathways including apoptosis, TNFA signaling through NF-xB and complement, etc. were significantly down-regulated (fig. 9B). The inventors have further compared the activity of various pathways between C1 and C2 and between C2 and C3 subtypes in the metabolic cohort and found that 6 pathways are activated in each LR pair-based molecular subtype, including glycolysis, hypoxia, epithelial-mesenchymal transition, MYC target, myogenesis, early and late estrogen responses (fig. 9C, fig. 9D).
Immunoassay
The ratio of stroma to immune cells in the tumor sample was inferred from the expression profile and immune scores and stroma scores were calculated using "ESTIMATE" in the R data packet. Wherein the higher the immune score and the matrix score, the higher the content of TME. The extent of infiltration of 22 immune cells in TNBC was further quantified by the CIBERSORT algorithm.
The results are as follows.
After running CIBERSORT, an estimated proportion of immune cells for 22 molecular subtypes based on LR pairs was available in the three TNBC cohorts. The Kruskal-Wallis test results show that most immune cells (16 in total) that differ in the estimated ratios between the three LR-pair-based molecular subtypes are in the metabolic cohort, including naive B cells, memory B cells, CD 8T cells, naive CD4T cells, activated CD4 memory T cells, delta-gamma T cells, resting and activated NK cells, M0 macrophages, M1 macrophages, M2 macrophages, resting dendritic cells, activated dendritic cells, resting and activated mast cells, and neutrophils (fig. 10A). There were significant differences in the estimated ratios of naive B cells, naive CD4T cells, activated CD4 memory T cells, delta-gamma T cells, activated NK cells, M0 macrophages, M1 macrophages, M2 macrophages, and activated mast cells in all three molecular LR pair-based subtypes in the TNBC cohort (fig. 10B and 10C). Matrix scores, immune scores and estimated scores were compared between each subtype by the KruskalWallis test. The immune scores between the three molecular subtypes in each cohort showed significant differences, with p values <0.01. The immune/estimated scores between the three molecular subtypes in each cohort also showed highly significant differences, with p values <0.0001. Regardless of which of the three scores, C3 always > C2> C1 (fig. 11A, 11B, 11C).
Risk model construction based on LR pairs (LR pairs)
And (4) screening important genes from the LR pairs related to prognosis, and constructing a risk model.
The specific screening steps are as follows:
prognostic-related LR pairs were analyzed using LASSO-penalized Cox regression, with insignificant LR pairs eliminated by reducing the weight of the model parameters, resulting in primary-screened LR pairs. The primary screen LR was then filtered through the stepic strategy in the MASS software package. An LR pair scoring model was established using the gene with the lowest stepAIC value, and the coefficients for each gene were obtained by multivariate Cox regression analysis.
The results are as follows.
To select the LR pairs best suited to predict TNBC prognosis, the inventors performed LASSO-COX regression analysis on the 145 pairs of LR pairs selected in the above example, and co-selected 6 pairs of LR pairs in a 10-fold cross-validation process because they presented non-zero coefficients in the fitted LASSO-COX regression model (fig. 12). 4 of these LR pairs (CXCL 9-CCR3, GPI-AMFR, IL18-IL18R1, and PLG-F2RL 1) were finally selected by stepAIC multifactor regression analysis, the 4 pairs of LRs having a statistical fit of the model and the number of parameters for the fit.
The resulting model formula is:
LR pair score = -0.08996361 × (CXCL 9-CCR3 expression amount) +0.27093847 × (GPI-AMFR expression amount) -0.29143116 × (IL 18-IL18R1 expression amount) +0.28034741 × (PLG-F2 RL1 expression amount).
Here, "CXCL9-CCR3" means the expression level of CXCL9 + the expression level of CCR3, and the like.
Essentially, the above model formula for risk score for each patient was derived based on LR versus score = bite × Expi. Where Expi is the expression level of the gene subject to ligand i, beta is the coefficient of a specific gene of multivariate Cox regression, and patients can be classified into high risk group (high LR score group) and low risk group (low LR score group) by zscore treatment with "0" as the threshold. When the kit is used for prognosis analysis, a survival curve can be further drawn by adopting a Kaplan-Meier method to visually show the prognosis risk condition, wherein the significance of the difference is determined by adopting a logarithmic rank test.
Of course, it should be understood that the threshold used in the embodiment of the present invention is essentially a continuous variable, and may be defined as a cut off value according to the situation when the grade data division is required. In the present embodiment, the threshold value is set to "0".
The coefficients corresponding to these predictors in the COX regression model are shown in fig. 13. Based on the 4 LR pairs, an LR pair scoring model and an LR pair score were constructed for quantitative analysis of LR pair patterns of TNBC samples. The inventors also found that the LR scores for the C1 subtype were significantly higher than the C2 and C3 subtypes in the METABRIC, GSE58812 and GSE21653 cohorts (fig. 14A, 14B, 14C). To analyze the clinical relevance of LR pairs, TNBC samples from each cohort were divided into two groups according to LR pair scores. LR showed significantly favorable survival outcomes for patients with low scores in the METABRIC cohort (fig. 15A). The area under the curve (AUC) of the LR versus score time-dependent ROC curve was 0.72, 0.63, 0.65, and 0.66 at 1, 3, 5, and 10 years, respectively (fig. 15B). The reliability of LR versus score was further verified using 107 samples from GSE58812 and 83 samples from GSE21653, and it was found that LR showed higher mortality and shorter survival time for the high-scoring samples in both verification sets (fig. 15C, 15D). GSES8812 validation centralizes that the AUC values of LR versus scoring model were 0.72, 0.75 and 0.67 at 3, 5 and 10 years, respectively (fig. 16A). The LR pair scoring model performed better in the GSE21653 validation cohort with AUC 0.90, 0.87 and 0.78 corresponding to 1-year, 3-year and 5-year survival respectively (fig. 16B).
Furthermore, univariate Cox regression model analysis in METABRIC showed that staging, age, and LR score correlated significantly with the prognosis of TNBC (fig. 17A). And in the multivariate Cox regression model, these prognostic factors can all be considered as independent prognostic factors for TNBC (fig. 17B).
Correlation between LR pair scores and immune composition and immune-related pathways
To find the most relevant way to LR scores. The inventors further analyzed METABRIC samples using "GSVA" in the R software package.
The results are as follows.
Single sample GSEA (ssGSEA) scores were obtained for METABRIC samples with different functions by "GSVA" and 30 pathways significantly correlated with LR scores were obtained by Pearson correlation analysis. Wherein 2 pathways are positively correlated with LR scores and 28 pathways are negatively correlated with LR scores. ssGSEA scores for immune-related pathways (e.g., chemokine signaling pathway, antigen processing and presentation, natural killer cell-mediated cytotoxicity, toll-like receptor signaling pathway, natural killer cell-mediated cytotoxicity, and T cell receptor signaling pathway) were significantly inversely correlated with LR scores (fig. 18A). Further analysis of the relationship between LR scores and tumor immune components revealed that at least in general there was a significant difference between the high LR score samples and the low LR score samples in the 22 immune cells (fig. 18B).
Furthermore, pearson correlation analysis between LR pair scores and immune cells showed (fig. 18C) that there was a significant negative correlation between LR pair scores and CD 8T cells, activated CD4 memory T cells and macrophages, but a positive correlation with M0 macrophages and M2 macrophages (fig. 18D), indicating a correlation between LR pair scores and tumor immunity.
Effect of LR on scoring model in predicting clinical treatment response
The relationship between LR score values and gene expression levels in immune checkpoints was determined by Wilcoxon test and a block diagram was generated for visualization. Tumor immune dysfunction and rejection (TIDE) is the prediction of Immune Checkpoint Blockade (ICB) treatment responses in a sample by mimicking the accurate genetic characteristics of two immune escape mechanisms.
The inventors downloaded drug sensitivity data for about 1000 cancer cell lines from cancer drug sensitivity Genomics (GDSC) (http:// www.cancerrxgene.org, GDSC is the largest common resource platform for cancer cell drug sensitivity and drug response molecular markers), where data for breast cell lines was mainly downloaded, resulting in 50 total cell line data treated with 190 drugs. The correlation between drug sensitivity and LR fractional values was calculated using Spearman correlation analysis with the area under the curve (AUC) value of the anti-tumor drug in the tumor cell line as the drug response index. The adjusted FDR was calculated using the Benjamin and Hochberg methods. Correlations with Rs absolute >0.2 and FDR <0.05 were considered statistically significant. In addition, the maximum inhibitory concentrations (calculated as half inhibitory concentration IC 50) of the recommended antitumor drugs paclitaxel, veliparib, olaparib (Olaparib) and Talazoparib (Talazoparib) in TNBC treatment were compared in different LR pair scoring groups using the prropheic software package.
The results are as follows.
In conjunction with the associations between LR pair scores and tumor immunity disclosed in the above examples, the inventors further analyzed the associations between LR pair scores and immune checkpoint genes. In terms of expression levels, 18 of the 19 immune checkpoints showed a difference between the two LR versus score groups, with high LR responses to score groups being greater (fig. 19). The high LR score also showed a significantly up-regulated T cell rejection score and a significantly down-regulated T cell dysfunction score compared to the low LR score set, whereas the TIDE score did not differ significantly between the two groups (fig. 20).
Further, the inventors examined the ability of LR to fractionally predict Immune Checkpoint Inhibitor (ICI) therapeutic response in the immunotherapy cohort IMvigor210 (anti-PD-LI). Samples of Stable Disease (SD) and Progressive Disease (PD) were found to have significantly higher LR vs scores compared to samples of Complete Response (CR) and Partial Response (PR) (fig. 21A). Samples treated with anti-PD-L1 were divided into low LR bisecting groups and high LR bisecting groups. In the IMvigor210 cohort, the prognosis for LR for high scoring samples was still significantly worse than for LR for low scoring samples (fig. 21B). The positive proportion of patients with low LR scores who responded positively to anti-PD-L1 treatment was significantly higher than patients with LR scores (figure 22).
The GDSC database stores the treatment response data of various anti-cancer drugs, as well as the gene expression profiles of a large number of cancer cell lines. The inventors have conducted Spearman correlation analysis on GDSC data and found that LR versus score is significantly correlated with the therapeutic response of 29 drugs, as shown by the area under the drug sensitivity curve (AUC). Of these, 28 relevant pairs (LR pairs) were positive, indicating that the high LR score in tumors was correlated with their resistance to these drugs (fig. 23A). Furthermore, by comparing the IC50 estimates of paclitaxel, veliparib, olaparib (Olaparib), and talzopanib (Talazoparib) in the two LR bisecting groups, the IC50 values for the four drugs in the low LR bisecting group were found to be significantly lower than the high LR bisecting group, indicating that the low LR bisecting group may be more sensitive to treatment with the four drugs (fig. 23B).
In conclusion, through carrying out TNBC survival analysis on the LR of 2293, the inventors co-screen 145 LR pairs which are significantly related to the TNBC prognosis, and then obtain three LR pair subtypes of the TNBC by adopting an unsupervised clustering mode according to the expression conditions of the 145 LR pairs. Of these three LR-pair subtypes, the C1 subtype prognosis is the worst, with the most aggressive breast cancer subtype, basal-like subtype, having a significantly higher proportion in the C1 subtype group than in the other two groups, with the highest proportion of deaths in the corresponding clinical characteristics of this group. In addition, the C1 subtype group showed the lowest anti-tumor immune responses, such as lower tumor infiltrating lymphocytes (naive B cells, CD 8T cells, naive CD4T cells), stromal and immune scores, which may be responsible for poor prognosis of the C1 subtype.
In addition to typing TNBC based on 145 pairs of LR, lasso regression and Cox analysis were also performed on 145 pairs of LR and 4 pairs of LR pairs were selected to construct an LR pair scoring model in the present invention. The significance of the LR scoring model for prognostic evaluation was confirmed in both TCGA and both geographic data sets. In this model, samples with high LR scores showed significantly shorter survival times than samples with low LR scores. In the art, chemokine signaling pathways are well known to promote anti-tumor responses of the immune system by recruiting immune cells; antigen processing and presentation as the initiation of an adaptive immune response plays a key role in anti-tumor immunity; the strength of the T cell receptor signaling pathway is a key determinant of T cell-mediated anti-tumor responses; natural killer cell-mediated cytotoxicity is an important effector mechanism of the immune system against cancer; and activation of toll-like receptor signaling pathways can be used to enhance immune responses against malignant cells, among others. The present invention verifies that the LR scores are not only significantly inversely related to chemokine signaling pathways, antigen processing and presentation, T cell receptor signaling pathways, natural killer cell-mediated cytotoxicity, toll-like receptor signaling pathways, natural killer cell-mediated cytotoxicity and T cell receptor signaling pathways, but also reflect the matrix score, the immune score, and the infiltration of CD 8T cells, activated CD4 memory T cells, and macrophages. Furthermore, there was no significant difference in the TIDE score between the high and low LR scores, and immune escape may have no significant effect on LR scores. Taken together with all the above results, it can be considered that TNBC samples with high LR scores do not have strong anti-tumor immunity.
The various ligands expressed by cancer cells bind to cell surface receptors on immune cells, trigger inhibitory pathways (e.g., PD-1/PD-L1) and promote immune tolerance by immune cells. In the present invention, the inventors validated the ability of 4 LR pair scores (based on LR pair scoring model) to predict Immune Checkpoint Inhibitor (ICI) treatment response using the anti-PD-L1 cohort. Patients with complete or partial remission of the disease were found to have significantly lower LR scores than patients with stable or progressive disease. The clinical benefit of low LR score anti-PD-L1 treatment was significantly greater than high LR score, which confirms the effectiveness of LR score models to predict anti-PD-L1 treatment.
Some molecular targeted antitumor drugs can prevent the immunotherapy drug resistance of cancer, but only a single drug therapy is applied, but the stable treatment effect cannot be achieved, and the combination of the antitumor drugs and the ICI immunotherapy can greatly improve the prognosis of patients. In the present example, the inventors determined 29 pairs of LR scores and drug sensitivities in the GDSC database by Spearman correlation analysis, where 28 pairs of drug sensitivity curves showed a significant positive correlation between AUC and LR scores (only Wnt-C59 showed sensitivity correlated with LR scores). This suggests that they exhibit resistance associated with LR pairs scores, and that targeted drug development based on these LR pairs can be effective to obtain highly potent anti-resistance drugs.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.
Claims (10)
1. The construction method of the risk assessment model based on LR pairs (LR calls) comprises the following steps:
(1) Analyzing the LR pairs related to the prognosis of the triple-negative breast cancer by using a Cox regression with LASSO punishment, and eliminating the unimportant LR pairs by reducing the weight of model parameters to obtain primary screening LR pairs;
(2) Filtering the primary screening LR pair through a stepAIC strategy in an MASS software package to obtain a target LR pair with the lowest stepAIC value;
(3) And obtaining the coefficient of each target LR pair through multivariate Cox regression analysis, and constructing a risk assessment model.
2. The method of claim 1, wherein the LR pairs associated with prognosis of triple negative breast cancer in step (1) comprise: APOB-ENO1, CXCL12-ITGA4, GPI-AMFR, MUC7-SELL, SELPLG-SELL, PODXL-SELL, BSG-SLC16A7, CD22-PTPRC, PTPRC-CD22, PPBP-CXCR1, IL7-IL7 3245 zxft 3219-CCR 7, SERPING1-SELP, CCL16-CCR2, PLG-F2RL1, CXCL13-ACKR4, IL11-IL6ST, ICOS-ICOSLG, ICOSLG-ICOS, CCL19-ACKR4, CXCL3-CXCR1, ICAM4-ITGA4 CCL16-CCR5, CD2-CD48, CD48-CD2, TNFSF13B-TNFRSF13B, ADAM-ITGA 4, HBEGF-ERBB2, CALR-SCARF1, CXCL13-CXCR5, LGI3-STX1A, CD-CD 48, CD48-CD244, HLA-A-KIR3DL1, HLA-B-KIR3DL1, HLA-F-KIR3DL1, LGI3-FLOT1, VEGFA-GPC1, EFNA4-EPHA5, QRFP-P2RY14, FGF4-FGFRL1, FGFRL1 CXCL8-CXCR1, NMS-NMUR2, GLG1-SELE, LY9-LY9, EDN3-KEL, CXCL13-CCR10, CCL19-CXCR3, CCL21-CXCR3, ADM-RAMP2, CFH-SELL, CXCL12-ITGB1, CD34-SELL, NMB-BRS3, CCL21-CCR7, PODXL2-SELL, SERPING1-SELE, CLEC2B-KLRF1, KLRF1-CLEC2B, CXCL-CXCR 3, SEMA4D-MET, ADM-GPRX 2, EBI3-IL6ST, SELP-IL 7 zxft 3536 2M-CD1B, ADAM-ITGA 4, WNT8A-FZD5, ADAM7-ITGB7, GNRH 1-MRRH 35BR 7, RHG-CALR 1-CCL 19, RCH 2-CCR 2, CCL 2-CCR 3, CCL 2-TIC 3, CLC 2-TIR 4, SEMA4D-MET 2, CRF 2, CRL 3, CRF 2-CCR 7, CRF 2-OCR 7, CRF 2, CRL 7, PRLL 3-OCR 7, CRL 7, CRS 2-OCR 7, and CRS 2, DKK2-LRP6, SERPINE1-LRP1, EFNA4-EPHA6, NMB-NMBR, MMP7-ERBB4, POMC-OPRD1, CLEC2D-KLRB1, KLRB1-CLEC2D, GDF-TDGF 1, GUCA2A-GUCY2C, IL-KCNJ 10, FGF3-FGFRL1, LRAP 1-SORT1, CXCL11-CXCR3, WNT9A-FZD10, APOB-LSR, CD70-CD27, ANGPTL1-TEK, TNF-TNFRSF1B, CXCL-CD 4, NECTN 2-TIIT, NECTN 4-TIIT, TIGIN 2, TIGN-LG-FAS, FASMC-POMC 3 25 zitf 4325-LRB 4325, APP 1-LRP 22, APP 22-PLRA 22-CCL 22-SEL 22-L22-SEL TNFSF13-FAS, CD160-TNFRSF14, TNFRSF14-CD160, FGF6-FGFR1, SPP1-ITGB1, MADCAM1-ITGA4, OXT-AVPR1A, CXCL-CXCR 3, CXCL5-ACKR1, BTLA-TNFRSF14, TNFRSF14-BTLA, CCL5-CCR3, TAC3-TACR1, CXCL10-SDC4, VEGFA-KDR, EFNA1-EPHA5, CCL5-ACKR1, EFNA1-EPHA2, DKK4-KREMEN2, POMC-MC2R, GNAS-PTGDR, CCL13-CCR2, IL18-IL18R1, MYL9-CD69, CXCL6-CXCR1, RSPO 3-CD 6, LRPAMMBN-CD 63, CALR-ITHR, CALCM 3-TSICA 1, and NMCLL 3-NMURDLL 4.
3. The building method according to claim 1, wherein the target LR pair in step (3) is: CXCL9-CCR3, GPI-AMFR, IL18-IL18R1 and PLG-F2RL1.
4. The method of constructing according to claim 1, wherein the risk assessment model is:
LR pair score = -0.08996361 × (CXCL 9-CCR3 expression amount) +0.27093847 × (GPI-AMFR expression amount) -0.29143116 × (IL 18-IL18R1 expression amount) +0.28034741 × (PLG-F2 RL1 expression amount).
5. Use of the risk assessment model constructed by the construction method according to any one of claims 1 to 4 in the assessment of the treatment effect of triple negative breast cancer.
6. The use of claim 5, wherein the lower the LR score obtained by the risk assessment model, the better the treatment of triple negative breast cancer with an anti-tumor agent; on the contrary, the poorer the treatment effect of the antitumor drug on the triple negative breast cancer; wherein the therapeutic effect is manifested by reduced drug resistance to the antineoplastic agent and enhanced therapeutic response.
7. Use of the risk assessment model constructed by the construction method according to any one of claims 1 to 4 in the prognosis effect assessment of triple negative breast cancer.
8. The use of claim 7, wherein the smaller the LR score obtained by the risk assessment model, the higher the prognosis survival rate and the longer the survival time of the subject; conversely, the lower the prognosis survival rate of the subject, the shorter the survival time.
9. The use of claim 6 or 8, wherein the magnitude of the LR pair score is defined based on a zscore treatment of the risk assessment model, wherein the zscore treatment is defined by a threshold, and wherein a LR pair score greater than the threshold is defined as a high LR pair score, and vice versa as a low LR pair score.
10. The use according to claim 9, wherein the threshold is 0.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210993850.0A CN115424669B (en) | 2022-08-18 | 2022-08-18 | Triple negative breast cancer curative effect and prognosis evaluation model based on LR (long-term evolution) score |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210993850.0A CN115424669B (en) | 2022-08-18 | 2022-08-18 | Triple negative breast cancer curative effect and prognosis evaluation model based on LR (long-term evolution) score |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115424669A true CN115424669A (en) | 2022-12-02 |
CN115424669B CN115424669B (en) | 2023-06-13 |
Family
ID=84198067
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210993850.0A Active CN115424669B (en) | 2022-08-18 | 2022-08-18 | Triple negative breast cancer curative effect and prognosis evaluation model based on LR (long-term evolution) score |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115424669B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116312802A (en) * | 2023-02-01 | 2023-06-23 | 中国医学科学院肿瘤医院 | Screening method of triple negative breast cancer prognosis characteristic gene and application thereof |
CN117153241A (en) * | 2023-09-21 | 2023-12-01 | 浙江省肿瘤医院 | Prediction model of triple negative breast cancer prognosis effect and application thereof |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016186582A2 (en) * | 2015-05-19 | 2016-11-24 | Amwise Diagnostics Pte. Ltd. | Gene expression profiles and uses thereof in breast cancer |
US20180246108A1 (en) * | 2017-02-21 | 2018-08-30 | University Of South Carolina | Left-Right Gene Expression Signature for Triple Negative Breast Cancer |
CN110382712A (en) * | 2017-01-24 | 2019-10-25 | 基因技术有限公司 | The improved method of risk for assessment development breast cancer |
CN110917353A (en) * | 2019-11-20 | 2020-03-27 | 中国人民解放军陆军军医大学 | Reagent for resisting premature senility escape of colorectal cancer cells and application thereof |
US20210177895A1 (en) * | 2018-08-24 | 2021-06-17 | Yeda Research And Development Co. Ltd. | Methods of modulating m2 macrophage polarization and use of same in therapy |
CN114300139A (en) * | 2022-01-13 | 2022-04-08 | 澳门科技大学 | Construction of breast cancer prognosis model, application method and storage medium thereof |
CN114496066A (en) * | 2022-04-13 | 2022-05-13 | 南京墨宁医疗科技有限公司 | Construction method and application of gene model for prognosis of triple negative breast cancer |
CN114480650A (en) * | 2022-02-08 | 2022-05-13 | 深圳市陆为生物技术有限公司 | Marker and model for predicting three-negative breast cancer clinical prognosis recurrence risk |
CN114891887A (en) * | 2022-05-13 | 2022-08-12 | 西安交通大学 | Method for screening triple negative breast cancer prognosis gene marker |
-
2022
- 2022-08-18 CN CN202210993850.0A patent/CN115424669B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016186582A2 (en) * | 2015-05-19 | 2016-11-24 | Amwise Diagnostics Pte. Ltd. | Gene expression profiles and uses thereof in breast cancer |
EP3298167A2 (en) * | 2015-05-19 | 2018-03-28 | Amwise Diagnostics Pte. Ltd. | Gene expression profiles and uses thereof in breast cancer |
CN110382712A (en) * | 2017-01-24 | 2019-10-25 | 基因技术有限公司 | The improved method of risk for assessment development breast cancer |
US20180246108A1 (en) * | 2017-02-21 | 2018-08-30 | University Of South Carolina | Left-Right Gene Expression Signature for Triple Negative Breast Cancer |
US20210177895A1 (en) * | 2018-08-24 | 2021-06-17 | Yeda Research And Development Co. Ltd. | Methods of modulating m2 macrophage polarization and use of same in therapy |
CN113227359A (en) * | 2018-08-24 | 2021-08-06 | 耶达研究及发展有限公司 | Methods of modulating polarization of M2 macrophages and uses thereof in therapy |
CN110917353A (en) * | 2019-11-20 | 2020-03-27 | 中国人民解放军陆军军医大学 | Reagent for resisting premature senility escape of colorectal cancer cells and application thereof |
CN114300139A (en) * | 2022-01-13 | 2022-04-08 | 澳门科技大学 | Construction of breast cancer prognosis model, application method and storage medium thereof |
CN114480650A (en) * | 2022-02-08 | 2022-05-13 | 深圳市陆为生物技术有限公司 | Marker and model for predicting three-negative breast cancer clinical prognosis recurrence risk |
CN114496066A (en) * | 2022-04-13 | 2022-05-13 | 南京墨宁医疗科技有限公司 | Construction method and application of gene model for prognosis of triple negative breast cancer |
CN114891887A (en) * | 2022-05-13 | 2022-08-12 | 西安交通大学 | Method for screening triple negative breast cancer prognosis gene marker |
Non-Patent Citations (2)
Title |
---|
WAKSAG等: "Breast cancer treatment:A review" * |
王常;彭理;潘博;周易冬;茅枫;林燕;孙强;: "瘦素受体在早期乳腺癌中的预后判定价值", no. 12 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116312802A (en) * | 2023-02-01 | 2023-06-23 | 中国医学科学院肿瘤医院 | Screening method of triple negative breast cancer prognosis characteristic gene and application thereof |
CN116312802B (en) * | 2023-02-01 | 2023-11-28 | 中国医学科学院肿瘤医院 | Application of characteristic gene TRIM22 in preparation of reagent for regulating and controlling breast cancer related gene expression |
CN117153241A (en) * | 2023-09-21 | 2023-12-01 | 浙江省肿瘤医院 | Prediction model of triple negative breast cancer prognosis effect and application thereof |
Also Published As
Publication number | Publication date |
---|---|
CN115424669B (en) | 2023-06-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Pan-cancer landscape of T-cell exhaustion heterogeneity within the tumor microenvironment revealed a progressive roadmap of hierarchical dysfunction associated with prognosis and therapeutic efficacy | |
CN115424669B (en) | Triple negative breast cancer curative effect and prognosis evaluation model based on LR (long-term evolution) score | |
Liu et al. | Molecular analysis of Chinese oesophageal squamous cell carcinoma identifies novel subtypes associated with distinct clinical outcomes | |
Xu et al. | Comprehensive FGFR3 alteration-related transcriptomic characterization is involved in immune infiltration and correlated with prognosis and immunotherapy response of bladder cancer | |
Cai et al. | Development and validation of a novel endoplasmic reticulum stress-related lncRNA prognostic signature and candidate drugs in breast cancer | |
CN115478106B (en) | LR-based method for typing triple negative breast cancer and application thereof | |
Yang et al. | Integrated transcriptome analyses and experimental verifications of mesenchymal-associated TNFRSF1A as a diagnostic and prognostic biomarker in gliomas | |
Lu et al. | Establishment and evaluation of module-based immune-associated gene signature to predict overall survival in patients of colon adenocarcinoma | |
Pan et al. | The molecular subtypes of triple negative breast cancer were defined and a ligand-receptor pair score model was constructed by comprehensive analysis of ligand-receptor pairs | |
Qin et al. | A cancer-associated fibroblast subtypes-based signature enables the evaluation of immunotherapy response and prognosis in bladder cancer | |
Gao et al. | Imrelnc: Identifying immune-related lncrna characteristics in human cancers based on heuristic correlation optimization | |
Li et al. | Comprehensive integration of single-cell RNA and transcriptome RNA sequencing to establish a pyroptosis-related signature for improving prognostic prediction of gastric cancer | |
Zhou et al. | The inflammatory response‐related robust machine learning signature in endometrial cancer: Based on multi‐cohort studies | |
Yue et al. | Unraveling the prognostic significance and molecular characteristics of tumor-infiltrating B lymphocytes in clear cell renal cell carcinoma through a comprehensive bioinformatics analysis | |
Shen et al. | Construction and validation of a bladder cancer risk model based on autophagy-related genes | |
Chen et al. | Functional status analysis of RNH1 in bladder cancer for predicting immunotherapy response | |
Chen et al. | Development and Validation of an Immune Prognostic Index Related to Infiltration of CD4+ and CD8+ T Cells in Colorectal Cancer | |
Zang et al. | Characterization of 5-inflammatory-gene signature to affect the immune status and predict prognosis in breast cancer | |
Cai et al. | Construction and verification of a novel hypoxia-related lncRNA signature related with survival outcomes and immune microenvironment of bladder urothelial carcinoma by weighted gene co-expression network analysis | |
Yu et al. | Systematic transcriptome profiling of pyroptosis related signature for predicting prognosis and immune landscape in lower grade glioma | |
Yanwei et al. | Systematic analysis of molecular subtypes and immune prediction based on CD8 T cell pattern genes based on head and neck cancer | |
Yan et al. | Integrated analysis of single-cell and bulk RNA sequencing data reveals the association between hypoxic tumor cells and exhausted T cells in predicting immune therapy response | |
Cao et al. | An immune-related risk gene signature predicts the prognosis of breast cancer | |
Sun et al. | An immune-related prognostic signature associated with immune landscape and therapeutic responses in gastric cancer | |
Li et al. | Elucidating the role of Pyroptosis in papillary thyroid cancer: prognostic, immunological, and therapeutic perspectives |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |