Nothing Special   »   [go: up one dir, main page]

CN112085067A - Method for high-throughput screening of DNA damage response inhibitor - Google Patents

Method for high-throughput screening of DNA damage response inhibitor Download PDF

Info

Publication number
CN112085067A
CN112085067A CN202010829597.6A CN202010829597A CN112085067A CN 112085067 A CN112085067 A CN 112085067A CN 202010829597 A CN202010829597 A CN 202010829597A CN 112085067 A CN112085067 A CN 112085067A
Authority
CN
China
Prior art keywords
cell nucleus
image
network model
dna damage
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010829597.6A
Other languages
Chinese (zh)
Other versions
CN112085067B (en
Inventor
王毅
王锐
荀德金
陈雪纯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202010829597.6A priority Critical patent/CN112085067B/en
Publication of CN112085067A publication Critical patent/CN112085067A/en
Application granted granted Critical
Publication of CN112085067B publication Critical patent/CN112085067B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/155Segmentation; Edge detection involving morphological operators
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/50Molecular design, e.g. of drugs
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/10ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Biophysics (AREA)
  • Medicinal Chemistry (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Epidemiology (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention discloses a method for high-throughput screening of a DNA damage response inhibitor, which comprises the following steps: s1, training a cell nucleus segmentation network model based on the U-Net network; s2, constructing a cell nucleus type judgment network model and training; s3, shooting the cells after the action of the DNA damage reaction inhibitor by using high content imaging equipment to obtain an image to be analyzed; and inputting the image to be analyzed into the cell nucleus segmentation network model and then inputting into the cell nucleus type judgment network model, and counting damaged cell nucleus ratios corresponding to each DNA damage reaction inhibitor, wherein the smaller the damaged cell nucleus ratio is, the better the effect of the DNA damage reaction inhibitor is. The method can automatically perform segmentation and class decision on the images acquired by the high-content imaging equipment in batches, and can preliminarily screen out compounds with further research value through statistical analysis.

Description

Method for high-throughput screening of DNA damage response inhibitor
Technical Field
The invention relates to the technical field of DNA damage, drug screening and deep learning, in particular to a method for screening a DNA damage response inhibitor in a high-throughput manner.
Background
DNA damage is caused when organisms are subjected to various endogenous and exogenous factors (e.g., reactive oxygen species, DNA replication errors, ultraviolet radiation, ionizing radiation and genotoxic agents). The accumulation of DNA damage has been shown to be closely related to organ aging and cancer progression.
Despite the question whether inhibiting DNA damage or optimizing the DNA repair process slows aging in humans, evidence suggests that prevention of DNA damage and promotion of DNA repair are key therapeutic targets for age-related diseases, including vascular diseases, metabolic diseases, neurodegenerative diseases.
In addition, DNA damage response inhibitors (DDR) are also useful in the treatment of cancer due to the high likelihood of tumor tissue accumulating DNA damage. Therefore, the development of a rapid and accurate high-throughput DDR screening method has important academic value.
The occurrence of nuclear foci is a common indicator of DNA damage and has wide applications in biometrics, individual radiosensitivity assessment, and toxicity assessment. The formation of nuclear foci is caused by the accumulation or modification of certain DDR proteins at double strand breaks.
DDR proteins include gamma H2AX, 53BP1, RAD51, MRE11/RAD50/NBS1 complex and the like. Lesions can be visualized under a fluorescence microscope by immunofluorescence, immunohistochemical analysis or labeling methods with fluorescent proteins. In general, the number of lesions is closely related to the radiation dose, and researchers can quantify DNA damage by counting the number of lesions and counting the lesions per nucleus or per DNA region.
Currently, some automated methods that allow batch processing are not always satisfactory in some situations.
In current open source software, FoCo has a friendly graphical user interface, but because of the variation in brightness between individual cells and batch-by-batch in the acquisition setup, intensity parameters need to be adjusted manually, which often introduces large errors.
Focinator is an ImageJ-based macro that detects Foci using only the maximum criteria, and also has similar limitations of FoCo.
Findfici allows manual training of parameters, but people mark Foci (focus) is laborious and error prone, especially in situations where background interference is large. In addition, when the cell density is high, some cell nuclei adhered to each other exist, and the cell nuclei cannot be well segmented by using the threshold segmentation method.
Therefore, a method capable of processing a large amount of image data acquired by a high-content imaging platform in batch, performing image segmentation rapidly and accurately, determining whether cell nuclei are damaged, and finally performing drug screening by using statistical analysis is urgently needed.
Disclosure of Invention
The invention provides a method for screening a DNA damage response inhibitor in a high-throughput manner, which can automatically perform single cell nuclear segmentation and class decision on images acquired by high content imaging equipment in batches, and can preliminarily screen out compounds with further research value through statistical analysis.
A method for high-throughput screening of DNA damage response inhibitors comprises the following steps:
s1, training a cell nucleus segmentation network model based on the U-Net network;
s2, constructing a cell nucleus type judgment network model and training;
s3, shooting the cells after the action of the DNA damage reaction inhibitor by using high content imaging equipment to obtain an image to be analyzed; and inputting the image to be analyzed into the cell nucleus segmentation network model and then inputting into the cell nucleus type judgment network model, and counting damaged cell nucleus ratios corresponding to each DNA damage reaction inhibitor, wherein the smaller the damaged cell nucleus ratio is, the better the effect of the DNA damage reaction inhibitor is.
The cell nucleus segmentation model can automatically segment images shot by high-content imaging platforms of different models, realizes automatic adaptation of image segmentation among different batches, improves the condition that the segmentation is not performed under the condition of cell nucleus adhesion, and improves the segmentation robustness under the condition of large cell background interference.
The nuclear segmentation model provided by the invention uses a deep learning method, adopts a U-Net network architecture, comprises an encoder and a decoder, wherein the encoder can automatically extract features, the extracted features are more and more abstract along with the increase of the number of layers, higher-dimensional information is reflected, the input of the extracted features is an image acquired by a high-content imaging platform, and the output of the extracted features is an extracted feature map.
The decoder gradually restores the details and the spatial dimensions of the object, and meanwhile, the encoder and the decoder are connected quickly, so that the decoder can be helped to restore the target details better. The input is a feature map extracted by the encoder, and the output is a mask image having the same size as the input image. The mask image can be used to segment individual nuclei in an image.
The cell nucleus type judging model can judge whether the input single cell nucleus image is damaged or not, has high accuracy, reduces the time consumed by manual counting, and can output the probability of each cell nucleus corresponding to each type so as to facilitate subsequent statistical analysis.
The cell nucleus type judgment model uses a deep learning method, uses a VGG-19 network architecture, uses a convolutional neural network to extract characteristics, uses a pooling layer to zoom images, obtains higher-dimensional characteristic information after several groups of convolution pooling, finally uses the high-dimensional characteristic information to classify the images, inputs the images of single cell nucleus and outputs the type judgment result of the cell nucleus.
The method for screening the DNA damage reaction inhibitor in high flux carries out statistical analysis on the result of the nucleus type judgment, carries out comparative analysis on the result of the nucleus type judgment and the result of a control group and a positive drug, calculates the proportion of damaged nuclei in an image acquired by each DNA damage reaction inhibitor, and finally carries out sequencing for statistical analysis. The compounds with the top rank are selected to carry out subsequent efficacy verification experiments, such as dose-effect curve experiments, comet experiments and the like.
Compared with the prior art, the invention has the following effects:
the method for screening the DNA damage reaction inhibitor at high flux can automatically process images from different drug sources, has high accuracy, processes a large amount of image data acquired by a high content imaging platform in batches, and provides a foundation for drug effect experiments.
Drawings
FIG. 1 is a general flow chart of the method for screening DNA damage response inhibitors in high throughput according to the present invention, in which Focinet refers to a cell nucleus segmentation model and a cell nucleus classification determination model.
FIG. 2 is a schematic diagram of a network architecture of a nuclear segmentation model according to the present invention.
Fig. 3 is a schematic diagram of a network architecture of the cell nucleus type determination model according to the present invention.
Fig. 4 is a flow chart of DDR drug screening using the established cell nucleus segmentation model and cell nucleus type determination model in the present invention, wherein FociNet in the figure refers to the cell nucleus segmentation model and the cell nucleus type determination model.
Detailed Description
The technical solution of the present invention is further described in detail below with reference to the accompanying tables and examples. The following examples are carried out on the premise of the technical scheme of the invention, and detailed embodiments and processes are given, but the scope of the invention is not limited to the following examples.
As shown in FIG. 1, this example provides a method for high throughput screening of DNA damage response inhibitors.
And S1, training the nuclear segmentation network model based on the U-Net network.
The cell nucleus segmentation network model is used for carrying out image segmentation on image data shot by the high content imaging equipment to obtain a mask image corresponding to the image data, and then the mask image is used for cutting the original image to obtain a single cell nucleus image.
And S11, constructing a first U-Net network and a second U-Net network.
The U-Net network comprises an encoder and a decoder, wherein quick connection exists between the encoder and the decoder, the structure of the encoder comprises 4-5 subblocks, except the last subblock, each subblock comprises two convolution layers and a pooling layer, elu is used as an activation function, and a Dropout layer is added between the two convolution layers; the last subblock includes only two convolutional layers, with a Dropout layer added between them, using elu as the activation function.
The encoder can automatically extract features, the extracted features are more and more abstract along with the increase of the number of layers, higher-dimensional information is reflected, the input of the encoder is an image acquired by a high-content imaging platform, and the output of the encoder is an extracted feature map. The decoder gradually restores the details and the spatial dimensions of the object, and meanwhile, a quick connection exists between the encoder and the decoder, so that the decoder can be helped to restore the target details better. The decoder input is a feature map extracted by the encoder, and the output is a mask image of the same size as the input image.
The number of subblocks in the encoder has a certain influence on the segmentation effect of the model, the number of subblocks is too small, the model training is insufficient, the characteristics with higher dimension cannot be extracted, the number of subblocks is too large, the model training process is slow, and the model can generate more redundant parameters. The number of sub-blocks is typically 4 to 5.
As shown in fig. 2, the structure of the encoder according to this embodiment includes 5 sub-blocks, each of which includes a certain number of convolutional layers and pooling layers.
The first sub-block contains two convolutional layers and a pooling layer, with elu being used as the activation function, with a Dropout layer added between the two convolutional layers to randomly drop some features during the training process to prevent over-fitting and increase the robustness of the model. The convolutional layer can be used for extracting features, and the pooling layer is used for scaling the image to extract higher-dimensional features;
similarly, the second sub-block, the third sub-block, and the fourth sub-block also include two convolutional layers and a pooling layer, with elu being used as the activation function, and a Dropout layer being added between the two convolutional layers;
the fifth subblock contains only two convolutional layers, again with a Dropout layer added between them, using elu as the activation function.
The structure of the decoder described in this embodiment includes 4 subblocks, each subblock includes a certain number of transposed convolution layers and a shortcut connection layer, the transposed convolution layers are used to scale the feature map back to a previous size, and the shortcut connection layer is used to connect the feature map in the encoder and the scaled image of the corresponding size of the transposed convolution layers, and the decoder can be helped to better restore the target details through information sharing. The decoder is connected after the encoder, each subblock is firstly subjected to feature scaling by using a transposed convolutional layer, then is communicated with a feature map of the encoder with a corresponding size, finally is connected with two convolutional layers, a Dropout layer is added between the two convolutional layers, elu is also used for an activation function of the two convolutional layers, and finally a convolutional layer is connected after the fourth subblock to output a final mask image.
After the structures of the encoder and the decoder are built, the encoder and the decoder learn the input samples, namely, the parameter optimization of the encoder and the decoder is realized, and the encoder and the decoder capable of performing the cell nucleus segmentation can be obtained. Because the images acquired by high content do not have corresponding mask images, and manual labeling consumes a large amount of time, the existing data sets are firstly considered to be searched on the network, and two training sets are found together.
The S12, DATA-SCIENCE-BOWL-2018 dataset is first passed through a resize function and then used to train the first U-Net network.
The DATA-SCIENCE-BOWL-2018 dataset is derived from https: com/kamalkraj/DATA-SCIENCE-BOWL-2018/tree/master/dada. The method is characterized in that the background difference is large, the image source is complex, and the method can be used for a rough network to inhibit the situation that some background interference is large. The loss function is a cross entropy loss function.
Aiming at images shot by high content imaging platforms from different sources, as the sizes of the images shot by different platforms and the settings of different experimenters are different, a resize function needs to be added before a U-Net network model, for the condition of unequal length and width, the original image is cut by the resize function to obtain the images with the same length and width, then the cut images are scaled, and finally the images are unified to the size of 512 to be input into the U-Net network model.
S13, the BBBC039 picture data set is subjected to a resize function and then used for training a second U-Net network.
The contrast between the background of the BBBC039 data set and the interested area is very obvious, and the poor applicability of the model under the condition that the ground network trained by the latter data set greatly interferes with the background is well avoided due to the preprocessing of the network trained by the former data set; meanwhile, a plurality of cell nuclei are adhered in the latter set of images, so that the method is very suitable for the segmentation scene. Through the model trained by the network, the image preprocessed by the previous network can be subjected to more fine image segmentation, and finally the mask image of the original input image is obtained.
The method comprises the steps that an image of an input cell nucleus segmentation network model firstly passes through a U-Net network to obtain a first mask image; and after the image to be detected is multiplied by the second mask image, obtaining the positions of all pixel points of each communication area through a communication domain algorithm, and further cutting each communication area out independently.
Through the mask image, the positions of all pixel points contained in each communicated region in the image can be obtained by using a connected domain algorithm, and then each communicated region can be cut out independently. Because the sizes of different cell nuclei are different, the subsequent cell nucleus type judgment model is troublesome by using the respective length and width of each cell nucleus for cutting, and according to the priori knowledge of cell biology, the fact that each connected region (usually, the region of one cell nucleus and the situation that a small amount of cells cannot be divided) is placed in a 256-by-256 container is determined, and the pixel values of the other regions of the image except the extracted region of the cell nucleus are 0. Each communicating region is placed in the middle of the container. This cuts out each individual cell nucleus region from the original image.
And S2, constructing a cell nucleus type judgment network model and training.
And S21, constructing a nucleus type judgment network model based on the VGG-19 network.
As shown in fig. 3. Specifically, we adopt the network architecture of VGG-19.
The number of the sub-blocks has a certain influence on the classification effect of the model, if the number of the sub-blocks is too small, the model training is insufficient, high-dimensional features cannot be extracted, the model training is not suitable for subsequent classification decision, the number of the sub-blocks is too large, the model training process is slow, large redundant parameters can be generated, and the number of the sub-blocks is usually 4 to 5.
The cell nucleus type judgment network model is divided into 5 sub-blocks and 2 full-connection layers which are connected in sequence, wherein the first two sub-blocks respectively comprise two convolution layers and a pooling layer, the activation function of the convolution layers uses relu, the last 3 sub-blocks respectively comprise 4 convolution layers and a pooling layer, and the activation function of the convolution layers also uses relu; the activation function of the first fully-connected layer is relu, and the activation function of the latter fully-connected layer is softmax.
The input of the model is an image of individual cell nuclei obtained by post-cropping using a cell nucleus segmentation model. After 5 sub-blocks, the obtained feature map is stretched into a one-dimensional vector.
The number of layers of the full-connection layer can also influence the classification effect of the model to a certain extent, under the common condition, if the number of layers is small, the model training is insufficient, the model training is easy to be under-fitted, the good classification effect cannot be achieved, and if the number of layers is large, the model training is easy to be over-fitted, and the model training method cannot be applied to the actual classification scene. Here, in the process of optimizing the network structure, we find that features extracted by the previous 5 sub-blocks are very suitable for our classification scene, so that a good classification effect can be achieved only by using two full-connection layers, and therefore, no more full-connection layers are added. After the framework is constructed, the network structure learns the input samples, namely, the parameter optimization of the network structure is realized, and finally, a model capable of carrying out the nucleus classification is obtained through training.
S22, obtaining the single cell image data set to train the cell nucleus type judgment network model.
The loss function is a cross entropy loss function.
The contrast group and the positive drug are segmented by using a nucleus segmentation network through images shot by high content to obtain corresponding single cell nucleus images, then 2000 images are manually selected from the single cell nucleus images, each image is strictly screened and examined, the three categories are damaged, undamaged and signal-free, the nuclei of EGFP focuses which have diffuse EGFP signals and no aggregated fluorescent spots or have the aggregated fluorescent spots counted as 1 to 4 are marked as undamaged types, and the nuclei with more than 4 EGFP focuses are marked as damaged types. Nuclei without EGFP signaling or showing pan-nuclear noise are cells that do not express EGFP or that are poorly illuminated, and are therefore labeled as a no-signal type.
For labeled data sets, we use the data amplification method, rotate the original image by 90 degrees, 180 degrees and 270 degrees, amplify the final data set to 24000 (2000 × 3) × (1+3), and then randomly distribute the data set according to the ratio of 4: the proportion of 1 is divided into a training set and a verification set.
The training set directly participates in model training and is used for adjusting parameters of the model, the verification set indirectly participates in the training of the model, after each batch of training is completed, verification can be performed on the verification set and is used for adjusting hyper-parameters of the model and performing primary evaluation on the capability of the model. In addition, 300 images of single cell nuclei are additionally marked as a test set, and the test set is not involved in training and is directly used for evaluating the final model. Finally, the accuracy of the model on the training set reaches 99.03%, the accuracy on the verification set reaches 99.15%, and the accuracy on the test set reaches 99.02%. By using the trained model, the image of the single cell nucleus input later can be predicted, and the corresponding category of each cell nucleus is output.
S3, shooting the cells after the action of the DNA damage reaction inhibitor by using high content imaging equipment to obtain an image to be analyzed; and inputting the image to be analyzed into the cell nucleus segmentation network model and then inputting into the cell nucleus type judgment network model, and counting damaged cell nucleus ratios corresponding to each DNA damage reaction inhibitor, wherein the smaller the damaged cell nucleus ratio is, the better the effect of the DNA damage reaction inhibitor is.
S31, shooting the cells after the action of the DNA damage reaction inhibitor by using high content imaging equipment to obtain an image to be analyzed;
s32, inputting the image to be analyzed into the trained cell nucleus segmentation network model to obtain a single cell nucleus image;
s33, inputting the single cell nucleus image into the trained cell nucleus type judgment network model for classification decision;
s34, counting the damaged cell nucleus ratio corresponding to each DNA damage response inhibitor, wherein the smaller the damaged cell nucleus ratio is, the better the effect of the DNA damage response inhibitor is.
As shown in fig. 4, a control group, a radiation damage group and a positive drug group are selected, and shot by high content imaging equipment, and then segmented by using a trained nucleus segmentation model to obtain a series of single cell nucleus images, and then the single cell nucleus images are input into a nucleus type determination model for classification decision. And counting the proportion of damaged cell nuclei in all the images of each group.
There was a significant difference between the control group and the radiation-damaged group, with the control group having a lower proportion of damaged nuclei and the radiation-damaged group having a higher proportion of damaged nuclei. After intervention of adding the positive drug WR-1065, the proportion of damaged cell nuclei is equivalent to that of a control group, and is obviously different from that of a radiation damage group, so that the DNA damage reaction inhibitor can inhibit the DNA damage reaction process to a certain extent, and then the DNA damage reaction inhibitor can be further verified through a dose-effect curve, a comet assay and the like.

Claims (8)

1. A method for screening a DNA damage response inhibitor in high throughput, which is characterized by comprising the following steps:
s1, training a cell nucleus segmentation network model based on the U-Net network;
s2, constructing a cell nucleus type judgment network model and training;
s3, shooting the cells after the action of the DNA damage reaction inhibitor by using high content imaging equipment to obtain an image to be analyzed; and inputting the image to be analyzed into the cell nucleus segmentation network model and then inputting into the cell nucleus type judgment network model, and counting damaged cell nucleus ratios corresponding to each DNA damage reaction inhibitor, wherein the smaller the damaged cell nucleus ratio is, the better the effect of the DNA damage reaction inhibitor is.
2. The method for high throughput screening of DNA damage response inhibitors according to claim 1, wherein the structure of the cell nucleus segmentation network model comprises two U-Net networks connected in series, and the image to be analyzed inputted into the cell nucleus segmentation network model first passes through the first U-Net network to obtain a first mask image; and after the image to be analyzed is multiplied by the second mask image, obtaining the positions of all pixel points of each communication area through a communication domain algorithm, and further cutting out each communication area independently.
3. The method for high-throughput screening of DNA damage response inhibitors according to claim 1, wherein the U-Net network-based cell nucleus segmentation network model is trained as follows:
s11, constructing a first U-Net network and a second U-Net network;
s12, enabling the DATA-SCIENCE-BOWL-2018 DATA set to pass through a resize function, and then training a first U-Net network;
s13, the BBBC039 picture data set is first subjected to a resize function and then used for training the second U-Net network.
4. The method for high throughput screening of DNA damage response inhibitors according to claim 3, wherein the first U-Net network and the second U-Net network each comprise an encoder and a decoder, and a shortcut connection exists between the encoder and the decoder;
the structure of the encoder comprises 4-5 sub-blocks; each subblock, except the last subblock, comprises two convolutional layers and a pooling layer connected in sequence, and a Dropout layer is added between the two convolutional layers by using elu as an activation function; the last subblock includes two convolutional layers, with elu being the activation function, with a Dropout layer added between the two convolutional layers;
the structure of the decoder comprises 4-5 subblocks; each sub-block uses a transposition convolution layer firstly, then connects two convolution layers, adds a Dropout layer between the two convolution layers, the activation functions of the two convolution layers use elu, connects a convolution layer after the last sub-block, and outputs the final mask image.
5. The method for high throughput screening of DNA damage response inhibitors of claim 1, wherein said cell nucleus class determination network model is based on VGG-19 network, ResNet or DenseNet.
6. The method for high-throughput screening of DNA damage response inhibitors according to claim 1 or 5, wherein the method for constructing and training the nuclear class judgment network model comprises the following steps:
s21, constructing a nucleus type judgment network model based on the VGG-19 network;
s22, obtaining the single cell image data set to train the cell nucleus type judgment network model.
7. The method for high throughput screening of DNA damage response inhibitors according to claim 6, wherein the obtaining of the single cell image dataset trains the nuclear class determination network model as follows:
s221, segmenting the image shot by the high content equipment by using the cell nucleus segmentation network model trained in S1 to obtain a corresponding single cell nucleus image;
s222, manually selecting 1800-2200 damaged cell nuclei, undamaged cell nuclei and no-signal images from the single cell nucleus image and marking;
s223, amplifying the original three types of images by using a data amplification method, and then dividing the images into a training set and a verification set according to the proportion;
s224, the training set judges the network model training for the cell nucleus type, adjusts the model parameters, the verification set indirectly participates in the model training, after each batch of training is finished, the verification set is used for verification, and the hyper-parameters of the model are adjusted.
8. The method for high throughput screening of DNA damage response inhibitors according to claim 1, wherein the image to be analyzed is input into the cell nucleus segmentation network model and then input into the cell nucleus type determination network model, specifically as follows: inputting an image to be analyzed into the trained cell nucleus segmentation network model to obtain a single cell nucleus image; and then inputting the single cell nucleus image into a trained cell nucleus type judgment network model for classification decision.
CN202010829597.6A 2020-08-17 2020-08-17 Method for high-throughput screening of DNA damage response inhibitor Active CN112085067B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010829597.6A CN112085067B (en) 2020-08-17 2020-08-17 Method for high-throughput screening of DNA damage response inhibitor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010829597.6A CN112085067B (en) 2020-08-17 2020-08-17 Method for high-throughput screening of DNA damage response inhibitor

Publications (2)

Publication Number Publication Date
CN112085067A true CN112085067A (en) 2020-12-15
CN112085067B CN112085067B (en) 2022-07-12

Family

ID=73728333

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010829597.6A Active CN112085067B (en) 2020-08-17 2020-08-17 Method for high-throughput screening of DNA damage response inhibitor

Country Status (1)

Country Link
CN (1) CN112085067B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116313115A (en) * 2023-05-10 2023-06-23 浙江大学 Drug action mechanism prediction method based on mitochondrial dynamic phenotype and deep learning

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107002122A (en) * 2014-07-25 2017-08-01 华盛顿大学 It is determined that causing the tissue of the generation of Cell-free DNA and/or the method for cell type and the method for identifying disease or disorder using it
CN108364288A (en) * 2018-03-01 2018-08-03 北京航空航天大学 Dividing method and device for breast cancer pathological image
CN109102498A (en) * 2018-07-13 2018-12-28 华南理工大学 A kind of method of cluster type nucleus segmentation in cervical smear image
CN109886179A (en) * 2019-02-18 2019-06-14 深圳视见医疗科技有限公司 The image partition method and system of cervical cell smear based on Mask-RCNN
CN110619639A (en) * 2019-08-26 2019-12-27 苏州同调医学科技有限公司 Method for segmenting radiotherapy image by combining deep neural network and probability map model
CN110880001A (en) * 2018-09-06 2020-03-13 银河水滴科技(北京)有限公司 Training method, device and storage medium for semantic segmentation neural network
WO2020052668A1 (en) * 2018-09-15 2020-03-19 北京市商汤科技开发有限公司 Image processing method, electronic device, and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107002122A (en) * 2014-07-25 2017-08-01 华盛顿大学 It is determined that causing the tissue of the generation of Cell-free DNA and/or the method for cell type and the method for identifying disease or disorder using it
CN108364288A (en) * 2018-03-01 2018-08-03 北京航空航天大学 Dividing method and device for breast cancer pathological image
CN109102498A (en) * 2018-07-13 2018-12-28 华南理工大学 A kind of method of cluster type nucleus segmentation in cervical smear image
CN110880001A (en) * 2018-09-06 2020-03-13 银河水滴科技(北京)有限公司 Training method, device and storage medium for semantic segmentation neural network
WO2020052668A1 (en) * 2018-09-15 2020-03-19 北京市商汤科技开发有限公司 Image processing method, electronic device, and storage medium
CN109886179A (en) * 2019-02-18 2019-06-14 深圳视见医疗科技有限公司 The image partition method and system of cervical cell smear based on Mask-RCNN
CN110619639A (en) * 2019-08-26 2019-12-27 苏州同调医学科技有限公司 Method for segmenting radiotherapy image by combining deep neural network and probability map model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
朱琳琳等: "基于U-Net网络的多主动轮廓细胞分割方法研究", 《红外与激光工程》 *
顾桂颖等: "计算机图像识别在白血病形态学诊断中的应用", 《中国医疗设备》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116313115A (en) * 2023-05-10 2023-06-23 浙江大学 Drug action mechanism prediction method based on mitochondrial dynamic phenotype and deep learning
CN116313115B (en) * 2023-05-10 2023-08-15 浙江大学 Drug action mechanism prediction method based on mitochondrial dynamic phenotype and deep learning

Also Published As

Publication number Publication date
CN112085067B (en) 2022-07-12

Similar Documents

Publication Publication Date Title
CN111126386B (en) Sequence domain adaptation method based on countermeasure learning in scene text recognition
CN104992223B (en) Intensive population estimation method based on deep learning
CN108256482B (en) Face age estimation method for distributed learning based on convolutional neural network
CN105139395B (en) SAR image segmentation method based on small echo pond convolutional neural networks
CN114529819B (en) Household garbage image recognition method based on knowledge distillation learning
CN110675370A (en) Welding simulator virtual weld defect detection method based on deep learning
CN108509976A (en) The identification device and method of animal
CN111179273A (en) Method and system for automatically segmenting leucocyte nucleoplasm based on deep learning
CN110728656A (en) Meta-learning-based no-reference image quality data processing method and intelligent terminal
CN105957086A (en) Remote sensing image change detection method based on optimized neural network model
CN108596038B (en) Method for identifying red blood cells in excrement by combining morphological segmentation and neural network
CN110197205A (en) A kind of image-recognizing method of multiple features source residual error network
CN111833322B (en) Garbage multi-target detection method based on improved YOLOv3
CN109815870B (en) High-throughput functional gene screening method and system for quantitative analysis of cell phenotype image
CN113947607A (en) Cancer pathology image survival prognosis model construction method based on deep learning
CN110751644B (en) Road surface crack detection method
CN111275684A (en) Strip steel surface defect detection method based on multi-scale feature extraction
CN109063983B (en) Natural disaster damage real-time evaluation method based on social media data
CN112085067B (en) Method for high-throughput screening of DNA damage response inhibitor
CN104463207B (en) Knowledge autoencoder network and its polarization SAR image terrain classification method
CN116129189A (en) Plant disease identification method, plant disease identification equipment, storage medium and plant disease identification device
Bull et al. Extended correlation functions for spatial analysis of multiplex imaging data
CN114897884A (en) No-reference screen content image quality evaluation method based on multi-scale edge feature fusion
CN117765480B (en) Method and system for early warning migration of wild animals along road
CN117830300B (en) Visual-based gas pipeline appearance quality detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant