CN117496044A

CN117496044A - Lung CT image reconstruction method based on deep learning

Info

Publication number: CN117496044A
Application number: CN202311207973.8A
Authority: CN
Inventors: 康帅; 陈飞蛟; 李昕帝; 张东霞; 任娟
Original assignee: Shanghai Ruitouch Technology Co ltd
Current assignee: Shanghai Ruitouch Technology Co ltd
Priority date: 2023-09-19
Filing date: 2023-09-19
Publication date: 2024-02-02

Abstract

The invention relates to a lung CT image reconstruction method based on deep learning, which comprises the following steps: (1) acquisition and pretreatment of lung CT images; (2) Extracting a local area only including lung parenchyma based on image information of human lung parenchyma; (3) Simulating and generating an X-ray image based on a DRR (digital radiography) method of digital reconstruction, and constructing a data set containing the X-CT image; (4) Constructing an X2CT-Net model, learning a mapping relation between an X-ray image and a CT image, and reconstructing a three-dimensional CT image; (5) Taking the reconstructed CT image as input, constructing a lung nodule detection model based on YOLOv5, and verifying the quality of the generated image; the invention uses the X-ray images of two orthogonal planes as input, builds the X2CT-Net model based on the generation of the antagonism network GAN to reconstruct the lung CT image, solves the problems of the prior art, and effectively relieves the problems of medical resource shortage and medical resource unbalance.

Description

Lung CT image reconstruction method based on deep learning

[ technical field ]

The invention belongs to the technical field of medical image processing, and particularly relates to a lung CT image reconstruction method based on deep learning.

[ background Art ]

Currently, computed tomography (Computed tomography, CT) has been widely used in clinical examinations. Traditional CT reconstruction methods acquire hundreds of X-ray projections from multiple angles through rotational scanning for three-dimensional reconstruction. However, CT scanning can bring more radiation dose to the patient. Also, CT scanners are much more costly than X-ray machines.

It would therefore be of great practical importance if an alternative method could be provided for generating CT images using X-ray images: (1) The CT image reconstruction is carried out by using a small amount of X-ray images, so that the radiation dose to a patient can be obviously reduced, and the health risk is reduced; (2) The X-ray image is used for reconstruction more conveniently and rapidly, so that rapid screening is realized, and the problem of medical resource shortage is relieved; (3) Compared with CT scanning, the X-ray machine has lower cost and can effectively relieve the problem of unbalanced medical resources.

Under the condition of meeting clinical demands, generating CT images by using low-dose X-ray images is a hot spot problem of current medical imaging, and has practical significance and reference value. However, existing CT reconstruction techniques still have the following limitations:

(1) Traditional CT reconstruction algorithms such as the filtered back projection algorithm (FBP) depend strongly on the quality of the acquired projection data. However, the loss of original details of the image is inevitably caused in the projection data denoising process, and the resolution of the reconstructed CT image is reduced.

(2) The CT reconstruction technology based on the deep learning constructs a convolutional neural network model through a single X-ray image to implicitly learn the nonlinear mapping relation between the X-ray image and the CT image, and reconstructs the CT image. However, there is a serious ambiguity in the information inside the X-ray image, and the mapping relationship cannot be fully learned only by a single X-ray image, resulting in limitation in algorithm performance.

[ summary of the invention ]

The invention aims to solve the defects and provide a lung CT image reconstruction method based on deep learning, which uses X-ray images of two orthogonal planes as input, builds an X2CT-Net model based on generation of an antagonistic network GAN to reconstruct lung CT images, solves the problems of the prior art, and effectively relieves the problems of medical resource shortage and medical resource imbalance.

In order to achieve the above purpose, a lung CT image reconstruction method based on deep learning is designed, which comprises the following steps:

(1) Collecting and preprocessing lung CT images;

(2) Extracting a local area only including lung parenchyma based on image information of human lung parenchyma;

(3) Simulating and generating an X-ray image based on a DRR (digital radiography) method of digital reconstruction, and constructing a data set containing the X-CT image;

(4) And constructing an X2CT-Net model, learning a mapping relation between an X-ray image and a CT image, and reconstructing a three-dimensional CT image.

Further, in step (1), the lung region is placed in a scanning range, and the lung CT image is obtained by emitting an X-ray beam through a CT scanner and recording image data, and is subjected to operations including but not limited to resampling and normalization to complete preprocessing.

Further, in the step (2), firstly, performing rough segmentation on a lung CT image based on a global threshold segmentation method according to the difference between a lung parenchyma HU value and other organs; secondly, removing interference through morphological closing operation, inversion, setting a maximum communication area and other operations; then, obtaining a mask of the trunk, obtaining a mask of the lung parenchyma by filling holes in the mask and subtracting the holes from the mask, and removing a lung trachea by setting the size of a communication area; finally, multiplying the mask of the lung parenchyma with the original image to obtain a segmented lung parenchyma image.

Further, in the step (3), a region of interest is selected from the CT scan data, a set of projection rays are released from each view angle at the light source position to pass through the region of interest, equidistant sampling is performed on each ray, and an attenuation coefficient corresponding to the CT value of the sampling point is calculated from 8 voxels nearest to the sampling point by using an interpolation method; then, the attenuation coefficients of all the sampling points are added to obtain the gray value of the pixel point on the imaging plane corresponding to the X-ray; this operation is repeated for each X-ray, the pixels are synthesized into a DRR image, and a simulated X-ray image of the posterior front and side is generated by the algorithm, constituting an X-CT paired dataset.

Further, in the step (4), an X2CT-Net model is built based on the generation of the countermeasure network, and the model comprises a generator and a discriminator; the generator comprises two U-shaped networks which are respectively used for learning the mapping relation between the X-ray images and the CT images of two different planes; the input of the generator is X-ray images of two orthogonal planes, and the U-shaped network is a symmetrical structure of the coder-decoder; the encoder is composed of 4 residual convolution modules and is used for extracting X-ray image characteristics, wherein each residual convolution module comprises a 3 multiplied by 3 two-dimensional convolution layer, a batch normalization layer and a ReLU activation function; the decoder is composed of 4 up-sampling modules, three-dimensional volume data are reconstructed from the extracted two-dimensional features, and an output result CT image is obtained, wherein each up-sampling module comprises a 3 multiplied by 3 three-dimensional convolution layer and a ReLU activation function; the arbiter consists of a 4 x 4 three-dimensional convolution layer, a batch normalization layer and a ReLU activation function, for distinguishing the generated samples from the real samples.

In the step (4), three jump connection modes are provided according to the characteristics of the image coding stage; firstly, a jump connection module Concata is constructed by expanding a full connection layer, and the module stretches the output flattening of the last encoder into a one-dimensional vector so as to be better reconstructed into three-dimensional characteristics; secondly, a jump connection module ConcatB is constructed between the encoder and the decoder at the same stage, characteristic channels at two ends of the encoder and the decoder are aligned through a two-dimensional convolution module, two-dimensional image characteristics are copied and expanded into three-dimensional image characteristics, three-dimensional convolution modules are used for carrying out characteristic coding, the modules transmit low-level two-dimensional image characteristics to the three-dimensional decoder, image details in the three-dimensional reconstruction process are made up, and the outline shape between input and output has high correlation; and finally, constructing a jump connection module Concat between two decoders of the parallel U-shaped network, splicing the output of each stage of decoder to obtain fusion characteristics so as to fuse the characteristics of two different plane X rays, enabling the model to learn to obtain a more sufficient mapping relation, calculating the similarity degree between the output of the two decoders, and updating parameters by back propagation.

Further, in the step (4), according to the characteristics of the target task and the image, the loss function is improved, and a mixed loss function optimization model training process is provided; the hybrid loss function is mainly composed of a contrast loss, a reconstruction loss and a projection loss, wherein the contrast loss is used for distinguishing a real data distribution and a corresponding generated data distribution, and the expression is as follows:

wherein X is an input X-ray image of two orthogonal planes, and y is a CT image; lambda (lambda) ₁ =0.1 is the counterloss function weight;

by using the mean square error MSE as reconstruction loss, the expression is as follows:

wherein lambda is ₂ =10 is the reconstruction loss function weight;

in order to simplify the matching process between the generated sample and the real sample, three orthogonal planes for generating the three-dimensional CT image are selected and matched with the plane corresponding to the real sample, and the three orthogonal planes are used as projection loss, and the expression is as follows:

wherein P is _ax 、P _co 、P _sa Is a projection of axial, coronal and sagittal planes; the L1 norms of the projection and the real projection are generated through calculation and are used for enhancing the learning capacity of the image edge; lambda (lambda) ₃ =10 is the projection loss function weight;

thus, the final mixing loss function is defined as follows:

further comprising the step (5): the reconstructed CT image is taken as input, and a lung nodule detection model based on YOLOv5 is constructed to verify the quality of the generated image.

Further, in the step (5), the lung nodule with the size larger than 3mm is marked through the lung CT image generated in the step (4), a lung nodule detection data set is constructed, and divided, wherein the training set, the verification set and the test set are 8:1:1, a step of; the Focal Loss function is added on the basis of the original Loss function, and the Focal Loss function is as follows:

FL(p _t )＝-α _t (1-p _t ) ^γ log(p _t )

wherein alpha is _t Taking a default value of 0.8 for balancing factors for balancing positive and negative samples; gamma is a balance factor for balancing the difficult sample, and takes a default value of 2.

Further, in the step (5), a backbone network of the lung nodule detection model adopts a pre-trained ResNet-34 network on an ImageNet for initializing model parameters and enhancing training effects; aiming at the image characteristics and distribution characteristics of the existing data set, the original loss function is improved, and the unbalanced characteristics of data types are relieved; and finally, testing and verifying the quality of the generated CT image by using the test image.

Compared with the prior art, the invention provides a lung CT image reconstruction method based on deep learning, which uses two orthogonal plane X-ray images as input, builds an X2CT-Net model based on a generated countermeasure network (Generative Adversarial Networks, GAN) to reconstruct the lung CT image, thereby solving the problems in the prior art; and, by taking the reconstructed CT image as input, constructing a lung nodule detection model based on YOLOv5, and carrying out lung nodule detection, thereby verifying the quality of the generated image and verifying the validity of the invention. In addition, the invention has the following remarkable practical significance: the CT image reconstruction is carried out by using a small amount of X-ray images, so that the radiation dose to a patient can be obviously reduced, and the health risk is reduced; the X-ray image is used for reconstruction more conveniently and rapidly, so that the rapid screening is realized, and the problem of medical resource shortage can be effectively relieved; and compared with CT scanning, the X-ray machine has lower cost and can effectively relieve the problem of unbalanced medical resources.

[ description of the drawings ]

FIG. 1 is a flowchart of a global threshold-based lung parenchyma segmentation algorithm of the present invention;

FIG. 2 is a schematic diagram of the structure of an X2CT-Net model according to the invention;

FIG. 3 is a schematic diagram of an improved hop link module configuration of the present invention;

FIG. 4 is a schematic diagram of a lung nodule detection model of the present invention;

FIG. 5 (a) is a contrast plot of a CT reconstructed image of the present invention;

FIG. 5 (b) is a contrast plot of a CT reconstructed image in accordance with the present invention;

FIG. 5 (c) is a contrast plot of a CT reconstructed image in accordance with the present invention;

FIG. 6 (a) is a schematic representation of the lung nodule labeling results of the present invention;

FIG. 6 (b) is a schematic representation of the lung nodule detection results of the present invention.

Detailed description of the preferred embodiments

The invention provides a lung CT image reconstruction method based on deep learning, which comprises the following steps:

(1) The acquisition and pretreatment of lung CT images comprise human lung CT images;

(3) Simulating the generation of an X-ray image based on a digital reconstructed radiogram method (Digitally Reconstructed Radiograph, DRR), constructing a dataset comprising X-CT images;

(4) Constructing an X2CT-Net model, learning a mapping relation between an X-ray image and a CT image, and reconstructing a three-dimensional CT image;

(5) And taking the reconstructed CT image as input, constructing a lung nodule detection model based on YOLOv5, and verifying the quality of the generated image.

The following specifically describes the above steps:

in the step (1), the lung region is positioned in the scanning range, the X-ray beam is emitted by a CT scanner, the image data is recorded to obtain a lung CT image, and the lung CT image is subjected to resampling, normalization and other operations to complete preprocessing. The resolution of the processed CT image is 128 x 128, and the pixel spacing is (1, 1).

In the step (2), the lung CT image is segmented by an image processing technology, and noise interference is removed. The lung CT primary image comprises lung parenchyma, lung trachea, trunk, scanning bed board and the like, wherein the lung parenchyma contains key information such as tumor, nodule, blood vessel and the like. To enhance the reconstruction effect, the lung parenchymal portion is preserved. During the processing of CT images of the lungs, the HU value of the lungs was-500, so by retaining the region of HU value within [ -1000,400], the disease associated with the lungs can be monitored, the overall flow is shown in FIG. 1. First, rough segmentation is performed based on a global thresholding method based on differences in the HU value of the lung parenchyma and other organs. And secondly, removing interference caused by a bed board, acquisition equipment and the like through morphological closing operation, inversion, setting of a maximum communication area and other operations. Then, a mask of the torso is obtained. The mask of the lung parenchyma is obtained by filling and subtracting the holes of the mask, and the lung trachea is removed by setting the size of the communication area. Finally, multiplying the mask of the lung parenchyma with the original image to obtain a segmented lung parenchyma image.

In step (3), the digital reconstructed radiometric image method is an image processing algorithm for medical imaging for generating a two-dimensional projection image of an analog X-ray image. By selecting a region of interest in the CT scan data, a set of projection rays is released from each view angle at the source location through the region of interest. Equidistant sampling is carried out on each ray, and an attenuation coefficient corresponding to the CT value of the sampling point is calculated by 8 voxels nearest to the sampling point through interpolation. And then adding attenuation coefficients of all the sampling points to obtain the gray value of the pixel point on the imaging plane corresponding to the X-ray. This operation is repeated for each X-ray, and the pixel points are synthesized into a DRR image. A simulated X-ray image of the posterior front and side is generated by the algorithm, forming an X-CT paired dataset.

In step (4), based on generating an X2CT-Net model which improves an antagonism network model, a nonlinear mapping relation of X-CT is learned, and the nonlinear mapping relation comprises a generator and a discriminator. The generator comprises two U-shaped networks which are respectively used for learning the mapping relation between the X-ray images and the CT images of two different planes, and the structure is shown in figure 2. Wherein the input of the generator is an X-ray image of two orthogonal planes. The generator consists of a U-shaped network, which is a symmetrical structure of the codec. The encoder is composed of 4 residual convolution modules for extracting X-ray image features. Residual convolution increases residual connection on the basis of an original convolution module, so that the feature extraction capability of a model is enhanced, and the problem of gradient disappearance in the training process is effectively solved. Each residual convolution module comprises a 3×3 two-dimensional convolution layer, a batch normalization layer Batch Normalization and a ReLU activation function. The decoder is composed of 4 up-sampling modules, and reconstructs three-dimensional volume data from the extracted two-dimensional features to obtain an output result CT image. Each up-sampling module comprises 3 of x 3 is described herein, and a ReLU activation function.

In step (4), three improved jump connection modes are proposed to improve the quality of CT reconstruction according to the characteristics of the image coding stage, and the structure of the jump connection modes is shown in fig. 3. First, in order to compensate for the loss of feature information in the process of expanding two-dimensional information to three-dimensional information, a new jump connection module ConcatA is constructed by expanding the full connection layer, as shown in (a) of fig. 3. The module stretches the output flattening of the last encoder into a one-dimensional vector, and can be better reconstructed into three-dimensional features. However, this procedure inevitably results in loss of two-dimensional spatial information, and thus the module is used only once between codecs. Next, similar to the U-Net model structure, a jump connection module concab is constructed between the encoder and decoder at the same stage, as shown in (b) of fig. 3. And aligning the characteristic channels at two ends of the coder and the decoder through a two-dimensional convolution module, copying the two-dimensional image characteristics, and expanding the two-dimensional image characteristics into three-dimensional image characteristics. And uses a three-bit convolution module to perform feature coding. The module can transmit low-level two-dimensional image characteristics to a three-dimensional decoder, make up image details in the three-dimensional reconstruction process, and enable outline shapes between input and output to have high correlation. Finally, in order for the generator to generate more accurate results, a jump connection module concac is built between the decoders of the two parallel U-shaped networks, as shown in fig. 3 (c). The model is input as two orthogonal planar X-ray images, each containing different image information. However, two parallel U-shaped networks cannot fuse the features that make up for each other, so the outputs of the decoders at each stage are channel spliced to obtain the fused features. Jump connection is constructed between two encoders of the U-shaped network, and features of two different plane X rays are fused, so that a model is learned more fully. Parameter updating is performed by calculating the degree of similarity between the two decoder outputs and propagating them back.

In step (4), the arbiter consists of a 4 x 4 three-dimensional convolution layer, a batch normalization layer Batch Normalization and a ReLU activation function, for distinguishing the generated samples from the real samples. In this way the ability of the generator to generate high quality samples can be improved.

In the step (4), according to the characteristics of the target task and the image, the loss function is improved, and a mixed loss function optimization model training process is provided. The mixing loss function is mainly composed of the contrast loss, reconstruction loss, and projection loss. The countermeasures are used for distinguishing the real data distribution from the corresponding generated data distribution, and the expression is as follows:

where X is the input X-ray image of two orthogonal planes and y is the CT image. The calculation of the countering loss by using the least square loss is helpful for stabilizing the training process, and more real details are learned. Lambda (lambda) ₁ =0.1 is the counterloss function weight.

Countering the loss allows the model to generate as much more realistic an image as possible. However, since the medical image has a relatively fixed internal structure anatomically, the resulting sample is relatively structurally sound due to other strong constraints. By using the mean square error MSE as reconstruction loss, the expression is as follows:

wherein lambda is ₂ =10 is the reconstruction loss function weight.

In order to simplify the matching process between the generated sample and the real sample, three orthogonal planes for generating the three-dimensional CT image are selected and matched with the planes corresponding to the real sample to be used as projection loss. The training process can be effectively improved by the orthogonal matching mode, and the expression is as follows:

wherein P is _ax 、P _co 、P _sa Is a projection of axial, coronal and sagittal planes. The L1 norms of the projection and the real projection are generated through calculation and are used for enhancing the learning ability of the image edge. Lambda (lambda) ₃ =10 is the projection loss function weight.

Thus, the final mixing loss function is defined as follows:

in step (5), an improved lung nodule detection model based on YOLOv5 is constructed. Aiming at the characteristics of small proportion and changeable morphology of the lung nodule characteristics in lung parenchyma, a characteristic pyramid structure based on cavity convolution is added into a backbone network of an original detection model. Feature maps of different sizes can be extracted and fused by using convolution kernels of different expansion rates, and the feature maps have better learning ability on targets with smaller forms such as lung nodules. And (3) constructing a lung nodule detection data set by the lung CT image generated in the step (4) and marking lung nodules larger than 3mm, and dividing the lung nodule detection data set. Wherein, training set, verification set, test set are 8:1:1. because the constructed data set has serious problems of unbalanced category and unbalanced positive and negative samples, the data category is only 2, so the improvement is carried out on the basis of the original Loss function, and the Focal Loss function is added, as shown in the following:

FL(p _t )＝-α _t (1-p _t ) ^γ log(p _t )

wherein alpha is _t For the balancing factor for balancing the positive and negative samples, a default value of 0.8 is taken. Gamma is a balance factor for balancing the difficult sample, and takes a default value of 2.

In step (5), to verify that the generated CT image has a reference value, a YOLOv 5-based lung nodule detection model is constructed, as shown in fig. 4. The backbone network of the lung nodule detection model employs a pre-trained ResNet-34 network on ImageNet for initializing model parameters and enhancing training effects. Aiming at the image characteristics and distribution characteristics of the existing data set, the original loss function is improved, and the unbalanced characteristics of the data are relieved. And finally, testing and verifying the quality of the generated CT image by using the test image.

The invention is further described below with reference to the accompanying drawings and specific examples:

because of the lack of a data set of X-ray image and CT image pairings in the actual environment, the present invention trains on the common lung data set, the lid-IDRI data set, and constructs the X-CT pairing data set. To ensure the effectiveness of the training, the raw CT data is first resampled to 1X 1mm ³ And performing lung parenchyma segmentation through a global threshold segmentation algorithm. Then, the process is carried out, cut it to 320X 320mm ³ And acquiring an X-ray image generated through simulation through a DRR algorithm. Two orthogonal angles of 0 DEG and 90 DEG are selected in the algorithm, the size of an X-ray image is 256 multiplied by 1, and a final X-CT data set is constructed.

The quality of the generated CT image is evaluated during the training process of the X2CT-Net model by using a plurality of evaluation indexes, including absolute mean error (Mean Absolute Error, MAE), cosine similarity (Cosine Similarity), structural similarity index (Structural Similarity Index, SSIM) and Peak Signal-to-Noise-Ratio (PSNR).

The invention is programmed by using Python language, and is developed by using a PyTorch framework, and model training is performed by using 4 pieces of NVIDIAGeForce RTX and 4090. Training parameter setting: the learning rate is set to be lr to be 0.0002, and the magnitude of the learning rate is adjusted by using a cosine annealing algorithm; the optimizer uses a random gradient descent method with momentum to optimize the update parameters; the BatchSize is set to 4; a total of 100 epochs were trained. Finally, the PSNR of Cosine Similarity,36.21 and SSIM of 71.76 of 92.74 MAE,0.9549 were obtained on the test set, and the reconstruction effects thereof are shown in fig. 5 (a), 5 (b) and 5 (c), wherein fig. 5 (a) is a CT original image, fig. 5 (b) is a 3D-CNN reconstruction result, and fig. 5 (c) is an X2CT-Net reconstruction result of the present invention. Therefore, the invention can reconstruct high-quality CT images.

In order to verify the quality of a CT image reconstructed by the X2CT-Net model, the lung nodule detection is carried out on the CT image based on the YOLOv5 model, the accuracy of 0.81576, the recall of 0.7625 and the mAP of 0.77032 are obtained, the detection results are shown in fig. 6 (a) and 6 (b), the result of marking the lung nodule is shown in fig. 6 (a), and the result of detecting the lung nodule is shown in fig. 6 (b). From fig. 6 (a) and 6 (b), it can be verified that the CT image reconstructed according to the present invention has the ability to provide clinical reference value.

What is not described in detail in this specification belongs to the prior art known to those skilled in the art, and is not described in detail herein. The present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principles of the invention are intended to be equivalent substitutes and are included in the scope of the invention.

Claims

1. The lung CT image reconstruction method based on the deep learning is characterized by comprising the following steps of:

(1) Collecting and preprocessing lung CT images;

2. The method of claim 1, wherein: in step (1), the lung region is placed in a scanning range, an X-ray beam is emitted by a CT scanner, image data is recorded to obtain a lung CT image, and the lung CT image is subjected to operations including but not limited to resampling and normalization to complete preprocessing.

3. The method of claim 1, wherein: in the step (2), firstly, performing rough segmentation on a lung CT image based on a global threshold segmentation method according to the difference between a lung parenchyma HU value and other organs; secondly, removing interference through morphological closing operation, inversion, setting a maximum communication area and other operations; then, obtaining a mask of the trunk, obtaining a mask of the lung parenchyma by filling holes in the mask and subtracting the holes from the mask, and removing a lung trachea by setting the size of a communication area; finally, multiplying the mask of the lung parenchyma with the original image to obtain a segmented lung parenchyma image.

4. The method of claim 1, wherein: in the step (3), a region of interest is selected from CT scan data, a group of projection rays are released from each view angle at a light source position to pass through the region of interest, equidistant sampling is carried out on each ray, and an attenuation coefficient corresponding to the CT value of a sampling point is calculated by 8 voxels closest to the sampling point by utilizing an interpolation method; then, the attenuation coefficients of all the sampling points are added to obtain the gray value of the pixel point on the imaging plane corresponding to the X-ray; this operation is repeated for each X-ray, the pixels are synthesized into a DRR image, and a simulated X-ray image of the posterior front and side is generated by the algorithm, constituting an X-CT paired dataset.

5. The method of claim 1, wherein: in the step (4), an X2CT-Net model is built based on the generation of an countermeasure network, and the model comprises a generator and a discriminator; the generator comprises two U-shaped networks which are respectively used for learning the mapping relation between the X-ray images and the CT images of two different planes; the input of the generator is X-ray images of two orthogonal planes, and the U-shaped network is a symmetrical structure of the coder-decoder; the encoder is composed of 4 residual convolution modules and is used for extracting X-ray image characteristics, wherein each residual convolution module comprises a 3 multiplied by 3 two-dimensional convolution layer, a batch normalization layer and a ReLU activation function; the decoder is composed of 4 up-sampling modules, three-dimensional volume data are reconstructed from the extracted two-dimensional features, and an output result CT image is obtained, wherein each up-sampling module comprises a 3 multiplied by 3 three-dimensional convolution layer and a ReLU activation function; the arbiter consists of a 4 x 4 three-dimensional convolution layer, a batch normalization layer and a ReLU activation function, for distinguishing the generated samples from the real samples.

6. The method of claim 5, wherein: in the step (4), three jump connection modes are provided according to the characteristics of an image coding stage; firstly, a jump connection module Concata is constructed by expanding a full connection layer, and the module stretches the output flattening of the last encoder into a one-dimensional vector so as to be better reconstructed into three-dimensional characteristics; secondly, a jump connection module ConcatB is constructed between the encoder and the decoder at the same stage, characteristic channels at two ends of the encoder and the decoder are aligned through a two-dimensional convolution module, two-dimensional image characteristics are copied and expanded into three-dimensional image characteristics, three-dimensional convolution modules are used for carrying out characteristic coding, the modules transmit low-level two-dimensional image characteristics to the three-dimensional decoder, image details in the three-dimensional reconstruction process are made up, and the outline shape between input and output has high correlation; and finally, constructing a jump connection module Concat between two decoders of the parallel U-shaped network, splicing the output of each stage of decoder to obtain fusion characteristics so as to fuse the characteristics of two different plane X rays, enabling the model to learn to obtain a more sufficient mapping relation, calculating the similarity degree between the output of the two decoders, and updating parameters by back propagation.

7. The method of claim 6, wherein in step (4), the loss function is improved according to the target task and the image characteristics, and a mixed loss function optimization model training process is provided; the hybrid loss function is mainly composed of a contrast loss, a reconstruction loss and a projection loss, wherein the contrast loss is used for distinguishing a real data distribution and a corresponding generated data distribution, and the expression is as follows:

wherein X is an input X-ray image of two orthogonal planes, and y is a CT image; lambda (lambda) ₁ =0.1 is the counterdamageLoss function weights;

wherein lambda is ₂ =10 is the reconstruction loss function weight;

thus, the final mixing loss function is defined as follows:

8. the method of any one of claims 1 to 7, further comprising step (5): the reconstructed CT image is taken as input, and a lung nodule detection model based on YOLOv5 is constructed to verify the quality of the generated image.

9. The method of claim 8, wherein in step (5), lung nodules greater than 3mm are labeled with the lung CT images generated in step (4), and a lung nodule detection dataset is constructed and partitioned, wherein the training set, the validation set, and the test set are 8:1:1, a step of; the FocalLoss loss function is added on the basis of the original loss function, and the FocalLoss loss function is as follows:

FL(p _t )＝-α _t (1-p _t ) ^γ log(p _t )

10. The method of claim 9, wherein: in the step (5), a backbone network of the lung nodule detection model adopts a pre-trained ResNet-34 network on an ImageNet for initializing model parameters and enhancing training effects; aiming at the image characteristics and distribution characteristics of the existing data set, the original loss function is improved, and the unbalanced characteristics of data types are relieved; and finally, testing and verifying the quality of the generated CT image by using the test image.