Most defect inspection methods used in semiconductor manufacturing require design layout or golden die images. Unlike methods that require such additional information, this paper presents a method for automatic inspection of defects in semiconductor images with a single image. First, we devise a method to classify images into four types: flat, linear, patterned, and complex using a cosine similarity. For linear and patterned images, we obtain defect-free images that retain the structure. A flat image is then obtained by subtracting the defect-free image from the input image. The FAST-MCD method then estimates the parameters of the inlier distribution of the flat image and uses them to detect defects. A segmentation neural network is used to detect defects in complex images. Unlike conventional methods that only work on a specific structure, our method classifies structures and finds defects in each structure. We use 16 defective images in our experiments, where our method detects all 16 defective images, while the conventional methods detect fewer defective images.
This work was supported by Samsung Electronics Co., Ltd. (IO201216-08216-01).
All authors contributed to the study conception and design. Data collection and analysis were performed by Jinkyu Yu and Songhee Han. The first draft of the manuscript was written by Jinkyu Yu and Chang-Ock Lee. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
The authors declare no competing interests.
Appendix A
Appendix A
1.1 A.1 Selection of the parameter \(t_{i}\)
Here, we experimentally show the role of the parameter \(t_{i}\) used in the image classification in Sect. 3.1. We obtain the repeated region \(P_{i} = \{(x,y) \mid CS_{i}(x,y) > t_{i} \text { for } x=1,\ldots ,h \text { and } y=1,\ldots ,w\}\) using the threshold \(t_{i}\). We overlap \(P_{i}\) based on the centroid of each kernel and call it the overall repeated region \(P\subset [1, \overline{h}]\times [1, \overline{w}]\). Since the value of \(CS_{i}\) at the centroid of \(K_{i}\) is always 1, \(P_{i}\) contains the centroid of \(K_{i}\). It means that P contains the center of domain \([1, \overline{h}]\times [1, \overline{w}]\). For R, the connected region containing the center of domain, we compute the moment tensor I as
where the components are defined as
Experiment results for various \(t_{i}\). The first column shows the input images. Columns 2 through 6 show P for \(r = 0.1\) to \(r = 0.3\). The red dot indicates the center of domain, and green region shows the connected region R containing the center of domain. The blue dots indicate the peak points
For the moment tensor I, we compute eigenvalues and corresponding eigenvectors. If an image has a linear structure, P has a long connected region R, with the large axis ratio defined as the ratio of large and small eigenvalues, in the dominant direction defined as the direction of the major eigenvector.
Figure 13 shows the overall repeated region P with various \(t_{i} = (1 - r) + r \min CS_{i}\) for \(r = 0.1, 0.15, 0.2, 0.25, 0.3\). The axis ratio of R is displayed at the top of each P. The maximum axis ratio of nonlinear images is 7.53, and the minimum one of linear images is 41.83. Therefore, we take the value 25 as the threshold for determination of linear images. Linear images always have an axis ratio greater than 25, regardless of \(t_{i}\) values. The peak points of the patterned images are aligned on specific lines.
1.2 A.2 Kernel size for cosine similarity
Here, we give a one-dimensional example of the cosine similarity for different kernel sizes. Consider a long vector in which (10110) is repeated. Then, the cosine similarities for the three types of one-dimensional kernels are as follows:
If the kernel does not include a pattern as in the first case, the repeated region of CS cannot find the pattern. On the other hand, if the kernel represents a pattern as in the second and third cases, the repeated region of CS finds the pattern. Therefore, we suggest small M and N values like 2, 3, and 4 so that the kernel includes the pattern.
GHT method of extracting the lattice vectors [32].
1.3 A.3 JSD between histogram and chi-squared distribution
We explain the details of the JSD between the histogram of \(d_{M}^{2}\) for the MCD-solution S and the chi-squared distribution. Let \(h_{S}(x)\) be the histogram of \(d_{M}^{2}(x_{i}, \mu , \Sigma )\) for the MCD-solution S. For a defect-free image, the maximum value of \(d_{M}^{2}\) in the MCD-solution S is close to \(\chi _{1,1-\alpha }^2\) where the quantity \(\chi _{1,p}^{2}\) is defined to satisfy \(P({X > \chi _{1,p}^{2}}) = p\) for \(X \sim \chi _{1}^{2}\). However, for defective images, the MCD-solution S appears outside the defects, and the maximum value of \(d_{M}^{2}\) in the MCD-solution S increases. The increment depends on the size of the defects. Therefore, we use a slightly modified chi-squared distribution as follows. Let \(f_{1}(x)\) be the probability density function for \(\chi _{1}^{2}\) distribution and l be the maximum value of \(d_{M}^{2}\) in the MCD-solution S. Then, the cropped probability density function \(\overline{f}_{1}(x)\) is
In Sect. 3.1, we measure the JSD between \(h_{S}(x)\) and \(\overline{f}_{1}(x)\) to check whether \(h_{S}(x)\) is close to \(\overline{f}_{1}(x)\) or not.
Figure 14 shows the JSDs for flat and complex images. For flat images, JSD is below \(0.01 \log 2\). For complex images, JSD is greater than \(0.1 \log 2\). Therefore, we take the value \(0.05 \log 2\) as the threshold for determination of flat images.
1.4 A.4 Lattice vector extraction
In this appendix, we briefly describe how to extract the lattice vectors from the overall repeated region P in Sect. 3.2.2. Before starting, the locations of peak points are regarded as vectors originating from the center of domain [\(1, \overline{h}]\times [1, \overline{w}\)]. The main idea of the generalized Hough transform (GHT) method in [32] of extracting lattice vectors is to build a parallelogram grid with each pair of linearly independent vectors and score how close the peak points are to the grid (see Algorithm 3). The usage of \(1/\Vert {v_{i}}\Vert \) leads to a high score in L for \(v_{i}\) with small length. However, this method only uses the vectors \(v_{\hat{i}}\) and \(v_{\hat{j}}\) to determine the lattice vectors and does not use other vectors that are constant multiples of them.
We modified the method slightly to use more linearly dependent vectors to obtain the lattice vectors. We consider a straight line passing through the center of domain. As mentioned in Sect. 3.1, a straight line passing through three peak points is judged to have pattern information. If the line passes through more peak points, the pattern is better represented. To find the lattice vectors, we take two straight lines containing the largest number of peak points. If the straight lines have the same number of peak points, we take the line with the smaller distance between the points. The proposed lattice vector extraction algorithm is described in Algorithm 4. The lattice points generated from the lattice vectors extracted by our proposed algorithm are more accurate because the error is reduced by using more linearly dependent vectors. The amount of computation of our proposed extraction algorithm is also less than that of the GHT method.
Figure 15a shows the lattice points for two methods in input image. Blue dots and red dots represent the lattice points generated by the GHT method and our method, respectively. A flattened image obtained with the blue lattice points is shown in Fig. 15b. Near the boundary of the image, traces of structures remain. Figure 15c shows the flattened image obtained by our method. Compared to the GHT method, our method produces more accurate lattice points.
Yu, J., Han, S. & Lee, CO. Defect inspection in semiconductor images using FAST-MCD method and neural network. Int J Adv Manuf Technol 129, 1547–1565 (2023). https://doi.org/10.1007/s00170-023-12287-z
