CN108875451B

CN108875451B - Method, device, storage medium and program product for positioning image

Info

Publication number: CN108875451B
Application number: CN201710325609.XA
Authority: CN
Inventors: 晏栋; 赵伟; 邓亮专
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2017-05-10
Filing date: 2017-05-10
Publication date: 2023-04-07
Anticipated expiration: 2037-05-10
Also published as: CN108875451A

Abstract

A method, apparatus, storage medium and program product for locating an image, the method comprising: determining at least one matching area matched with the image template from the acquired picture; determining a target area according to the at least one matching area; extracting first feature information in the target area according to a feature matching algorithm; determining at least one effective area in the target area according to preset identification conditions and distribution characteristics of the first characteristic information in the target area; and determining a result image according to the distribution characteristics of the characteristic information in the at least one effective area. By adopting the scheme, the efficiency of identifying the pictures of the specific type can be effectively improved, and the accuracy of the identified pictures of the specific type can also be improved.

Description

Method, device, storage medium and program product for positioning image

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a method, an apparatus, a storage medium, and a program product for positioning an image.

Background

At present, a traffic or parking lot generally locates and identifies a license plate of a vehicle so as to facilitate traffic management or payment management. License plate location identifies license plates based mainly on color components: converting the color space of the shot picture from an RGB color model to an HSV color model, extracting a blue component of the picture, setting an area with the blue component exceeding a threshold value as a foreground area, taking other parts in the picture as a background area, and then identifying the license plate in the foreground area by using the prior knowledge of the license plate in the foreground area. If the identity card is identified by giving color components, the probability that white and black in the shot identity card picture appear in the background is very high due to the characteristics of the identity card, so that the identity card picture cannot be accurately identified.

At present, a method based on edge information is mainly adopted, and the method specifically comprises the following steps: and converting the color picture into a gray picture, carrying out edge detection on the gray picture by using various operators, searching for an edge conforming to the shape of the identity card in the edge image, and taking an area surrounded by edge line segments conforming to the characteristics of the identity card as an identity card area. However, when the identity card image is detected in the complex background, the efficiency of identifying the identity card image by adopting the edge information-based method is low, and secondly, the accuracy of the identity card image detected in the complex background is low.

Disclosure of Invention

The application provides a method, a device, a storage medium and a program product for positioning an image, which can solve the problem of low accuracy in the prior art.

A first aspect of the present application provides a method of localizing an image, the method comprising:

determining at least one matching area matched with the image template from the acquired picture;

determining a target area according to the at least one matching area;

extracting first feature information in the target area according to a feature matching algorithm;

determining at least one effective area in the target area according to preset identification conditions and distribution characteristics of the first characteristic information in the target area;

and determining a result image according to the distribution characteristics of the characteristic information in the at least one effective area.

A second aspect of the present application provides an apparatus for recognizing an image, having a function of implementing a method for recognizing an image corresponding to the first aspect provided above. The functions can be realized by hardware, and the functions can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the above functions, which may be software and/or hardware. In one possible design, the apparatus for recognizing an image in the present application may include:

the acquisition module is used for acquiring pictures;

the processing module is used for determining at least one matching area matched with the image template from the picture acquired by the acquisition module;

Yet another aspect of the present application provides a computer storage medium comprising instructions which, when executed on a computer, cause the computer to perform the method of the above-described aspects.

Yet another aspect of the present application provides a computer program product containing instructions which, when run on a computer, cause the computer to perform the method of the above-described aspects.

Compared with the prior art, in the scheme provided by the application, at least one matching area matched with the feature template is determined from the picture according to the feature template, the identification range can be narrowed, then the target area is determined according to the matching areas, first feature information is extracted from the target area according to a feature matching algorithm, and at least one effective area in the target area is determined according to preset identification conditions and distribution features of the first feature information in the target area, so that a result image can be determined according to the at least one effective area. By adopting the scheme, the efficiency of identifying the specific type of image can be effectively improved, and the accuracy of the identified specific type of image can also be improved.

Drawings

Fig. 1 is a schematic diagram of a network topology of a credit investigation system according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a method for locating an image according to an embodiment of the present invention;

FIG. 3-1 is a diagram illustrating a relationship between an image template and a feature template according to an embodiment of the present invention;

FIG. 3-2 is a schematic diagram of the determination of a target area using a matching area according to an embodiment of the present invention;

3-3 are alternative diagrams illustrating the use of matching regions to determine target regions in accordance with embodiments of the present invention;

FIG. 4 is a schematic diagram of matching 3 matching regions by using an identity card front feature template in the embodiment of the present invention;

FIG. 5 is a diagram illustrating filtering invalid match regions from the 3 match regions according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of matching 1 matching area by using an identity card front feature template in the embodiment of the present invention;

FIG. 7 is another diagram illustrating filtering invalid match regions from the 3 match regions in accordance with an embodiment of the present invention;

FIG. 8 is a schematic diagram illustrating an embodiment of determining SIFT feature points in a back area of an identification card;

FIG. 9 is a diagram illustrating an embodiment of determining SIFT activity in the front area of an identity card;

FIG. 10 is a schematic diagram of an embodiment of the present invention for determining an ID card region according to the determined SIFT valid line;

FIG. 11 is a schematic diagram of an embodiment of an apparatus for positioning an image;

FIG. 12 is a schematic diagram of a configuration of a mobile phone for positioning images according to an embodiment of the present invention;

fig. 13 is a schematic structural diagram of a server for positioning an image according to an embodiment of the present invention.

Detailed Description

The terms "first," "second," and the like in the description and in the claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprise," "include," and "have," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those steps or modules expressly listed, but may include other steps or modules not expressly listed or inherent to such process, method, article, or apparatus, wherein the division of modules presented herein is merely a logical division and may be implemented in a practical application in a different manner, such that a plurality of modules may be combined or integrated into another system or that certain features may be omitted or not implemented, and wherein shown or discussed as coupled or directly coupled or communicatively coupled to each other via interfaces and indirectly coupled or communicatively coupled to each other via electrical or other similar means, all of which are not intended to be limiting in this application. The modules or sub-modules described as separate components may or may not be physically separated, may or may not be physical modules, or may be distributed in a plurality of circuit modules, and some or all of the modules may be selected according to actual needs to achieve the purpose of the present disclosure.

The embodiment of the application provides a method, a device, a storage medium and a program product for positioning an image, and the application can be used in the fields of credit investigation systems, image recognition systems and the like. For example, a credit investigation system and a face recognition system of an e-commerce platform, and fig. 1 is a network topology structure of the credit investigation system, a user can upload a shot identity card picture to a server through a mobile phone, and then the server identifies the identity card from the identity card picture uploaded by the user, and then performs credit investigation verification on the identified identity card.

In order to solve the above technical problem, the embodiments of the present application mainly provide the following technical solutions:

the detection technology based on the Haar features locates the face on the front side of the identity card or the national emblem on the back side of the identity card in the picture so as to determine the approximate area of the identity card, the sift features are extracted from the approximate area, and the accurate position of the identity card is determined through the statistical distribution of the sift feature points. Among them, haar features are mainly classified into three categories: and combining the edge characteristic, the linear characteristic, the central characteristic and the diagonal characteristic into a characteristic template. The characteristic template is provided with two rectangles of white and black, and the characteristic value of the template is defined as the sum of the white rectangle pixels minus the sum of the black rectangle pixels. The Haar characteristic value can reflect the gray level change condition of the image. A plurality of Haar characteristic templates can be trained in advance based on Haar characteristics, and the Haar characteristic templates used for the front side and the back side of the identity card are different.

The device for locating an image in the present application may be a server as a server, may also be a terminal device as a server, and may also be understood as a client installed on the terminal device (or referred to as an interactive application).

Referring to fig. 2, a method for positioning an image according to an embodiment of the present application is illustrated as follows, and the method may include:

201. and determining at least one matching area matched with the image template from the acquired picture.

The image template refers to a template obtained by pre-training various types of pictures, and can be used for recognizing specific types of pictures from the whole world, wherein the specific types of pictures can be, for example, an identity card, a bank card, a driving license, a license plate, a human face or the like, and the specific application is not limited. In consideration of multiplexing of the image template and uncertainty of the picture, in the embodiment of the present application, local features may be identified from local regions respectively, and then the at least one matching region is obtained by synthesis. Specifically, the image template may include at least two feature templates, each feature template corresponding to a specific feature of a type, so that when a matching region in the picture is identified, the same or similar features may be matched according to the feature templates, respectively.

On one hand, more matching areas meeting the characteristics in the characteristic template can be matched in the picture, and more matching areas can reduce the missing fishes; on the other hand, when the local feature templates are used for matching, the matching accuracy and precision can be improved, and useless matching areas matched when the image templates are matched can be filtered, so that the searching range of the positioning features can be reduced by combining a plurality of feature templates to match the images. For example, when the image template is only used to match features in a picture, if features overlap with each other in the picture, a specific type of picture in which the features overlap with each other may be directly filtered, and finally, even if a specific type of picture to be identified exists in the picture, the specific type of picture cannot be identified. Therefore, a plurality of feature templates can be used in combination to match the local features respectively, which can reduce such phenomena.

Optionally, in another embodiment, if there are many specific types of pictures to be recognized, and there are parts that are locally the same or similar between some specific types of pictures, for example, identification documents such as identification cards and residence certificates, between various graduation certificates, between various professional certificates, and so on, or even between some legal formatting or specification documents, and so on, there may be the same or similar characteristics. One piece of address information, user head portrait and the like may be divided among the pictures of the specific types, and the load of the database is increased correspondingly in consideration of the types of the pictures to be identified. In order to reduce the load of the database and facilitate the database management, the characteristic templates can be reused, so that the reuse rate of some characteristic templates can be improved, the reused characteristic templates can be called as public characteristic templates, and other characteristic templates which can only be used for certain specific type of pictures can be called as special characteristic templates.

For example, when a specific type of picture is identified, the type of the picture included in the specific type of picture may be roughly determined according to the feature information of the current picture, and then the image template may be directly called, or at least two feature templates may be called, where the at least two called feature templates may include only a dedicated feature template, or only a common feature template, or both the dedicated feature template and the common feature template, and the specific feature template may be selected, which is not limited in the present application. For example, the at least two feature templates may further include at least one type of specific feature template and at least one type of common feature template, and fig. 3-1 is a schematic diagram illustrating a relationship between an image template and a feature template (including a specific feature template and a common feature template).

The matching region is a region where the called feature template is compared with features in the picture for detection, and finally the features corresponding to the feature template are matched, wherein the matching region can be matched by a global feature template or an independent feature template.

As shown in fig. 4, three matching regions (matching regions 1-3) are obtained by using the image template, and these three matching regions are candidate matching regions, wherein one matching region 3 can be matched by the image template, but the features in the matching region 3 are not the features to be identified, and therefore, the matching region 3 needs to be filtered out.

If the at least two called feature templates only include the special feature template, the server or the terminal device may determine the at least one matching region from the picture according to the special feature templates. If the at least two called feature templates comprise the special feature template and the public feature template, the server or the terminal equipment can determine a matching area matched with the special feature template of the at least one type from the obtained picture, and determine a matching area matched with the at least one public feature template from the picture.

202. And determining a target area according to the at least one matching area.

Optionally, in some embodiments, as can be seen from the description in step 201, the at least one matching region includes a first matching region, and the first matching region is matched with the first feature template. Correspondingly, the process of determining the target area according to the at least one matching area is specifically as follows:

and judging whether the first feature in the first matching region corresponds to the second feature in the first feature template or not according to the first feature in the first feature template, if so, taking the first matching region as a candidate region, and if not, ignoring the first matching region. It should be noted that the first matching region in this application refers to any matching region in the at least one matching region obtained in step 201, and the determination process of other matching regions may refer to the description of the first matching region, which is not described in detail. In this way, each matching region or multiple matching regions in step 201 can be determined separately, and finally at least one candidate region can be obtained.

Then, at least one obtained candidate area is used as the target area, and the target area refers to an area where the features in the picture meeting the identification conditions are located.

Wherein the location of the first feature in the resultant image corresponds to the location of the second feature in the image template, and the location of the first candidate region in the resultant image corresponds to the location of the first feature template in the image template. For example, as shown in fig. 4, in a scene of identifying an id card picture, three matching regions (matching region 1-matching region 3) are determined by using a face feature template and a text feature template, and these three matching regions are candidate matching regions, wherein one matching region 3 can be matched by the image template, but the features in the matching region 3 are not the features that are not to be identified, so that the matching region 3 needs to be filtered out, and the finally obtained schematic diagram of the id card region can refer to fig. 5. By means of feature comparison, unqualified matching regions can be filtered, so that the accuracy of image identification is improved, and the operation complexity can be reduced to a certain extent.

Optionally, in some embodiments, the means for determining the target area according to the at least one matching area mainly includes the following:

(1) Judging whether regions which can accord with the characteristics of certain specific types of images exist according to the positions of the matching regions in the images and the prior knowledge, and then taking the matching regions which accord with the conditions as a target region. For example, the at least one matching region may include a face matching region, a user information matching region, a country matching region, and an identity card number matching region, and it is obvious that it is impossible to form a specific type of picture with the country matching region and the other three matching regions according to a priori knowledge, but the face matching region, the user information matching region, and the identity card number matching region may form a picture with identity information, such as an identity card picture or a resident card picture, and as for the identity card picture or the resident card picture, feature points in each matching region may be further analyzed, for example, it may be determined according to an identity card format or a resident card format, which is not described in detail herein.

(2) The method comprises the steps of dividing a specific type of area (as shown in figure 3-2) containing a matching area around the matching area according to the matching area, detecting whether the divided area contains all or part of features of a specific type of picture through prior knowledge, judging whether the divided area meets the judgment condition of a specific type of picture, and taking the divided area as a target area if the divided area meets the judgment condition. And by analogy, each matching area can be divided into a specific type of area according to the requirement, and then the specific type of area is detected and judged. Of course, more than two matching regions may be used to define a specific type of region (as shown in fig. 3-3), and the specific implementation means is not limited in this application.

203. And extracting first feature information in the target area according to a feature matching algorithm.

Optionally, in some embodiments, the first feature information includes a plurality of feature points, each effective region is a region included between two feature lines, and a distance between the two feature lines is smaller than a preset threshold.

The Feature matching algorithm may adopt a Scale-invariant Feature transform (SIFT), a high-robustness local interest point detection (speedup Robust, SURF), an ORB, or the like, and the specific application is not limited. The SIFT algorithm can be used for detecting and describing local features in an image, searching extreme points in a spatial scale and extracting position, scale and rotation invariants of the extreme points. The SIFT algorithm mainly comprises generation of SIFT feature vectors, and the second stage is matching of the SIFT feature vectors.

204. And determining at least one effective area in the target area according to preset identification conditions and the distribution characteristics of the first characteristic information in the target area.

205. And determining a result image according to the distribution characteristics of the characteristic information in the at least one effective area.

Compared with the existing mechanism, in the scheme provided by the application, at least one matching region matched with the feature template is determined from the picture according to the feature template, the recognition range can be narrowed, then the target region is determined according to the matching regions, first feature information is extracted from the target region according to a feature matching algorithm, and at least one effective region in the target region is determined according to preset recognition conditions and the distribution feature of the first feature information in the target region, so that the specific image can be determined according to the at least one effective region. Therefore, by adopting the scheme, the efficiency of identifying the specific image can be effectively improved, and the accuracy of the identified specific image can also be improved.

Optionally, in some inventive embodiments, for example, when the SIFT algorithm is adopted, SIFT feature information is extracted from the target region according to a SIFT feature matching algorithm, and at least one SIFT effective region in the target region is determined according to a preset identification condition and a distribution feature of the SIFT feature information in the target region, so that an image of a specific type can be determined according to the at least one SIFT effective region.

Accordingly, the first feature information may include a plurality of SIFT feature points, each SIFT valid region is a region included between two SIFT feature lines, and a distance between the two SIFT feature lines is smaller than a preset threshold. The SIFT algorithm is good in uniqueness and rich in information quantity, SIFT feature points can be quickly and accurately matched in a massive feature database, the SIFT algorithm is high in quantity, even a few objects can generate a large number of SIFT feature vectors, the SIFT algorithm is high in operation speed, real-time requirements can be met, and the SIFT algorithm is particularly suitable for processing scenes of pictures uploaded by a plurality of terminal devices at the same time. In addition, the SIFT feature points are local features of the picture, so that the stability is strong when the picture is subjected to simulation transformation. Therefore, when the SIFT algorithm is adopted to identify the pictures of the specific types, the efficiency of identifying the pictures of the specific types and the accuracy of the identified pictures of the specific types can be improved.

Optionally, in some embodiments of the present invention, considering that the server or the terminal device may not know what type of the specific type of the picture is contained in the picture, or there are more features included in the picture, or there is a contamination source, or there are more similar specific types of pictures, even if the features do not overlap with each other, the specific type of the picture to be identified may be interfered. Then, the number of dematching with different feature templates is increased, and some of the matching operations must be useless. Then, in order to reduce the operation load and improve the accuracy of a specific type of picture, before determining at least one matching region matching the image template from the obtained picture, second feature information of the picture can be extracted from the picture, and then the image template matching the second feature information is queried.

For example, if there is an identity card picture in a captured picture, the terminal device does not know that the captured picture contains the identity card picture, and the terminal device may call various types of locally stored feature templates to match the identity card picture, and may match the identity card picture in the picture, but may match the identity card picture several times, and may need to traverse more feature templates to match the identity card picture. By adopting the scheme, the terminal equipment can extract the characteristic information from the picture, then call various certificate characteristic templates (for example, the social security card characteristic template, the residence permit characteristic template and the identity card characteristic template are called in sequence to be matched, and finally the identity card picture can be matched through the identity card characteristic template.

Optionally, in some invention embodiments, the second feature information includes at least one information element of a graph, a text, a logo, an icon, a point, a line, a gradient between pixel points, or a positional relationship between pixel points, after the second feature information of a picture is extracted from the picture, before the query is applied to the image template matched with the second feature information, at least one feature may also be extracted from the second feature information, and the extracted at least one feature may be a feature of a specific type, such as a face, address information, a name, a logo, or the like.

The querying the image template matched with the second feature information comprises:

and querying the image template matched with the at least one characteristic, so that when a proper image template is queried, the range of candidates can be further reduced, and the traversal times are reduced.

Optionally, in some inventive embodiments, in view of that the quality of the obtained picture may be low, in order to improve efficiency of identifying a specific type of picture, before the second feature information of the picture is extracted from the picture, the picture may be further preprocessed, where the preprocessing includes at least one of image smoothing, image transformation, image enhancement, and image restoration.

The image enhancement means that an original unclear image is changed into a clear image or certain interesting features are emphasized, the uninteresting features are suppressed, the image quality is improved, the information content is enriched, and the image processing mode of image interpretation and recognition effects is enhanced, and mainly comprises a frequency domain method and a space domain method.

The image restoration is to restore the original aspect of the degraded image by using the prior knowledge of the degradation process, and can correct atmospheric influence, geometric correction and correction of scanning line leakage, dislocation and the like caused by equipment to the remote sensing image data so as to reconstruct the degraded image into an original ideal image which is close to or completely free from degradation.

For ease of understanding, the server recognizes the id card picture in the photo uploaded by the user as an example.

Firstly, it may be determined by prior knowledge, for example, the second feature information extracted in the foregoing embodiment, whether the photo includes the front/back image information of the identity card, and after it is determined that the photo uploaded by the user includes the front/back image of the identity card, the following identification operation of the identity card image may be performed:

the method can be used for positioning the face on the front side of the identity card or the national emblem on the back side of the identity card in the picture based on the Haar feature detection technology so as to determine the approximate region of the identity card, and extracting SIFT features in the approximate region so as to determine the accurate position of the identity card through the statistical distribution of SIFT feature points. Among them, haar features are mainly classified into three categories: and combining the edge characteristic, the linear characteristic, the central characteristic and the diagonal characteristic into a characteristic template. The feature template has two rectangles of white and black, and the feature value of the template is defined as the sum of the white rectangular pixel and the subtracted black rectangular pixel. The Haar characteristic value can reflect the gray level change condition of the image. A plurality of Haar characteristic templates can be trained in advance based on Haar characteristics, and the Haar characteristic templates used on the front side and the back side of the identity card are different.

The following describes steps (1) to (7) for accurately identifying the identity card picture from the photo:

(1) And detecting a region matched with the feature template in the picture by using the trained haar feature template.

In particular, a face feature template may be used to detect matching regions in a photograph that match the face feature template, possibly resulting in more than two matching regions for faces. Similarly, the matching region matched with the national emblem feature template is detected in the photo by adopting the national emblem feature template, and more than two matching regions about the national emblem are likely to be obtained. Finally, a plurality of matching regions can be detected, the schematic diagram for detecting the plurality of matching regions can refer to the schematic diagram for detecting the front of the identity card shown in fig. 6, each dashed box can represent a matching region, and it can be determined through priori knowledge that the matching region 3 in fig. 6 is not a face on the front of the real identity card, so that the matching region 3 can be ignored, and the region schematic diagram shown in fig. 7 can be finally obtained.

(2) If the picture contains the picture of the front face of the identity card, the face matching area with the upper left corner smaller than the width 1/2 of the picture is excluded, and if the picture contains the picture of the back face of the identity card, the national emblem matching area 4 with the upper left corner exceeding the width 1/3 of the picture is ignored, and the effect is shown in figure 7.

(3) And determining a target area according to the obtained multiple matching areas.

The set length H is the vertical distance of the matching area, the width W is the horizontal distance of the matching area, the size of the target area is set according to the length and the width of the matching area, and the matching area and the target area are both rectangular. If the photo contains the positive face of the identity card, the coordinate calculation method of the upper left corner of the target area is as formula (1), and the coordinate calculation method of the lower right corner is as formula (2):

(max(0,int(X–3.58*W),max(0,int(Y-0.48*H))) (1)

(min(WW,int(X+1.9*W))，min(HH,int(Y+2.5*H))) (2)

if the back of the identity card is contained in the photo, the coordinate calculation method of the upper left corner of the target area is as shown in the formula (3), and the coordinate calculation method of the lower right corner is as shown in the formula (4):

(max(0,int(X–0.8*W),max(0,int(Y-0.4*H))) (3)

(min(WW,int(X+6*W))，min(HH,int(Y+3.6*H))) (4)

wherein, the coordinate system of the picture is defined as: the upper left corner of the picture is taken as an origin, the right direction is the positive direction of a horizontal axis, the downward direction is the positive direction X of a vertical axis, Y respectively represents that the abscissa and the ordinate WW of the upper left corner of the matching area are the maximum width of the picture, HH is the maximum length of the picture, max () function is the maximum number in the parameters, min () function is the minimum number in the parameters, and int () function is the integer which is closest to the parameters.

(4) And extracting SIFT features in the target area.

The SIFT feature points are often found in places with severe edge change and without being influenced by scale change, so the SIFT feature points on the identity card region are generally found in the text region, and the SIFT feature points on the edge and the picture region are few. In the step, referring to fig. 8, SIFT feature points in the back region of the identification card may be detected, and a dotted frame in fig. 8 identifies the position of the target region in the above-mentioned picture, where a hollow point in the dotted frame is an SIFT feature point in the target region. The same detection principle for the SIFT feature points in the front area of the identity card can refer to the detection in fig. 8, which is not described in detail.

(5) And determining the SIFT effective area according to the ID card format and the distribution characteristics of the SIFT characteristic points in the photo.

In step (2), the present solution has already determined a target region, and the SIFT valid region may be defined as:

if the vertical distance between two SIFT effective rows does not exceed two rows, the region between every two SIFT effective rows can be called as an SIFT effective region; if the vertical distance between two SIFT active rows exceeds two rows, the two SIFT active rows may be considered to be insignificant in their features and may be ignored. Because the format of the identity card is fixed, the edge format of the identity card, the character typesetting and the graph typesetting in the identity card can be adopted as unsuitable filtering target areas, finally, the area formed by the qualified target areas is used as the identity card area, an effect schematic diagram of detecting the identity card area can refer to fig. 9, and each solid line in fig. 9 represents an SIFT effective line. The identity card area judgment rule is as follows:

if the picture contains the picture of the front face of the identity card, a SIFT effective region (corresponding to an address region) with the length of 0.2-0.6H is arranged at a position 0.7-1.6H below the upper left corner of the face matching region, and a SIFT effective region with the length of 0.2-0.6H is arranged at a position 1.7-2.3W below the lower left corner of the face matching region, wherein the SIFT region is a region consisting of a plurality of SIFT effective lines and regions among the SIFT effective lines.

If the picture contains the reverse picture of the identity card, 0.3-0.5H SIFT effective area (namely the text segment corresponding to the people's republic of China) and 0.4-0.6H SIFT effective area (namely the text segment corresponding to the identity card of a resident) are arranged at a position 0.2-0.5W away from the right side of the matching area of the national emblem. And a 0.1-0.2H SIFT effective area is arranged at a position 1.0-1.3H below the lower right corner of the national emblem matching area.

(6) And determining a plurality of SIFT effective areas which accord with the identity card format as identity card areas according to the priori knowledge and the judgment rule in the step 5, and selecting the target area with the largest area as the identity card area if the plurality of SIFT effective areas meet the identity card format. Fig. 10 is a schematic diagram of the final positive area of the identification card, the negative area of the identification card being determined in a similar manner as the positive area of the identification card.

(7) And if the SIFT effective area which meets the condition does not exist, returning the information for prompting that the photo is invalid.

Therefore, the identity card region (comprising the identity card front region and the identity card back region) in the picture can be accurately and quickly positioned, clean input is provided for identity card correction and identity card OCR on the next step, and SIFT effective regions are selected by using priori knowledge such as identity card edges and character typesetting in a combined mode, so that the scheme provided by the application can be used for improving the identification speed and accuracy of the identity card picture.

It should be noted that, in the present application, the image with the front side and the back side of the identity card may be identified, the image with the front side of the identity card may also be identified, the image with the back side of the identity card may also be identified, and the application is not limited to whether the image with the identity card to be identified includes the front side and the back side of the identity card. The mode of performing the identification processing on the front side of the identity card for the picture only with the front side of the identity card is the same as or similar to the mode of performing the identification processing on the back side of the identity card for the picture only with the back side of the identity card, and only the types of the image templates used by the pictures are different.

Any features of the matching region, the image template, the feature template, and the like described in fig. 1 to 11 can be applied to the following fig. 12 to 13, and the description in fig. 1 to 11 can be referred to later, which is not repeated.

A method for positioning an image in an embodiment of the present application is described above, and an apparatus for performing the method for positioning an image is described below.

1. Referring to fig. 11, an apparatus 110 for positioning an image in an embodiment of the present application is described, where the apparatus 110 includes:

an obtaining module 1101 is configured to obtain a picture.

A processing module 1102, configured to determine at least one matching region matching an image template from the picture acquired by the acquisition module 1101;

determining a target area according to the at least one matching area;

Compared with the existing mechanism, in the scheme provided by the application, the processing module 1102 determines at least one matching region matched with the feature template from the picture according to the feature template, so that the recognition range can be narrowed, then determines a target region according to the matching regions, extracts first feature information in the target region according to a feature matching algorithm, and determines at least one effective region in the target region according to a preset recognition condition and distribution features of the first feature information in the target region, so that an image of a specific type can be determined according to the at least one effective region. Therefore, by adopting the scheme, the efficiency of identifying the specific image can be effectively improved, and the accuracy of the identified specific image can also be improved.

Optionally, in some inventive embodiments, before determining, from the obtained picture, at least one matching region matching the image template, the processing module 1102 is further configured to:

extracting second characteristic information of the picture from the picture;

and inquiring the image template matched with the second characteristic information.

Optionally, in some inventive embodiments, before the processing module 1102 extracts the second feature information of the picture from the picture, the processing module is further configured to:

and preprocessing the picture, wherein the preprocessing mode comprises at least one processing mode of image smoothing processing, image transformation domain, image enhancement and image restoration.

Optionally, in some embodiment of the invention, the second feature information includes at least one information element of a graph, a text, an identifier, an icon, a point, a line, a gradient between pixel points, or a position relationship between pixel points, and after extracting the second feature information of a picture from the picture, before querying the image template matching the second feature information, the processing module 1102 is further configured to:

extracting at least one feature from the second feature information;

the processing module is specifically configured to:

querying the image template matching the at least one feature.

Optionally, in some inventive embodiments, the image template includes at least two feature templates, each feature template corresponds to a specific feature of a type, the at least two feature templates include at least one type of specific feature template and at least one type of common feature template, and the processing module is specifically configured to:

determining a matching region from the picture that matches the at least one type of specialized feature template, and determining a matching region from the picture that matches the at least one common feature template.

Optionally, in some embodiments of the present invention, the at least one matching area includes a first matching area, and the processing module 1102 is specifically configured to:

judging whether a first feature in the first matching region corresponds to a second feature in the first feature template or not according to the first feature in the first feature template, if so, taking the first matching region as a candidate region, and if not, ignoring the first matching region;

taking the obtained at least one candidate region as the target region;

the position of the first feature in the result image corresponds to the position of the second feature in the image template, and the position of the first candidate region in the result image corresponds to the position of the first feature template in the image template.

Optionally, in some inventive embodiments, the first feature information includes a plurality of feature points, each effective region is a region included between two feature lines, and a distance between the two feature lines is smaller than a preset threshold

The apparatus in the embodiment of the present invention is described above from the perspective of the unitized functional entities, and the apparatus in the embodiment of the present invention is described below from the perspective of hardware processing.

The embodiment of the present invention further provides a terminal device, which is specifically the terminal device described in the method for recognizing an image, as shown in fig. 12, for convenience of description, only a part related to the embodiment of the present invention is shown, and details of the specific technology are not disclosed, please refer to the method part of the embodiment of the present invention. The terminal device referred to in this application may refer to a device that provides voice and/or data connectivity to a user, a handheld device having wireless connection capability, or other processing device connected to a wireless modem. A wireless terminal may communicate with one or more core networks via a Radio Access Network (RAN), and the wireless terminal may be a mobile terminal, such as a mobile phone (or called "cellular" phone) and a computer having a mobile terminal, such as a portable, pocket, hand-held, computer-embedded or vehicle-mounted mobile device, which exchanges voice and/or data with the RAN. Examples of such devices include Personal Communication Service (PCS) phones, cordless phones, session Initiation Protocol (SIP) phones, wireless Local Loop (WLL) stations, and Personal Digital Assistants (PDA). The wireless Terminal may also be called any Terminal Device such as a system, a Subscriber Unit (Subscriber Unit), a Subscriber Station (Subscriber Station), a Mobile Station (Mobile), a Remote Station (Remote Station), an Access Point (Access Point), a Remote Terminal (Remote Terminal), an Access Terminal (Access Terminal), a User Terminal (User Terminal), a Terminal Device, a User Agent (User Agent), a User Device (User Device), a User Equipment (User Equipment), a Mobile phone, a Sales Terminal (full name: point of Sales, abbreviated as POS), and a vehicle-mounted computer, taking the Terminal Device as a Mobile phone:

fig. 12 is a block diagram showing a partial structure of a cellular phone related to a terminal provided in an embodiment of the present invention. Referring to fig. 12, the cellular phone includes: radio Frequency (RF) circuit 1212, memory 1220, input unit 1230, display unit 1240, sensor 1250, audio circuit 1260, wireless fidelity (WiFi) module 1270, processor 1280, and power supply 1290. Those skilled in the art will appreciate that the handset configuration shown in fig. 12 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The following describes each component of the mobile phone in detail with reference to fig. 12:

RF circuit 1212 may be configured to receive and transmit signals during a message transmission and reception or a call, and in particular, receive downlink information of a base station and then process the received downlink information to processor 1280; in addition, the data for designing uplink is transmitted to the base station. In general, the RF circuitry 1212 includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry 1212 may communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), general Packet Radio Service (GPRS), code Division Multiple Access (CDMA), wideband Code Division Multiple Access (WCDMA), long Term Evolution (LTE), e-mail, short Message Service (SMS), etc.

The memory 1220 may be used to store software programs and modules, and the processor 1280 executes various functional applications and data processing of the mobile phone by operating the software programs and modules stored in the memory 1220. The memory 1220 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, etc. Further, the memory 1220 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The input unit 1230 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the cellular phone. Specifically, the input unit 1230 may include a touch panel 1231 and other input devices 1232. The touch panel 1231, also referred to as a touch screen, can collect touch operations of a user (e.g., operations of the user on or near the touch panel 1231 using any suitable object or accessory such as a finger, a stylus, etc.) thereon or nearby, and drive the corresponding connection device according to a preset program. Alternatively, the touch panel 1231 may include two portions, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, and sends the touch point coordinates to the processor 1280, where the touch controller can receive commands from the processor 1280 and execute the commands. In addition, the touch panel 1231 may be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 1230 may include other input devices 1232 in addition to the touch panel 1231. In particular, other input devices 1232 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The display unit 1240 may be used to display information input by a user or information provided to the user and various menus of the cellular phone. The Display unit 1240 may include a Display panel 1241, and optionally, the Display panel 1241 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, touch panel 1231 can overlay display panel 1241, and when touch panel 1231 detects a touch operation thereon or nearby, the touch operation can be transmitted to processor 1280 for determining the type of touch event, and then processor 1280 provides a corresponding visual output on display panel 1241 according to the type of touch event. Although in fig. 12, the touch panel 1231 and the display panel 1241 are implemented as two independent components to implement the input and output functions of the mobile phone, in some embodiments, the touch panel 1231 and the display panel 1241 may be integrated to implement the input and output functions of the mobile phone.

The cell phone may also include at least one sensor 1250, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 1241 according to the brightness of ambient light, and the proximity sensor may turn off the display panel 1241 and/or the backlight when the mobile phone moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the gesture of the mobile phone (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.

Audio circuitry 1260, speaker 1261, and microphone 1262 can provide an audio interface between a user and a cell phone. The audio circuit 1260 can transmit the electrical signal converted from the received audio data to the speaker 1261, and the electrical signal is converted into a sound signal by the speaker 1261 and is output; on the other hand, the microphone 1262 converts the collected sound signals into electrical signals, which are received by the audio circuit 1260 and converted into audio data, which are then processed by the audio data output processor 1280, and then transmitted to, for example, another cellular phone via the RF circuit 1212, or output to the memory 1220 for further processing.

WiFi belongs to short-distance wireless transmission technology, and the mobile phone can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 1270, and provides wireless broadband internet access for the user. Although fig. 12 shows the WiFi module 1270, it is understood that it does not belong to the essential constitution of the handset, and may be omitted entirely as needed within a scope not changing the essence of the invention.

The processor 1280 is a control center of the mobile phone, connects various parts of the entire mobile phone using various interfaces and lines, and performs various functions of the mobile phone and processes data by operating or executing software programs and/or modules stored in the memory 1220 and calling data stored in the memory 1220, thereby integrally monitoring the mobile phone. Optionally, processor 1280 may include one or more processing units; preferably, the processor 1280 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 1280.

The mobile phone further includes a power supply 1290 (e.g., a battery) for supplying power to various components, and preferably, the power supply may be logically connected to the processor 1280 through a power management system, so as to implement functions of managing charging, discharging, and power consumption through the power management system.

Although not shown, the mobile phone may further include a camera, a bluetooth module, etc., which will not be described herein.

In the embodiment of the present invention, the processor 1280 included in the terminal device further has a function of controlling the execution of the above method flow performed by the apparatus for recognizing an image.

As shown in fig. 13, the server 1300 may include one or more Central Processing Units (CPUs) 1322 (e.g., one or more processors), a memory 1332, and one or more storage media 1330 (e.g., one or more mass storage devices) for storing applications 1342 or data 1344. Memory 1332 and storage medium 1330 may be, among other things, transitory or persistent storage. The program stored on the storage medium 1330 may include one or more modules (not shown), each of which may include a sequence of instructions operating on a server. Still further, the central processor 1322 may be provided in communication with the storage medium 1330, executing a sequence of instruction operations in the storage medium 1330 on the server 1300.

The server 1300 may also include one or more power supplies 1326, one or more wired or wireless network interfaces 1350, one or more input-output interfaces 1358, and/or one or more operating systems 1341, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, etc.

The steps performed by the apparatus for recognizing an image in the above-described embodiment of the invention may be based on the server configuration shown in fig. 13.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working processes of the system, the apparatus, and the module described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice, for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed coupling or direct coupling or communication connection between each other may be through some interfaces, indirect coupling or communication connection between devices or modules, and may be in an electrical, mechanical or other form.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one position, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may be stored in a computer readable storage medium.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.

The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a server, a data center, etc., that is integrated with one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), among others.

The technical solutions provided by the present application are introduced in detail, and the present application applies specific examples to explain the principles and embodiments of the present application, and the descriptions of the above examples are only used to help understand the method and the core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A method of locating an image, the method comprising:

determining a target area according to the at least one matching area;

extracting first feature information in the target region according to a Scale Invariant Feature Transform (SIFT) algorithm; the first feature information comprises a plurality of SIFT feature points;

determining at least one SIFI effective region in the target region according to a preset identification condition and the distribution characteristics of the first characteristic information in the target region; each SIFT effective region is a region included between two SIFT effective rows, and the distance between the two SIFT effective rows is smaller than a preset threshold value; the SIFT effective behaviors comprise more than two SIFT feature points;

and determining a result image according to the distribution characteristics of the characteristic information in the at least one SIFI effective region.

2. The method of claim 1, wherein prior to said determining at least one matching region from the captured picture that matches the image template, the method further comprises:

extracting second characteristic information of the picture from the picture;

3. The method according to claim 2, wherein before the extracting the second feature information of the picture from the picture, the method further comprises:

4. The method according to claim 2, wherein the second feature information includes at least one information element of a figure, a text, a logo, an icon, a point, a line, a gradient between pixel points, or a positional relationship between pixel points, and after the extracting of the second feature information of a picture from the picture, before the querying of the image template matching the second feature information, the method further includes:

extracting at least one feature from the second feature information;

querying the image template matching the at least one feature.

5. The method according to any one of claims 1 to 4, wherein the image template comprises at least two feature templates, each feature template corresponds to a specific feature of a type, the at least two feature templates comprise at least one type of specific feature template and at least one type of common feature template, and the determining at least one matching region matching the image template from the captured picture comprises:

6. The method of claim 5, wherein the at least one matching region comprises a first matching region that matches a first feature template, and wherein determining a target region from the at least one matching region comprises:

taking the obtained at least one candidate region as the target region;

7. An apparatus for localizing images, the apparatus comprising:

the acquisition module is used for acquiring pictures;

8. The apparatus of claim 7, wherein the processing module, prior to determining at least one matching region from the captured picture that matches the image template, is further configured to:

extracting second characteristic information of the picture from the picture;

9. The apparatus of claim 8, wherein the processing module, prior to extracting the second feature information of the picture from the picture, is further configured to:

10. The apparatus according to claim 8, wherein the second feature information includes at least one information element of a graphic, a text, a logo, an icon, a point, a line, a gradient between pixel points, or a position relationship between pixel points, and the processing module, after extracting the second feature information of a picture from the picture, before querying the image template matching the second feature information, is further configured to:

extracting at least one feature from the second feature information;

the processing module is specifically configured to:

querying the image template matching the at least one feature.

11. The apparatus according to any of claims 7-10, wherein the image template comprises at least two feature templates, each feature template corresponding to a type of specific feature, the at least two feature templates comprising at least one type of specific feature template and at least one type of common feature template, the processing module being specifically configured to:

12. The apparatus of claim 11, wherein the at least one matching region comprises a first matching region, and wherein the processing module is specifically configured to:

taking the obtained at least one candidate region as the target region;

13. A computer storage medium comprising instructions which, when executed on a computer, cause the computer to perform the method of any one of claims 1 to 6.

14. A terminal device, comprising a memory, a processor;

wherein the memory is used for storing programs;

the processor, when executing the program in the memory, is adapted to implement the method of any of claims 1-6.