CN103593458A - Mass image searching system based on color features and inverted indexes - Google Patents
Mass image searching system based on color features and inverted indexes Download PDFInfo
- Publication number
- CN103593458A CN103593458A CN201310601630.XA CN201310601630A CN103593458A CN 103593458 A CN103593458 A CN 103593458A CN 201310601630 A CN201310601630 A CN 201310601630A CN 103593458 A CN103593458 A CN 103593458A
- Authority
- CN
- China
- Prior art keywords
- image
- color
- images
- mass
- colors
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5838—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
Landscapes
- Engineering & Computer Science (AREA)
- Library & Information Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
- Processing Or Creating Images (AREA)
Abstract
The invention provides a mass image searching system based on color features and inverted indexes. The mass image searching system is used in the following steps that firstly, a CIE1976L*a*b*c (Lab for short) color space with good uniformity is selected, K-means clustering is conducted on the space, and n types of colors are obtained; secondly, image pixels to be searched for are mapped according to the principle that color differences are minimum, and dimensionality reduced images are obtained; thirdly, the images obtained in the second step are divided into grids, main colors in grid units are obtained and used as representative colors, and each image is composed of a plurality of representative colors; fourthly, user-defined coding is conducted on the representative colors obtained in the third step, a piece of class text composed of a plurality of character codes is obtained finally, the class text is uploaded to an inverted index server, index building of mass images is completed, and therefore an image searching function can be completed.
Description
Invention field
The invention belongs to pattern-recognition and technical field of information processing, the large nuber of images relating on e-commerce platform is processed,
Relate in particular to a kind of implementation of the massive image retrieval based on color characteristic and inverted index.
Background technology
E-business service industry is just experiencing the Golden Age of its development.Expect 2015, China E-Commerce Business service sector business revenue will break through trillion yuan scale, and China will have worldwide largest, leading E-business service industry when the time comes.When ecommerce is flourish, magnanimity commodity image also increases progressively in the growth rate with how much multiples.Therefore how fast and effectively the commodity image of magnanimity to be retrieved and to become new research tendency.Commodity picture material has differences in shape clearly, such as clothes and trousers are just having very large difference in shape.Color characteristic is that most critical is also the most frequently used feature, but directly the RGB color of processing coloured image is work quite consuming time, so how to reduce the complexity of Color Statistical, is that large nuber of images is processed a difficult problem of first facing.The present invention proposes based on color space quantize and the method for feature coding in the hope of quick obtaining color of image feature, and by the method for image lattice, further extract the main colouring information of image, and finally through feature coding, set up large nuber of images inverted index.
Summary of the invention
The object of the invention is to solve the image quick-searching problem under the large nuber of images that the ecommerce that develops rapidly forms, the commodity that consumer can face large nuber of images quick-searching and is concerned about thus.A kind of massive image retrieval method of e-commerce platform is fast and effectively provided.
The present invention is by the following technical solutions to achieve these goals:
A massive image retrieval system for color characteristic and inverted index, is characterized in that, comprises the steps:
Step 1: first calculate the dimension disaster problem of color characteristic in order to solve RGB color space, consider the homogeneity question of color space simultaneously, select the CIE1976L of good uniformity
*a
*b
*color space, and with K-means clustering method to CIE1976L
*a
*b
*carry out cluster, cluster to 256 kind of color.
Step 2: obtain all images to be retrieved from e-commerce platform, first the RGB color of image is converted into CIE1976L
*a
*b
*color, and 256 kinds of colors that each the pixel color in image is obtained according to aberration minimum principle and step 1 do and shine upon, the dimension of each pixel of final image becomes 256 dimensions.
Step 3: by the image lattice obtaining in step 2, sizing grid is 8*8.Statistical color mass-tone in each grid cell, and using the representative color of each mass-tone as this grid cell.Last every image will be comprised of 64 representative colors.
Step 4: 64 representative colors that step 3 is obtained carry out character code by self-defining coding rule, the corresponding class text being formed by 64 character codes of last image meeting, this class text is uploaded to inverted index server, the index that completes large nuber of images is set up, and then can complete image retrieval function.
The present invention, in conjunction with text retrieval feature fast and effectively, well transforms characteristics of image for text.The fast effective search problem that has solved large nuber of images, the present invention has the following advantages:
One, the requirement of from consumer, user being experienced, by the improvement of characteristics of image class text, can complete image retrieval effect fast and effectively;
Two, the present invention, from the angle of e-commerce platform, can be good at the image information of magnanimity in platform to carry out effective integration by color characteristic.Thereby to user, provide better consumption experience, bring more website traffic.
Three, from the angle of the information processing technology, the present invention well combines the advantage of text retrieval, and image is carried out to gridding, thereby has retained the profile information of image section, for the commodity image of profile information sensitivity, has good result.
Accompanying drawing explanation
Accompanying drawing 1 is searching system frame diagram;
Accompanying drawing 2 mass-tone figure;
Accompanying drawing 3 custom codings;
Accompanying drawing 4 character code literary compositions;
Accompanying drawing 5 part of test results.
Embodiment
In order to make object of the present invention, technical scheme and beneficial effect clearer, below in conjunction with concrete case, and with reference to accompanying drawing, the present invention is described in more detail.
The present invention is the search method for e-commerce platform large nuber of images similar image.The method can be converted into characteristics of image the class text key characteristics that can set up index, thereby utilizes inverted index search engine, completes the quick-searching work to image.This search method can be good at meeting user to fast effective search method demand, the user that can increase to a great extent e-commerce platform simultaneously experiences, and has well verified in practice the benefit of these incoherent two kinds of search method combinations originally of image retrieval and text retrieval.
Our test experiments hardware environment is:
Hardware environment:
Computer type: desktop computer;
CPU:Pentium(R)Dual-Core?CPU?E5600@2.93GHz
Internal memory: 4.00GB(3.49GB can use)
System type: 32-bit operating system
Display card: integrated graphics card
Software environment:
IDE:Visual?Studio2010
Image treatment S DK:OpenCV2.3.1
Search engine: Apache Solr1.4.1
Development language: C++, Python
As the retrieval flow figure of Fig. 1 the present invention to similar image, the search method of similar commodity image is comprised the steps:
Step 1: be first the dimension problem of color of image feature in order to utilize color of image feature to set up the matter of utmost importance that efficient index will solve, the dimension of RGB color space is 16777216 dimensions (256*258*256), if directly without dimension-reduction treatment, the retrieval based on color characteristic will become unrealistic so.In order to solve the problem of dimension disaster, consider the homogeneity question of color space simultaneously, the present invention selects the CIE1976L of good uniformity
*a
*b
*color space, and with K-means clustering method to CIE1976L
*a
*b
*carry out cluster, cluster to 256 kind of color.Because these 256 kinds of colors do not comprise greyscale color, dimensionality reduction poor effect to black white image or the more shallow image of color, so we are by color space cluster to 248 dimension, by gray space cluster to 8 dimension, thereby the colour that obtains 256 dimensions adds the color space of gray scale, the present invention is referred to as standard colors space, and each color is wherein referred to as standard colors.
Step 2: obtain all images to be retrieved from e-commerce platform and carry out batch images feature extraction, for the flow process of image characteristics extraction is described more clearly, we describe as an example with the feature extraction of an image.First, the RGB color of image is converted into CIE1976L
*a
*b
*color; Then (the present invention selects CIEDE1976 colour difference formula to every kind of color calculating value of chromatism in the profile connecting space each pixel color value and step 1 in the image after conversion being obtained, colour difference formula is as formula 1), the CIE1976L in computed image and in profile connecting space
*a
*b
*the aberration of color, selects the standard colors of aberration minimum as the representative color of this pixel, and all 16777216 pixels of tieing up that finally we obtain in image will all be mapped to the standard colors of 256 dimensions.We claim that the image that this process obtains is map image.
Wherein:
DE
1976x
1, two kinds of CIE1976L
*a
*b
*the aberration of color.
Step 3: by the image lattice obtaining in step 2, suppose that image size is for 200*200, sizing grid is 8*8, and each grid cell size is 25*25.Statistical color histogram in each grid cell, obtains the color value of ratio maximum as the mass-tone (being representative color) of this grid cell, and last every image will be comprised of 64 mass-tones, and we are referred to as mass-tone figure, are illustrated in fig. 2 shown below.
Step 4: in order better to use the search engine of arranging being good at text retrieval, the mass-tone figure that we obtain step 3 carries out character code by self-defining coding rule, be about to each mass-tone and be converted into the character code being formed by four letters, the key word in similar text retrieval.Character-coded front two has recorded the coordinate information of mass-tone, and latter two have been recorded mass-tone value, and conversion process as shown in Figure 3.The corresponding class text being comprised of 64 character codes of last image meeting, the present invention is referred to as character code literary composition.Character code literary composition as shown in Figure 4.Character code literary composition and ID corresponding to image are uploaded to inverted index server, and the index that completes large nuber of images is set up, and then can complete image retrieval function.
The effect of the inventive method:
In order to verify effect of the present invention, we have obtained the movement value image data set of magnanimity at home in certain e-commerce platform, and this test data set comprises motion overcoat, sports T-shirt, sport footwear, sport pants and various 10000 images altogether such as ball.Wherein the image retrieval time, not higher than 14ms, has reached good live effect.Part of test results of the present invention as shown in Figure 5.
Claims (1)
1. the massive image retrieval system based on color characteristic and inverted index, is characterized in that, comprises the steps:
Step 1: the CIE1976L that selects good uniformity
*a
*b
*color space, and with K-means clustering method to CIE1976L
*a
*b
*carry out cluster, cluster to 256 kind of color;
Step 2: obtain all images to be retrieved, first the RGB color of image is converted into CIE1976L
*a
*b
*color, and 256 kinds of colors that each the pixel color in image is obtained according to aberration minimum principle and step 1 do and shine upon, the dimension of each pixel of final image becomes 256 dimensions;
Step 3: by the image lattice obtaining in step 2, sizing grid is n*n.Statistical color mass-tone in each grid cell, and using each mass-tone as the representative color of this grid cell, last every image will be comprised of n*n representative color;
Step 4: 64 representative colors that step 3 is obtained carry out character code by self-defining coding rule, the corresponding class text being formed by 64 character codes of last image meeting, this class text is uploaded to inverted index server, the index that completes large nuber of images is set up, and then can complete image retrieval function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310601630.XA CN103593458A (en) | 2013-11-21 | 2013-11-21 | Mass image searching system based on color features and inverted indexes |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310601630.XA CN103593458A (en) | 2013-11-21 | 2013-11-21 | Mass image searching system based on color features and inverted indexes |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103593458A true CN103593458A (en) | 2014-02-19 |
Family
ID=50083599
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310601630.XA Pending CN103593458A (en) | 2013-11-21 | 2013-11-21 | Mass image searching system based on color features and inverted indexes |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103593458A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104778281A (en) * | 2015-05-06 | 2015-07-15 | 苏州搜客信息技术有限公司 | Image index parallel construction method based on community analysis |
CN104978565A (en) * | 2015-05-11 | 2015-10-14 | 厦门翼歌软件科技有限公司 | Universal on-image text extraction method |
CN105447451A (en) * | 2015-11-13 | 2016-03-30 | 东方网力科技股份有限公司 | Method and device for retrieving object markers |
CN107832359A (en) * | 2017-10-24 | 2018-03-23 | 杭州群核信息技术有限公司 | A kind of picture retrieval method and system |
CN110413824A (en) * | 2019-06-20 | 2019-11-05 | 平安科技(深圳)有限公司 | A kind of search method and device of similar pictures |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7015931B1 (en) * | 1999-04-29 | 2006-03-21 | Mitsubishi Denki Kabushiki Kaisha | Method and apparatus for representing and searching for color images |
CN101506840A (en) * | 2006-06-23 | 2009-08-12 | 卡勒兹普麦迪亚公司 | Method of discriminating colors of color based image code |
CN101714257A (en) * | 2009-12-23 | 2010-05-26 | 公安部第三研究所 | Method for main color feature extraction and structuring description of images |
CN102523367A (en) * | 2011-12-29 | 2012-06-27 | 北京创想空间商务通信服务有限公司 | Real-time image compression and reduction method based on plurality of palettes |
-
2013
- 2013-11-21 CN CN201310601630.XA patent/CN103593458A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7015931B1 (en) * | 1999-04-29 | 2006-03-21 | Mitsubishi Denki Kabushiki Kaisha | Method and apparatus for representing and searching for color images |
CN101506840A (en) * | 2006-06-23 | 2009-08-12 | 卡勒兹普麦迪亚公司 | Method of discriminating colors of color based image code |
CN101714257A (en) * | 2009-12-23 | 2010-05-26 | 公安部第三研究所 | Method for main color feature extraction and structuring description of images |
CN102523367A (en) * | 2011-12-29 | 2012-06-27 | 北京创想空间商务通信服务有限公司 | Real-time image compression and reduction method based on plurality of palettes |
Non-Patent Citations (1)
Title |
---|
缑西梅: "基于内容的图像检测技术研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104778281A (en) * | 2015-05-06 | 2015-07-15 | 苏州搜客信息技术有限公司 | Image index parallel construction method based on community analysis |
CN104978565A (en) * | 2015-05-11 | 2015-10-14 | 厦门翼歌软件科技有限公司 | Universal on-image text extraction method |
CN104978565B (en) * | 2015-05-11 | 2019-06-28 | 厦门翼歌软件科技有限公司 | A kind of pictograph extracting method of universality |
CN105447451A (en) * | 2015-11-13 | 2016-03-30 | 东方网力科技股份有限公司 | Method and device for retrieving object markers |
CN105447451B (en) * | 2015-11-13 | 2019-01-22 | 东方网力科技股份有限公司 | A kind of method and apparatus for retrieving object marker object |
CN107832359A (en) * | 2017-10-24 | 2018-03-23 | 杭州群核信息技术有限公司 | A kind of picture retrieval method and system |
CN107832359B (en) * | 2017-10-24 | 2021-06-08 | 杭州群核信息技术有限公司 | Picture retrieval method and system |
CN110413824A (en) * | 2019-06-20 | 2019-11-05 | 平安科技(深圳)有限公司 | A kind of search method and device of similar pictures |
WO2020253063A1 (en) * | 2019-06-20 | 2020-12-24 | 平安科技(深圳)有限公司 | Method and device for searching for similar images |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10540579B2 (en) | Two-dimensional document processing | |
CN107909039B (en) | High-resolution remote sensing image earth surface coverage classification method based on parallel algorithm | |
US10467743B1 (en) | Image processing method, terminal and storage medium | |
CN103593458A (en) | Mass image searching system based on color features and inverted indexes | |
CN101937549B (en) | Picture guidance system for network shopping guidance | |
CN106156284B (en) | Extensive nearly repetition video retrieval method based on random multi-angle of view Hash | |
EP4322031A1 (en) | Recommendation method, recommendation model training method, and related product | |
CN102289671A (en) | Method and device for extracting texture feature of image | |
CN101986295B (en) | Image clustering method based on manifold sparse coding | |
CN108460400B (en) | Hyperspectral image classification method combining various characteristic information | |
CN103955952A (en) | Extraction and description method for garment image color features | |
Cai et al. | Improving sampling-based image matting with cooperative coevolution differential evolution algorithm | |
CN103049340A (en) | Image super-resolution reconstruction method of visual vocabularies and based on texture context constraint | |
CN103593853A (en) | Remote-sensing image multi-scale object-oriented classification method based on joint sparsity representation | |
CN114463637A (en) | Winter wheat remote sensing identification analysis method and system based on deep learning | |
CN102831161B (en) | For the semi-supervised sequence learning method based on manifold regularization of image retrieval | |
CN111177450B (en) | Image retrieval cloud identification method and system and computer readable storage medium | |
CN112307352A (en) | Content recommendation method, system, device and storage medium | |
Pang et al. | SCA-CDNet: A robust siamese correlation-and-attention-based change detection network for bitemporal VHR images | |
CN106933905B (en) | Method and device for monitoring webpage access data | |
CN111723222A (en) | Image search and training system | |
CN110020123B (en) | Popularization information delivery method, device, medium and equipment | |
Chen et al. | A new patch-based LBP with adaptive weights for gender classification of human face | |
CN104142978A (en) | Image retrieval system and image retrieval method based on multi-feature and sparse representation | |
CN117197479A (en) | Image analysis method, device, computer equipment and storage medium applying corn ear outer surface |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20140219 |
|
RJ01 | Rejection of invention patent application after publication |