Nothing Special   »   [go: up one dir, main page]

CN103593458A - Mass image searching system based on color features and inverted indexes - Google Patents

Mass image searching system based on color features and inverted indexes Download PDF

Info

Publication number
CN103593458A
CN103593458A CN201310601630.XA CN201310601630A CN103593458A CN 103593458 A CN103593458 A CN 103593458A CN 201310601630 A CN201310601630 A CN 201310601630A CN 103593458 A CN103593458 A CN 103593458A
Authority
CN
China
Prior art keywords
image
color
images
mass
colors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310601630.XA
Other languages
Chinese (zh)
Inventor
董乐
封宁
梁燕
王冉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201310601630.XA priority Critical patent/CN103593458A/en
Publication of CN103593458A publication Critical patent/CN103593458A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5838Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour

Landscapes

  • Engineering & Computer Science (AREA)
  • Library & Information Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention provides a mass image searching system based on color features and inverted indexes. The mass image searching system is used in the following steps that firstly, a CIE1976L*a*b*c (Lab for short) color space with good uniformity is selected, K-means clustering is conducted on the space, and n types of colors are obtained; secondly, image pixels to be searched for are mapped according to the principle that color differences are minimum, and dimensionality reduced images are obtained; thirdly, the images obtained in the second step are divided into grids, main colors in grid units are obtained and used as representative colors, and each image is composed of a plurality of representative colors; fourthly, user-defined coding is conducted on the representative colors obtained in the third step, a piece of class text composed of a plurality of character codes is obtained finally, the class text is uploaded to an inverted index server, index building of mass images is completed, and therefore an image searching function can be completed.

Description

A kind of massive image retrieval system based on color characteristic and inverted index
Invention field
The invention belongs to pattern-recognition and technical field of information processing, the large nuber of images relating on e-commerce platform is processed,
Relate in particular to a kind of implementation of the massive image retrieval based on color characteristic and inverted index.
Background technology
E-business service industry is just experiencing the Golden Age of its development.Expect 2015, China E-Commerce Business service sector business revenue will break through trillion yuan scale, and China will have worldwide largest, leading E-business service industry when the time comes.When ecommerce is flourish, magnanimity commodity image also increases progressively in the growth rate with how much multiples.Therefore how fast and effectively the commodity image of magnanimity to be retrieved and to become new research tendency.Commodity picture material has differences in shape clearly, such as clothes and trousers are just having very large difference in shape.Color characteristic is that most critical is also the most frequently used feature, but directly the RGB color of processing coloured image is work quite consuming time, so how to reduce the complexity of Color Statistical, is that large nuber of images is processed a difficult problem of first facing.The present invention proposes based on color space quantize and the method for feature coding in the hope of quick obtaining color of image feature, and by the method for image lattice, further extract the main colouring information of image, and finally through feature coding, set up large nuber of images inverted index.
Summary of the invention
The object of the invention is to solve the image quick-searching problem under the large nuber of images that the ecommerce that develops rapidly forms, the commodity that consumer can face large nuber of images quick-searching and is concerned about thus.A kind of massive image retrieval method of e-commerce platform is fast and effectively provided.
The present invention is by the following technical solutions to achieve these goals:
A massive image retrieval system for color characteristic and inverted index, is characterized in that, comprises the steps:
Step 1: first calculate the dimension disaster problem of color characteristic in order to solve RGB color space, consider the homogeneity question of color space simultaneously, select the CIE1976L of good uniformity *a *b *color space, and with K-means clustering method to CIE1976L *a *b *carry out cluster, cluster to 256 kind of color.
Step 2: obtain all images to be retrieved from e-commerce platform, first the RGB color of image is converted into CIE1976L *a *b *color, and 256 kinds of colors that each the pixel color in image is obtained according to aberration minimum principle and step 1 do and shine upon, the dimension of each pixel of final image becomes 256 dimensions.
Step 3: by the image lattice obtaining in step 2, sizing grid is 8*8.Statistical color mass-tone in each grid cell, and using the representative color of each mass-tone as this grid cell.Last every image will be comprised of 64 representative colors.
Step 4: 64 representative colors that step 3 is obtained carry out character code by self-defining coding rule, the corresponding class text being formed by 64 character codes of last image meeting, this class text is uploaded to inverted index server, the index that completes large nuber of images is set up, and then can complete image retrieval function.
The present invention, in conjunction with text retrieval feature fast and effectively, well transforms characteristics of image for text.The fast effective search problem that has solved large nuber of images, the present invention has the following advantages:
One, the requirement of from consumer, user being experienced, by the improvement of characteristics of image class text, can complete image retrieval effect fast and effectively;
Two, the present invention, from the angle of e-commerce platform, can be good at the image information of magnanimity in platform to carry out effective integration by color characteristic.Thereby to user, provide better consumption experience, bring more website traffic.
Three, from the angle of the information processing technology, the present invention well combines the advantage of text retrieval, and image is carried out to gridding, thereby has retained the profile information of image section, for the commodity image of profile information sensitivity, has good result.
Accompanying drawing explanation
Accompanying drawing 1 is searching system frame diagram;
Accompanying drawing 2 mass-tone figure;
Accompanying drawing 3 custom codings;
Accompanying drawing 4 character code literary compositions;
Accompanying drawing 5 part of test results.
Embodiment
In order to make object of the present invention, technical scheme and beneficial effect clearer, below in conjunction with concrete case, and with reference to accompanying drawing, the present invention is described in more detail.
The present invention is the search method for e-commerce platform large nuber of images similar image.The method can be converted into characteristics of image the class text key characteristics that can set up index, thereby utilizes inverted index search engine, completes the quick-searching work to image.This search method can be good at meeting user to fast effective search method demand, the user that can increase to a great extent e-commerce platform simultaneously experiences, and has well verified in practice the benefit of these incoherent two kinds of search method combinations originally of image retrieval and text retrieval.
Our test experiments hardware environment is:
Hardware environment:
Computer type: desktop computer;
CPU:Pentium(R)Dual-Core?CPU?E5600@2.93GHz
Internal memory: 4.00GB(3.49GB can use)
System type: 32-bit operating system
Display card: integrated graphics card
Software environment:
IDE:Visual?Studio2010
Image treatment S DK:OpenCV2.3.1
Search engine: Apache Solr1.4.1
Development language: C++, Python
As the retrieval flow figure of Fig. 1 the present invention to similar image, the search method of similar commodity image is comprised the steps:
Step 1: be first the dimension problem of color of image feature in order to utilize color of image feature to set up the matter of utmost importance that efficient index will solve, the dimension of RGB color space is 16777216 dimensions (256*258*256), if directly without dimension-reduction treatment, the retrieval based on color characteristic will become unrealistic so.In order to solve the problem of dimension disaster, consider the homogeneity question of color space simultaneously, the present invention selects the CIE1976L of good uniformity *a *b *color space, and with K-means clustering method to CIE1976L *a *b *carry out cluster, cluster to 256 kind of color.Because these 256 kinds of colors do not comprise greyscale color, dimensionality reduction poor effect to black white image or the more shallow image of color, so we are by color space cluster to 248 dimension, by gray space cluster to 8 dimension, thereby the colour that obtains 256 dimensions adds the color space of gray scale, the present invention is referred to as standard colors space, and each color is wherein referred to as standard colors.
Step 2: obtain all images to be retrieved from e-commerce platform and carry out batch images feature extraction, for the flow process of image characteristics extraction is described more clearly, we describe as an example with the feature extraction of an image.First, the RGB color of image is converted into CIE1976L *a *b *color; Then (the present invention selects CIEDE1976 colour difference formula to every kind of color calculating value of chromatism in the profile connecting space each pixel color value and step 1 in the image after conversion being obtained, colour difference formula is as formula 1), the CIE1976L in computed image and in profile connecting space *a *b *the aberration of color, selects the standard colors of aberration minimum as the representative color of this pixel, and all 16777216 pixels of tieing up that finally we obtain in image will all be mapped to the standard colors of 256 dimensions.We claim that the image that this process obtains is map image.
DE 1976 ( x 1 , x 2 ) = ( ( DL * ) 2 + ( Da * ) 2 + ( Db * ) 2 ) (formula 1)
Wherein: x 1 = [ L 1 * , a 1 * , b 1 * ] T , x 2 = [ L 2 * , a 2 * , b 2 * ] T , DL = L 1 * - L 2 * , Da = a 1 * - a 2 * , Db = b 1 * - b 2 * , DE 1976x 1, two kinds of CIE1976L *a *b *the aberration of color.
Step 3: by the image lattice obtaining in step 2, suppose that image size is for 200*200, sizing grid is 8*8, and each grid cell size is 25*25.Statistical color histogram in each grid cell, obtains the color value of ratio maximum as the mass-tone (being representative color) of this grid cell, and last every image will be comprised of 64 mass-tones, and we are referred to as mass-tone figure, are illustrated in fig. 2 shown below.
Step 4: in order better to use the search engine of arranging being good at text retrieval, the mass-tone figure that we obtain step 3 carries out character code by self-defining coding rule, be about to each mass-tone and be converted into the character code being formed by four letters, the key word in similar text retrieval.Character-coded front two has recorded the coordinate information of mass-tone, and latter two have been recorded mass-tone value, and conversion process as shown in Figure 3.The corresponding class text being comprised of 64 character codes of last image meeting, the present invention is referred to as character code literary composition.Character code literary composition as shown in Figure 4.Character code literary composition and ID corresponding to image are uploaded to inverted index server, and the index that completes large nuber of images is set up, and then can complete image retrieval function.
The effect of the inventive method:
In order to verify effect of the present invention, we have obtained the movement value image data set of magnanimity at home in certain e-commerce platform, and this test data set comprises motion overcoat, sports T-shirt, sport footwear, sport pants and various 10000 images altogether such as ball.Wherein the image retrieval time, not higher than 14ms, has reached good live effect.Part of test results of the present invention as shown in Figure 5.

Claims (1)

1. the massive image retrieval system based on color characteristic and inverted index, is characterized in that, comprises the steps:
Step 1: the CIE1976L that selects good uniformity *a *b *color space, and with K-means clustering method to CIE1976L *a *b *carry out cluster, cluster to 256 kind of color;
Step 2: obtain all images to be retrieved, first the RGB color of image is converted into CIE1976L *a *b *color, and 256 kinds of colors that each the pixel color in image is obtained according to aberration minimum principle and step 1 do and shine upon, the dimension of each pixel of final image becomes 256 dimensions;
Step 3: by the image lattice obtaining in step 2, sizing grid is n*n.Statistical color mass-tone in each grid cell, and using each mass-tone as the representative color of this grid cell, last every image will be comprised of n*n representative color;
Step 4: 64 representative colors that step 3 is obtained carry out character code by self-defining coding rule, the corresponding class text being formed by 64 character codes of last image meeting, this class text is uploaded to inverted index server, the index that completes large nuber of images is set up, and then can complete image retrieval function.
CN201310601630.XA 2013-11-21 2013-11-21 Mass image searching system based on color features and inverted indexes Pending CN103593458A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310601630.XA CN103593458A (en) 2013-11-21 2013-11-21 Mass image searching system based on color features and inverted indexes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310601630.XA CN103593458A (en) 2013-11-21 2013-11-21 Mass image searching system based on color features and inverted indexes

Publications (1)

Publication Number Publication Date
CN103593458A true CN103593458A (en) 2014-02-19

Family

ID=50083599

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310601630.XA Pending CN103593458A (en) 2013-11-21 2013-11-21 Mass image searching system based on color features and inverted indexes

Country Status (1)

Country Link
CN (1) CN103593458A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778281A (en) * 2015-05-06 2015-07-15 苏州搜客信息技术有限公司 Image index parallel construction method based on community analysis
CN104978565A (en) * 2015-05-11 2015-10-14 厦门翼歌软件科技有限公司 Universal on-image text extraction method
CN105447451A (en) * 2015-11-13 2016-03-30 东方网力科技股份有限公司 Method and device for retrieving object markers
CN107832359A (en) * 2017-10-24 2018-03-23 杭州群核信息技术有限公司 A kind of picture retrieval method and system
CN110413824A (en) * 2019-06-20 2019-11-05 平安科技(深圳)有限公司 A kind of search method and device of similar pictures

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7015931B1 (en) * 1999-04-29 2006-03-21 Mitsubishi Denki Kabushiki Kaisha Method and apparatus for representing and searching for color images
CN101506840A (en) * 2006-06-23 2009-08-12 卡勒兹普麦迪亚公司 Method of discriminating colors of color based image code
CN101714257A (en) * 2009-12-23 2010-05-26 公安部第三研究所 Method for main color feature extraction and structuring description of images
CN102523367A (en) * 2011-12-29 2012-06-27 北京创想空间商务通信服务有限公司 Real-time image compression and reduction method based on plurality of palettes

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7015931B1 (en) * 1999-04-29 2006-03-21 Mitsubishi Denki Kabushiki Kaisha Method and apparatus for representing and searching for color images
CN101506840A (en) * 2006-06-23 2009-08-12 卡勒兹普麦迪亚公司 Method of discriminating colors of color based image code
CN101714257A (en) * 2009-12-23 2010-05-26 公安部第三研究所 Method for main color feature extraction and structuring description of images
CN102523367A (en) * 2011-12-29 2012-06-27 北京创想空间商务通信服务有限公司 Real-time image compression and reduction method based on plurality of palettes

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
缑西梅: "基于内容的图像检测技术研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778281A (en) * 2015-05-06 2015-07-15 苏州搜客信息技术有限公司 Image index parallel construction method based on community analysis
CN104978565A (en) * 2015-05-11 2015-10-14 厦门翼歌软件科技有限公司 Universal on-image text extraction method
CN104978565B (en) * 2015-05-11 2019-06-28 厦门翼歌软件科技有限公司 A kind of pictograph extracting method of universality
CN105447451A (en) * 2015-11-13 2016-03-30 东方网力科技股份有限公司 Method and device for retrieving object markers
CN105447451B (en) * 2015-11-13 2019-01-22 东方网力科技股份有限公司 A kind of method and apparatus for retrieving object marker object
CN107832359A (en) * 2017-10-24 2018-03-23 杭州群核信息技术有限公司 A kind of picture retrieval method and system
CN107832359B (en) * 2017-10-24 2021-06-08 杭州群核信息技术有限公司 Picture retrieval method and system
CN110413824A (en) * 2019-06-20 2019-11-05 平安科技(深圳)有限公司 A kind of search method and device of similar pictures
WO2020253063A1 (en) * 2019-06-20 2020-12-24 平安科技(深圳)有限公司 Method and device for searching for similar images

Similar Documents

Publication Publication Date Title
US10540579B2 (en) Two-dimensional document processing
CN107909039B (en) High-resolution remote sensing image earth surface coverage classification method based on parallel algorithm
US10467743B1 (en) Image processing method, terminal and storage medium
CN103593458A (en) Mass image searching system based on color features and inverted indexes
CN101937549B (en) Picture guidance system for network shopping guidance
CN106156284B (en) Extensive nearly repetition video retrieval method based on random multi-angle of view Hash
EP4322031A1 (en) Recommendation method, recommendation model training method, and related product
CN102289671A (en) Method and device for extracting texture feature of image
CN101986295B (en) Image clustering method based on manifold sparse coding
CN108460400B (en) Hyperspectral image classification method combining various characteristic information
CN103955952A (en) Extraction and description method for garment image color features
Cai et al. Improving sampling-based image matting with cooperative coevolution differential evolution algorithm
CN103049340A (en) Image super-resolution reconstruction method of visual vocabularies and based on texture context constraint
CN103593853A (en) Remote-sensing image multi-scale object-oriented classification method based on joint sparsity representation
CN114463637A (en) Winter wheat remote sensing identification analysis method and system based on deep learning
CN102831161B (en) For the semi-supervised sequence learning method based on manifold regularization of image retrieval
CN111177450B (en) Image retrieval cloud identification method and system and computer readable storage medium
CN112307352A (en) Content recommendation method, system, device and storage medium
Pang et al. SCA-CDNet: A robust siamese correlation-and-attention-based change detection network for bitemporal VHR images
CN106933905B (en) Method and device for monitoring webpage access data
CN111723222A (en) Image search and training system
CN110020123B (en) Popularization information delivery method, device, medium and equipment
Chen et al. A new patch-based LBP with adaptive weights for gender classification of human face
CN104142978A (en) Image retrieval system and image retrieval method based on multi-feature and sparse representation
CN117197479A (en) Image analysis method, device, computer equipment and storage medium applying corn ear outer surface

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20140219

RJ01 Rejection of invention patent application after publication