Nothing Special   »   [go: up one dir, main page]

CN107545007A - Electric power big data quick-searching engine - Google Patents

Electric power big data quick-searching engine Download PDF

Info

Publication number
CN107545007A
CN107545007A CN201610493720.5A CN201610493720A CN107545007A CN 107545007 A CN107545007 A CN 107545007A CN 201610493720 A CN201610493720 A CN 201610493720A CN 107545007 A CN107545007 A CN 107545007A
Authority
CN
China
Prior art keywords
engine
big data
data
electric power
retrieval
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610493720.5A
Other languages
Chinese (zh)
Inventor
杨庆双
刘建宇
徐志丹
赵宏振
李振雷
常月廷
刘金华
曹北建
张金禄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Tianjin Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Tianjin Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Tianjin Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201610493720.5A priority Critical patent/CN107545007A/en
Publication of CN107545007A publication Critical patent/CN107545007A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of electric power big data quick-searching engine, and it includes enterprise's application, secondary open interface, big data search engine core, operating system and Cloud Server resource.By electric power big data search engine, data auto-partition can be indexed according to the query feature of application, give full play to modern PC multiple-core servers, the advantage of big internal memory, the flexible multi engine mechanism of innovation, there is provided open second development interface.System supports row storage, realizes the efficient access of specific data row, improves the speed of statistic of classification and the sequence of specific data row.The retrieval caching of existing single node in engine, there is the integral retrieval after merging to cache again, at many levels the design of more granularities, substantially increase the hit rate of caching, mitigate the retrieval node pressure under high concurrent, so as to increase substantially data retrieval capability of the system in the case of high concurrent.

Description

Electric power big data quick-searching engine
Technical field
The present invention relates to a kind of search engine, especially a kind of electric power big data quick-searching engine.
Background technology
Big data refers to the number that can not be retrieved and be managed to data content using the instrument of routine under the present conditions According to collection.The research of big data Knowledge Discovery at present is concentrated mainly on division, cluster, retrieval, increment (in batches, online or parallel) Practise these aspects.
Electric power big data is the practice of big data theory, technology and method in power industry.Electric power big data is related to sending out Electricity, transmission of electricity, power transformation, distribution, electricity consumption, each link of scheduling are across unit, multi-disciplinary, trans-sectoral business data analysis and are excavated, and number According to visualization.Electric power big data by structural data and it is unstructured form, with intelligent grid construction and the application of Internet of Things, Unstructured data shows the impetus of rapid growth, and its quantity will substantially exceed structural data.The characteristic of electric power big data Meet five characteristics of big data, first, data volume (Volume), two greatly are that (Velocity), three are data class to processing speed soon Type more (Variety), four are that value (Value), five greatly are that accuracy is high (Veracity).
Big data retrieval in recent years has been achieved for developing, but at present to the research ratio of electric power big data search problem processing It is less.The core of big data management is big data search engine, in other words the big data management system of confluent retrieval engine technique. Search engine is the big data efficiently basis of management and intellectual analysis.It is fast to it is generally desirable to energy by user when electric power big data is retrieved The thing obtained from all data required for oneself of speed.This relates to what how a speed and accuracy rate were chosen ask Topic.Big data is retrieved in the past, is inclined to the degree of accuracy, last decade, with becoming increasingly popular for network, the generation of electric power big data, accurately Retrieval can not meet the needs of user, and currently, the retrieval of electric power big data also needs to meet large concurrent, quick response user's need The requirement asked.
The content of the invention
The technical problem to be solved in the present invention is to provide a kind of highly reliable electric power big data quick-searching engine.
In order to solve the above-mentioned technical problem, electric power big data quick-searching engine of the invention includes
Enterprise applies, and it is included towards electric power enterprise service application, specially enterprise search, data mining engine, vertically Search, public sentiment management and Content Management;
Secondary open interface, carried for enterprise using the connection with big data search engine core, the secondary open interface For various main flow interfaces, HTTP interface, C interface, JAVA interface .NET interfaces are specifically included;
The big data search engine core, it includes search node, Hadoop and HDFS, and search node includes scheduling mould Block and data dictionary, by search engine adapter, general search engine, professional retrieval engine, image retrieval engine are accessed, propped up Flexible multi engine technology is held, big data management system uses multi engine mechanism, defines the engine interface of a standard;For difference Application demand can using different engines come externally provide service, user can also build the engine of oneself to extend system Data-handling capacity, support isomeric data, structuring is semi-structured, the unified retrieval of unstructured data;Big data is retrieved Current reference is designed using flattening, resilient expansion, completely reciprocity between node, can externally provide service, whole system There is no Single Point of Faliure, the failure of any one node does not affect system external and provides service;The framework of flattening has system There is good autgmentability, need to only increase new node can online and the capacity of system and external service ability are provided;Pass through Hadoop and HDFS can realize efficient subregion Indexing Mechanism, can be indexed data auto-partition according to the query feature of application, Modern PC multiple-core servers, the advantage of big internal memory are given full play to, the mode merged using parallel index, multichannel, becomes random read-write Sequentially to read and write, the index creation of high speed is realized, adapts to the centralized indexes of mass data and the application demand of quick indexing;And
Operating system and Cloud Server resource, both are the infrastructure service resources based on cloud service, there is provided operating system, void Plan machine, cloud computing and cloud storage, deployed environment is provided for electric power big data search engine.
The beneficial effects of the invention are as follows:Using electric power big data quick-searching engine, electric power big data user can be met The requirement of quick-searching electric power data.For electric power big data feature (magnanimity power customer archive information and information on services, electric power Topological structure and various Heterogeneous data diversity, application demand diversity), by towards the efficient, reliable, intelligent of big data Search engine, realize electric power big data quick-searching.
Brief description of the drawings
Fig. 1 is the schematic diagram of electric power big data quick-searching engine of the present invention.
Embodiment
The present invention is further detailed explanation with reference to the accompanying drawings and detailed description:
Referring to Fig. 1, electric power big data quick-searching engine of the invention, including enterprise is using 1, secondary open interface 2, big Data retrieval engine core 3, operating system 4 and Cloud Server resource 5.Enterprise includes towards electric power enterprise service application using 1, There are enterprise search, data mining engine, vertical search, public sentiment management and Content Management etc..Enterprise is connect using 1 by secondary opening Mouthfuls 2 are connected with big data search engine core 3, there is provided various main flow interfaces, including HTTP interface, C interface, JAVA interface, .NET interface.Big data search engine core 3 is the major part of the present invention, and it passes through search node 301, the and of Hadoop 302 HDFS 303 realizes electric power big data fast search function.Search node 301 includes scheduler module and data dictionary, by searching Rope engine adapter, general search engine, professional retrieval engine, image retrieval engine are accessed, support flexible multi engine technology, greatly Data management system uses multi engine mechanism, defines the engine interface of a standard.It can be used for different application demands Different engines externally provides service, and user can also build the engine of oneself to extend the data-handling capacity of system, branch Isomeric data is held, structuring is semi-structured, the unified retrieval of unstructured data.Big data search engine core 3 is using flat Change designs, resilient expansion, completely reciprocity between node, can externally provide service, whole system does not have Single Point of Faliure, any The failure of one node does not affect system external and provides service;The framework of flattening makes system have good autgmentability, only New node can need to be increased online the capacity of system and external service ability are provided.By Hadoop 302, HDFS 303, Efficient subregion Indexing Mechanism is realized, data auto-partition can be indexed, it is more to give full play to modern PC according to the query feature of application The advantage of core server, big internal memory, the mode merged using parallel index, multichannel, become random read-write and read and write into order, realize high The index creation of speed, adapts to the centralized indexes of mass data and the application demand of quick indexing.Meanwhile subregion index can also subtract Index matching range when retrieving less, shorten the retrieval response time.Operating system 4 and Cloud Server resource 5, it is to be based on cloud service Infrastructure service resource, there is provided operating system, virtual machine, cloud computing and cloud storage, portion is provided for electric power big data search engine Affix one's name to environment.
Electric power big data search engine, for big data feature, can efficiently, it is reliable, intelligence realize that big data is retrieved, Structuring, semi-structured, unstructured data unified management and search are supported, realizes the Mass Data Management of PB levels, is supported The high concurrent of mass users accesses.By electric power big data search engine, data can be divided automatically according to the query feature of application Area indexes, and gives full play to modern PC multiple-core servers, the advantage of big internal memory, the flexible multi engine mechanism of innovation, there is provided open Second development interface.Tables of data is established in internal memory, adaptation data volume is less, but inquires about concurrently exigent with response speed Application demand.System supports row storage, realizes the efficient access of specific data row, improves statistic of classification and the row of specific data row The speed of sequence.The retrieval caching of existing single node, has the integral retrieval after merging to cache again, the design of multi-level more granularities, greatly The big hit rate for improving caching, mitigates the retrieval node pressure under high concurrent, so as to increase substantially system in high concurrent feelings Data retrieval capability under condition.
In summary, present disclosure is not limited in the above embodiments, and those skilled in the art can be It is proposed other embodiments within the technological guidance's thought of the present invention, but these embodiments be included in the scope of the present invention it It is interior.

Claims (1)

  1. A kind of 1. electric power big data quick-searching engine, it is characterised in that:Including
    Enterprise applies (1), and it is included towards electric power enterprise service application, specially enterprise search, data mining engine, vertically search Rope, public sentiment management and Content Management;
    Secondary open interface (2), for enterprise's application (1) and the connection of big data search engine core (3), the secondary opening connects Mouth (2) provides various main flow interfaces, specifically includes HTTP interface, C interface, JAVA interface .NET interfaces;
    The big data search engine core (3), it includes search node (301), Hadoop (302) and HDFS (303), search Node (301) includes scheduler module and data dictionary, by search engine adapter, accesses general search engine, professional retrieval Engine, image retrieval engine, flexible multi engine technology is supported, big data management system uses multi engine mechanism, defines a mark Accurate engine interface;Service externally can be provided using different engines for different application demands, user can be with structure The engine of oneself is built to extend the data-handling capacity of system, supports isomeric data, structuring is semi-structured, unstructured number According to unified retrieval;Big data search engine core (3) is designed using flattening, resilient expansion, completely reciprocity between node, all Service can be externally provided, whole system does not have Single Point of Faliure, and the failure of any one node does not affect system external offer Service;The framework of flattening makes system have good autgmentability, need to only increase new node can online and provide system Capacity and external service ability;Efficient subregion Indexing Mechanism can be realized by Hadoop (302) and HDFS (303), can basis The query feature of application, data auto-partition is indexed, give full play to modern PC multiple-core servers, the advantage of big internal memory, used Parallel index, the mode that multichannel merges, become random read-write and read and write into order, realize the index creation of high speed, adapt to mass data Centralized indexes and quick indexing application demand;And
    Operating system (4) and Cloud Server resource (5), both are the infrastructure service resources based on cloud service, there is provided operating system, Virtual machine, cloud computing and cloud storage, deployed environment is provided for electric power big data search engine.
CN201610493720.5A 2016-06-26 2016-06-26 Electric power big data quick-searching engine Pending CN107545007A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610493720.5A CN107545007A (en) 2016-06-26 2016-06-26 Electric power big data quick-searching engine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610493720.5A CN107545007A (en) 2016-06-26 2016-06-26 Electric power big data quick-searching engine

Publications (1)

Publication Number Publication Date
CN107545007A true CN107545007A (en) 2018-01-05

Family

ID=60962842

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610493720.5A Pending CN107545007A (en) 2016-06-26 2016-06-26 Electric power big data quick-searching engine

Country Status (1)

Country Link
CN (1) CN107545007A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109189752A (en) * 2018-10-12 2019-01-11 国网山东省电力公司电力科学研究院 Power marketing knowledge base system based on intelligent Search Technique
CN109359087A (en) * 2018-06-15 2019-02-19 深圳市木浪云数据有限公司 Instant file index and searching method, apparatus and system
CN111858796A (en) * 2020-06-22 2020-10-30 北京百度网讯科技有限公司 Geographic information system engine system, implementation method, device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102750606A (en) * 2012-05-16 2012-10-24 中国电力科学研究院 Power grid scheduling cloud system
CN103631887A (en) * 2013-11-15 2014-03-12 北京奇虎科技有限公司 Method for network search at browser side and browser
CN104767813A (en) * 2015-04-08 2015-07-08 江苏国盾科技实业有限责任公司 Public bank big data service platform based on openstack
CN105069112A (en) * 2015-08-11 2015-11-18 浪潮软件集团有限公司 Industry vertical search engine system
CN105574643A (en) * 2015-11-23 2016-05-11 江苏瑞中数据股份有限公司 Real-time data center and big data platform fusion method for power grid

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102750606A (en) * 2012-05-16 2012-10-24 中国电力科学研究院 Power grid scheduling cloud system
CN103631887A (en) * 2013-11-15 2014-03-12 北京奇虎科技有限公司 Method for network search at browser side and browser
CN104767813A (en) * 2015-04-08 2015-07-08 江苏国盾科技实业有限责任公司 Public bank big data service platform based on openstack
CN105069112A (en) * 2015-08-11 2015-11-18 浪潮软件集团有限公司 Industry vertical search engine system
CN105574643A (en) * 2015-11-23 2016-05-11 江苏瑞中数据股份有限公司 Real-time data center and big data platform fusion method for power grid

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109359087A (en) * 2018-06-15 2019-02-19 深圳市木浪云数据有限公司 Instant file index and searching method, apparatus and system
CN109359087B (en) * 2018-06-15 2020-11-17 深圳市木浪云数据有限公司 Instant file indexing and searching method, device and system
CN109189752A (en) * 2018-10-12 2019-01-11 国网山东省电力公司电力科学研究院 Power marketing knowledge base system based on intelligent Search Technique
CN111858796A (en) * 2020-06-22 2020-10-30 北京百度网讯科技有限公司 Geographic information system engine system, implementation method, device and storage medium
CN111858796B (en) * 2020-06-22 2023-08-18 北京百度网讯科技有限公司 Geographic information system engine system, implementation method and device and storage medium

Similar Documents

Publication Publication Date Title
CN103631870B (en) System and method used for large-scale distributed data processing
Guo et al. Manu: a cloud native vector database management system
US10922316B2 (en) Using computing resources to perform database queries according to a dynamically determined query size
Xie et al. Kraken: memory-efficient continual learning for large-scale real-time recommendations
US11507555B2 (en) Multi-layered key-value storage
Lin et al. A comprehensive survey on distributed training of graph neural networks
CN104410666B (en) The method and system of isomerism storage resources management are realized under cloud computing
Hua et al. Hadoop configuration tuning with ensemble modeling and metaheuristic optimization
CN111404932A (en) Method for accessing medical institution system to smart medical cloud service platform
CN107291539A (en) Cluster program scheduler method based on resource significance level
Cambazoglu et al. Quantifying performance and quality gains in distributed web search engines
Costa et al. A survey on data-driven performance tuning for big data analytics platforms
CN107545007A (en) Electric power big data quick-searching engine
Groppe Emergent models, frameworks, and hardware technologies for Big data analytics
Xu et al. Enhancing HDFS with a full-text search system for massive small files
CN108763323A (en) Meteorological lattice point file application process based on resource set and big data technology
CN106570151A (en) Data collection processing method and system for mass files
Shen et al. Meteorological sensor data storage mechanism based on timescaledb and kafka
Wei et al. Status, challenges and trends of data-intensive supercomputing
US11687513B2 (en) Virtual data source manager of data virtualization-based architecture
Wu et al. Data design and analysis based on cloud computing and improved K-Means algorithm
Polak et al. Organization of quality-oriented data access in modern distributed environments based on semantic interoperability of services and systems
Li et al. An improved distributed query for large-scale RDF data
Quan et al. The implications from benchmarking three big data systems
Luo et al. Superset: a non-uniform replica placement strategy towards high-performance and cost-effective distributed storage service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180105

RJ01 Rejection of invention patent application after publication