CN117390118A - Quick data retrieval method and system based on block chain - Google Patents
Quick data retrieval method and system based on block chain Download PDFInfo
- Publication number
- CN117390118A CN117390118A CN202311386638.9A CN202311386638A CN117390118A CN 117390118 A CN117390118 A CN 117390118A CN 202311386638 A CN202311386638 A CN 202311386638A CN 117390118 A CN117390118 A CN 117390118A
- Authority
- CN
- China
- Prior art keywords
- data
- similarity
- block
- database
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 23
- 238000004364 calculation method Methods 0.000 claims abstract description 46
- 238000011156 evaluation Methods 0.000 claims abstract description 34
- 238000013461 design Methods 0.000 claims abstract description 17
- 238000004458 analytical method Methods 0.000 claims abstract description 15
- 230000007246 mechanism Effects 0.000 claims abstract description 12
- 238000012795 verification Methods 0.000 claims abstract description 8
- 230000000007 visual effect Effects 0.000 claims abstract description 7
- 238000004422 calculation algorithm Methods 0.000 claims description 20
- 238000007405 data analysis Methods 0.000 claims description 8
- 238000010586 diagram Methods 0.000 claims description 7
- 238000007726 management method Methods 0.000 claims description 7
- 238000013075 data extraction Methods 0.000 claims description 6
- 238000013500 data storage Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000012937 correction Methods 0.000 claims description 3
- 238000013506 data mapping Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 abstract description 6
- 238000013523 data management Methods 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 description 7
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 1
- 238000013499 data model Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000000586 desensitisation Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6227—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a block chain-based quick data retrieval method and a system, which relate to the technical field of block chains, wherein the system comprises a block chain structure design module, a database module, a mechanism setting module, a similarity calculation module and a matching output module; the technical key points are as follows: different information in the transaction data is respectively created into corresponding databases, so that better query capability can be provided, encrypted data management is facilitated, expansion of the corresponding databases can be performed on the basis of not affecting the performance of other databases, similarity calculation and verification processing are sequentially performed through a Pearson correlation coefficient calculation method and visual analysis, accuracy of calculation results is ensured, a similarity evaluation value Pgxs is obtained on the basis, and a matching result which can accurately know that output conditions are met can be obtained through comparison with a similarity threshold value, so that accuracy of data retrieval work is further improved.
Description
Technical Field
The invention relates to the technical field of blockchains, in particular to a method and a system for quickly retrieving data based on a blockchain.
Background
The block chain technology is a distributed account book technology, which realizes the decentralization data storage and transaction verification by chaining transaction records in the form of blocks, wherein each block contains a plurality of transaction records and is protected and verified by a cryptography algorithm; a blockchain network is composed of a plurality of nodes, each node stores a complete blockchain copy, and the consistency of the network is ensured through a consensus algorithm.
In the chinese application of the application publication No. CN113468549a, a method for searching a blockchain-based encryption information certificate is disclosed, which comprises: encrypting the storage position of the credential information by adopting an asymmetric encryption technology; adding the encrypted storage location information to the credential information; calculating the hash value of the credential information by adopting an SHA-256 hash function; recording the credential information and the corresponding hash value into the intelligent contract on the chain; for the up-chain credit certificate, constructing a hash retrieval tree in the intelligent contract according to the hash value of the certificate information, wherein each node on the hash retrieval tree correspondingly stores one certificate information; acquiring the credential information according to the hash value, verifying the credential information, and decrypting the verified storage position information;
in the Chinese invention application with the application publication number of CN110232080A, a quick search method based on block chain is disclosed; the method comprises the following steps: the method comprises the steps that a database management model based on a block chain technology verifies the design, the database management model is applied to an electronic archive stage after an electronic file is archived and enters a database, the database management model based on the block chain technology searches the design, a primary database is divided into a plurality of secondary databases according to different fields, and target field information is the field corresponding to a target data block; searching a target database in the target secondary database by means of keywords, determining the target data block, and eliminating non-target data blocks.
In the above application, the retrieved data can be encrypted respectively, so that the data is prevented from being tampered randomly, finer management is provided for data retrieval, and the retrieval efficiency is improved by excluding a large number of non-target data blocks.
Disclosure of Invention
(one) solving the technical problems
Aiming at the defects of the prior art, the invention provides a block chain-based rapid data retrieval method and a block chain-based rapid data retrieval system, different information in transaction data is respectively created into corresponding databases, so that on one hand, better query capability can be provided, encryption type data management is convenient, on the other hand, expansion of the corresponding databases can be performed on the basis of not influencing the performance of other databases, similarity calculation and verification processing can be performed sequentially through a pearson correlation coefficient calculation method and visual analysis, the accuracy of a calculation result is ensured, a similarity evaluation value Pgxs is obtained on the basis, the accurate matching result meeting the output condition can be obtained through comparison with a similarity threshold value, and then the accuracy of data retrieval work is further improved, so that the problems in the background technology are solved.
(II) technical scheme
In order to achieve the above purpose, the invention is realized by the following technical scheme:
a blockchain-based fast data retrieval system, comprising:
the block chain structure design module is used for designing a block chain structure and comprises the steps of defining the attribute of a block, determining the data format stored in the block and determining a hash algorithm, wherein each block comprises transaction data and corresponding hash values in the block chain, the transaction data comprises a plurality of transactions, and specific transactions comprise: insurance contract information, claim information, and user information;
the database module is used for creating a comprehensive information database, storing transaction data in a blockchain, organizing and storing the comprehensive information database according to different attributes, wherein the comprehensive information database comprises a corresponding user database, an insurance contract database and a claim information database, and the creation processes of the databases are the same;
the mechanism setting module is used for accelerating the searching of the data by using an index structure, and the index structure maps the transaction data and the corresponding hash value to the corresponding block according to the attribute appointed by the rule engine;
the similarity calculation module comprises a data extraction unit, an analysis calculation unit and a threshold comparison unit, wherein the data extraction unit is used for extracting data attributes of the search content, the analysis calculation unit is used for generating a similarity evaluation value Pgxs, and the threshold comparison unit is used for setting a similarity threshold value compared with the similarity evaluation value Pgxs, so that a corresponding result is obtained after comparison;
and the matching output module is used for obtaining a result meeting the output condition after the similarity evaluation value Pgxs is compared with the similarity threshold value, marking the corresponding block as a matching result and outputting the matching result.
Further, the block chain structure design module specifically performs the following steps:
s101, defining basic attributes of a block: including index of block, timestamp, hash value of previous block, transaction number and random number;
s102, determining a data format stored in the block: determining a specific data format stored in each block, wherein the blocks comprise insurance contract information, claim information and user information;
s103, determining a hash algorithm: selecting SHA-256 as a hash algorithm, and performing encryption calculation on the block to generate a unique hash value;
s104, block linking: by including in each block the hash value of the previous block.
Through the steps, a specific block chain structure can be designed for supporting data storage and transaction in the insurance industry, the structure ensures the safety, the integrity and the non-tamper property of data, and meanwhile, the traceability and the sharing property are provided, and the specific design is adjusted and optimized according to actual requirements and scenes.
Further, the specific steps in creating the user database are as follows:
s201, defining user data attribute: determining specific attributes of each piece of user information data includes: name, ID card number, contact information and insurance authentication information;
s202, data encryption and privacy protection: encrypting and desensitizing user information data in insurance industry;
s203, storing and organizing user data: each user information data is used as a transaction and stored in one or more blocks, and the user information data is stored by adopting a NoSQL database;
s204, hash and link of user data: carrying out hash calculation on each piece of user information, storing a corresponding hash value in each block, and linking the hash values of the user information data in each block;
s205, data access control and rights management: and setting up a permission unit in the database module, wherein the permission unit is used for an authorized user to access and modify user data.
Through the steps, the creation of the user database combines the characteristics of the blockchain, ensures that the user data is stored in the non-tamperable blockchain, provides high data security and confidentiality, and is suitable for the requirements of the insurance industry on the confidentiality of the user information
Further, the specific steps of the indexing mechanism are as follows:
s301, setting an index structure: the index structure comprises a hash table and a B+ tree, the hash table is used as a primary index structure, the B+ tree is used as an alternative index structure of the hash table, and if the system does not respond within 0.5s, the system is switched to the B+ tree to be used as the index structure;
s302, selecting attributes: determining the attribute needing to establish the index through a rule engine;
s303, data mapping and storage: transaction data and corresponding hash values are mapped into corresponding chunks according to a set index structure and stored in a blockchain.
Through the steps set by the mechanism, the index structure and the data attribute are combined, a fast data retrieval system based on the blockchain can be realized, the retrieval efficiency and the user experience of the data are improved through the design, and the usability and the functionality of the system are enhanced.
Further, in the analysis and calculation unit, the data attribute of the search content A1 is first associated with a database,then, the similarity between the data attributes is obtained by using a Pearson correlation coefficient calculation method to form a similarity data set Xsd i I.e. { Xsd 1 、Xsd 2 、...、Xsd i Finally, a data analysis model is built, and the data analysis model is based on the output similarity data set Xsd i Generating a similarity evaluation value Pgxs; the formula according to which the similarity evaluation value Pgxs is generated is as follows:
in the formula, i represents the number of data attributes in the search content A1, and K represents a constant correction coefficient.
Further, after obtaining the similarity between the data attributes by pearson correlation coefficient calculation, the similarity data set Xsd is subjected to visual analysis i And verifying each similarity result in the database, if the trend exists in the scatter diagram, outputting the corresponding similarity result, and if the trend does not exist in the scatter diagram, carrying out similarity calculation again.
Further, comparing the similarity threshold value with a similarity evaluation value Pgxs to obtain the following result: if the similarity evaluation value Pgxs reaches or exceeds the similarity threshold, the output condition is satisfied, and if the similarity evaluation value Pgxs does not reach the similarity threshold, the output condition is not satisfied.
A fast data retrieval method based on block chain comprises the following steps:
step one, designing a block chain structure, which comprises defining attributes of blocks, determining data formats stored in the blocks and determining a hash algorithm, wherein each block comprises transaction data and corresponding hash values in the block chain, the transaction data comprises a plurality of transactions, and the specific transactions comprise: insurance contract information, claim information, and user information;
creating a comprehensive information database, storing transaction data in a blockchain, wherein the comprehensive information database can be organized and stored according to different attributes and comprises a corresponding user database, an insurance contract database and a claim information database, and the creation processes of the databases are the same;
step three, using an index structure to accelerate the searching of the data, wherein the index structure maps the transaction data and the corresponding hash value to the corresponding block according to the attribute appointed by the rule engine;
step four, extracting the data attributes of the search content, then obtaining the similarity between the data attributes by using a pearson correlation coefficient calculation method, and forming a similarity data set Xsd after verification i Constructing a data analysis model according to the output similarity data set Xsd i Generating a similarity evaluation value Pgxs, setting a similarity threshold value compared with the similarity evaluation value Pgxs in a threshold value comparison unit, and obtaining a corresponding result after comparison;
and fifthly, obtaining a result meeting the output condition after the similarity evaluation value Pgxs is compared with the similarity threshold value, marking the corresponding block as a matching result, and outputting the matching result.
(III) beneficial effects
The invention provides a block chain-based rapid data retrieval method and a block chain-based rapid data retrieval system, which have the following beneficial effects:
different information in transaction data is respectively created into corresponding databases, so that better query capability can be provided, encrypted data management is facilitated, expansion of the corresponding databases can be performed on the basis of not affecting the performance of other databases, similarity calculation can be performed on data attributes of search contents and data attributes in the corresponding databases when similarity calculation is performed on the data attributes, similarity calculation and verification processing are sequentially performed through a pearson correlation coefficient calculation method and visual analysis, accuracy of calculation results is ensured, similarity evaluation values Pgxs are obtained on the basis, matching results meeting output conditions can be accurately obtained through comparison with similarity thresholds, and therefore efficient and accurate data search work is achieved.
Drawings
FIG. 1 is an overall flow chart of a blockchain-based fast data retrieval method of the present invention;
FIG. 2 is a block chain architecture diagram of a block chain based fast data retrieval system according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1: referring to fig. 2, the present invention provides a fast data retrieval system based on blockchain, comprising:
the block chain structure design module designs a block chain structure, comprising defining the attribute of the block, determining the data format stored in the block and determining a hash algorithm, wherein each block comprises a certain amount of transaction data and corresponding hash values in the block chain, the transaction data comprises a plurality of transactions, and the specific transactions comprise: insurance contract information, claim information, and user information;
the block chain structure design module specifically comprises the following steps:
s101, defining basic attributes of a block: these attributes include the index of the chunk, the timestamp, the hash value of the previous chunk, the number of transactions, and the random number; these attributes are used to help determine the uniqueness and chaining of each tile;
s102, determining a data format stored in the block: determining a specific data format stored in each block, wherein the block comprises insurance contract information, claim information and user information in the insurance industry of the application according to specific application scenes and requirements, and the data can be designed according to the requirement of a data structure and stored in a data field of the block;
s103, determining a hash algorithm: the method comprises the steps of selecting a proper hash algorithm to carry out encryption calculation on a block to generate a unique hash value, wherein common hash algorithms comprise SHA-256 and SHA-3, determining the proper hash algorithm can ensure the integrity and the safety of data, and ensuring that any tampering of block data can cause the change of the hash value, and determining SHA-256 as the hash algorithm in the application;
s104, block linking: the method is realized by including the hash value of the previous block in each block, the link mechanism ensures the integrity and the non-tamper property of the blockchain, since any modification to the blocks can cause the hash values of all subsequent blocks in the whole chain to change, a concurrency control mechanism can be designed in the blockchain structure according to the need, and a distributed consistency algorithm is adopted to ensure that the operation of all nodes on the blockchain can reach the consensus, and the specific design mode and the content are not repeated herein;
through the steps, a specific block chain structure can be designed for supporting data storage and transaction in the insurance industry, the structure ensures the safety, the integrity and the non-tamper property of data, and meanwhile, the traceability and the sharing property are provided, and the specific design is adjusted and optimized according to actual requirements and scenes.
The system comprises a database module, a block chain module and a data processing module, wherein the database module creates a comprehensive information database, transaction data are stored in the block chain, each piece of user information, insurance contract information and claim information can be used as a transaction and stored in one or more blocks, the comprehensive information database can be organized and stored according to different attributes in the block chain, the comprehensive information database comprises a corresponding user database, insurance contract database and claim information database, the creation process of each database is the same, and the specific steps of creating the user database are as follows:
s201, defining user data attribute: determining specific attributes of each piece of user information data includes: name, identification number, contact information, and insurance authentication information, which may be defined and designed according to specific insurance business requirements, specific attributes for insurance contract data include: the insurance amount, the insurance period, and the insurance policy number, specific attributes for claim information include: claim amount, contact information of claim processor, and claim number;
s202, data encryption and privacy protection: the user information data in the insurance industry relates to sensitive information, and in order to protect user privacy and data security, encryption and desensitization processing are needed to be carried out on the user information data, and the user information data is realized by using a cryptography algorithm and a data protection technology;
s203, storing and organizing user data: each user information data is used as a transaction and stored in one or more blocks, the user information data is selected to be stored by a NoSQL database, the NoSQL database can be MongoDB or Cassandra, the NoSQL database provides more flexible storage and inquiry modes, and the NoSQL database can be adapted according to a data model and service requirements;
s204, hash and link of user data: in order to ensure the integrity and the non-tamper property of the data, carrying out hash calculation on each piece of user information, and storing a corresponding hash value in each block, and realizing the traceability and the consistency of the data by linking the hash values of the user information data in each block;
s205, data access control and rights management: corresponding data access control and authority management measures are set according to the data privacy and compliance requirements of the insurance industry, and an authority unit is built in a database module and used for an authorized user to access and modify user data, so that the safety and confidentiality of the data are ensured.
Through the steps, the creation of the user database combines the characteristics of the blockchain, ensures that the user data is stored in the non-tamperable blockchain, provides high data security and confidentiality, and meets the requirements for user information confidentiality in the insurance industry.
The mechanism setting module is used for selecting a proper mechanism for realizing quick data retrieval; an index structure (e.g., hash table and b+ tree) is used to accelerate the lookup of data, which maps transaction data and corresponding hash values to corresponding chunks according to attributes specified in the rules engine, thereby enabling fast data retrieval.
The specific steps of setting the indexing mechanism after the user database is built are as follows:
s301, setting an index structure: according to specific requirements and system design, an index structure is selected to accelerate data retrieval, the index structure comprises a hash table and a B+ tree, the hash table is suitable for quick key value searching, the B+ tree is suitable for range searching and orderly traversing, the hash table is used as a primary index structure, the B+ tree is used as an alternative index structure of the hash table, if the system does not respond within 0.5s, the B+ tree is switched to be used as an index structure, a proper index structure is selected according to actual conditions to optimize data retrieval efficiency, and a single type index structure is not used, so that the advantage of high efficiency of the hash table index is firstly utilized, and the advantage of comprehensive index of the B+ tree is utilized;
s302, selecting attributes: the attributes that need to be indexed are determined by the rules engine. According to the data characteristics and the retrieval requirements in the insurance industry, key attributes are selected as indexes so as to quickly locate related blocks; for example, the index can be performed according to the attributes such as the customer name, insurance policy number or claim number, so as to quickly locate the relevant data;
s303, data mapping and storage: mapping the transaction data and the corresponding hash values into corresponding blocks according to the set index structure, and storing the transaction data and the corresponding hash values in a block chain;
through the steps set by the mechanism, the index structure and the data attribute are combined, a fast data retrieval system based on the blockchain can be realized, the retrieval efficiency and the user experience of the data are improved through the design, and the usability and the functionality of the system are enhanced.
The similarity calculation module comprises a data extraction unit, an analysis calculation unit and a threshold comparison unit;
the steps performed in the similarity calculation module are as follows:
s401, extracting data attributes of the search content A1 through a data extraction unit;
s402, determining that the data attribute of the search content A1 belongs to a specific database in the comprehensive information databases in the analysis and calculation unit firstly, and thenSimilarity between data attributes is obtained by using a pearson correlation coefficient calculation method, and a similarity data set Xsd is formed i I.e. { Xsd 1 、Xsd 2 、...、Xsd i And to the similarity dataset Xsd by visual analysis i Verifying each similarity result in the database, outputting the corresponding similarity result if the trend exists in the scatter diagram, and repeating similarity calculation if the trend does not exist in the scatter diagram, constructing a data analysis model and according to the output similarity dataset Xsd i The similarity evaluation value Pgxs is generated according to the following formula:
in the formula, i represents the number of data attributes in the search content A1, K represents a constant correction coefficient, and the specific value can be 1.357 according to actual requirements.
S403, setting a similarity threshold in the threshold comparison unit, and comparing the similarity threshold with the similarity evaluation value Pgxs, wherein if the similarity evaluation value Pgxs reaches or exceeds the similarity threshold, the output condition is met, and if the similarity evaluation value Pgxs does not reach the similarity threshold, the output condition is not met.
And the matching output module is used for obtaining a result meeting the output condition after the similarity evaluation value Pgxs is compared with the similarity threshold value, marking the corresponding block as a matching result and outputting the matching result.
Specifically, different information in the transaction data is respectively created into corresponding databases, so that better query capability can be provided, encryption type data management is facilitated, expansion of the corresponding databases can be performed on the basis of not affecting the performance of other databases, similarity calculation can be performed on the data attributes of the search content and the data attributes in the corresponding databases when similarity calculation is performed on the data attributes, similarity calculation and verification processing are sequentially performed through a Pearson correlation coefficient calculation method and visual analysis, accuracy of calculation results is ensured, a similarity evaluation value Pgxs is obtained on the basis, and a matching result meeting output conditions can be obtained through comparison with a similarity threshold value, so that efficient and accurate data search work is realized.
Example 2: referring to fig. 1, a fast data retrieval method based on a blockchain includes the following steps:
step one, designing a block chain structure, which comprises defining attributes of blocks, determining data formats stored in the blocks and determining a hash algorithm, wherein each block comprises transaction data and corresponding hash values in the block chain, the transaction data comprises a plurality of transactions, and the specific transactions comprise: insurance contract information, claim information, and user information;
creating a comprehensive information database, storing transaction data in a blockchain, wherein the comprehensive information database can be organized and stored according to different attributes and comprises a corresponding user database, an insurance contract database and a claim information database, and the creation processes of the databases are the same;
step three, using an index structure to accelerate the searching of the data, wherein the index structure maps the transaction data and the corresponding hash value to the corresponding block according to the attribute appointed by the rule engine;
step four, extracting the data attributes of the search content, then obtaining the similarity between the data attributes by using a pearson correlation coefficient calculation method, and forming a similarity data set Xsd after verification i Constructing a data analysis model according to the output similarity data set Xsd i Generating a similarity evaluation value Pgxs, setting a similarity threshold value compared with the similarity evaluation value Pgxs in a threshold value comparison unit, and obtaining a corresponding result after comparison;
step five, obtaining the result meeting the output condition after comparing the similarity evaluation value Pgxs with the similarity threshold value, marking the corresponding block as a matching result, and outputting the matching result
The above formulas are all formulas with dimensions removed and numerical values calculated, the formulas are formulas with a large amount of data collected for software simulation to obtain the latest real situation, and preset parameters in the formulas are set by those skilled in the art according to the actual situation.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application.
Claims (9)
1. A blockchain-based fast data retrieval system, characterized by: comprising the following steps:
the block chain structure design module is used for designing a block chain structure and comprises the steps of defining the attribute of a block, determining the data format stored in the block and determining a hash algorithm, wherein each block comprises transaction data and corresponding hash values in the block chain, the transaction data comprises a plurality of transactions, and specific transactions comprise: insurance contract information, claim information, and user information;
the database module is used for creating a comprehensive information database, storing transaction data in a blockchain, organizing and storing the comprehensive information database according to different attributes, wherein the comprehensive information database comprises a corresponding user database, an insurance contract database and a claim information database, and the creation process of each database is the same;
the mechanism setting module is used for accelerating the searching of the data by using an index structure, and the index structure maps the transaction data and the corresponding hash value to the corresponding block according to the attribute appointed by the rule engine;
the similarity calculation module comprises a data extraction unit, an analysis calculation unit and a threshold comparison unit, wherein the data extraction unit is used for extracting data attributes of the search content, the analysis calculation unit is used for generating a similarity evaluation value Pgxs, and the threshold comparison unit is used for setting a similarity threshold value compared with the similarity evaluation value Pgxs, so that a corresponding result is obtained after comparison;
and the matching output module is used for obtaining a result meeting the output condition after the similarity evaluation value Pgxs is compared with the similarity threshold value, marking the corresponding block as a matching result and outputting the matching result.
2. A blockchain-based fast data retrieval system as in claim 1 wherein: the block chain structure design module specifically comprises the following steps:
s101, defining basic attributes of a block: including index of block, timestamp, hash value of previous block, transaction number and random number;
s102, determining a data format stored in the block: determining a specific data format stored in each block, wherein the blocks comprise insurance contract information, claim information and user information;
s103, determining a hash algorithm: selecting SHA-256 as a hash algorithm, and performing encryption calculation on the block to generate a unique hash value;
s104, block linking: by including in each block the hash value of the previous block.
3. A blockchain-based fast data retrieval system as in claim 1 wherein: the specific steps in creating the user database are as follows:
s201, defining user data attribute: determining specific attributes of each piece of user information data includes: name, ID card number, contact information and insurance authentication information;
s202, data encryption and privacy protection: encrypting and desensitizing user information data in insurance industry;
s203, storing and organizing user data: each user information data is used as a transaction and stored in one or more blocks, and the user information data is stored by adopting a NoSQL database;
s204, hash and link of user data: carrying out hash calculation on each piece of user information, storing a corresponding hash value in each block, and linking the hash values of the user information data in each block;
s205, data access control and rights management: and setting up a permission unit in the database module, wherein the permission unit is used for an authorized user to access and modify user data.
4. A blockchain-based fast data retrieval system as in claim 1 wherein: the specific steps of the index mechanism setting are as follows:
s301, setting an index structure: the index structure comprises a hash table and a B+ tree, the hash table is used as a primary index structure, the B+ tree is used as an alternative index structure of the hash table, and if the system does not respond within 0.5s, the system is switched to the B+ tree to be used as the index structure;
s302, selecting attributes: determining the attribute needing to establish the index through a rule engine;
s303, data mapping and storage: transaction data and corresponding hash values are mapped into corresponding chunks according to a set index structure and stored in a blockchain.
5. A blockchain-based fast data retrieval system as in claim 1 wherein: in the analysis and calculation unit, the data attribute of the search content A1 is first associated with a database,then, the similarity between the data attributes is obtained by using a Pearson correlation coefficient calculation method to form a similarity data set Xsd i I.e. { Xsd 1 、Xsd 2 、...、Xsd i Finally, a data analysis model is built, and the data analysis model is based on the output similarity data set Xsd i A similarity evaluation value Pgxs is generated.
6. The blockchain-based fast data retrieval system of claim 5, wherein: the formula according to which the similarity evaluation value Pgxs is generated is as follows:
in the formula, i represents the number of data attributes in the search content A1, and K represents a constant correction coefficient.
7. The blockchain-based fast data retrieval system of claim 6, wherein: after similarity between data attributes is obtained by pearson correlation coefficient calculation, the similarity data set Xsd is subjected to visual analysis i And verifying each similarity result in the database, if the trend exists in the scatter diagram, outputting the corresponding similarity result, and if the trend does not exist in the scatter diagram, carrying out similarity calculation again.
8. A blockchain-based fast data retrieval system as in claim 1 wherein: comparing the similarity threshold value with a similarity evaluation value Pgxs to obtain the following result: if the similarity evaluation value Pgxs reaches or exceeds the similarity threshold, the output condition is satisfied, and if the similarity evaluation value Pgxs does not reach the similarity threshold, the output condition is not satisfied.
9. A fast data retrieval method based on a blockchain, using the system of any of claims 1 to 8, characterized in that: the method comprises the following steps:
step one, designing a block chain structure, which comprises defining attributes of blocks, determining data formats stored in the blocks and determining a hash algorithm, wherein each block comprises transaction data and corresponding hash values in the block chain, the transaction data comprises a plurality of transactions, and the specific transactions comprise: insurance contract information, claim information, and user information;
creating a comprehensive information database, storing transaction data in a blockchain, organizing and storing the comprehensive information database according to different attributes, wherein the comprehensive information database comprises a corresponding user database, an insurance contract database and a claim information database, and the creation processes of the databases are the same;
step three, using an index structure to accelerate the searching of the data, wherein the index structure maps the transaction data and the corresponding hash value to the corresponding block according to the attribute appointed by the rule engine;
step four, extracting the data attributes of the search content, then obtaining the similarity between the data attributes by using a pearson correlation coefficient calculation method, and forming a similarity data set Xsd after verification i Constructing a data analysis model according to the output similarity data set Xsd i Generating a similarity evaluation value Pgxs, setting a similarity threshold value compared with the similarity evaluation value Pgxs in a threshold value comparison unit, and obtaining a corresponding result after comparison;
and fifthly, obtaining a result meeting the output condition after the similarity evaluation value Pgxs is compared with the similarity threshold value, marking the corresponding block as a matching result, and outputting the matching result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311386638.9A CN117390118A (en) | 2023-10-24 | 2023-10-24 | Quick data retrieval method and system based on block chain |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311386638.9A CN117390118A (en) | 2023-10-24 | 2023-10-24 | Quick data retrieval method and system based on block chain |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117390118A true CN117390118A (en) | 2024-01-12 |
Family
ID=89466274
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311386638.9A Pending CN117390118A (en) | 2023-10-24 | 2023-10-24 | Quick data retrieval method and system based on block chain |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117390118A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118484840A (en) * | 2024-07-12 | 2024-08-13 | 山东政信大数据科技有限责任公司 | Credit data asset security management and traceability system based on block chain technology |
-
2023
- 2023-10-24 CN CN202311386638.9A patent/CN117390118A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118484840A (en) * | 2024-07-12 | 2024-08-13 | 山东政信大数据科技有限责任公司 | Credit data asset security management and traceability system based on block chain technology |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7519835B2 (en) | Encrypted table indexes and searching encrypted tables | |
US10740474B1 (en) | Systems and methods for generation of secure indexes for cryptographically-secure queries | |
CN105678189B (en) | Data file encryption storage and retrieval system and method | |
Dai et al. | A privacy-preserving multi-keyword ranked search over encrypted data in hybrid clouds | |
Ku et al. | A query integrity assurance scheme for accessing outsourced spatial databases | |
CN117390118A (en) | Quick data retrieval method and system based on block chain | |
Servan-Schreiber et al. | Private approximate nearest neighbor search with sublinear communication | |
CN114579998A (en) | Block chain assisted medical big data search mechanism and privacy protection method | |
Fu et al. | A privacy-preserving fuzzy search scheme supporting logic query over encrypted cloud data | |
Guo et al. | LuxGeo: Efficient and Security-Enhanced Geometric Range Queries | |
Yan et al. | Secure multi-keyword search supporting dynamic update and ranked retrieval | |
Raghavendra et al. | Survey on data storage and retrieval techniques over encrypted cloud data | |
Zhang et al. | A verifiable and dynamic multi-keyword ranked search scheme over encrypted cloud data with accuracy improvement | |
He et al. | FMSM: A fuzzy multi-keyword search scheme for encrypted cloud data based on multi-chain network | |
CN116484399A (en) | Method and system for constructing ciphertext range search result completeness verification data structure | |
Dang | Ensuring correctness, completeness, and freshness for outsourced tree-indexed data | |
Huang et al. | Efficient privacy-preserving content-based image retrieval in the cloud | |
CN113158245A (en) | Method, system, equipment and readable storage medium for searching document | |
Chang | Is distributed ledger technology built for personal data? | |
Dong et al. | Arm: Authenticated approximate record matching for outsourced databases | |
Li et al. | A secure and efficient log storage and query framework based on blockchain | |
Bonomi et al. | A review of privacy preserving mechanisms for record linkage | |
CN115439118B (en) | Digital certificate storage management method based on blockchain | |
Gampala et al. | A study on privacy preserving searching approaches on encrypted data and open challenging issues in cloud computing | |
CN116756760B (en) | Searchable database encryption system and method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20240112 |