CN104765876B - Magnanimity GNSS small documents cloud storage methods - Google Patents
Magnanimity GNSS small documents cloud storage methods Download PDFInfo
- Publication number
- CN104765876B CN104765876B CN201510204235.7A CN201510204235A CN104765876B CN 104765876 B CN104765876 B CN 104765876B CN 201510204235 A CN201510204235 A CN 201510204235A CN 104765876 B CN104765876 B CN 104765876B
- Authority
- CN
- China
- Prior art keywords
- file
- index
- gnss
- small documents
- observation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to magnanimity GNSS small documents cloud storage methods, effectively solve the problems, such as magnanimity GNSS small documents efficient storage, management, issue and shared, method is that magnanimity GNSS small documents are merged into big file first, and the big file after merging is established and indexed;And optimum indexing block storage strategy, blocks of files after cutting and index block are stored on the node or the back end nearest from data block of data block, the index of GNSS data type is stored on name node, reduce the consumption of memory capacity and the memory consumption of name node, improve large amount of small documents write-in, the performance for accessing and deleting, the inventive method is simple, it is easy to operate, save memory space, reduce memory consumption, improve write-in, reading and deletion efficiency, effectively improve magnanimity GNSS small documents efficient storages, management, issue and shared purpose, it is that magnanimity GNSS small documents managerial one are innovated greatly, economic and social benefit is huge.
Description
Technical field
The present invention relates to " Geodesy and Survey Engineering " technical field in " Surveying Science and Technology " subject, especially
It is a kind of magnanimity GNSS small documents cloud storage method.
Background technology
With the continuous development of scientific technology, global, national, region class CORS net (CORS,
Continuously Operating Reference Station System) constantly build up, GPS
(GNSS, Global Navigation Satellite System) is widely used in every field, particularly integrates independent
What type CORS was formed possesses more base stations, higher level associative form CORS networking successively and Continuous Observation, global satellite
The scale of guidance system data amount is increasing.
The data of magnanimity bring challenges to storage and management, data latency processing more than a large amount of TB levels.Observed with GNSS
Data instance, Continuous Observation 1 day, sample rate be 1 second, the data volume of only gps satellite just up to 80MB, there are up to ten thousand observations in the whole world
Stand, the data volume of one day is just up to tens of to hundreds of TB;In addition, it is different from network log and remote sensing image, GNSS data species
It is rich and varied with form, the category for being counted as GNSS data of the fruit as representative and belonging to small documents is conciliate using GNSS observation files.
The challenge brought for magnanimity GNSS small documents to storage and management, traditional storage area network (SAN,
Storage Area Network) and network attached storage (NAS, Network-Attached Storage) in capacity and performance
Extension on bottleneck be present.FTP (FTP, the File Transfer that GNSS data center uses at present
Protocol) and there are many restrictions in relational database in terms of magnanimity GNSS data is managed, and centralised storage method can not
Meet the needs of extensive GNSS data storage application.Domestic and international research institution and researcher store to mass small documents to be carried out
Extensive concern and research, the document delivered mainly include:It is external《Journal of Network and Computer
Applications》's《An Optimized Approach for Storing and Accessing Small Files
on Cloud Storage》、《Web Information Systems and Mining》's《Metadata-Aware Small
Files Storage Architecture on Hadoop》、《Algorithms and Architectures for
Parallel Processing》's《Hmfs:Efficient Support of Small Files Processing over
HDFS》, it is domestic《XI AN JIAOTONG UNIVERSITY Subject Index》's《A kind of scheme for improving cloud storage small file storage efficiency》、《Wuhan is big
Learn journal information science version》's《A kind of combination RDBMS and Hadoop mass small documents storage method》With《Under cloud environment
Space-time data small documents storage strategy》.
Existing solution has all been placed on focus the correlation inquired between metadata schema, analysis mass small documents
Property, the structure of adjustment system and user access rule etc., but indexed to data type and feature and to merging part hereinafter
Placement Strategy concern it is less, it is impossible to be entirely applied to the management of GNSS small documents.In face of the magnanimity using small documents as representative
The storage demand of GNSS data, it is small with reference to GNSS data type and feature, design magnanimity GNSS using the cloud platform of increasing income of bottom
File cloud storage method, turn into magnanimity GNSS small documents efficient storage, management, issue and shared effective way.
The content of the invention
For the above situation, to overcome the defect of prior art, it is small that the purpose of the present invention is just to provide a kind of magnanimity GNSS
File cloud storage method, effectively solve the problems, such as magnanimity GNSS small documents efficient storage, management, issue and shared.
The technical scheme that the present invention solves is, the defects of for magnanimity GNSS small documents centralised storage methods and bottleneck,
Increased income cloud platform (Hadoop) based on bottom, build and design magnanimity GNSS small documents cloud storage methods, realize that magnanimity GNSS is small
The efficient cloud storage of file, magnanimity GNSS small documents are merged into big file first, index is established to the big file after merging;And
Optimum indexing block storage strategy, the blocks of files after cutting and index block are stored in the node of data block or nearest from data block
On back end (DataNode), the index of GNSS data type is stored on name node (NameNode), reduces storage
The consumption of capacity and the memory consumption of name node (NameNode), the performance for improve large amount of small documents write-in, accessing and deleting,
Specifically include following steps:
(1) magnanimity GNSS small documents, are merged into big file, to reduce large amount of small documents to name node (NameNode)
The occupancy of internal memory, it is first to merge same observation period or resolving time, same type of file that small documents, which merge,;Its
In to GNSS observe file merging when, merged by four alphabetical sequencings of survey station name, to resolve achievement text
During the merging of part, merged by three alphabetical sequencings of GNSS analysis centers title, a large amount of GNSS observation files are closed
And big file is continuously observed as an observation period, will resolve Outcome Document merging turns into a resolving time Sequentially continuous
The big file of resolving achievement;
(2), to the big file structure indexes of GNSS after merging, i.e., observation file is conciliate respectively and is counted as fruit structure index,
Using character with indexing one-to-one mode, to observing file, Pyatyi rope is built by file sequence number, year day of year and survey station name
Draw, the positional information of storage observation file in afterbody index;To resolving Outcome Document, by day and analysis in GPS weeks, week
Center names build six grades of indexes, the positional information of storage resolving Outcome Document in afterbody index;
(3) index of foundation, is subjected to cutting by data block size, due to software can be handled by one by GNSS data
Observation data in it merge, therefore file sequence number can be unified for 0, and corresponding file first order index file sequence number of observing also is 0,
When indexing cutting, to observing the second of file to level V index, resolving first to the 6th grade of achievement index, take from lower and
On mode, the size of computation index, by its cutting be 64MB sizes index block;
(4), index block is placed on the node or the node nearest from data block of data storage block, file is improved and reads
Speed and the memory consumption for further reducing name node (NameNode);
(5), the index of the file type of the big files of GNSS after merging is stored on name node (NameNode), file
Block map paths and sign observation file are stored in name node with resolving three characters/index block map paths of achievement type
(NameNode) on, blocks of files and index block are stored on back end (DataNode), realize magnanimity GNSS small documents
Cloud storage.
The inventive method is simple, easy to operate, saves memory space, reduces memory consumption, improves write-in, reads and deletes effect
Rate, magnanimity GNSS small documents efficient storage, management, issue and shared purpose are effectively improved, is to magnanimity GNSS small documents pipes
A big innovation in reason, economic and social benefit are huge.
Brief description of the drawings
Fig. 1 is small documents storage platform functional schematic of the present invention.
Fig. 2 is present invention observation file index structure figure.
Fig. 3 resolves achievement index construct figure for the present invention.
Fig. 4 conciliates for present invention observation file is counted as fruit file storage location schematic diagram.
Embodiment
The embodiment of the present invention is elaborated below in conjunction with accompanying drawing.
As shown in Fig. 1-4, the present invention comprises the following steps in specific implementation:
Step 1:Magnanimity GNSS small documents are merged into big file, to reduce large amount of small documents to name node
(NameNode) occupancy of internal memory, magnanimity GNSS small documents include following two type files:One kind is, to observe data, lead
Boat ephemeris and the observation file that meteorological file is representative, another kind of is using coordinate file, precise ephemeris, precise clock correction as representative
Resolve Outcome Document;Either observation file still resolves Outcome Document, all employs the reference format of international uniform, observation text
The DIF (RINEX, Receiver Independent Exchange Format) unrelated with receiver of part use, solution
It is counted as the unrelated achievement DIF of resolving (SINEX, the Solution (Software/technique) of fruit use
Independent Exchange Format), ionosphere DIF (IONEX, Ionosphere Exchange Format)
With precise ephemeris data format (SP3, NGS Standard GPS Format) form, n GNSS small documents are stored in system,
Every part of GNSS small documents all include three kinds of position, time and file type parameters, are made a distinction between data by parameter, GNSS
Small documents data set D is expressed as:
D={ d (Li,Tj,Ik),d|Li∈L,Tj∈T,Ik∈ I }, i, j, k ∈ Z formulas (1)
Wherein, L represents positional information caused by file, main to include gathering the survey station of observation file and resolve Outcome Document
Mechanism;T represents time mark caused by file, because the 24h Continuous Observations of survey station and the timing of data center continuously resolve
With issue, T is a continuous time series;I represents file type, is defined by above-mentioned reference format, and L and T are from file
The top of file of name and file record, which is separately won, to be taken, and the top of file of I from file extension and file record, which is separately won, to be taken;D presenting sets
Close, i, j, k represent the sequence number of document location, time and type parameter respectively, and Z is integer;
When small documents merge, first by same observation period or resolving time, same type of file, by survey station name four
Character, the sequencing of three characters of analysis center's title merge, GNSS small documents collection after mergingIt is expressed as:
Formula (2)
Wherein, TjRepresent j-th of observation period or resolve moment, IkK-th of file type is represented, Z is integer;
Then, the small documents of each type are merged by continuous observation period or resolving time sequence respectively, because
For in the measurement of GNSS small documents, all solutions are of universal significance, therefore the observation file of continuous 7 days and the day of 7 days are solved respectively
Piece file mergence is a big file,It is represented by:
Formula (3)
The merging of two steps more than, you can it is continuous into an observation period that the GNSS of continuous 7 days is observed into Piece file mergence
The big file of observation, the resolving Outcome Document of 7 days is merged into the big file of resolving achievement of a resolving time Sequentially continuous;Greatly
The filename of file is with file type, the observation of starting and ending or resolving time, first and end survey station name or analysis center
Name mark, the file after merging are stored in cloud storage system by the way of piecemeal, and data block is dimensioned to 64MB, often
Individual data block is the set of multiple small documents, and takes name node (NameNode) 150B memory headroom, before relatively merging
Each small documents take 150B memory headroom compare, substantially reduce name node (NameNode) memory consumption;
Described magnanimity GNSS small documents include GNSS observation files and resolve Outcome Document, and these files all follow the world
Unified reference format, because GNSS data and achievement form can constantly upgrade, therefore, to the file format after upgrading and newest
The file type of proposition, it can bring the category of GNSS small documents into;
Described same observation period or resolving time, same type of file merge, and can also press phase respectively first
The same observation period resolving date merges, then is merged by continuous observation period or resolving cycle, the text of big file
Part name is referred to as with file type, the observation of starting and ending or resolving time, first and end survey station title or analysis center's name
To mark, it is stored in after big Piece file mergence in cloud storage system by the way of piecemeal, data block is dimensioned to 64MB, often
Individual data block is the set of multiple GNSS small documents;
Step 2:To the big file structure indexes of GNSS after merging, i.e., observation file is conciliate respectively and be counted as fruit by L and T structures
Index, method is:
During to observation file structure index, because observation file is using the preservation of RINEX forms, RINEX forms are using 8.3
Naming method, wherein 8 represent the root name of 8 character lengths for representing file ownership, 3 represent for representing file type
3 character lengths extension name, concrete form ssssdddf.yyt, wherein ssss represent the survey station name of 4 character lengths,
Ddd represents year day of year, and f represents the file sequence number of one day;Intraday file sequence number is represented using character f, character string ddd is represented
Year day of year, character string ssss represent survey station name, from top to down, take character and index one-to-one mode, build Pyatyi rope
Draw, in the positional information of the end node storage observation file of afterbody index;First order index is file sequence number, indexes model
Enclose by [0,9] and [a, z] two section compositions, [0,9] represents 10 Arabic integers, and [a, z] represents 26 english lowercase words
Mother, second level index are hundred of year day of year, and index range is [0,3], and [0,3] represents 4 Arabic integers, third level index
Ten of year day of year are corresponded to, index range is [0,9], and fourth stage index is the individual position of year day of year, and index range is [0,9],
Level V index is the survey station name of four character lengths, and the scope of each character falls in [0,9] and [a, z] two sections;
To resolving Outcome Document, preserved in the form of sssddddd.ttt, wherein sss represents three words of analysis center
Referred to as, preceding four d in ddddd are represented from GPS weeks that January 6,0h was started in 1980, last d represents day in week to symbol,
Ttt represents the type for resolving achievement, six grades of indexes is built by day in GPS weeks, week and analysis center's title, in the end section of index
Point storage resolve Outcome Document positional information, the first order to the fourth stage index be respectively GPS weeks kilobit, hundred, ten and
Individual position, index range are [0,9], and level V index is day in GPS weeks, and index range is [0,7], wherein the 7 of [0,6] representative
Individual integer represents the day solution file of one week, and numeral 7 represents week solution file, the 6th grade of analysis institution's name indexed as three character lengths
Claim, the equal scope of each character falls in [a, z] section;
Described observation file and resolving Outcome Document builds Pyatyi and six grades of indexes respectively, and the foundation of index follows standard
File format, the routing information of storage file in afterbody index;
Step 3:The index of foundation is subjected to cutting by data block 64MB, to observing file, due to GNSS numbers can be passed through
Intraday observation data are merged according to processing software, therefore file sequence number is unified for 0, corresponding first order index file sequence
Number also it is 0;When indexing cutting, to observing the second of file to level V index, resolving first to the 6th grade of achievement index, adopt
Mode from bottom to top, the size of computation index block are taken, current i-1 and the i size (IBlock) indexed meet following formula
Formula (4)
First i-1 index is saved as into an independent index block, in such a way, completes the institute built to step 2
There is the cutting of index;
Step 4:Index block is placed on the back end (DataNode) of data storage block or the number nearest from data block
According on node (DataNode), improving reading speed and further reducing the memory consumption of name node (NameNode), will cut
Point the content of index block matched with the title of GNSS large file blocks after merging, take during matching from top to bottom by
The mode of level matching, when branch occurs in index, there is the ratio shared by each index character of bifurcation in statistics, will account for index block
The maximum character of ratio is matched with data block in back end (DataNode), using matching rate highest node as index
The memory node of block;When index block is placed on the node of data storage block or the node nearest from data block, on the one hand, reduce
Communication overhead during digital independent, that is, find on node local or adjacent again after some is indexed with regard to corresponding file can be found
Content, improve reading speed;On the other hand, due to indexing and being not stored on name node (NameNode), but in data
On node (DataNode), therefore further reduce the memory consumption of name node (NameNode);
Step 5:The index of the file type of the big files of GNSS after merging is stored on name node (NameNode),
File is observed to GNSS, the index being stored on name node (NameNode) removes the file type represented comprising a letter
Outside, the rear two digits in observation year on date are also included;To resolving Outcome Document, the rope being stored on name node (NameNode)
Draw the file type only represented comprising three letters;Therefore, the data block copy amount except storage and big File name/path
Mapping, the file type/index block path being made up of three bit digitals or letter also are stored on name node (NameNode), from
And realize magnanimity GNSS small documents cloud storages.
The present invention can also be realized in specific implementation by following methods:
Provided by Fig. 1, the invention mainly comprises a name node (NameNode) to be used as host node, several data sections
Point (DataNode) includes being responsible for as blocks of files and the memory node of index block, the task of each back end (DataNode)
Small documents merge and the structure of index block, and a certain specific back end (DataNode) is responsible for merging and the index block of index
Cutting, comprise the concrete steps that:
1) magnanimity GNSS small documents are merged:Magnanimity GNSS small documents include GNSS observations file, resolve the class of Outcome Document two,
Observation file is received via all kinds of receivers, the file structure for the standard RINEX forms being converted into through Data Format Conversion Software
Into main include RINEX 2.0 and 3.0 two kind of form, observation data of the file type including multisystem multifrequency, each system are led
Navigate ephemeris, satellite clock correction and observation summary (summary file) four class files;Resolving Outcome Document includes precise ephemeris, precision
Clock correction, earth rotation parameter (ERP), satellite yaw rate and coordinate file etc., be by international GNSS Servers Organizations (IGS,
International GNSS Service) each analysis center resolves to obtain using high-precision GNSS data processing software, form
Follow SP3, SINEX, IONEX standard;
Observation file corresponds to corresponding observation period, comprising information such as initial time, end time and sampling intervals, because
This first can be merged the observation file of identical period by survey station name;Then continuous observation time sequence is pressed, is merged different
The observation file of observation period;Resolve achievement to correspond to by the period of resolved data, during starting and ending comprising resolved data
Between, therefore the identical period can be observed and achievement merging is resolved corresponding to data, when merging different according still further to the continuous resolving cycle
The resolving achievement of phase, the filename of big file are surveyed with file type, the observation of starting and ending or resolving time, first and end
Station name or analysis center's name are referred to as marking;
Each back end (DataNode) is responsible for completing the merging of the node small documents;
2) the observation file after merging is conciliate respectively and is counted as fruit structure index:During to observation file structure index, due to
Observation data generally use RINEX forms, and RINEX forms use 8.3 naming method, wherein 8 represent for representing that file is returned
The root name of 8 character lengths of category, 3 represent the extension name of 3 character lengths for representing file type, and concrete form is
Ssssdddf.yyt, therefore the year day of year and word that the intraday file sequence number that is represented using character f, character string ddd are represented
The survey station name that symbol string ssss is represented, from top to down, character is taken with indexing one-to-one mode, structure Pyatyi index, most
The routing information of the end node storage observation file of rear stage index;As shown in Fig. 2 observation file indexes, first order index is
File sequence number, index range are made up of [0,9] and [a, z] two sections, and [0,9] represents 10 Arabic integers, and [a, z] is represented
26 English lower cases, second level index is hundred of year day of year, and index range is [0,3], and [0,3] represents 4 Arab
Integer, third level index correspond to ten of year day of year, and index range is [0,9], and the fourth stage is indexed as the individual position of year day of year, rope
It is [0,9] to draw scope, and level V index is the survey station name of four character lengths, and the scope of each character falls in [0,9] and [a, z]
In two sections;
To resolving Outcome Document, preserved in the form of sssddddd.ttt, wherein sss represents three words of analysis center
Referred to as, preceding four d in ddddd are represented from GPS weeks that January 6,0h was started in 1980, last d represents day in week to symbol,
Ttt represents the type for resolving achievement, six grades of indexes is built by day in GPS weeks, week and analysis center's title, in the end section of index
Point storage resolves the positional information of Outcome Document;As shown in Fig. 3 resolving Outcome Document indexes, the first order to the fourth stage, which indexes, to be distinguished
For the kilobit of GPS weeks, hundred, ten and individual position, index range is [0,9], and level V index is day in GPS weeks, indexes model
Enclose for [0,7], wherein day solve file of [0,6] 7 integers representing as one week, numeral 7 represent week solution file, the 6th grade indexes
For analysis institution's title of three character lengths, the scope of each character falls in [a, z] section;
Each back end (DataNode) is responsible for completing the structure of node small documents index;After the completion of index construct,
The merging of index is completed in another specific back end (DataNode);
3) cutting index block, the index that second step is established is subjected to cutting by data block size (64MB), to observing file,
Intraday observation data are merged due to software can be handled by GNSS data, therefore file sequence number can be unified for 0,
Corresponding first order index file sequence number is also 0, when indexing cutting, to observing the second of file to level V index, resolving achievement
First to the 6th grade of index, take mode from bottom to top, the size of computation index block, when index size exceedes data first
During the size of block, an index is returned to, this index is saved as into an independent index block, in such a way, completion pair
The cutting of all indexes of second step structure;
Index block is that the back end (DataNode) for merging index in second step is completed with cutting;
4) index block is stored, and the index block that the 3rd step segments is stored in the back end of corresponding data block
(DataNode) it is or on the back end nearest from data block (DataNode), the content of index block and the GNSS after merging is big
The title of file data blocks is matched, and takes matching way step by step during matching, and when branch occurs in index block, statistics is divided
The ratio shared by each index character at branch, the maximum character of this grade of index ratio and number in back end (DataNode) will be accounted for
Matched according to the title of block, the memory node using matching rate highest node as the index block;
5) file type is indexed/index block path is stored on name node (NameNode), such as Fig. 4 observation files and
Resolve Outcome Document storage location to illustrate shown in schematic diagram, file is observed to GNSS, is stored on name node (NameNode)
Index except comprising one letter represent file type in addition to, also comprising observation year on date rear two digits;To resolving achievement
File, the file type that the index being stored on name node (NameNode) only represents comprising three letters;By file type
Index is stored on name node (NameNode) with the one-to-one address of cache of index block, completes the index of above-mentioned structure
Mapping, therefore, except the data block copy amount and big File name/map paths of storage, by three bit digitals or alphabetical group
Into file type/index block path also be stored on name node (NameNode), so as to realize magnanimity GNSS small documents clouds
Storage.
The foregoing is only a preferred embodiment of the present invention, protection scope of the present invention not limited to this, any ripe
Those skilled in the art are known in the technical scope of present disclosure, the letter for the technical scheme that can be become apparent to
Altered or equivalence replacement are each fallen within protection scope of the present invention.
From the foregoing, the present invention is a kind of method of new magnanimity GNSS small documents cloud storages, support to magnanimity GNSS
The efficient storages of small documents, management, inquiry and shared.The cluster that experiment is formed by building 9 nodes, 1 is used as title
Node (NameNode), remaining 8 as back end (DataNode), number of copies is arranged to 3, tests magnanimity GNSS small documents
Write-in, reading and delete speed.By test, small documents storage method proposed by the present invention compared with traditional HDFS methods,
Memory space is greatlyd save, memory consumption reduces 1/2, and writing speed improves about 4 times, and reading speed improves about 3 times, deletes
Except speed improves about 2.5 times.The effect of practical application and the scale of storage system, the performance of each node, network environment, data
The difference of size and type etc. is closely related.Therefore the present invention compared with prior art, there is Advantageous following prominent to imitate
Fruit:
(1) memory space is saved
According to GNSS data type and data characteristicses, the observation data to the Continuous Observation period are conciliate to be counted as the present invention
Fruit, the strategy for being merged into big file is taken, improve Hadoop distributed file systems (HDFS, Hadoop Distributed
File System) in each small documents take the situation of whole data block space, data of the big file after cutting after merging
Block takes the size of a data block, effectively saves back end (DataNode) memory space, improves memory space
Utilization rate.
(2) memory consumption is reduced
Proposed by the present invention conciliate according to GNSS observation files is counted as fruit naming rule, and rope is established to the big file after merging
Draw, in the path that the end node storage file of index preserves.On the one hand, small documents are merged, storage system can be greatly reduced
The quantity of middle data block, reduce name node (NameNode) memory cost;On the other hand, the big file after merging is built
Lithol draws and after cutting, index block is stored in back end (DataNode), name node (NameNode) only saves
The mapping of rope file type/index path of file extension and the mapping of big File name/file path, are further reduced
The memory consumption of name node (NameNode).
(3) write-in, reading and deletion efficiency are improved
Method proposed by the present invention establishes the method indexed by merging GNSS small documents, to the file after merging, establishes
Efficient memory mechanism, reduces client and name node (NameNode), name node (NameNode) and data section
Communication between point (DataNode), client and back end (DataNode), reduce the response time of inquiry and retrieval.
Improve write-in, reading and deletion efficiency.
(4) it is easy to extend
Method proposed by the present invention has wide applicability, and being counted as fruit to all kinds of GNSS observation file reconciliation passes through conjunction
And, establish index and piecemeal after, efficient storage can be realized.To newly-increased GNSS data and achievement form, according to data class
Type and feature merge, and after the steps such as structure index, piecemeal storage, can all include the small documents storage system of the present invention
System, can have broad applicability and stronger autgmentability, solve bottleneck and challenge that existing GNSS small documents storage faces, band
Efficient storage efficiency is carried out, " Geodesy and Survey Engineering " skill being efficiently applied in " Surveying Science and Technology " subject
Art field, realize magnanimity GNSS small documents efficient storage, management, issue and share, economic and social benefit is huge.
Claims (5)
- A kind of 1. magnanimity GNSS small documents cloud storage method, it is characterised in that magnanimity GNSS small documents are merged into big text first Part, index is established to the big file after merging;And optimum indexing block storage strategy, the blocks of files after cutting and index block are stored On the node or the back end nearest from data block of data block, the index of GNSS data type is stored in name node On, the consumption of memory capacity and the memory consumption of name node are reduced, the property for improve large amount of small documents write-in, accessing and deleting Can, specifically include following steps:(1) magnanimity GNSS small documents, are merged into big file, it is small to reduce occupancy of the large amount of small documents to name node internal memory Piece file mergence is first to merge same observation period or resolving time, same type of file;Wherein seen to GNSS When surveying the merging of file, merge by four alphabetical sequencings of survey station name, in the merging to resolving Outcome Document, press Three alphabetical sequencings of GNSS analysis centers title are merged, and a large amount of GNSS observation Piece file mergences are turned into an observation Period continuously observes big file, will resolve Outcome Document and merges as the big text of resolving achievement of a resolving time Sequentially continuous Part;(2), to the big file structure indexes of GNSS after merging, i.e., observation file is conciliate respectively and is counted as fruit structure index, used Character is with indexing one-to-one mode, to observing file, is indexed by file sequence number, year day of year and survey station name structure Pyatyi, The positional information of storage observation file in afterbody index;During to observation file structure index, first order index is file sequence Number, index range is made up of [0,9] and [a, z] two sections, and [0,9] represents 10 Arabic integers, and [a, z] represents 26 English Literary lowercase, second level index is hundred of year day of year, and index range is [0,3], and [0,3] represents 4 Arabic integers, the Three level list corresponds to ten of year day of year, and index range is [0,9], and the fourth stage is indexed as the individual position of year day of year, index range For [0,9], level V index is the survey station name of four character lengths, and the scope of each character falls in [0,9] and [a, z] Liang Ge areas In;To resolving Outcome Document, six grades of indexes are built by day in GPS weeks, week and analysis center's title, in afterbody index Storage resolves the positional information of Outcome Document;To resolving Outcome Document, the first order to fourth stage index is respectively the thousand of GPS weeks Position, hundred, ten and individual position, index range are [0,9], and level V index is day in GPS weeks, and index range is [0,7], its In [0,6] represent 7 integers represent one week day solution file, numeral 7 represents week solution file, the 6th grade index be three characters length Analysis institution's title of degree, the equal scope of each character fall in [a, z] section;(3) index of foundation, is subjected to cutting by data block size, due to software can be handled by one day by GNSS data Observation data merge, therefore file sequence number can be unified for 0, and corresponding file first order index file sequence number of observing also is 0, index During cutting, to observing the second of file to level V index, resolving first to the 6th grade of achievement index, take from bottom to top Mode, the size of computation index, by the index block that its cutting is 64MB sizes;(4), index block is placed on the node or the node nearest from data block of data storage block, improves file reading speed And further reduce the memory consumption of name node;(5), the index of the file type of the big files of GNSS after merging is stored on name node, blocks of files map paths and table Sign observation file is stored on name node with resolving three characters/index block map paths of achievement type, blocks of files and rope Draw block to be stored on back end, realize the cloud storage of magnanimity GNSS small documents.
- 2. magnanimity GNSS small documents cloud storage method according to claim 1, it is characterised in that comprise the following steps:Step 1:Magnanimity GNSS small documents are merged into big file, to reduce occupancy of the large amount of small documents to name node internal memory, Magnanimity GNSS small documents include following two type files:One kind is, gentle as file is representative to observe data, navigation ephemeris Observation file, it is another kind of be using coordinate file, precise ephemeris, precise clock correction as representative resolving Outcome Document;Either see Survey file and still resolve Outcome Document, all employ the reference format of international uniform, observation file uses unrelated with receiver DIF, resolve achievement use the unrelated achievement DIF of resolving, ionosphere DIF and precise ephemeris data lattice Formula form, n GNSS small documents are stored in system, and every part of GNSS small documents all include position, time and the seed ginseng of file type three Count, made a distinction between data by parameter, GNSS small documents data sets D is expressed as:D={ d (Li,Tj,Ik),d|Li∈L,Tj∈T,Ik∈ I }, i, j, k ∈ Z formulas (1)Wherein, L represents positional information caused by file, the main survey station for including collection observation file and the machine for resolving Outcome Document Structure;T represents time mark caused by file, and due to the 24h Continuous Observations of survey station and the timing of data center continuously resolves and hair Cloth, T are a continuous time serieses;I represents file type, is defined by above-mentioned reference format, L and T from filename and The top of file of file record, which is separately won, to be taken, and the top of file of I from file extension and file record, which is separately won, to be taken;D, which is represented, to be gathered, i, J, k represents the sequence number of document location, time and type parameter respectively, and Z is integer;When small documents merge, first by same observation period or resolving time, same type of file, by four words of survey station name Symbol, the sequencing of three characters of analysis center's title merge, GNSS small documents collection after mergingIt is expressed as:Wherein, TjRepresent j-th of observation period or resolve moment, IkK-th of file type is represented, Z is integer;Then, the small documents of each type are merged by continuous observation period or resolving time sequence respectively, because In the measurement of GNSS small documents, all solutions are of universal significance, therefore the observation file of continuous 7 days and the day of 7 days are solved into file respectively A big file is merged into,It is represented by:The merging of two steps more than, you can continuously see the GNSS observation Piece file mergences of continuous 7 days into an observation period Big file is surveyed, the resolving Outcome Document of 7 days is merged into the big file of resolving achievement of a resolving time Sequentially continuous;Big file Filename with file type, the observation of starting and ending or resolving time, first and end survey station name or analysis center's identifier Note;File after merging is stored in cloud storage system by the way of piecemeal, and data block is dimensioned to 64MB, per number It is the set of multiple small documents according to block, and takes name node 150B memory headroom, each small documents before relatively merging accounts for Compared with 150B memory headroom, substantially reduce the memory consumption of name node;Step 2:To the big file structure indexes of GNSS after merging, i.e., observation file is conciliate respectively and be counted as fruit by L and T structure ropes Draw, method is:During to observation file structure index, preserved due to observing file using RINEX forms, RINEX forms use 8.3 name Mode, wherein 8 represent for represent file ownership 8 character lengths root name, 3 represent for represent file type 3 The extension name of position character length, concrete form ssssdddf.yyt, wherein ssss represent the survey station name of 4 character lengths, ddd generations Table year day of year, f represents the file sequence number of one day;Intraday file sequence number is represented using character f, character string ddd represents year product Day, character string ssss represents survey station name, from top to down, takes character to be indexed with indexing one-to-one mode, structure Pyatyi, The positional information of the end node storage observation file of afterbody index;First order index is file sequence number, index range by [0,9] and [a, z] two section compositions, [0,9] represents 10 Arabic integers, and [a, z] represents 26 English lower cases, the Secondary index is hundred of year day of year, and index range is [0,3], and [0,3] represents 4 Arabic integers, and third level index is corresponding For ten of year day of year, index range is [0,9], and fourth stage index is the individual position of year day of year, and index range is [0,9], the 5th Level index is the survey station name of four character lengths, and the scope of each character falls in [0,9] and [a, z] two sections;To resolving Outcome Document, preserved in the form of sssddddd.ttt, wherein sss represents three character letters of analysis center Claim, preceding four d in ddddd are represented from GPS weeks that January 6,0h was started in 1980, last d and represented day in week, ttt generations Table resolves the type of achievement, builds six grades of indexes by day in GPS weeks, week and analysis center's title, is deposited in the end node of index The positional information of storage resolving Outcome Document, the first order to the fourth stage index kilobit of respectively GPS weeks, hundred, ten and individual position, Index range is [0,9], and level V index is day in GPS weeks, and index range is [0,7], wherein 7 of [0,6] representative are whole Number represents the file that solves day of one week, and numeral 7 represents week solution file, and the 6th grade of index is analysis institution's title of three character lengths, The scope of each character is all fallen within [a, z] section;Step 3:The index of foundation is subjected to cutting by data block 64MB, to observing file, at can be by GNSS data Reason software merges intraday observation data, therefore file sequence number is unified for 0, corresponding first order index file sequence number For 0;When indexing cutting, to observing the second of file to level V index, resolving first to the 6th grade of achievement index, take from The size of mode on down, the size of computation index block, current i-1 and i index meets following formulaFirst i-1 index is saved as into an independent index block, in such a way, completes all ropes built to step 2 The cutting drawn;Step 4:Index block is placed on the back end or the back end nearest from data block of data storage block, improves and reads Take speed and further reduce name node memory consumption, by the content of the index block of cutting with merge after the big files of GNSS The title of data block is matched, and is taken the mode matched step by step from top to bottom during matching, when branch occurs in index, is counted Ratio shared by each index character of existing bifurcation, data block in the character and back end that account for index block ratio maximum is carried out Matching, the memory node using matching rate highest node as index block;When index block be placed on data storage block node or During nearest from data block node, on the one hand, reduce communication overhead during digital independent, that is, find after some index in local Or corresponding file content just can be found on adjacent node, improve reading speed;On the other hand, due to indexing and being not stored in On name node, but on back end, therefore further reduce the memory consumption of name node;Step 5:The index of the file type of the big files of GNSS after merging is stored on name node, text is observed to GNSS Part, the index on name node is stored in addition to the file type represented comprising a letter, after also including observation year on date Two digits;To resolving Outcome Document, the index on name node is stored in only comprising three alphabetical file types represented;Cause This, is except the data block copy amount and big File name/map paths of storage, the files classes being made up of three bit digitals or letter Type/index block path also is stored on name node, so as to realize magnanimity GNSS small documents cloud storages.
- 3. magnanimity GNSS small documents cloud storage method according to claim 2, it is characterised in that described step 1 magnanimity GNSS small documents include GNSS observation files and resolve Outcome Document, and these files all follow the reference format of international uniform, by Can constantly it upgrade in GNSS data and achievement form, therefore, to the file format after upgrading and the file type of newest proposition, The category of GNSS small documents can be brought into.
- 4. magnanimity GNSS small documents cloud storage method according to claim 2, it is characterised in that described step 1 is same Observation period or resolving time, same type of file merge, and can also resolve day by identical observation period respectively first Phase merges, then is merged by continuous observation period or resolving cycle, and the filename of big file is with file type, starting Observation or resolving time, first and end survey station title or analysis center's name with end are referred to as marking, after big Piece file mergence It is stored in cloud storage system by the way of piecemeal, data block is dimensioned to 64MB, and each data block is multiple GNSS The set of small documents.
- 5. magnanimity GNSS small documents cloud storage method according to claim 2, it is characterised in that described step 2 is observed File and resolving Outcome Document build Pyatyi and six grades of indexes respectively, and the foundation of index follows Standard File Format, at last The positional information of storage file in level index.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510204235.7A CN104765876B (en) | 2015-04-24 | 2015-04-24 | Magnanimity GNSS small documents cloud storage methods |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510204235.7A CN104765876B (en) | 2015-04-24 | 2015-04-24 | Magnanimity GNSS small documents cloud storage methods |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104765876A CN104765876A (en) | 2015-07-08 |
CN104765876B true CN104765876B (en) | 2017-11-10 |
Family
ID=53647703
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510204235.7A Expired - Fee Related CN104765876B (en) | 2015-04-24 | 2015-04-24 | Magnanimity GNSS small documents cloud storage methods |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104765876B (en) |
Families Citing this family (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105608212B (en) * | 2015-12-30 | 2020-02-07 | 成都国腾实业集团有限公司 | Method and system for ensuring that MapReduce data input fragment contains complete record |
CN106970928B (en) * | 2016-01-14 | 2020-12-29 | 平安科技(深圳)有限公司 | File management method and system |
CN105843841A (en) * | 2016-03-07 | 2016-08-10 | 青岛理工大学 | Small file storage method and system |
CN107402924A (en) * | 2016-05-19 | 2017-11-28 | 普天信息技术有限公司 | MR files apply the implementation method and device in HDFS |
CN106528451B (en) * | 2016-11-14 | 2019-09-03 | 哈尔滨工业大学(威海) | The cloud storage frame and construction method prefetched for the L2 cache of small documents |
CN107391423A (en) * | 2017-07-26 | 2017-11-24 | Tcl移动通信科技(宁波)有限公司 | Method, storage medium and the mobile terminal of file are transmitted by OTG functions |
CN109947703A (en) * | 2017-11-09 | 2019-06-28 | 北京京东尚科信息技术有限公司 | File system, file memory method, storage device and computer-readable medium |
CN109947721B (en) * | 2017-12-01 | 2021-08-17 | 北京安天网络安全技术有限公司 | Small file processing method and device |
CN108460121B (en) * | 2018-01-22 | 2022-02-08 | 重庆邮电大学 | Little file merging method for space-time data in smart city |
CN108470577B (en) * | 2018-02-02 | 2021-07-27 | 重庆金山医疗器械有限公司 | Capsule endoscopy system data storage method |
CN109033137B (en) * | 2018-06-06 | 2021-11-05 | 千寻位置网络有限公司 | Dynamic RINEX data storage method and device |
CN109800184B (en) * | 2018-12-12 | 2024-06-25 | 平安科技(深圳)有限公司 | Caching method, system, device and storable medium for small block input |
CN110795391A (en) * | 2019-10-28 | 2020-02-14 | 深圳市元征科技股份有限公司 | Automobile repair data processing method and device, electronic equipment and storage medium |
CN111159120A (en) * | 2019-12-16 | 2020-05-15 | 西门子电力自动化有限公司 | Method, device and system for processing files in power system |
CN111461537A (en) * | 2020-03-31 | 2020-07-28 | 山东胜软科技股份有限公司 | Oil gas production data based classified quantity counting method and control system |
CN111475463B (en) * | 2020-04-01 | 2023-02-24 | 中国人民解放军火箭军工程大学 | GNSS observation data digital relation storage method |
CN111400247B (en) * | 2020-04-13 | 2023-08-01 | 杭州九州方园科技有限公司 | User behavior auditing method and file storage method |
CN112347045B (en) * | 2020-11-30 | 2022-07-26 | 长春工程学院 | Storage method of mass cable tunnel state signal data |
CN113032348A (en) * | 2021-05-25 | 2021-06-25 | 湖南省第二测绘院 | Spatial data management method, system and computer readable storage medium |
CN113420186B (en) * | 2021-06-18 | 2022-10-04 | 自然资源部第三地形测量队 | Data storage method, data storage device, computer readable storage medium and data reading method |
CN114416811A (en) * | 2021-12-07 | 2022-04-29 | 中国科学院国家授时中心 | Distributed storage system for GNSS data |
CN116150113A (en) * | 2023-04-17 | 2023-05-23 | 江苏北斗信创科技发展有限公司 | Data storage method for GNSS |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102332027A (en) * | 2011-10-15 | 2012-01-25 | 西安交通大学 | Mass non-independent small file associated storage method based on Hadoop |
CN102662992A (en) * | 2012-03-14 | 2012-09-12 | 北京搜狐新媒体信息技术有限公司 | Method and device for storing and accessing massive small files |
WO2014000458A1 (en) * | 2012-06-28 | 2014-01-03 | 华为技术有限公司 | Small file processing method and device |
CN103577123A (en) * | 2013-11-12 | 2014-02-12 | 河海大学 | Small file optimization storage method based on HDFS |
CN103856567A (en) * | 2014-03-26 | 2014-06-11 | 西安电子科技大学 | Small file storage method based on Hadoop distributed file system |
CN104346384A (en) * | 2013-07-31 | 2015-02-11 | 上海云端广告有限公司 | Method and device for processing small files |
-
2015
- 2015-04-24 CN CN201510204235.7A patent/CN104765876B/en not_active Expired - Fee Related
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102332027A (en) * | 2011-10-15 | 2012-01-25 | 西安交通大学 | Mass non-independent small file associated storage method based on Hadoop |
CN102662992A (en) * | 2012-03-14 | 2012-09-12 | 北京搜狐新媒体信息技术有限公司 | Method and device for storing and accessing massive small files |
WO2014000458A1 (en) * | 2012-06-28 | 2014-01-03 | 华为技术有限公司 | Small file processing method and device |
CN104346384A (en) * | 2013-07-31 | 2015-02-11 | 上海云端广告有限公司 | Method and device for processing small files |
CN103577123A (en) * | 2013-11-12 | 2014-02-12 | 河海大学 | Small file optimization storage method based on HDFS |
CN103856567A (en) * | 2014-03-26 | 2014-06-11 | 西安电子科技大学 | Small file storage method based on Hadoop distributed file system |
Non-Patent Citations (2)
Title |
---|
An optimized approach for storing and accessing small files on cloud storage;Bo Dong等;《Journal of Network and Computer Applications》;20120724;第35卷(第6期);第1847-1862页 * |
基于Hadoop的海量小文件存储方法的研究;时倩等;《数字技术与应用》;20140115(第01期);第50,52页 * |
Also Published As
Publication number | Publication date |
---|---|
CN104765876A (en) | 2015-07-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104765876B (en) | Magnanimity GNSS small documents cloud storage methods | |
CN109635068A (en) | Mass remote sensing data high-efficiency tissue and method for quickly retrieving under cloud computing environment | |
CN104820714B (en) | Magnanimity tile small documents memory management method based on hadoop | |
US7533112B2 (en) | Context hierarchies for address searching | |
US20120197900A1 (en) | Systems and methods for search time tree indexes | |
CN110291518A (en) | Merging tree garbage indicators | |
US9223801B2 (en) | Information management method and information management apparatus | |
CN110383261A (en) | Stream selection for multi-stream storage | |
CN106933833B (en) | Method for quickly querying position information based on spatial index technology | |
CN105160039A (en) | Query method based on big data | |
CN109684428A (en) | Spatial data building method, device, equipment and storage medium | |
CN102982103A (en) | On-line analytical processing (OLAP) massive multidimensional data dimension storage method | |
CN105117502A (en) | Search method based on big data | |
CN103678491A (en) | Method based on Hadoop small file optimization and reverse index establishment | |
CN106599040A (en) | Layered indexing method and search method for cloud storage | |
CN108804602A (en) | A kind of distributed spatial data storage computational methods based on SPARK | |
CN103399945A (en) | Data structure based on cloud computing database system | |
CN103678657B (en) | Method for storing and reading altitude data of terrain | |
CN104199860A (en) | Dataset fragmentation method based on two-dimensional geographic position information | |
CN108009265B (en) | Spatial data indexing method in cloud computing environment | |
CN114328779A (en) | Geographic information cloud disk based on cloud computing efficient retrieval and browsing | |
CN103970842A (en) | Water conservancy big data access system and method for field of flood control and disaster reduction | |
CN104021210B (en) | Geographic data reading and writing method of MongoDB cluster of geographic data stored in GeoJSON-format semi-structured mode | |
CN103425653A (en) | Method and system for realizing DICOM (digital imaging and communication in medicine) image quadratic search | |
CN104008209B (en) | Reading-writing method for MongoDB cluster geographic data stored with GeoJSON format structuring method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
EXSB | Decision made by sipo to initiate substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20171110 Termination date: 20180424 |
|
CF01 | Termination of patent right due to non-payment of annual fee |