CN106844435A - Update the method and device of geographic information data increment - Google Patents
Update the method and device of geographic information data increment Download PDFInfo
- Publication number
- CN106844435A CN106844435A CN201611154814.6A CN201611154814A CN106844435A CN 106844435 A CN106844435 A CN 106844435A CN 201611154814 A CN201611154814 A CN 201611154814A CN 106844435 A CN106844435 A CN 106844435A
- Authority
- CN
- China
- Prior art keywords
- data set
- newly
- increased
- union
- original
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 239000012141 concentrate Substances 0.000 claims description 26
- 238000000605 extraction Methods 0.000 claims description 3
- 238000013480 data collection Methods 0.000 claims description 2
- 238000009412 basement excavation Methods 0.000 description 5
- 238000007418 data mining Methods 0.000 description 5
- 239000000284 extract Substances 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000003860 storage Methods 0.000 description 4
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000005065 mining Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000005304 joining Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Remote Sensing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of method and device for updating geographic information data increment.Wherein, the method includes:The candidate of geography information is obtained, candidate includes raw data set and newly-increased data set;Raw data set and newly-increased data set are extracted from candidate;Raw data set and newly-increased data set to extracting are calculated, and obtain incremental computations result;The data increment of geography information is updated according to incremental computations result.The present invention solves the low technical problem of data updating efficiency of geography information in correlation technique.
Description
Technical field
The present invention relates to data processing field, in particular to a kind of method for updating geographic information data increment and
Device.
Background technology
Association rule mining as data mining a key areas, for finding the pass in mass data between item collection
Connection property, irreplaceable effect is played in every field.At present, with the further development of information technology, in national economy
Every field accumulation data volume it is increasing, we have welcome the epoch of big data.In the practical application of big data, close
The object for joining rule digging is often a huge centralized or distributed data source.If being associated rule using unit
Then excavate, storage capacity and digging efficiency certainly will be as the bottlenecks in mining process, so as to the need of big data excavation can not be met
Ask.On the other hand, in many actual data mining applications, there is a problem of incremental update toward contact.Many application fields
Database be at constantly updating, so as to causing original pattern excavated ineffective or produce new pattern.
For above-mentioned problem, effective solution is not yet proposed at present.
The content of the invention
A kind of method and device for updating geographic information data increment is the embodiment of the invention provides, at least to solve correlation
The low technical problem of the data updating efficiency of geography information in technology.
A kind of one side according to embodiments of the present invention, there is provided method of renewal geographic information data increment, including:
The candidate of geography information is obtained, above-mentioned candidate includes above-mentioned raw data set and above-mentioned newly-increased data set;From above-mentioned
Above-mentioned raw data set and above-mentioned newly-increased data set are extracted in candidate;To the above-mentioned raw data set that extracts and above-mentioned newly-increased
Data set is calculated, and obtains incremental computations result;Increased according to the data that above-mentioned incremental computations result updates above-mentioned geography information
Amount.
Further, the candidate for obtaining geography information includes:Scanning geographic information database;Given birth to according to scanning result
Into the above-mentioned candidate of above-mentioned geography information.
Further, the above-mentioned raw data set and above-mentioned newly-increased data set that extract are calculated, obtains incremental computations
Result includes:Above-mentioned raw data set is divided into original frequent item set and original nonmatching grids, and by above-mentioned newly-increased data
Collection is divided into New added frequent item set and newly-increased nonmatching grids, wherein, in data set, support counting is more than or equal to data set
Middle transaction journal number is frequent item set with the item collection of the product of minimum support threshold value, and support counting is less than thing in data set
Business record count is nonmatching grids with the item collection of the product of minimum support threshold value;Calculate above-mentioned original frequent item set with it is above-mentioned
The union of New added frequent item set, obtains the first union;Above-mentioned original frequent item set and above-mentioned newly-increased nonmatching grids are calculated, is obtained
Second union;Above-mentioned original nonmatching grids and above-mentioned New added frequent item set are calculated, the 3rd union is obtained;Calculate above-mentioned original non-
Frequent item set and above-mentioned newly-increased nonmatching grids, obtain the 4th union;By above-mentioned first union, above-mentioned second union, above-mentioned
Three unions and above-mentioned 4th union are used as above-mentioned incremental computations result.
Further, the data increment for updating above-mentioned geography information according to above-mentioned incremental computations result includes:By above-mentioned
The item collection concentrated in the lump is added in above-mentioned newly-increased data set as data increment;By the above-mentioned 4th and the item collection concentrated from above-mentioned
Initial data is concentrated and deleted.
Further, the data increment for updating above-mentioned geography information according to above-mentioned incremental computations result includes:Judge above-mentioned
Second and concentrate item collection whether be nonmatching grids;If so, then by the corresponding original frequent item set of the nonmatching grids from upper
State initial data and concentrate deletion;And/or judge the above-mentioned 3rd and concentrate item collection whether be frequent item set;If so, then by this frequently
The corresponding original nonmatching grids of numerous item collection are added in above-mentioned newly-increased data set.
Another aspect according to embodiments of the present invention, additionally provides a kind of device for updating geographic information data increment, bag
Include:Acquiring unit, for obtaining geography information candidate, above-mentioned candidate includes above-mentioned raw data set and above-mentioned newly-increased
Data set;Extraction unit, for extracting above-mentioned raw data set and newly-increased data set from above-mentioned candidate;Computing unit,
Calculated for the above-mentioned raw data set and above-mentioned newly-increased data set to extracting, obtain incremental computations result;Updating block,
Data increment for updating above-mentioned geography information according to above-mentioned incremental computations result.
Further, above-mentioned acquiring unit includes:Scan module, for scanning geographic information database;Generation module, uses
In the candidate that above-mentioned geography information is generated according to scanning result.
Further, above-mentioned computing unit includes:Division module, it is original frequent for above-mentioned raw data set to be divided into
Item collection and original nonmatching grids, and above-mentioned newly-increased data set is divided into New added frequent item set and newly-increased nonmatching grids, its
In, in data set, support counting is more than or equal to transaction journal number in data set and the product of minimum support threshold value
Item collection is frequent item set, and support counting is less than transaction journal number in data set and the item collection of the product of minimum support threshold value
It is nonmatching grids;First computing module, for calculating above-mentioned original frequent item set and the union of above-mentioned New added frequent item set, obtains
To the first union;Second computing module, for calculating above-mentioned original frequent item set and above-mentioned newly-increased nonmatching grids, obtains second
Union;3rd computing module, for the first computing module, for calculating above-mentioned original nonmatching grids with above-mentioned newly-increased frequent episode
Collection, obtains the 3rd union;4th computing module, for calculating above-mentioned original nonmatching grids and above-mentioned newly-increased nonmatching grids,
Obtain the 4th union;Determining module, for by above-mentioned first union, above-mentioned second union, above-mentioned 3rd union and the above-mentioned 4th
Union is used as above-mentioned incremental computations result.
Further, the data increment for updating above-mentioned geography information according to above-mentioned incremental computations result includes:First addition
Module, in using the item collection of above-mentioned first and concentration as data increment added to above-mentioned newly-increased data set;First deletes mould
Block, deletes for the item collection of the above-mentioned 4th and concentration to be concentrated from above-mentioned initial data.
Further, above-mentioned updating block includes:First judge module, for judge above-mentioned second and concentrate item collection be
No is nonmatching grids;Second removing module, for above-mentioned second and concentrate item collection be nonmatching grids when, by the non-frequency
The corresponding original frequent item set of numerous item collection is concentrated from above-mentioned initial data and deleted;And/or second judge module, it is above-mentioned for judging
3rd and concentrate item collection whether be frequent item set;Second add module, the item collection for the above-mentioned 3rd and concentration is frequent episode
During collection, the corresponding original nonmatching grids of the frequent item set are added in above-mentioned newly-increased data set.
In embodiments of the present invention, use according to item collection before and after database incremental update whether for frequently situation dynamic
The mode for updating the data structure, by obtaining the candidate of geography information, candidate includes raw data set and newly-increased
Data set;Raw data set and newly-increased data set are extracted from candidate;To the raw data set and newly-increased data set that extract
Calculated, obtained incremental computations result;The data increment of geography information is updated according to incremental computations result, reached it is quick,
The technique effect of newly-increased geographic information data is efficiently updated, and then solves the data updating efficiency of geography information in correlation technique
Low technical problem.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, this hair
Bright schematic description and description does not constitute inappropriate limitation of the present invention for explaining the present invention.In the accompanying drawings:
Fig. 1 is the flow chart of the method for a kind of optional renewal geographic information data increment according to embodiments of the present invention;
Fig. 2 is the schematic diagram of the device of a kind of optional renewal geographic information data increment according to embodiments of the present invention.
Specific embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention
Accompanying drawing, is clearly and completely described to the technical scheme in the embodiment of the present invention, it is clear that described embodiment is only
The embodiment of a part of the invention, rather than whole embodiments.Based on the embodiment in the present invention, ordinary skill people
The every other embodiment that member is obtained under the premise of creative work is not made, should all belong to the model of present invention protection
Enclose.
It should be noted that term " first ", " in description and claims of this specification and above-mentioned accompanying drawing
Two " it is etc. for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that so using
Data can exchange in the appropriate case, so as to embodiments of the invention described herein can with except illustrating herein or
Order beyond those of description is implemented.Additionally, term " comprising " and " having " and their any deformation, it is intended that cover
Lid is non-exclusive to be included, for example, the process, method, system, product or the equipment that contain series of steps or unit are not necessarily limited to
Those steps or unit clearly listed, but may include not list clearly or for these processes, method, product
Or other intrinsic steps of equipment or unit.
Embodiment 1
According to embodiments of the present invention, there is provided a kind of embodiment of the method for updating geographic information data increment is, it is necessary to say
It is bright, can be held in the such as one group computer system of computer executable instructions the step of the flow of accompanying drawing is illustrated
OK, and, although show logical order in flow charts, but in some cases, can be with different from order herein
Perform shown or described step.
Fig. 1 is the flow chart of the method for a kind of optional renewal geographic information data increment according to embodiments of the present invention,
As shown in figure 1, the method comprises the following steps:
Step S102, obtains the candidate of geography information, and candidate includes raw data set and newly-increased data set;
Step S104, extracts raw data set and newly-increased data set from candidate;
Step S106, raw data set and newly-increased data set to extracting are calculated, and obtain incremental computations result;
Step S108, the data increment of geography information is updated according to incremental computations result.
During the database of many application fields is at constantly updating, the technical scheme provided using the present invention, Ke Yi
On the basis of original pattern, excavated again with reference to newly-increased data set, that is, carried out Increment sign correlation excavation treatment.So,
Even if there is incremental update in data mining application, will not also cause original pattern excavated ineffective or produce
The new pattern of life.
By above-mentioned steps, when being excavated to mass data, the demand of mass data excavation can not only be met, moreover it is possible to pole
The earth improves digging efficiency.
Alternatively, the raw data set and newly-increased data set for obtaining geography information include:
S2, scans geographic information database;
S4, the candidate of geography information is generated according to scanning result.
Wherein, geographic information database can be the distributed data base of multisystem.Raw data set includes original frequent
Item collection and original nonmatching grids.Wherein, concentrated in initial data, support counting concentrates affairs note more than or equal to initial data
Record number is original frequent item set with the item collection of the product of minimum support threshold value, and support counting concentrates thing less than initial data
Business record count is original nonmatching grids with the item collection of the product of minimum support threshold value.Similarly, increasing data set newly is included newly
Increase frequent item set and newly-increased nonmatching grids.Wherein, in newly-increased data set, support counting is more than or equal in newly-increased data set
Transaction journal number is New added frequent item set with the item collection of the product of minimum support threshold value, and support counting is less than newly-increased data
It is newly-increased nonmatching grids with the item collection of the product of minimum support threshold value to concentrate transaction journal number.
By above-mentioned steps, comprehensively and accurately raw data set and newly-increased data set can be obtained.
Alternatively, the raw data set and newly-increased data set that extract are calculated, obtaining incremental computations result includes:
S6, is divided into original frequent item set and original nonmatching grids, and newly-increased data set is divided by raw data set
It is New added frequent item set and newly-increased nonmatching grids, wherein, in data set, support counting is more than or equal to affairs in data set
Record count is frequent item set with the item collection of the product of minimum support threshold value, and support counting is less than transaction journal in data set
Number is nonmatching grids with the item collection of the product of minimum support threshold value;
S8, calculates the union of original frequent item set and New added frequent item set, obtains the first union;
S10, calculates original frequent item set and newly-increased nonmatching grids, obtains the second union;
S12, calculates original nonmatching grids and New added frequent item set, obtains the 3rd union;
S14, calculates original nonmatching grids and newly-increased nonmatching grids, obtains the 4th union;
S16, using the first union, the second union, the 3rd union and the 4th union as incremental computations result.
Before and after database incremental update whether can be frequently situation dynamic according to item collection by the embodiment of the present invention
The mode for updating the data structure, reach improve update efficiency purpose.
Alternatively, the data increment for updating geography information according to incremental computations result includes:
S18, the item collection of first and concentration is added in newly-increased data set as data increment;
S20, the item collection of the 4th and concentration is concentrated from initial data and is deleted.
Due to the union necessarily frequent item set of original frequent item set and New added frequent item set, therefore can directly be added
Add to newly-increased data set;And the union of original nonmatching grids and newly-increased nonmatching grids necessarily nonmatching grids, therefore
Directly it can be concentrated from initial data and deleted.
By the embodiment of the present invention, by different types of item collection classified calculating, it is possible to achieve quick renewal result of calculation
Purpose, reaches the technique effect for improving and updating efficiency.
Alternatively, the data increment for updating geography information according to incremental computations result includes:
S22, judge second and concentrate item collection whether be nonmatching grids;
S24, if so, then the corresponding original frequent item set of the nonmatching grids is concentrated from initial data deleting;
And/or
S26, judge the 3rd and concentrate item collection whether be frequent item set;
S28, if so, then the corresponding original nonmatching grids of the frequent item set are added in newly-increased data set.
That is, when the union of original frequent item set and newly-increased nonmatching grids is sought, if original frequent item set become turn to it is non-
Frequent item set, then need to concentrate the corresponding original frequent item set of the nonmatching grids from initial data and delete;And/or, asking
During the union of original nonmatching grids and New added frequent item set, if original nonmatching grids become turns to frequent item set, need by
The corresponding original nonmatching grids of the frequent item set are added in newly-increased data set.
Before and after database incremental update whether can be frequently situation dynamic according to item collection by the embodiment of the present invention
The mode for updating the data structure, reach improve update efficiency purpose.
The present invention is elaborated with a specific embodiment below:
In embodiments of the present invention, it is possible to use FUFP-tree algorithms carry out the Increment Mining based on correlation rule.Specifically
Ground, after original transaction database incremental update, the problem that frequent item set changes, can incremental update after it is all
Item collection is divided into 4 kinds of classifications C1, C2, C3, C4.Wherein, for classification C1, in D (i.e. raw data set) and d (increasing data set newly)
In all be frequent item set, be certainly also frequent item set in such transaction database D ∪ d in the updated;And for classification C4,
All nonmatching grids in D and d, are also nonmatching grids in transaction database D ∪ d in the updated certainly;For classification C2,
It is frequent item set in D, is nonmatching grids in d, then frequency does not know in D ∪ d, if being changed into non-frequent episode, needs
It is deleted from original frequent item set;It is nonmatching grids in D for classification C3, is frequent item set in d, then
Frequency does not know yet in D ∪ d, if being changed into frequent item set, in needing to add it to frequent item set.
FUFP-tree algorithms are in the case where minimum support is constant, using acquired original frequent item set and more
Database after new, using the thought of FUP algorithms, whether foundation item collection is that frequently situation is moved before and after database incremental update
State ground updates FUFP-tree data structures, so as to minimally go to scan original transaction database.Built in FUFP-tree
During, it will usually the single order frequent item set found after preliminary scan raw data base is stored in entitled Header-table's
It is corresponding with the node in FUFP-tree in head table.It is a difference in that with FP-tree structures, father node in FP-tree structures
Unidirectional annexation and between child nodes is changed to be bi-directionally connected, so that in database incremental update, can be according to above-mentioned 4 kinds
Classification sets interior joint to update Headertable table and increase or delete, so that it correctly can quickly update
FUFP-tree.When FUFP-tree updates, it is clear that classification C4 is not considered, classification C2, the newly-increased number of transactions of scanning are considered first
According to storehouse, will be deleted from original table Header-table and FUFP-tree from being frequently changed into non-frequently item collection.Then again
Consider classification C1 and C3, this 2 classifications only exist the situation that item collection is added toward head table Header-table and FUFP-tree, but
There is difference.For C1, the newly-increased transaction data set (TDS) for belonging to C1 only need to be added, and for C3, it is necessary to rescan original
Transaction database, finds out the item collection that belongs to C3 and calculates support, is then supported with the newly-increased transaction data set (TDS) in C3 again
Degree is calculated, and the transaction journal after calculating for frequent item set is added.By the way that to C1,3 kinds of situations of C2, C3 are updated can
To obtain the FUFP-tree of new transaction database D ∪ d.
FUFP-tree algorithms have been effectively combined FUP and FP-tree algorithms, by the renewal frequent mode of low complex degree
Tree and unique single pass original transaction database, you can complete the incremental update data mining of original transaction database.
Embodiment 2
According to embodiments of the present invention, there is provided a kind of embodiment of the device for updating geographic information data increment.
Fig. 2 is the schematic diagram of the device of a kind of optional renewal geographic information data increment according to embodiments of the present invention,
As shown in Fig. 2 the device includes:Acquiring unit 202, for obtaining geography information candidate, the candidate includes institute
State raw data set and the newly-increased data set;Extraction unit 204, for extracting raw data set from candidate and increasing newly
Data set;Computing unit 206, calculates for the raw data set and newly-increased data set to extracting, and obtains incremental computations knot
Really;Updating block 208, the data increment for updating geography information according to incremental computations result.
During the database of many application fields is at constantly updating, the technical scheme provided using the present invention, Ke Yi
On the basis of original pattern, excavated again with reference to newly-increased data set, that is, carried out Increment sign correlation excavation treatment.So,
Even if there is incremental update in data mining application, will not also cause original pattern excavated ineffective or produce
The new pattern of life.
By above-mentioned steps, when being excavated to mass data, the demand of mass data excavation can not only be met, moreover it is possible to pole
The earth improves digging efficiency.
Alternatively, acquiring unit includes:Scan module, for scanning geographic information database;Generation module, for basis
Scanning result generates the candidate of geography information, and candidate includes raw data set and newly-increased data set.
Wherein, geographic information database can be the distributed data base of multisystem.Raw data set includes original frequent
Item collection and original nonmatching grids.Wherein, concentrated in initial data, support counting concentrates affairs note more than or equal to initial data
Record number is original frequent item set with the item collection of the product of minimum support threshold value, and support counting concentrates thing less than initial data
Business record count is original nonmatching grids with the item collection of the product of minimum support threshold value.Similarly, increasing data set newly is included newly
Increase frequent item set and newly-increased nonmatching grids.Wherein, in newly-increased data set, support counting is more than or equal in newly-increased data set
Transaction journal number is New added frequent item set with the item collection of the product of minimum support threshold value, and support counting is less than newly-increased data
It is newly-increased nonmatching grids with the item collection of the product of minimum support threshold value to concentrate transaction journal number.
By above-mentioned steps, comprehensively and accurately raw data set and newly-increased data set can be obtained.
Alternatively, computing unit includes:Division module, for raw data set to be divided into original frequent item set and original
Nonmatching grids, and newly-increased data set is divided into New added frequent item set and newly-increased nonmatching grids, wherein, in data set,
Support counting is frequent item set with the item collection of the product of minimum support threshold value more than or equal to transaction journal number in data set,
Support counting is nonmatching grids with the item collection of the product of minimum support threshold value less than transaction journal number in data set;The
One computing module, the union for calculating original frequent item set and New added frequent item set, obtains the first union;Second calculates mould
Block, for calculating original frequent item set and newly-increased nonmatching grids, obtains the second union;3rd computing module, based on first
Module is calculated, for calculating original nonmatching grids and New added frequent item set, the 3rd union is obtained;4th computing module, based on
Original nonmatching grids and newly-increased nonmatching grids are calculated, the 4th union is obtained;Determining module, for by the first union, second simultaneously
Collection, the 3rd union and the 4th union are used as incremental computations result.
Before and after database incremental update whether can be frequently situation dynamic according to item collection by the embodiment of the present invention
The mode for updating the data structure, reach improve update efficiency purpose.
Alternatively, the data increment for updating geography information according to incremental computations result includes:First add module, for inciting somebody to action
First and the item collection concentrated be added in newly-increased data set as data increment;First removing module, for that the 4th and will concentrate
Item collection from initial data concentrate delete.
Due to the union necessarily frequent item set of original frequent item set and New added frequent item set, therefore can directly be added
Add to newly-increased data set;And the union of original nonmatching grids and newly-increased nonmatching grids necessarily nonmatching grids, therefore
Directly it can be concentrated from initial data and deleted.
By the embodiment of the present invention, by different types of item collection classified calculating, it is possible to achieve quick renewal result of calculation
Purpose, reaches the technique effect for improving and updating efficiency.
Alternatively, updating block includes:First judge module, whether the item collection for judging second and concentrating is non-frequent
Item collection;Second removing module, for second and concentrate item collection be nonmatching grids when, by the corresponding original of the nonmatching grids
Beginning frequent item set is concentrated from initial data and deleted;And/or second judge module, for the item collection that judges the 3rd and concentrate whether be
Frequent item set;Second add module is corresponding original by the frequent item set when item collection for the 3rd and concentration is frequent item set
Nonmatching grids are added in newly-increased data set.
That is, when the union of original frequent item set and newly-increased nonmatching grids is sought, if original frequent item set become turn to it is non-
Frequent item set, then need to concentrate the corresponding original frequent item set of the nonmatching grids from initial data and delete;And/or, asking
During the union of original nonmatching grids and New added frequent item set, if original nonmatching grids become turns to frequent item set, need by
The corresponding original nonmatching grids of the frequent item set are added in newly-increased data set.
Before and after database incremental update whether can be frequently situation dynamic according to item collection by the embodiment of the present invention
The mode for updating the data structure, reach improve update efficiency purpose.
The embodiments of the present invention are for illustration only, and the quality of embodiment is not represented.
In the above embodiment of the present invention, the description to each embodiment all emphasizes particularly on different fields, and does not have in certain embodiment
The part of detailed description, may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed technology contents, can be by other
Mode is realized.Wherein, device embodiment described above is only schematical, such as division of described unit, Ke Yiwei
A kind of division of logic function, can there is other dividing mode when actually realizing, such as multiple units or component can combine or
Person is desirably integrated into another system, or some features can be ignored, or does not perform.Another, shown or discussed is mutual
Between coupling or direct-coupling or communication connection can be the INDIRECT COUPLING or communication link of unit or module by some interfaces
Connect, can be electrical or other forms.
The unit that is illustrated as separating component can be or may not be it is physically separate, it is aobvious as unit
The part for showing can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple
On unit.Some or all of unit therein can be according to the actual needs selected to realize the purpose of this embodiment scheme.
In addition, during each functional unit in each embodiment of the invention can be integrated in a processing unit, it is also possible to
It is that unit is individually physically present, it is also possible to which two or more units are integrated in a unit.Above-mentioned integrated list
Unit can both be realized in the form of hardware, it would however also be possible to employ the form of SFU software functional unit is realized.
If the integrated unit is to realize in the form of SFU software functional unit and as independent production marketing or use
When, can store in a computer read/write memory medium.Based on such understanding, technical scheme is substantially
The part for being contributed to prior art in other words or all or part of the technical scheme can be in the form of software products
Embody, the computer software product is stored in a storage medium, including some instructions are used to so that a computer
Equipment (can be personal computer, server or network equipment etc.) perform each embodiment methods described of the invention whole or
Part steps.And foregoing storage medium includes:USB flash disk, read-only storage (ROM, Read-Only Memory), arbitrary access are deposited
Reservoir (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. are various can be with store program codes
Medium.
The above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art
For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications also should
It is considered as protection scope of the present invention.
Claims (10)
1. it is a kind of update geographic information data increment method, it is characterised in that including:
The candidate of geography information is obtained, the candidate includes the raw data set and the newly-increased data set;
The raw data set and the newly-increased data set are extracted from the candidate;
The raw data set and the newly-increased data set to extracting are calculated, and obtain incremental computations result;
The data increment of the geography information is updated according to the incremental computations result.
2. method according to claim 1, it is characterised in that the candidate for obtaining geography information includes:
Scanning geographic information database;
The candidate of the geography information is generated according to scanning result.
3. method according to claim 1, it is characterised in that to the raw data set for extracting and the newly-increased data
Collection is calculated, and obtaining incremental computations result includes:
The raw data set is divided into original frequent item set and original nonmatching grids, and the newly-increased data set is divided
It is New added frequent item set and newly-increased nonmatching grids, wherein, in data set, support counting is more than or equal to affairs in data set
Record count is frequent item set with the item collection of the product of minimum support threshold value, and support counting is less than transaction journal in data set
Number is nonmatching grids with the item collection of the product of minimum support threshold value;
The union of the original frequent item set and the New added frequent item set is calculated, the first union is obtained;
The original frequent item set and the newly-increased nonmatching grids are calculated, the second union is obtained;
The original nonmatching grids and the New added frequent item set are calculated, the 3rd union is obtained;
The original nonmatching grids and the newly-increased nonmatching grids are calculated, the 4th union is obtained;
Using first union, second union, the 3rd union and the 4th union as the incremental computations knot
Really.
4. method according to claim 3, it is characterised in that the geography information is updated according to the incremental computations result
Data increment include:
The item collection of described first and concentration is added in the newly-increased data set as data increment;
The item collection of the described 4th and concentration is concentrated from the initial data and is deleted.
5. method according to claim 3, it is characterised in that the geography information is updated according to the incremental computations result
Data increment include:
Judge described second and concentrate item collection whether be nonmatching grids;
If so, then the corresponding original frequent item set of the nonmatching grids is concentrated from the initial data deleting;
And/or
Judge the described 3rd and concentrate item collection whether be frequent item set;
If so, then the corresponding original nonmatching grids of the frequent item set are added in the newly-increased data set.
6. it is a kind of update geographic information data increment device, it is characterised in that including:
Acquiring unit, for obtaining geography information candidate, the candidate includes the raw data set and described new
Increase data set;
Extraction unit, for extracting the raw data set and newly-increased data set from the candidate;
Computing unit, calculates for the raw data set and the newly-increased data set to extracting, and obtains incremental computations
As a result;
Updating block, the data increment for updating the geography information according to the incremental computations result.
7. device according to claim 6, it is characterised in that the acquiring unit includes:
Scan module, for scanning geographic information database;
Generation module, the candidate for generating the geography information according to scanning result.
8. device according to claim 6, it is characterised in that the computing unit includes:
Division module, for the raw data set to be divided into original frequent item set and original nonmatching grids,
And the newly-increased data set is divided into New added frequent item set and newly-increased nonmatching grids, wherein, in data set,
Support counting is frequent with the item collection of the product of minimum support threshold value more than or equal to transaction journal number in data set
Item collection, support counting is non-frequent episode with the item collection of the product of minimum support threshold value less than transaction journal number in data set
Collection;
First computing module, the union for calculating the original frequent item set and the New added frequent item set, obtains first simultaneously
Collection;
Second computing module, for calculating the original frequent item set and the newly-increased nonmatching grids, obtains the second union;
3rd computing module, for the first computing module, for calculating the original nonmatching grids with the newly-increased frequent episode
Collection, obtains the 3rd union;
4th computing module, for calculating the original nonmatching grids and the newly-increased nonmatching grids, obtains the 4th union;
Determining module, for using first union, second union, the 3rd union and the 4th union as institute
State incremental computations result.
9. device according to claim 8, it is characterised in that the geography information is updated according to the incremental computations result
Data increment include:
First add module, for the item collection of described first and concentration to be added into the newly-increased data set as data increment
In;
First removing module, deletes for the item collection of the described 4th and concentration to be concentrated from the initial data.
10. device according to claim 8, it is characterised in that the updating block includes:
First judge module, whether the item collection for judging described second and concentrate is nonmatching grids;
Second removing module, for described second and concentrate item collection be nonmatching grids when, by the nonmatching grids correspondence
Original frequent item set from the initial data concentrate delete;
And/or
Second judge module, whether the item collection for judging the described 3rd and concentrate is frequent item set;
Second add module is corresponding original by the frequent item set when item collection for the described 3rd and concentration is frequent item set
Nonmatching grids are added in the newly-increased data set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611154814.6A CN106844435A (en) | 2016-12-14 | 2016-12-14 | Update the method and device of geographic information data increment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611154814.6A CN106844435A (en) | 2016-12-14 | 2016-12-14 | Update the method and device of geographic information data increment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106844435A true CN106844435A (en) | 2017-06-13 |
Family
ID=59140523
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611154814.6A Pending CN106844435A (en) | 2016-12-14 | 2016-12-14 | Update the method and device of geographic information data increment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106844435A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112581252A (en) * | 2020-12-03 | 2021-03-30 | 信用生活(广州)智能科技有限公司 | Address fuzzy matching method and system fusing multidimensional similarity and rule set |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101436211A (en) * | 2008-12-19 | 2009-05-20 | 北京交通发展研究中心 | City road network data increment recognizing method and increment updating method based on buffer zone analysis |
CN103761236A (en) * | 2013-11-20 | 2014-04-30 | 同济大学 | Incremental frequent pattern increase data mining method |
CN103984723A (en) * | 2014-05-15 | 2014-08-13 | 江苏易酒在线电子商务有限公司 | Method used for updating data mining for frequent item by incremental data |
CN104298669A (en) * | 2013-07-16 | 2015-01-21 | 江苏宏联物联网信息技术有限公司 | Person geographic information mining model based on social network |
CN105528391A (en) * | 2015-11-26 | 2016-04-27 | 国网北京市电力公司 | A method and a device for updating a geographic information data increment |
-
2016
- 2016-12-14 CN CN201611154814.6A patent/CN106844435A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101436211A (en) * | 2008-12-19 | 2009-05-20 | 北京交通发展研究中心 | City road network data increment recognizing method and increment updating method based on buffer zone analysis |
CN104298669A (en) * | 2013-07-16 | 2015-01-21 | 江苏宏联物联网信息技术有限公司 | Person geographic information mining model based on social network |
CN103761236A (en) * | 2013-11-20 | 2014-04-30 | 同济大学 | Incremental frequent pattern increase data mining method |
CN103984723A (en) * | 2014-05-15 | 2014-08-13 | 江苏易酒在线电子商务有限公司 | Method used for updating data mining for frequent item by incremental data |
CN105528391A (en) * | 2015-11-26 | 2016-04-27 | 国网北京市电力公司 | A method and a device for updating a geographic information data increment |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112581252A (en) * | 2020-12-03 | 2021-03-30 | 信用生活(广州)智能科技有限公司 | Address fuzzy matching method and system fusing multidimensional similarity and rule set |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yoo et al. | A partial join approach for mining co-location patterns | |
CN103761236B (en) | Incremental frequent pattern increase data mining method | |
CN105468371B (en) | A kind of business process map merging method based on Subject Clustering | |
Hussein et al. | Using the interestingness measure lift to generate association rules | |
CN103345616B (en) | The system of the fingerprint storage comparison that Behavior-based control is analyzed | |
CN106294715A (en) | A kind of association rule mining method based on attribute reduction and device | |
CN102822822A (en) | Image management device, image management method, program, recording medium, and integrated circuit | |
CN107315822A (en) | A kind of method for digging of Knowledge Relation | |
CN103945238B (en) | A kind of community's detection method based on user behavior | |
CN106844435A (en) | Update the method and device of geographic information data increment | |
CN114880522A (en) | Method and device for realizing ID Mapping based on graph database | |
Lin et al. | Research on maximal frequent pattern outlier factor for online high dimensional time-series outlier detection | |
Vo et al. | Parallel method for mining high utility itemsets from vertically partitioned distributed databases | |
Chu et al. | An efficient k-medoids-based algorithm using previous medoid index, triangular inequality elimination criteria, and partial distance search | |
CN107908776A (en) | Frequent mode Web Mining algorithm and system based on affairs project incidence matrix | |
Islam et al. | DETECTIVE: A decision tree based categorical value clustering and perturbation technique for preserving privacy in data mining | |
Ramaraju et al. | A conditional tree based novel algorithm for high utility itemset mining | |
Singh et al. | Knowledge based retrieval scheme from big data for aviation industry | |
CN116703141A (en) | Audit data processing method, audit data processing device, computer equipment and storage medium | |
Saraswat et al. | Data pre-processing techniques in data mining: A Review | |
CN111160077A (en) | Large-scale dynamic face clustering method | |
CN104765810A (en) | Diagnosis and treating rules mining method based on Boolean matrix | |
CN105528391A (en) | A method and a device for updating a geographic information data increment | |
CN106296537A (en) | Colony in a kind of information in public security organs industry finds method | |
Lin et al. | Mining high-utility sequential patterns in uncertain databases |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170613 |