Nothing Special   »   [go: up one dir, main page]

CN106022359A - Fuzzy entropy space clustering analysis method based on orderly information entropy - Google Patents

Fuzzy entropy space clustering analysis method based on orderly information entropy Download PDF

Info

Publication number
CN106022359A
CN106022359A CN201610315952.1A CN201610315952A CN106022359A CN 106022359 A CN106022359 A CN 106022359A CN 201610315952 A CN201610315952 A CN 201610315952A CN 106022359 A CN106022359 A CN 106022359A
Authority
CN
China
Prior art keywords
entropy
fuzzy
integer
information entropy
steps
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610315952.1A
Other languages
Chinese (zh)
Inventor
熊盛武
郑文博
段鹏飞
于笑寒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University of Technology WUT
Original Assignee
Wuhan University of Technology WUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University of Technology WUT filed Critical Wuhan University of Technology WUT
Priority to CN201610315952.1A priority Critical patent/CN106022359A/en
Publication of CN106022359A publication Critical patent/CN106022359A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a fuzzy entropy space clustering analysis method based on orderly information entropy, comprising the following steps: S1, inputting a standardized matrix, and letting an integer i=1; S2, assigning an ith value corresponding to a relationship set D to a fuzzy entropy space parameter, and initializing intermediate sets A and C; S3, letting an intermediate set B be empty; S4, if R(xj, xk)>=lambda, deciding that B=B union {xj} and A=A\{xj}; S5, if R(xj, xk)>=lambda, deciding that B=B union {xs} and A=A\{xs}, or going to S6; S6, letting C=C union {B}, and letting an integer i+1 assign a value to i; repeating the steps S2 to S6 until the integer i is equal to the size m of the relationship set D; S7, if A is empty, deciding that X(lambda)=C, and calculating the orderly information entropy at the moment; S8, if HP(min)=min{X(lambda)}, deciding that min=lambda and going to S9, or going to S2; and S9, outputting an optimal granularity level.

Description

Fuzzy entropy space cluster analysis method based on ordered information entropy
Technical field
The present invention relates to cluster analysis field, particularly relate to a kind of fuzzy entropy space cluster analysis based on ordered information entropy Method.
Background technology
Cluster, as an ancient problem, along with generation and the development of human society deepen constantly.Cluster analysis is machine An important field of research in device learning areas.It is an important method of rule between research discrete data.Cluster All it is widely used in a lot of fields, the aspect such as including pattern recognition, data mining and marketing research.The target of cluster is On the premise of there is no any priori, data are gathered into different classes, find character the most essential between sample point, make Class inherited is big as far as possible, and in class, diversity is the least.People have carried out various different research to cluster very early, pass The clustering algorithm of system substantially can be divided into following a few class: division methods, hierarchical method, method based on density, based on grid Method and method based on model.By cluster analysis, we can from the interesting knowledge of extracting data, rule or other Information, it is possible to observe from different perspectives, the knowledge of discovery may be used for decision-making, process control, information management etc..Existing Technology mainly has partition clustering, hierarchical clustering, density clustering and cluster based on grid, but great majority are fuzzy Cluster is required for artificially specifying clusters number.Initial cluster center, to initializing sensitivity, it is difficult to obtain global optimum.And at present Clustering algorithm based on Fuzzy Quotient Spaces, draw is the cluster result of a hierarchical, but cannot know which grain Degree level is an optimal cluster result.
Summary of the invention
The goal of the invention of the present invention is, it is provided that the base of a kind of optimal level that can select under Fuzzy Quotient Spaces In the fuzzy entropy space cluster analysis method of ordered information entropy,.
The technical solution adopted for the present invention to solve the technical problems is:
The present invention provides a kind of fuzzy entropy space cluster analysis method based on ordered information entropy, comprises the following steps:
S1: input normalized matrix X={x1, x2, x3..., xn, R be a fuzzy equivalence relation on X or Similarity relation, then set of relationship D={R (x, y) | x, y ∈ X}={ λ1, λ2, λ3..., λm, 1=λ1> λ2> λ3> ... > λm, wherein λi(i=1,2,3 ..., m) for fuzzy upper spatial parameter;Make integer i=1;
S2: i-th value corresponding for set of relationship D is assigned to fuzzy entropy spatial parameter λ, λ=λi, initialize middle set In the middle of A, and assignment, set A is and X an equal amount of positive integer collection, A={1,2,3 ..., n};Set C in the middle of initializing, And assignment C is empty,
S3: in the middle of order, set B is empty,Take arbitrary integer j ∈ A, B=B ∪ { xj, A=A { xj};
S4: take arbitrary integer k ∈ A, if R is (xj, xk) >=λ, then B=B ∪ { xj, A=A { xj};Take arbitrary Integer s ∈ A, if R is (xj, xk) >=λ, then B=B ∪ { xs, A=A { xs, otherwise turn S5;
S5: make C=C ∪ { B}, and make integer i+1 be assigned to i;Repeated execution of steps S2~S5, until integer i is equal to closing Assembly closes size m of D, repeats to terminate;
S6: ifThen X (λ)={ x1, x2, x3... xk..., xλ}=C,
Calculate ordered information entropy HP (X (λ)) now (k=1,2,3,4 ..., λ);
S7: if HP (min)=min{X (λ) }, then min=λ, turn S8, otherwise turn S2;
S8: export optimal granularity level X (min)=C.
In method of the present invention, step 6 calculates X (λ)={ x1, x2, x3... xk..., xλThe ordered information of }=C Entropy concretely comprises the following steps:
Phase space reconfiguration is used to postpone coordinate method to either element x in X (λ)iCarry out phase space reconfiguration, to each sampled point Take m sampling point of its continuous print, obtain an xiM-dimensional space reconstruct vector: xi=x (i), x (i+1) ..., x (i+ (m- L) * l) },
Then the phase space matrix of sequence X is:When wherein m and l is respectively reconstruct dimension and postpones Between;
Reconstruct vector x to x (i)iEach element carries out ascending order arrangement, obtains:
xi={ x (i+ (j1-1)*l)≤x(i+(j2-1)*l)≤…≤i+(jm-1)*l}
The arrangement mode obtained is { j1, j2, j3..., jm}
It is fully intermeshing m!In one, arranging situation occurrence numbers various to X sequence are added up, and calculate various arrangement The relative frequency that situation occurs is as its Probability p1、p2、…pk, k≤m!, arrangement entropy after sequence of calculation normalization:
H = ( - Σ i = 1 k p i * log 2 ( p i ) ) * log 2 ( m ! ) .
The beneficial effect comprise that: the present invention utilizes the set of relationship under Fuzzy Quotient Spaces to replace traditional fuzzy C Euclidean distance in means clustering algorithm, utilizes criterion function ordered information entropy based on granularity thought to select one Good level, so that it is determined that the number of cluster, and select there is the high sample of similarity as initial cluster center.The present invention is also Introduce entropy computing, have amount of calculation compared with little, take the advantages such as computer storage unit is few, method is simple, it is adaptable to large sample Cluster analysis, and the method for the present invention need not pre-establish cluster numbers, it appeared that the hierarchical relationship of class, it is possible to it is clustered into Other shape.
Accompanying drawing explanation
Below in conjunction with drawings and Examples, the invention will be further described, in accompanying drawing:
Fig. 1 is the flow chart of embodiment of the present invention fuzzy entropy based on ordered information entropy space cluster analysis method.
Detailed description of the invention
In order to make the purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, right The present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, not For limiting the present invention.
Present invention fuzzy entropy based on ordered information entropy space cluster analysis method, as it is shown in figure 1, mainly include following step Rapid:
S1: input normalized matrix X={x1, x2, x3..., xn, R be a fuzzy equivalence relation on X or Similarity relation, then set of relationship D={R (x, y) | x, y ∈ X}={ λ1, λ2, λ3..., λm, 1=λ1> λ2> λ3> ... > λm, wherein λi(i=1,2,3 ..., m) for fuzzy upper spatial parameter;Make integer i=1;
S2: i-th value corresponding for set of relationship D is assigned to fuzzy entropy spatial parameter λ, λ=λi, initialize middle set In the middle of A, and assignment, set A is and X an equal amount of positive integer collection, A={1,2,3 ..., n};Set C in the middle of initializing, And assignment C is empty,
S3: in the middle of order, set B is empty,Take arbitrary integer j ∈ A, B=B ∪ { xj, A=A { xj};
S4: take arbitrary integer k ∈ A, if R is (xj, xk) >=λ, then B=B ∪ { xj, A=A { xj};Take arbitrary Integer s ∈ A, if R is (xj, xk) >=λ, then B=B ∪ { xs, A=A { xs, otherwise turn S5;
S5: make C=C ∪ { B}, and make integer i+1 be assigned to i;Repeated execution of steps S2~S5, until integer i is equal to closing Assembly closes size m of D, repeats to terminate;
S6: ifThen X (λ)={ x1, x2, x3... xk..., xλ}=C,
Calculate ordered information entropy HP (X (λ)) now (k=1,2,3,4 ..., λ);
S7: if HP (min)=min{X (λ) }, then min=λ, turn S8, otherwise turn S2;
S8: export optimal granularity level X (min)=C.
In above-described embodiment, A, B, C are middle set."=", represents assignment, and " ∪ " represents two union of sets collection, " " table Show two quotient sets gathered.
Step S2 also exists for initial center sensitive for traditional Fuzzy C-Means Cluster Algorithm, needs to specify in advance Clusters number, and the shortcoming that correct cluster result is hardly resulted in the case of uneven for class size.This patent utilizes Set of relationship under Fuzzy Quotient Spaces replaces the Euclidean distance in traditional fuzzy C means clustering algorithm.
Step S7 utilizes criterion function ordered information entropy based on granularity thought to select an optimal level, thus Determine the number of cluster, and select there is the high sample of similarity as initial cluster center.In this step, introduce entropy computing Have amount of calculation compared with little, take the advantages such as computer storage unit is few, method is simple, it is adaptable to the cluster analysis of large sample.
Step S8 need not pre-establish cluster numbers, it appeared that the hierarchical relationship of class;Except cluster is cluster, it is also possible to It is tree-like, it is also possible to be other arbitrary shapes.
The process calculating ordered information entropy is:
If sequence X={ xiI=1,2,3......n, use phase space reconfiguration to postpone coordinate method to either element x in XiEnter Row phase space reconfiguration, takes m sampling point of its continuous print, obtains an x each sampled pointiM-dimensional space reconstruct vector: xi={ x (i), x (i+1) ..., x (i+ (m-l) * l) }
Then the phase space matrix of sequence X is:When wherein m and l is respectively reconstruct dimension and postpones Between;
The reconstruct each element of vector Xi of x (i) is carried out ascending order arrangement, obtains:
Xi={ x (i+ (j1-1)*l)≤x(i+(j2-1)*l)≤…≤i+(jm-1)*l}
The arrangement mode so obtained is { j1, j2, j3..., jm}
It is fully intermeshing m!In one, arranging situation occurrence numbers various to X sequence are added up, and calculate various arrangement The relative frequency that situation occurs is as its Probability p 1, p2 ... pk, k≤m!, arrangement entropy after sequence of calculation normalization:
H = ( - Σ i = 1 k p i * log 2 ( p i ) ) * log 2 ( m ! )
It should be appreciated that for those of ordinary skills, can be improved according to the above description or be converted, And all these modifications and variations all should belong to the protection domain of claims of the present invention.

Claims (2)

1. a fuzzy entropy space cluster analysis method based on ordered information entropy, it is characterised in that comprise the following steps:
S1: input normalized matrix X={x1, x2, x3..., xn, R is a fuzzy equivalence relation on X or similar pass System, then set of relationship D={R (x, y) | x, y ∈ X}={ λ1, λ2, λ3..., λm1=λ1> λ2> λ3> ... > λm, wherein λi(i=1,2,3 ..., m) for fuzzy upper spatial parameter;Make integer i=1;
S2: i-th value corresponding for set of relationship D is assigned to fuzzy entropy spatial parameter λ, λ=λi, initialize middle set A, and In the middle of assignment, set A is and X an equal amount of positive integer collection, A={1,2,3 ..., n};Set C in the middle of initializing, and compose Value C is empty,
S3: in the middle of order, set B is empty,Take arbitrary integer j ∈ A, B=B ∪ { xj, A=A { xj};
S4: take arbitrary integer k ∈ A, if R is (xj, xk) >=λ, then B=B ∪ { xj, A=A { xj};Take arbitrary integer s ∈ A, if R is (xj, xk) >=λ, then B=B ∪ { xs, A=A { xs, otherwise turn S5;
S5: make C=C ∪ { B}, and make integer i+1 be assigned to i;Repeated execution of steps S2~S5, until integer i is equal to set of relations Close size m of D, repeat to terminate;
S6: ifThen X (λ)={ x1, x2, x3... xk..., xλ}=C,
Calculate ordered information entropy HP (X (λ)) now (k=1,2,3,4 ..., λ);
S7: if HP (min)=min{X (λ) }, then min=λ, turn S8, otherwise turn S2;
S8: export optimal granularity level X (min)=C.
Method the most according to claim 1, it is characterised in that calculate X (λ)={ x in step 61, x2, x3... xk..., xλThe ordered information entropy of }=C concretely comprises the following steps:
Phase space reconfiguration is used to postpone coordinate method to either element x in X (λ)iCarry out phase space reconfiguration, each sampled point is taken it M sampling point of continuous print, obtains an xiM-dimensional space reconstruct vector: xi=x (i), x (i+1) ..., x (i+ (m-l) * L) },
Then the phase space matrix of sequence X is:Wherein m and l is respectively reconstruct dimension and time delay;
Reconstruct vector x to x (i)iEach element carries out ascending order arrangement, obtains:
xi={ x (i+ (j1-1)*l)≤x(i+(j2-1)*l)≤…≤i+(jm-1)*l}
The arrangement mode obtained is { j1, j2, j3..., jm}
It is fully intermeshing m!In one, arranging situation occurrence numbers various to X sequence are added up, and calculate various arranging situation The relative frequency occurred is as its Probability p1、p2、…pk, k≤m!, arrangement entropy after sequence of calculation normalization:
H = ( - Σ i = 1 k p i * log 2 ( p i ) ) * log 2 ( m ! ) .
CN201610315952.1A 2016-05-12 2016-05-12 Fuzzy entropy space clustering analysis method based on orderly information entropy Pending CN106022359A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610315952.1A CN106022359A (en) 2016-05-12 2016-05-12 Fuzzy entropy space clustering analysis method based on orderly information entropy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610315952.1A CN106022359A (en) 2016-05-12 2016-05-12 Fuzzy entropy space clustering analysis method based on orderly information entropy

Publications (1)

Publication Number Publication Date
CN106022359A true CN106022359A (en) 2016-10-12

Family

ID=57099762

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610315952.1A Pending CN106022359A (en) 2016-05-12 2016-05-12 Fuzzy entropy space clustering analysis method based on orderly information entropy

Country Status (1)

Country Link
CN (1) CN106022359A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107228766A (en) * 2017-05-22 2017-10-03 上海理工大学 Based on the Fault Diagnosis of Roller Bearings for improving multiple dimensioned fuzzy entropy
CN109145921A (en) * 2018-08-29 2019-01-04 江南大学 A kind of image partition method based on improved intuitionistic fuzzy C mean cluster
CN109447163A (en) * 2018-11-01 2019-03-08 中南大学 A kind of mobile object detection method towards radar signal data
CN109657123A (en) * 2018-12-13 2019-04-19 厦门大学嘉庚学院 A kind of food safety affair clustering method based on comentropy
CN112270203A (en) * 2020-09-18 2021-01-26 河北建投新能源有限公司 Fan characteristic optimization method based on entropy weight method

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107228766A (en) * 2017-05-22 2017-10-03 上海理工大学 Based on the Fault Diagnosis of Roller Bearings for improving multiple dimensioned fuzzy entropy
CN107228766B (en) * 2017-05-22 2019-03-05 上海理工大学 Based on the Fault Diagnosis of Roller Bearings for improving multiple dimensioned fuzzy entropy
CN109145921A (en) * 2018-08-29 2019-01-04 江南大学 A kind of image partition method based on improved intuitionistic fuzzy C mean cluster
CN109145921B (en) * 2018-08-29 2021-04-09 江南大学 Image segmentation method based on improved intuitive fuzzy C-means clustering
CN109447163A (en) * 2018-11-01 2019-03-08 中南大学 A kind of mobile object detection method towards radar signal data
CN109447163B (en) * 2018-11-01 2022-03-22 中南大学 Radar signal data-oriented moving object detection method
CN109657123A (en) * 2018-12-13 2019-04-19 厦门大学嘉庚学院 A kind of food safety affair clustering method based on comentropy
CN109657123B (en) * 2018-12-13 2022-10-11 厦门大学嘉庚学院 Food safety event cluster analysis method based on information entropy
CN112270203A (en) * 2020-09-18 2021-01-26 河北建投新能源有限公司 Fan characteristic optimization method based on entropy weight method

Similar Documents

Publication Publication Date Title
Wang et al. Robust bi-stochastic graph regularized matrix factorization for data clustering
Bora et al. A comparative study between fuzzy clustering algorithm and hard clustering algorithm
Sudderth et al. Describing visual scenes using transformed objects and parts
Torresani et al. A dual decomposition approach to feature correspondence
Kannan et al. Effective fuzzy c-means clustering algorithms for data clustering problems
Ashraf et al. A multichannel Markov random field framework for tumor segmentation with an application to classification of gene expression-based breast cancer recurrence risk
Dutta et al. Stochastic graphlet embedding
CN106022359A (en) Fuzzy entropy space clustering analysis method based on orderly information entropy
CN106650744B (en) The image object of local shape migration guidance is divided into segmentation method
Tang et al. One-step multiview subspace segmentation via joint skinny tensor learning and latent clustering
Ramathilagam et al. Extended Gaussian kernel version of fuzzy c-means in the problem of data analyzing
CN110751027B (en) Pedestrian re-identification method based on deep multi-instance learning
CN106845536B (en) Parallel clustering method based on image scaling
CN109858518A (en) A kind of large data clustering method based on MapReduce
CN110349159A (en) 3D shape dividing method and system based on the distribution of weight energy self-adaptation
CN108921853A (en) Image partition method based on super-pixel and clustering of immunity sparse spectrums
Ding et al. Density peaks clustering algorithm based on improved similarity and allocation strategy
Li et al. Hierarchical clustering driven by cognitive features
Ramathilagam et al. Extended fuzzy c-means: an analyzing data clustering problems
Shang et al. Co-evolution-based immune clonal algorithm for clustering
Burdescu et al. A Spatial Segmentation Method.
Naitzat et al. M-Boost: Profiling and refining deep neural networks with topological data analysis
Qiao et al. Lung nodule classification using curvelet transform, LDA algorithm and BAT-SVM algorithm
Ramathilaga et al. Two novel fuzzy clustering methods for solving data clustering problems
Du et al. Cluster ensembles via weighted graph regularized nonnegative matrix factorization

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20161012