1 Introduction

Granular computing (GrC), originally proposed by Zadeh (1997), has made significant contributions to the advancement of artificial intelligence. It provides methodologies for knowledge representation and acquisition, and is closely associated with knowledge applications in complex real-world problems (Pedrycz 2015; Pedrycz et al. 2015; Pedrycz 2021; Bello et al. 2021). Rough set theory (Pawlak 1982, 1992), as a fundamental application of GrC, focuses on describing and acquiring knowledge about uncertainties from various perspectives such as approximations, indistinguishable relations, and information theories. The theory relies on information granules derived from indistinguishable relations. Since then, increasing attention has being attracted particularly from the perspective of attribute reductions in extended models such as classical and neighborhood rough sets (Qian et al. 2010; Chen et al. 2012; Dai et al. 2017; Hu et al. 2021), fuzzy rough sets (Hu et al. 2010; Chen and Yang 2013; Hu et al. 2017; Yuan et al. 2021; Ji et al. 2023; Zhang et al. 2022), multigranulation rough sets (Kang and Dai 2023) and so on.

However, the existing model is inadequate when dealing with preference relations. The proposed dominance-based rough set approach (DRSA), as an extension of rough sets, aims at reflecting uncertainties behind the inconsistency of preference relations between criterion and decisions. In DRSA, approximation operators are constructed by a particular graded granular named as dominance relations. These granules play a crucial role in attribute reduction and decision-making processes based on the DRSA (Xu et al. 2009; Zhang et al. 2017; Pan et al. 2017; Xu 2013; Pan et al. 2023; Gao et al. 2019; Chen et al. 2022; Wang et al. 2023). However, the presence of noise in graded information granules can undermine the consistency of decision rules. To address this issue, variable-precision DRSA has been proposed (Greco et al. 2000; Inuiguchi et al. 2009; Liou 2011; Pan et al. 2023). Hu et al. introduced fuzzy DRSA, which utilizes fuzzy preference relations to formulate fuzzy information granules, and analyzed attribute reduction methods from an information theory perspective (Hu et al. 2010). Chen et al. developed dominance-based neighborhood rough sets by employing dominance-based neighborhood information granules, which involve distance measures to filter redundant object pairs and identify prominent object pairs with dominance relations (Chen et al. 2015, 2016). Building upon this, Sang et al. investigated incremental feature selection approaches by incorporating a new conditional entropy measure with fuzzy knowledge granules (Sang et al. 2021, 2021). Pan et al. proposed weighted dominance-based neighborhood rough sets by assigning weight vectors to conditional attributes (Pan et al. 2023). Yang et al. simplified dominance-based neighborhood information granules using fuzzy preference relations and introduced a novel model of quantitative dominance-based neighborhood rough set (Yang et al. 2021). Additionally, attribute reductions based on lower and upper discernibility matrices have been extensively studied both theoretically and experimentally.

For attribute reductions of DRSA, the key task is to decrease inconsistencies between criteria and decisions by removing redundant or unnecessary attributes which disrupt the order relations between objects (Yang et al. 2019; Yu et al. 2023). Attribute reductions based on lower approximations of DRSA aim to identify sub-attribute sets, known as reducts, that preserve the same lower approximations as the complete set of attributes (Zhu 2007; Yang 2007; Yang et al. 2008). The goal is to obtain representative reducts with the minimum cardinality by considering the significance of attributes in terms of dependence or approximating quality. Various heuristic algorithms have been proposed for attribute reductions based on these principles (Zhu and Wang 2003; Leung et al. 2008; Tsang et al. 2008).

Kotłowski et al. introduced the concept of generalized decisions with lower approximations to study ordinal classifications from a statistical perspective (Kotłowski et al. 2008). Generalized decisions are defined to reveal the dominance principle. Building upon this, several accelerated and dynamic attribute reduction methods for DRSA have been explored (Li et al. 2013; Wang et al. 2016, 2019, 2020). In Li et al. (2013), lower approximations were updated by incorporating changes in generalized decisions when faced with variations in objects. Wang et al. further extended this approach by combining generalized decisions to construct a dominance feature matrix that considers order relations and indiscernible relations. They proposed an accelerated attribute reduction method to handle variations in objects and attributes in ordered information tables (Wang et al. 2016, 2019; Ahmad et al. 2020; Wang et al. 2020).

Objects with generalized decisions accurately represent the optimal decision rules, and collections of objects with the same generalized decisions form graded information granules when combined with order relations. Therefore, the main contribution of this paper is to formulate a new graded information granule in quantitative dominance-based neighborhood rough sets, which is no longer a covering of the universe but a partition based on the consistence between criteria and decisions. On the basis, some theoretical theories including importance of granules will be studied in detail. Besides, the relationship between approximating quality or dependence and importance of the formulated graded information granules will be explored. Finally, we will design an accelerated algorithm for attribute reductions in quantitative dominance-based neighborhood rough sets by updating generalized decisions.

The remaining paper is organized as follows. Section 2 provides a review of the basic concepts of DRSA and quantitative dominance-based neighborhood rough sets. In Sect. 3, we focus on the construction of graded information granules and their properties, which form the main focus of this paper. Section 4 analyzes the theory and algorithms for attribute reductions in quantitative dominance-based neighborhood rough sets, including the accelerated process. In Sect. 5, we evaluate the effectiveness of our method both on executing time of attribute reductions and precision accuracy of reduct sets, particularly in terms of the time required for attribute reductions, using public datasets under various parameters by statistical hypothesis with paired t-test. Finally, Sect. 6 concludes the paper.

2 Preliminaries

In this section, we will provide a review of the fundamental concepts of dominance-based rough set approach (DRSA) and quantitative dominance-based neighborhood rough sets (QDNRSs) (Greco et al. 1999, 2002a, b; Yang et al. 2021). The specific concepts are presented as follows:

2.1 Information tables and decision tables

DRSA and its extensions are tools to reveal knowledge with preference orders in information tables and decision tables (DTs). An information table is a quadruple (UAVf), where \(U=\{x_1,\cdots ,x_n\}\) is the universe consisting of finite objects, \(A=\{a_1,a_2\cdots ,a_m \}\) is a non-empty finite set of attributes, V is the domain of attribute set A and \(f:U\times A\rightarrow V\) is an information function, i.e., \(\forall x\in U,\ a\in A,\ f(x,a)\in V_a\).

Specially, an information table is a DT, if \(A=C\cup D\), where C and D denote conditional and decision attributes respectively. For simplicity and without loss of generality, in this paper we focus on DTs with the single decision attribute, i.e., \(D=\{d\}\).

Here we use some sub-part of data set Teaching from UCI machine learning to present an example of DT (Asuncion and Newman 2007). Data information in Table 1 is selected from the first 15th rows of objects and all the attributes of Teaching.

Example 1

A decision table generated by Teaching.

Table 1 Decision table from Teaching (Asuncion and Newman 2007)

In Table 1, \(U=\{x_1,\cdots ,x_{15}\}\), \(A=\{a_1,\cdots ,a_5\}\) is the conditional attributes and \(V_d=\{2,3\}\) means the domain of decision attribute.

2.2 Quantitative dominance-based neighborhood rough sets and dominance-based rough set approach

Yang et al. propose QDNRSs with fuzzy preference relations (Yang et al. 2021), in which quantitative dominance-based neighborhood relations are formulated by considering both dominance relations and the extent of dominance relations by using a threshold \(\delta\). The specific model of QDNRSs is presented as follows.

Definition 1

Let \((U,A\cup \{d\},V,f)\) be an DT and \(B\subseteq A,\ X\subseteq U\), lower and upper approximations of X with respect to quantitative dominance-based neighborhood classes are denoted by \(\underline{\delta _B^+}(X)\) and \(\overline{\delta _B^+}(X)\) respectively. The specific description is represented by Eq. 1.

$$\begin{aligned} \underline{\delta _B^+}(X)&=\{x\in U |\delta _B^+(x)\subseteq X\},\nonumber \\ \overline{\delta _B^+}(X)&=\{x\in U |\delta _B^+(x)\cap X\ne \emptyset \}. \end{aligned}$$
(1)

Where \(\delta _B^+(x)=\{x\}\cup \{y\in U |p_a(y,x)\geqslant \delta ,\forall a\in B\},\ (\delta \geqslant 0.5)\) is the quantitative dominance-based neighborhood class of x with B. \(p_a(y,x)=\frac{1}{1+e^{-k*(f(y,a)-f(x,a))}}\) denotes the fuzzy preference relation which quantifies the extent of \(y\succcurlyeq x\) in terms of order relation \(\succcurlyeq\) under attribute a, in which \(k>0\) is to control the extent of dominance relations.

Similarly, the quantitative dominated neighborhood class of x with B is \(\delta _B^-(x)=\{x\}\cup \{y\in U |p_a(y,x)\leqslant 1- \delta ,\forall a\in B\}\).

For any \(\delta _B^+(x)\) and \(\delta _B^-(x)\), the quantitative dominance-based and dominated neighborhood relations are denoted by \(\delta _B^+\) and \(\delta _B^-\), where \(\delta _B^+=\{(x,y)|x\in \delta _B^+(y)\}\) and \(\delta _B^-=\{(x,y)|x\in \delta _B^-(y)\}\). More details of QDNRSs and fuzzy preference relations are presented in Yang et al. (2021).

Theorem 1

Yang et al. (2021) Let \((U,C\cup \{d\}, V,f)\) be a DT, for \(\forall X,Y\subseteq U\), \(B\subseteq C\), the approximate operators of X satisfy the following properties:

$$\begin{array}{*{20}l} {(1L)\;\underline{{\delta _{B}^{ + } }} ( \sim X) = \sim \overline{{\delta _{B} }} ^{ + } (X),} \hfill & {(1U)\;\overline{{\delta _{B}^{ + } }} ( \sim X) = \sim \underline{{\delta _{B} }} ^{ + } (X);} \hfill \\ {(2L)\;\underline{{\delta _{B}^{ + } }} (\emptyset ) = \emptyset ,} \hfill & {(2U)\;\overline{{\delta _{B}^{ + } }} (\emptyset ) = \emptyset ;} \hfill \\ {(3L)\;\underline{{\delta _{B}^{ + } }} (U) = U,} \hfill & {(3U)\;\overline{{\delta _{B}^{ + } }} (U) = U;} \hfill \\ {(4L)\;\underline{{\delta _{B}^{ + } }} (X \cap Y) = \underline{{\delta _{B}^{ + } }} (X) \cap \underline{{\delta _{B}^{ + } }} (Y),} \hfill & {(4U)\;\overline{{\delta _{B}^{ + } }} (X \cup Y) = \overline{{\delta _{B}^{ + } }} (X) \cup \overline{{\delta _{B}^{ + } }} (Y);} \hfill \\ {(5L)\;\underline{{\delta _{B}^{ + } }} (X \cup Y) \supseteq \underline{{\delta _{B}^{ + } }} (X) \cup \underline{{\delta _{B}^{ + } }} (Y),} \hfill & {(5U)\;\overline{{\delta _{B}^{ + } }} (X \cap Y) \subseteq \overline{{\delta _{B}^{ + } }} (X) \cap \overline{{\delta _{B}^{ + } }} (Y);} \hfill \\ {(6L)\;X \subseteq Y \Rightarrow \underline{{\delta _{B}^{ + } }} (X) \subseteq \underline{{\delta _{B}^{ + } }} (Y),} \hfill & {(6U)\;X \subseteq Y \Rightarrow \overline{{\delta _{B}^{ + } }} (X) \subseteq \overline{{\delta _{B}^{ + } }} (Y);} \hfill \\ {(7L)\;\underline{{\delta _{B}^{ + } }} (X) \subseteq X,} \hfill & {(7U)\;X \subseteq \overline{{\delta _{B}^{ + } }} (X);} \hfill \\ {(8L)\;\underline{{\delta _{B}^{ + } }} (\underline{{\delta _{B}^{ + } }} (X)) = \underline{{\delta _{B}^{ + } }} (X),} \hfill & {(8U)\;\overline{{\delta _{B}^{ + } }} (\overline{{\delta _{B}^{ + } }} (X)) = \overline{{\delta _{B}^{ + } }} (X).} \hfill \\ \end{array}$$

Particularly, DRSA is obtained by restricting \(\delta =0.5\) in QDNRSs and \(\delta _B^+(x)\) is transformed to dominance class \([x]_B^\geqslant\), i.e., \([x]_B^\geqslant\) describes the dominance class of object x under attribute subset B when \(\delta =0.5\). Similarly, approximations of DRSA can be acquired similar to QDNRSs.

In this section, some preliminaries related to DRSA and extensions are reviewed. In what follows, we will focus on our study from the perspective of information granules.

3 Graded information granules based on the generalized decisions

The lower approximations of DRSA capture the dominance principle, which states that if object x is preferred over object y under certain criteria, the decision class of x should be superior to that of y. Specifically, if \(x\in \underline{R}_A(Cl_i^\geqslant )\), it implies that x belongs to \(\underline{R}_A(Cl_j^\geqslant )\) for any \(j\leqslant i\), where A represents the conditional attribute set and \(Cl_i^\geqslant\) denotes the i-th upward union of decision classes. From the perspective of decision rules, if for any \(k>i\), \(x\notin \underline{R}_A(Cl_k^\geqslant )\), then \(Cl_i^\geqslant\) can be interpreted as the optimal decision class of x with respect to A.

For example, let’s consider a student (x) in middle school and subjects such as math, English, and Chinese (A). If \(Cl_i\) represents the performance levels of medium, good, and excellence for \(i=1,2,3\) respectively, the comprehensive performance of the student (x) should correspond to a unique grade of medium, good, or excellence. However, based on the lower approximations of DRSA, if the student (x) performs well, it is also considered to have a medium performance. Hence, the optimal decision class becomes crucial in reflecting the consistency between criteria and decisions.

The main point of this section is to formulate a graded information granule from the perspective of decision rules in DRSA. Decision rules in ordered information tables describe the consistence between criteria and decision attributes. In DRSA, certain decision rules are acquired based on lower approximations (Greco et al. 1999, 2002a, b).

Definition 2

For DT=\(\{U,A\cup \{d\},V,f\}\) and \(B\subseteq A\), certain decision rules based on lower approximations of DRSA are defined as follows:

  1. (1)

    certain \(Cl^{\geqslant }\)-decision rules with \(\underline{R_B}(Cl_i^\geqslant )\):

    if (\(\mathop {\wedge }\limits _{j} f(x,a_j)\geqslant r_{a_j}\)), then \(x\in Cl_i^\geqslant\),

  2. (2)

    certain \(Cl^{\leqslant }\)-decision rules with \(\underline{R_B}(Cl_i^\leqslant )\):

    if (\(\mathop {\wedge }\limits _{j} f(x,a_j)\leqslant r_{a_j}\)), then \(x\in Cl_i^\leqslant\),

    where \(a_j\in B\) and \(r_{a_j}\in V_{a_j}\). \(U/d = \{Cl_1,\cdots ,Cl_r\}\) and \(Cl_j = \{x \in U,d(x) = j, \ 1\leqslant j\leqslant r\}\) are induced by decision attribute d and \(Cl_i^\geqslant =\bigcup _{j\geqslant i} Cl_j,\ Cl_i^\leqslant =\bigcup _{j\leqslant i}Cl_j (1\leqslant i\leqslant r)\) represent the upward and downward union of decision class respectively.

By considering consistence between criteria and decision attributes, Kotlowski et al. formulated a generalized decision of object x with attribute subset B (Kotłowski et al. 2008). The concrete definition is showed as follows.

Definition 3

Let DT=\(\{U,A\cup \{d\},V,f\}\) and \(B\subseteq A\), for \(\forall x\in U\), the generalized decision is constructed as an interval \([l_B(x),u_B(x)]\), where \(l_B(x)\) and \(u_B(x)\) are represented as follows.

$$\begin{aligned} l_B(x)&=max\{k |x\in \underline{R_B}(Cl_k^\geqslant )\},\nonumber \\ u_B(x)&=min\{k |x\in \underline{R_B}(Cl_k^\leqslant )\}. \end{aligned}$$
(2)

Combining with the inconsistencies caused by redundant objects in dominance classes of objects, we will formulate a more generalized decision of each object in QDNRSs. Specific definition is proposed as follows.

Definition 4

Let \(S=(U,A\cup \{d\},V,f)\) be a DT and \(x\in U\), \(B\subseteq A\), we define the generalized decision of x with respect to \(\delta _B^+\) and \(\delta _B^-\) by \([l_{B,\delta }(x), u_{B,\delta }(x)]\), where \(l_{B,\delta }(x)\) and \(u_{B,\delta }(x)\) are proposed by:

$$\begin{aligned} l_{B,\delta }(x)&=\max \{i |x\in \underline{\delta _B^+} (Cl_i^\geqslant )\},\nonumber \\ u_{B,\delta }(x)&=\min \{i |x\in \underline{\delta _B^-} (Cl_i^\leqslant )\}. \end{aligned}$$
(3)

Hereafter \(l_{B,\delta }(x)\) and \(u_{B,\delta }(x)\) are used to represent the upward and downward generalized decisions. Above Definition 4 reveals the dominance principle with a certain confidence degree restricted by quantitative dominance-based neighborhood relations. Obviously, \(l_{B,\delta }(x)\leqslant u_{B,\delta }(x)\) and object x is consistent with respect to the dominance principle when \(l_{B,\delta }(x)=u_{B,\delta }(x)\). From the perspective of decision rules in Definition 2, for any \(x\in U\) and \(B\subseteq A\) in \(\{U,A\cup \{d\},V,f\}\), \(l_{B,\delta }(x)\) and \(u_{B,\delta }(x)\) represent the optimal index for the upward and downward union of decision classes in terms of \(\delta _B^+\) and \(\delta _B^-\). The higher the index for \(l_{B,\delta }(x)\), the more superior of x in terms of \(\delta _B^+\). Also, classes of objects with the same upward or downward generalized decisions show the identical consistence between criteria and decisions. Classes with different upward or downward generalized decisions are graded information granules and all the unions with different upward or downward generalized decisions partition the universe.

Definition 5

For DT\(=(U,A\cup \{d\},V,f)\) and \(B\subseteq A\), \(x\in U\), upward and downward graded information granules with generalized decisions are defined by:

  1. (1)

    the upward graded information granule of x in terms of the \(l_{B,\delta }(x)\) is denoted as \(G_{B,\delta }^+(x)\), which is formulated by:

    $$\begin{aligned} G_{B,\delta }^+(x)=\left\{ y\in U |l_{B,\delta }(x) =l_{B,\delta }(y)\right\} , \end{aligned}$$
    (4)
  2. (2)

    the downward graded information granule of x in terms of \(u_{B,\delta }(x)\) is denoted as \(G_{B,\delta }^-(x)\), which is formulated by:

    $$\begin{aligned} G_{B,\delta }^-(x)=\left\{ y\in U |u_{B,\delta }(x) =u_{B,\delta }(y)\right\} . \end{aligned}$$
    (5)

Clearly, \(\left\{ G_{B,\delta }^+(x) |x\in U\right\}\) and \(\left\{ G_{B,\delta }^-(x) |x\in U\right\}\) are graded information granules in the universe U in terms of the upward and downward union of decision classes respectively. \(\bigcup \left\{ G_{B,\delta }^+(x) |x\in U\right\} =\{X_1,\cdots , X_r\}=U\) and \(\bigcup \left\{ G_{B,\delta }^-(x) |x\in U\right\} =\{Y_1,\cdots ,Y_r\}=U\), where r means the cardinality of decision classes. For simplicity, upward and downward graded information granules are denoted as \(G_{B,\delta }^+\) and \(G_{B,\delta }^-\).

For any upward and downward graded information granules \(G_{B,\delta }^+\) and \(G_{B,\delta }^-\) under attribute subset B, \(l_{B,\delta }(x)\) and \(u_{B,\delta }(x)\) reflect the level of dominance principle for x. \(|G_{B,\delta }^+(x) |\) and \(|G_{B,\delta }^-(x) |\) describe the extent of representation for x on \(G_{B,\delta }^+(x)\) and \(G_{B,\delta }^-(x)\). By considering above two factors, we define the importance degree of \(G_{B,\delta }^+\) and \(G_{B,\delta }^-\), which are denoted as \(I_{B,\delta }^+\) and \(I_{B,\delta }^-\) respectively. Specific constructions are proposed by the following Definition 6.

Definition 6

Let \(S=\{U,A\cup \{d\},V,f\}\) be a DT, \(B\subseteq A\) and \(V_d=\{1,\cdots ,r\}\), \(G_{B,\delta }^+=\{X_1,\cdots ,X_r\}\) and \(G_{B,\delta }^-=\{Y_1,\cdots ,Y_r\}\) are upward and downward graded information granules derived from \(\delta _B^+\) and \(\delta _B^-\), then the importance degrees \(I_{B,\delta }^+\) and \(I_{B,\delta }^-\) are constructed as follows.

$$\begin{aligned} I_{B,\delta }^+&=\sum \limits _{i=1}^r i \times \frac{|X_i |}{ |U |},\nonumber \\ I_{B,\delta }^-&=\sum \limits _{i=1}^r i \times \frac{|Y_i |}{ |U |}. \end{aligned}$$
(6)

\(I_{B,\delta }^+\) and \(I_{B,\delta }^-\) quantify the degree of upward and downward graded information granules derived from lower approximations of QDNRSs. Besides, the importance of attributes is revealed by approximating qualities, which is presented as follows (Xu 2013).

Definition 7

(Xu 2013) Let \(S=(U,A\cup \{d\},V,f)\) be a DT, \(\forall B\in A\), \(\{Cl_i^\geqslant |i=1,\cdots ,r\}\) and \(\{Cl_i^\leqslant |i=1,\cdots ,r\}\) be upward and downward union of decision classes derived from the decision attribute d, then approximating qualities of B with respect to \(\{Cl_i^\geqslant |i=1,\cdots ,r\}\) and \(\{Cl_i^\leqslant |i=1,\cdots ,r\}\) are defined by \(\gamma _B(Cl^\geqslant )\) and \(\gamma _B(Cl^\leqslant )\), respectively, i.e.,

$$\begin{aligned} \gamma _B(Cl^\geqslant )&=\frac{\sum \limits _{i}^r |\underline{R_B}(Cl_i^\geqslant ) |}{\sum \limits _{i}^r |Cl_i^\geqslant |},\nonumber \\ \gamma _B(Cl^\leqslant )&=\frac{\sum \limits _{i}^r |\underline{R_B}(Cl_i^\leqslant ) |}{\sum \limits _{i}^r |Cl_i^\leqslant |}. \end{aligned}$$
(7)

Generally, upward approximating qualities of QDNRSs can be formulated by replacing \(\gamma _B(Cl^\geqslant ), \underline{R_B}(Cl_i^\geqslant )\) with \(\gamma _{B,\delta }(Cl^\geqslant )\) and \(\underline{\delta _B}^+(Cl_i^\geqslant )\) when considering the dominance principle between dominance classes and upward union of decisions. Downward approximating qualities in QDNRSs can be defined by \(\gamma _{B,\delta }(Cl^\leqslant )\) similarly.

Without loss of generality, in what follows, basic theories related to QDNRSs are just discussed with the upward union of decision classes. When facing the uncertainty caused by the extent to dominance principle in QDNRSs from perspectives of graded information granules and lower approximations, we analyze the interrelationship of \(I_{B,\delta }^+\) and \(\gamma _{B,\delta }(Cl^\geqslant )\) by the following theorem.

Theorem 2

For \(S=(U,A\cup \{d\},V,f)\) and \(B\subseteq A\), then,

$$\begin{aligned} \gamma _{B,\delta }(Cl^\geqslant )= |U |\times \frac{I_{B,\delta }}{\sum \limits _{i=1}^r{ |Cl_i^\geqslant |}}. \end{aligned}$$
(8)

Proof

Assume the graded information granule is \(G_{B,\delta }^+=\{X_1,\cdots ,X_r\}\) with the upward union of decision classes. Based on properties of lower approximations, if \(x\in \underline{\delta _B}^+(Cl_i^\geqslant )\), then for any \(j\leqslant i\), \(x\in \underline{\delta _B}^+(Cl_j^\geqslant )\). Thus

$$\begin{aligned} \sum \limits _{i=1}^r |\underline{\delta _B}^+(Cl_i^\geqslant ) |&= |\underline{\delta _B}^+(Cl_1^\geqslant ) |+ |\underline{\delta _B}^+(Cl_2^\geqslant ) |+\cdots + |\underline{\delta _B}^+(Cl_r^\geqslant ) |\\&= |\{x |x\in \underline{\delta _B}^+(Cl_r^\geqslant ) \} + |\{x |x\in \underline{\delta _B}^+(Cl_{r-1}^\geqslant ) \ \textrm{and} \\&\ \ \ \ x\notin \underline{\delta _B}^+(Cl_r^\geqslant )\} |+ |\{x |x\in \underline{\delta _B}^+(Cl_r^\geqslant ) \} |+\cdots \\&= |\{x |l_{B,\delta }(x)=1\} |+|\{x |l_{B,\delta }(x)=2\} |\times 2+ \cdots + |\{x |l_{B,\delta }(x)=r\} |\times r\\&= |X_1 |+2\times |X_2 |+\cdots + |X_r |\times r\\&= |U |\times I_{B,\delta }^+. \end{aligned}$$

Then \(\gamma _{B,\delta }(Cl^\geqslant )=\frac{\sum \limits _{i=1}^r |\underline{\delta _B}^+(Cl_i^\geqslant )|}{\sum \limits _{i=1}^r{ |Cl_i^\geqslant |}} =\frac{|U |\times I_{B,\delta }^+}{\sum \limits _{i=1}^r{ |Cl_i^\geqslant |}}\). \(\square\)

Approximating qualities of QDNRSs hold the property of monotonicity with respect to conditional attributes, i.e., if \(B\subseteq C\subseteq A\), then \(\gamma _{A,\delta }(Cl^\geqslant )\geqslant \gamma _{B,\delta }(Cl^\geqslant )\geqslant \gamma _{C,\delta }(Cl^\geqslant )\). Thus the following property can be obtained directly.

Theorem 3

For \(S=(U,A\cup \{d\},V,f)\) and \(C\subseteq B\subseteq A\), \(G_{C,\delta }^+\), \(G_{B,\delta }^+\) are graded information granules, \(l_{B,\delta }(x)\) is the upward generalized decision of \(x\in U\) then

$$\begin{aligned} l_{C,\delta }(x)\leqslant l_{B,\delta }(x). \end{aligned}$$
(9)

Proof

It can be proved trivially by combining Theorem 1 and 2. \(\square\)

Theorem 3 tells us the fact that for any object x and attribute subset B in \(S=\{U,A\cup \{d\},V,f\}\), the upward generalized decision \(l_{B,\delta }(x)\) monotonically increasing when new attribute a is added to B.

Here we employ Table 1 in Example 1 to present graded information granules based on the generalized decisions. Parameters are assumed as follows: \(A=\{a_1,a_2,\cdots ,a_5\}\), \(\delta =0.5\) and 0.7, \(k=10\).

Example 2

Continued Example 1.

  • Based on Definition 4, \(l_{A,\delta }(x)\) for each object x is presented as follows:

    \(l_{A,0.5}(x_1)= l_{A,0.5}(x_2)=l_{A,0.5}(x_3)=l_{A,0.5}(x_5) =l_{A,0.5}(x_6)=l_{A,0.5}(x_7)=l_{A,0.5}(x_8)=l_{A,0.5}(x_9) =l_{A,0.5}(x_{10})=l_{A,0.5}(x_{11})=l_{A,0.5}(x_{12}) =l_{A,0.5}(x_{13})=2\),

    \(l_{A,0.5}(x_4)=l_{A,0.5}(x_{14})=l_{A,0.5}(x_{15})=1\).

    \(l_{A,0.7}(x_i)=2,\ \forall i=1,2, \cdots , 14\) and \(l_{A,0.7}(x_{15})=29\).

  • Based on Definition 5, graded information granules are computed by:

    \(G_{A,0.5}^+=\{X_1,X_2\}\), where \(X_1=\{x_i |i=1,2,3,5,6, 7,8,9,10,11,12,13\}\) and \(X_2=\{x_4,x_{14},x_{15}\}\).

    \(G_{A,0.7}^+=\{Y_1,Y_2\}\), where \(Y_1=\{x_i |i=1,2, \cdots ,14\}\) and \(Y_2=\{x_{15}\}\).

4 An efficient approach to attribute reductions based on graded information granules

Attribute reductions based on lower approximations aim at obtaining the representative conditional attributes which keep the dominance principle invariant. Theorem 2 supplies a novel perspective of attribute reductions based on information granules. In this section, theories of attribute reductions based on graded information granules will be analyzed. Furthermore, an accelerated approach will be proposed to promote the process of reductions.

4.1 Theory on attribute reductions based on graded information granules

From the perspective of graded information granules, redundant attribute a can be obtained when the construction of upward graded information granules is unaltered by deleting a. Thus we have the following conclusion.

Definition 8

Let \(S=\{U,A\cup \{d\},V,f\}\) be a DT, \(a\in A\) is a redundant element with respect to the upward graded information granule when \(G_{A\backslash \{a\},\delta }^+=G_{A,\delta }^+\).

On the basis of Definition 8, a reduct can be obtained. The specific definition is proposed as follows.

Definition 9

Let \(S=\{U,A\cup \{d\},V,f\}\) be a DT, if \(B\subseteq A\) is a reduct of S in terms of upward graded information granules, then for any \(x\in U\), the following conclusions are equivalent.

  1. (1)

    \(l_{B,\delta }(x)= l_{A,\delta }(x)\) and for any \(B'\subset B\), \(l_{B',\delta }(x)\ne l_{B,\delta }(x)\),

  2. (2)

    \(G_{B,\delta }(x)=G_{A,\delta }(x)\) and for any \(B'\subset B\), \(G_{B',\delta }^+(x)\ne G_{B,\delta }^+(x)\).

If \(l_{B,\delta }(x)= l_{A,\delta }(x)\), then \(l_{B,\delta }(x)\) is called as the optimal upward generalized decision for simplicity. By combining Theorem 3, importance degrees satisfy the monotonicity in terms of conditional attributes. The concrete conclusion is presented by the following Theorem.

Theorem 4

Let \(S=(U,A\cup \{d\},V,f)\) be a DT, \(B\subseteq A\), \(G_{B,\delta }^+\) and \(G_{A,\delta }^+\) be upward graded information granules for attributes B and A respectively, then for the importance degrees \(I_{B,\delta }^+\) and \(I_{A,\delta }^+\), we have,

  1. (1)

    \(I_{B,\delta }^+\leqslant I_{A,\delta }^+\),

  2. (2)

    if B is a reduct in terms of the upward graded information granules, then \(I_{B,\delta }^+=I_{A,\delta }^+\) and for any \(B'\subset B\), \(I_{B,\delta }^+>I_{B',\delta }^+\).

Proof

Conclusions can be proved by combining Definition 9 and Theorem 3. \(\square\)

Definition 9 and Theorem 4 supply a way to acquire all the reducts by keeping the upward graded information granules invariable. However, it is time-consuming when facing with high-dimensional data sets. Thus a representative reduct is meaningful, which not only possesses less cardinality but also keeps the upward graded information granules unaltered. To do so, we define the significance of an attribute as follows.

Definition 10

Let \(S=(U,A\cup \{d\},V,f)\) be a DT and \(B\subseteq A\), for any \(a\in A\backslash B\), the significance of a with respect to B and \(Cl^\geqslant\) can be defined as follows:

$$\begin{aligned} Sig_\delta ^+(a,B) =I_{B\cup \{a\},\delta }^+-I_{B,\delta }^+ . \end{aligned}$$
(10)

Based on Theorem 4 and Definition 9, \(B\subseteq A\) is a reduct of \(S=(U,A\cup \{d\},V,f)\) when \(Sig_\delta ^+(a,B)=0\). Definition 9 provides the significance of attributes when adding a into the given subset B. The search strategy is called as forward attribute reduction. Similarly, the backward strategy can be defined when reducing attributes. The corresponding significance of a can be defined as: \(Sig_\delta ^+(a,B) =I_{B,\delta }^+-I_{B\backslash \{a\},\delta }^+\). Without loss of generality, we focus on the first search strategy.

4.2 An efficient algorithm of attribute reductions based on the graded information granules of quantitative dominance-based neighborhood rough sets

In Subsection 4.1, methods of attribute reductions for QDNRS are studied based on graded information granule. However in the forward searches, objects need to be traversed again when adding new attribute into the current chosen ones. It is time-consuming especially when facing with large-scale data sets. Thus, in this subsection, we will continue to analyze the accelerated process of attribute reductions in QDNRSs.

Combining with Definition 9, if B is a reduct based on graded information granules, \(\forall x\in U\), \(l_{B,\delta }(x)=l_{A,\delta }(x)\). That is to say, in forward search strategies of attribute reduction methods restricted by Definition 10, objects with the optimal upward generalized decision will make no contribution to the significance of the selected attributes. Thus only objects with the varying upward generalized decisions need to be updated. In what follows, we will focus on designing an efficient algorithm of attribute reduction method with the upward generalized decisions. The concrete algorithm is designed by the following Algorithm 1.

Algorithm 1
figure a

An efficient approach to attribute reductions based on the upward graded information granules in QDNRSs.

The prior information \(l_{A,\delta }\) is added in the fourth line of Algorithm 1, thus in the forward search strategy, the upward generalized decision of each object x will close to the optimal one when adding attributes into the selected attribute subset. So the number of iterations will be less and less when adding new attributes. Also more and more objects will reach their optimal upward generalized decisions in the process. This makes the Algorithm 1 efficient to acquire the reduct set. In what follows, we use an example based on Table 1 to evaluate the efficiency of our algorithm.

Example 3

Continued Example 1 and Example 2. Here processes of attribute reductions based on Algorithm 1 are evaluated when \(\delta =0.5\) and 0.7 respectively.

  1. (1)

    Attribute reductions of QDNRSs with \(\delta =0.5\).

    Step 1: \(l_{A,0.5}=(2,2,2,1,2,2,2,2,2,2,2,2,2,1,1)\) and \(I^+_{A,0.5}=\frac{27}{15}\).

    Step 2: \(I^+_{\{a_1\},0.5}=\frac{15}{15},\ I^+_{\{a_2\},0.5}=\frac{27}{15},\ I^+_{\{a_3\},0.5}=\frac{17}{15},\ I^+_{\{a_4\},0.5}=\frac{15}{15},\ I^+_{\{a_5\},0.5}=\frac{18}{15}\),

    \(Red\leftarrow \{a_2\}\),

    \(I^+_{\{a_2\},0.5}=I^+_{A,0.5}\).

    Step 3: Because \(I^+_{A,0.5}=I^+_{\{a_2\},0.5}\), we obtain Red=\(\{a_2\}\).

  2. (2)

    Attribute reductions of QDNRSs with \(\delta =0.7\).

    Step 1: \(l_{A,0.7}=(2,2,2,2, 2, 2,2,2, 2, 2,2,2, 2, 2, 1)\) and \(I^+_{A,0.7}=\frac{29}{15}\).

    Step 2: \(I^+_{\{a_1\},0.7}=\frac{25}{15}, I^+_{\{a_1\},0.7}=\frac{28}{15},I^+_{\{a_1\},0.7}= \frac{18}{15},I^+_{\{a_1\},0.7}= \frac{24}{15},I^+_{\{a_1\},0.7}= \frac{18}{15}\),

    \(Red\leftarrow \{a_2\}\),

    \(l_{\{a_2\},0.7}=(2,2,2,1,2,2,2,2,2,2,2,2,2,2,1)\).

    Step 3: Obtain objects without the upward generalized decisions compared with that under A and denoted as \(LOC_{obj}=\{x_4\}\).

    Step 4: \(I^+_{\{a_2,a_1\},0.7}=\frac{28}{15}, I^+_{\{a_2,a_3\},0.7}=\frac{28}{15},I^+_{\{a_2,a_4\},0.7} =\frac{29}{15},I^+_{\{a_2,a_5\},0.7}=\frac{28}{15}\).

    Thus \(Red\leftarrow Red\cup \{a_4\}\).

    Step 5: Red=\(\{a_2,a_4\}\).

In Example 3, QDNRSs degenerate to DRSA when \(\delta =0.5\) and based on above conclusions we obtain that QDNRSs improve the approximating quality. Moreover, when \(\delta =0.7\), after the first attribute \(a_2\) being selected in the reduct, only \(x_{4}\) needs to be traversed in the next iteration, which will accelerate the process of attribute reductions effectively.

In this Section, we focus on the upward graded information granules based on upward generalized decisions. On the basis, methods of attribute reductions for QDNRSs both on theoretical analysis and algorithm design are studied in detail. In the next Section we will use some public data sets to evaluate the performance of our algorithms.

5 Experimental analysis

In this section, with the aim of evaluating the efficiency of Algorithm 1, experiments are conducted. We record both time consuming of attribute reductions and structures of upward graded information granules with the alteration of parameters on each data set. Also the precision of reduct sets is evaluated both from perspectives of rough set theory and machine learning. The experiments is conducted by Matlab (R2019B) on computers with the memory of 8.0G, CPU of Inter Core i5-8500, 3.0GHz and 8 cores.

5.1 Description on data sets

In this subsection, some data sets from UCI machine learning repository and knowledge extraction evolutionary learning (KEEL) are downloaded to evaluate the efficiency of Algorithm 1. The information of data sets is presented by the following Table 2.

Table 2 Description of data information

In Table 2, No. of instances, No. of attributes and No. of classes denote the cardinality of instances, attributes and classes respectively. Marketing, Thyroid, Lym come from KEEL and UCI. Other data sets in Table 2 are oriented from UCI. For data sets Marketing, the total and effective instances numbers are 8993, 6876 respectively. Similarly, the missing values exist in data set Cleveland. The number of instance without missing values is 297. Attributes category of Marketing is integer, and for other data sets, Thyroid and Absenteeism are mixed with real and integer, Cleveland, Winequality-W and Leaf are real. The type of attribute values for Lym is categorical, and mixed with categorical, integer and real for Australian. For data set Dermatology, attribute values are categorical and integer. From the application of data sets in Table 2, Marketing, Thyroid, Cleveland, Australian, Dermatology, Leaf are used for classification. Other data sets are for classification, clustering analysis, regression works.

For these data sets, Absenteeism (Yang et al. 2023), Winequality-W (Yang et al. 2021, 2023; Sang et al. 2021), Australian (Yang et al. 2021, 2023; Sang et al. 2021a, b; Du and Hu 2016), Dermatology (Yang et al. 2023; Sang et al. 2021) are widely applied in DRSA. Thyroid is used in classification (Ganivada et al. 2011). Other data sets selected in this paper performs well on experimental analysis.

5.2 Interpretation of accelerations for attribute reductions based on the upward generalized decisions

To reflect the efficiency of attribute reductions based on upward generalized decisions, we record time consuming and compare it with the original lower approximations of QDNRSs, fuzzy preference-based rough set (Hu et al. 2010) and efficient computations of approximations by dominance feature matrix (Wang et al. 2019, 2020). For simplicity, the compared methods based on upward generalized decisions and other ones are abbreviated to UGD, LA-QDNRS, FPRS, DFM respectively. In the experiment, k controls the strength of fuzzy preference relations which changes in steps of 2 from 2 to 20 and \(\delta\) varies from 0.5 to 0.9 with the step of 0.1. Particularly, QDNRS degenerates to DRSA when \(\delta =0.5\). In the following Table 3, we present the comparison on running time of above methods and LA means LA-QDNRS for clarity.

Table 3 Time consuming of acquiring reducts (seconds)

Each value in the second to fifth columns of Table 3 is obtained by the mean value of running time on FPRS, UGD, LA-QDNRS and DFM if \(\delta\)=0.5 when k changes from 2 to 20 with the step of 2. Values in the sixth to eighth columns of Table 3 are acquired by the mean value of executing time on UGD, LA-QDNRS and DFM when \(\delta\) and k vary from 0.5 to 0.9 and 2 to 20 respectively.

Obviously, UGD and DFM accelerate the process of attribute reductions compared with FPRS and LA-QDNRS, which is shown by the second to fifth columns of Table 3. Concluded by the last three columns of Table 3, UGD performs dramatically well when \(\delta\) and k change. To further illustrate the superiority of UGD, in what follows, we display the time consuming with the variation of data size by comparing UGD with other methods when setting \(\delta =0.7\) and k=10. The specific information is shown by the following Fig. 1.

Fig. 1
figure 1

Time consuming on attribute reductions based on UGD, LA-QDNRS, FPRS and DFC

By Fig. 1 we obtain that attribute reductions of QDNRS improve the time efficiency by comparing LA-QDNRS with \(\delta =0.5\) and \(k=10,\delta =0.7\), or UGD with \(\delta =0.5\) and \(k=10,\delta =0.7\). When \(\delta =0.5\), UGD accelerates the process of attribute reductions dramatically compared with LA-QDNRS. The similar tendency is shown when \(\delta =0.7\) and \(k=10\). The acceleration in Fig. 1 further illustrates the superiority of Algorithm 1. Also when \(\delta =0.7, \ k=10\), it costs less time than methods with \(\delta =0.5\) no matter by the comparison of LA-QDNRS or UGD.

To further explain the accelerated process of UGD, in what follows we record the results by the following Fig. 2. In Fig. 2, “Ratios of optimal decision classes" denotes the ratios of objects obtaining their optimal upward generalized decisions under the selected attributes of a reduct.

Fig. 2
figure 2

Accelerated process of attribute reductions with \(\delta =0.5\) and 0.7

From Fig. 2, we obtain that for data set Absenteeism, after the first attribute being selected into reduct, ratios of objects with the optimal upward generalized decisions are about 10% and 60% with \(\delta =0.5\) and 0.7 respectively, which means these objects with the optimal upward generalized decisions will have no impact on the selection of the next attribute being selected. Meanwhile, ratios of objects with the optimal upward generalized decisions take higher than 50% percentage for data sets Australian, Marketing, Cleveland, Lym and Thyroid with the variation of \(\delta\) when \(k=10\).

5.3 Statistical testing on the superiority of the attribute reduction method by upward generalized decisions

The paired t-test is a statistical approach to test whether the mean difference between pairs of measurements is significant or not. The main point of this paper is to study the accelerated process of obtaining the reduct set by UGD, which is emphasized by comparing running time of attribute reductions with other methodologies in DRSA. The compared results are acquired from the same data sets. Thus the mean executing time can be tested by paired-t-test to stress the superiority of our method. The tested process can be concluded by the following four steps, which is similar to that in Ahmad et al. (2020).

Step 1. Choosing null and alternative hypothesis.

Based on Table 3, it takes more time to obtain the reduct set by LA-QDNRS and FPPRS compared with DMF. Thus the difference between UGD and DMF needs to be checked first. Specifically, the null and alternative hypothesis are assumed as follows:

Null hypothesis \(H_0\): \(\mu _0=\mu _1\).

Alternative hypothesis \(H_1\): \(\mu _0<\mu _1\).

Where \(\mu _0\) and \(\mu _1\) denote mean execution time on UGD and DMF respectively. \(H_0\) means that there is no difference between the mean running time on attribute reductions of UGD and DMF. \(H_1\) gives a hypothesis on that it takes less time by UGD compared with DMF.

Step 2. Constructing test statistics

In paired t-test, t-statistic t is constructed by the Eq. (11).

$$\begin{aligned} t=\frac{\overline{d}}{SE(d)}. \end{aligned}$$
(11)

In Eq. (11), \(\overline{d}\) and SE(d) are mean difference and standard error of mean difference respectively. Concretely, SE(d) is computed by the following formula:

$$\begin{aligned} SE(d)=\frac{S(d)}{\sqrt{n}}. \end{aligned}$$
(12)

Where S(d) and n are standard deviation and sample size respectively. Conclusively, the statistic from Eq. (11) is a t-statistic which takes \(n-1\) as the degree of freedom. In this paper, n denotes the number of data sets.

Step 3. Setting level of significance

Level of significance \(\alpha\) measures the probability of rejecting the null hypothesis and is set as 0.05 in practical issues.

Step 4. Analyzing experimental results

\(t_{\frac{\alpha }{2}}(n-1)=t_{0.025}(8)=2.306\) from t-distribution table. Combining with Table 3, t-values under DRSA and QDNRS are computed by 1.2168 and 1.6295, which provides strong evidence to reject NULL hypothesis and states our claim that the mean running time of proposed method is less than the mean executing time of the DMF. Continued step 1, the proposed method by UDG is more efficient than other ones including LA-QDNRS.

5.4 Efficiency of reduct sets from perspectives of rough sets and machine learning

In this subsection, the evaluation of reduct set by QDNRS from perspectives of rough sets and machine learning is proposed. Specifically, importance degrees and approximating qualities of reduct sets are compared by Table 4. In table 4, importance degrees and approximating qualities for QDNRS are computed by the mean value under each k and \(\delta\), and all the values for FPRS are conducted by the mean value of k.

Table 4 Importance degrees and approximating qualities on reduct set

Clearly, the higher the values of importance degree and approximating qualities, the more consistent the decision rules will be, which can be elaborated by Definitions 2, 6 and 7. From Table 4 we conclude that QDNRS improves the consistence of ordered information systems dramatically for all the data sets. Particularly, the notable effectiveness is shown by data set Marketing, for which the degrees by QDNRS are twice as high as other methods. The conclusion further evaluate the motivation of our proposed QDNRS, which aims at improving the consistence between criteria and decisions by filtering out pairs of objects without apparent dominance relations.

In the following Table 5, the mean value on cardinalities of QDNRS is compared with DRSA and FPRS.

Table 5 Mean cardnalities on reduct set

From Table 5 we obtain that less attributes are selected by QDNRS compared with other methods. Meanwhile, majority attributes will be obtained by FPRS. Specially, there is no difference between reduct sets and original ones for Winequality-W and Marketing. Also for data sets Australian, Cleveland, Lym and Thyroid, the difference on cardinalities of reduct sets is inapparent by DRSA and FPRS.

To reveal the difference of graded information granules with \(\delta =0.5\) and 0.7, we display the distribution of objects with respect to the optimal upward generalized decisions. The specific information is shown by Fig. 3.

Fig. 3
figure 3

Distributions of upward graded information granules with \(\delta =0.5\) and 0.7. Blue bars represent numbers of cardinality of upward graded information granules with \(\delta =0.5\) and red ones denote results when \(\delta =0.7\)

From Fig. 3 we conclude that more objects are partitioned into the upward graded information granules with higher levers when \(\delta =0.7\) compared with that when \(\delta =0.5\). By combining with Theorem 2, the results imply the fact that QDNRSs improve approximating qualities compared with DRSA.

To reflect the effectiveness of QDNRS from the perspective of machine learning, we compare the classification precision by the mean-squared error (MSE) through SVM and KNN. The parameter \(k_0\) is set 3 in KNN. All the results are conducted by 10-fold cross-validation method. The comparison on MSE with SVM and KNN is presented by the following Tables 6 and 7.

Table 6 Mean MSE by SVM
Table 7 Mean MSE by KNN (\(k_0\) is set 3)

In Tables 6 and 7, the second to fourth columns denote the mean value of MSE by SVM and KNN on reduct set. The last three columns record the mean value of MSE on the first selected attribute in reduct set. The less the value of MSE in Tables 6 and 7, the more efficient of the model performs on classification. By combining with Table 5 we obtain that QDNRS performs poor by comparing the mean MSE with DRSA and FPRS. Especially there is a big difference on the mean MSE between QDNRS and other models when the variance on cardinality is high. Take data set Dermatology as an example, the mean MSE by SVM is 0.3507 from QDNRS, which is much higher than that of DRSA and FPRS. However, the cardinality on reduct set by QDNRS is less than one sixth of the original attributes. Similar conclusions are revealed by Cleveland and Lym from Table 6. But still, for Australian and Thyroid, the difference on the mean MSE is particularly subtle both by SVM and KNN, but the cardinality of reduct by QDNRS is small, which further saves much time of obtaining the reduct set. Meanwhile, for data sets Absenteeism, Winequality-W, Dermatology, Leaf and Lym, the first selected attribute in reduct set of QDNRS performs well by the evaluation of the mean MSE on SVM compared with FPRS. The similar variations can be shown on data sets Absenteeism, Dermatology, Marketing, Cleveland, Leaf, Lym and Thyroid by KNN.

6 Conclusion

In this paper, generalized decisions including upward and downward ones of quantitative dominance-based neighborhood rough sets were constructed from the perspective of decision rules. On the basis, upward and downward graded information granules were constructed, which partitioned the universe respectively. Even some theoretical properties and importance degrees were studied. Furthermore, an accelerated attribute reduction algorithm was designed based on the upward generalized decisions. In the algorithm, only objects without the optimal upward generalized decisions would be traversed in the following attributes selected process, which further saved much more running time when more objects reached their optimal generalized decisions in forward search. Finally, numerical experiment results shown that the updating of upward generalized decisions promoted the process of attribute reductions too much extent, which was evaluated by paired t-test. Also, our proposed method of precision accuracy on reduct sets performed quite well both from rough set theory and machine learning.