1 Introduction and preliminaries

An interval number (Moore 1966) \({\mathbb {a}}\) is a pair \(\langle {\dot{a}},\ddot{a}\rangle \in R^2\) with \({\dot{a}}\le \ddot{a}\), where R is the set of all real numbers with the ordinary partial order. In this paper, we write \({\mathbb {a}}=\langle a,a\rangle \) as \(a^\shortmid \) and identifyFootnote 1 it with the corresponding real number a (therefore, interval numbers can be looked as an extension of real numbers) and designate the number \(\langle {\dot{a}}, \ddot{a}\rangle ^c=0.5({\dot{a}}+\ddot{a})\) (resp., \(\langle {\dot{a}}, \ddot{a}\rangle ^r=0.5(\ddot{a}-{\dot{a}}\))) the center or midpoint (resp., the radius) of the interval number \(\langle {\dot{a}}, \ddot{a}\rangle \) (thus \(\langle {\dot{a}}, \ddot{a}\rangle \) can also be written as \(\lceil \langle {\dot{a}}, \ddot{a}\rangle ^c, \langle {\dot{a}}, \ddot{a}\rangle ^r\rceil \) or \(\langle {\dot{a}}, \ddot{a}\rangle ^c\pm \langle {\dot{a}}, \ddot{a}\rangle ^r\)). We use \({{\mathbb {R}}}\) to denote the set of all interval numbers. We define \(a\vee b=\max \{a,b\}\), \(a\wedge b=\min \{a,b\}\), and \({\downharpoonleft }a,b{\downharpoonright } ={\langle a\wedge b, a\vee b\rangle }\) for any \(a,b\in R\). The four elementary operations on R have also been extent to \({{\mathbb {R}}}\) (where \({\mathbb {a}}= \langle {\dot{a}},\ddot{a}\rangle ,{\mathbb {b}}=\langle {\dot{b}},\ddot{b}\rangle \in {{\mathbb {R}}}\), see Moore 1966): \(\langle {\dot{a}},\ddot{a}\rangle \oplus \langle {\dot{b}},\ddot{b}\rangle = \langle {\dot{a}}+{\dot{b}},\ddot{a}+\ddot{b}\rangle , \ \langle {\dot{a}},\ddot{a}\rangle \) \(\ominus \) \(\langle {\dot{b}},\ddot{b}\rangle ={\downharpoonleft }\ {\dot{a}}-{\dot{b}},\ddot{a}-\ddot{b}\ {\downharpoonright }, \ \langle {\dot{a}},\ddot{a}\rangle \odot \langle {\dot{b}},\ddot{b}\rangle ={\downharpoonleft }\ {\dot{a}}{\dot{b}},\ddot{a}\ddot{b}\ {\downharpoonright }, \ \langle {\dot{a}},\ddot{a}\rangle \) \(\langle {\dot{b}},\ddot{b}\rangle = {\downharpoonleft } \frac{{\dot{a}}}{{\dot{b}}}, \frac{\ddot{a}}{\ddot{b}} {\downharpoonright } \ (0\notin [{\dot{b}},\ddot{b}])\). Moreover, a total order (see Hu and Wang 2006; Xu and Yager 2006) \(\leqslant \) can be defined on \({{\mathbb {R}}}\) by putting

$$\begin{aligned} \langle {\dot{a}},\ddot{a}\rangle \leqslant \langle {\dot{b}},\ddot{b}\rangle \ \Longleftrightarrow \ \langle {\dot{a}},\ddot{a}\rangle ^c<\langle {\dot{b}},\ddot{b}\rangle ^c, \ \text {or} \ \langle {\dot{a}} ,\ddot{a} \rangle ^c=\langle {\dot{b}} ,\ddot{b} \rangle ^c \ \text {but} \ \ddot{a} -{\dot{a}} \ge \ddot{b} -{\dot{b}}. \end{aligned}$$

Interval data arise in many cases (such as numerical analysis—managing rounding errors, computer-assisted proofs, global optimization, individually, modeling uncertainty) because the data included there can not be exactly expressed in real numbers but can be revealed in interval numbers. For instance, consider the following problems with interval numbers as input and output (briefly, interval input–output) sample set [for more details, we refer to see also Table 4 in Inuiguchi and Mizoshita (2012)]Footnote 2:

  1. (I)

    The pattern recognition problem including interval samples. Astragali Radix is a medicinal and edible plant of the same origin that can regulate the body’s immune function and is perfect for endangered patients. Usually, Astragali Radixes are distributed (based on test and measure data) into 5 grades: 1 (the lowest grade), \(\ldots \), 5 (the highest grade). The following Table 1 (taken from Zhang et al. 2020) dispenses some useful samples, where \({{\mathbb {x}}}_1\) stands for length (cm) of Astragali Radix, \({{\mathbb {x}}}_2\) stands for head diameter (cm) of Astragali Radix, \({{\mathbb {x}}}_3\) stands for tail diameter (cm) of Astragali Radix, and \({{\mathbb {y}}}\) stands for grade of Astragali Radix. For a given Astragali Radix \({\pmb {\mathbb {t}}}_0= (35.8\pm 5.4, 1.7\pm 0.2, 0.8\pm 0.1)\) (whose grade cannot be resolved instantly by using Pharmacopoeia of the People’s Republic of China 1992), try to match it.

  2. (II)

    A control problem involving interval samples:

    $$\begin{aligned} \frac{\mathrm{d}{\pmb x}}{\mathrm{d}t}= f({\pmb x})+{\pmb u}(t) \end{aligned}$$
    (1)

    where \({\pmb x}(t)=(x_1(t), x_2(t))^\mathrm{T}\in U\) (a compact set in \(R^2\)) represents the state at time t, \(f({\pmb x})\) is a binary continuous function on U which is just observable (i.e., we can get the approximate value, an interval number in general, for each \({\pmb x}\)), and \({\pmb u}\) is the controller to be designed. Just like succeeding a decimal by an integer (so-called rounding up or down), practitioners can restore each interval number \({\mathbb {a}}\) in a sample set by its center \({\mathbb {a}}^c\) to process the new sample using some known processes. However, this is an unacceptable method due to the loss of the information. The other way to deal with it while solving the practical problem is the use of a real input-interval output model (i.e. real number input-interval number output, called also crisp input-interval output); for more details, see (Hwang et al. 2006; Ishibuchi and Tanaka 1990; Jeng et al. 2003; Lee and Tanaka 1999) and the following Example 1.1 and Remark 1.2.

Table 1 Sample information of Astragali Radix

Example 1.1

Consider a real input–output (i.e. real number input-real number output) sample (an n-element set) \(S=\{(x_{1},y_{1}),\) \((x_{2},y_{2}),\ldots ,\) \( (x_{n},y_{n})\}\) from a continuous function f(x). Without loss of generality we assume \(S=\{(-1,0),\) \((0,1), (1,0)\}\) (a 3-element set). Firstly, we get the linear interpolation \(f_{1,3,0}(x)\) of f based on S

$$\begin{aligned} f_{1,3,0}(x)=\left\{ \begin{array}{ll} 1+x, &{} x\in [-1,0), \\ 1-x, &{} x\in [0,1]. \end{array} \right. \end{aligned}$$

Secondly, we get the first real input-interval output model \(f_{1,3,\varepsilon }(x)=\langle f^-_{1,3,\varepsilon }(x),f^+_{1,3,\varepsilon }(x)\rangle \) \(= f_{1,3,0}(x)\pm \varepsilon \); clearly, \(f^-_{1,3,\varepsilon }(x)\) \(=f_{1,3,0}(x)-\varepsilon \) and \(f^+_{1,3,\varepsilon }(x)=f_{1,3,0}(x)+\varepsilon \) are continuous. Similarly, we get \(f_{1,n,0}(x)\) and \(f_{1,n,\varepsilon }(x)=\langle f^-_{1,n,\varepsilon }(x),f^+_{1,n,\varepsilon }(x)\rangle \) \(= f_{1,n,0}(x)\pm \varepsilon \) \((n>3, \varepsilon >0)\). Notice \(f_{1,n,0}(x_i)\in f_{1,n,\varepsilon }(x_i)\) \((i=1,2,\ldots , n)\), we have \(\lim \limits _{\varepsilon \rightarrow 0} f_{1,n,\varepsilon }(x_i) =\langle f_{1,n,0}(x_i),f_{1,n,0}(x_i)\rangle \) \(=f_{1,n,0}(x_i)\) \((i=1,2,\ldots , n)\), and \(\lim \limits _{n\rightarrow +\infty } f_{1,n,\varepsilon }(x)= \langle f_{1,n,0}(x),f_{1,n,0}(x)\rangle \) \(=f_{1,n,0}(x) \ \ (x\in [a,b])\) if \(a=x_1<x_2<\cdots x_n=b\) is an equal-length partition. Thirdly, we get the second real input-interval output model \(f_{2,3,\varepsilon }(x) =\langle f^-_{2,3,\varepsilon }(x),f^+_{2,3,\varepsilon }(x)\rangle (\varepsilon >0)\), where

$$\begin{aligned} f^-_{2,3,\varepsilon }(x)=\left\{ \begin{array}{ll} 1+x-\varepsilon , &{} x\in [-1,-0.5\varepsilon ), \\ \sqrt{0.5\varepsilon ^2-x^2}+1-2\varepsilon , &{} x\in [-0.5\varepsilon , 0.5\varepsilon ), \\ 1-x-\varepsilon , &{} x\in [0.5\varepsilon ,1] \end{array} \right. \end{aligned}$$

and

$$\begin{aligned} f^+_{2,3,\varepsilon }(x)=\left\{ \begin{array}{ll} 1+x+\varepsilon , &{} x\in [-1,-0.5\varepsilon ), \\ \sqrt{0.5\varepsilon ^2-x^2}+1, &{} x\in [-0.5\varepsilon , 0.5\varepsilon ), \\ 1-x+\varepsilon , &{} x\in [0.5\varepsilon ,1] \end{array} \right. \end{aligned}$$

are apparently differentiable. Similarly, we get \(f_{2,n,\varepsilon }(x) =\langle f^-_{2,n,\varepsilon }(x),f^+_{2,n,\varepsilon }(x)\rangle \ \ (n>3)\). Notice \(f_{1,n,0}(x_i)\in f_{2,n,\varepsilon }(x_i)\) \((i=1,2,\ldots , n)\), \(\lim \limits _{\varepsilon \rightarrow 0} f_{2,n,\varepsilon }(x_i)\) \( =f_{2,n,0}(x_i)\ \ (i=1,2,\ldots , n)\), and \(\lim \limits _{n\rightarrow +\infty } f_{2,n,\varepsilon }(x)=f_{2,n,0}(x) \ \ (x\in [a,b])\) if \(a=x_1<x_2<\cdots x_n=b\) is an equal-length partition.

Remark 1.2

An interval linear regression model for a real input–output sample can also be written as \(Y({\pmb x})=A_0\oplus (A_1\otimes x_1)\oplus (A_2\otimes x_2)\oplus \cdots \oplus (A_n\otimes x_n)={\pmb A}{\pmb x},\) where \({\pmb x}=(1, x_1, x_2, \ldots , x_n)^\mathrm{T}\) is a real input column vector, \(Y({\pmb x})\) is the corresponding estimated interval, and \({\pmb A}=(A_0, A_1, A_2, \ldots , A_n)\) is an interval coefficient row vector with \(A_i=\lceil a_i, c_i\rceil \ \ (i=0,1,2,\ldots , n)\), the optimal interval coefficients. Moreover, the optimal interval coefficients is a solution of the following quadratic programming problem [see (Chukhrova and Johannssen 2019; Tanaka and Lee 1998) and the references here in]

$$\begin{aligned}&\min \limits _{{\pmb a}, {\pmb c}}J= {\pmb c}^\mathrm{T}\left( \sum _{j=1}^m|{\pmb x}_j|^\mathrm{T}|{\pmb x}_j|\right) {\pmb c} +\xi {\pmb a}^\mathrm{T}{\pmb a},\\&\quad \text {subject to} \ \ \ {\pmb a}^\mathrm{T}{\pmb x}_j-{\pmb c}^\mathrm{T}{\pmb x}_j\le y_j\le {\pmb a}^\mathrm{T}{\pmb x}_j+{\pmb c}^\mathrm{T}|{\pmb x}_j|,\\&\quad c_i\ge 0 \ \ (i=0,1,\ldots , n; j=1,2,\ldots , p), \end{aligned}$$

where \({\pmb a}=(a_0, a_1, \ldots , a_n)^\mathrm{T}\), \({\pmb c}=(c_0, c_1,\ldots , c_n)^\mathrm{T}\), \({\pmb c}^\mathrm{T}\) is the transpose of \({\pmb c}\), \(({\pmb x}^\mathrm{T}_j,y_j)=(x_{j1}, x_{j2}, \ldots , x_{jn}; y_j)\) is j-th sample, \(|{\pmb x}_j|^\mathrm{T}=(|x_{j1}|, |x_{j2}|, \ldots , |x_{jn}|)\), and \(\xi >0\) (very small).

Thus, the main concern of this paper is how practitioners to model interval input–output samples. There are some available work in the literature related to the topic (we refer to Boukezzoula et al. 2011, 2018, 2020; Chuang 2008; Hladíka and Černý 2012); but we extend the method as presented by Hladíka and Černý (2012) because the readers probably see from it not only the motivation the authors propose their method but also the course the method is formed [see Polya’s famous book (Polya 1954) for more in this direction]. The present paper is a sequel of these works which exemplifies, in the light of Polya’s idea, how to restore or approximate the true function (in form of interval input–output function) from the interval input–output samples based on some commonly used methods. For problem (I), we first concentrate the original problem by assuming that each interval datum has a radius 0 and thus get a new sample set and find a solution: Determine the grade of \({\pmb {\mathbb {t}}}_0\) using the classical linear regression function obtained relying on the new sample set. Then we investigate the generalization of this strategy in the case of the interval sample. Analogously, for problem (II), we first use an interval input–output function (which can be formulated) to approximate f (by observing the analogy between the case of interval input–output sample and the case of the real input–output sample) and then consider the establishment of the new system.

To make our discussion more reasonable, let us recall three most usually used metrics on the n-dimension Euclidean space \(R^n \ \ (n\ge 1)\) which are defined by \(({\pmb a}=(a_1,\ldots , a_n), {\pmb b}=(b_1, \ldots , a_n))\)\({\underline{\rho }}_n({\pmb a},{\pmb b})=\max \{ |a_1-b_1|,\) \( \ldots , |a_n-b_n|\}\) (called the Chebyshev metric), \({\rho }_n({\pmb a},{\pmb b})\) \(=\sqrt{\sum _{i=1}^{n}|a_i-b_i|^2}\) (called the Euclidean metric), and \({\bar{\rho }}_n({\pmb a},{\pmb b})\) \(=\sum _{i=1}^{n}|a_i-b_i|\) (called the Manhattan metric or the city block metric).

The rest of the paper consists of four sections. Section 2 exemplifies how to discover, based on the classical linear regression, methods to model interval input–output samples. Section 3 investigates, based on the classical linear interpolation, the same problem as in Sect. 2. Section 4 exemplifies applications of the models proposed. Conclusions, discussion, and strategies for further generalization are given in Sect. 5.

2 Three models stemmed from the classical linear regression

In this section, the following interval input–output sample set, consisting of \(n+1\)-dimension row vectors of interval numbers (each interval number contains the true datum), will be considered:

$$\begin{aligned} {{\mathbb {S}}}= & {} ({{{\mathbb {x}}}}_{1,1},{{{\mathbb {x}}}}_{2,1},\ldots ,{{{\mathbb {x}}}}_{n,1}; {\mathbb {y}}_{1}), ({{{\mathbb {x}}}}_{1,2},{{{\mathbb {x}}}}_{2,2},\ldots ,{{{\mathbb {x}}}}_{n,2}; {\mathbb {y}}_{2}),\nonumber \\&\ldots , ({{{\mathbb {x}}}}_{1,m},{{{\mathbb {x}}}}_{2,m},\ldots ,{{{\mathbb {x}}}}_{n,m}; {\mathbb {y}}_{m}) \end{aligned}$$
(2)

We first present three linear regression models, each is stemmed from the corresponding linear regression modal with real input–output sample. Then we prove that each modal is an universal approximator (Ying 2015) to the corresponding linear regression modal based on the true sample.

Remark 2.1

  1. (1)

    If \(m=n+1=2\), \(({{{\mathbb {x}}}}_{1,1},{\mathbb {y}}_{1})= ({x}_{1},{y}_{1})\), and \(({{{\mathbb {x}}}}_{1,2},{\mathbb {y}}_{2})= ({x}_{2},{y}_{2})\ \ (x_1\not = x_2)\). Then the linear regression function based on \({{\mathbb {S}}}\) is a classical one: \(f_c(x)=\frac{{x}_{1}y_2-x_2y_1}{{x}_{1}-{x}_{2}}+ \frac{y_{1}-y_{2}}{x_{1}-x_{2}}x=a_0+a_1x\).

  2. (2)

    If \(m=n+1=2\), \(({{{\mathbb {x}}}}_{1,1},{\mathbb {y}}_{1})= (\lceil x_1,\delta \rceil , \lceil y_1,\delta \rceil )\), and \(({{{\mathbb {x}}}}_{1,2},{\mathbb {y}}_{2})= (\lceil x_2,\delta \rceil , \lceil y_2,\delta \rceil )\) \((x_1\not = x_2, \delta >0)\). It is most possible for us to get two functions: one is the classical linear regression function

    $$\begin{aligned} {\underline{f}}(t)= \frac{(x_1-\delta )(y_2-\delta )-(x_2-\delta )(y_1-\delta )}{{x}_{1}-{x}_{2}}+ \frac{y_{1}-y_{2}}{x_{1}-x_{2}}{t} \end{aligned}$$

    based on the real input–output sample set \(\{(x_1-\delta ,y_1-\delta ),\) \((x_2-\delta ,y_2-\delta )\}\), another is the classical linear regression function

    $$\begin{aligned} {\bar{f}}(t)= \frac{(x_1+\delta )(y_2+\delta )-(x_2+\delta )(y_1+\delta )}{{x}_{1}-{x}_{2}}+ \frac{y_{1}-y_{2}}{x_{1}-x_{2}}{t} \end{aligned}$$

    based on the real input–output sample set \(\{(x_1+\delta ,y_1+\delta ),(x_2+\delta ,y_2+\delta )\}\). It is also natural for us to take the linear regression function based on the above interval input–output sample set \({{\mathbb {S}}}\) as \({\hat{f}}_\delta ({{{\mathbb {x}}}}) ={\mathbb {a}}_0{\oplus } [{\mathbb {a}}_1\otimes {{{\mathbb {x}}}}]\) or \({\check{f}}_\delta ({{{\mathbb {x}}}})= {\downharpoonleft } {\underline{f}}({\dot{x}}), {\bar{f}}(\ddot{x}){\downharpoonright },\) where

    \({\mathbb {a}}_1\) \(=\frac{y_{1}-y_{2}}{x_{1}-x_{2}}\), and \({{{\mathbb {x}}}}=\langle {\dot{x}},\ddot{x}\rangle \in {{\mathbb {R}}}\). It can be easily seen that \(\lim \limits _{\delta \rightarrow 0}{\hat{f}}_\delta ({{{\mathbb {x}}}})= \lim \limits _{\delta \rightarrow 0}{\check{f}}_\delta ({{{\mathbb {x}}}}) =f(x) \ \ (\forall {{{\mathbb {x}}}}\in {{\mathbb {R}}})\). This confirms the rationality that we take the linear regression function based on the above interval input–output sample set \({{\mathbb {S}}}\) as \({\hat{f}}_\delta \) or \({\check{f}}_\delta \).

  3. (3)

    Notice that the radii of interval numbers in a data set are not the same in general in practice problems. So we should consider a little big generalization of (2): \(m=n+1=2\), \(({{{\mathbb {x}}}}_{1,1},{\mathbb {y}}_{1})= (\lceil x_1,\delta _{1,1}\rceil , \lceil y_1,\delta _{1,2}\rceil )\), and \(({{{\mathbb {x}}}}_{1,2},{\mathbb {y}}_{2})= (\lceil x_2,\delta _{2,1}\rceil , \lceil y_2,\delta _{2,2}\rceil )\)\((x_1\not = x_2, \delta _{i,j}>0, i,j=1,2)\). Analogously to (2), we can get the linear regression function \({\hat{f}}_{{\pmb \delta }}({{{\mathbb {x}}}}) =\langle {\dot{a}}_0,\ddot{a}_0\rangle \oplus \langle {\dot{a}}_1,\ddot{a}_1\rangle {{{\mathbb {x}}}}\) or \({\check{f}}_{{\pmb \delta }}({{{\mathbb {x}}}})\) \(= \ \ {\downharpoonleft }\ {\underline{f}}({\dot{x}}),{\bar{f}}(\ddot{x}) {\downharpoonright } \ \ ({{{\mathbb {x}}}}=\langle {\dot{x}},\ddot{x}\rangle \in {{\mathbb {R}}})\) based on this interval input–output sample set \({{\mathbb {S}}}\), where

    $$\begin{aligned} {\dot{a}}_0= & {} \frac{(x_1-\delta _{1,1})(y_2-\delta _{2,2})-(x_2-\delta _{2,1}) (y_1-\delta _{1,2})}{({x}_{1}-{x}_{2})+(\delta _{1,1}-\delta _{2,1})} \wedge \frac{(x_1+\delta _{1,1})(y_2+\delta _{2,2})-(x_2+\delta _{2,1}) (y_1+\delta _{1,2})}{({x}_{1}-{x}_{2})+(\delta _{1,1}-\delta _{2,1})},\\ \ddot{a}_0= & {} \frac{(x_1-\delta _{1,1})(y_2-\delta _{2,2})-(x_2-\delta _{2,1}) (y_1-\delta _{1,2})}{({x}_{1}-{x}_{2})+(\delta _{1,1}-\delta _{2,1})}\vee \frac{(x_1+\delta _{1,1})(y_2+\delta _{2,2})-(x_2+\delta _{2,1}) (y_1+\delta _{1,2})}{({x}_{1}-{x}_{2})+(\delta _{1,1}-\delta _{2,1})},\\ {\dot{a}}_1= & {} \frac{(y_{1}-y_{2})+(\delta _{2,2}-\delta _{1,2})}{({x}_{1}-{x}_{2})+(\delta _{2,1}-\delta _{1,1})} \wedge \frac{(y_{1}-y_{2})+(\delta _{1,2}-\delta _{2,2})}{({x}_{1}-{x}_{2})+(\delta _{1,1}-\delta _{2,1})},\\ \ddot{a}_1= & {} \frac{(y_{1}-y_{2})+(\delta _{2,2}-\delta _{1,2})}{({x}_{1}-{x}_{2})+(\delta _{2,1}-\delta _{1,1})} \vee \frac{(y_{1}-y_{2})+(\delta _{1,2}-\delta _{2,2})}{({x}_{1}-{x}_{2})+(\delta _{1,1}-\delta _{2,1})},\\ {\underline{f}}(t)= & {} \frac{(x_1-\delta _{1,1})(y_2-\delta _{2,2})- (x_2-\delta _{2,1})(y_1-\delta _{1,2})}{({x}_{1}-{x}_{2})+(\delta _{2,1}-\delta _{1,1})} + \frac{(y_{1}-y_{2})+(\delta _{2,2}-\delta _{1,2})}{(x_{1}-x_{2})+(\delta _{2,1}-\delta _{1,1})}t \end{aligned}$$

    is the classical linear regression function based on the real input–output sample set \(\{(x_1-\delta _{1,1},y_1-\delta _{1,2}),\) \((x_2-\delta _{2,1},y_2-\delta _{2,2})\}\), and

    $$\begin{aligned} {\bar{f}}(t)=\frac{(x_1+\delta _{1,1})(y_2+\delta _{2,2})- (x_2+\delta _{2,1})(y_1+\delta _{1,2})}{({x}_{1}-{x}_{2})+(\delta _{1,1}-\delta _{2,1})} +\frac{(y_{1}-y_{2})+(\delta _{1,2}-\delta _{2,2})}{(x_{1}-x_{2})+(\delta _{1,1}-\delta _{2,1})}t \end{aligned}$$

    is the classical linear regression function based on the real input–output sample set \(\{(x_1+\delta _{1,1},y_1+\delta _{1,2}),\) \((x_2+\delta _{2,1},y_2+\delta _{2,2})\}\). In addition, both \({\hat{f}}_{{\pmb \delta }}({{{\mathbb {x}}}})\longrightarrow f(x)\) and \({\check{f}}_{{\pmb \delta }}({{{\mathbb {x}}}})\longrightarrow f(x)\) hold for all \({{{\mathbb {x}}}}\in {{\mathbb {R}}}\) as \({\pmb \delta }=(\delta _{1,1},\delta _{1,2},\delta _{2,1},\delta _{2,2}) \longrightarrow (0,0,0,0).\) This further support us to take the linear regression function based on the above interval input–output sample set \({{\mathbb {S}}}\) as \({\hat{f}}_{{\pmb \delta }}\) or \({\check{f}}_{{\pmb \delta }}\).

  4. (4)

    If \(\delta _{1,1}=\delta _{1,2}=\delta _{2,1}=\delta _{2,2}\) does not hold, then \({\mathbb {a}}_1=\frac{y_{1}-y_{2}}{x_{1}-x_{2}}\) does not hold (which can be seen from (1)–(3)) and the output should be \(f_c(x)+r\), where r should be a continuous function of variables \(\delta _{1,1}\), \(\delta _{1,2},\) \(\delta _{2,1},\) and \(\delta _{2,2}\) satisfying \(r(0,0,0,0)=0\). Thus we can take r as the second kind linear regression function (i.e. the linear regression function having no the constant term and obtaining by the classical least square estimation method) \(r(\delta )= \frac{\delta _{1,1}\delta _{1,2}+\delta _{2,1}\delta _{2,2}}{\delta ^2_{1,1}+\delta ^2_{2,1}} \delta \) based on the real input–output sample set \(\{(\delta _{1,1},\delta _{1,2}), (\delta _{2,1},\delta _{2,2})\}\). Therefore, we can also take the linear regression function based on the interval input–output sample set \({{\mathbb {S}}}\) as \({\tilde{f}}_{{\pmb \delta }}({{{\mathbb {x}}}})= \lceil f_c({{{\mathbb {x}}}}^c), r({{{\mathbb {x}}}}^r)\rceil = f_c({{{\mathbb {x}}}}^c)\pm r({{{\mathbb {x}}}}^r)\), where \(f_c\) is the linear regression function based on \(\{({{{\mathbb {x}}}}^c_{1,1},{\mathbb {y}}^c_1), ({{{\mathbb {x}}}}^c_{1,2},{\mathbb {y}}^c_2)\}\).

Example 2.2

Consider the interval input–output sample set \({{\mathbb {S}}}=\{({{{\mathbb {x}}}}_{1,i},{{{\mathbb {x}}}}_{2,i},{\mathbb {y}}_i) \mid i=1,2,\) \(\ldots , 10\} =\{(\lceil x_{1,i},\delta _{1,i}\rceil ,\) \( \lceil x_{2,i},\delta _{2,i}\rceil , \lceil y_{i}, r_{i}\rceil ,) \mid i=1,2,\) \(\ldots , 10\}\) in Table 2.

Table 2 The interval input–output data

(1) Similar to Remark 2.1(2) and Remark 2.1(3), we have \({\underline{f}}(u, v)= {\underline{a}}_0+{\underline{a}}_1u+{\underline{a}}_2v =95.9639+0.02u-7.3671v\), \({\bar{f}}(u,v)={\bar{a}}_0+{\bar{a}}_1u+{\bar{a}}_2v =126.6484+0.0088u-6.9587v\), and thus we obtain the first model \({\hat{f}}({\pmb {{{\mathbb {x}}}}}) ={\hat{f}}({{{\mathbb {x}}}}_1,{{{\mathbb {x}}}}_2) ={\mathbb {a}}_0{\oplus }[{\mathbb {a}}_1\otimes {{{\mathbb {x}}}}_1] {\oplus }[{\mathbb {a}}_2\otimes {{{\mathbb {x}}}}_2] =\langle 95.9639,126.6484\rangle {\oplus }\) \([\langle 0.0088,0.02\rangle \otimes {{{\mathbb {x}}}}_1] {\ominus } [\langle 6.9587,7.3671\rangle \otimes {\mathbb {x}}_2]\) and the second model \({\check{f}}({\pmb {{\mathbb {x}}}}) ={\check{f}}({{{\mathbb {x}}}}_1,{{{\mathbb {x}}}}_2) ={\downharpoonleft }\ \underline{f} ({\dot{x}}_1,{\dot{x}}_2), {\bar{f}}(\ddot{x}_1,\ddot{x}_2)\ {\downharpoonright } = {\downharpoonleft }\ 95.9639+0.02{\dot{x}}_1-7.3671{\dot{x}}_2, 126.6484+0.0088\ddot{x}_1-6.9587\ddot{x}_2\ {\downharpoonright }.\) Similar to Remark 2.1(4), we have \(f_c({{{\mathbb {x}}}}^c_1,{{{\mathbb {x}}}}^c_2)=f_c(x_1,x_2)\) \( =a_0+a_1x_1+a_2x_2=111.6918+0.0143x_1-7.1882x_2\) (i.e. the classical linear regression function or the first kind linear regression function based on the sample set \(\{({{{\mathbb {x}}}}^c_1,{{{\mathbb {x}}}}^c_2,{\mathbb {y}}^c) \mid ({{{\mathbb {x}}}}_1,{{{\mathbb {x}}}}_2,{\mathbb {y}})\in {{\mathbb {S}}})\}\). We chose, by the classical least square estimation method, a linear regression function \(r(\delta _1,\delta _2) =0.{\dot{1}}\delta _1+4.{{\dot{4}}}\delta _2\) from the set of all 2-variable linear functions without constant terms (i.e. the second kind linear regression function)Footnote 3 As a result, we obtain the third model \({\tilde{f}}({\pmb {{\mathbb {x}}}})= {\tilde{f}}(\lceil x_1,\delta _1\rceil ,\lceil x_2,\delta _2\rceil )= \big \lceil 111.6918+0.0143x_1-7.1882x_2, 0.{\dot{1}}\delta _1+4.{\dot{4}} \delta _2\big \rceil =(111.6918+0.0143x_1-7.1882x_2)\pm (0.{\dot{1}}\delta _1+4.{{\dot{4}}} \delta _2).\)

(2) Assume \({\pmb s}_0= \big (1000, 600, 1200, 500, 300, 400, 1300,\) \(1100, 1300, 300; 5, 7, 6, 6, 8, 7, 5,\) 4, 3, 9; 100, 75, 80, 70, 50, 65,  \(90, 100,110, 60\big ) \in {{{\mathbb {x}}}}_{1,1}\times {{{\mathbb {x}}}}_{1,2} \times \cdots \times {{{\mathbb {x}}}}_{1,10}\times {{{\mathbb {x}}}}_{2,1}\times \) \({{{\mathbb {x}}}}_{2,2}\times \cdots \times {{{\mathbb {x}}}}_{2,10}\times {\mathbb {y}}_1\times {\mathbb {y}}_2\times \cdots \times {\mathbb {y}}_{10}\) is the true data. Then \(f({\pmb x})=f(x_1, x_2)=111.6918+0.0143x_1- 7.1882x_2\) is the linear regression function based on the real input-real output data set \({\pmb s}_0\). For each \({\pmb s}= \big (1000+a, 600+a, 1200+a, 500+a, 300+a, 400+a, 1300+a, 1100+a\), \(1300+a, 300+a; 5+b, 7+b, 6+b, 6+b, 8+b, 7+b, 5+b, 4+b, 3+b, 9+b; 100+c, 75+c, 80+c, 70+c, 50+c, 65+c, 90+c, 100+c, 110+c, 60+c\big )\) \(\in {{{\mathbb {x}}}}_{1,1}\times {{{\mathbb {x}}}}_{1,2} \times \cdots \times {{{\mathbb {x}}}}_{1,10}\times {{{\mathbb {x}}}}_{2,1}\times {{{\mathbb {x}}}}_{2,2}\times \cdots \times {{{\mathbb {x}}}}_{2,10}\times {\mathbb {y}}_1\times {\mathbb {y}}_2\times \cdots \times {\mathbb {y}}_{10}\), let \(f_{abc}({\pmb x})=f_{abc}(x_1, x_2)=\beta _a+{\beta _b}{x_1}+{\beta _c}{x_2}\) be the linear regression function based on the real input–output data set \({\pmb s}\), \({\underline{\rho }}_3^*(f,f_{abc})=\) \({\underline{\rho }}_3((111.6918,0.0143,7.1882), (\beta _a,\beta _b,\beta _c))\), \(\rho _3^*(f,f_{abc})=\rho _3((111.6918,0.0143,7.1882), (\beta _a,\beta _b,\beta _c))\), and \({\bar{\rho }}_3^*(f,f_{abc})={\bar{\rho }}_3((111.6918,0.0143,7.1882), \) \((\beta _a,\beta _b,\beta _c))\). Then we have the computing results as shown in Table 3.

Table 3 Comparison of distances between f and \(f_{abc}\) under three commonly used metrics

Supported by Remark 2.1 and Example 2.2, we have reasons to propose the following Algorithm 2.3:

Algorithm 2.3

The linear regression functions based on the interval input–output sample set \({{\mathbb {S}}}\) in equality (2) can be given by the following ways:

  1. (1)

    Step 1 Compute the ordinary linear regression function f\(({\pmb t})=\) f\((t_1,\ldots , t_n)=\) a\(_0+ \) a\(_1t_1+\cdots +\) a\(_nt_n \ \ (\forall {\pmb t}\in R^n)\), based on the real input–output sample set S \(= \{ ({\dot{x}}_{1,1},{\dot{x}}_{2,1},\ldots ,{\dot{x}}_{n,1};{\dot{y}}_{1}), \ ({\dot{x}}_{1,2},{\dot{x}}_{2,2},\ldots ,{\dot{x}}_{n,2};{\dot{y}}_{2}), \ \ldots ,({\dot{x}}_{1,m},{\dot{x}}_{2,m},\ldots ,{\dot{x}}_{n,m};{\dot{y}}_{m}) \}\), using the classical method. Step 2 Compute the ordinary linear regression function \({\bar{f}}({\pmb t})={\bar{f}}(t_1,\ldots , t_n)= {\bar{a}}_0+ {\bar{a}}_1t_1+\cdots +{\bar{a}}_nt_n \ \ (\forall {\pmb t}\in R^n)\), based on the real input–output sample set \({\bar{S}} = \{ (\ddot{x}_{1,1},\ddot{x}_{2,1},\ldots ,\ddot{x}_{n,1};\ddot{y}_{1}), \ (\ddot{x}_{1,2},\ddot{x}_{2,2},\ldots ,\ddot{x}_{n,2};{\dot{y}}_{2}), \ \ldots , (\ddot{x}_{1,m},\ddot{x}_{2,m},\ldots ,\ddot{x}_{n,m};\ddot{y}_{m}) \}\), using the classical method. Step 3 Obtain the first linear regression function \({\hat{f}}({\pmb {{\mathbb {x}}}})={\hat{f}}({{{\mathbb {x}}}}_1, {{{\mathbb {x}}}}_2,\ldots , {{{\mathbb {x}}}}_n) ={\mathbb {a}}_0{\oplus }[{\mathbb {a}}_1\otimes {{{\mathbb {x}}}}_1]\) \( {\oplus }[{\mathbb {a}}_2\otimes {{{\mathbb {x}}}}_2]\) \({\oplus }\cdots {\oplus }\) \([{\mathbb {a}}_n\otimes {{{\mathbb {x}}}}_n]\) by computing \({\mathbb {a}}_0= {\downharpoonleft } {\underline{a}}_0,{\bar{a}}_0 {\downharpoonright }\), \({\mathbb {a}}_1= {\downharpoonleft } {\underline{a}}_1,{\bar{a}}_1 {\downharpoonright }\), \(\ldots \), \({\mathbb {a}}_n= {\downharpoonleft } {\underline{a}}_n,{\bar{a}}_n {\downharpoonright }\). Step 4 Obtain the second linear regression function \({\check{f}}({\pmb {{\mathbb {x}}}})= {\check{f}}({{{\mathbb {x}}}}_1,\ldots , {{{\mathbb {x}}}}_n) = {\downharpoonleft }\ {\underline{f}}({\dot{x}}_1, \ldots ,{\dot{x}}_n), {\bar{f}}(\ddot{x}_1, \ldots ,\ddot{x}_n)\ {\downharpoonright }\).

  2. (2)

    Step 1 Compute the ordinary linear regression function \(f_c({\pmb t})=f_c(t_1,\ldots , t_n)=a_0+ a_1t_1+\cdots +a_nt_n\)\((\forall {\pmb t}\in R^n)\), based on the real input–output sample set

    $$\begin{aligned} S^c= & {} \bigg \{ \Big (\frac{\ddot{x}_{1,1}+{\dot{x}}_{1,1}}{2}, \frac{\ddot{x}_{2,1}+{\dot{x}}_{2,1}}{2}, \ldots , \frac{\ddot{x}_{n,1}+{\dot{x}}_{n,1}}{2}; \frac{\ddot{y}_1+{\dot{y}}_1}{2}\Big ), \Big (\frac{\ddot{x}_{1,2}+{\dot{x}}_{1,2}}{2}, \frac{\ddot{x}_{2,2}+{\dot{x}}_{2,2}}{2}, \ldots , \frac{\ddot{x}_{n,2}+{\dot{x}}_{n,2}}{2}; \frac{\ddot{y}_2+{\dot{y}}_2}{2}\Big ),\\&\ldots , \Big (\frac{\ddot{x}_{1,m}+{\dot{x}}_{1,m}}{2}, \frac{\ddot{x}_{2,m}+{\dot{x}}_{2,m}}{2}, \ldots , \frac{\ddot{x}_{n,m}+{\dot{x}}_{n,m}}{2}; \frac{\ddot{y}_m+{\dot{y}}_m}{2}\Big ) \bigg \}, \end{aligned}$$

    using the classical method. Step 2 Compute the ordinary linear regression function \(r({\pmb \delta })=r(\delta _1,\ldots , \delta _n)= b_1\delta _1+\cdots +b_n\delta _n \ \ (\forall {\pmb \delta }\in R^n)\), based on the real input–output sample set

    $$\begin{aligned} S^r= & {} \bigg \{\Big (\frac{\ddot{x}_{1,1}-{\dot{x}}_{1,1}}{2}, \frac{\ddot{x}_{2,1}-{\dot{x}}_{2,1}}{2}, \ldots , \frac{\ddot{x}_{n,1}-{\dot{x}}_{n,1}}{2}; \frac{\ddot{y}_1-{\dot{y}}_1}{2}\Big ), \Big (\frac{\ddot{x}_{1,2}-{\dot{x}}_{1,2}}{2}, \frac{\ddot{x}_{2,2}-{\dot{x}}_{2,2}}{2}, \ldots , \frac{\ddot{x}_{n,2}-{\dot{x}}_{n,2}}{2}; \frac{\ddot{y}_2-{\dot{y}}_2}{2}\Big ),\\&\ldots ,\Big (\frac{\ddot{x}_{1,m}-{\dot{x}}_{1,m}}{2}, \frac{\ddot{x}_{2,m}-{\dot{x}}_{2,m}}{2}, \ldots , \frac{\ddot{x}_{n,m}-{\dot{x}}_{n,m}}{2}; \frac{\ddot{y}_m-{\dot{y}}_m}{2}\Big ) \bigg \}, \end{aligned}$$

    using the least-square method. Then \({\tilde{f}}({\pmb {{\mathbb {x}}}}) ={\tilde{f}}({{{\mathbb {x}}}}_1, \ldots , {{{\mathbb {x}}}}_n)= \lceil f_c({{{\mathbb {x}}}}^c_1, \ldots ,{{{\mathbb {x}}}}^c_n), r({{{\mathbb {x}}}}^r_1, \ldots ,{{{\mathbb {x}}}}^r_n)\rceil \) \( =f_c({{{\mathbb {x}}}}^c_1,{{{\mathbb {x}}}}^c_2,\ldots ,{{{\mathbb {x}}}}^c_n)\pm r({{{\mathbb {x}}}}^r_1,{{{\mathbb {x}}}}^r_2,\ldots ,{{{\mathbb {x}}}}^r_n) \) is the third linear regression function.

The rationality of Algorithm 2.3 is guaranteed by the following

Theorem 2.4

Let \({\pmb s}={(} ({ x}_{1,1},{ x}_{2,1},\ldots ,{ x}_{n,1}; {y}_{1}), ({x}_{1,2},{x}_{2,2},\ldots ,{x}_{n,2}; {y}_{2}), \ldots ,\) \(({x}_{1,m},{x}_{2,m},\ldots ,{x}_{n,m}; {y}_{m}){)}\) be a real input–output datum from \({\pmb S}=({{{\mathbb {x}}}}_{1,1}\times {{{\mathbb {x}}}}_{2,1}\times \cdots \times {{{\mathbb {x}}}}_{n,1}\times {\mathbb {y}}_{1})\times ({{{\mathbb {x}}}}_{1,2}\times {{{\mathbb {x}}}}_{2,2}\times \cdots \times {{{\mathbb {x}}}}_{n,2}\times {\mathbb {y}}_{2})\times \cdots \times \) \( ({{{\mathbb {x}}}}_{1,m}\times {{{\mathbb {x}}}}_{2,m}\times \cdots \times {{{\mathbb {x}}}}_{n,m}\times {\mathbb {y}}_{m})\) (which is a rearrangement of the sample set in equality (2)) and \(f_{\pmb s}({\pmb x})=\beta _0({\pmb s})+\) \(\beta _1({\pmb s}){x_1}+ \cdots +{\beta _n}({\pmb s}){x_n}\) the linear regression function based on the real input–output sample set \({\pmb s}\). Then

  1. (1)

    \(\beta _0({\pmb s})\), \(\beta _1({\pmb s})\), \(\ldots \), \({\beta _n}({\pmb s})\) are continuous functions from \(\left( R^{2m(n+1)},\rho \right) \) to \((R,\rho _1)\ \ (\rho \in \{{\underline{\rho }}_{2m(n+1)},\) \(\rho _{2m(n+1)}, {\bar{\rho }}_{2m(n+1)}\})\).

  2. (2)

    The mapping \(g:\left( R^{m(n+1)},\rho _{m(n+1)}\right) \longrightarrow \left( R^{n+1},\rho \right) (\rho \in \{{\underline{\rho }}_{n+1}, \rho _{n+1}, {\bar{\rho }}_{n+1}\})\), defined by \(g({\pmb s})= \big (\beta _0({\pmb s}),\beta _1({\pmb s}),\ldots ,\) \({\beta _n}({\pmb s}) \big )\), is continuous.

  3. (3)

    Let \(f({\pmb x})=\beta _0+\beta _1{x_1}+\cdots +{\beta _n}{x_n}\) be the linear regression function based on the true sample \({\pmb t}=\big ( ({ x}^0_{1,1},{ x}^0_{2,1},\ldots ,{ x}^0_{n,1}; { y}^0_{1}), ({x}^0_{1,2},{x}^0_{2,2},\ldots ,{x}^0_{n,2}; {y}^0_{2}), \ldots , ({x}^0_{1,m},{x}^0_{2,m},\ldots ,{x}^0_{n,m}; {y}^0_{m})\big )\) in \({\pmb S}\). Then, for each \(\varepsilon >0\), there exists a \({\pmb \delta }=(\delta _0,\delta _1,\delta _2,\ldots , \delta _n)^\mathrm{T}\pmb >{\pmb 0}\) (i.e. \(\delta _0>0, \delta _1>0, \delta _2>0, \ldots , \delta _n>0)\) and a \(\delta >0\), such that (where \(\rho \in \{{\underline{\rho }}_{n+1}, \rho _{n+1}, {\bar{\rho }}_{n+1}\}\))

    1. i)

      \(\rho \big (g({\pmb s}), (\beta _0,\beta _1,\ldots ,{\beta _n})\big )<\varepsilon \) if \({\pmb S}\) is a \({\pmb \delta }\)-sample set, i.e. it satisfies \(\max \big \{{\mathbb {y}}^r_{1}, {\mathbb {y}}^r_{2}, \ldots , {\mathbb {y}}^r_{m}\big \}\) \(\le \delta _0\), \(\max \big \{{{{\mathbb {x}}}}^r_{1,1}, {{{\mathbb {x}}}}^r_{1,2}, \ldots , {{{\mathbb {x}}}}^r_{1,m}\big \}\) \(\le \delta _1\), \(\max \big \{{{{\mathbb {x}}}}^r_{2,1}, \ldots , {{{\mathbb {x}}}}^r_{2,m}\big \}\) \(\le \delta _2\), \(\ldots \), \(\max \big \{{{{\mathbb {x}}}}^r_{n,1}, {{{\mathbb {x}}}}^r_{n,2}, \ldots , {{{\mathbb {x}}}}^r_{n,m}\big \}\) \(\le \delta _n\).

    2. ii)

      \(\rho \big (g({\pmb s}), (\beta _0,\beta _1,\ldots ,{\beta _n})\big )<\varepsilon \) if \({\pmb S}\) is a \({\delta }\)-sample set, i.e. it satisfies \(\max \{{\mathbb {y}}^r_{1}, {\mathbb {y}}^r_{2}, \ldots , {\mathbb {y}}^r_{m}; {{{\mathbb {x}}}}^r_{1,1}, {{{\mathbb {x}}}}^r_{1,2}, \ldots , {{{\mathbb {x}}}}^r_{1,m};\) \({{{\mathbb {x}}}}^r_{2,1}, {\mathbb {x}}^r_{2,2},\) \(\ldots , {{{\mathbb {x}}}}^r_{2,m}; \ \ldots ,\) \( \ {{{\mathbb {x}}}}^r_{n,1}, {{{\mathbb {x}}}}^r_{n,2}, \ldots , {{{\mathbb {x}}}}^r_{n,m}\}\) \(\le \delta \).

  4. (4)

    \({\hat{f}}\) is a universal approximator to f, i.e. for each compact set \(U\subseteq R^{n+1}\) and each \(\varepsilon >0\), there exists a \(\delta (U,\varepsilon )>0\) such that \(\sup \limits _{{\pmb x}\in U}|(a_0+a_1x_1+a_2x_2+\cdots +a_nx_n)-f({\pmb x})| <\varepsilon \) for each \({\pmb a}=(a_0,a_1,a_2,\ldots ,a_n)\in {\mathbb {a}}_0\times {\mathbb {a}}_1\times {\mathbb {a}}_2 \times \cdots \times {\mathbb {a}}_n\) and each \(\delta (U,\varepsilon )\)-sample set \(\pmb S\) (containing \(\pmb t\)), where \({\hat{f}}({\pmb {{\mathbb {x}}}}) ={\hat{f}}({{{\mathbb {x}}}}_1,{{{\mathbb {x}}}}_2,\ldots , {{{\mathbb {x}}}}_n) ={\mathbb {a}}_0{\oplus }[{\mathbb {a}}_1\otimes {{{\mathbb {x}}}}_1] {\oplus }[{\mathbb {a}}_2\otimes {{{\mathbb {x}}}}_2]{\oplus }\)\( \cdots {\oplus }\) \([{\mathbb {a}}_n\otimes {{{\mathbb {x}}}}_n]\) is the first linear regression function based on \(\pmb S\).

  5. (5)

    \({\check{f}}\) is a universal approximator to f, i.e. for each compact set \(U\subseteq R^{n+1}\) and each \(\varepsilon >0\), there exists a \(\delta (U,\varepsilon )>0\) such that \(\sup \limits _{{\pmb x}\in U}|\) f\(({\pmb x})-f({\pmb x})| <\varepsilon \) and \(\sup \limits _{{\pmb x}\in U}|{\bar{f}}({\pmb x})-f({\pmb x})|<\varepsilon \) for each \(\delta (U,\varepsilon )\)-sample set \(\pmb S\) (containing \(\pmb t\)), where \({\check{f}}({\pmb {{\mathbb {x}}}})={\check{f}}({{{\mathbb {x}}}}_1,{{{\mathbb {x}}}}_2, \ldots , {{{\mathbb {x}}}}_n)\) is the third linear regression function based on \(\pmb S\).

  6. (6)

    \({\tilde{f}}\) is a universal approximator to f, i.e. for each compact set \(U\subseteq R^{n+1}\) and each \(\varepsilon >0\), there exists a \(\delta (U,\varepsilon )>0\) such that \(\sup \{r({\pmb x}) \mid {\pmb x}\in U\}<\varepsilon \) for each \(\delta (U,\varepsilon )\)-sample set \(\pmb S\) (containing \(\pmb t\)), where \({\tilde{f}}({\pmb {{\mathbb {x}}}})= {\tilde{f}} ({{{\mathbb {x}}}}_1,{{{\mathbb {x}}}}_2,\ldots , {{{\mathbb {x}}}}_n) =\langle f({{{\mathbb {x}}}}^c_1,{{{\mathbb {x}}}}^c_2,\ldots ,{{{\mathbb {x}}}}^c_n), r({{{\mathbb {x}}}}^r_1,{{{\mathbb {x}}}}^r_2,\ldots ,{\mathbb {x}}^r_n)\rangle \) is the second linear regression function based on \(\pmb S\).

Proof

Step 1 :

For two k-variables polynomials \(P({\pmb x})\) and \(Q({\pmb x}) \ \ (k\ge 1)\), the function \(\frac{P({\pmb x})}{Q({\pmb x})}\) is continuous in \(R^k-E_Q\), where \(E_Q\) is the zero-point set of \(Q({\pmb x})\) (and thus a finite set). Moreover, the inequalities \({\underline{\rho }}_{k+1}({\pmb x},{\pmb y})\le {\rho }_{k+1}({\pmb x},{\pmb y})\le {\bar{\rho }}_{k+1}({\pmb x},{\pmb y})\) hold for any \(\{{\pmb x},{\pmb y}\}\subseteq R^{k+1}\).

Step 2 :

As \({\underline{\rho }}_{2m(n+1)},\) \(\rho _{2m(n+1)}\), and \({\bar{\rho }}_{2m(n+1)}\) induce the same topology (i.e. the Euclidean topology) on \(R^{2m(n+1)}\), it can be easily seen from computing formulae of classical linear regression and Step 1 that (1) is true. By (1), \(p_{i}\circ g({\pmb s})=\beta _i({\pmb s}): {\big (R^{m(n+1)},\rho _{m(n+1)}\big )}\longrightarrow (R,\rho _1)\) is a continuous function (where \(p_i: R^{n+1}\longrightarrow R\) is the i-th projection, \(i=1,2, \ldots ,n+1\)). Thus \(g({\pmb s})\) is a continuous function i.e. (2) is true (equivalently, (3) is true). (4)–(6) follow from (3). \(\square \)

Example 2.5

Consider the interval sample in Table 4.

Table 4 Interval sample from Hladíka and Černý (2012)

As \({\underline{f}}({\pmb t})=52.0962+1.3713t_2-0.7425t_3\) and \({\bar{f}}({\pmb t})=56.5474+1.4580t_2-0.5399t_3 \ \ (\forall {\pmb t}= (t_1,t_2,t_3)\in R^3)\), \({\hat{f}}({\pmb {{{\mathbb {x}}}}}) =\langle 52.0962,56.5474\rangle \oplus [\langle 1.3713,1.4580\rangle \otimes {{{\mathbb {x}}}}_2] {\ominus }\) \([\langle 0.5399,0.7425\rangle \otimes {{{\mathbb {x}}}}_3]\) and

$$\begin{aligned}&{\check{f}}({{{\mathbb {x}}}}_1,{{{\mathbb {x}}}}_2,{{{\mathbb {x}}}}_3)= {\downharpoonleft }\ 52.0962+1.3713{\dot{x}}_2-0.7425{\dot{x}}_3, 56.5474+1.4580\ddot{x}_2-0.5399\ddot{x}_3\ {\downharpoonright }\\&\quad (\forall {\pmb {{{\mathbb {x}}}}}= ({{{\mathbb {x}}}}_1,{{{\mathbb {x}}}}_2,{{{\mathbb {x}}}}_3)\in {{\mathbb {R}}}^3). \end{aligned}$$

As \(f_c({{{\mathbb {x}}}}^c_1,{{{\mathbb {x}}}}^c_2,{{{\mathbb {x}}}}^c_3)=53.1765+1.4635{{{\mathbb {x}}}}^c_2 +0.6514{{{\mathbb {x}}}}^c_3\) and \(r({{{\mathbb {x}}}}^r_1,{{{\mathbb {x}}}}^r_2,{{{\mathbb {x}}}}^r_3)= 0.7668{{{\mathbb {x}}}}^r_2+0.1887{{{\mathbb {x}}}}^r_3 \ (\forall ({{{\mathbb {x}}}}_1,{{{\mathbb {x}}}}_2,{{{\mathbb {x}}}}_3)\) \(\in {{\mathbb {R}}}^3)\), \({\tilde{f}}({{{\mathbb {x}}}}_1,{{{\mathbb {x}}}}_2,{{{\mathbb {x}}}}_3) \) \(=\lceil {53.1765}+1.4635{{{\mathbb {x}}}}^c_2+ 0.6514{{{\mathbb {x}}}}^c_3, 0.7668{{{\mathbb {x}}}}^r_2+0.1887{{{\mathbb {x}}}}^r_3\rceil \ \ (\forall {\pmb {{{\mathbb {x}}}}} =({{{\mathbb {x}}}}_1,{{{\mathbb {x}}}}_2,{{{\mathbb {x}}}}_3)\in {{\mathbb {R}}}^3).\) Notice that two models obtained in (Baczyński et al 2008) for above sample are \({\mathbb {y}}=\langle 50.922,55.431\rangle \) \({\oplus }\) \([\langle 1.401,1.526\rangle \otimes {{{\mathbb {x}}}}_2]\) \({\oplus } [\langle 0.624,0.679\rangle \) \(\otimes {{{\mathbb {x}}}}_3]\) and \({\mathbb {y}}=\langle 44.463,61.89\rangle {\oplus } [\langle 1.224,1.703\rangle \otimes {{{\mathbb {x}}}}_2]{\oplus }\) \( [\langle 0.545,0.758\rangle \otimes {{{\mathbb {x}}}}_3]\).

Example 2.6

Consider the special interval input–output sample set \({{\mathbb {S}}}=\{({{\mathbb {x}}}_{1,i},{{{\mathbb {x}}}}_{2,i},{\mathbb {y}}_i)\) \(\mid i=1,2,3,\) \(4,5\}\) in Table 5 (where \(\delta _2\equiv 2\), \(r\equiv 5\), and \(\delta _1\) is almost equal to 10).

Table 5 Special interval sample set

Then \(\underline{f}({\pmb t})= 111.7393+0.0124t_1-11.4787t_2\), \(f_c({\pmb t})=f_c(t_1,t_2)=139.57317+ 0.0124t_1-11.4787t_2\), and \({\bar{f}}({\pmb t})\) \(={\bar{f}}(t_1,t_2)=167.407+0.0124t_1-11.4787t_2\) \((\forall {\pmb t}=(t_1,t_2)\in R^2)\). By Algorithm 2.3, \({\hat{f}}({\pmb {{\mathbb {t}}}})= {\hat{f}}(\lceil t_1,10\rceil ,\lceil t_2,2\rceil )\) \( =\langle 111.7393+0.0124t_1-11.4787t_2, 167.407+ 0.0124t_1\) \(-11.4787t_2\rangle ={\check{f}}({\pmb {{\mathbb {t}}}}) \ \ (\forall {\pmb {{\mathbb {t}}}}=(\lceil t_1,10\rceil , \lceil t_2,2\rceil )\in {{\mathbb {R}}})\). As \(r(\delta _1,\delta _2)=2.5\delta _2=5\) (if \(\delta _2=2\)), \({\tilde{f}}({\pmb {{\mathbb {t}}}})= \lceil 0.0124t_1-11.4787t_2+139.5732, 5\rceil = f_c({\pmb t})\pm 5= f_c(t_1,t_2)\pm 5 \ \ ({\pmb {{\mathbb {t}}}}=(\lceil t_1,10\rceil , \lceil t_2,2\rceil )\in {{\mathbb {R}}})\). This motivates the following easy-to-use Corollary 2.7:

Corollary 2.7

For a special sample (or data set) \({{\mathbb {S}}}=\big \{ (\lceil x_{1,1},\varepsilon _1\rceil ,\) \(\lceil x_{2,1},\) \(\varepsilon _2\rceil ,\) \( \ldots ,\) \( \lceil x_{n,1}, \varepsilon _n\rceil ;\) \(\lceil y_1,\varepsilon _0\rceil )\), \((\lceil x_{1,2},\varepsilon _1\rceil ,\) \(\lceil x_{2,2},\varepsilon _2\rceil ,\) \(\ldots ,\) \( \lceil x_{n,2},\varepsilon _n\rceil ;\) \(\lceil y_2,\varepsilon _0\rceil ),\) \(\ldots ,\) \((\lceil x_{1,m},\varepsilon _1\rceil ,\) \(\lceil x_{2,m},\varepsilon _2\rceil , \ldots , \lceil x_{n,m},\varepsilon _n\rceil ;\lceil y_m,\varepsilon _0\rceil ) \big \}\), the first two linear regression functions in Algorithm 2.3 are the same: \({\hat{f}}({\pmb {{\mathbb {t}}}}) ={\hat{f}}(\lceil t_1,\varepsilon _1\rceil , \lceil t_2,\varepsilon _2\rceil , \ldots , \lceil t_n,\varepsilon _n\rceil )\) \(={\downharpoonleft }\ f_c({\pmb t})-o,f_c({\pmb t})+o\ {\downharpoonright }\) \(={\check{f}}({\pmb {{\mathbb {t}}}}\), it equals exactly to \(\lceil f_c(t_1,\ldots , t_n), |o|\rceil = f_c(t_1,\ldots , t_n)\pm |o|\), which can be looked to be \({\tilde{f}}({\pmb {{\mathbb {t}}}})\) (because \(r({\pmb \delta })\) in Algorithm 2.3 has infinite many chooses, including |o|), here \({\pmb t}=(t_1,t_2, \ldots , t_n)\) and \(o=\varepsilon _0-a_{1}\varepsilon _1-a_{2} \varepsilon _2- \cdots -a_{n}\varepsilon _n\).

Proof

For two column vectors \({\pmb x}=(x_{1},x_{2},\ldots ,x_{m})^\mathrm{T}\) and \({\pmb z}=(z_{1},z_{2},\ldots ,z_{m})^\mathrm{T}\), we write \(\bar{\pmb x}=\frac{1}{m}(x_{1}+x_{2}\) \(+\cdots +x_{m})\) and \({\pmb x}^\mathrm{T}{\pmb z}=x_1z_1+x_2z_2+\cdots +x_mz_m\). Using \({\pmb x}_1\) (resp., \({\pmb x}_2, \ldots , {\pmb x}_n, {\pmb y})\) to denote the column vector \(({x}_{1,1},{x}_{1,2},\ldots ,{x}_{1,m})^\mathrm{T}\) (resp., \(({x}_{2,1},{x}_{2,2},\ldots ,{x}_{2,m})^\mathrm{T},\) \( \ldots ,\) \( ({x}_{n,1},{x}_{n,2},\ldots ,{x}_{n,m})^\mathrm{T}, ({y}_{1},{y}_{2},\ldots ,{y}_{m})^\mathrm{T})\) and \(\varepsilon ^{\shortmid }\) to denote a column vector with the constant coordinates \(\varepsilon \), we get \(({\pmb y}-\varepsilon ^{\shortmid }_0)^\mathrm{T} ({\pmb x}_i-\varepsilon ^{\shortmid }_i)- m\overline{{\pmb y}-\varepsilon ^{\shortmid }_0}\ \overline{{\pmb x}_i-\varepsilon ^{\shortmid }_i}\) \( =\left( {\pmb y}^\mathrm{T}{\pmb x}_i-m\varepsilon _i\bar{\pmb y}- m\varepsilon _0\bar{\pmb x}_i+\varepsilon _0^{\shortmid T} {\varepsilon _i^{\shortmid }}\right) - m\overline{{\pmb y}-\varepsilon ^{\shortmid }_0}\ \overline{{\pmb x}_i-\varepsilon ^{\shortmid }_i} ={\pmb y}^\mathrm{T}{\pmb x}_i-m\bar{\pmb y}\bar{\pmb x}_i \ \ (i=1,2, \ldots , n).\) Analogously, \(({\pmb x}_i-\varepsilon ^{\shortmid }_i)^\mathrm{T} ({\pmb x}_j-\varepsilon ^{\shortmid }_j) -m\overline{{\pmb x}_i-\varepsilon ^{\shortmid }_i}\ \overline{{\pmb x}_j-\varepsilon ^{\shortmid }_j} = {\pmb x}_i^\mathrm{T}{\pmb x}_j-m\bar{\pmb x}_i\bar{\pmb x}_j \ \ (i,j=1,2, \ldots , n)\).

This implies \({\underline{a}}_i=a_i \ \ (i,j=1,2, \ldots , n)\), and thus \({\underline{a}}_0= \overline{{\pmb y}-\varepsilon ^{\shortmid }_0}- {\underline{a}}_1\overline{{\pmb x}_1-\varepsilon ^{\shortmid }_1}- {\underline{a}}_2\overline{{\pmb x}_2-\varepsilon ^{\shortmid }_2}-\cdots - {\underline{a}}_n\overline{{\pmb x}_n-\varepsilon ^{\shortmid }_n}= \overline{{\pmb y}-\varepsilon ^{\shortmid }_0}- a_1\overline{{\pmb x}_1-\varepsilon ^{\shortmid }_1}- a_2\overline{{\pmb x}_2-\varepsilon ^{\shortmid }_2}-\cdots - a_n\overline{{\pmb x}_n-\varepsilon ^{\shortmid }_n}= (\bar{\pmb y}-a_1\bar{\pmb x}_1-a_2\bar{\pmb x}_2-\cdots - a_n\bar{\pmb x}_n)-(\varepsilon _0-a_{1}\varepsilon _1-a_{2} \varepsilon _2- \cdots -a_{n}\varepsilon _n)=a_0-o\). Therefore, \({\underline{f}}({\pmb t})=f_c({\pmb t})-o\); analogously, \({\bar{f}}({\pmb t})=f_c({\pmb t})+o\). \(\square \)

Remark 2.8

In practice we can ask people to collect sample as that in Corollary 2.7 so as to use the simplified models in Corollary 2.7.

3 Two models stemmed from the classical linear interpolations

In this section, we first present two linear-like interpolation models with interval input–output sample, each is stemmed from the classical linear interpolation modal with real input–output sample. Then we prove that each modal is an universal approximator to the corresponding linear interpolation modal.

Example 3.1

  1. (1)

    For a real input–output data set \(S=\{(x_1,y_1), (x_2,y_2),\) \(\ldots , (x_m,y_m)\}\) satisfying \(x_{i+1}-x_i=h_i>0 \ \ (i=1,2, \ldots , m-1, \ m>3)\), the continuous function \({\hat{f}}(x)=y_1l({x_1})(x)+y_2l({x_2})(x)+\cdots +y_ml({x_m})(x)\) (called the 1-variable linear interpolation function) interpolates the given sample, i.e. it satisfies \({\hat{f}}(x_i)=y_i \ \ (i=1,2, \ldots , m)\), where

    $$\begin{aligned} l({x_1})(x)= & {} \left\{ \begin{array}{ll} \frac{x-x_1}{h_1}+1, &{} x\in [x_1-h_1,x_1), \\ \frac{x_1-x}{h_1}, &{} x\in [x_1,x_2), \\ 0, &{} \text {otherwise}, \end{array} \right. \\ l({x_m})(x)= & {} \left\{ \begin{array}{ll} \frac{x-x_{m-1}}{h_{m-1}}, &{} x\in [x_{m-1},x_m), \\ \frac{x_m-x}{h_m}+1, &{} x\in [x_{m},x_m+h_{m-1}],\\ 0, &{} \text {otherwise}, \end{array} \right. \end{aligned}$$

    and

    $$\begin{aligned} l({x_i})(x)=\left\{ \begin{array}{ll} \frac{x-x_i}{h_{i-1}}, &{} x\in [x_{i-1},x_{i}), \\ \frac{x_i-x}{h_i}, &{} x\in [x_i,x_{i+1}), \\ 0, &{} \text {otherwise}, \end{array} \right. \ \ (i=2,3, \ldots , m-1) \end{aligned}$$

    are called basis functions relaying on S.

  2. (2)

    Let \(\{\left( {\pmb x}_0,f({\pmb x}_0)\right) \mid {\pmb x}_0 \in S\}\) be a multi-input-one output sample set (from an n-variable function f), where \(S=S_1\times S_2\times \cdots \times S_n\), \(S_1=\Big \{x_1^{(1)}, x_1^{(2)}, \ldots , x_1^{(m_1)}\Big \}\) with \(x_1^{(i+1)}-x_1^{(i)}=h_1^{(i)}>0\) \((i=1,2, \ldots , m_1-1)\), \(S_2=\Big \{x_2^{(1)}, x_2^{(2)}, \ldots , x_2^{(m_2)}\Big \}\) with \(x_2^{(i+1)}-x_2^{(i)}=h_2^{(i)}>0\) \((i=1,2, \ldots , m_2-1)\), \(\ldots \), \(S_n=\Big \{x_n^{(1)}, x_n^{(2)}, \ldots , x_n^{(m_n)}\Big \}\) with \(x_n^{(i+1)}-x_n^{(i)}=h_n^{(i)}>0\) \((i=1,2, \ldots , m_n-1)\), and \(m=\min \{m_1, m_2, \ldots , m_n\}>3\). Then the n-variable linear-like interpolation function \({\hat{f}}({\pmb x})=\sum _{{\pmb x}_0\in S}f({\pmb x}_0)l({{\pmb x}_0})({\pmb x})\) \((\forall {\pmb x}=(x_1, x_2, \ldots , x_n)\in R^n)\) is continuous and satisfies \({\hat{f}}({\pmb x}_0)=f({\pmb x}_0) \ \ (\forall {\pmb x}_0\in S)\), where \(l({{\pmb x}_0})({\pmb x}) =l{(}x_1^{(k_1)}{)}(x_1)\times l{(}x_2^{(k_2)}{)}(x_2)\) \(\times \cdots \times l{(}x_n^{(k_n)}{)}(x_n)\) if \({{\pmb x}_0}=\left( x_1^{(k_1)},x_2^{(k_2)}, \ldots , x_n^{(k_n)}\right) \), \(l{(}x_1^{(k_1)}{)}(x_1)\) is a base function relying on the sample set \(S_1\), \(l{(}x_2^{(k_2)}{)}(x_2)\) is a base function relying on the sample set \(S_2\), \(\ldots \), \(l{(}x_n^{(k_n)}{)}(x_n)\) is a base function relying on the sample set \(S_n\).

Motivated by Example 3.1, we have (as in Sect. 2) the following Algorithm 3.2:

Algorithm 3.2

Consider an interval number sample set (from an n-variable function f, \(n>1\)) \(\{\left( {\pmb {\mathbb {x}}}_0,f_{\ \wr } ({\pmb {\mathbb {x}}}_0)\right) \mid {\pmb {\mathbb {x}}}_0 \in {{\mathbb {S}}}\}\), where \(({\pmb {\mathbb {x}}}_0,f_{\ \wr }({\pmb {\mathbb {x}}}_0))\) is the observe value of the true sample \((x_0,f(x_0))\), \({{\mathbb {S}}}={{\mathbb {S}}}_1\times {{\mathbb {S}}}_2\times \cdots \times {{\mathbb {S}}}_n\) (the set of all interpolation nodes), \({{\mathbb {S}}}_1=\Big \{{{\mathbb {x}}}_1^{(1)}, {{\mathbb {x}}}_1^{(2)}, \ldots , {{\mathbb {x}}}_1^{(m_1)}\Big \}\) with \({{\mathbb {x}}}_1^{(2)}\ominus {{\mathbb {x}}}_1^{(1)}=h_1^{(1)}>0\), \({{\mathbb {x}}}_1^{(3)}\ominus {{\mathbb {x}}}_1^{(2)}=h_1^{(2)}>0\), \(\ldots \), and \({{\mathbb {x}}}_1^{(m_1)}\ominus {{\mathbb {x}}}_1^{(m_1-1)}=h_1^{(m_1-1)}>0\), \({{\mathbb {S}}}_2=\Big \{{{\mathbb {x}}}_2^{(1)}, {{\mathbb {x}}}_2^{(2)}, \ldots , {{\mathbb {x}}}_2^{(m_2)}\Big \}\) with \({{\mathbb {x}}}_2^{(2)}\ominus {{\mathbb {x}}}_2^{(1)}=h_2^{(1)}>0\), \({{\mathbb {x}}}_2^{(3)}\ominus {{\mathbb {x}}}_2^{(2)}=h_2^{(2)}>0\), \(\ldots \), and \({{\mathbb {x}}}_2^{(m_2)}\ominus {{\mathbb {x}}}_2^{(m_2-1)}={h_2}^{(m_2-1)}>0\), \(\ldots \), \({{\mathbb {S}}}_n=\Big \{{{\mathbb {x}}}_n^{(1)}, {\mathbb {x}}_n^{(2)},\) \(\ldots , {{\mathbb {x}}}_n^{(m_n)}\Big \}\) with \({{\mathbb {x}}}_n^{(2)}\ominus {{\mathbb {x}}}_n^{(1)}=h_n^{(1)}>0\), \({{\mathbb {x}}}_n^{(3)}\ominus {{\mathbb {x}}}_n^{(2)}=h_n^{(2)}>0\), \(\ldots \), and \({{\mathbb {x}}}_n^{(m_n)}\ominus {{\mathbb {x}}}_n^{(m_n-1)}=h_n^{(m_n-1)}>0\). Then the n-variable linear-like interpolation function of f, based on the data set \(\{\left( {\pmb {\mathbb {x}}}_0,f_{\wr } ({\pmb {\mathbb {x}}}_0)\right) \mid {\pmb {\mathbb {x}}}_0 \in {{\mathbb {S}}}\}\), can be given by the following ways:

  1. (1)

    Step 1 Compute the n-variable linear-like interpolation function \({\underline{f}}({\pmb t})= \sum _{{\pmb {\mathbb {x}}}_0\in {{\mathbb {S}}}} p_1({f_{\wr }({\pmb {\mathbb {x}}}_0)}) l({\dot{x}}_1, {\dot{x}}_2, \ldots , {\dot{x}}_n)({\pmb t})\) \((\forall {\pmb t}=(t_1, t_2, \ldots , t_n)\in R^n)\) based on the real input–output sample set \(\{\big (({\dot{x}}_1, {\dot{x}}_2, \ldots , {\dot{x}}_n), p_1({f_{\wr }({\pmb {\mathbb {x}}}_0)}) \big ) \mid {\pmb {\mathbb {x}}}_0= ({{\mathbb {x}}}_1, {{\mathbb {x}}}_2, \ldots , {{\mathbb {x}}}_n) \in {{\mathbb {S}}}\}\), using the method given in Example 3.1(2). Step 2 Similarly, compute the n-variable linear-like interpolation function \({\bar{f}}({\pmb t})=\) \(\sum _{{\pmb {\mathbb {x}}}_0\in {{\mathbb {S}}}} p_2({f_{\wr }({\pmb {\mathbb {x}}}_0)}) l(\ddot{x}_1, \ddot{x}_2, \ldots , \ddot{x}_n)({\pmb t})\) \((\forall {\pmb t}=(t_1, t_2, \ldots , t_n)\in R^n) \) based on the real input–output sample set \({\{}\big ((\ddot{x}_1, \ddot{x}_2, \ldots , \ddot{x}_n), p_2({f_{\wr }({\pmb {\mathbb {x}}}_0)}) \big ) \mid {\pmb {\mathbb {x}}}_0= ({{\mathbb {x}}}_1, {{\mathbb {x}}}_2, \ldots , {{\mathbb {x}}}_n)\) \( \in {{\mathbb {S}}}\text {\}}\), using the method given in Example 3.1(2). Step 3 Obtain, based on Steps 1–2, the first n-variable linear-like interpolation function \({\check{f}}({\pmb {\mathbb {x}}})=\) \({\downharpoonleft }\ {\underline{f}}({\dot{x}}_1,{\dot{x}}_2,\ldots ,{\dot{x}}_n), {\bar{f}}(\ddot{x}_1,\ddot{x}_2,\ldots ,\ddot{x}_n)\ {\downharpoonright }\) \((\forall {\pmb {\mathbb {x}}}= ({{\mathbb {x}}}_1, {{\mathbb {x}}}_2, \ldots , {{\mathbb {x}}}_n) \in {{\mathbb {R}}}^n)\).

  2. (2)

    Step 1 Compute the n-variable linear-like interpolation function \(f_c({\pmb t})= \sum _{{\pmb {\mathbb {x}}}_0\in {{\mathbb {S}}}} f_{\wr }({\pmb {\mathbb {x}}}_0)^c l( {{\mathbb {x}}}^c_1, {{\mathbb {x}}}^c_2, \ldots , {{\mathbb {x}}}^c_n)({\pmb t})\) \( (\forall {\pmb t}=(t_1, t_2, \ldots , t_n) \in R^n)\) based on the real input–output sample set \({\{}\big (({{\mathbb {x}}}^c_1, {{\mathbb {x}}}^c_2, \ldots , {{\mathbb {x}}}^c_n), f_{\wr }({\pmb {\mathbb {x}}}_0)^c\big ) \mid {\pmb {\mathbb {x}}}_0= ({{\mathbb {x}}}_1, {{\mathbb {x}}}_2, \ldots , {{\mathbb {x}}}_n)\) \(\in {{\mathbb {S}}} {\}}\), using the method given in Example 3.1(2). Step 2 Compute the ordinary linear regression function (without the term \(b_0\))Footnote 4\(r({\pmb \delta })=r(\delta _1,\ldots , \delta _n)= b_1\delta _1+\cdots +b_n\delta _n\) \((\forall {\pmb \delta }\in R^n)\), based on the real input–output sample set \({\{\left( ({{\mathbb {x}}}^r_1, {{\mathbb {x}}}^r_2, \ldots , {{\mathbb {x}}}^r_n), f_{\wr }({\pmb {\mathbb {x}}}_0)^r\right) \mid {\pmb {\mathbb {x}}}_0= ({{\mathbb {x}}}_1, {{\mathbb {x}}}_2, \ldots , {{\mathbb {x}}}_n)\in {\mathbb S}\}}\). Then \({\tilde{f}}({\pmb {\mathbb {x}}}) =f_c({{\mathbb {x}}}^c_1,{{\mathbb {x}}}^c_2,\ldots ,{{\mathbb {x}}}^c_n)\) \(\pm r({{\mathbb {x}}}^r_1,{{\mathbb {x}}}^r_2,\ldots ,{{\mathbb {x}}}^r_n) =\left\lceil f_c({{\mathbb {x}}}^c_1,{{\mathbb {x}}}^c_2,\ldots , {{\mathbb {x}}}^c_n),r({{\mathbb {x}}}^r_1,{{\mathbb {x}}}^r_2, \ldots ,{{\mathbb {x}}}^r_n)\right\rceil \) is the second n-variable linear-like interpolation function.

The rationality of Algorithm 3.2 is guaranteed by the following Theorem 3.3. For a vector \({\pmb {\mathbb {x}}}= ({{\mathbb {x}}}_1,{{\mathbb {x}}}_2,\ldots , {{\mathbb {x}}}_n)\) with interval number coordinates, we write \({\pmb t}\in {\pmb {\mathbb {x}}}\) if \({\pmb t}=(t_1,t_2,\ldots ,t_n)\) and \((t_1,t_2,\ldots ,t_n)\in {{\mathbb {x}}}_1\times {{\mathbb {x}}}_2\times \cdots \times {\mathbb {x}}_n\).

Theorem 3.3

  1. (1)

    For each real input–output sample \({\pmb s}=\big \{ ({\pmb {\mathbb {x}}}_0^{\bullet },f_{\wr }({\pmb {\mathbb {x}}}_0)^{\bullet }) \ \mid \ {\pmb {\mathbb {x}}}_0 \in {{\mathbb {S}}}\big \}\) (where \({\pmb {\mathbb {x}}}_0^\bullet \in {\pmb {\mathbb {x}}}_0\) and \(f_{\wr }({\pmb {\mathbb {x}}}_0)^\bullet \in f_{\wr }({\pmb {\mathbb {x}}}_0)\)), let \(f_{\pmb s}({\pmb t})=f_{\pmb s}(t_1,\ldots ,t_n)\) be the n-variable linear-like interpolation function, based on the real input–output sample set \({\pmb s}\) and obtained by using the method given in Example 3.1(2). Then \(f_{\pmb s}({\pmb t})\) is also continuous on \({\pmb s}\).

  2. (2)

    If \({\big \{( {\pmb {\mathbb {x}}}_0^{\bullet },f_{\wr }({\pmb {\mathbb {x}}}_0)^{\bullet })\ \mid \ {\pmb {\mathbb {x}}}_0 \in {{\mathbb {S}}}\}}\) (where \({\pmb {\mathbb {x}}}_0^{\bullet }\in {\pmb {\mathbb {x}}}_0\) and \(f_{\wr }({\pmb {\mathbb {x}}}_0)^{\bullet } \in f_{\wr }({\pmb {\mathbb {x}}}_0)\)) is the true sample set. Then, for each \(\varepsilon >0\), there exists a \(\delta >0\) such that \(|f_{\pmb s}({\pmb x}_0)- f_{\wr }({\pmb {\mathbb {x}}}_0)^{\bullet }|\) \(<\varepsilon \ \ (\forall {\pmb x}_0\in {\pmb {\mathbb {x}}}_0, \forall {\pmb {\mathbb {x}}}_0 =({{\mathbb {x}}}_1,{{\mathbb {x}}}_2,\ldots , {{\mathbb {x}}}_n)\in {{\mathbb {S}}})\) whenever \(\max \big \{{{\mathbb {x}}}^r_1, {{\mathbb {x}}}^r_{2}, \ldots , {{\mathbb {x}}}^r_{m},f_{\wr }({\pmb {\mathbb {x}}}_0)^r\big \}<\delta \), where \(f_{\pmb s}({\pmb t})=f_{\pmb s}(t_1,\ldots ,t_n)\) is the n-variable linear-like interpolation function, based on the real input–output sample set \({\pmb s}=\big \{ ({\pmb {\mathbb {x}}}_0^{\bullet },f_{\wr }({\pmb {\mathbb {x}}}_0)^{\bullet }) \ \mid \ {\pmb {\mathbb {x}}}_0 \in {{\mathbb {S}}}\big \}\) (where \({\pmb {\mathbb {x}}}_0^\bullet \in {\pmb {\mathbb {x}}}_0\) and \(f_{\wr }({\pmb {\mathbb {x}}}_0)^\bullet \in f_{\wr }({\pmb {\mathbb {x}}}_0)\)) and obtained by using the method given in Example 3.1(2).

Proof

Similarly to that of Theorem 2.4. \(\square \)

Remark 3.4

In practical problem if the set \({{\mathbb {S}}}\) of all interpolation nodes (related to the n-variable function f) is not an n-dimension cub-like set, then we consider the new set of all interpolation nodes (related to the n-variable function which is an extension of f) and give the two n-variable linear-like interpolation functions ( and ) of \(f_-\) using Algorithm 3.2 (and thus Theorem 3.3 holds for , particularly, for f), where , \({{\mathbb {S}}}_i=\{P_i({\pmb {\mathbb {s}}}) \mid {\pmb {\mathbb {s}}} \in {{\mathbb {S}}}\} \ \ (i=1,2, \ldots , n)\), \(P_i : {{\mathbb {R}}}^n\longrightarrow {{\mathbb {R}}} \ \ (i=1,2, \ldots , n)\) is the \(i-th\) projection, and the value of \(f_-({\pmb {\mathbb {s}}})\) will be given by experts using arithmetic average operator, weighted average operator, or other aggregation operators (Beliakov et al. 2007; Cubillo et al. 2015; Deschrijver and Kerre 2008) .

Example 3.5

  1. (1)

    The 1-variable linear-like interpolation functions are simple. Consider the interval input–output sample set \({{\mathbb {S}}}=\{({{\mathbb {x}}}_{i}, {\mathbb {y}}_{i}) \mid i= 1,2,\ldots , 41\}\) in (Chuang 2008, Table 2). By Algorithm 3.2, \({\check{f}}({\pmb {\mathbb {x}}}_i)= {\mathbb {y}}_{i} \ \ (i=1,2, \ldots , 41)\). By Algorithm 3.2(2), we have \([{\tilde{f}}({\pmb {\mathbb {x}}}_i)]^c= [{\mathbb {y}}_{i}]^c \ \ (i=1,2, \ldots , 41)\) and \(r(\delta )=25.0891\delta \), thus the minimum of the lengths of all intervals \({\tilde{f}}({\pmb {\mathbb {x}}}_i)\cap {\mathbb {y}}_{i} \ \ (i=1,2, \ldots , 41)\) is 0.0302. However, if we take \(r(\delta )\) to be the cubic spline interpolation function, then \({\tilde{f}}({\pmb {\mathbb {x}}}_i)= {\mathbb {y}}_{i} \ \ (i=1,2, \ldots , 41)\), and thus the minimum of the lengths of all intervals \({\tilde{f}}({\pmb {\mathbb {x}}}_i)\cap {\mathbb {y}}_{i} \ \ (i=1,2, \ldots , 41)\) is 0.9123.

  2. (2)

    Consider the special interval input–output sample set \({{\mathbb {S}}}=\{({{\mathbb {x}}}_{1,i},{{\mathbb {x}}}_{2,j}, {\mathbb {y}}_{i,j}) \mid i,j= 1,2,3\}\) as shown in Table 6 (where \(\delta _1\equiv 10\), \(\delta _2\equiv 0.1\), and r almost takes the same value 1).

Table 6 Special interval sample set

Step 1\(l\big (x_1^{(1)}\big )(t_1) =\max {\big \{1-\frac{1}{300}|t_1-300|, 0\big \}}\), \(l\big (x_1^{(2)}\big )(t_1) =\max {\big \{1-\frac{1}{300}|t_1-600|, 0\big \}}\), and \(l\big (x_1^{(3)}\big )(t_1)=\)\(\max \text {\{}1-\frac{1}{300}|t_1-900|, 0\text {\}}\). Similarly, \(l\big (x_2^{(1)}\big )(t_2)= \max \{1-|t_2-5|, 0\}\), \(l\big (x_2^{(2)}\big )(t_2)=\max \{1-|t_2-6|, 0\}\), and \(l\big (x_2^{(3)}\big )(t_2)=\max \{1-|t_2-7|, 0\}\). Thus

$$\begin{aligned} f_c({\pmb t})= & {} y_{1,1}l\big (x_1^{(1)}\big )(t_1)l \big (x_2^{(1)}\big )(t_2) +y_{1,2}l\big (x_1^{(1)}\big )(t_1)l \big (x_2^{(2)}\big )(t_2)\\&+ y_{2,1}l\big (x_1^{(2)} \big )(t_1)l\big (x_2^{(1)} \big )(t_2) +y_{2,2}l\big (x_1^{(2)}\big )(t_1)l \big (x_2^{(2)}\big )(t_2)\\= & {} 50\Big (2-\frac{t_1}{300}\Big )(6-t_2)+ 60\Big (2-\frac{t_1}{300}\Big )(t_2-5)\\&+ 60\Big (\frac{t_1}{300}-1\Big )(6-t_2)+ 70\Big (\frac{t_1}{300}-1\Big )(t_2-5)\\= & {} \frac{1}{30}t_1+10t_2-10 \end{aligned}$$

if \({\pmb t}=(t_1,t_2)\in [300,600)\times [5,6); f_c(t)\) \(=60\Big (2-\frac{t_1}{300}\Big )(7-t_2)+ 70\Big (2-\frac{t_1}{300}\Big )(t_2-6)+ 70\Big (\frac{t_1}{300}-1\Big )(7-t_2)+ 80\Big (\frac{t_1}{300}-1\Big )(t_2-6)=\frac{1}{30}t_1+10t_2-10\) if \({\pmb t}\in [300,600)\times [6,7); f_c(t)\) \(=60\Big (3-\frac{t_1}{300}\Big )(6-t_2)+ 70\Big (3-\frac{t_1}{300}\Big )(t_2-5)+ 50\Big (\frac{t_1}{300}-2\Big )(6-t_2)+ 80\Big (\frac{t_1}{300}-2\Big )(t_2-5) =\frac{1}{30}t_1+10t_2-10\) if \({\pmb t}\in [600,900)\times [5,6); f_c(t)\) \(=70\Big (3-\frac{t_1}{300}\Big )(7-t_2)+ 80\Big (3-\frac{t_1}{300}\Big )(t_2-6)+ 80\Big (\frac{t_1}{300}-2\Big )(7-t_2)+ 100\Big (\frac{t_1}{300}-2\Big )(t_2-6) =\frac{1}{30}t_1t_2-\frac{1}{6}t_1-10t_2+110\) if \({\pmb t}\in [600,900)\times [6,7)\); \(f_c({\pmb t})= y_{1,1}l\big (x_1^{(1)}\big )(t_1)l \big (x_2^{(1)}\big )(t_2) =\frac{1}{6}t_1t_2-\frac{2}{3}t_1\) if \({\pmb t}\in [0,300)\times [4,5)\); \(f_c({\pmb t})= y_{1,1}l\big (x_1^{(1)}\big )(t_1)l \big (x_2^{(1)}\big )(t_2)\) \(+ y_{1,2}l\big (x_1^{(1)}\big )(t_1)l \big (x_2^{(2)}\big )(t_2) =\frac{1}{30}t_1t_2\) if \({\pmb t}\in [0,300)\times [5,6)\); \(f_c({\pmb t})= y_{1,2}l\big (x_1^{(1)}\big )(t_1)l \big (x_2^{(2)}\big )(t_2)+ y_{1,3}l\big (x_1^{(1)}\big )(t_1)l \big (x_2^{(3)}\big )(t_2)=\frac{1}{30}t_1t_2\) if \({\pmb t}\in [0,300)\times [6,7)\); \(f_c({\pmb t})= y_{1,3}l\big (x_1^{(1)}\big )(t_1)l \big (x_2^{(3)}\big )(t_2)\) \(= -\frac{7}{30}t_1t_2+\frac{28}{15}t_1\) if \({\pmb t}\in [0,300)\times [7,8]\); \(f_c({\pmb t})= y_{1,1}l\big (x_1^{(1)}\big )(t_1)l \big (x_2^{(1)}\big )(t_2)\) \(+ y_{2,1}l\big (x_1^{(2)}\big )(t_1)l \big (x_2^{(1)}\big )(t_2) = \frac{1}{30}t_1t_2-\frac{2}{15}t_1+40t_2-160\) if \({\pmb t}\in [300,600)\times [4,5)\); \(f_c({\pmb t})= y_{1,3}l\big (x_1^{(1)}\big )(t_1)l \big (x_2^{(3)}\big )(t_2)+ y_{2,3}l\big (x_1^{(2)}\big )(t_1)l \big (x_2^{(3)}\big )(t_2) =-\frac{1}{30}t_1t_2+\frac{4}{15}t_1-60t_2+480\) if \({\pmb t}\in [300,600)\times [7,8]\); \(f_c({\pmb t})= y_{2,1}l\big (x_1^{(2)}\big )(t_1)l \big (x_2^{(1)}\big )(t_2)+\) \(y_{3,1}l\big (x_1^{(3)}\big )(t_1)l \big (x_2^{(1)}\big )(t_2) =\frac{1}{30}t_1t_2-\frac{2}{15}t_1+40t_2-160\) if \({\pmb t}\in [600,900)\times [4,5)\); \(f_c({\pmb t})= y_{2,3}l\big (x_1^{(2)}\big )(t_1)l \big (x_2^{(3)}\big )(t_2)+y_{3,3}l\big (x_1^{(3)}\big )(t_1)l \big (x_2^{(3)}\big )(t_2) =-\frac{1}{15}t_1t_2+\frac{8}{15}t_1-40t_2+320\) if \({\pmb t}\in [600,900)\times [7,8]\); \(f_c({\pmb t})= y_{3,1}l\big (x_1^{(3)}\big )(t_1)l \big (x_2^{(1)}\big )(t_2)\) \( =-\frac{7}{30}t_1t_2+\frac{14}{15}t_1+280t_2-1120\) if \({\pmb t}\in [900,1200]\times [4,5)\); \(f_c({\pmb t})= y_{3,1}l\big (x_1^{(3)}\big )(t_1)l \big (x_2^{(1)}\big )(t_2)+ y_{3,2}l\big (x_1^{(3)}\big )(t_1)l \big (x_2^{(2)}\big )(t_2) =-\frac{1}{30}t_1t_2-\frac{1}{15}t_1+40t_2+80\) if \({\pmb t}\in [900,1200]\times [5,6)\); \(f_c({\pmb t})= y_{3,2}l\big (x_1^{(3)}\big )(t_1)l \big (x_2^{(2)}\big )(t_2)+ y_{3,3}l\big (x_1^{(3)}\big )(t_1)l \big (x_2^{(3)}\big )(t_2) =-\frac{1}{15}t_1t_2+\frac{2}{15}t_1+80t_2-160\) if \({\pmb t}\in [900,1200]\times [6,7)\); \(f_c({\pmb t})= y_{3,3}l\big (x_1^{(3)}\big )(t_1)l \big (x_2^{(3)}\big )(t_2)\) \(=\frac{1}{3}t_1t_2-\frac{8}{3}t_1-400t_2+3200\) if \({\pmb t}\in [900,1200]\times [7,8]\). In a words, \(f_c({\pmb t})\)=\(\left\{ \begin{array}{ll} \frac{1}{30}t_1+10t_2-10, &{}\ {\pmb t}\in [300,600)\times [5,7),\\ \frac{1}{30}t_1+10t_2-10, &{}\ {\pmb t}\in [600,900)\times [5,6),\\ \frac{1}{30}t_1t_2-\frac{1}{6}t_1-10t_2+110,&{}\ {\pmb t}\in [600,900) \times [6,7),\\ \frac{1}{6}t_1t_2-\frac{2}{3}t_1,&{}\ {\pmb t}\in [0,300)\times [4,5),\\ \frac{1}{30}t_1t_2,&{}\ {\pmb t}\in [0,300)\times [5,7),\\ -\frac{7}{30}t_1t_2+\frac{28}{15}t_1,&{}\ {\pmb t}\in [0,300)\times [7,8],\\ \frac{1}{30}t_1t_2-\frac{2}{5}t_1+40t_2-160,&{}\ {\pmb t}\in [300,600)\times [4,5),\\ -\frac{1}{30}t_1t_2+\frac{4}{5}t_1-60t_2+480,&{}\ {\pmb t}\in [300,600)\times [7,8],\\ \frac{1}{30}t_1t_2-\frac{2}{15}t_1+40t_2-160,&{}\ {\pmb t}\in [600,900)\times [4,5),\\ -\frac{1}{15}t_1t_2+\frac{8}{15}t_1-40t_2+320,&{}\ {\pmb t}\in [600,900)\times [7,8],\\ -\frac{7}{30}t_1t_2+\frac{14}{15}t_1+280t_2-1120,&{}\ {\pmb t}\in [900,1200)\times [4,5),\\ -\frac{1}{30}t_1t_2-\frac{1}{15}t_1+40t_2+80,&{}\ {\pmb t}\in [900,1200)\times [5,6),\\ -\frac{1}{15}t_1t_2+\frac{2}{15}t_1+80t_2-160,&{}\ {\pmb t}\in [900,1200)\times [6,7),\\ \frac{1}{3}t_1t_2-\frac{8}{3}t_1-400t_2+3200,&{}\ {\pmb t}\in [900,1200)\times [7,8],\\ 0,&{}\ \text {otherwise}. \end{array} \right. \)

Step 2\(l\big (x_1^{(1)}-\delta _1\big )(t_1) =\max \{1-\frac{1}{300}|t_1-290|,\ 0\}\), \(l\big (x_1^{(2)}-\delta _1\big )(t_1)=\max \{1-\frac{1}{300}|t_1-590|, 0\}\), \(l\big (x_1^{(3)}-\delta _1\big )(t_1)\) \(=\max \{1-\frac{1}{300}|t_1-890|, 0\}\), \(l\big (x_2^{(1)}-\delta _2\big )(t_2)=\max \{1-|t_2-4.9|, 0\}\), \(l\big (x_2^{(2)}-\delta _2\big )(t_2)=\max \{1-|t_2-5.9|, 0\}\), \(l\big (x_2^{(3)}-\delta _2\big )(t_2) =\max \{1-\) \(|t_2-6.9|, 0\}\). Thus

$$\begin{aligned} {\underline{f}}({\pmb t})=\left\{ \begin{array}{ll} \frac{1}{30}t_1+10t_2-\frac{29}{3} &{}\ {\pmb t}\in [290,590)\times [4.9,6.9),\\ -\frac{1}{3000}t_1t_2+\frac{1049}{30000}t_1+\frac{3059}{300}t_2-\frac{31891}{3000}, &{}\ {\pmb t}\in [590,890)\times [4.9,5.9),\\ \frac{101}{3000}t_1t_2-\frac{4969}{30000}t_1-\frac{2959}{300}t_2+\frac{32317}{3000},&{}\ {\pmb t}\in [590,890) \times [5.9,6.9),\\ \frac{49}{300}t_1t_2-\frac{637}{1000}t_1+\frac{49}{30}t_2-\frac{637}{100},&{}\ {\pmb t}\in [0,290)\times [3.9,4.9),\\ \frac{1}{30}t_1t_2+\frac{1}{3}t_2,&{}\ {\pmb t}\in [0,290)\times [4.9,6.9),\\ -\frac{23}{100}t_1t_2+\frac{1817}{1000}t_1-\frac{23}{10}t_2+\frac{1817}{100},&{}\ {\pmb t}\in [0,290)\times [6.9,7.9],\\ \frac{1}{30}t_1t_2-\frac{13}{100}t_1+\frac{118}{3}t_2-\frac{767}{5},&{}\ {\pmb t}\in [290,590)\times [3.9,4.9),\\ -\frac{1}{30}t_1t_2+\frac{79}{300}t_1-\frac{178}{3}t_2+\frac{7031}{15},&{}\ {\pmb t}\in [290,590)\times [6.9,7.9],\\ \frac{1}{30}t_1t_2-\frac{13}{100}t_1+\frac{118}{3}t_2-\frac{767}{5},&{}\ {\pmb t}\in [590,890)\times [3.9,4.9),\\ -\frac{1}{15}t_1t_2+\frac{79}{150}t_1-\frac{119}{3}t_2+\frac{9401}{30},&{}\ {\pmb t}\in [590,890)\times [6.9,7.9],\\ -\frac{23}{100}t_1t_2+\frac{897}{1000}t_1+\frac{2737}{10}t_2-\frac{106743}{100},&{}\ {\pmb t}\in [890,1190)\times [3.9,4.9),\\ -\frac{33}{1000}t_1t_2-\frac{683}{10000}t_1+\frac{3927}{100}t_2+\frac{81277}{1000},&{}\ {\pmb t}\in [890,1190)\times [4.9,5.9),\\ -\frac{67}{1000}t_1t_2+\frac{1323}{10000}t_1+\frac{7973}{100}t_2-\frac{157437}{1000},&{}\ {\pmb t}\in [890,1190)\times [5.9,6.9),\\ \frac{33}{100}t_1t_2-\frac{2607}{1000}t_1-\frac{3927}{10}t_2+\frac{310233}{100},&{}\ {\pmb t}\in [890,1190)\times [6.9,7.9],\\ 0,&{}\ \text {otherwise}. \end{array} \right. \end{aligned}$$

Step 3\(l\big (x_1^{(1)}+\delta _1\big )(t_1) =\max \{1-\frac{1}{300}|t_1-310|,\) \(0\}\), \(l\big (x_1^{(2)}+\delta _1\big )(t_1)= \max \{1-\frac{1}{300}|t_1-610|, 0\}\), \(l\big (x_1^{(3)}+\delta _1\big )(t_1)\) \(=\max \{1-\frac{1}{300}|t_1-910|, 0\}\), \(l\big (x_2^{(1)}+\delta _2\big )(t_2)=\max \{1-|t_2-5.1|, 0\}\), \(l\big (x_2^{(2)}+\delta _2\big )(t_2)=\max \{1-|t_2-6.1|, 0\}\), \(l\big (x_2^{(3)}+\delta _2\big )(t_2)= \max \{1-\) \(|t_2-7.1|, 0\}\). Thus

$$\begin{aligned} {\bar{f}}({\pmb t})=\left\{ \begin{array}{ll} \frac{1}{30}t_1+10t_2-\frac{31}{3}, &{}\ {\pmb t}\in [310,610)\times [5.1,7.1),\\ \frac{1}{3000}t_1t_2+\frac{949}{30000}t_1+\frac{2939}{300}t_2-\frac{27889}{3000}, &{}\ {\pmb t}\in [610,910)\times [5.1,6.1),\\ \frac{33}{1000}t_1t_2-\frac{5029}{30000}t_1-\frac{1013}{100}t_2+\frac{336769}{3000},&{}\ {\pmb t}\in [610,910) \times [6.1,7.1),\\ \frac{17}{100}t_1t_2-\frac{697}{1000}t_1-\frac{17}{10}t_2+\frac{697}{100},&{}\ {\pmb t}\in [0,310)\times [4.1,5.1),\\ \frac{1}{30}t_1t_2-\frac{1}{3}t_2,&{}\ {\pmb t}\in [0,310)\times [5.1,7.1),\\ -\frac{71}{300}t_1t_2+\frac{1917}{1000}t_1+\frac{71}{30}t_2-\frac{1917}{100},&{}\ {\pmb t}\in [0,310)\times [7.1,8.1],\\ \frac{1}{30}t_1t_2-\frac{41}{300}t_1+\frac{122}{3}t_2-\frac{2501}{15},&{}\ {\pmb t}\in [310,910)\times [4.1,5.1),\\ -\frac{1}{30}t_1t_2+\frac{27}{100}t_1-\frac{182}{3}t_2+\frac{2457}{5},&{}\ {\pmb t}\in [310,610)\times [7.1,8.1],\\ \frac{1}{30}t_1t_2-\frac{41}{300}t_1+\frac{122}{3}t_2-\frac{2501}{15},&{}\ {\pmb t}\in [610,910)\times [4.1,5.1),\\ -\frac{1}{15}t_1t_2+\frac{27}{50}t_1-\frac{121}{3}t_2+\frac{3267}{10},&{}\ {\pmb t}\in [610,910)\times [7.1,8.1],\\ -\frac{71}{300}t_1t_2+\frac{2911}{3000}t_1+\frac{8591}{30}t_2-\frac{352231}{300},&{}\ {\pmb t}\in [910,1210)\times [4.1,5.1),\\ -\frac{101}{3000}t_1t_2-\frac{1949}{30000}t_1+\frac{12221}{300}t_2+\frac{235829}{3000},&{}\ {\pmb t}\in [910,1210)\times [5.1,6.1),\\ -\frac{199}{3000}t_1t_2+\frac{1343}{10000}t_1+\frac{24079}{300}t_2-\frac{162503}{1000},&{}\ {\pmb t}\in [910,1210)\times [6.1,7.1),\\ \frac{101}{300}t_1t_2-\frac{2727}{1000}t_1-\frac{12221}{30}t_2+\frac{329967}{100},&{}\ {\pmb t}\in [910,1210)\times [7.1,8.1],\\ 0,&{}\ \text {otherwise}. \end{array} \right. \end{aligned}$$

Step 4\({\check{f}}({\pmb {\mathbb {t}}})= {\downharpoonleft } {\underline{f}}({\dot{t}}_1,{\dot{t}}_2),{\bar{f}}(\ddot{t}_1,\ddot{t}_2) {\downharpoonright }\)\((\forall {\pmb {\mathbb {t}}}=({{\mathbb {t}}}_1,{{\mathbb {t}}}_2)\in {{\mathbb {R}}}^2)\). Using the classical least square estimation method we can obtain an linear regression functionFootnote 5\(r({\pmb \varepsilon })=r(\varepsilon _1,\varepsilon _2)\) (it has no the constant term) based on the sample set

$$\begin{aligned} \big \{\overbrace{(10,0.1,1),\ldots ,(10,0.1,1)}^{7 \text {times}}, (10,0.1,1.1), (10,0.1,1)\big \}. \end{aligned}$$

Calculations show that \(r({\pmb \varepsilon })= r(\varepsilon _1,\varepsilon _2)=0.1011\varepsilon _1\). Therefore, \({\tilde{f}}({\pmb {\mathbb {t}}})= \lceil f_c({{\mathbb {t}}}^c_1,{{\mathbb {t}}}^c_2),0.1011\varepsilon _1\rceil = f_c({{\mathbb {t}}}^c_1,{{\mathbb {t}}}^c_2)\) \(\pm 0.1011\varepsilon _1\). For an \({\pmb {\mathbb {s}}} =(\lceil 400,10\rceil ,\lceil 7,0.1\rceil )\in {{\mathbb {R}}}\), as \({\underline{f}}({\dot{s}}_1,{\dot{s}}_2)={\underline{f}}(390,6.9) =72.{{\dot{3}}} \), \(f_c({{\mathbb {s}}}^c_1,{{\mathbb {s}}}^c_2)=f_c(400,7)\) \(=73.{{\dot{3}}}\), and \({\bar{f}}(\ddot{s}_1,\ddot{s}_2)={\bar{f}}(410,7.1)=74.{{\dot{3}}}\), \({\check{f}}({\pmb {\mathbb {s}}})\) \(= \langle 72.{{\dot{3}}},74.{{\dot{3}}}\rangle \) and \({\tilde{f}}({\pmb {\mathbb {s}}})=\lceil 73.{{\dot{3}}},1.011\rceil = \langle 72.3223,74.3443\rangle \).

Step 5 If the true sample set \(S=\{(x_{1,i},x_{2,j},y_{i,j}) \mid i,j=1,2,3\}\) consists of centers of interval numbers in \({{\mathbb {S}}}\), then the compute results and the true sample are list in Table 7, where \(y^c_{i,j}=f_c(x_{1,i},x_{2,j})\), \(y^-_{i,j}=\) \({\underline{f}}(x_{1,i}-10^{-2},x_{2,j}-10^{-5})\), and \(y^+_{i,j}={\bar{f}}(x_{1,i}+10^{-2},x_{2,j}+10^{-5})\).

Table 7 Comparison with true sample

Example 3.6

Consider the special interval input–output sample set \({{\mathbb {S}}}=\big \{(300\pm 10, 5\pm 2,50\pm 5),\) \( (300\pm 10, 7\pm 2,70\pm 5),\) \( (600\pm 10, 6\pm 2,\) \(70\pm 5), (900\pm 10, 5\pm 2,\) \(70\pm 5), (900\pm 10, 7\pm 2,\) \(90\pm 5) \big \}\) (from a function f) in Table 8 and its completion (from the corresponding function \(f_-\)) in Table 9, where the unknown sample are determined using an arithmetic average operator (of course, we can also use an appropriate aggregation operator, a t-norm, a t-conorm, etc.): \(*=0.5(\lceil 50,1\rceil \) \(\oplus \lceil 70,1\rceil )=\lceil 60,1\rceil \), \(**=0.5(\lceil 50,1\rceil \oplus \lceil 70,1\rceil )=\lceil 60,1\rceil \), \(\star =0.5(\lceil 70,1\rceil \oplus \lceil 90,1\rceil )=\lceil 80,1\rceil \), \(\star \star =0.5(\lceil 70,1\rceil \oplus \lceil 90,1\rceil )=\lceil 80,1\rceil \). Let \(f_c\), \({\underline{f}}\), \({\bar{f}}\), \({\check{f}}\), and \({\tilde{f}}\) be as in Example 3.5(2), then

By Algorithm 3.2, and .

Table 8 Special interval input–output sample
Table 9 Completion of Table 8

4 Applications

Example 4.1

Now we give detailed solutions to problem (I) in Sect. 1.

  1. (1)

    To determine the grade by using the linear regression model presented in Sect. 2. Firstly, compute the 3-variable linear regression functions. \({\underline{f}}({\pmb t})= -1.5253-0.0141t_1\) \(+1.4017t_2+3.5535t_3\), \(f_c({\pmb t})=-1.514- \ 0.0114t_1\) \(+0.1363t_2+4.6991t_3, \ {\bar{f}}({\pmb t})\) \(=-0.8411-\ 0.0259t_1\) \(+1.893t_2+1.917t_3\), and \(r({\pmb \delta })=0.0075\delta _1\) \(+ 0.1585\delta _2 - 0.169\delta _3\). By Algorithm 2.3, \({\hat{f}}({\pmb {\mathbb {t}}})= \langle -1.5253,-0.8411\rangle \oplus \langle -0.0266,-0.0141\rangle {\mathbb {t}}_1 \oplus \ \langle 1.4017,1.893\rangle {\mathbb {t}}_2\oplus \langle 1.917,3.5535\rangle {\mathbb {t}}_3,\) \({\tilde{f}}({\pmb {\mathbb {t}}})=\ \lceil -1.514-0.0114{\mathbb {t}}^c_1+ 0.1363{\mathbb {t}}^c_2 +4.6991{\mathbb {t}}^c_3, 0.0075\delta _1\ +0.1586\delta _2- 0.169\delta _3\rceil \) and \({\check{f}}({\pmb {\mathbb {t}}}) =\ {\downharpoonleft } {\underline{f}}({\dot{t}}_1,{\dot{t}}_2,{\dot{t}}_3), {\bar{f}}(\ddot{t}_1,\ddot{t}_2,\ddot{t}_3) {\downharpoonright }\)\((\forall {\pmb {\mathbb {t}}}= ({\mathbb {t}}_1,{\mathbb {t}}_2,{\mathbb {t}}_3)\in {{\mathbb {R}}}^3)\). Secondly, \({\hat{f}}({\pmb {\mathbb {t}}}_0)=\langle 1.4635,4.7158\rangle \), \({\check{f}}({\pmb {\mathbb {t}}}_0)=\ {\downharpoonleft }{2.983,3.1963}{\downharpoonright }\), and \({\tilde{f}}({\pmb {\mathbb {t}}}_0)=\lceil 2.0694,0.0551\rceil \). Finally, determine the grade of \({\pmb {\mathbb {t}}}_0\) by experts based on the computation results.

  2. (2)

    To determine the grade by using the linear-like interpolation model presented in Sect. 3. Firstly, supplement Table 9 (which we will call Table 10). In Table 10, only \(2^3\) samples are need because, for the computation of each of \(\{{\check{f}}({\pmb {\mathbb {t}}}_0), {\tilde{f}}({\pmb {\mathbb {t}}}_0)\}\), only the corresponding \(2^3\) basis functions relaying the sample will be used. Secondly, compute \({\check{f}}({\pmb {\mathbb {t}}}_0)\) or \({\tilde{f}}({\pmb {\mathbb {t}}}_0)\) (and thus determine the grade). We omit the computation course (including Table 10) here because it is tedious (unless a matched program is given).

  3. (3)

    To determine the grade by using some easy-to-think-of methods (which can also be discovered in plausible manner). For example, take three Astragali Radix \({\pmb {\mathbb {t}}}_1=(\lceil 36.5,9\rceil , \lceil 1.3,0.2\rceil , \lceil 0.9,0.1\rceil )\), \({\pmb {\mathbb {t}}}_2=(\lceil 36.7,10.4\rceil , \lceil 1.7,0.2\rceil , \lceil 1.2,0.2\rceil )\), and \({\pmb {\mathbb {t}}}_3=(\lceil 32.4,5.5\rceil , \lceil 1,0.1\rceil , \lceil 0.8,0.1\rceil )\) (such that each is much similar to \({\pmb {\mathbb {t}}}_0\) in one aspect) in Table 9 first. Then compute

    $$\begin{aligned} {\mathbb {d}}_1= & {} d({\pmb {\mathbb {t}}}_0,{\pmb {\mathbb {t}}}_1) ={\downharpoonleft } |(35.8-5.4)-(36.5-9)|, |(35.8+5.4)-(36.5+9)| {\downharpoonright }\\&\oplus {\downharpoonleft } |(1.7-0.27)-(1.3-0.2)|, |(1.7+0.27)-(1.3+0.2)| {\downharpoonright }\\&\oplus {\downharpoonleft } |(0.8-0.1)-(0.9-0.1)|, |(0.8+0.1)-(0.9+0.1)| {\downharpoonright }\\= & {} \langle 2.9,4.3\rangle \oplus \langle 0.33,0.47\rangle \oplus \langle 0.1,0.1\rangle =\langle 3.33,4.87\rangle ,\\ {\mathbb {d}}_2= & {} d({\pmb {\mathbb {t}}}_0,{\pmb {\mathbb {t}}}_2) ={\downharpoonleft } |(35.8-5.4)-(36.7-10.4)|, |(35.8+5.4)-(36.7+10.4)| {\downharpoonright }\\&\oplus {\downharpoonleft } |(1.7-0.27)-(1.7-0.2)|, |(1.7+0.27)-(1.7+0.2)| {\downharpoonright }\\&\oplus {\downharpoonleft } |(0.8-0.1)-(1.2-0.2)|, |(0.8+0.1)-(1.2+0.2)| {\downharpoonright }\\= & {} \langle 4.1,5.9\rangle \oplus \langle 0.07,0.07\rangle \oplus \langle 0.3,0.5\rangle =\langle 4.47,6.47\rangle ,\\ {\mathbb {d}}_3= & {} d({\pmb {\mathbb {t}}}_0, {\pmb {\mathbb {t}}}_3) ={\downharpoonleft } |(35.8-5.4)-(32.4-5.5)|, |(35.8+5.4)-(32.4+5.5)| {\downharpoonright }\\&\oplus {\downharpoonleft } |(1.7-0.27)-(1-0.1)|, |(1.7+0.27)-(1+0.1)| {\downharpoonright }\\&\oplus {\downharpoonleft } |(0.8-0.1)-(0.8-0.1)|, |(0.8+0.1)-(0.8+0.1)| {\downharpoonright }\\= & {} \langle 3.3,3.5\rangle \oplus \langle 0.53,0.87\rangle \oplus \langle 0.1,0.1\rangle =\langle 3.83,4.37\rangle . \end{aligned}$$

Finally, as \({\mathbb {d}}_1\leqslant {\mathbb {d}}_3\leqslant {\mathbb {d}}_2\), \({\pmb {\mathbb {t}}}_0\) is most similar to \({\pmb {\mathbb {t}}}_1\), and thus the grade of \({\pmb {\mathbb {t}}}_0\) can be judged as 4.

Example 4.2

In a practical problem with an interval input–output sample and a given interval input \({\pmb {\mathbb {t}}}\), how to obtain the interval output? the suggested strategies are as follows:

  1. (1)

    If the sample is managed so carefully that are as in Examples 2.6 and 3.1 (i.e., the radii are the same in the same variable data), then we can compute the output of \({\pmb {\mathbb {t}}}\) just using one of the five kinds of functions (i.e., the linear regression functions or the line-like interpolation functions given in this paper, see Corollary 2.7 and Examples 3.5 and 3.6 for details); of course, we can also take the output of \({\pmb {\mathbb {t}}}\) to be an aggregation (for example, a simple weighted average) of these outputs.

  2. (2)

    In other cases, we can compute the output of \({\pmb {\mathbb {t}}}\) using one of the three kinds of linear regression functions given in this paper, see Example 4.1(1) for details); we can also compute the output of \({\pmb {\mathbb {t}}}\) as in the last part of Example 4.1(2). Of course, we can also take the output of \({\pmb {\mathbb {t}}}\) to be an appropriate aggregation of these outputs.

Example 4.3

Consider the control problem \(\frac{\mathrm{d}{\pmb x}}{\mathrm{d}t}= (f_1({\pmb x}), f_2({\pmb x}))^\mathrm{T}+{\pmb u}(t)\) in (II) of Sect. 1. Assume the sample obtained on \(y_1=f_1({\pmb x})\) is as the Table 4 in Example 2.6, and the sample obtained on \(y_2=f_2({\pmb x})\) is as the Table 7 in Example 3.6. Replacing \((f_1({\pmb x}),f_2({\pmb x}))\) by \((f_{1,c}({\pmb x}),f_{2,c}({\pmb x}))\) (resp, \(({\underline{f}}_{1}({\pmb x}),{\underline{f}}_{2}({\pmb x}))\), \(({\bar{f}}_{1}({\pmb x}),{\bar{f}}_{2}({\pmb x}))\) (see Example 3.5(2)), we obtain the following three systems:

$$\begin{aligned}&\frac{\mathrm{d}{\pmb x}}{\mathrm{d}t} =\left( \begin{array}{c} x^{\prime }_1(t)\\ x'_2(t) \end{array}\right) = \left( \begin{array}{c} f_{1,c}({\pmb x})\\ f_{2,c}({\pmb x}) \end{array}\right) + \left( \begin{array}{c} u_1(t)\\ u_2(t) \end{array}\right) \end{aligned}$$
(3)
$$\begin{aligned}&\frac{\mathrm{d}{\pmb x}}{\mathrm{d}t}= \left( \begin{array}{c} {\underline{f}}_{1}({\pmb x})\\ {\underline{f}}_{2}({\pmb x}) \end{array}\right) + \left( \begin{array}{c} u_1(t)\\ u_2(t) \end{array}\right) \end{aligned}$$
(4)
$$\begin{aligned}&\frac{\mathrm{d}{\pmb x}}{\mathrm{d}t} \left( \begin{array}{c} {\bar{f}}_{1}({\pmb x})\\ {\bar{f}}_{2}({\pmb x}) \end{array}\right) + \left( \begin{array}{c} u_1(t)\\ u_2(t) \end{array}\right) \end{aligned}$$
(5)

We declare that system (3) is asymptotically stable if we take \((u_1(t),u_2(t))^\mathrm{T}=\big (-k_1x_1(t)-f_{1,c}(\pmb {x}(t)), -k_2x_2(t)-f_{2,c}(\pmb {x}(t))\big )^\mathrm{T} \ \ (k_1>0, k_2>0)\). Indeed, define the Lyapunov function as \(2V({\pmb x})=x_1^2+x_2^2\). Then \(\frac{\mathrm{d}V}{\mathrm{d}t}=x_1\frac{\mathrm{d}x_1}{\mathrm{d}t}+x_2\frac{\mathrm{d}x_2}{\mathrm{d}t} =x_1[f_{1,c}(\pmb {x})+u_1]+x_2[f_{2,c}(\pmb {x})+u_2]= -k_1x^2_1-k_2x^2_2\le 0\), which implies that system (3) is asymptotically stable. Similarly, systems (4) and (5) are also asymptotically stable. The idea and method used here can be used immediately to control chaotic fractional-order neural networks (Han et al. 2020).

5 Conclusions

Intelligent doctors, plus the Tradition Chinese Medicine (TCM for short), can resist having diarrhea, cold, being ill with a fever, and tonsillitis completely (even coronavirus disease-2019, COVID-19 for short, in a degree actually) with a smaller side effect. The present paper confirms that interval input–output samples (which are major samples or data in TCM all the time) can always be disposed towards satisfactory with the help of models (stemmed from the classical ones) and practitioners’ cooperation. Given an n-variable interval input-1-variable interval output sample set \({{\mathbb {S}}}\), we have illustrated in much detail how to establish, in Polya’s discovering pattern (mainly using specialization and analogous towards exploring solutions to problems), three linear regression models and two linear-like interpolation models relaying on \({{\mathbb {S}}}\). Each model is proved to be a universal approximator to the corresponding model based on the true samples under some easy-to-be-satisfied conditions. As the computations only involve centers, left endpoints, or right endpoints of interval numbers in the sample, off-the-shelf software can be utilized. Practitioners can optimize these models in the following ways:

  1. (1)

    Collect directly or obtain such kind of interval samples that have the same radius r and r is as small as possible [this can be realized by practitioners themselves directly or by experts indirectly using three-way decision theory (Yao 2017)].

  2. (2)

    Take \(g_1({\pmb {{\mathbb {x}}}})= {\mathbb {w}}_1{\hat{f}} ({\pmb {{\mathbb {x}}}})\oplus {\mathbb {w}}_2{\check{f}}({\pmb {{\mathbb {x}}}})\oplus {\mathbb {w}}_3{\tilde{f}}({\pmb {{\mathbb {x}}}})\) to be the new model in the case of linear regression model, where \(\{{\mathbb {w}}^c_1,{\mathbb {w}}^c_2,{\mathbb {w}}^c_3\}\subseteq (0,1)\) satisfying \({\mathbb {w}}^c_1+{\mathbb {w}}^c_2+{\mathbb {w}}^c_3\approx 1\) may be determined by experts; take \(g_2({\pmb {{\mathbb {x}}}})={\mathbb {w}}_1{\check{f}}({\pmb {{\mathbb {x}}}})\oplus {\mathbb {w}}_2{\tilde{f}}({\pmb {{\mathbb {x}}}})\) to be the new model in the case of linear-like interpolation model, where \(\{{\mathbb {w}}^c_1,{\mathbb {w}}^c_2\}\subseteq (0,1)\) satisfying \({\mathbb {w}}^c_1+{\mathbb {w}}^c_2\approx 1\) may be determined by experts.

Notice that the models disposing of interval samples in this paper are stemmed from the classical linear regression model or the classical linear interpolation model, thus practitioners can also make out similarly other new models disposing of interval samples based on other classical models disposing of real number samples. Theoretically, these work can be generalized further (e.g., replace the operations by demand-oriented aggregation-like operation, t-norm-like operation, or t-conorm-like operation to be coined by experts), if necessary. Finally, some new universal approximators can also be discovered and used to control chaotic fractional-order neural networks (Han et al. 2020) based on this research (which will be our future work).