Nothing Special   »   [go: up one dir, main page]

Next Article in Journal
A Chaos-Based Encryption Algorithm to Protect the Security of Digital Artwork Images
Previous Article in Journal
Optimal Replenishment Strategy for a High-Tech Product Demand with Non-Instantaneous Deterioration under an Advance-Cash-Credit Payment Scheme by a Discounted Cash-Flow Analysis
Previous Article in Special Issue
From Classical to Modern Nonlinear Central Limit Theorems
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Pointwise Sharp Moderate Deviations for a Kernel Density Estimator

1
School of Mathematics and Statistics, Northeastern University at Qinhuangdao, Qinhuangdao 066004, China
2
CY University, AGM UMR 8088, Saint-Martin, 95000 Cergy-Pontoise, France
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(20), 3161; https://doi.org/10.3390/math12203161
Submission received: 27 July 2024 / Revised: 9 September 2024 / Accepted: 25 September 2024 / Published: 10 October 2024
(This article belongs to the Special Issue New Trends in Stochastic Processes, Probability and Statistics)

Abstract

:
Let f n be the non-parametric kernel density estimator based on a kernel function K and a sequence of independent and identically distributed random vectors taking values in R d . With some mild conditions, we establish sharp moderate deviations for the kernel density estimator. This means that we provide an equivalent for the tail probabilities of this estimator.
MSC:
60F10; 62G07; 60E05; 62E20

1. Introduction

Let { X i ; i 1 } be a sequence of independent and identically distributed (i.i.d.) random vectors taking values in R d on probability space ( Ω , F , P ) with density function f. Let K : R d R be a kernel function. The kernel density estimator of f is defined by
f n ( x ) = 1 n a n d i = 1 n K x X i a n , x = ( x 1 , x 2 , . . . , x d ) T R d ,
where { a n , n 1 } is a bandwidth sequence, that is, a sequence of positive numbers satisfying
a n 0 , n a n d + a s n + .
A great and synthetic reference for such estimates is [1]. Among the huge number of applications of kernel density estimation, let us cite the elegant paper in [2] which makes use of this estimator for an important problem related with green algae: using our results may be used to derive a decision rule for this important ecological question.
In this paper, we are interested in the pointwise sharp moderate deviations for { f n , n 1 } by the empirical process approach; the volume in [3] is a perfect overview for such questions. In order to present our main result, let us first introduce some notations and assumptions. Let g : R d R be a real function. As usual, denote by
g p = x R d g ( x ) p d x 1 p , 1 p < and g = sup x R d g ( x )
the L p -norm of g and the supermum norm, respectively.
The consistency for the kernel density estimator has been studied widely. Let f be continuously differentiable on R such that f < and f < ( f is the derivative of f). In addition, suppose that lim N n a n = and lim N n a n 2 = c 0 , where c is a constant. With some mild conditions, Joutard [4] proved the following pointwise sharp large deviation: for any α > 0 and n ,
P f n ( x ) f ( x ) > α = exp { n a n Λ * ( α ) + c H ( τ ) } τ ( 2 π n a n f ( x ) I ( τ ) ) 1 2 1 + O ( 1 ) ,
where Λ * ( α ) = τ ( α + f ( x ) ) f ( x ) I ( τ ) , τ [ 0 , α ] , is such that α + f ( x ) = f ( x ) I ( τ ) and H ( τ ) = ( f 2 ( x ) I 2 ( τ ) / 2 + f ( x ) J ( τ ) ) , with J ( t ) = R z exp { t K ( z ) } d z . For uniform consistency, with some mild conditions, Gao [5] proved the following moderate deviation principle (MDP) result. Let { b n , n 1 } be a sequence of positive real numbers satisfying
n a n d b n + , n a n d log a n 1 b n 2 + a s n + .
Gao [5] proved that for any λ > 0 ,
lim n n a n d b n ln P n a n d b n f n E f n > λ = I ( λ ) ,
where
I ( λ ) = λ 2 2 f K 2 2 .
A pointwise MDP is also established in Gao [5]. A class of refinements of pointwise MDP is called sharp moderate deviations. Sharp moderate deviations are also known as Cramér moderate deviations, and have attracted a lot of interest. We refer to Cramér [6], Petrov [7], Beknazaryan et al. [8] and Fan et al. [9] for such type results. In this paper, we are interested in establishing sharp moderate deviations for the kernel density estimator.
The paper is organized as follows. Our main result is stated and discussed in Section 2. The proof of our theorem is given in Section 3.

2. Main Results

The following assumptions will be used in this paper.
(A)
Assume that the kernel function K satisfies
x R d K ( x ) d x = 1 and K < + .
(B)
There exist a constant β ( 0 , 1 ] and a non-negative integer s such that for any x , y , z R d ,
( z · ) s f ( x ) ( z · ) s f ( y ) A x y 2 β | | z | | 2 s ,
where A is a positive constant, · 2 is the Euclidean distance and
z · = z 1 x 1 + z 2 x 2 + + z d x d .
(C)
Assume
x R d K 2 ( x ) x 2 s + β d x < + and x R d K ( x ) i = 1 d x i j i d x = 0 , 0 < i = 1 d j i s .
Remark 1.
After [1], we recall that the previous assumptions are pretty standard and we restate them in the current multivariate setting:
1.
The first condition in Assumption (A) is necessary to ensure that the estimate remains a function with integral 1;
the second one is not necessary but no striking or useful unbounded kernel was used in the frame of density estimation. Moreover, this condition makes it useless to assume that x R d K 2 ( x ) d x < , for example.
2.
Assumption (B) is a regularity condition on f with order s + β , with β > 0 and s N as considered before.
3.
The first condition in Assumption (C) is useful to prove that the involved expressions are square-integrable. The second part of this condition is more tricky and ensures that the Taylor expansion up to order s provides the relation
z R d K ( z ) f ( x + z ) d z = f ( x ) + O ( x s + β ) ,
since all the intermediate terms simply vanish. It is important to also quote that such kernels K exist. A very simple and usual case is s = β = 1 (second-order regularity), which holds in case K is symmetric with respect to each of its coordinates; it this case, it is possible to obtain K ( z ) 0 and then the estimator f n is still a density (since it is non-negative). For the general case s N , a standard procedure to prove the existence of such kernels is to define K ( z ) = P ( z ) δ ( z ) for a fixed bounded density function and d P = s , where d P is the degree of the polynomial P. Then, it is easy to prove that the system of equations in (C) together with the first part of (A) is invertible and linear because the matrix with coefficients
a j , k = z R d K ( x ) i = 1 d x i j i + k i d x , 0 < i = 1 d j i s , 0 < i = 1 d k i s ,
is symmetric non-negative definite; this point is a straightforward extension of Lemma 3.3.1 in [10] to our multidimensional setting.
Assume that f ( x ) > 0 for some x R d . Denote
D n ( x ) = n a n d f n ( x ) E f n ( x ) f ( x ) K 2 2 + t = 1 s a n t 1 t ! z R d ( z · ) t f ( x ) K 2 ( z ) d z .
We have the following pointwise sharp moderate deviations for the kernel density estimator.
Theorem 1.
Assume that Conditions (A)–(C) are satisfied. Assume f ( x ) > 0 for some x R d . Then, it holds that
ln P D n ( x ) t 1 Φ ( t ) = O 1 + t 3 n a n d + t 2 a n ( s + β ) d
uniformly for 0 t = o ( n a n d ) as n . Moreover, the same equality remains valid when ln P ( D n ( x ) t ) 1 Φ ( t ) is replaced by ln P ( D n ( x ) t ) 1 Φ ( t ) .
For the non-centered case, we have the following pointwise sharp moderate deviations for the kernel density estimator. Denote
D ^ n ( x ) = n a n d K 2 f ( x ) f n ( x ) f ( x ) t = 1 s 1 t ! x R d K ( z ) ( z · ) t f ( x ) d z a n t .
Theorem 2.
Assume that conditions (A)–(C) are satisfied. Assume f ( x ) > 0 for some x R d . Then, it holds that
ln P D ^ n ( x ) t 1 Φ ( t ) = O 1 + t 3 n a n d + t 2 a n ( s + β ) d + ( 1 + t ) n a n s + β + d / 2
uniformly for 0 t = o ( n a n d ) as n . Moreover, the same equality remains valid when ln P ( D ^ n ( x ) t ) 1 Φ ( t ) is replaced by ln P ( D ^ n ( x ) t ) 1 Φ ( t ) .
Remark 2.
Let us comment on Theorem 2.
1.
In the expression of D n , recall that (1) in Remark 1 entails that E f n ( x ) f ( x ) = O ( a n s + β ) .
2.
This result makes it possible to provide a practitioner with precise confidence intervals that are easy to compute in the case of hypothesis testing. Explicit asymptotic p-values can thus be straightforwardly obtained. For instance, consider the following hypothesis testing:
H 0 : f ( x 0 ) = t 0 versus H 1 : f ( x 0 ) t 0 ,
with t 0 > 0 . Denote
z 0 = n a n d K 2 f ( x 0 ) f n ( x 0 ) t 0 .
Then, by Theorem 2, the p-value is asymptotically equal to 2 ( 1 Φ ( z 0 ) ) , provided that z 0 satisfies
1 + z 0 3 n a n d + z 0 2 a n ( s + β ) d + ( 1 + z 0 ) n a n s + β + d / 2 0 .
as n .
3.
Cases of other non parametric estimators, such as the Nadaraya–Watson kernel regression estimator (cf. El Machkouri et al. [11] for instance), non-linear regression estimates or conditional expectations, for predictions issues or estimates of derivatives or even quantile regression estimators, see Rosenblatt [1], will be derived in further subsequent papers.
4.
Even if a non-independent version of this result is accessible, we prefer to give a simple result in the current i.i.d. case.
By Theorem 2, we have the following Berry–Esseen bound for D ^ n ( x ) , that is,
sup t R | P D ^ n ( x ) t Φ ( t ) | = O 1 n a n d + a n ( s + β ) d + n a n s + β + d / 2 .
In particular, by taking a n = n 1 / ( s + β + d ) , we obtain
sup t R | P D ^ n ( x ) t Φ ( t ) | = O n ( s + β ) / ( 2 s + 2 β + d ) .
Moreover, if s = 0 and β = d = 1 , i.e., f is 1-Hölder-continuous, then it holds that
sup t R | P n 1 4 K 2 f ( x ) f n ( x ) f ( x ) t Φ ( t ) | = O n 1 / 3 .
Conclusions. When f C 1 ( R d ) and K ( z ) is symmetric with respect to 0, which implies that z R d z i K 2 ( z ) d z = 0 for all 1 i d , by taking s = 1 in Assumptions (C), then we have
x R d K ( z ) z · f ( x ) d z = 0 ,
which implies
D n ( x ) = n a n d K 2 f ( x ) f n ( x ) E f n ( x ) and D ^ n ( x ) = n a n d K 2 f ( x ) f n ( x ) f ( x ) .
Then, Theorems 1 and 2 hold with s = 1 . Theorems 1 and 2 provide moderate deviations for the expressions D n and D ^ n , which are related through the expression of f n ’s bias; see Remark 2. Remarks 1 and 2 provide a detailed description of the calculation of the bias essential here.

3. Proof of Theorem 1

For n 1 , let { Y i , 1 i n } be i.i.d. and centered random variables. Denote σ 2 = E Y 1 2 and T n = i = 1 n Y i . Assume that σ > 0 . Fan et al. [12] (see also Cramér [6]) established the following asymptotic expansion on the tail probabilities of moderate deviations for T n .
Lemma 1.
Assume that there exists a constant α n such that for all 1 i n ,
E | Y i | k 1 2 k ! 1 α n k 2 E Y i 2 , k 2 .
Then,
ln P ( T n t σ n ) 1 Φ ( t ) = O 1 + t 3 n α n a s n
holds uniformly for 0 t = o n α n .
Proof. 
Lemma 1 is a simple consequence of Fan et al. [12]. □
With the preliminary lemma above, we are in the position to begin the proof of Theorem 1. It is easy to see that
n a n d f n ( x ) E f n ( x ) = 1 n a n d k = 1 n K x X i a n E K x X i a n .
In the sequel, we give an estimate for the right-hand side of the last equality. Notice that
1 n a n d k = 1 n K x X i a n E K x X i a n = 1 n k = 1 n 1 a n d K x X i a n E K x X i a n .
Denote
Y i = 1 a n d K x X i a n E K x X i a n , 1 i n .
We can prove that Y i satisfies the Bernstein condition Equation (5). Indeed, we can deduce that for all k 2 ,
E Y i k 1 a n d k E K x X i a n E K x X i a n k 1 a n d k K k 2 E K x X i a n E K x X i a n 2 K a n d k 2 E Y i 2 1 2 k ! K a n d k 2 E Y i 2 .
For the variance of Y i , we have the following estimation:
Var ( Y i ) = 1 a n d Var K x X i a n = 1 a n d E K 2 x X i a n E K x X i a n 2 .
It is easy to see that
E K 2 x X i a n = a n d z R d K 2 ( z ) f ( x z a n ) d z = a n d z R d K 2 ( z ) f ( x ) d z + z R d K 2 ( z ) ( f ( x z a n ) f ( x ) ) d z .
By Assumption (B), it is easy to see that
| f ( x z a n ) f ( x ) t = 1 s 1 t ! ( z · ) t f ( x ) a n t | = | 1 s ! ( z · ) s f ( x + θ z a n ) ( z · ) s f ( x ) a n s | C d A a n s + β z 2 s + β ,
where | θ | 1 and A is given by Assumption (B). Again by Condition (C), we can deduce that
E K 2 x X i a n = a n d z R d K 2 ( z ) f ( x ) d z + z R d K 2 ( z ) ( f ( x z a n ) f ( x ) ) d z = a n d f ( x ) z R d K 2 ( z ) d z + t = 1 s 1 t ! z R d ( z · ) t f ( x ) K 2 ( z ) d z a n d + t + O ( 1 ) a n d + s + β z R d K 2 ( z ) z 2 s + β d z = a n d f ( x ) z R d K 2 ( z ) d z + t = 1 s a n d + t 1 t ! z R d ( z · ) t f ( x ) K 2 ( z ) d z + O ( a n d + s + β ) .
By Condition (B), it is easy to see that
E K x X i a n = a n d z R d K ( z ) f ( x ) d z + z R d K ( z ) [ f ( x z a n ) f ( x ) ] d z = a n d f ( x ) z R d K ( z ) d z + o ( a n d ) = a n d f ( x ) + o ( a n d ) .
From Equations (10) and (11), we have
Var ( Y i ) = 1 a n d ( a n d f ( x ) z R d K 2 ( z ) d z + t = 1 s a n d + t 1 t ! z R d ( z · ) t f ( x ) K 2 ( z ) d z + O ( a n d + s + β ) a n d f ( x ) + o ( a n d ) 2 ) = f ( x ) z R d K 2 ( z ) d z + t = 1 s a n t 1 t ! z R d ( z · ) t f ( x ) K 2 ( z ) d z + O ( a n ( s + β ) d ) .
When f ( x ) > 0 , we obtain
γ n : = K / a n d n Var ( Y 1 ) = O 1 n a n d
and
Var ( Y i ) f ( x ) K 2 2 + t = 1 s a n t 1 t ! z R d ( z · ) t f ( x ) K 2 ( z ) d z = 1 + O ( a n ( s + β ) d ) .
Therefore, by Lemma 1, we can deduce that for all 0 t = o ( n a n d )
P n a n d f n ( x ) E f n ( x ) t Var ( Y 1 ) = P 1 n Var ( Y 1 ) k = 1 n Y i t = 1 Φ ( t ) exp O ( 1 ) 1 + t 3 n a n d .
Applying inequality Equation (12) to the last inequality, we deduce that for all 0 t = o ( n a n d )
P D n ( x ) t = P n a n d f n ( x ) E f n ( x ) t f ( x ) K 2 2 + t = 1 s a n t 1 t ! z R d ( z · ) t f ( x ) K 2 ( z ) d z = P n a n d f n ( x ) E f n ( x ) t Var ( Y 1 ) 1 + O ( a n ( s + β ) d ) = 1 Φ t ( 1 + O ( a n ( s + β ) d ) ) exp O ( 1 ) 1 + t 3 n a n d .
Because of
1 1 + λ e λ 2 / 2 2 π 1 Φ λ , λ 0 ,
it is easy to see that for all 0 λ x ,
1 λ exp { t 2 / 2 } d t x exp { t 2 / 2 } d t 1 + λ x exp { t 2 / 2 } d t x exp { t 2 / 2 } d t 1 + c 1 x ( x λ ) exp ( x 2 λ 2 ) / 2 exp c 2 x | x λ | .
Hence, we obtain for any λ , x 0 ,
1 Φ λ = 1 Φ ( x ) exp O ( 1 ) ( x + λ ) | x λ | .
By the last equality, it follows that
1 Φ x ( 1 + O ( a n ( s + β ) d ) ) = 1 Φ ( x ) exp O ( 1 ) x 2 a n ( s + β ) d .
Therefore, we have, for all 0 t = o ( n a n d ) ,
P D n ( x ) t = 1 Φ ( t ) exp O ( 1 ) 1 + t 3 n a n d + t 2 a n ( s + β ) d .
This completes the proof of Theorem 1.

4. Proof of Theorem 2

It is easy to see that
E f n ( x ) f ( x ) = 1 n a n d k = 1 n x R d K x t a n f ( t ) d t f ( x ) = x R d K ( z ) f ( x a n z ) d z f ( x ) = x R d K ( z ) [ f ( x a n z ) f ( x ) ] d z .
By inequality Equation (9), we deduce that
E f n ( x ) f ( x ) = t = 1 s 1 t ! x R d K ( z ) ( z · ) t f ( x ) d z a n t + C d A x R d K ( z ) z 2 s + β d z a n s + β = t = 1 s 1 t ! x R d K ( z ) ( z · ) t f ( x ) d z a n t + O ( a n s + β ) .
Applying the last line to Equation (13), we obtain, for all 0 t = o ( n a n d ) ,
P D ^ n ( x ) t = P D n ( x ) t n a n d O ( a n s + β ) K 2 f ( x ) = 1 Φ ( t ) exp O ( 1 ) 1 + t 3 n a n d + t 2 a n ( s + β ) d + ( 1 + t ) n a n s + β + d / 2 .
This completes the proof of Theorem 2.

Author Contributions

Writing—original draft preparation, X.F., P.D., S.L. and H.H.; supervision, X.F.; funding acquisition, S.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was also funded in the frame of the chair FIME: https://fime-lab.org/ and CY-AS (“Investissements d’Avenir” ANR-16-IDEX-0008), “EcoDep” PSI-AAP2020-0000000013.

Data Availability Statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

Acknowledgments

The authors would like to thank the anonymous referees for their valuable comments and remarks.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Rosenblatt, R. Stochastic Curve Estimation; NSF-CBMS Regional Conference Series in Probability and Statistics; Institute of Mathematical Statistics: Waite Hill, OH, USA, 1991; Volume 3. [Google Scholar]
  2. Ciret, T.; Damien, P.; Ciutat, A.; Durrieu, G.; Massabuau, J.-C. Estimation of potential and limits of bivalve closure response to detect contaminants: Application to cadmium. Environ. Toxicol. Chem. 2003, 22, 914–920. [Google Scholar]
  3. de la Peña, V.H.; Lai, T.L.; Shao, Q.M. Self-Normalized Processes: Limit Theory and Statistical Applications (Probability and Its Applications); Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
  4. Joutard, C. Sharp large deviations in nonparametric estimation. J. Nonparam. Stat. 2006, 18, 293–306. [Google Scholar] [CrossRef]
  5. Gao, F. Moderate deviations and large deviations for kernel density estimators. J. Theoret. Probab. 2003, 16, 401–418. [Google Scholar] [CrossRef]
  6. Cramér, H. Sur un nouveau théorème-limite de la théorie des probabilités. Actual. Sci. Indust. 1938, 736, 5–23. [Google Scholar]
  7. Petrov, V.V. Sums of Independent Random Variables; Springer: Berlin/Heidelberg, Germany, 1975. [Google Scholar]
  8. Beknazaryan, A.; Sang, H.; Xiao, Y. Cramér type moderate deviations for random fields. J. Appl. Probab. 2019, 56, 223–245. [Google Scholar] [CrossRef]
  9. Fan, X.; Hu, H.; Xu, L. Cramér-type moderate deviations for Euler-Maruyama scheme for SDE. Sci. China Math. 2024, 67, 1865–1880. [Google Scholar] [CrossRef]
  10. Doukhan, P. Stochastic Models for Time Series; Series Mathématiques et Applications; Springer: Berlin/Heidelberg, Germany, 2018; Volume 80. [Google Scholar]
  11. El Machkouri, M.; Fan, X.; Reding, L. On the Nadaraya-Watson kernel regression estimator for irregularly spaced spatial data. J. Statist. Plann. Infer. 2020, 205, 92–114. [Google Scholar] [CrossRef]
  12. Fan, X.; Grama, I.; Liu, Q. Cramér large deviation expansions for martingales under Bernstein’s condition. Stoch. Process. Appl. 2013, 123, 3919–3942. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, S.; Fan, X.; Hu, H.; Doukhan, P. Pointwise Sharp Moderate Deviations for a Kernel Density Estimator. Mathematics 2024, 12, 3161. https://doi.org/10.3390/math12203161

AMA Style

Liu S, Fan X, Hu H, Doukhan P. Pointwise Sharp Moderate Deviations for a Kernel Density Estimator. Mathematics. 2024; 12(20):3161. https://doi.org/10.3390/math12203161

Chicago/Turabian Style

Liu, Siyu, Xiequan Fan, Haijuan Hu, and Paul Doukhan. 2024. "Pointwise Sharp Moderate Deviations for a Kernel Density Estimator" Mathematics 12, no. 20: 3161. https://doi.org/10.3390/math12203161

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop