Abstract
This paper deals with a class of optimal control problems which arises in advertising models with Volterra Ornstein-Uhlenbeck process representing the product goodwill. Such choice of the model can be regarded as a stochastic modification of the classical Nerlove-Arrow model that allows to incorporate both presence of uncertainty and empirically observed memory effects such as carryover or distributed forgetting. We present an approach to solve such optimal control problems based on an infinite dimensional lift which allows us to recover Markov properties by formulating an optimization problem equivalent to the original one in a Hilbert space. Such technique, however, requires the Volterra kernel from the forward equation to have a representation of a particular form that may be challenging to obtain in practice. We overcome this issue for Hölder continuous kernels by approximating them with Bernstein polynomials, which turn out to enjoy a simple representation of the required type. Then we solve the optimal control problem for the forward process with approximated kernel instead of the original one and study convergence. The approach is illustrated with simulations.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The problem of optimizing advertising strategies has always been of paramount importance in the field of marketing. Starting from the pioneering works of Vidale and Wolfe (1957) and Nerlove and Arrow (1962), this topic has evolved into a full-fledged field of research and modeling. Realizing the impossibility of describing all existing classical approaches and results, we refer the reader to the review article of Sethi (1977) (that analyzes the literature prior to 1975) and a more recent paper by Feichtinger et al. (1994) (covering the results up to 1994) and references therein.
It is worth noting that the Nerlove–Arrow approach, which was the foundation for numerous modern dynamic advertising models, assumed no time lag between spending on advertising and the impact of the latter on the goodwill stock. However, many empirical studies (see, for example, Leone 1995) clearly indicate some kind of a “memory” phenomenon that is often called the “distributed lag” or “carryover” effect: the influence of advertising does not have an immediate impact but is rather spread over a period of time varying from several weeks to several months. This shortcoming of the basic Nerlove–Arrow model gave rise to many modifications of the latter aimed at modeling distributed lags. For a long time, nevertheless, the vast majority of dynamic advertising models with distributed lags had been formulated in a deterministic framework (see e.g. Sethi 1977, §2.6 and Feichtinger et al. 1994, Section 2.3).
In recent years, however, there have been several landmark papers that consider the Nerlove-Arrow-type model with memory in a stochastic setting. Here, we refer primarily to the series of papers (Gozzi and Marinelli 2005; Gozzi et al. 2009) (see also a more recent work Li and Chen 2020), where goodwill stock is modeled via Brownian linear diffusion with delay of the form
where \(X^u\) is interpreted as the product’s goodwill stock and u is the spending on advertising. The corresponding optimal control problem in this case was solved using the so-called lift approach: equation (1.1) was rewritten as a stochastic differential equation (without delay) in a suitable Hilbert space, and then infinite-dimensional optimization techniques (either dynamic programming principle or maximum principle) were applied.
In this article, we present an alternative stochastic model that also takes the carryover effect into account. Instead of the delay approach described above, we incorporate the memory into the model by means of the Volterra kernel \(K \in L^2([0,T])\) and consider the controlled Volterra Ornstein-Uhlenbeck process of the form
where \(\alpha ,\beta ,\sigma > 0\) and \(X(0) \in {\mathbb {R}}\) are constants (see e.g. Abi et al. 2019, Section 5 for more details on affine Volterra processes of such type). Note that such goodwill dynamics can be regarded as the combination of deterministic lag models described in Feichtinger et al. (1994, Section 2.3) and the stochastic Ornstein-Uhlenbeck-based model presented by Rao (1986). The main difference from (1.1) is the memory incorporated to the noise along with the drift as the stochastic environment (represented by the noise) tends to form “clusters” with time. Indeed, in reality positive increments are likely to be followed by positive increments (if conditions are favourable for the goodwill during some period of time) and negative increments tend to follow negative increments (under negative conditions). This behaviour of the noise cannot be reflected by a standard Brownian driver but can easily be incorporated into the model (1.2).
Our goal is to solve an optimization problem of the form
where \(a_1,a_2 > 0\) are given constants. The set of admissible controls for the problem (1.3), denoted by \(L^2_a:= L^2_a(\Omega \times [0,T])\), is the space of square integrable real-valued stochastic processes adapted to the filtration generated by W. Note that the process \(X^u\) is well defined for any \(u\in L^2_a\) since, for almost all \(\omega \in \Omega \), the Eq. (1.2) treated pathwisely can be considered as a deterministic linear Volterra integral equation of the second kind that has a unique solution (see e.g. Tricomi 1985).
The optimization problem (1.3) for underlying Volterra dynamics has been studied by several authors (see, e.g. Agram and Øksendal 2015, Yong 2006 and the bibliography therein). Contrarily to most of the work in our bibliography, we will not solve such problem by means of a maximum principle approach. Even though this method allows to find necessary and sufficient conditions to obtain the optimal control to (1.3), we cannot directly apply it as we deal with low regularity conditions on the coefficients of our drift and volatility. Furthermore, such method has another notable drawback in the practice. In fact, its application is often associated with computations of conditional expectations that are substantially challenging due to the absence of Markovianity. Another possible method to solve the optimal control problem (1.3) is to get an explicit solution of the forward equation (1.2), plug it into the performance functional and try to solve the maximization problem using differential calculus in Hilbert spaces. But, even though this method seems appealing, obtaining the required explicit representation of \(X^u\) in terms of u might be tedious and burdensome. Instead, we will use the approach introduced in Abi et al. (2021), Di and Giordano (2022) that is in the same spirit of the one in Gozzi and Marinelli (2005), Gozzi et al. (2009), Li and Zhen (2020) mentioned above: we will rewrite the original forward stochastic Volterra integral equation as a stochastic differential equation in a suitable Hilbert space and then apply standard optimization techniques in infinite dimensions (see e.g. Fabbiri et al. 2017; Fuhrman and Tessitore 2002). Moreover, the shape of the corresponding infinite-dimensional Hamilton-Jacobi-Bellman equation allows to obtain an explicit solution to the latter by exploiting the “splitting” method from Gozzi et al. (2009, Section 3.3).
We notice that, while the optimization problem (1.3) is closely related to the one presented in Abi etal. (2021), there are several important differences in comparison to our work. In particular, Abi etal. (2021) demands the kernel to have the form
where \(\mu \) is a signed measure such that \(\int _{{\mathbb {R}}_+} (1 \wedge \theta ^{-1/2}) |\mu |(d\theta ) < \infty \). Although there are some prominent examples of such kernels, not all kernels K are of this type; furthermore, even if a particular K admits such a representation in theory, it may not be easy to find the explicit shape of \(\mu \). In contrast, our approach works for all Hölder continuous kernels without any restrictions on the shape and allows to get explicit approximations \({\hat{u}}_n\) of the optimal control \({\hat{u}}\). The lift procedure presented here is also different from the one used in Abi etal. (2021) (although they both are specific cases of the technique presented in Cuchiero and Teichmann (2020).
The lift used in the present paper was introduced in Cuchiero and Teichmann (2020), then generalized in Cuchiero and Teichmann (2019) for the multi-dimensional case, but the approach itself can be traced back to Carmona and Coutin (1998). It should be also emphasised that this method has its own limitations: in order to perform the lift, the kernel K is required to have a specific representation of the form \(K(t) = \langle g, e^{t{\mathcal {A}}} \nu \rangle _{{\mathbb {H}}}\), \(t\in [0,T]\), where g and \(\nu \) are elements of some Hilbert space \({\mathbb {H}}\) and \(\{e^{t{\mathcal {A}}},~t\in [0,T]\}\) is a uniformly continuous semigroup acting on \({\mathbb {H}}\) with \({\mathcal {A}}\in {\mathcal {L}}({\mathbb {H}})\) and, in general, it may be hard to find feasible \({\mathbb {H}}\), g, \(\nu \) and \({\mathcal {A}}\). Here, we work with Hölder continuous kernels K and we overcome this issue by approximating the kernel with Bernstein polynomials (which turn out to enjoy a simple representation of the required type). Then we solve the optimal control problem for the forward process with approximated kernel instead of the original one and we study convergence.
The paper is organised as follows. In Sect. 2, we present our approach in case of a liftable K (i.e. K having a representation in terms of \({\mathbb {H}}\), g, \(\nu \) and \({\mathcal {A}}\) mentioned above). Namely, we describe the lift procedure, give the necessary results from stochastic optimal control theory in Hilbert spaces as well as derive an explicit representation of the optimal control \({\hat{u}}\) by solving the associated Hamilton-Jacobi-Bellman equation. In Sect. 3, we introduce a liftable approximation for general Hölder continuous kernels, give convergence results for the solution to the approximated problem and discuss some numerical aspects for the latter. In Sect. 4, we illustrate the application of our technique with examples and simulations.
2 Solution via Hilbert space-valued lift
2.1 Preliminaries
First of all, let us begin with some simple results on the optimization problem (1.3). Namely, we notice that \(X^u\) and the optimization problem (1.3) are well defined for any \(u\in L^2_a\).
Theorem 1
Let \(K\in L^2([0,T])\). Then, for any \(u \in L^2_a\),
-
1)
the forward Volterra Ornstein-Uhlenbeck-type equation (1.2) has a unique solution;
-
2)
there exists a constant \(C>0\) such that
$$\begin{aligned} \sup _{t\in [0,T]} {\mathbb {E}}[|X^u(t)|^2] \le C(1+ \Vert u\Vert ^2_2), \end{aligned}$$where \(\Vert \cdot \Vert _2\) denotes the standard \(L^2(\Omega \times [0,T])\) norm;
-
3)
\(|J(u)| < \infty \).
Proof
Item 1) is evident since, for almost all \(\omega \in \Omega \), the Eq. (1.2) treated pathwisely can be considered as a deterministic linear Volterra integral equation of the second kind that has a unique solution (see e.g. Tricomi 1985). Next, it is straightforward to deduce that
Now, item 2) follows from Gronwall’s inequality. Finally, \(\mathbb {E}[X^u(t)]\) satisfies the deterministic Volterra equation of the form
and hence can be represented in the form
where \(R_\beta \) is the resolvent of the corresponding Volterra integral equation and the operator \({\mathcal {L}}\) is linear and continuous. Hence J(u) can be re-written as
which immediately implies that \(|J(u)| < \infty \). \(\square \)
2.2 Construction of Markovian lift and formulation of the lifted problem
As anticipated above, in order to solve the optimization problem (1.3) we will rewrite \(X^u\) in terms of Markovian Hilbert space-valued process \({\mathcal {Z}}^u\) using the lift presented in Cuchiero and Teichmann (2020) and then apply the dynamic programming principle in Hilbert spaces. We start from the description of the core idea behind the Markovian lifts in case of liftable kernels.
Definition 1
Let \({\mathbb {H}}\) denote a separable Hilbert space with the scalar product \(\langle \cdot , \cdot \rangle \). A kernel \(K\in L^2([0,T])\) is called \({\mathbb {H}}\)-liftable if there exist \(\nu , g\in {\mathbb {H}}\), \(\Vert \nu \Vert _{{\mathbb {H}}} = 1\), and a uniformly continuous semigroup \(\{e^{t{\mathcal {A}}},~t\in [0,T]\}\) acting on \({\mathbb {H}}\), \({\mathcal {A}}\in \mathcal L({\mathbb {H}})\), such that
For examples of liftable kernels, we refer to Sect. 4 and to Cuchiero and Teichmann (2020).
Consider a controlled Volterra Ornstein-Uhlenbeck process of the form (1.2) with a liftable kernel \(K(t) = \langle g, e^{t{\mathcal {A}}}\nu \rangle \), \(\Vert \nu \Vert _{{\mathbb {H}}} = 1\), and denote \(\zeta _0:=\frac{X(0)}{\Vert g\Vert _{{\mathbb {H}}}^2}g\) and
Using the fact that \(X(0)=\langle g, \zeta _0\rangle \), we can now rewrite (1.2) as follows:
where \({\widetilde{{\mathcal {Z}}}}^u_t:=\zeta _0+\int _0^t e^{(t-s){\mathcal {A}}}\nu dV^u(s)\). It is easy to check that, \({\widetilde{{\mathcal {Z}}}}^u\) is the unique solution of the infinite dimensional SDE
and thus the process \(\{{\mathcal {Z}}_t^u, t\in [0,T]\}\) defined as \({\mathcal {Z}}^u_t:= {\widetilde{{\mathcal {Z}}}}^u_t-\zeta _0\) satisfies the infinite dimensional SDE of the form
where \({\bar{{\mathcal {A}}}}\) is the linear bounded operator on \({\mathbb {H}}\) such that
These findings are summarized in the following theorem.
Theorem 2
Let \(\{X^u(t), t\in [0,T]\}\) be a Volterra Ornstein-Uhlenbeck process of the form (1.2) with the \({\mathbb {H}}\)-liftable kernel \(K(t) = \langle g, e^{t{\mathcal {A}}}\nu \rangle \), \(g,\nu \in \mathbb H\), \(\Vert \nu \Vert _{{\mathbb {H}}} = 1\), \({\mathcal {A}}\in {\mathcal {L}}(\mathbb H)\). Then, for any \(t\in [0,T]\),
where \(\zeta _0:=\frac{X(0)}{\Vert g\Vert _{{\mathbb {H}}}^2}g\) and \(\{{\mathcal {Z}}^u_t,~t\in [0,T]\}\) is the \({\mathbb {H}}\)-valued stochastic process given by
and \({\bar{{\mathcal {A}}}} \in {\mathcal {L}}({\mathbb {H}})\) is such that
Using Theorem 2, one can rewrite the performance functional J(u) from (1.3) as
where the superscript g in \(J^g\) is used to highlight dependence on the \({\mathbb {H}}\)-valued process \({\mathcal {Z}}^u\). Clearly, maximizing (2.6) is equivalent to maximizing
Finally, for the sake of notation and coherence with literature, we will sometimes write our maximization problem as a minimization one by simply noticing that the maximization of the performance functional \(J^g (u) - a_2\langle g,\zeta _0\rangle \) can be reformulated as the minimization of
Remark 1
Using the arguments similar to the proof of Theorem 1, it is straightforward to check that \(J^g\) and \({\bar{J}}^g\) are continuous w.r.t. u.
In other words, in case of \({\mathbb {H}}\)-liftable kernel K, the original optimal control problem (1.3) can be replaced by the following one:
Remark 2
The machinery described above can also be generalized for strongly continuous semigroups on Banach spaces, see e.g. Cuchiero and Teichmann (2019, 2020). However, for our purposes, it is sufficient to consider the case when \({\mathcal {A}}\) is a linear bounded operator on a Hilbert space.
2.3 Solution to the lifted problem
In order to solve the optimal control problem (2.8), we intend to use the dynamic programming approach as in Fabbri and Russo (2017). A comprehensive overview of this method for more general optimal control problems can also be found in Fabbri et al. (2017) and Fuhrman and Tessitore (2002).
Denote by \({\widetilde{\sigma }}\) an element of \({\mathcal {L}}({\mathbb {R}}, {\mathbb {H}})\) acting as
and consider the Hamilton-Jacobi-Bellman (HJB) equation associated with the problem (2.8) of the form
where by \(\nabla v\) we denote the partial Gateaux derivative w.r.t. the spacial variable z and the Hamiltonian functional \({\mathcal {H}}: [0,T]\times {\mathbb {H}}\times {\mathbb {H}}\rightarrow {\mathbb {R}}\) is defined as
Proposition 1
The HJB equation (2.9) associated with the lifted problem (2.8) admits a classical solution (in the sense of Fabbiri and Russo 2017, Definition 3.4) of the form
where
\({\bar{{\mathcal {A}}}}^* = {\mathcal {A}}^* - \beta \langle \nu , \cdot \rangle g\), and
Proof
Let us solve the HJB equation (2.9) explicitly using the approach presented in Gozzi et al. (2009, Section 3.3). Namely, we will look for the solution in the form (2.10), where w(t) and c(t) are (unknown) functions such that \(\frac{\partial }{\partial t} v\) and \(\nabla v\) are well-defined. In this case,
and, recalling that \(\langle g, \zeta _0 \rangle = X(0)\), we can rewrite the HJB equation (2.9) as
Now it would be sufficient to find w and c that solve the following systems:
Noticing that the first system in (2.13) has to hold for all \(z\in {\mathbb {H}}\), we can solve
instead, which is a simple linear equation and its solution has the form (2.11). Now it is easy to see that c has the form (2.12) and
It remains to note that (2.10)–(2.12) is indeed a classical solution to (2.9) in the sense of Fabbiri and Russo 2017, Definition 3.4.
\(\square \)
Let us now identify v in (2.10)–(2.12) with the value function of the lifted optimal control problem (2.8) using the result presented in [Theorem 4.1]Fabbri and Russo (2017).
Theorem 3
(Verification theorem) Let v be the solution (2.10)–(2.12) to the HJB equation (2.9) associated with the lifted optimal control problem (2.8). Then
-
1)
\(\inf _{u\in L^2_a} {\bar{J}}^g(u) = v(0,0)\);
-
2)
The optimal control \({\hat{u}}\) minimizing \({\bar{J}}^g\) in (2.8) has the form
$$\begin{aligned} {\hat{u}}(t)=-\frac{\alpha }{2 a_1}\langle w(t),\nu \rangle =\frac{\alpha a_2}{2 a_1 } \left\langle g, e^{(T-t){\bar{{\mathcal {A}}}}}\nu \right\rangle , \end{aligned}$$(2.14)where \({\bar{{\mathcal {A}}}} = {\mathcal {A}}-\beta \langle g, \cdot \rangle \nu \).
In particular, \({\hat{u}}\) given by (2.14) solves the original optimal control problem (1.2).
Proof
It is straightforward to check that the coefficients of the forward equation in (2.8) satisfy Fabbiri and Russo(2017, Hypothesis 3.1) whereas the cost functional \({\bar{J}}^g(u)\) satisfies the conditions of Fabbiri and Russo (2017, Hypothesis 3.3). Moreover, the term \(-\beta \langle g, \zeta _0\rangle \nu \) in (2.8) satisfies condition (i) of Hypothesis 3.3 (2017, Theorem 3.7) and, since v given by (2.10)–(2.12) is a classical solution to the HJB equation (2.9), condition (ii) of Fabbiri and Russo (2017, Theorem 3.7) holds automatically. Finally, it is easy to see that v has sufficient regularity as required in Fabbiri and Russo (2017, Theorem 4.1). Therefore, both statements of Theorem 3 immediately follow from Fabbiri and Russo (2017, Theorem 4.1). \(\square \)
Remark 3
The approach described above can be extended by lifting to Banach space-valued stochastic processes. See Di and Giardano (2022) for more details.
3 Approximate solution for forwards with Hölder kernels
The crucial assumption in Sect. 2 that allowed to apply the optimization techniques in Hilbert space was the liftability of the kernel. However, in practice it is often hard to find a representation of the required type for the given kernel, and even if this representation is available, it is not always convenient from the implementation point of view. For this reason, we provide a liftable approximation for the Volterra Ornstein-Uhlenbeck process (1.2) for a general \(C^h\)-kernel K, where \(C^h([0,T])\) denotes the set of h-Hölder continuous functions on [0, T].
This section is structured as follows: first we approximate an arbitrary \(C^h\)-kernel by a liftable one in a uniform manner and introduce a new optimization problem where the forward dynamics is obtained from the original one replacing the kernel K with its liftable approximation. Afterwards, we prove that the optimal value of the approximated problem converges to the optimal value of the original problem and give an estimate for the rate of convergence. Finally, we discuss some numerical aspects that could be useful from the implementation point of view.
Remark 4
In what follows, by C we will denote any positive constant the particular value of which is not important and may vary from line to line (and even within one line). By \(\Vert \cdot \Vert _2\) we will denote the standard \(L^2(\Omega \times [0,T])\)-norm.
3.1 Liftable approximation for Volterra Ornstein-Uhlenbeck processes with Hölder continuous kernels
Let \(K\in C([0,T])\), \({\mathbb {H}}=L^2({\mathbb {R}})\), the operator \({\mathcal {A}}\) be the 1-shift operator acting on \({\mathbb {H}}\), i.e.
and denote \(K_n\) a Bernstein polynomial approximation for K of order \(n\ge 0\), i.e.
where
Observe that
and hence \(K_n\) is \({\mathbb {H}}\)-liftable as
with \(g_n:= \sum _{k=0}^n k! \kappa _{n,k} \mathbbm {1}_{[-k,-k+1]}\) and \(\nu := \mathbbm {1}_{[0,1]}\).
By the well-known approximating property of Bernstein polynomials, for any \(\varepsilon > 0\), there exist \(n = n(\varepsilon ) \in {\mathbb {N}}_0\) such that
Moreover, if additionally \(K \in C^h([0,T])\) for some \(h\in (0,1)\), Mathe (1999, Theorem 1) guarantees that for all \(t\in [0,T]\)
where \(H>0\) is such that
Now, consider a controlled Volterra Ornstein-Uhlenbeck process \(\{X^u(t),~t\in [0,T]\}\) of the form (1.2) with the kernel \(K \in C^h([0,T])\) satisfying (3.4). For a given admissible u define also a stochastic process \(\{X^u_n (t),~t\in [0,T]\}\) as a solution to the stochastic Volterra integral equation of the form
where \(K_n (t) = \sum _{k=0}^n \kappa _{n,k} t^k\) with \(\kappa _{n,k}\) defined by (3.2), i.e. the Bernstein polynomial approximation of K of degree n.
Remark 5
It follows from Azmoodeh et al. (2014, Corollary 4) that both stochastic processes \(\int _0^t K(t-s)dW(s)\) and \(\int _0^t K_n(t-s)dW(s)\), \(t\in [0,T]\), have modifications that are Hölder continuous at least up to the order \(h \wedge \frac{1}{2}\). From now on, these modifications will be used.
Now we move to the main result of this subsection.
Theorem 4
Let \(K\in C^h([0,T])\), \(u \in L^2_a\), and \(X^u\), \(X^u_n\) are given by (1.2) and (3.5) respectively. Then there exists \(C>0\) which does not depend on n or u such that for any admissible \(u\in L^2_a\):
Proof
First, by Theorem 1, there exists a constant \(C>0\) such that
Consider an arbitrary \(\tau \in [0,T]\), and denote \(\Delta (\tau ):=\sup _{t\in [0,\tau ]} \mathbb {E}\left[ |X^u(t) -X^u_{n}(t)|^2\right] \). Then
Note that, by (3.3) we have that
Moreover, since \(\{K_n,~n\ge 1\}\) are uniformly bounded due to their uniform convergence to K it is true that
with C not dependent on n, and from (3.3), (3.6) one can deduce that
Lastly, by the Ito isometry and (3.3),
Hence
where C is a positive constant (recall that it may vary from line to line). The final result follows from Gronwall’s inequality. \(\square \)
3.2 Liftable approximation of the optimal control problem
As it was noted before, our aim is to find an approximate solution to the the optimization problem (1.3) by solving the liftable problem of the form
where the maximization is performed over \(u\in L^2_a\). In (3.7), \(K_n\) is the Bernstein polynomial approximation of \(K\in C^h([0,T])\), i.e.
where \({\mathcal {A}}\in {\mathcal {L}} \left( {\mathbb {H}}\right) \) acts as \(({\mathcal {A}}f)(x+1)\), \(\nu = \mathbbm {1}_{[0,1]}\) and \( g_n = \sum _{k=0}^n k! \kappa _{n,k} \mathbbm {1}_{[-k,-k+1]}\) with \(\kappa _{n,k}\) defined by (3.2). Due to the liftability of \(K_n\), the problem (3.7) falls in the framework of Sect. 2, so, by Theorem 3, the optimal control \({\hat{u}}_n\) has the form (2.14):
where \({\bar{{\mathcal {A}}}}_n:= {\mathcal {A}}-\beta \langle g_n, \cdot \rangle \nu \). The goal of this subsection is to prove the convergence of the optimal performance in the approximated dynamics to the actual optimal, i.e.
where J is the performance functional from the original optimal control problem (1.3).
Proposition 2
Let the kernel \(K \in C^h([0,T])\). Then
where \(\Vert \cdot \Vert _2\) denotes the standard \(L^2(\Omega \times [0,T])\) norm.
Proof
We prove only (3.9); the proof of (3.10) is the same. Let \(u\in L^2_a\) be fixed. For any \(n\in {\mathbb {N}}\) denote
and notice that for any \(t\in [0,T]\) we have that
where \(C > 0\) is a deterministic constant that does not depend on n, t or u (here we used the fact that \(K_n \rightarrow K\) uniformly on [0, T]). Whence, for any \(n\in {\mathbb {N}}\),
Now, let us prove that there exists a constant \(C>0\) such that
First note that, by Remark 5, for each \(n\in {\mathbb {N}}\) and \(\delta \in \left( 0, \frac{h}{2} \wedge \frac{1}{4}\right) \) there exists a random variable \(\Upsilon _n = \Upsilon _n (\delta )\) such that
and whence
Thus it is sufficient to check that \(\sup _{n\in {\mathbb {N}}}\mathbb {E}\Upsilon _n < \infty \). It is known from Azmoodeh et al. (2014) that one can put
where \(p:=\frac{1}{\delta }\) and \(C_\delta > 0\) is a constant that does not depend on n. Let \(p' > p\). Then Minkowski integral inequality yields
Note that, by Mathe (1999, Proposition 2), every Bernstein polynomial \(K_n\) that corresponds to K is Hölder continuous of the same order h and with the same constant H, i.e.
whenever
This implies that there exists a constant C which does not depend on n such that
Plugging the bound above to (3.12), we get that
where \(C>0\) denotes, as always, a deterministic constant that does not depend on n, t, u and may vary from line to line.
Therefore, there exists a constant, again denoted by C not depending on n, t or u such that
and thus, by (3.11),
By Gronwall’s inequality, there exists \(C>0\) which does not depend on n such that
and so
\(\square \)
Theorem 5
Let \(K \in C^h([0,T])\) and \(K_n\) be its Bernstein polynomial approximation of order n. Then there exists constant \(C>0\) such that
Moreover, \({\hat{u}}_n\) is “almost optimal” for J in the sense that there exists a constant \(C>0\) such that
Proof
First, note that for any \(r \ge 0\)
where \(B_r:= \{u\in L^2_a:~\Vert u\Vert _2 \le r\}\). Indeed, by definitions of J, \(J_n\) and Theorem 4, for any \(u\in B_r\):
In particular, this implies that there exists \(C>0\) that does not depend on n such that \(J(0) - C < J_n(0)\), so, by Proposition 2, there exists \(r_0>0\) that does not depend on n such that \(\Vert u\Vert _2 > r_0\) implies
In other words, all optimal controls \({\hat{u}}_n\), \(n\in {\mathbb {N}}\) must be in the ball \(B_{r_0}\) and that \(\sup _{u\in L^2_a} J(u) = \sup _{u\in B_{r_0}} J(u)\). This, together with uniform convergence of \(J_n\) to J over bounded subsets of \(L^2_a\) and estimate (3.14), implies that there exists \(C>0\) not dependent on n such that
Finally, taking into account (3.14) and (3.16) as well as the definition of \(B_{r_0}\),
which ends the proof. \(\square \)
Theorem 6
Let \(K \in C^h([0,T])\) and \({\hat{u}}_n\) be defined by (3.8). Then the optimization problem (1.3) has a unique solution \({\hat{u}} \in L^2_a\) and
in the weak topology of \(L^2(\Omega \times [0,T])\).
Proof
By (2.1), the performance functional J can be represented in a linear-quadratic form as
where \({\mathcal {L}}\): \(L^2(\Omega \times [0,T]) \rightarrow L^2(\Omega \times [0,T])\) is a continuous linear operator. Then, by Allaire (2007, Theorem 9.2.6), there exists a unique \({\hat{u}} \in L^2(\Omega \times [0,T])\) that maximizes J and, moreover, \({\hat{u}}_n \rightarrow {\hat{u}}\) weakly as \(n\rightarrow \infty \). Furthermore, since all \({\hat{u}}_n\) are deterministic, so is \({\hat{u}}\); in particular, it is adapted to filtration generated by W which implies that \({\hat{u}} \in L^2_a\). \(\square \)
3.3 Algorithm for computing \({\hat{u}}_n\)
The explicit form of \({\hat{u}}_n\) given by (3.8) is not very convenient from the implementation point of view since one has to compute \(e^{(T-t) {\bar{{\mathcal {A}}}}_n}\nu =e^{(T-t)\bar{\mathcal {A}}_n}\mathbbm {1}_{[0,1]}\), where \({\bar{{\mathcal {A}}}}_n:= {\mathcal {A}}-\beta \langle g_n, \cdot \rangle \mathbbm {1}_{[0,1]}\), \(({\mathcal {A}}f)(x) = f(x+1)\). A natural way to simplify the problem is to truncate the series
for some \(M\in {\mathbb {N}}\). However, even after replacing \(e^{(T-t){\bar{{\mathcal {A}}}}_n}\) in (3.8) with its truncated version, we still need to be able to compute \(\bar{\mathcal {A}}_{n}^k\mathbbm {1}_{[0,1]}\) for the given \(k \in {\mathbb {N}} \). An algorithm to do so is presented in the proposition below.
Proposition 3
For any \(k\in {\mathbb {N}} \cup \{0\}\),
where, \(\gamma ({0,0})=1\) and, for all \(k\ge 1\),
Proof
The proof follows an inductive argument. The statement for \(\gamma ({0,0})\) is obvious. Now let
Then
\(\square \)
Finally, consider
where \(\kappa _{n,i}\) are defined by (3.2) and \(\gamma ({i,k})\) are from Proposition 3.
Theorem 7
Let \(n\in {\mathbb {N}}\) be fixed and \(M \ge (T-t)\Vert {\bar{{\mathcal {A}}}}_n\Vert _{{\mathcal {L}}}\), where \(\Vert \cdot \Vert _{\mathcal L}\) denotes the operator norm. Then, for all \(t\in [0,T]\),
Moreover,
Proof
One has to prove the first inequality and the second one then follows. It is clear that
and, if \(M \ge (T-t)\Vert {\bar{{\mathcal {A}}}}_n\Vert _{{\mathcal {L}}}\), we have that
where we used a well-known result on tail probabilities of Poisson distribution (see e.g. Samuel 1965). \(\square \)
4 Examples and simulations
Example 1
(monomial kernel) Let \(N \in {\mathbb {N}}\) be fixed. Consider an optimization problem of the form
where, as always, we optimize over \(u\in L^2_a\). The kernel \(K(t) = t^{N}\) is \({\mathbb {H}}\)-liftable,
where \(({\mathcal {A}}f)(x) = f(x+1)\), \(f\in {\mathbb {H}}\). By Theorem 3, the optimal control for the problem (4.1) has the form
where \({\bar{{\mathcal {A}}}} ={\mathcal {A}}-N!\langle \mathbbm {1}_{[-N,-N+1]}, \cdot \rangle \mathbbm {1}_{[0,1]} \). In this simple case, we are able to find an explicit expression for \(e^{(T-t){\bar{{\mathcal {A}}}}^*} \mathbbm {1}_{[-i,-i+1]}\). Indeed, it is easy to see that, for any \(i\in {\mathbb {N}} \cup \{0\}\), \(p \in {\mathbb {N}}\cup \{0\}\) and \(q=0,1,...,N\),
and whence
where \(E_{a,b}(z):= \sum _{p=0}^\infty \frac{z^p}{\Gamma (ap+b)}\) is the Mittag-Leffler function. This, in turn, implies that
On Fig. 1, the black curve depicts the optimal \({\hat{u}}\) computed for the problem 4.1 with \(K(t) = t^2\) and \(T=2\) using (4.2); the othere curves are the approximated optimal controls \({\hat{u}}_{n,M}\) (as in (3.17)) computed for \(n=1,2,5,10\) and \(M=20\).
Remark 6
The solution of the problem (4.1) described in Example 1 should be regarded only as an illustration of the optimization technique via infinite dimensional lift: in fact, the kernel K in this example is degenerate and thus the forward equation (4.1) can be solved explicitly. This means that other finite dimensional techniques could have been used in this case.
Example 2
(fractional and gamma kernels) Consider three optimization problems of the form
\(u\in L^2_a\), where the kernels are chosen as follows: \(K_1(t):= t^{0.3}\) (fractional kernel), \(K_2(t):= t^{1.1}\) (smooth kernel) and \(K_3(t):= e^{-t}t^{0.3}\) (gamma kernel). In these cases, we apply all the machinery presented in Sect. 3 to find \({\hat{u}}_{n,M}\) for each of the optimal control problems described above. In our simulations, we choose \(T=2\), \(n=20\), \(M=50\); the mesh of the partition for simulating sample paths of \(X^u\) is set to be 0.05, \(\sigma = 1\), \(X(0) = 0\).
Figure 2 depicts approximated optimal controls for different values of \(\alpha \) and \(\beta \). Note that the gamma kernel \(K_3(t)\) (third column) is of particularly interest in optimal advertising. This kernel, in fact, captures the peculiarities of the empirical data (see Leone 1995) since the past dependence comes into play after a certain amount of time (like a delayed effect) and its relevance declines as time goes forward.
Remark 7
Note that the stochastic Volterra integral equation from (4.3) can be sometimes solved explicitly for certain kernels (e.g. via the resolvent method). For instance, the solution \(X^u\) which corresponds to the fractional kernel of the type \(K(t) = t^h\), \(h>0\), and \(\beta = 1\) has the form
where \(E_{a,b}\) again denotes the Mittag-Leffler function. Having the explicit solution, one could solve the optimization problem (4.3) by plugging in the shape of \(X^u\) to the performance functional and applying the standard minimization techniques in Hilbert spaces. However, as mentioned in the introduction, this leads to some tedious calculations that are complicated to implement, whereas our approach allows to get the approximated solution in a relatively simple manner.
References
Abi Jaber, E., Larsson, M., Pulido, S.: Affine Volterra processes. Ann. Appl. Probab. 29(5), 3155–3200 (2019)
Abi Jaber, E., Miller, E., Pham, H.: Linear-Quadratic control for a class of stochastic Volterra equations: solvability and approximation. Annals of Applied Probability (2021) (To appear)
Agram, N., Øksendal, B.: Malliavin calculus and optimal control of stochastic Volterra equations. J. Optim. Theory Appl. 167(3), 1070–1094 (2015)
Allaire, G.: Numerical Analysis and Optimization: An Introduction to Mathematical Modelling and Numerical Simulation. Oxford University Press, London (2007)
Azmoodeh, E., Sottinen, T., Viitasaari, L., Yazigi, A.: Necessary and sufficient conditions for Hölder continuity of Gaussian processes. Stat. Probab. Lett. 94, 230–235 (2014)
Carmona, P., Coutin, L.: Fractional Brownian motion and the Markov property. Electron. Commun. Probab. 3, 95–107 (1998)
Cuchiero, C., Teichmann, J.: Markovian lifts of positive semidefinite affine Volterra-type processes. Decis. Econ. Finance 42(2), 407–448 (2019)
Cuchiero, C., Teichmann, J.: Generalized Feller processes and Markovian lifts of stochastic Volterra processes: the affine case. J. Evol. Equ. 20(4), 1301–1348 (2020)
Di Nunno, G., Giordano, M.: Lifting of Volterra processes: optimal control and in UMD Banach spaces. Report (2022)
Fabbri, G., Gozzi, F., Świech, A.: Stochastic Optimal Control in Infinite Dimension: Dynamic Programming and HJB Equations. Springer, Cham (2017)
Fabbri, G., Russo, F.: HJB equations in infinite dimension and optimal control of stochastic evolution equations via generalized Fukushima decomposition. SIAM J. Control Optim. 55(6), 4072–4091 (2017). https://doi.org/10.1137/17M1113801
Feichtinger, G., Hartl, R., Sethi, S.: Dynamic optimal control models in advertising: recent developments. Manag. Sci. 40, 195–226 (1994)
Fuhrman, M., Tessitore, G.: Nonlinear Kolmogorov equations in infinite dimensional spaces: the backward stochastic differential equations approach and applications to optimal control. Ann. Probab. 30, 1397–1465 (2002)
Gozzi, F., Marinelli, C.: Stochastic optimal control of delay equations arising in advertising models. In: Stochastic Partial Differential Equations and Applications, Lecture Notes in Pure and Applied Mathematics, pp. 133–148. Chapman and Hall/CRC, Boca Raton (2005)
Gozzi, F., Marinelli, C., Savin, S.: On controlled linear diffusions with delay in a model of optimal advertising under uncertainty with memory effects. J. Optim. Theory Appl. 142(2), 291–321 (2009)
Leone, R.: Generalizing what is known about temporal aggregation and advertising carryover. Mark. Sci. 14, G141–G150 (1995)
Li, C., Zhen, W.: Stochastic optimal control problem in advertising model with delay. J. Syst. Sci. Complex. 33, 968–987 (2020)
Mathe, P.: Approximation of Hölder continuous functions by Bernstein polynomials. Am. Math. Mon. 106(6), 568 (1999)
Nerlove, M., Arrow, K.: Optimal advertising policy under dynamic conditions. Economica 29(114), 129–142 (1962)
Rao, R.C.: Estimating continuous time advertising-sales models. Mark. Sci. 5(2), 125–142 (1986)
Samuels, S.M.: On the number of successes in independent trials. Ann. Math. Stat. 36(4), 1272–1278 (1965)
Sethi, S.: Dynamic optimal control models in advertising: a survey. SIAM Rev. 19(4), 685–725 (1977)
Tricomi, F.G.: Integral Equations. Dover Publications, Mineola (1985)
Vidale, M.L., Wolfe, H.B.: An operations-research study of sales response to advertising. Oper. Res. 5, 370–381 (1957)
Yong, J.: Backward stochastic Volterra integral equations and some related problems. Stoch. Process Appl. 116, 770–795 (2006)
Acknowledgements
Authors would also like to thank Dennis Schroers for the enlightening help with one of the proofs leading to this paper as well as Giulia Di Nunno for the proofreading and valuable remarks.
Funding
Open access funding provided by University of Oslo (incl Oslo University Hospital) The present research is carried out within the frame and support of the ToppForsk project nr. 274410 of the Research Council of Norway with title STORM: Stochastics for Time-Space Risk Models.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declares that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Giordano, M., Yurchenko-Tytarenko, A. Optimal control in linear-quadratic stochastic advertising models with memory. Decisions Econ Finan 47, 275–298 (2024). https://doi.org/10.1007/s10203-023-00409-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10203-023-00409-x
Keywords
- Dynamic programming
- Volterra Ornstein-Uhlenbeck process
- Infinite-dimensional Bellman equations
- Optimal advertising