Abstract
Motivated by the current scheduling needs of complex resource allocation systems, this paper introduces a novel methodology for performance optimization in DES applications that evolve over very large and complex state spaces and the main objective is expressed as the long-run maximization of some reward rate. The proposed methodology leverages the Generalized Stochastic Petri Net (GSPN) modeling framework in order to effect the seamless integration of the logical and performance-oriented control of the aforementioned applications, define a pertinent policy space, and (re-)cast the performance optimization problem into a mathematical programming formulation that is eventually solved through sensitivity analysis of Markov reward processes and stochastic approximation algorithms. An important attribute of the proposed methodology is that it facilitates an explicit control of the existing trade-off between the computational tractability of the employed formulation and the performance of the derived policies. Furthermore, by posing the eventually defined problem as a constrained nonlinear programming formulation, the presented methodology inherits all the analytical tools and insights that are offered by that vast area of optimization theory. In the current manuscript, all these possibilities are demonstrated through the application of the proposed approach to the throughput maximization of a capacitated re-entrant line abstracting the operation of an automated manufacturing cell.
Similar content being viewed by others
Notes
Also, we should notice, for completeness, that according to standard PN notation, the other three elements in the net-defining tuple are the place set P, the flow relation W that defines the connectivity of the net, and the net initial marking m 0.
As remarked in Section 2, the restriction of the GSPN timed dynamics to Markovian behavior does not restrict substantially its modeling power, since (i) more generally distributed firing times can be approximated to any desired degree of accuracy by phase-type distributions, and (ii) the latter can be modeled by appropriately structured GSPN subnets (Cassandras and Lafortune 2008).
For notational economy, in the following we shall use \(\mathcal {N}\) to refer to the controlled net, as well.
Besides its mathematical necessity for the well-posedness of the proposed optimization problem, the randomization in the decision-making process introduced by δ will also be a useful “exploration” mechanism in the solution of the proposed formulation through the sample-path-based techniques that are pursued in this work.
On the other hand, random switches that depend also on the marking m are characterized as dynamic.
The last remarks are further exemplified through the case study that is presented in Section 6.
The reader should also notice that the “unfolding” nature of the construction of \(\mathcal {G}_{v}(\mathcal {N}; m)\) implies that it is possible that \(m\in \bar {M}\). In fact, this will be the case for the “fictitious” transitions in \(\mathcal {U}(\mathcal {N})\) that are introduced by the uniformization.
However, this increase can take place in an arbitrarily slow rate.
Thus, the characterization “capacitated re-entrant line”.
As already mentioned, more general distributions can be effectively approximated by phase-type distributions (Cassandras and Lafortune 2008). Hence, the restriction of this example to exponentially distributed processing times does not compromise its generality. We also notice that the cluster tools that are used in the contemporary semiconductor manufacturing industry are miniaturized versions of the robotic cells that are considered in this example (Singer 1995; Weiss 1996).
More formally, from the standpoint of the PN-based supervisory control theory, P SCP is a “monitor” place that enforces the aforementioned inequality by introducing an appropriate p-semiflow in the net dynamics (Giua et al. 1992). The theory of imposing constraints on the PN behavior that are expressed as linear inequalities in the net marking, is pretty well developed (Moody and Antsaklis 1998; Iordache and Antsaklis 2006), and in the particular context of the liveness-enforcing supervision of complex RAS, it can support the implementation of the maximally permissive LES or of some pretty tight approximations of that policy Reveliotis (2005, 2007).
We should also notice, for completeness, that the optimal scheduling policy for capacitated re-entrant lines might involve deliberate idleness. The possibility of modeling such deliberate idleness in the GSPN framework has been demonstrated in Choi and Reveliotis (2003). In this work, we have opted to ignore this behavioral element in order to maintain the presentational tractability of the considered example. Furthermore, it can be verified that for the considered example, the optimal scheduling policy does not involve any deliberate idleness.
We opted for a fairly large N 1, especially when compared to N 2, since (i) the analysis of Section 5 has revealed the criticality of the size of N 1 for controlling the bias in the observations Y n w.r.t. the actual direction of the gradient \(\nabla _{\bar {\xi }}(\bar {\xi }_{n})\), and furthermore, (ii) the computational cost of the replications that provide the throughput estimates are significantly lower than the computational cost of the replications that provide the Y n -values.
References
Ajmone MM, Balbo G, Conte G, Donatelli S, Franceschinis G (1994) Modeling with generalized stochastic petri nets. Wiley
Ajmone Marsan M, Conte G, Balbo G (1984) A class of generalized stochastic Petri nets for the performance evaluation of multiprocessor systems. ACM Trans Comput Sys 2:93–122
de Alfaro L (1997) Formal verification of probabilistic systems. PhD thesis, Stanford University, Palo Alto, VA
Beccuti M, Franceschinis G, Haddad S (2007) Markov Decision Petri Net and Markov Decision Well-Formed Net formalisms. In: ICATPN 2007 – LNCS 4546. pp 43–62
Bellman R (1957) Applied dynamic programming. Princeton University, Princeton
Bertsekas DP (2005) Dynamic programming and optimal control, 3rd edn. Athena Scientific, Belmont
Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, Belmont
Cao X-R (1994) Realization probabilities: the dynamics of queueing systems. Springer-Verlag, New York
Cao X-R (2007) Stochastic learning and optimization. Springer, New York
Cassandras CG, Lafortune S (2008) Introduction to discrete event systems, 2nd edn. Springer, New York
Choi JY, Reveliotis SA (2003) A generalized stochastic Petri net model for performance analysis and control of capacitated re-entrant lines. IEEE Trans Robot Autom 19:474–480
Ciardo G, Muppala JK, Trivedi KS (1991) On the solution of GSPN reward models. Perform Eval 12:237–253
Cordone R, Nazeem A, Piroddi L, Reveliotis S (2013) Designing optimal deadlock avoidance policies for sequential resource allocation systems through classification theory: existence results and customized algorithms. IEEE Trans Autom Control 58:2772–2787
Giua A, DiCesare F, Silva M (1992) Generalized mutual exclusion constraints on nets with uncontrollable transitions. In: Proceedings of the 1992 IEEE Intl. conference on systems, man and cybernetics. IEEE, pp 974–979
Gosavi A (2003) Simulation-based optimization: parametric optimization techniques and reinforcement learning. Kluwer Academic Publishers, Norwell
Ho YC, Cao X-R (1991) Perturbation analysis of discrete event systems. Kluwer Academic Publishers, Boston
Iordache MV, Antsaklis PJ (2006) Supervisory control of concurrent systems: a Petri net structural approach. Birkhäuser, Boston
Konda VR, Tsitsiklis JN (2003) On actor-critic algorithms. SIAM J Control Optim 42:1143–1166
Kumar PR (1994) Scheduling manufacturing systems of re-entrant lines. In: Yao, DD (ed) Stochastic modeling and analysis of manufacturing systems. Springer-Verlag, pp 325–360
Kumar PR (1994) Scheduling semiconductor manufacturing plants. IEEE Control Syst Mag 14–6:33–40
Kushner HJ, Yin GG (2003) Stochastic approximation and recursive algorithms and applications. Springer, New York
Li ROptimized scheduling of complex resource allocation systems. PhD thesis, ISyE, Georgia Tech, Atlanta, GA (in preparation)
Li R, Reveliotis S (2013) Performance optimization for a class of generalized stochastic Petri nets. In Proceedings of CDC 2013. IEEE
Li Z, Zhou M, Wu N (2008) A survey and comparison of Petri net-based deadlock prevention policies for flexible manufacturing systems. IEEE Trans Syst Man Cybern Part C Appl Rev 38:173–188
Luenberger DG (1984) Linear and nonlinear programming, 2nd edn. Addison Wesley, Reading
Marbach P, Tsitsiklis JN (2001) Simulation-based optimization of Markov reward processes. IEEE Trans Autom Control 46:191–209
Moazzez E R, Li K, Paschalidis I Ch (2012) A least squares temporal difference actor-critic algorithm with applications to warehouse management. Nav Res Logist 59:197–211
Moody JO, Antsaklis PJ (1998) Supervisory control of discrete event systems using Petri nets. Kluwer Academic Publisher, Boston
Murata T (1989) Petri nets: Properties, Analysis and applications. Proc IEEE 77:541–580
Papadimitriou CH, Tsitsiklis JN (1999) The complexity of optimal queueing network control. Math. OR 24:293–305
Powell WB (2007) Approximate dynamic programming: solving the curses of dimensionality. Wiley, New York
Puterman ML (1994) Markov decision processes: discrete stochastic dynamic programming. Wiley
Ramadge PJG, Wonham WM (1989) The control of discrete event systems. Proc IEEE 77:81–98
Reveliotis S (2007) Algebraic deadlock avoidance policies for sequential resource allocation systems. In: Lahmar M (ed) Facility logistics: approaches and solutions to next generation challenges. Auerbach Publications, pp 235–289
Reveliotis S, Nazeem A (2014) Deadlock avoidance policies for automated manufacturing systems using finite state automata. In: Campos J, Seatzu C, Xie X (eds) Formal Methods in Manufacturing. CRC Press / Taylor & Francis
Reveliotis SA (2000) The destabilizing effect of blocking due to finite buffering capacity in multi-class queueing networks. IEEE Trans Autom Control 45:585–588
Reveliotis SA (2005) Real-time management of resource allocation systems: a discrete event systems approach. Springer, New York
Ross SM (1983) Stochastic processes. Wiley, New York
Rubinstein RY, Shapiro A (1993) Discrete event systems: sensitivity analysis and stochastic optimization by the score function method. Wiley, New York
Singer P (1995) The driving forces in cluster tool development. Semiconductor International. July ’95. pp 113–118
Weiss M (1996) Semiconductor factory automation. Solid State Tech:89–96
Wonham WM (2006) Supervisory control of discrete event systems. Technical Report ECE 1636F / 1637S 2006-07, Electrical & Computer Eng., University of Toronto
Acknowledgment
This work was partially supported by NSF grant CMMI-0928231.
Author information
Authors and Affiliations
Corresponding author
Appendix: The generic SA algorithm that is employed in this work
Appendix: The generic SA algorithm that is employed in this work
Consider the recursion
that is further specified by the following assumptions:
-
I.
{Y n } is a stochastic process such that E n [Y n ]≡E[Y n |𝜃 0, Y i , i< n]=∇ 𝜃 g(𝜃 n ) + β n where g(⋅) is a continuously differentiable real-valued function, and β n → 0 w.p. 1.
-
II.
supnE[|Y n |2] < ∞.
-
III.
\(\epsilon _{n} \geq 0;\ \epsilon _{n} \rightarrow 0;\ \sum _{n=0}^{\infty } \epsilon _{n} = \infty ;\ \sum _{n=0}^{\infty } {\epsilon _{n}^{2}} < \infty\).
-
IV.
H ≡ {𝜃:ψ i (𝜃) ≤ 0, i = 1, …, c}, where ψ i (⋅), i = 1, …, c, are continuously differentiable real-valued functions. It is further assumed that ∀i, ∇ 𝜃 ψ i (𝜃) ≠ 0 if ψ i (𝜃) = 0. Furthermore, the set H is assumed to be connected, compact and non-empty. Finally, the operator π H [⋅] projects any given 𝜃 upon the set H, i.e., it returns a point 𝜃′ ∈ H such that 𝜃′ ∈ argmin{|𝜃−𝜃″|2: 𝜃″ ∈ H} and the considered norm is the Euclidean norm.
It is clear from the above description that the sequence {𝜃 n } generated by the recursion of Eq. 21 is constrained to evolve in H. For any point 𝜃 ∈ H, let A(𝜃) = {i:ψ i (𝜃) = 0}. The set A(𝜃) is the set of the active constraints at 𝜃. Furthermore, consider the set Ω(𝜃) = {∇ 𝜃 ψ i (𝜃):i ∈ A(𝜃)}, i.e., the set containing the outward normals of the active constraints at 𝜃, and let C(𝜃) denote the convex cone that is generated by the elements of Ω(𝜃). For the particular case where A(𝜃) = ∅, C(𝜃) will contain only the zero element. On the other hand, for A(𝜃) ≠ ∅, and under the further assumption that the elements of Ω(𝜃) are linearly independent, the recursion of Eq. 21 can be rewritten as follows:
where Z n is a vector belonging in −C(𝜃 n+1). Finally, let
We shall refer to S H as the (set containing the) “stationary” points of H. S H can be divided into disjoint compact and connected subsets S j , j = 0, … The next theorem characterizes the limiting behavior of the recursion of Eq. 21 under the further structure that is defined in the above discussion.
Theorem 2
(Kushner and Yin 2003 ) Consider the recursion of Eq. 21 under Assumptions I-IV, and further suppose that the underlying function g(⋅) is constant on each subset S i of the set S H . Then, the sequence {𝜃 n } that is generated by this recursion converges to a unique subset S i w.p. 1.
Rights and permissions
About this article
Cite this article
Li, R., Reveliotis, S. Performance optimization for a class of generalized stochastic Petri nets. Discrete Event Dyn Syst 25, 387–417 (2015). https://doi.org/10.1007/s10626-014-0189-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10626-014-0189-3