(PDF) Application of Multilayer Feedforward Network to the Voiced-Unvoiced-Silence Detection Problem

Abstract----Multilayer feedforward networks (MFN) are tested in detection and classification tasks. The classification of the speech signal into voiced, unvoiced, and silence (V/U/S) provides a preliminary acoustic segmentation of speech, which is important for speech ...

Application of Multilayer Feedforward Networks in the Solution of Compressible Euler Equations Final Technical Report for NGT-70353 Principal Investigator: Andrew J. Meade, Jr. Associate Professor Mechanical Engineering and Materials Science, MS-321 William Marsh Rice University Houston, Texas 77005-1892 Phone: (713) 737-5880 E-mail: meade@rice.edu February, 1998 Abstract Application of Multilayer Feedforward Networks in the Solution of Compressible Euler Equations Final Technical Report for NGT-70353 Recent work has proven that it is possible to actually \program" multilayer feedforward arti cial neural networks (ANNs). This new paradigm is not only making it possible to logically and predictably extend the capabilities of ANNs to \hard" computing, such as Computational Fluid Dynamics (CFD), but it is also revealing a di erent, quantitative way of looking at neural networks that could help advance the level of understanding we possess about these systems. Accurate modeling of linear ordinary di erential equations (ODEs) and linear partial di erential equations (PDEs) has already been successfully completed using this paradigm. It is proposed to extend this capability further by attempting to use ANNs to model the solution of the two-dimensional compressible Euler equations. The code thus created should not only have the same accuracy as a more conventional computer code, but should still retain the ability of an ANN to modify itself when exposed to experimental data, thus yielding software that could be specialized with experimental results. To accomplish these objectives, the method of optimal incremental function approximation has been developed for the adaptive solution of di erential equations using ANN architecture. Two major attractive features of this approach are that: 1) the developed method is exible enough to use any of the popular transfer functions and 2) the developed method requires minimal user interaction. The latter is especially advantagous when dealing with complicated physical or computational domains. The method of optimal incremental function approximation is formulated from the combination of various concepts utilized by computational mechanics and arti cial neural networks (e.g. function approximation and error minimization, variational principles and weighted residuals, and adaptive grid optimization). The basis functions and associated coecients of a series expansion, representing the solution, are optimally selected by a parallel direct search technique at each step of the algorithm according to appropriate criteria; the solution is built sequentially. Complicated data structures and expensive remeshing algorithms and systems solvers, that are common in conventional computational mechanics, are avoided. Computational eciency is increased by using augmented radial bases and concurrent computing. Variational principles are utilized for the de nition of the objective function to be extremized in the associated optimization problems, ensuring that the problem is well-posed. Numerical results and convergence rates are reported for steady-state problems, including the nonlinear compressible Euler equations. Acknowledgments The author would like to acknowledge the work of research assistants Alvaro Fernandez and Michael Kokkolaras. Also, Dr. Robert Shelton of the NASA Johnson Space Center is gratefully acknowledged for his guidance and assistance. Special thanks are given to Prof. Bowen Loftin for access to the University of Houston IBM SP2 parallel computer. Contents Abstract Acknowledgments List of Illustrations List of Tables ii iv viii xi 1 Tikhonov-Based Integration of Theoretical, Experimental and Computational Mechanics 1 1.1 1.2 1.3 1.4 Introduction The Tikhonov-Based Integration Method Interpolation Methods for Experimental Data Analysis Arti cial Neural Networks 1.4.1 Arti cial Neural Networks As a General Approximation Tool 1.4.2 Network Modelling As An Ill-Posed Problem A Method for Data Analysis: Regularization by A-Priori Mathematical Models 1.5.1 Analytical Approximation of Regularized Mechanical Systems 1.5.2 Numerical Approximation of Regularized Mechanical Systems Example Problem 1.6.1 Results from the Analytical Approximation 1.6.2 Results from the Numerical Approximation Nonlinear A-Priori Models Implementation Concerns : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1.5 : : : : : : : : : : : : : : : : : : : : : : : : : : 1.6 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1.7 1.8 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 3 4 8 8 10 12 14 17 18 19 21 22 23 vi 2 Optimal Incremental Approximation: Background 26 2.1 2.2 2.3 2.4 The Method of Weighted Residuals Variational Principle Methods Adaptive Grid Optimization Arti cial Neural Networks : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 3 Optimal Incremental Approximation: Approach 3.1 3.2 3.3 3.4 The Proposed Algorithm The Parallel Direct Search Optimization Technique The Iterative Character of the Proposed Algorithm Function Approximation 3.4.1 Implementation 3.4.2 Numerical Examples 36 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 Linear Di erential Equations 4.1 Boundary Conditions 4.2 Linear Nonself-Adjoint Problems 4.2.1 Numerical Example: The Convection-Di usion Equation 4.2.2 Computational Cost : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5 The Euler Equations The Velocity Potential Equation The Variational Integral The Computational Domain Boundary Conditions Numerical Procedure and Results 5.5.1 Numerical Example: 1 = 0 4 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 50 52 56 65 69 : : : : : : : : : : : : : : : : : : : : M 36 38 39 42 42 44 47 : : : : : : : : : : : : : : : : : : : : : : : : : : 5.1 5.2 5.3 5.4 5.5 26 27 30 33 : : : : : : : : : : : : : : : : 70 71 74 76 80 83 vii 5.5.2 Numerical Example: 1 = 0 43 5.5.3 Computational Work M : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6 Conclusions and Future Work 6.1 Conclusions 6.1.1 Objectives 6.1.2 Approach 6.1.3 Numerical Examples 6.1.4 Optimization and Bases 6.2 Future Work 6.2.1 Generalization of the Proposed Method 6.2.2 Utilizing the Method of Weighted Residuals 6.2.3 Increasing Computational Eciency 6.2.4 Compact Bases 94 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : A A.1 A.2 A.3 A.4 A.5 Norms, Inner Products, and Orthogonality -splines and Gaussian Functions Members of the Method of Weighted Residuals Classi cation of Di erential Operators PDS Computational Issues A.5.1 The Formulation of the Optimization Variables A.5.2 Scaling A.6 Diagonal Dominance and Spectral Radius A.7 Numerical Integration : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Bibliography 94 94 94 96 96 97 97 97 99 100 102 : : : : : : : : : : : : : : B 88 89 102 105 108 109 111 111 112 113 114 119 Illustrations 1.1 Extrapolation from insucient data for = 0:01; (solid) the mechanical system F (x), (dash) the regularized approximation f~(x), (dash-dot) the numerical approximation fa(x). : : : : : : : : : 20 1.2 Interpolation of data with noise for optimal ; (solid) F (x), (dash) f~(x), (dash-dot) fa (x). : : : : : : : : : : : : : : : : : : : : : : : : : 22 3.1 Convergence rates for the function approximation problems: (a) one-dimensional case and (b) two-dimensional case. : : : : : : : : : 46 4.1 One-dimensional convection-di usion: exact and approximate solutions for (a) = = 5 and (b) = = 20 using 16 and 9 optimal augmented radial basis functions, respectively. : : : : : : : : : : : : 4.2 One-dimensional convection-di usion: (a) exact and approximate solutions for di erent grid spacing and cell Reynolds numbers and (b) exact and approximate solutions in the presence of an upwind scheme; = = 20, plots reproduced from reference [26]. : : : : : : : 4.3 One-dimensional convection-di usion: convergence rates for (a) = = 5 and (b) = = 20. : : : : : : : : : : : : : : : : : : : : : : : 4.4 One-dimensional convection-di usion: convergence rates comparison considering the Petrov-Galerkin and Raleigh-Ritz methods; = = 20. : : : : : : : : : : : : : : : : : : : : : : : : : : : 61 63 63 67 ix 4.5 One-dimensional convection-di usion: absolute value of the coecients c for = = 20. : : : : : : : : : : : : : : : : : : : : : : : 68 i 5.1 Compressible inviscid ow: (top) the physical (x-y-plane) and (bottom) computational (r--plane) domains. : : : : : : : : : : : 5.2 Compressible inviscid ow: the approximate velocity potential auxiliary function (r; ) for M1 = 0:4 using 15 optimal basis functions. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5.3 Compressible inviscid ow: the approximate velocity potential auxiliary function (r; ) at di erent radii (on and near the cylinder surface) for M1 = 0:4 using 15 optimal basis functions. : 5.4 Compressible inviscid ow: nondimensionalized (curve 1) uid velocity and (curve 2) sonic speed on the cylinder surface for M1 = 0:4. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5.5 Compressible inviscid ow: (top) pressure coecient distribution and (bottom) nondimensionalized density on the cylinder surface for M1 = 0:4. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5.6 Compressible inviscid ow, M1 = 0:4: (top) local Mach number distribution over the entire ow eld and (bottom) Mach-isolines in a region close to the cylinder surface. : : : : : : : : : : : : : : : 5.7 Compressible inviscid ow: the approximate velocity potential auxiliary function (r; ) for M1 = 0:43 using 15 optimal basis functions. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5.8 Compressible inviscid ow: the approximate velocity potential auxiliary function (r; ) at di erent radii (on and near the cylinder surface) for M1 = 0:43 using 15 optimal basis functions. : 75 : 84 : 84 : 85 : 86 : 87 : 88 : 89 x 5.9 Compressible inviscid ow: nondimensionalized (curve 1) uid velocity and (curve 2) sonic speed on the cylinder surface for M1 = 0:43. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5.10 Compressible inviscid ow: (a) pressure coecient distribution and (b) nondimensionalized density on the cylinder surface for M1 = 0:43. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5.11 Compressible inviscid ow, M1 = 0:43: (top) local Mach number distribution over the entire ow eld and (bottom) Mach-isolines in a region close to the cylinder surface. : : : : : : : : : : : : : : : : 90 : 90 : 91 A.1 B -splines: (a) a B1-spline and (b) a B3-spline. : : : : : : : : : : : : 107 A.2 Radial bases: (a) Gaussian function with center = 1:5 and radius = 0:5 and (b) an augmented radial basis with = = 10?6 , = 2:5 and = ?2:5. : : : : : : : : : : : : : : : : : : : : : : : : 107 l l r r Tables 4.1 Amount of computational work required by PDS for the linear convection-di usion problem using B1-splines (= = 5). : : : : : : 66 4.2 Amount of computational work required by PDS for the linear convection-di usion problem using augmented radial basis functions (= = 5, d = 1000, and tol = 10?4 ). : : : : : : : : : : : : 66 4.3 Amount of computational work required by PDS for the linear convection-di usion problem using augmented radial basis functions (= = 20, d = 1000, and tol = 10?4 ). : : : : : : : : : : : 67 5.1 Amount of computational work required by PDS for the Euler Equation problem using augmented radial basis functions for d = 1120 and tol = 10?4 (M = 0:4). : : : : : : : : : : : : : : : : : : 93 5.2 Amount of computational work required by PDS for the Euler Equation problem using augmented radial basis functions for d = 1120 and tol = 10?4 (M = 0:43). : : : : : : : : : : : : : : : : : 93 1 Chapter 1 Tikhonov-Based Integration of Theoretical, Experimental and Computational Mechanics 1.1 Introduction The goal of engineering analysis is to obtain a comprehensive description of a physical system of interest. Three approaches to this goal exist: 1) theoretical analysis of mathematical models, 2) physical experimentation and 3) computational mechanics [18]. None of these approaches is superior and they must be used in combination. For example, ideally, experimental measurements should provide the most adequate description of a physical system. However, from an engineering standpoint, experimental studies lack the needed generality since measurements only describe the system behavior under a speci c combination of load and system parameters. Also, errors in experimental measurements can induce a bias in the evaluation of the physical system. This may conceal important features in the system behavior. Most importantly, experimental measurements of complex physical systems may require signi cant, if not prohibitive, material and labor costs; describing complex engineering systems using sparse experimental data is an important and challenging engineering problem. Alternatively, utilizing only mathematical models for engineering analysis can yield general results which can then provide the foundation for a parametric study. However, even the most sophisticated mathematical model can only approximate the physical system. In this regard, the delity of mathematical models and its applicability for describing system behavior ultimately requires model validation 2 using experimental measurements. In addition, the majority of sophisticated mathematical models do not have analytical solutions and must be simpli ed (lowdelity) or solved by high-speed digital computers combined with accurate numerical algorithms. Unfortunately, even with the use of numerical computation only a limited number of parameter combinations can be investigated and, like the experimental approach, the theoretical approach loses its generality. In fact, computational studies are often referred to as numerical experiments [81]. Though it is known in the engineering community that successful analyses rest upon the proper balance of all three approaches, it is noted that few attempts have been made in uniting experimental, theoretical and numerical methods in the literature. It is the overall objective of this report to address this by developing a method which e ectively combines all available information, from both experimental data and mathematical models, in the emulation of physical systems. To accomplish this in a practical sense, it is the speci c objective of this report to develop a method in which the combined information can be approximated by arti cial neural network (ANN) architectures. Chapter 1 of this report discusses the development of the integration method and includes background material and numerical examples. Section 1.8 discusses the issues that must be addressed if the integraton method is to be applied to engineering problems. As a result of this discussion a novel numerical approximation method (optimal sequential function approximation) is proposed and introduced in Chapters 2 and 3. The novel approximation method is then applied to the variational form of a linear nonself-adjoint di erential equation in Chapter 4 and the compressible Euler equations in Chapter 5. Conclusions are given in Chapter 6. Section 6.2 discusses generalization of the optimal sequential function approxi- 3 mation, when a variational form is not available, utilizing the method of weighted residuals. 1.2 The Tikhonov-Based Integration Method The proposed integration method can be interpreted as the combination of experimental and computational mechanics by which the theoretical response of a physical system is \ ne-tuned" through available experimental data. Note that even mathematical models of low physical delity, while failing to provide adequate engineering accuracy, may properly capture the most signi cant qualitative aspects of system behavior. The proposed method eciently \corrects" the response of low delity mathematical models by using experimental data. Alternatively, the method can be interpreted as a low computational cost procedure for determining smooth response curves which can be used to interpolate sparse and scattered experimental data by physically motivated criteria. As a result, the problem of predicting (extrapolating) the mechanical system response at unmeasured coordinates can be examined. This chapter also shows that the developed method can be eciently used to process and lter noisy measurements. The response of the mathematical models characterizes the physical system's target features, which are used to separate the serviceable part of the data from the noisy component. The technical approach of the developed method utilizes a common tool in neural network applications, the theory of Tikhonov regularization of ill-posed problems [83]. Speci cally, the problem of mathematical analysis of experimental data is treated as an ill-posed problem. Its regularization involves the introduction of additional information regarding the physical system. It is proposed to utilize apriori mathematical models of physical systems at appropriate degrees of delity for regularization. The computational environment of ANNs is utilized in this regard. 4 The numerical examples considered in this chapter demonstrate the usefulness of the method for engineering applications. 1.3 Interpolation Methods for Experimental Data Analysis The amount of experimental data is always limited by the cost of material resources, instrumentation and labor. Therefore, engineering analysis of experimental data involves an approximation of the system response surface from a limited amount of measurements. This leads to the classical problem of interpolating scattered data [28]. Speci cally, the exact response of the system F (x) is approximated by the equation f (x) a = X k (x) (1.1) k k where are coecients corresponding to the basis functions (x). The function f (x) is designed to approximately pass through the experimental data points f (x ) at the coordinates x = (x1 ; :::; x ), where d denotes the dimensionality of the approximation problem. It is assumed that k k a e i i f (x ) e i = F (x ) + i ;i d;i (1.2) i where denotes the random noise of measurements at the coordinate x . Several approaches have been proposed in the literature in the construction of f (x). The response surface methodology of experiment design utilizes a polynomial basis for f (x) which is tted to the data by the statistical tool of linear regression [18], [7], [67], [66]. This relatively simple method of function interpolation and the statistical framework for data analysis provide for the accurate approximation of i i a a 5 multi-dimensional functions in local regions of interest. Also this approach is quite useful in forming a strategy (design of experiments) for selecting data for the best local approximation of the response surface by Eq. (1.1). The nite element based method of data interpolation is based on the idea of using continuous, or continuously di erentiable, local bases placed on a triangularization of the region of approximation [28]. The radial basis function interpolation of scattered data is an increasingly popular method for data tting [28], [21], [20]. Though this approach incorporates \global" basis functions, the approximation fa (x) = X k kx ? k k ck ! k (1.3) ; can be eciently computed for large data sets [21]. Here, k k represents the Euclidean norm in the space Rd , and ck , k , and k are parameters to be evaluated. Further, the data interpolation procedure for certain radial bases can be seen, in a Hilbert space setting [87], as the minimization of the functional [fa(x)] = Z fba( ) 2 b Rd G (kk) d (1.4) with the constraint fa(xk ) = fe(xk ); here is a dummy variable and the hat denotes the Fourier transform of the corresponding quantities. The term G (kk) is a positive function such that b k k) ! 0 as kk ! 0: b G ( Note that, in this case, the function (1.5) k k) b G ( ?1 can be interpreted as a high 6 pass lter which increases the contribution of the high frequency components of ? the approximation fa(x) in Eq. (1.4). Therefore, the kernel Gb ( ) with the conditions of Eq. (1.5) induces a penalty for a non-smooth interpolation of data. The Hilbert space framework of scattered data interpolation was rst proposed by Duchon [19] and Meinguet [62] who derived multi-dimensional spline bases. A more general class of semi-spline radial functions [74], [29] can be obtained by using the smoothing functional, k k X 1 am Om fa(x) L2 R m 0 m Z 1 X @ X = am f (x) dxA n i1 ;:::;im a [fa(x)] = 2 k k ( ) i1 ;:::;im R m 2 j j (1.6) where m i1;:::;im = @x @:::@x : i1 im As a result, data interpolation with the Gaussian function kxk2 (x) = e? 22 (1.7) is associated with the kernel 1 m X Gb ( ) = m m m! 2 2 k k k k =0 m 2 !? 1 = e? 2 kxk2 2 ; the Gaussian radial basis is used in the numerical examples of this chapter. 7 With regard to the preceding discussion, it is noted that few studies in spline curve- tting have been devoted to establishing appropriate criteria in the selection of the smoothing functional [fa(x)]. Often this functional is selected based on the consideration of computational eciency rather than on the study of the engineering system. Several authors have argued that the choice of the smoothness criterion should rely on the knowledge of the underlying system that generated the data. A probabilistic foundation for the selection of the interpolating bases of Eq. (1.1) was proposed by Matheron [55], [56] who developed the kriging method of statistical data analysis. In this method the response function F (x) is assumed to be a realization of a homogeneous random eld consistent with the experimental measurements. The approximation function fa(x) is then selected as the best linear unbiased estimate of the random eld F (x) based on the observation data fe (xi ). Note that, in this case, the auto-correlation function describing the second order properties of the random eld becomes the basis function of Eq. (1.1). That is, fa (x) = X k k Rf (x ? xk ) ; (1.8) where Rf (x) denotes the auto-correlation function specifying the a-priori probability structure of the response function F (x). Similar results were obtained by Kimeldorf and Wahba [48] who showed that generalized splines correspond to a special form of the a-priori information with respect to the spectral density function of the random eld F (x). Speci cally, b Sf ( ) = G (k k) ; (1.9) 8 where S () is the spectral density function assigned to the response function F (x). Obviously this stochastic approach for selecting the smoothness requirements on f (x) may be fully justi ed only in the case when the response function can be realistically treated as a realization of a random eld. Nevertheless, Sacks et.al. [81] and Welch et.al. [88] insisted on the application of statistical procedures, even when the response function is deterministic, as in the case of computer experiments. The statistical framework introduced for analysis of deterministic data was proven to be useful in yielding an accurate approximation of functions based only on a few generated data points. The last data interpolation method to be discussed, and the focus of this text, is the use of ANN systems which have drawn signi cant interest in engineering application for data-driven emulation and control of engineering systems. The computational environment of ANN systems will be adapted in this report for integrating experimental data and mathematical models. f a 1.4 Arti cial Neural Networks 1.4.1 Arti cial Neural Networks As a General Approximation Tool Any approximation algorithm that uses a combination of basis functions and can be mapped into a graph-directed representation, can be treated as an arti cial neural network [74]. In this context, the response of a feedforward neural network (FFANN) can be seen as a general scheme of experimental data approximation and is de ned by the following equation [11], [41], [43], [63], f (x; a ) = X k i i ? (k 1) :::1 X d l=1 1;l x l !! ; (1.10) 9 where xi = (x1; :::; xd) is the d-dimensional input of the ANN system, the vector represents the set of network parameters, and i is the non-linear transfer function of neurons associated with the i-th intermediate layer of the network. In Eq. (1.10), the value k ? 1 can be thought of as the number of intermediate neuron layers in a graph-directed representation of the ANN system. A single hidden layer of neurons is commonly selected in network applications unless a problemrelated consideration indicates that several intermediate layers can have a bene cial impact on the overall behavior of the network. In this regard, sigmoidal transfer functions have been thoroughly investigated in connectionist literature because of their resemblance to the response of biological neurons [11], [40], [41], [43], [89]. Another class of networks popular in applications is the radial basis network (RBN), whose response can be expressed by Eq. (1.3) [74], [30], [8], [72], [75]. Note that an RBN system can also be constructed as a two-hidden-layer FFANN with p transfer functions 1() = 2 and 2() = ( ). The training of an ANN system can be formulated as the procedure of selecting the parameter vector so that the response of the neural network is \close" to the available data. Often, the network parameters are found by minimizing the following error criterion (training objective function) = N X i=1 fe (xi ) ? fa (xi ; ) 2 : (1.11) In terms of the connectionist literature, numerical minimization of this error criterion by the steepest descent method constitutes the backpropagation algorithm [79]. The conjugate gradient method, the Levenberg-Marguardt method, and other nonlinear optimization methods have also been used for network training [80]. 10 However, it has been found that even the most advanced optimization algorithms can show poor convergence in training. Often, this phenomenon is called overtraining [37]. Further, sparse data can also cause diculties in network modelling of complex mechanical systems. These diculties become pronounced in modelling systems in which experimental data is relatively expensive to obtain [38]. In this regard, arti cially augmenting the available data by generating new training examples through introduction of small random noise has been advocated in the literature [38]. Also, experimental data is often corrupted by measurement noise. Minimization of the training criterion of Eq. (1.11), for a suciently large network, yields a response function fa(x; ) which can be over-sensitive to the noisy details of the training data resulting in poor network performance. Often, this phenomenon is called over tting [71]. 1.4.2 Network Modelling As An Ill-Posed Problem The problems of network overtraining, inadequacy of training examples, and overtting are in fact closely related. They can be viewed as a result of the network training being an ill-posed problem [83]. The ill-posed features of the ANN models can be deduced from a number of diculties that prevent ANN from becoming a readily available engineering tool for emulating mechanical systems. The network modelling of physical systems by minimizing Eq. (1.11) can only be justi ed when one can show that a network which realizes the minimum of Eq. (1.11) actually exists. This leads to concerns regarding the existence of the best network models [30], [89], [61]. Note that the non-existence of the best network approximation can be used, in some cases, as a rational explanation for the e ect of overtraining. 11 Network modelling from sparse data is an ill-posed problem, since training is not capable of producing a unique solution in regions lacking data [74]. This leads to establishing sensible criteria for network based extrapolation from examples. Moreover, even for adequate data the non-uniqueness phenomenon can result from the non-linear character of the network approximation. For example, the network response f (x; ) depends linearly on only a few parameters, shown in Eq. (1.10) as . All remaining parameters a ect the network response in a non-linear manner. In this regard, some aspects of the uniqueness problem in network modelling are discussed in references [30], [89] and [61]. As alluded to previously, over t of noisy training examples by neural networks is a major concern in network applications. This phenomenon can be associated with the instability of the standard training procedures with respect to the data noise [71]. Finally, Saarinen et.al. [80] demonstrated that overtraining was inherent to the internal architecture of multi-layered FFANN systems, yielding the ill-conditioned Jacobian and Hessian matrices. As a result of the highly parallel and redundant structure, neural networks with di erent parameter values are able to produce very similar responses. Clearly, network modelling can be improved by imposing additional constraints on the network parameters and/or by appropriately changing the training objective function of Eq. (1.11) and increasing the condition numbers of the Jacobian and Hessian matrices. This also may eliminate local minima in the objective function. The most common method of imposing ANN parameter constraints, suitable for numerical computations, is by using the theory of Tikhonov regularization of illposed problems [83]. a k;i 12 1.5 A Method for Data Analysis: Regularization by APriori Mathematical Models The popular Tikhonov regularization method [83] was originally adopted for ANN systems by Poggio and Girosi [74] and extensively used in ANN applications [29], [6]. In this case a set of constraints on the network approximation fa(x; ) is introduced through the Tikhonov regularization functional fa(x; ) which speci es the penalty given to the oscillatory behavior of fa(x; ); the network parameters can be determined by minimizing the objective function h (fa) = N X i=1 fe (xi) ? fa (xi ; 2 h i i ) + fa(x; ) : (1.12) Unfortunately, conventional smoothness based Tikhonov regularization of Poggio and Girosi may not be consistent with the smoothness required in emulating a physical system and so would be unable to properly approximate with noisy experimental data. Instead, this report submits that the regularization should be based on an a-priori mathematical model of some level of delity. This regularized model can then be approximated by more conventional computational methods. A large class of mechanics problems can be described by the equations L [f0 (x)] = g (x) ; x 2 and B [f0(x)] = 0 ; x 2 @ ; (1.13) where L[] is a linear di erential operator, B [] is the boundary operator, f0 is the solution function, and g is some forcing function. Let lh; i be the bilinear symmetric di erential form associated with the operator L, i.e. lh; i = hL[]; i = h; L[]i, so that the expression h; i denotes an inner product (Appendix A.1). Typically, 13 q this functional is called the energy functional, and the associated norm l(; ) is called the energy norm [50]. The solution to Eq. (1.13) can be determined by minimizing the energy form [f ] = 12 lhf; f i ? hg; f i : (1.14) Since for any f that satis es the boundary operator h l f0 ; f i = hg; f i ; (1.15) then substituting Eq. (1.15) into Eq. (1.14) yields [f ] = 1 lhf ? f0; f ? f0i ? 1 lhf0; f0i = 1 lhf ? f0; f ? f0i + constant 2 2 2 : From a physical perspective, the functional [f ] penalizes the mechanical energy of the discrepancy between the function f and the response of a mathematical model f0. With the use of f and the energy functional of Eq. (1.14), regularized modelling can be formulated as the minimization of the objective function ( )= f N X i=1 jfe(xi) ? f (xi)j2 + 21 lhf; f i ? hg; f i (1.16) which combines information from experimental data and the a-priori mathematical model. Note that represents the degree of reliance on the a-priori mathematical model of the physical system relative to the reliability of the experimental data. 14 1.5.1 Analytical Approximation of Regularized Mechanical Systems The formation of a regularized model of a mechanical system is identical to approximating the hypersurface f~(x) that gives the global minimum of Eq. (1.16). In this context, analytical study of f~(x) is essential for evaluating the properties of the regularized model. Because f~(x) is the global minimum of Eq. (1.16), L N 2 X f~(x ) ? f (x ) (x ? x ) ; x 2 f~ = g + i e i i h i i=1 and h i B f~ =0 ;x 2@ (1.17) : From Eq. (1.17) it can be deduced that f~ contains two components f~(x) = f0(x) + fcor (x) (1.18) : These components can be determined by the equations f0(x) = and fcor (x) R G(x; )g ( )d N X = 2 fe (xi ) ? f~(xi ) G(x; xi) (1.19) i=1 where the Green's function G(x; ) satis es the boundary value problem L [G(x; )] = (x ? ) ; x 2 and B [G(x; )] Substituting Eq. (1.19) into Eq. (1.18) yields =0 ; x 2@ : 15 f~(x) = f0(x) + N X i=1 ci G(x; xi ) ; (1.20) where the coecients are de ned as cj = 2 fe (xj ) ? f~(xj ) : (1.21) The coecients can be evaluated by substituting Eq. (1.20) into Eq. (1.21), resulting in N 2 f (x ) ? f (x ) ? X ci G(xj ; xi ) cj = e j 0 j ! ; j i=1 = 1; :::; N (1.22) with which c can be isolated. With the substitution of c into Eq. (1.20), f~ can be written as f~(x) = f0(x) + N X i=1 0 G(x; xi ) @ G + 2I !?1 fe ? f0 1 A i : (1.23) It is noted that the matrix G of Eq. (1.23) is positive de nite since for an arbitrary function (x) we can write Z Z G(x; )( )(x)d dx = Z (x)(x)dx = Z (x)L (x)dx = lh ; i ; where (x) is the solution to Eq. (1.13), for g(x) = (x), and satis es the boundary condition. Therefore, since the quadratic form lh; i is strictly elliptic, 16 Z Z G(x; )( )(x)d dx k k2H s( )>0 : (1.24) Substituting (x) = PNi=1 ci(x ? xi) into Eq. (1.24) yields Z Z G(x; )( )(x)d dx = cT Gc > 0 : Because G is a positive de nite matrix the matrix (G + I=2) is also positive de nite for any positive value of the parameter . As a result, the vector c can be evaluated in a computationally stable manner. Since the range of the regularizing parameter is the set of all positive numbers, it is prudent to obtain qualitative results for the limits of . Equation (1.23) indicates that as ! 1 then f~(x) ! f0(x), as expected, since with ! 1 experimental data becomes less relevant and f~ emulates more of the a-priori mathematical model. For ! 0, the a-priori model of the mechanical system is assumed to be increasingly unreliable compared to the experimental data. Equation (1.23) gives f~(x) ! f0(x) + N X i=1 G(x; xi ) G?1 fe ? f0 i so that the regularized solution interpolates the data fe (xj ) exactly since f0 (xj ) + N X i=1 G(xj ; xi ) G?1 fe ? f0 i = f0 (xj ) + N X i=1 ji fe ? f0 i = fe (xj ) As a result, the a-priori mathematical model of mechanical system does not contribute to f~ in the region of adequate data. However, the mathematical model is : 17 not eliminated from the modelling procedure; it is used by f~ to extrapolate the data to the regions lacking data. In an alternate perspective, as ! 0 the function f~ can be thought of as a response of the a-priori model \attached" to the data points. 1.5.2 Numerical Approximation of Regularized Mechanical Systems The radial basis expansion used to approximate f~ can be described by Eq. (1.3) with () = e?2 =2 : (1.25) Since the basis expansion of Eq. (1.3) does not satisfy the boundary conditions of the mathematical model for arbitrary , the objective function of Eq. (1.16) is augmented to incorporate the boundary conditions. Speci cally, (fa) = N X i=1 1 fe (xi) ? fa (xi ; ) + lhfa ; fa i ? hg; f i + bhfa ; q i 2 2 ; (1.26) where bh; i may be de ned as bhf; qi = @ q()B [f ()] d. Preselecting the distribution and localization properties of the radial basis functions, Eq. (1.26) is minimized with respect to the remaining unknowns, ck , resulting in R M X j lhj (x); k (x)i + 2 N X i=1 hg; k (x)i + 2 ! j (xi)k (xi) cj + bhk (x); qi = N X i=1 fe (xi )k (xi) ; for k = 1; :::; M (1.27) 18 M X cj bhj (x); pi = 0 and j (1.28) where and belong to an appropriate nite dimensional subspace. Equations (1.27) and (1.28) are known as the Galerkin method [26]. This observation provides a useful link between the proposed method of data analysis and methods in computational mechanics. p q 1.6 Example Problem Consider a one-dimensional engineering system governed by the di erential equation ? dd x ! ( ) dd ( ) = a x x F x x ; x 2 (0 1) ; ; F (0) = (1) = 0 F : For this example problem the value of = 1 is selected for the exact system. An a-priori mathematical model with 0 = 1 3 is used along with the regularizing functional a a Z1 1 [ ] = 2 0 f a0 d d f x !2 d ? x : Z1 0 xf d = x 1Z 1 2 0 a0 ! d ? d 0 2 d + constant d d f f x x x : (1.29) Regularization by the functional of Eq. (1.29) suppresses oscillations of the experimental data that are dissimilar to the oscillatory behavior of the a-priori model. 19 Consequently, the approximation of a regularized mechanical system should lter the data noise. 1.6.1 Results from the Analytical Approximation The experimental data for the extrapolation study are obtained from the physical system response at x = 0:25 and x = 0:75, which are shown as triangles in Fig. 1.1. The optimal curve f~, regularized by the a-priori mathematical model, is evaluated from Eq. (1.23) and is shown in Fig. 1.1 for the regularization parameter = 0:01. By selecting values of Eq. (1.23) acts as a penalty method. The optimal curve passes through the data points while the curvature in the region between data matches the curvature of the a-priori mathematical model. This can be shown by noting that the function fcor (x) is a piece-wise linear function in the regions between data points. As previously mentioned, the use of Eq. (1.29) for regularization should lter data noise. Noisy experimental data was generated for this study by adding random noise , to the response of the physical system, that was statistically independent of location and uniformly distributed between d=2, where d = 0:025, and with variance V ar[] = d2=12. The generated data points are displayed as triangles in Fig. 1.2. Unlike the extrapolation example, the value of the regularization parameter was determined such that Eq. (1.23) and the constraint N X i=1 jfe (xi) ? f (xi )j = E 2 "N X i=1 # i 2 = N V ar[] = (1.30) were satis ed for f = f~. More speci cally, minimized the magnitude of the equation residual (), where 20 0.07 f (x) 0.06 0.05 0.04 0.03 0.02 0.01 0 0 0.1 0.2 0.3 0.4 0.5 0.7 0.6 0.8 0.9 1 x Extrapolation from insucient data for = 0:01; (solid) the mechanical system F (x), (dash) the regularized approximation f~(x), (dash-dot) the numerical approximation fa(x). Figure 1.1 () = N X k=1 N X k=1 jfe(xk ) ? fa(xk )j2 ? = fe (xk ) ? f0 (xk ) ? ? : N X i=1 0 G(xk ; xi ) @ G + 2I !?1 fe ? f0 1 A 2 i (1.31) Use of Eq. (1.31) requires a-priori knowledge regarding the intensity of the random noise in measurements. This is usually available in engineering applications. The optimal , which acts as a Lagrange multiplier, was determined through the dichotomous search technique [82]. The regularized curve of Fig. 1.2 (dash) shows 21 that the developed method of data analysis can successfully discriminate between noise and the physical system response. 1.6.2 Results from the Numerical Approximation The numerical approximation by basis expansion consisted of seven (M = 7) Gaussian radial basis functions where the transfer function centers were uniformly distributed in the segment [0,1] with the center of the two boundary functions located at x = 0 and x = 1, respectively. The parameters k of Eq. (1.3), describing the localization properties of the Gaussian function, were all selected to be equal to 1=6. Figure 1 illustrates the acceptable matching between the analytical solution of the regularization problem and the numerical approximation. Figure 2 shows that the performance of the basis expansion in modelling with the noisy data of Section 1.6.1 is satisfactory. In the interpolation study, was determined such that Eqs. (1.27) and (1.28) were satis ed with Eq. (1.30) for f = fa. An attractive integration method based in Tikhonov regularization has been proposed and evaluated. It has been shown in this chapter that since physical models of engineering interest can be expressed in the form of empirical, di erential and/or integral equations of varying degrees of delity then these model equations can be incorporated in the form of a regularizing functional for a robust matching of experimental data. The integration method can be viewed as the adjustment of a low delity, and computationally inexpensive, mathematical model response by experimental data. The developed integration method can also be viewed as an ecient procedure for interpolating experimental data by utilizing physically motivated criteria. Further, it has been shown how the developed method can be used to successfully extrapolate the physical system response at unmeasured coordinates and lter noisy experimental measurements. The radial basis expan- 22 0.08 f (x) 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 −0.01 0 0.1 0.2 0.3 0.4 0.6 0.5 0.7 0.8 0.9 1 x Interpolation of data with noise for optimal ; (solid) F (x), (dash) f~(x), (dash-dot) fa (x). Figure 1.2 sion has been examined as the computational apparatus of the developed method. However, the developments of the chapter retain a sucient level of generality and can be implemented in conjunction with any of the commonly used methods including nite di erence and conventional nite elements. 1.7 Nonlinear A-Priori Models It is apparent that though the mathematical model can be low delity, and computationally inexpensive, the accuracy of the interpolation and extrapolation of experimental data is directly dependent on the accuracy of the a-priori mathematical model. The scheme must therefore be able to accommodate moderate to highly nonlinear equations. Fortunately, the integration scheme can be easily extended to 23 physical systems described by nonlinear equations by modifying the regularization functional of Eq. (1.26) so that the objective function becomes (fa) = N X i=1 fe (xi) ? fa (xi ; 2 ) + 21 lhfa ? fnum ; fa ? fnum i + bhfa; qi (1.32) with lhfa ? fnum ; fa ? fnum i = Z S X m=1 dmfa ? dmfnum 2 dx ; dxm dxm ! (1.33) where S is the highest derivative present in the a-priori model and fnum is the numerical approximation of f0 by some conventional computation mechanics method that can be performed oine. The multidimensional form of Eq. (1.33) is straightforward. 1.8 Implementation Concerns As attractive as the Tikhonov-based integration method appears, there are problems in implementation that are shared by all ANN applications. Of primary concern, the objective function given by Eq. (1.26), and constrained by Eq. (1.31), must be globally minimized through selection of optimal values of and . This is a nonlinear optimization problem. As mentioned previously, there are two major approaches to improving the performance of optimization (ANN training) algorithms: 1) e ective selection of the initial values of the network parameters and 2) e ective introduction of network parameter constraints through the incorporation of regularizing functionals. Both approaches have been used in this chapter through the use of the integration method and the construction of radial bases func- 24 tions, respectively. The minimum of Eq. (1.26) can even be controlled through appropriate preselection of the Gaussian radial basis parameters. However, numerical experiments have shown that accuracy is very sensitive to the parameters controlling the basis functions. In addition, the accuracy is also sensitive to the number of bases used. Speci cally, after exceeding a certain number of uniformly distributed bases, the approximation becomes more and more inaccurate for optimal basis parameters [60]. Consequently, there is a limit to the number of ANN local bases that can be used in approximating a mechanical system. Of secondary concern, is that constraining the basis function parameters and minimizing Eq. (1.26) requires user interaction. For example, since the number of ANN local bases is limited, they should be judiciously placed. However, placement of bases is dependent on the type of basis, familiarity with the a-priori model and experimental data, which in turn requires user interaction. Additionally, if Eq. (1.32) is used then the a-priori model is limited by the accuracy of fnum . This is of special concern in complicated computational domains. To address these concerns a technique (optimal sequential function approximation) has been developed. This technique minimizes Eq. (1.26) in a stable and reliable fashion through problem-dependent optimal selection of and using local minimization criteria. This approach involves relatively little user interaction and is independent of the type of basis selected by the user. It will also be shown that this technique can approximate the solution to a nonlinear partial di erential equation directly and exceed the accuracy of fnum on a uniform grid. Background information on elements used in developing the optimal sequential function approximation method are given in Chapter 2. Details on the method are discussed in Chapter 3. To demonstrate the method, the limiting case of Eq. (1.26) for ! 1 has been minimized using the variational form of the nonlinear compressible Euler 25 equations as the a-priori model (Chapter 5). Section 6.2.1 discusses generalization of the optimal sequential function approximation, utilizing the method of weighted residuals, when a variational form is not available. 26 Chapter 2 Optimal Incremental Approximation: Background The proposed method of optimal sequential function approximation is based on a variety of techniques borrowed from distinct areas of computational methods. In particular, motivated by similarities observed in adaptive grid optimization techniques, an algorithm originated and employed in arti cial neural networks for iterative function approximation has been reformulated with the help of variational principles and the closely related method of weighted residuals. The objective of this chapter is to review the previously mentioned ideas and illustrate their exact role in, and in uence on, the proposed method. 2.1 The Method of Weighted Residuals One of the most popular methods for the solution of di erential equations is the method of weighted residuals. The basic idea is to require the inner product of the residual of a di erential equation, with respect to some weighting function, to vanish. Speci cally, if an approximate solution function (reproduced by a series expansion with speci ed functions ( ) and unknown expansion coecients ) is introduced into the di erential equation, it will yield a nonzero equation residual . The inner product of the residual with respect to some weighting function is then de ned as the integral of their product over the domain of interest (Appendix A.1), and is set to zero. In this manner, a system of equations of the form h i, = 1 , is formed and solved for the unknown coecients. If the set G of functions f1 2 g is orthogonal, then the matrix A of the associated i x ci R w n R; i i ;:::;n ; ;:::; n 27 system of algebraic equations A = is diagonal and the coecients are calculated with computational eciency. In general though, a function 2 G is orthogonal with respect to most, but not all, functions 2 G , = 1 ? 1, 6= . Accordingly, the matrix A will be sparse and banded (in the best case tridiagonal). In spectral methods, the matrix A will be fully, or nearly-fully, occupied. The choice of the weighting functions determines the method of weighted residuals. Among the most popular are the collocation method, the subdomain method, the least-squares method, the method of moments, and the Galerkin methods (Appendix A.3). The method of weighted residuals and its applicability to di erent classes of problems is extensively discussed in reference [25]. It is important to emphasize that no matter how signi cant the choice of the weighting (or test) functions is, the importance of the choice of the basis (or trial) functions used in the series expansion should not be underestimated. As will be demonstrated by means of speci c applications, the proposed numerical method for the solution of di erential equations via optimal sequential approximation may involve concepts of the method of weighted residuals. c b i j j ;:::;n j i i 2.2 Variational Principle Methods Variational methods in mathematics, physics, and engineering have been extensively studied and developed. A selection of the more general and popular texts include references [64, 51, 78, 68], and [25]. The variational principle method is based on the calculus of variations. Calculus of variations is concerned with variations of functionals, where a functional is dened as some form of correspondence between a function and the set of the real numbers. Such functionals (also called variational principles) may be readily dened for many problems of practical importance in engineering. The approximate 28 solution (given in terms of speci ed basis functions and their corresponding unknown coecients) of a given problem is introduced into the associated functional, and the unknown coecients are determined by making the functional stationary. In this regard, the rst variation of some functional F (f ) is de ned by F (f + w) ? F (f ) ; F = lim !0 (2.1) where f and w are some functions and 2 R. The functional F (f ) is typically given by a variational integral, and is made stationary by requiring its rst variation to vanish; the associated integrand is then set equal to zero, yielding the so-called Euler-Lagrange equation [65]. As mentioned above, for many physical problems of the general form H [f0 ] ?g =0 ; (2.2) an associated variational principle F (f0), whose Euler-Lagrange equation is Eq. (2.2), can be de ned, and the Raleigh-Ritz method can be employed for the computation of the unknown solution expansion coecients. In Eq. (2.2), H [] is some di erential operator, f0 is the solution function, and g is some forcing function. It is important to note that the vanishing of the rst variation of the functional is only a sucient condition for the existence of an extremum. To determine whether a functional is extremized, i.e. minimized or maximized, the examination of its second variation 2F is necessary. In particular, if 2F can be said to be positive, then the functional possess a minimum; if 2F can be said to be negative, then the functional possess a maximum, and if the sign of the second variation is inde nite, then the functional merely possesses a saddle point. 29 From a physical point of view, variational principles manifest the tendency of nature to favor the presence of extrema in physical systems. The mathematician Euler expressed his certainty that \...there is no doubt that all the e ects of the world can be derived by the method of maxima and minima from their nal causes as well as from their ecient ones" [25](page 335). However, the derivation of \natural" variational principles for all classes of problems is far from being fully understood and successful. This fact makes the use of \contrived" variational principles necessary. In particular, variational principles are de ned based on the properties of the previously introduced di erential operator H []. If H is linear, i.e. H = L, then L is either self-adjoint or not self-adjoint (Appendix A.4). If L is self-adjoint, then a variational principle is readily de ned in terms of L, f0, and g. If L is not self-adjoint and the operator L can be transformed to a self-adjoint operator L without changing the associated EulerLagrange equation, then a variational principle is de ned in terms of L , f0, and g. If the nonself-adjoint operator L cannot be transformed to a self-adjoint operator, then a variational principle is de ned in terms of L, f0, g, L, v0, and h , where L[v0] ? h = 0 is the adjoint problem of L[f0] ? g = 0. If H is nonlinear, i.e. H = N , then the term \self-adjoint" cannot be used, because the adjoint of a nonlinear operator cannot be identical to the operator, and is substituted by the term \self-sucient" [5]. If N is self-sucient, variational principles can be de ned without extreme e ort. However, if N is not self-sucient, the use of Frechret derivatives (Appendix A.4) cannot be avoided and the tediously derived variational principles are quite complicated and not amenable to computational implementations. It is interesting to note that there always exists a Galerkin method (a member of the method of weighted residuals) that is equivalent to the variational method; 0 0 30 in fact it was Galerkin himself who noted this fact [25](page 223). Due to this signi cant property, Galerkin methods have been extensively used in nite element methods. Members of the method of weighted residuals, in particular the Galerkin method, have been primarily developed and used for the solution of di erential equations due to their simplicity and computational eciency. Less frequently, variational methods have been utilized by the Raleigh-Ritz method. For both of these cases, the Euler-Lagrange equation is solved numerically in order to determine the function which satis es it and the associated initial and/or boundary conditions. However, it is argued by Finlayson [25](page 310) and Greenspan [31] that the numerical approximation of the function that extremizes a given variational principle is to be preferred to the numerical approximation of the function that satis es the associated Euler-Lagrange equation. The proposed method for optimal sequential function approximation for the solution of di erential equations is based on the use of variational principles. In particular, the variational principle is adopted as the objective function(al) in the optimization problem(s) to be solved numerically in order to obtain the extremal function that satis es the associated Euler-Lagrange equation and the boundary conditions. A major advantage of the use of variational principles is that coupled and/or uncoupled equations with general boundary conditions can be concisely and succinctly represented by a single functional. 2.3 Adaptive Grid Optimization Numerical methods for the solution of uid mechanics problems rely on the discretization of the continuous spatial (and/or temporal) domain into a grid of nodal points. The requisite computational cost and the eciency of the solution method 31 are strongly a ected by the type of discretization. For example, slow convergence and numerical instability, resulting from poor condition numbers of the associated discretization matrices, have been reported in the literature for uniform grids. Therefore, optimization in computational mechanics has focused primarily on the grid selection for a given problem. Jensen introduced a nodal density function based on local discretization error estimates as a criterion for optimal node distribution [44], while Oliveira showed that optimal node distribution provides minimal total potential energy of the system [70]. In fact, it was proven that such distributed nodes lie along isoenergetic contours. Felippa solved the direct and computationally expensive grid optimization problem for a given nite element con guration, and then relocated the nodes according to a total energy criterion [23, 24]. This process was repeated iteratively until convergence to an optimal grid was achieved. His approach was limited to problems governed by self-adjoint, positive de nite, di erential operators. Moreover, unavailable exact derivatives and/or inaccurate derivative approximations dramatically decreased the eciency of the method. Results were reported for extremely coarse two-dimensional grids (8 to 36 nodes), due to the complexity (constraints) and high dimensionality of the associated optimization problem. Diaz, Kikuchi, and Taylor classi ed the optimal nodes distribution as a relocation technique, or r-method, and included it among the well known h-, p-, and h-p adaptive methods [16]. In h-methods, the grid spacing h is reduced; additional interpolation functions are introduced to improve the quality of the solution. In p-methods, the order of the interpolation functions is increased, while h-p methods combine the introduction of additional functions with the increase of the order of the existing and/or introduced functions. In the r-method, the location of the 32 nodes of the grid, i.e. the location of the functions, is adjusted for higher accuracy. The grid optimization algorithm of Diaz et al. was based on the minimization of an estimated upper bound on the potential energy of the nite element solution. The associated objective function involved a global interpolation estimate (given by simple summation of local element interpolation error estimates) which serves as an upper bound on the discretization error. Recently developed adaptive methods of computational mechanics substitute the principle of the system potential energy minimum with the weaker criterion of the homogeneous distribution of the local approximation error over the elements [13, 17, 15]. The common characteristic of adaptive grid techniques is that a grid is generated and a solution is found. Based on the solution, the grid is adapted and regenerated in order to repeat the described process. Multigrid methods share this a-posteriori error estimation philosophy. It is well-known that most of the computational time required by codes that utilize adaptive or multigrid techniques is spent on the necessary grid generation [69, 36, 22]. According to Oden [69], the major challenges to adaptive methods in computational uid dynamics include 1) the use of unstructured meshes and therefore elaborate and complicated data structures, 2) the necessity of explicit or iterative solution techniques due to the poor performance of direct solvers on dynamically evolving unstructured meshes, 3) stability issues of the associated numerical schemes stemming from the continuous changes in the data structures and polynomial orders, and 4) the computational overhead of the error estimation and the adaptation process. The method presented in this report features the advantages of adaptive methods without sharing their disadvantages. Speci cally, while the method is adaptive in an iterative manner and is based on the optimization of some appropriate crite- 33 rion, the dimensionality of the associated optimization problem is kept low, derivatives are not required, and most importantly, a grid is not built in the traditional sense. Unstructured mesh data structures are not required for the realization of the method due to the use of simple tensor products of low-order basis functions. Moreover, computationally expensive matrix-assembling algorithms and solvers of systems of equations are avoided. Variational principles and weighted residuals concepts are utilized to de ne well-posed optimization problems; numerical stability issues are limited to the nonlinear optimization process. Most signi cantly, the grid, de ned by the location of the optimally selected basis functions, evolves with the solution and is not based on a-posteriori error estimates; the computational overhead of the a-posteriori error estimation and respective regridding is eliminated. Lastly, the parallel processing of the optimization problem, de ned by the proposed algorithm, dramatically increases the eciency of the proposed method. 2.4 Arti cial Neural Networks Function approximation, in the framework of neural network computations, has been primarily based on the results of Cybenko [11] and Hornik, Stinchcombe, and White [39] who showed that any continuous function on a d-dimensional compact subset of Rd can be approximated arbitrarily well by a linear combination of simple one-dimensional functions i( ) : Rd+m R f f0 fa = uan (x) = c0 + ! Xn cii(x; i)) ; i=1 g (2.3) where x Rd, i Rm, and ci R, represent the independent variables, function parameters and linear combination coecients, respectively. If the linear and nonlinear network parameters in Eq. (2.3) are selected by minimizing the square 2 2 2 34 of the L2 norm, 2 Z (f0 ? fa)2dx ; (2.4) then Barron [3] showed that if the functions i are xed, cannot be smaller than O(n?2=d). He referred to this phenomenon as \the curse of dimensionality". Therefore, he proposed, based on the theoretical work of Jones [46, 47], the following iterative method for sequential approximation uan (x) = u ?1 (x) + cn (x; n ) ; a n n (2.5) where n; cn, and n are optimally selected at each iteration of the algorithm. As a result, the high-dimensional optimization problem associated with neural network training is reduced to a series of simpler low-dimensional problems. A general principle of statistics was utilized to show that the upper bound of the error is p of the order C= n, where C is a positive constant. The functions i , introduced above, have traditionally been sigmoidals in neural network computations, i.e. functions with the properties lim i() = 0 and lim () = 1 : !1 i !?1 Alternatively, Orr introduced radial basis functions in his forward selection method of sequential network training [71]. Orr's radial basis functions are of the form (x; xc; r), where x; xc 2 Rd and r 2 R are the independent variables, the center point, and the radius, respectively. Their value decreases (if they are bounded) or increases (if they are not bounded) monotonically with increasing distance from 35 the central point. Orr's method is in essence a method of incremental function approximation. At each iteration of the algorithm, an additional basis function, which produces the largest reduction of error in the previous iteration, is chosen from a given set of functions and added to the approximation. However, the forward selection training method can be inecient in that it may require signi cant computational resources when the set of trial functions is large. A similar principle is utilized in Platt's resource allocating networks (RAN) [73]. Whenever an unusual pattern is presented to the network in an on- or o -line network training procedure, a new computational unit (according to the neural network terminology), is allocated. The computational units respond to local regions of the input space. The concept of incremental (or sequential) approximation is one of the major features of the proposed method. Following the main idea of Barron's algorithm, optimal basis functions with local support are added to the approximating series expansion. A broad class of functions can be employed. B -splines are popular and resemble radial basis functions, but Gaussian functions (see Appendix A.2), families of locally de ned orthogonal polynomials, and combinations of hyperbolic tangents can also be used. 36 Chapter 3 Optimal Incremental Approximation: Approach The proposed method of optimal incremental function approximation for the solution of di erential equations is presented in this chapter. The development of this method was motivated by the recent advances in parallel optimization techniques and by numerical concepts reviewed in the previous chapter. The main features of the proposed method are a) the iterative nature of the proposed algorithm for the approximation and b) the solution of a nonlinear optimization problem at each stage of the algorithm. The objective of the development of the method was the employment of the latter in the solution of di erential equations. 3.1 The Proposed Algorithm The developed algorithm can be summarized as follows [59]. Given a problem H [f0 (x)] ? g = 0, governed by one (or more) algebraic or di erential equation(s), the dependent variable(s) is (are) approximated by a basis expansion of the form given in Eq. (2.3). An incremental set of basis functions is built sequentially to improve the expansion-based approximation. At each stage of the algorithm, the parameters of the new basis function and the corresponding expansion coecient are optimally determined by solving a low-dimensional, nonlinear optimization problem. Assuming that the algorithm has reached step n, the approximating function is given by f0 (x) fa (x) = uan X (x) = c (x; ) = c (x; ) + u ? (x) : n i=1 i i i n n n a n 1 (3.1) 37 The coecient c and the parameters of the associated basis function (x; ) are optimally computed according to an appropriate criterion, i.e. by solving a nonlinear optimization problem, while the coecients c and the basis functions (x; ), i = 1; : : : ; n ? 1, are held xed. If the given problem is linear, the nonlinear optimization problem can be decomposed into two lower-dimensional problems; one for the computation of the parameters and one for the calculation of the coecient c . In this manner, computational eciency is enhanced. A broad class of locally de ned functions can be used, e.g. low-order polynomials, B -splines, radial basis functions, and combined hyperbolic tangents. It is emphasized that there are no restrictions on the distribution of the interpolation functions over the domain of interest; overlapping is possible. Orthogonality or orthonormality properties can be attained, if desired or required, by means of the Gram-Schmidt algorithm, either during the basis-building process or at its completion. The latter requires the recalculation of the coecients and reduces the computational eciency of the proposed method since a system of equations has to be solved. The algorithm can be initialized with either an empty set or an arbitrary number of predetermined functions and coecients; the second option enables the use of coarse grid solutions obtained, for example, from a previous nite element analysis of the given problem. The sequential algorithm is halted when the relative di erence between the previous and current approximations falls below a user-de ned tolerance level. The details of the method will be presented and computational issues will be addressed with the help of examples of increasing complexity. Results for spen n n i i i n n n 38 ci c numerical examples (with emphasis on uid dynamics applications) will be presented and discussed, and the overall performance will be evaluated based on parameter studies. 3.2 The Parallel Direct Search Optimization Technique The computational eciency of the method depends on the cost of the nonlinear optimization problem solution. The dimensionality of the nonlinear problem is kept low due to the iterative nature of the algorithm. Moreover, the parallel direct search (PDS) optimization technique is used to solve the optimization problem eciently. PDS is a software package that consists of a collection of Fortran routines for solving both unconstrained and constrained nonlinear optimization problems using direct search methods. Direct search methods have the advantage of requiring information about the objective function only. Derivatives do not need to be calculated, and the condition for convergence is that the objective function has to be continuously di erentiable. The user has to provide objective function and constraint(s) evaluation routines and an input le. This fact makes PDS quite easy to use. In this manner, the combined user e ort and computer time may often be smaller than that required for gradient-based methods since the latter are theoretically more ecient but practically more complicated to implement. In addition, ingredients essential to gradient-based methods, i.e. derivatives, may be inaccurate, noisy, not available, or quite expensive to compute. PDS can achieve relatively high performance rates because of the parallel execution of the necessary objective function evaluations. Exceptional scalability is achieved by fully exploiting additional processors, yielding almost linear speed-up. The PDS optimization technique and its theoretical background are presented in references 39 [14, 84, 52], and [85]. Reference [10] reports the application of PDS to an optimal design problem. The working scheme of PDS is as follows. Given an initial guess for the optimization variables vector of length K , PDS evaluates the objective function at d vertices of a simplex de ned on the hyperplane of dimension K . The \best" vertex is determined and compared to the previous iterate. If the di erence lies within a user-de ned tolerance tol, the search is terminated; otherwise, a new iteration is performed with the current iterate as the initial vertex. The search size is controlled by the user by the input of the parameters d and tol. For unconstrained problems the restriction d 2K applies, while for constrained problems d K 2 + K . However, d should be typically chosen large in order to ensure a ne-grained search. PDS works best when the number of the optimization variables is kept small, i.e. when the ratio of the number of necessary function evaluations to the number of available processors is low. This requirement is satis ed by the proposed method for incremental function approximation. Computational implementation issues will be addressed in the following sections and in Appendix A.5. 3.3 The Iterative Character of the Proposed Algorithm This section intends to demonstrate how the iterative nature of the proposed algorithm relates the method of optimal incremental approximation to the Gauss-Seidel method for the solution of a linear system of equations. There are two types of numerical methods for the solution of a linear system of algebraic equations of the form Az = b, where A 2 RN N and b 2 RN are known and z 2 RN is the unknown solution vector; direct methods yield the exact solution, as far as computer arithmetic accuracy allows, while iterative methods 40 approximate the exact solution up to a user speci ed level of accuracy. Iterative methods are computationally more ecient than direct methods for large systems (large ), especially when the matrix A is sparse, but their performance, i.e. convergence rate, strongly depends on the structure and properties of the matrix A. In general, iterative methods are more stable, as they will usually dampen errors with increasing number of iterations [49]. Various methods and algorithms have been developed to enhance and/or exploit the properties of the matrix A [2]. Iterative methods are classi ed as either stationary or nonstationary. In stationary iterative methods (Jacobi, Gauss-Seidel, SOR, etc.), the matrix A remains invariant, while in nonstationary methods (e.g. conjugate gradients (CG), GMRES, etc.) A varies with each iteration. The objective of the latter is to accelerate the convergence to a suciently accurate solution. The Gauss-Seidel method belongs to the class of stationary iterative methods. Diagonal dominance is required for convergence, and a low spectral radius is desired for a satisfactory rate of convergence (Appendix A.6). In the Gauss-Seidel method, the matrix A is decomposed into its lower triangular part, which contains the diagonal of the original matrix, and its strictly upper triangular part, i.e. an upper triangular matrix with a zero diagonal, N A = , (L + D) + U = z b z z b and the iterative algorithm is (L + D)k+1 = ? U k z b z : (3.2) 41 If A is diagonal dominant, the Gauss-Seidel method converges for any starting vector 0. Typically, 0 = 0. Given a system of algebraic equations, e.g. a di erential equation discretized by means of the Bubnov-Galerkin method, the use of the Gauss-Seidel method for its solution can be interpreted as follows. The rst coecient 1 of the solution vector is readily calculated by solving the rst equation of the system. Notice that the rst equation depends only on the rst interpolation function 1 since the Bubnov-Galerkin scheme implies that the inner product h i (where is the di erential equation residual and the weighting function is identical to the trial function ) vanishes 8 , assuming that 0 = 0. The second equation is then solved for 2 with known 1, employing the functions 1 and 2. This process is repeated for all equations; it is clear that at step all components with are known and the employed functions are the functions with . After all equations have been solved, a new outer iteration begins to improve the accuracy of the solution vector. The proposed algorithm of Section 3.1 resembles the Gauss-Seidel method in that it proceeds in the same manner, with the exception that at each step of the algorithm the -th interpolation function is optimized instead of being xed and preselected. It can be argued that the analogy of the matrix (L + D) is not prede ned and constant, but built in an adaptive way. It therefore resembles a \nonstationary" Gauss-Seidel method. Consequently, the convergence rate should be improved, compared to that of the stationary Gauss-Seidel method. Moreover, the proposed algorithm is not just an improved technique for the solution of linear systems of algebraic equations. It is an adaptive, optimal approximation algorithm z z z z R; R n n n n z z z n zi i n i n i < n 42 that imitates grid optimization schemes and can be also employed in nonlinear problems, as will be demonstrated in Chapter 5. 3.4 Function Approximation In this section, the implementation of the proposed method for two simple nonlinear function approximation problems in one and two dimensions will be demonstrated. 3.4.1 Implementation At the n-th stage of the proposed sequential approximation algorithm, some function f0(x) is approximated by Eq. (3.1). Let en = f0(x) ? uan (x). The squared error between the function and its approximation can be de ned as ken k2 = hen ; eni = Z e2n dx (3.3) and can be rewritten as Z Z Z (f0(x) ? uan(x))2dx = (f0(x) ? uan?1(x) ? cnn (x))2dx = Z Z 2 2 2 (en?1 ? cnn (x; n)) dx = en?1 dx + cn (n(x; n))2dx ? Z 2cn en?1n (x; n)dx = ken?1k2 + c2nkn (x; n)k2 ? 2cn hen?1; n (x; n)i : (3.4) The linear coecient cn can be determined by di erentiating with respect to cn 43 and requiring the derivative to vanish @ @cn = 0 = 2cn kn (x; n)k2 ? 2hen?1 ; n(x; n)i ; (3.5) which yields cn = hen?1 ; n(x; n)i : kn (x; n)k2 (3.6) Note that the procedure for evaluating the coecient cn is equivalent to the Bubnov-Galerkin scheme since it provides for the orthogonality of the error en to the weighting function n (x; n). Indeed, Eq. (3.5) shows that Z en?1 n (x; n )dx ? cn Z (n(x; n))2dx = hen; n (x; n)i = 0 : (3.7) Substituting Eq. (3.6) into Eq. (3.4) yields = ken?1 k2 ? hen?1; n (x; n)i2 kn(x; n)k2 : (3.8) To minimize the error, the parameters of the n-th basis function must be selected to maximize the last term of Eq. (3.8). This nonlinear optimization problem can be solved eciently by PDS. Clearly, the proposed method resembles some of the conventional adaptive numerical techniques from computational mechanics in that the new basis function is selected to provide the largest projection on the error of the preceding iteration. Speci cally, the new basis function is positioned at the location of the largest previous error. Moreover, the basis function can be also 44 viewed as a weighting function; the proposed method resembles the method of weighted residuals. 3.4.2 Numerical Examples The proposed method has been applied to two nonlinear function approximation examples. In particular the one-dimensional function f0(x) = x2 and the twodimensional function f0(x) = x2 + y2 have been approximated. For the purpose of evaluating the algorithm, piecewise linear transfer functions were used to construct [57] the computationally straightforward piecewise linear \hat" (B1-splines) basis functions. B1-splines are de ned as 8 >> x?(xM ?xl) >< xl xr )?x (x) = > (xM + x r >> :0 if xM ? xl x xM if xM x xM + xr otherwise, (3.9) where the parameters xM , xl, and xr denote the location of the center of the function, its width to the left, and its width to the right, respectively. The simple product of two one-dimensional functions, which can be constructed from a sigmapi network, has been used in the two-dimensional case, i.e. (x; y) = (x)(y). The optimization variables are given by n = (xM ; xl; xr)Tn (3.10) for the one-dimensional case and by n = ( x; y )Tn = (xM ; xl; xr; yM ; yl; yr)Tn (3.11) 45 for the two-dimensional case. The initial values for the optimization variables, required by PDS, are chosen so that each new basis function is initially centered in the domain of interest and encompasses it. As an alternative, the initial guess can be chosen such that the center of the basis function is located at the position of the maximum error which has been calculated in the previous iteration in order to improve the eciency of the PDS technique. However, this does not a ect the accuracy of the results since PDS performs a ne-grained search throughout the domain. The approximations obtained by the proposed method are satisfactorily accurate. The RMS error, de ned as RMSNM = s PM j=1(f0(xj ) ? uaN (xj ))2 M (3.12) ; where is some sucient number of uniformly distributed trial points, is plotted versus the number of optimized basis functions in Fig. 3.1 for both the one-dimensional and the two-dimensional problems using = 11 and = 101 uniformly distributed evaluation points, respectively. Linear and quadratic convergence lines are included to illustrate the satisfactory convergence rate. Convergence rates with respect to two di erent numbers of used basis functions 1 and 2 are de ned as linear if M N M M N NM1 RMS N 2 RMSM = N =N 2 1 ; N (3.13) 46 (a) −1 (b) 0 10 10 linear RMS RMS linear −2 10 −1 10 quadratic quadratic −3 10 −2 0 1 10 2 10 N 10 10 0 1 10 2 10 N 10 (b) (a) Convergence rates for the function approximation problems: (a) one-dimensional case and (b) two-dimensional case. Figure 3.1 and quadratic if NM1 RMS N 2 RMSM = (N =N )2 2 1 (3.14) ; in order to be compatible with the convergence terminology used in the nite element method, where the grid spacing is used as the convergence metric (for example 2-convergence etc.). If Eq.s (3.13) and (3.14) become inequalities with larger right sides, the convergence rates are de ned as superlinear and superquadratic, respectively. When nite element methods are applied on uniform grids, the grid spacing and the number of basis functions used are directly related through the formulas = 1 ( ? 1), for one-dimensional problems, and x y = ( Nx1?1 )( Ny1?1 ), for two-dimensional problems. h h h h N = N h h 47 Chapter 4 Linear Di erential Equations Consider the general boundary value problem with homogeneous Dirichlet boundary conditions [ ( )] = ( ) in L f0 x g x ; ( ) = 0 on f0 x @ (4.1) ; where [] is a linear, self-adjoint di erential operator. Homogeneous Dirichlet boundary conditions are assumed for the sake of simplicity. For linear, self-adjoint problems associated with homogeneous Dirichlet boundary conditions, the variational principle associated with Eq. (4.1) is given by the functional L [ a( )] 21 h a( ) a( )i ? h ( ) a( )i f x l f x ;f x g x ;f x (4.2) : where h i is the bilinear symmetric di erential form associated with the operator discussed in Section 1.5. If the functional is numerically minimized by the function a( ), then a( ) automatically satis es the associated Euler-Lagrange equation Eq. (4.1); moreover, if is positive de nite, the solution is unique [64](pages 74-75). However, a function a( ) that satis es Eq. (4.1) does not necessarily minimize the functional . It is due to this fact that variational principles are employed by the proposed method. In this regard, the approximate function given by Eq. (3.1), is introduced l ; L f L f x x f x 48 into the functional to yield [uan(x)] 21 lhuan(x); uan(x)i ? huan(x); g(x)i : (4.3) This functional may be used as the objective function to be minimized in an optimization problem in order to determine the optimal parameters n and the associated coecient cn . However, in this case the properties of the di erential operator can be exploited to simplify the optimization problem. Since the di erential operator is self-adjoint, the boundary conditions are homogeneous and of the Dirichlet type, and huan (x); g(x))i = lhuan (x); f0(x)i, Eq. (4.3) can be rewritten as [uan(x)] = 21 lhf0(x) ? uan (x); f0(x) ? uan (x)i ? 21 lhf0(x); f0(x)i : (4.4) At the n-th step of the proposed sequential method, the approximate solution is given by Eq. (3.1). Again, let en = f0(x) ? uan(x). Then, en = en?1 ? cnn (x; n), and Eq. (4.4) becomes [uan(x)] = 21 lhen?1 ? cn n(x; n); en?1 ? cnn (x; n)i ? 21 lhf0(x); f0(x)i = 1 lhe ; e i + 1 c2 lh (x; ); (x; )i ? c lhe ; (x; )i ? n n?1 n n 2 n?1 n?1 2 n n n n n 1 lhf (x); f (x)i : (4.5) 0 2 0 As in the function approximation case, the coecient cn can be determined by 49 equating the partial derivative of with respect to cn to zero @ [uan (x)] @cn = 0 = cn lhn(x; n); n(x; n)i ? lhen?1; n (x; n)i : (4.6) The coecient cn can then be calculated from cn = lhen?1 ; n (x; n)i lhn (x; n); n (x; n)i (4.7) and Eq. (4.5) can be rewritten as 2 [uan(x)] = 21 lhen?1; en?1i? 12 lhf0(x); f0(x)i? 12 lh(lhe(nx;?1; )n; (x;(x;n)i) )i :(4.8) n n n n The equivalence of the method to the Bubnov-Galerkin method is, as in Section 3.4.1, readily deduced by examining Eq. (4.6): 0 = cnlhn (x; n); n(x; n)i ? lhen?1; n (x; n)i = hL[cn n(x; n)]; n(x; n)i ? hL[en?1]; n(x; n)i = hRn ; n(x; n)i : (4.9) To improve the accuracy of the approximation, the last term of Eq. (4.8) must be maximized by appropriately selecting the parameters n of the basis function n (x; n) since the rst two terms of Eq. (4.8) are constant with respect to n. 50 4.1 Boundary Conditions Consider the linear, self-adjoint problem L[f0(x)] = g(x) associated with general boundary conditions B [f0(x)] = b(x). The boundary operator B may represent a mixture of Dirichlet and Neumann conditions. Assuming that an arbitrarily chosen function s(x) satis es the non-homogeneous boundary conditions but not necessarily the di erential equation, then the function q (x) = fa (x) ? s(x) will satisfy the homogeneous boundary conditions, provided that fa(x) satis es the non-homogeneous boundary conditions; since B [] is the general linear boundary operator, then B [q] = B [fa] ? B [s] = 0. If p = f0 ? s, where f0 is the exact solution of the problem, then, since L is linear L[p] = L[f0 ] ? L[s] = g ? L[s] = # : (4.10) In this manner, the non-homogeneous boundary value problem has been transformed to a homogeneous boundary value problem. The variational principle is then [q] 21 hq; L[q]i ? hq; #i (4.11) and can be reformulated in terms of the unknown function fa Z Z [fa] = (faL[fa] ? 2fag + faL[s] ? sL[fa])dx + (2sg ? sL[s])dx ; (4.12) where the second integral can be omitted due to its independence with respect to fa . Since the di erential operator is self-adjoint, integration by parts would force the last two terms of the rst integral to cancel each other, but the presence of 51 non-homogeneous boundary conditions will result in the appearance of non-zero boundary terms. The same is true for the integration by parts of the rst term of the rst integral, i.e. the boundary terms will only be zero for homogeneous boundary conditions. It is clear that the trial functions have to only satisfy the Dirichlet boundary conditions; the Neumann boundary conditions are automatically satis ed by the variational principle. Though our problem of primary interest is nonlinear, the previous analysis hints at how boundary conditions can be addressed and hints at problems that popular ANN bases may encounter using the optimal sequential approximation technique. A preselected sum of weighted bases can be used to assemble the function s(x). The remaining weighted bases are then constrained to satisfy homogeneous boundary conditions and approximate q(x). Unfortunately, Gaussian radial bases and local bases formed from hyperbolic transfer functions [60] cannot be used to accurately approximate q(x), by the optimal sequential approximation technique, in a straight-forward manner since the value of the basis function (x) and its derivatives approach zero as x ! 1. A basis should be used such that the derivatives of (x) are nonzero as (x) vanishes for x ! 1. The remedy is relatively straightforward. A radial basis function is modi ed such that only its value, but not its derivatives, becomes zero at some nite value of x (Fig. A.2 (b)). The augmented basis function to be used in this report is (x) = e e ??1 1 ; () x 2 R where 8> x?(x ?x ) >> l Mxl l < (x +x )?x (x) = > r M xrr >: 0 if xM ? xl x xM if xM x xM + xr otherwise, (4.13) (4.14) 52 Note that the use of the augmented radial basis function with the proposed sequential method advances the method from an h-r scheme to an h-r-p adaptive scheme since, by a Taylor series expansion, x + + + (x) = e e ??1 1 = : 1 + + 2 + 2 ( ) 2 3 (4.15) As a result, the order of the basis function increases with the magnitude of . In addition, we can recover the B basis since Eq. (4.15) shows that (x) = (x) for = 0. 2! 3! 2! 3! 1 4.2 Linear Nonself-Adjoint Problems Application of the method is straightforward for a linear self-adjoint ordinary differential equation with homogeneous Dirichlet boundary conditions. Therefore, as a test, a linear, second-order, nonself-adjoint ordinary di erential equation with non-homogeneous Dirichlet boundary conditions will be solved by means of the optimal incremental approximation method. The transformation of the nonselfadjoint operator into a self-adjoint operator will be demonstrated. In the case when this transformation is not possible, the shortcoming of the method will be discussed and potential alternatives to overcome this shortcoming will be proposed. Consider the general, linear, second-order, nonself-adjoint di erential equation L[f ] = 0 d f + df + f = g ; x 2 [a; b] ; dx dx 2 0 2 0 0 (4.16) associated with general boundary conditions. The coecients , , and of the di erential equation may also be functions of the independent variable x. The operator L may be adjusted for self-adjointness by premultiplication with some 53 function of the independent variable x, say P (x) [90], [25](page 309). In order for the new operator L to be self-adjoint, it is necessary and sucient (Appendix A.4) that the following expression for two arbitrary functions z and w 0 Zb a Zb a wL [z ]dx ? 0 wP Zb a zL [w]dx = 0 ! Zb a wP L[z ]dx ? d2z + dz + z dx ? Z b zP dx2 dx a Zb a zP L[w]dx = ! d2w + dw + w dx dx2 dx (4.17) is a function of boundary terms that arise from integration by parts only. Integration by parts yields Zb ! d2z + wP dz + wP z dx ? wP d x2 dx a ! Zb 2 d w d w zP dx2 + zP dx + zP w dx = a ! Z b d(wP ) dz d z ? dx dx + wP dx + wP z dx + (B.T.)1 ? a ! Z b d(zP ) dw d w ? dx dx + zP dx + zP w dx ? (B.T.)2 = a Zb Zb Zb ? a P ddwx ddxz dx ? a w d(dPx ) ddxz dx + a wP ddxz dx + Z b dw dz Zb wP z dx + (B.T.)1 + P dx + d x dx a a Z b d(P ) dw Zb dw Z b z d x ? zP ? zP wdx ? (B.T.)2 = dx dx a a ! dx a Z b dz d(P ) dx + (B.T.) ? P ? w 1 dx dx a ! Z b dw d( P ) z dx P ? dx dx ? (B.T.)2 ; a (4.18) where (B.T.)1 and (B.T.)2 denote the boundary terms stemming from the two integrations by parts. By examining Eq. (4.18), it is obvious that for nonzero and 54 , the necessary and sucient condition for L to be self-adjoint is 0 P ? d(P ) = 0 ; dx (4.19) which yields after integration P (x) = e ; (4.20) R where = ?1( ? ddx )dx. Consequently, the operator L is self-adjoint, and a variational principle for Eq. (4.16) is readily de ned as [12] 0 [fa] 21 hfa; L [fa]i ? hfa; gi ; (4.21) 0 provided that the boundary conditions are homogeneous. If the boundary conditions are non-homogeneous, then the functional is de ned, according to the developments of Section 4.1, as [q] 21 hq; L [q]i ? hq; #i (4.22) 0 and can be rewritten in terms of fa [f a ] = Z b a Z b (faL [fa] ? 2fag + faL [s] ? sL [fa])dx + (2sg ? sL [s])dx :(4.23) 0 0 0 0 a Since the second integral is invariant with respect to fa, it can be dropped from the functional. 55 It is interesting to note that after integrating by parts all terms of the rst integral involving the di erential operator, and if the non-homogeneous boundary conditions are of the Dirichlet type, the functional of Eq. (4.23) will be apparently identical to the functional of Eq. (4.21), derived for the homogeneous boundary value problem, after this is also integrated by parts. The di erence is that in this case the function fa(x) must satisfy the non-homogeneous Dirichlet boundary condition; hence, Dirichlet boundary conditions are essential boundary conditions. In the case of Neumann boundary conditions, the functional of Eq. (4.23) will include terms that represent the Neumann boundary conditions. In this regard, the function fa(x) does not have to satisfy the Neumann boundary conditions since they are included in the functional (natural boundary conditions). It should be noted that the rst variation of the functional of Eq. (4.23) yields Eq. (4.16) as the associated Euler-Lagrange equation since the transformation of the operator can be interpreted as a simple variable transformation [25](page 309). The second variation of indicates that the functional should be minimized. Remarks 1. If the coecients of the di erential equation are constant, then even ordered derivatives are self-adjoint while odd ordered derivatives are nonself-adjoint. 2. If = 0, then the original operator L is self-adjoint and no transformation or adjustment is necessary. 3. If = 0, then a transformation or an adjustment is impossible. The introduction of the adjoint problem L[v] = h into the variational principle is 56 required to yield [5] Z b a (v(L[fa] ? g) ? h fa)dx = Zb a (fa(L[v] ? h ) ? gv)dx : (4.24) Unfortunately, it is possible that the sign of the second variation of Eq. (4.24) is inde nite. In this case the functional cannot be extremized. An alternative would be to minimize the square of the rst variation of the functional, in other words to apply the least squares optimization method to the associated Euler-Lagrange equation. However, the simplicity and eciency of the proposed method would decrease dramatically. Alternatives for the generalization of the method due to the lack of variational principles are considered in Chapter 6.2.1. 4.2.1 Numerical Example: The Convection-Di usion Equation The linear ordinary di erential equation 2 ? ddxf20 + ddfx0 = 0 with ; > 0 and 0 x 1 (4.25) is a mathematical model of adequate delity for the steady-state balance between convection and di usion. Guymon derived in references [34] and [35] variational principles for the unsteady problem in one and two dimensions using a similar approach and utilized them for the solution by means of the nite element method. The coecients and represent the viscosity and convection coecients, respectively, and will assumed to be constant for the sake of simplicity. Equation (4.25) is nondimensionalized and, in combination with the non-homogeneous Dirichlet 57 boundary conditions f0(0) = 0 and f0(1) = 1, possess the exact solution e x ?1 f0 (x) = : e ?1 (4.26) The solution function, representing velocity, is characterized by the presence of a boundary layer adjacent to the boundary at x = 1 [27]. From a physical perspective dissipation is, for many ow problems, of signi cance only in the region of the boundary layer. Therefore, numerical solutions are often oscillatory when the dependent variable exhibits large gradients across the boundary layer. Speci cally, in nite di erence methods the accuracy and oscillations of the numerical solution [27] are highly dependent on the cell Reynolds number, de ned as Recell = x . In this regard, Eq. (4.25) poses a challenging problem to be solved by the proposed method. 2 Given the fact that the di erential operator L[] = ? ddx2 + ddx is nonselfadjoint, and following the approach presented in the previous section, it can be found that the operator L can be transformed into a self-adjoint operator L , if it is premultiplied by the function P (x) = e? x. The associated variational principle is then de ned, according to Eq. (4.23) and for g = 0, as 0 [fa] = Z1 0 (faL [fa] + faL [s] ? sL [fa])dx ; 0 0 0 (4.27) where s(x) = x is chosen to satisfy the boundary conditions. Integrating by parts the rst term of the integral yields Z1 0 fa e? x ! Z1 2fa 2 x d fa Z 1 ? x dfa d fa d ? ? dx2 + dx dx = 0 fae dx dx ? 0 fae dx2 dx = 58 Z1 df fa e? x a dx + Z 1 d(fa e? x ) dfa !1 df fae? x a = dx dx dx dx ? dx 0 0 !1 Z 1 dfa !2 Z1 Z 1 ? x dfa d f d f a a ? x ? x ? x fa e dx dx + 0 e dx dx ? 0 fae dx dx ? fae dx 0 = 0 ! Z 1 dfa !2 dfa ? x ? dx ? e (4.28) e dx dx x=1 ; 0 0 while integrating by parts the second and third terms of the integral in Eq.(4.27) yields Z1 ! ! Z1 2 2 d s d s ? x ? d fa + dfa dx = ? se d x ? + dx2 dx dx2 dx 0 0 Z Z 1 d2 f a Z Z 2 1 xd s 1 ? x ds 1 ? x dfa ? fa e se d x ? fa e d x? d x + se? x 2 dx = 2 dx dx dx! dx 0 0 0 0 x 1 Z Z ? 1 1 d(fae ) ds dx ? f e? x ds ? ds fa e? x dx + a dx dx dx dx fa e? x 0 0 Z1 !1 Z 1 d(se? x ) 0 df dfa dx ? d x + se? x a = dx dx dx 0 0 0 ! ! Z Z 1 dfa ds 1 d f d s d s d f a a ? x ? ? e? x dx dx dx ? e dx x=1 ? 0 e dx dx dx + e dx x=1 = 0 ! dfa d s ? e (4.29) dx ? dx x=1 : se? x The nal functional is obtained by combining Eq.s (4.28) and (4.29) [fa] = Z1 0 e? x ! dfa 2 dx ? e? ds dx dx ! x=1 : (4.30) The second term of the functional can be omitted since it does not depend on fa (x). This functional can be directly minimized in order to approximate the function f0(x) that satis es the di erential Eq. (4.16). The expansion of the ap- 59 proximate solution is given by uan (x) = x + n X i=1 ci i (x; i) = uan?1 (x) + cn n (x; n) ; (4.31) where the rst term is included in order to satisfy the boundary conditions. By introducing Eq. (4.31) into Eq. (4.30), the augmented optimization variables vector n = ( n; cn)T can be computed using PDS in one step. However, this nonlinear optimization problem can be decomposed into two lower-order problems. By di erentiating with respect to cn, requiring the derivative to vanish, and abbreviating (x; ) as for simplicity, the following expression is obtained 0 2 Z1 0 du d e? x n n dx = 0 : dx dx a (4.32) The derivative of Eq. (4.31) with respect to the independent variable x is given by n d dua (x) + c dn : duan(x) = 1 + X ci i = n?1 n dx dx dx dx i=1 (4.33) Substituting Eq. (4.33) into Eq. (4.32) and solving for the coecient cn yields cn = Z1 d d e? x n n dx dx dx 0 ? Z1 0 !?1 ! nX ?1 Z 1 di d x dn n x ? ? e dx dx ? i=1 ci 0 e dx dx dx : (4.34) Combining Eq.s (4.34), (4.33), and (4.30), a new objective function is obtained [uan] = Z1 0 e ?x 0 n?1 Z 1 dn dn !?1 @1 + X ci di + e? x dx i=1 dx 0 dx dx 60 ? Z1 0 ?x e !2 ! ?1 Z 1 d d d dx ? X d c e? dx dx dx dx dx dx : (4.35) 0 =1 n n i x i n n i The parameters of the n-th basis function are determined by minimizing the functional in Eq. (4.35), while the associated coecient c is readily calculated from Eq. (4.34). Two PDS input parameters control the number of required PDS iterations and function and constraint evaluations; the number of pattern points d for the search directions and the user-selected tolerance tol, which constitutes the convergence criterion of the PDS algorithm. PDS requires d 2K pattern points for unconstrained problems and d K 2 + K pattern points for constrained problems, where K is the number of optimization variables. Numerical experimentations have shown that d should be chosen to be larger than 2K or K 2 + K , respectively, in order to ensure a detailed search and, as far as this is possible, a global extremum. The number of pattern points for the search directions should be relatively small if the objective function is convex, and large if the objective function is suspected to be nearly concave. Typically, if a small d is chosen, a small tolerance should also be chosen since the search interval is relatively coarse. Similarly, if a large d is selected for the same number of design variables, a larger tolerance should be chosen to ensure computational eciency. The approximate solutions are compared to the exact in Fig. 4.1(a) for = = 5 and Fig. 4.1(b) for = = 20 for the augmented radial bases. The relatively low number of necessary basis functions, in both gures, and the excellent agreement between exact and approximate solutions, even in the presence of a boundary layer characterized by a large gradient, con rm the potential of the method. Most interesting of all, note that the accuracy of the augmented radial bases rivals both n n 61 nite elements on a uniform grid and B1 splines by the proposed method (Fig. 4.3). Note that since the augmented radial bases are of the same form as the solution in Eq. (4.26), is limited to 0 jj 3. This is to demonstrate the relatively low number of bases needed by the method when compared to a uniform grid and traditional radial bases. 1.2 1.2 approx. exact approx. exact 1 1 0.8 0.8 u(x) u(x) 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0 0.1 0.2 0.3 0.4 0.5 x (a) 0.6 0.7 0.8 0.9 1 −0.2 0 0.1 0.2 0.3 0.4 0.5 x 0.6 0.7 0.8 0.9 1 (b) Figure 4.1 One-dimensional convection-di usion: exact and approximate solutions for (a) = = 5 and (b) = = 20 using 16 and 9 optimal augmented radial basis functions, respectively. Two plots from Fletcher [27] are reproduced in Fig. 4.2 in order to illustrate the large oscillations in the solutions obtained by traditional numerical methods (in this case by the nite di erence method). The problem and the boundary conditions associated with Fig. 4.2 are the same as those discussed in this section with the parameter ratio is = = 20. The symbol T has been chosen by Fletcher to represent the variable for which the symbol u was used in this text, while the symbol u has been used by Fletcher for the parameter introduced here as . 62 Figure 4.3 compares the RMS error convergence rates obtained for = = 5 and = = 20 to the nite element method on a uniform grid, the proposed method with B1 splines and the proposed method using the augmented radial bases. PDS was executed using a tolerance tol of 10?4 and 1000 pattern points d. The RMS error was calculated based on 101 uniformly distributed trial points. Though a large di erence in the RMS values are apparent, a small quantitative di erence in the convergence rates can be observed. The RMS error convergence rate for = = 20, tol = 10?4 , and d = 1000 is plotted in Fig. 4.3(b). It can be concluded that when the solution of the problem has a relatively large gradient in some region of the domain, the proposed method will fully exploit its adaptive nature and perform much better than traditional methods and traditional bases. It can be argued that the adjustment of the di erential operator L by the introduction of the premultiplying function P (x) is equivalent to the application of the Petrov-Galerkin method to the original problem, when the latter is solved by the traditional nite element method. Since the new, equivalent problem is L [f0 (x)] = P (x)h(x) = P (x)L[f0 (x)] (4.36) 0 or, more speci cally, e? x ! 2 ? ddxf20 + ddfx0 = 0 ; (4.37) the equation residual, after introducing the approximate solution fa, is e? x ! 2 d ? dxf2a + ddfxa = e? x (R) = R : 0 (4.38) 63 (b) (a) One-dimensional convection-di usion: (a) exact and approximate solutions for di erent grid spacing and cell Reynolds numbers and (b) exact and approximate solutions in the presence of an upwind scheme; = = 20, plots reproduced from reference [26]. Figure 4.2 0 10 −1 10 uniform grid opt. incr. approx. B1−splines opt. incr. approx. RBFs linear −1 10 quadratic −2 10 RMS RMS linear quadratic −2 10 −3 10 −3 10 uniform grid opt. incr. approx. B1−splines opt. incr. approx. RBFs −4 10 0 1 10 10 N (a) 2 10 −4 10 0 10 1 10 N (b) One-dimensional convection-di usion: convergence rates for (a) = = 5 and (b) = = 20. Figure 4.3 2 10 64 Applying the Bubnov-Galerkin method of weighted residuals to the residual is equivalent to applying the Petrov-Galerkin method of weighted residuals to the original residual of Eq. (4.38) since R 0 R hR 0 ; k i = h R; e ? x k i : (4.39) Moreover, the system of equations obtained by applying the Raleigh-Ritz method to the problem, i.e. the algebraic equations obtained by requiring the derivatives of the variational principle of Eq. (4.30) with respect to the coecients =1 , to vanish, is equivalent to the system of equations obtained k, by the Petrov-Galerkin. It was found that the equivalent Petrov-Galerkin and Raleigh-Ritz methods eliminated the oscillatory behavior of the solution but did not enhance the RMS error convergence rate when utilizing 1-splines on a uniform grid; the convergence rate is slightly worse, as seen in Fig. 4.4. In addition, it can be seen that the convergence rate of the proposed method is better when the coecients are computed one by one at each stage of the algorithm and not when all the coecients are recomputed by solving a system of equations at each step. An additional characterisic of the method is illustrated in Fig. 4.5, using augmented radial bases. Since the method is based on the adaptive improvement of the series expansion, the absolute value of the coecients should decrease with increasing number of expansion terms. The oscillations observed in Fig. 4.5 can be explained by the fact that the coecients do not necessarily have to decrease strictly monotonically, especially when their sign oscillates in the early stage of the proposed algorithm. N c k ;:::;N B 65 4.2.2 Computational Cost Tables 4.1, 4.2, and 4.3 summarize the amount of computational work required by PDS for = = 5 using B1 splines and augmented radial bases and = = 20 using augmented radial bases, respectively. Step 1 involves selection of optimal parameters in (x) of Eq. (4.14). Step 2 involves the selection of the remaining basis function parameters and . The number of required iterations and function and constraint evaluations do not uctuate signi cantly. A low number of pattern points associated with a tighter tolerance requires more iterations but fewer function and constraint evaluations than a large number of pattern points associated with a relaxed tolerance. It is interesting to note that in all cases the algorithm requires less work in its initial stage than in all the subsequent steps. l r 66 Amount of computational work required by PDS for the linear convection-di usion problem using B1-splines (= = 5). Table 4.1 Algorithm # of PDS Total # of Total # of step iterations function eval. constraint eval. d = 100 tol = 10?6 1 24 836 2272 8 18 1493 1536 16 15 1307 1365 24 16 1168 1335 d = 1000 tol = 10?4 1 7 916 5344 8 11 5721 6683 16 12 6970 8719 24 10 7401 8475 Amount of computational work required by PDS for the linear convection-di usion problem using augmented radial basis functions (= = 5, d = 1000, and tol = 10?4 ). Table 4.2 Algorithm # of PDS Total # of Total # of step iterations function eval. constraint eval. Step 1 1 7 883 5263 3 9 6776 7643 6 13 4419 7943 9 10 7320 7598 Step 2 1 6 2061 2061 3 5 3620 3620 6 5 1778 1778 9 7 3042 3042 67 Amount of computational work required by PDS for the linear convection-di usion problem using augmented radial basis functions (= = 20, d = 1000, and tol = 10?4 ). Table 4.3 Algorithm # of PDS Total # of Total # of step iterations function eval. constraint eval. Step 1 1 12 2420 10245 3 13 7819 9160 6 13 3883 4719 9 8 4903 5080 Step 2 1 6 2596 2596 3 6 4537 4537 6 7 1760 1760 9 6 3469 3469 0 10 −1 10 RMS linear quadratic −2 10 −3 10 opt. seq. approx. one−by−one opt. seq. approx. system Bubnov−Galerkin Raleigh−Ritz = Petrov−Gal. −4 10 0 10 1 10 N 2 10 One-dimensional convection-di usion: convergence rates comparison considering the Petrov-Galerkin and Raleigh-Ritz methods; = = 20. Figure 4.4 68 0 10 −1 |c| 10 −2 10 −3 10 −4 10 1 2 3 4 5 6 j 7 8 9 10 11 One-dimensional convection-di usion: absolute value of the coecients c for = = 20. Figure 4.5 i 69 Chapter 5 The Euler Equations Many kinds of ow of practical interest in uid dynamics, including compressible ow , can be assumed to be irrotational. When a ow is irrotational, the vorticity of the uid, which is proportional to its angular velocity, is equal to zero. This is the case for supersonic ows around sharp edges or cones, for two-dimensional or axisymmetric nozzle ows, and it can be assumed to be the case for ows around slender sharp-nosed bodies. However, supersonic ow around blunt bodies and viscous ow inside a boundary layer are certainly rotational [1]. Whenever a ow can be assumed to be irrotational, the velocity vector can be de ned as the gradient of the velocity potential, i.e. V = r'. In this manner, the governing equations (continuity and momentum equations) can be combined into a single nonlinear partial di erential equation with ' as the single unknown. Once ' is found, all other ow properties can be calculated. Variational principles for the problem of compressible, inviscid, unsteady ow have been derived for the one- and two-dimensional cases from Prozan [77, 76]. Manwell discusses in reference [54] a variational principle for steady, compressible, inviscid ow with shocks, but does not present numerical results. His approach is similar to that presented earlier by Greenspan and Jain in reference [33], where a variational principle, derived for the problem of steady, irrotational, inviscid, compressible ow, was extremized using nite di erences. In fact, the results of an extensive literature search indicated that Greenspan has been one of few reA ow is assumed to be compressible if the freestream Mach number M1 exceeds the numerical value of 0.3. 70 searchers to work directly with the variational principle and attempt to determine the extremal function by direct optimization [31, 33, 32]. In this regard, the proposed method for optimal sequential approximation will be applied to the problem studied by Greenspan and Jain in reference [33]. 5.1 The Velocity Potential Equation The equations governing steady, two-dimensional, homentropic, irrotational, inviscid, compressible ow past a xed body are given by the continuity and momentum equations, respectively @ (u) @ (v ) + @y = 0 @x @u 1 @p @u + v + =0 u @x @y @x @v @v 1 @p u + v + =0: @x @y @y (5.1) (5.2) (5.3) In these equations, u and v are the x- and y- Cartesian velocity components, respectively, is density, and p is pressure. The velocity vector is de ned as the gradient of the velocity potential ' V = (u; v) = T '= r @' @' ; @x @y !T (5.4) and has magnitude V 2 = V 22= u2 + v2. The set of equations is completed by the equation of state for an ideal gas jj p = a ; jj (5.5) 71 where a is a constant and is the ratio of speci c heats. Equations (5.1) - (5.3) can be combined to yield the potential velocity equation [1, 33] 0 @c2 ? @' @x !21 2 A @ '2 ? 2 @' @' @x @ 2' @x @y @x@y 0 + @c2 ? @' @y !21 2 A @ '2 = 0 ; @y (5.6) where c2 = dp=d is the square of the local speed of sound. According to the equation of state, for an ideal gas c2 =a ?1 : (5.7) The associated boundary conditions are @' @ = 0 on the body surface, and r' = (V1 ; 0)T = constant far from the body, (5.8) (5.9) where is the direction normal to the body surface and V1 is the freestream velocity, which is recovered in a distance far from the body. 5.2 The Variational Integral Bateman proposed in reference [4] two variational principles whose variations yield the governing equations for steady, irrotational, isentropic, compressible, inviscid ow. The rst is implicitly expressed in terms of the velocity potential and has to be maximized if the stationary ow is subsonic throughout the domain of interest. 72 The second is de ned with help of the stream function , where @ @ V = (u; v )T = ; @y @x !T (5.10) ; and has to be minimized [53]. In the work presented in this report, the rst variational principle has been considered. Since pressure is a function of density, and density is in turn a function of the velocity magnitude, the variational integral is given by = Z pd = Z f (u2 + v 2)d = Z f ((@'=@x)2 + (@'=@y )2)d (5.11) and is associated with the following Euler-Lagrange equation @ 2f @ 2 ' @ 2f @ 2' @ 2f @ 2' + 2 + =0 : @u2 @x2 @u@v @x@y @v 2 @y 2 (5.12) By comparing Eq.s (5.6) and (5.12) it can be concluded that ! @ 2f @' 2 2 = c ? @x ; @u2 @ 2f @' @' = ? ; and @u@v @x @y ! @' 2 @ 2f 2 = c ? @y : @v 2 (5.13) (5.14) (5.15) The functional of Eq. (5.11) is divergent for in nite domains. This drawback has been remedied by Lush and Cherry by means of introducing appropriate terms into Eq. (5.11) that do not a ect the associated Euler-Lagrange equation [53]. The modi ed variational integral is convergent and has to be maximized. It is de ned 73 as ['] = ! Z I @ @ (V1 ) p ? p1 + V1 1 d @x ! @' @ 2 ' ? V11 @ ds ; @ + (5.16) where 1 denotes freestream conditions, is some unknown auxiliary function, and @ represents the closed body curve. The potential ' is given by ' = V1 x + V1 : (5.17) According to the boundary condition, has to vanish far from the body. The body used by Greenspan and Jain in reference [33] was a unit cylinder, and in this regard, polar coordinates were utilized to derive the nal functional [] = p1 Z 2 Z 1 =0 r =1 (F ? 1 + 2 H )rdrd ; M1 (5.18) where ?1 2 M1 F = 1? 2 0 @2 cos() @ ? 2 sin() @ + @r r @ @ @r !2 + r12 @ @ !211 AA ?1 (5.19) and 1 @ sin() 1 @ cos() ? 1 + r2 @ r : H = 1? 2 r @r (5.20) 74 M1 is the freestream Mach number given by M1 = Vc 1 1 (5.21) ; where c1 is the freestream speed of sound. Finally, the boundary conditions are and @ @r = ? cos() at r = 1 =0 far from the body : (5.22) (5.23) 5.3 The Computational Domain The variational principle of Eq. (5.18) is de ned over an in nite domain. However, it is assumed that the radius spans from r = 1 (body surface) to r = 20 (far eld). This assumption is more than justi ed considering the fact that typical numerical grids in airfoil ow analysis, extend six to seven times the length of the bodies. In addition, the ow is completely symmetrical about the lines = 0, = 180, = 90 , and = 270 . In this regard, the computational domain can be restricted to the range 90 180 . With respect to the above assumptions, the variational integral to be numerically maximized is given by Z 2 = 1 ==2 Z r2 =20 r1 =1 (F ? 1 + 2 M1 H )rdrd : (5.24) Figure 5.1 depicts the physical (x-y-plane) and computational (r--plane) domains. The points A,B,C, and D with x-y coordinates (-20,0), (-1,0), (0,1), and (0,20), 75 D 20 y 15 10 5 0 C A B −20 20 −15 −10 x −5 0 D* A* C* c B* a r 15 10 5 0 80 b 100 120 140 θ 160 180 200 Compressible inviscid ow: (top) the physical (x-y-plane) and (bottom) computational (r--plane) domains. Figure 5.1 76 respectively, are mapped on the points A, B, C, and D, with r- coordinates (20,180 ), (1,180 ), (1,90), and (20,90), respectively. The presence of homogeneous Dirichlet boundary conditions on the lines r = 20 and = 90 facilitates the use of basis functions that vanish at the domain boundaries. In this regard, and due to the presence of Neumann boundary conditions on the lines r = 1 and = 180 , the computational domain has been extended, as shown in Fig. 5.1. The extended computational domain is described by the points a, b, c, and D in the r- plane. 5.4 Boundary Conditions The boundary conditions have to be adjusted for the modi ed computational domain. Speci cally, the appropriate boundary conditions are r = ? cos() at r = 1 ; = 0 at r = 20 and = 0 at = 180 ; (5.25) = 90 ; and (5.26) (5.27) where the subscripts denote partial di erentiation. It must now be determined whether the Neumann boundary conditions are naturally satis ed by the variational principle or if additional terms and/or modi cations of the integral are necessary. Assuming that the Dirichlet boundary conditions are satis ed by the trial functions, the variation of [] with respect to a variation in , de ned as = + (5.28) 77 with = + and = + r r r (5.29) ; is given by @ @ ! = =0 Z Z @ (F r) @ ! =0 drd + Z Z @ ( M 2Hr) @ ! =0 drd (5.30) and must vanish, for to be extremized. In this regard, and by noting that = F (r; ; ; ) ; F 6= F () H = H (r; ; ; ) ; H 6= H () r 6= r() ; and 6= () ; F r r it can be found that and with and @ (F r) @ ! =0 @ ( M 2Hr) @ @ r @ ! =0 ! @ (F r) @ r @ r @ = ! = =0 r ; = ! =0 + @ (F r) @ @ @ @ ( M 2Hr) @ r @ r @ @ @ ! =0 = ! =0 + ; 1 @ (F r) = ? G ?1 rM 2 (cos + r ) P @ r =0 ! 1 @ (F r) 2 1 ? 1 = ?G M r ? sin Q @ =0 ! 1 @ ( M 2Hr) 2 = M r ? r cos R @ r =0 ! =0 @ ( M 2Hr) @ @ @ (5.31) ! =0 (5.32) (5.33) (5.34) (5.35) (5.36) 78 @ ( M 2 Hr) @ ! =0 =? M 2 1 1 + r2 sin S ; (5.37) where 1 1 1 2 2 2 G = 1 ? ( ? 1)M 2 cos ? 2 sin + + 2 2 r r r r (5.38) : Substituting Eq.s (5.31)-(5.37) into Eq. (5.30) yields = Z Z (P + Q )drd + r Z Z (R + S )drd : r (5.39) Integrating by parts the rst terms of each integral with respect to r and the second terms of each integral with respect to yields Z Z @P Z Z @Q Z Z r2 =? drd ? drd + (P )r1 d + (Q )21 dr ? @r @ Z Z @S Z Z Z Z @R @r drd ? @ drd + (R) 21 d + (S) 21 dr : r r (5.40) But @R @r = ? @S @ (5.41) and is zero on r = r2 and = 1 (due to the homogeneous Dirichlet boundary conditions on these boundaries) and so the functional variation reads as =? Z Z Z @P @r + ! @Q drd ? @ Z Z Z (P ) 1 d + r (Q) 2 dr ? (R) 1 d + (S) 2 dr : r (5.42) 79 In addition, (R)r1 = 0 and (S )2 = 0; the nal functional variation is given therefore by =? Z Z @P @r + ! @Q drd ? @ Z Z (P )r1 d + (Q)2 dr : (5.43) In order for to vanish for arbitrary , all the integrands must vanish, i.e. @P @r + @Q =0 ; @ (5.44) (P )r1 = 0 ; and (5.45) (Q)2 = 0 : (5.46) Equation (5.44) is the associated Euler-Lagrange equation of the functional, and Eq.s (5.45) and (5.46) are the Neumann boundary conditions on r = r1 and = 2, respectively. Speci cally, from Eq. (5.45) (?G ?1 rM 2 (cos + r ))r1 1 = 0 =) (G ?1 )r1 = 0 or (cos + r)r1 = 0 (5.47) 1 and from Eq. (5.46) ?G or Since G 1 ?1 M 2 1 r 1 ?1 2 1 r ? sin =0: 2 = 0 =) (G ?1 )2 1 =0 (5.48) represents the density of the uid, and can therefore not be equal to 80 zero, the Neumann boundary conditions are given by r = ? cos on r = r1 (5.49) and = 0 on = 2 ; (5.50) which are indeed the (both homogeneous and non-homogeneous) Neumann boundary conditions of the problem and are therefore naturally satis ed by the functional (5.18). 5.5 Numerical Procedure and Results The proposed algorithm for optimal incremental approximation has been utilized to numerically approximate the extremal function (r; ) that maximizes the variational principle de ned in Section 5.2 and adjusted to the computational domain discussed in Sections 5.3 and 5.4. The function (r; ) is approximated by the series expansion [58] an (r; ) = X c (r; ) (; ) ; n i=1 i i r i i i (5.51) where the basis functions i are, as previously introduced, augmented one-dimensional radial bases with parameters ir = (rM , rl, rr, rl , rr )Ti and i = (M , l, r, l , r )Ti . The ci are the corresponding expansion coecients. The tensor product i(r; ir)i(; i ) can be produced by a feedforward network of two hidden layers of radial bases [60]. The function (r; ) appears only implicitly, in the form of its partial derivatives with respect to r and , in the functional. They are accordingly approximated 81 by r @an @r @an @ X c @ (r; ) (; ) @r X = c (r; ) @ (; ) : = n i=1 n i=1 i r i i i i r i i i (5.52) i i (5.53) @ These approximations are introduced into the variational integral to be maximized at each step of the algorithm, and PDS is employed to determine the optimal parameters and expansion coecient. Due to the high complexity and nonlinearity of the functional, the optimization problem cannot be decomposed into lower-dimensional problems; all seven optimization variables, represented by the vector i = (rM , rl, rr, M , l, r, rl , rr , l , r , c)Ti , are computed in one step by solving an eleven-dimensional highly nonlinear optimization problem using PDS. To enhance computational eciency in the absence of simpli cation and decomposition possibilities, the scaling option of PDS was activated (Appendix A.5.2). The components of the initial guess vector for each step of the algorithm were the following: rM was positioned close to the cylinder surface, while rl and rr were chosen so that (r) covers the computational domain in the r-direction; M was positioned in the middle of the domain (with respect to the -direction), while l and r were also chosen so that () covers the computational domain in the -direction. Finally, the initial guess for rl ; rr ; l ; r and the coecient c were set equal to zero so that the functional value achieved in the previous step of the algorithm will be the initial value in the next step. In this manner, it is guaranteed that the functional value will be strictly monotonically improved with each iteration of the proposed algorithm. 82 Numerical results are presented for transonic ow of air about a unit cylinder and compared to results presented by Greenspan and Jain in reference [33]. The following ow properties were estimated after solving for the function . The nondimensionalized speed of the uid V 2 V1 = 1 + 2 cos()r ? 2 sin(r ) + (r )2 + r12 ( )2 ; (5.54) the nondimensionalized local speed of sound c 2 V1 2 = M12 + 21 ( ? 1) 1 ? VV 1 1 ! (5.55) ; the nondimensionalized pressure p p1 V 2 1 2 = 1 + 2 ( ? 1)M1 1 ? V 1 !! ?1 ; (5.56) the pressure coecient Cp = p p1 1 2 ?1 2 M1 (5.57) ; and the nondimensionalized density 1 = p p1 !1 ; (5.58) where = 1:405. Greenspan and Jain used a Newton-like algorithm in their numerical test cases with = 9 and r = 0:1, 0.15, and 0.2, i.e. they used a nite di erence grid of 11 191 = 2101, 11 127 = 1397, and 11 96 = 1056 83 nodes, respectively. As will be shown on the illustrations provided below, the results reported by Greenspan and Jain are quite sensitive with respect to the value of r. Our numerical examples are presented for the freestream Mach numbers of 0.4 and 0.43. The results obtained by the proposed method are compared to those presented in reference [33]. The latter include results obtained previously by means of other numerical and analytical approximation methods and reported by Lush and Cherry [53] and Imai [42] for M1 = 0:4 and M1 = 0:43. Note that for illustrative purposes all variables requiring the di erentiation of are curve t by a cubic polynomial. 5.5.1 Numerical Example: M1 = 0:4 The numerical example of M1 = 0:4 presented in this report simulates near sonic conditions. However, the ow remains subsonic throughout the domain of interest. As illustrated in Fig.s 5.2 and 5.3, the extremal function (r; ) was satisfactorily approximated after 15 two-dimensional augmented radial basis functions were determined and used in the series expansion. The values of the function at di erent radii are compared to results presented in reference [33], which include numerical values obtained by 1) Imai and 2) Greenspan and Jain using a) r = 0:1 and b) r = 0:2. The agreement is good for r = 1. For higher radii the results obtained by the proposed algorithm deviate, although not signi cantly, from those reported by Greenspan and Jain. It should be emphasized, however, that these values have been reproduced from scanned plots due to the absence of tabulated data in reference [33]. Consequently, error has been introduced into the presented values. Moreover, the numerical results presented by Greenspan and Jain are over three decades old. Therefore, they merely constitute a comparison basis and not an accurate benchmark. 84 0 −0.2 χ(r,θ) −0.4 −0.6 −0.8 −1 −1.2 −1.4 80 20 100 15 120 10 140 5 160 180 0 r θ Compressible inviscid ow: the approximate velocity potential auxiliary function (r; ) for M1 = 0:4 using 15 optimal basis functions. Figure 5.2 r=1 r=1.6 0 0 opt. seq. approx. Imai Greenspan dr=0.1 Greenspan dr=0.2 −0.2 opt. seq. approx. Imai Greenspan dr=0.1 Greenspan dr=0.2 −0.1 −0.6 −0.3 χ −0.2 χ −0.4 −0.8 −0.4 −1 −0.5 −1.2 −0.6 −1.4 90 100 110 120 130 θ 140 150 160 170 −0.7 90 180 100 110 120 130 r=2.2 θ 140 150 160 170 180 r=2.8 0 0 opt. seq. approx. Imai Greenspan dr=0.1 Greenspan dr=0.2 −0.1 opt. seq. approx. Imai Greenspan dr=0.1 Greenspan dr=0.2 −0.05 −0.1 −0.2 −0.15 −0.3 χ χ −0.2 −0.25 −0.4 −0.3 −0.5 −0.35 −0.6 −0.4 −0.7 90 100 110 120 130 θ 140 150 160 170 180 −0.45 90 100 110 120 130 θ 140 150 160 170 180 Figure 5.3 Compressible inviscid ow: the approximate velocity potential auxiliary function (r; ) at di erent radii (on and near the cylinder surface) for M1 = 0:4 using 15 optimal basis functions. 85 The nondimensionalized uid velocity on the surface of the cylinder (r = 1) is plotted in Fig. 5.4 and compared to the results of Greenspan and Jain, Lush and Cherry, and Imai. Figure 5.4 also depicts the nondimensionalized local speed of sound on the cylinder surface. In this, the most sensitive test of the numerical approximation, it can be seen that the uid velocity barely exceeds the sonic speed at = 90, though the actual ow should be just below sonic speed; the approximation is considered to be satisfactory overall. 3 (1) 2.5 2 (2) 1.5 1 opt. seq. approx. Greenspan dr=0.1 Lush and Cherry Imai 0.5 0 90 100 110 120 130 θ 140 150 160 170 180 Compressible inviscid ow: nondimensionalized (curve 1) uid velocity and (curve 2) sonic speed on the cylinder surface for M1 = 0:4. Figure 5.4 The pressure coecient distribution Cp and the nondimensionalized density on the cylinder surface are drawn in Fig. 5.5. Figure 5.6 presents the local Mach number distribution over the entire ow eld and some representative Mach-isolines (lines of equal Mach number) for the region close to the cylinder surface. 86 2 1 Cp 0 −1 −2 opt. seq. approx. Greenspan dr=0.1 Lush and Cherry Imai −3 −4 90 100 110 120 130 θ 140 150 160 170 180 1.15 1.1 1.05 1 ρ 0.95 0.9 0.85 0.8 opt. seq. approx. Greenspan dr=0.1 Lush and Cherry Imai 0.75 0.7 0.65 90 100 110 120 130 θ 140 150 160 170 180 Figure 5.5 Compressible inviscid ow: (top) pressure coecient distribution and (bottom) nondimensionalized density on the cylinder surface for 1 = 0 4. M : 87 1.4 1.2 1 Ma 0.8 0.6 0.4 0.2 0 20 15 180 160 10 140 120 5 100 0 r θ 6 5 80 6 4 7 7 3 7 9 6 10 11 1112 134 1 9 2 6 7 2 1 3 44 0 10 5 3 1 132 9 8 5 1 y 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 8 8 2 1143 2 1 1 0 1 01 1 7 7 8 9 11 4 -1 5 11 9 7 6 -25 5 8 6 -4 7 6 -3 -5 0.965213 0.901587 0.837961 0.774335 0.710709 0.647082 0.583456 0.51983 0.456204 0.392578 0.328952 0.265326 0.2017 0.138074 0.0744478 -4 -2 0 2 4 x Compressible inviscid ow, 1 = 0 4: (top) local Mach number distribution over the entire ow eld and (bottom) Mach-isolines in a region close to the cylinder surface. Figure 5.6 M : 88 5.5.2 Numerical Example: M1 = 0:43 Transonic conditions are successfully simulated when the freestream Mach number takes the value of 0.43. Speci cally, a supersonic bubble appears in the vicinity of = 90 on and near the cylinder surface. The numerically approximated function (r; ) is depicted in Fig.s 5.7 and 5.8 using 15 optimal basis functions. It can be concluded by comparison with Fig.s 0 −0.2 χ(r,θ) −0.4 −0.6 −0.8 −1 −1.2 −1.4 80 20 100 15 120 10 140 5 160 180 0 r θ Figure 5.7 Compressible inviscid ow: the approximate velocity potential auxiliary function (r; ) for M1 = 0:43 using 15 optimal basis functions. 5.2 and 5.3 that the function reaches higher absolute values for the present test case. Once again, the values of the function at di erent radii are compared to results presented in reference [33], which include numerical values obtained by 1) Imai and 2) Greenspan and Jain using a) r = 0:1 and b) r = 0:15. The agreement seems to be better for this test case. The nondimensionalized uid velocity and local speed of sound on the surface of the cylinder, presented in Fig. 5.9, illustrate the appearance of the supersonic 89 r=1 r=1.6 0 0 opt. seq. approx. Imai Greenspan dr=0.1 Greenspan dr=0.15 −0.2 opt. seq. approx. Imai Greenspan dr=0.1 Greenspan dr=0.15 −0.1 −0.2 −0.4 −0.3 χ χ −0.6 −0.4 −0.8 −0.5 −1 −0.6 −1.2 −1.4 90 −0.7 100 110 120 130 θ 140 150 160 170 −0.8 90 180 100 110 120 130 r=2.2 θ 140 150 160 170 180 r=2.8 0 0 opt. seq. approx. Imai Greenspan dr=0.1 Greenspan dr=0.15 −0.1 opt. seq. approx. Imai Greenspan dr=0.1 Greenspan dr=0.15 −0.05 −0.1 −0.2 −0.15 −0.3 χ χ −0.2 −0.25 −0.4 −0.3 −0.5 −0.35 −0.6 −0.4 −0.7 90 100 110 120 130 θ 140 150 160 170 180 −0.45 90 100 110 120 130 θ 140 150 160 170 180 Figure 5.8 Compressible inviscid ow: the approximate velocity potential auxiliary function (r; ) at di erent radii (on and near the cylinder surface) for M1 = 0:43 using 15 optimal basis functions. bubble. In the vicinity of = 90 , the uid velocity exceeds the local speed of sound and the ow becomes supersonic. The pressure coecients and the nondimensionalized density, plotted in Fig. 5.10, are in satisfactory agreement with numerical results of Greenspan and Jain. Figure 5.11 presents the local Mach number distribution over the ow eld and representative Mach-isolines for the region close to the cylinder surface. The presence of supersonic ow regions is clear. 5.5.3 Computational Work The amount of computational work that PDS required at each step of the proposed algorithm is presented in Tables 5.1 and 5.2. Computational work is recorded in terms of the numbers of required PDS iterations and function and constraint eval- 90 3 2.5 (1) 2 (2) 1.5 1 opt. seq. approx. Greenspan dr=0.1 Greenspan dr=0.15 0.5 0 90 100 110 120 130 θ 140 150 160 170 180 Compressible inviscid ow: nondimensionalized (curve 1) uid velocity and (curve 2) sonic speed on the cylinder surface for 1 = 0 43. Figure 5.9 M : 1.2 1 1.1 0 1 −1 0.9 ρ Cp 2 −2 0.8 −3 0.7 opt. seq. approx. Greenspan dr=0.1 Greenspan dr=0.15 −4 −5 90 100 110 120 130 θ 140 150 160 170 180 opt. seq. approx. Greenspan dr=0.1 Greenspan dr=0.15 0.6 0.5 90 100 110 120 130 θ 140 150 160 170 180 (b) (a) Compressible inviscid ow: (a) pressure coecient distribution and (b) nondimensionalized density on the cylinder surface for 1 = 0 43. Figure 5.10 M : 91 1.4 1.2 1 Ma 0.8 0.6 0.4 0.2 0 20 15 180 160 10 140 120 5 100 0 r 80 θ 5 7 4 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 77 6 5 6 9 8 3 10 7 2 5 4 2 1 0 2 8 0 4 4 3 2 4 6 6 7 -2 14 13 1 2 1 1 99 -1 9 10 6 4 5 8 7 -3 12 13 14 3 y 9 1 1 21 1 8 5 1.08471 1.01318 0.941636 0.870097 0.798558 0.727019 0.65548 0.58394 0.512401 0.440862 0.369323 0.297784 0.226244 0.154705 0.0831659 -4 7 6 6 -5 -4 -2 0 2 4 x Compressible inviscid ow, 1 = 0 43: (top) local Mach number distribution over the entire ow eld and (bottom) Mach-isolines in a region close to the cylinder surface. Figure 5.11 M : 92 uations. The strictly monotonically increasing functional value is included. As can be concluded by inspection of Tables 5.1 and 5.2, the amount of computational work does not exhibit high uctuations as the algorithm advances. Even when the maximum numerical value of the functional is asymptotically approached, with increasing number of optimal bases, the amount of work necessary for the improvement of the variational integral value is of the same order as for the previous iterations. 93 Amount of computational work required by PDS for the Euler Equation problem using augmented radial basis functions for = 1120 and = 10?4 ( = 0 4). Table 5.1 d tol M : Algorithm # of PDS Total # of Total # of Functional step iterations function eval. constraint eval. value Step 1 1 21 6318 17198 0.0535 5 24 23824 24107 0.0960 13 25 22305 23105 0.0974 22 17 13823 16253 0.0975 Step 2 1 17 9032 9032 0.0731 5 75 76254 76254 0.0960 13 96 56877 56877 0.0974 22 15 4006 4006 0.0975 Amount of computational work required by PDS for the Euler Equation problem using augmented radial basis functions for = 1120 and = 10?4 ( = 0 43). Table 5.2 d tol M : Algorithm # of PDS Total # of Total # of Functional step iterations function eval. constraint eval. value Step 1 1 17 5341 14303 0.0694 6 113 125333 126196 0.1097 12 20 14432 16834 0.1155 19 27 20664 28026 0.1156 Step 2 1 10 1383 1383 0.0924 6 16 3903 3903 0.1097 12 21 5884 5884 0.1155 19 5 1208 1208 0.1156 94 Chapter 6 Conclusions and Future Work 6.1 Conclusions 6.1.1 Objectives The objective of the work presented in this report is not only to model the nonlinear compressible Euler equations by feedforward arti cial neural networks but to develop a computationally ecient method which requires minimal user interaction, is general enough to use any of the popular ANN transfer functions that can form compact bases, can be applied to a complex computational domain, and approaches the accuracy of conventional computational mechanics techniques. The bene ts of accomplishing this task, besides those previously given, will be a relatively painless feedforward arti cial neural network construction technique that is understandable and usable by engineers and address validation and veri cation concerns voiced by those applying ANN technology. 6.1.2 Approach To ful ll our objectives, iterative approximation concepts originating in arti cial neural networks are combined with error minimization ideas from function approximation theory to formulate a method; this method involves the incremental determination of the basis functions and the associated coecients used in the series expansion for the representation of the solution. The proposed method requires the solution of a nonlinear optimization problem at each step. Variational principles, associated with the given di erential equations and the boundary con- 95 ditions, are utilized to de ne the objective function(al). The major advantage in the use of variational principles is that the latter succinctly and concisely represent governing equations and associated boundary conditions in a single variational integral which is then extremized. In this manner, the well-posed formulation of the optimization problem is guaranteed. Computational eciency is achieved by utilizing PDS for the solution of the optimization problem and using B1-splines and augmented radial bases. The proposed method is adaptive in nature although a grid is neither built nor adapted in the traditional sense. It can be classi ed as an h-r adaptive technique when using the B1-splines and an h-r-p adaptive technique when using augmented radial bases. The computational overhead of the a-posteriori error estimation and adaptive process, shared by traditional adaptive and multigrid techniques, is avoided; the \grid", de ned by the location of the optimally selected basis functions, evolves with the solution. Moreover, additional disadvantages of traditional adaptive unstructured grid techniques are not present in the proposed method; complicated data structures are not required since systems of equations are neither assembled nor solved, and numerical stability issues are limited to the nonlinear optimization problem. If the optimization problem can be decomposed (e.g. in linear problems), the anity of the proposed method to that of weighted residuals (in particular to the Galerkin method), becomes apparent. The proposed method is iterative in nature and may be viewed as a nonstationary version of the Gauss-Seidel method for the solution of systems of algebraic equations. 96 6.1.3 Numerical Examples It is shown that the proposed method can be successfully applied to a one-dimensional, linear nonself-adjoint, and a nonlinear di erential equation associated with Dirichlet and Neumann types of boundary conditions. Emphasis is given to a uid dynamics application described by the compressible Euler equations. Accurate numerical results and satisfactory convergence rates are reported. Speci cally, the accuracy level of the presented numerical results rivals traditional nite elements on a uniform grid, considering the relatively low number and simplicity of the optimal basis functions used in the solution representation. Moreover, the proposed method does not face the diculties that traditional methods do in some applications. For example, the oscillations observed in the nite di erence and nite element solutions for the linear convection-di usion problem presented in Section 4.2.1. 6.1.4 Optimization and Bases Computational cost is reduced and eciency is enhanced by solving low-dimensional optimization problems using the parallel direct search technique. Moreover, the optimization problem may be further decomposed in linear applications. Parameter studies demonstrate that the computational work required by PDS for the solution of the nonlinear optimization problems, does not increase as the algorithm advances. The numbers of required PDS iterations and function and constraint evaluations do not uctuate dramatically. Readily parameterized conventional and augmented radial bases, and their tensor products, are used to ensure simplicity and computational eciency. However, a broad class of functions (e.g. hyperbolic tangents) can be utilized if they are combined to form compact bases, though added computational e ort may be required to satisfy the boundary conditions with each new basis. 97 6.2 6.2.1 Future Work Generalization of the Proposed Method The proposed method is based on the utilization of natural and contrived variational principles. Such variational formulations exist for many problems of practical interest in engineering and have been studied extensively. However, there still exist problems for which variational principles have not yet been de ned. It is demonstrated how alternative formulations of optimization problems can be adopted by the proposed method for the determination of the optimal basis functions and the associated coecients. 6.2.2 Utilizing the Method of Weighted Residuals The method of weighted residuals could act as the necessary alternative. For example, the minimization of the square of the residual of a di erential equation corresponds to the well-known least-squares weighted residual method; if the objective functional h i is di erentiated with respect to the coecients k , = 1 , of a series expansion, a system of algebraic equations is formed for the solution of the unknown vector . The functional h i could be used as the objective function to be minimized in the proposed method of sequential approximation. In this case, both the coecients k and the parameters of the basis functions k of the series expansion constitute the optimization variables; the proposed method is similar to the least-squares weighted residuals method. However, the disadvantages include: 1) the necessity of developing techniques for the accommodation of the boundary conditions, 2) the possible solution of a multi-objective optimization problem in the event that a given problem is governed by more than one di erential R; R c N c R; R c k ;:::;N 98 equations, and 3) solution uniqueness problems, especially in the case of nonlinear di erential equations. An alternative to the least-squares method that is within the class of the method of weighted residuals, can be deduced by noting that the numerator of the last term of Eq. (4.8), which is to be maximized, can be rewritten as (h ?1 ; n (x)i) l en 2 = ( h 0( ) ? l f ( h 0( ) n( )i ? h l f x ; x a x x ; x x a 2 a ?1 (x)]; n (x)i) L un 2 2 = = ?1 (x)[; n (x)i) L un (h ( ) n ( )i ? h [ g x ; ?1 (x); n (x)i) ?1 (x); n (x)i) l un (h [ 0( )] n( )i ? h [ L f a un = 2 =h ?1 ; n (x)i Rn 2 (6.1) ; which is equivalent to minimizing ?h n?1 n( )i2 (6.2) where the equation residual n?1 is equal to ?( [ an?1( )] ? ( )). From a geometric perspective, Eq. (6.1) indicates that by maximizing h n?1 ni2, the -th basis function will be selected such that n is as parallel as possible to the equation residual. For the sake of generalization, the objective function h n?1 ni2 will be used, where n is a function of n. The selection of the associated coecient n according to another geometric consideration follows; the residual n is forced to be as orthogonal as possible to 2 n . In this regard, the objective function to be minimized is h n n i , with n being the optimization variable. The equivalence of this technique to the PetrovGalerkin method of weighted residuals is apparent. In fact, for linear problems the coecient can be readily calculated by the Galerkin relation h n ni = 0 rather than minimizing h n ni2. R ; x ; R L u x g x R ; R i ; c R R ; R ; R ; c 99 6.2.3 Increasing Computational Eciency For practical applications we can take the preceding derivations and minimize both ? h n( ) ( n ? x ; R ?1 )i Rn and h n ( ) 2 x ; Rn i2 These equations have the advantage of reducing the required number of integrations, per PDS search, of the functional that contains most of the nonlinear optimization parameters. This translates into reduced CPU time and required memory. However, the parameter n is present in both of the equations, unlike Eq. (6.1). Also, with this method of weighted residuals approach the implementation of the boundary conditions may not be straightforward, especially if they are of the Neumann type, and this alternative approach may face solution uniqueness problems. These are relatively minor disadvantages that are present in most applications of conventional computational mechanics. The use of PDS is not required by the proposed method and is only utilized for convenience since it is a derivative-free nonlinear search technique that can be implemented on parallel and distributed computers. and is not required by the proposed method. With additional e ort from the user, better convergence can found with conventional calculus-based search techniques, such as steepest descent and the conjugate gradient method. For complex computational domains, PDS can be run on a serial machine to provide an initial guess to be used by the conventional calculus-based search technique. c 100 Complex computational domains can be accommodated by an equality constraint through the use of a crude unstructured grid. For example, (x) = P 1 ? N X i=1 i(x) !2 where i are the bases of the crude unstructured grid and P is a large positive penalty function. The function is either 0, if the geometric parameters are within the computational domain , or = P . The function can be added to the objective function during optimization. If the user has a desire to increase the resolution of the solution approximation within some subdomain D of the problem domain , a straightforward penalty function approach can be used within the context of the optimal incremental approximation method. For example, we can write 2 2 = h n (x); ?(x)Rni ? h n (x); ?(x) (Rn ? Rn?1)i 2 h (x); n(x)i 8 n > < m > 0 if x 2 D m where ?(x) = 10 > : m = 0 otherwise. 6.2.4 Compact Bases As mentioned previously, traditional transfer functions can be used by the proposed method with they are combined to form compact bases [60]. For example, the technique used to augment radial bases can be used with the hyperbolic tan- 101 gent, (x)) A(x) = tanh( tanh() ; ( (x) ? 1)) B (x) = tanh(tanh( +1 ) (6.3) where is the same as that de ned in Eq. (4.14). The bases A (x) and B (x) are similar to the augmented radial basis function for positive and negative values of , respectively. Note, that like the augmented radial basis, both A(x) = (x) and B (x) = (x) for = 0. Future research work will be done on the previously mentioned items and will also focus on the extension of the proposed method to three-dimensional and/or unsteady problems. 102 Appendix A A.1 Norms, Inner Products, and Orthogonality In order to determine the \best approximation" of a (solution) function, it is necessary to de ne a metric which measures the distance between the exact and the approximate [9]. Any linear space of functions (or vectors) Q, i.e. a set S of elements obeying the addition and multiplication operations according to the usual rules of arithmetic, is called normed if a real-valued function exists and has the following properties (a) jj f jj > 0; if f 6= 0 (positivity) (b) jj f jj=j j jj f jj (homogeneity) (c) jj f + g jjjj f jj + jj g jj (triangle inequality), where f , g 2 Q and 2 R. The metric for two functions f and g of this space is then given by jj f ? g jj. Since a large number of di erent norms can be de ned, it is obvious that the concepts of best approximation and convergence depend on the norm choice. For example, the de nition of the norm jj f (x) jj= amax j f (x) j xb (A.1) for a function f (x) de ned on an interval [a; b] enables the discussion of uniform convergence. 103 However, it is often useful to associate the norm with an operation de ned as the inner product. The inner product between two functions f and g is denoted by hf; gi and has the following properties (b) hf; f i > 0; if f 6= 0 hf; gi = hg; f i (symmetry) (c) hf; g + hi = hf; gi + hf; hi (linearity), (a) (positivity) where , 2 R. Then, the norm of the inner-product space is automatically dened as q jj f jj= + hf; f i (A.2) and the following properties hold (b) j hf; gi jjj f jj jj g jj jj f + g jjjj f jj + jj g jj (c) jj f + g jj2 + jj f ? g jj2= 2(jj f jj2 + jj g jj2) (parallelogram law) (d) hf; gi = 0 )jj f + g jj2=jj f jj2 + jj g jj2 (a) (Cauchy-Schwarz inequality) (triangle inequality) (Pythagorean law). Assuming x can be multidimensional, if f (x) is continuous on some domain , where = S @ with @ being the boundary of , and square-integrable, R i.e f (x)2dx < 1, the Hilbert space L2( ) is de ned, and the associated inner product and L2-norm are given by Z hf; f i = (f (x))2dx =jj f jj2L 2 : (A.3) 104 This is a quite popular inner product de nition, and is used throughout this report. Two functions f and g are said to be orthogonal if hf; gi = 0. In terms of definition (A.3), two functions f (x) and g(x) are said to be orthogonal with respect to some function w(x) on some domain if hf; giw = Z f (x)g (x)w(x)dx = 0: (A.4) If an orthogonal set of functions has the additional property of jj fi jj= 1, 8i, it is said to be orthonormal. Therefore, the functions of the set share the cardinality property 8 >< 1 ; if i = j hfi; fj i = ij = >: 0 ; if i = 6 j; (A.5) where ij is the Kronecker delta. Finally note that a complete normed linear space is called a Banach space, while an inner product space that is complete in the associated norm is called a Hilbert space. A space S is said to be complete if every Cauchy sequence fxn : n = 1; 2; g with xi 2 S converges to some x 2 S . A sequence is Cauchy if limn;m!1 jj xn ? xm jj= 0. Hilbert spaces are widely used in computational mechanics, especially in nite element methods [45]. A Hilbert space H k ( ) consists of functions that are, together with their k-th derivatives, square-integrable on . The associated inner product is then de ned as Z hf; f iH k ( ) = ((f (x))2 + (f (x))2 + + (f k (x))2)dx =jj f jj2H k( 0 ); (A.6) 105 where k ( ) is the -th derivative of ( ). A Hilbert space 0k ( ) consists of functions that are, together with their -th derivatives, square-integrable on and vanish on its boundary . Hilbert spaces of the latter form are very useful for problems associated with homogeneous Dirichlet boundary conditions. f x k f x H k @ A.2 B-splines and Gaussian Functions The widely used 1-splines were introduced in Section 3.4.2, Eq. (3.9). Figure A.1(a) illustrates a 1-spline with center M = 1 0, width to the left l = 0 5, and width to the right r = 0 5. Another popular -splines family consists of 3-splines, also called cubic splines. Cubic splines are de ned in terms of ve parameters, compared to the three parameters that are needed for the de nition of the 1-splines, B B x x : x : : B B B 8 >> (x?xM ?xl1 ?xl2 )3 if M ? l1 ? l2 M ? l1 (xl2 )3 >> >> (xl1)3+3(xl1)2(x?xM ?xl1 )+3xl13(x?xM ?xl1 )2?3(x?xM ?xl1 )3 (xl1 ) >> >> if M ? l1 M >< 3 2 xr13(xM +xr1 ?x)2?3(xM +xr1 ?x)3 (A.7) 3 ( ) = > (xr1 ) +3(xr1 ) (xM +xr1 ?x)+3 ( x r1 ) >> >> if M M + r1 >> (xM +xr1 +xr2 ?x)3 >> if M + r1 M + r1 r2 (xr2 )3 >> : 0 otherwise. ; x x x x x x ; x B x x x x ; x x x x ; x x x x x x ; A 3-spline with center M = 1 0, rst width to the left l1 = 0 25, second width to the left l2 = 0 25, rst width to the right r1 = 0 25, and second width to the right r2 = 0 25 is plotted in Fig. A.1(b). The maximum value of a 3-spline is 4.0, while the maximum value of the 1-spline was 1.0. The latter occurs for the 3 -spline at the locations of the two new parameters, namely at M ? l1 and B x x x : x : x : : : B B B x x 106 + xr1 The advantages o ered by the use of B3-splines include the improved adjustment for nonlinear approximations and the availability of up to third order derivatives. Finally, the most popular of the radial basis functions is the well known Gaussian function, de ned in terms of two parameters, its center and radius [71] xM G(x) = e? 2 ? k where = kx : (A.8) The Gaussian function is symmetric and can be employed for nonlinear approximations by exploiting its nonlinear nature and its simplicity due to the low number of parameters that are to be optimized. However, its symmetry decreases its exibility. A Gaussian function with center = 1:5 and radius = 0:5 is depicted in Fig. A.2 (a). Figure A.2 (b) is an example of the augmented radial basis function used in this report for the reason given in Section 4.1. For the numerical examples considered in this report, B1-splines and augmented radial bases were employed. This choice was made for the reasons that follow. Firstly, B1-splines are quite easy to use and implement computationally. In addition, they depend on three and six parameters for the one- and two-dimensional cases, respectively. In this manner, the size of the optimization problem is kept as low as possible. Secondly, the derivatives of B3-splines can achieve relatively large values, or \spikes", which are the source of numerical instabilities. Moreover, they depend on ve and ten parameters for the one- and two-dimensional cases, respectively. Accordingly, not only is the size of the associated optimization problems increased, but the computational representation of the optimization variables becomes more dicult, as will be shown in Section A.5. Finally, Gaussian functions depend on only two and three parameters to be optimized for the one- and two- 107 (a) (b) 1.5 5 4.5 4 3.5 1 B3 (x) B1 (x) 3 2.5 2 0.5 1.5 1 0.5 0 0 0.2 0.4 0.6 0.8 1 x 1.2 1.4 1.6 1.8 0 0 2 0.2 0.4 0.6 0.8 1.2 1.4 1.6 1.8 2 1.8 2 (b) (a) Figure A.1 1 x B -splines: (a) a B1-spline and (b) a B3-spline. 1 κ = κ = 10−6 L R κL = 2.5 κR = −2.5 1.2 0.9 0.8 1 0.7 0.8 Φ(x) G(x) 0.6 0.6 0.5 0.4 0.4 0.3 0.2 0.2 0.1 0 −0.5 0 0.5 1 1.5 x 2 2.5 3 3.5 0 0 0.2 0.4 0.6 0.8 1 x 1.2 1.4 1.6 (b) (a) Figure A.2 Radial bases: (a) Gaussian function with center = 1:5?and 6 radius = 0:5 and (b) an augmented radial basis with = = 10 , = 2:5 and = ?2:5. l l r r 108 dimensional cases, respectively. However, the domain of in uence (or support) of the functions cannot be as readily controlled as for the B1-splines. Consequently, the necessary constraints for enforcing homogeneous Dirichlet boundary conditions may not be as straightforwardly to implement as for the B1-splines, as will be illustrated in Section A.5. In this regard, Gaussian functions are more useful in problems associated with Neumann boundary conditions, where the basis functions do not have to vanish at certain boundaries. A.3 Members of the Method of Weighted Residuals The members of the method of weighted residuals can be classi ed according to the choice of the weighting function as follows. 1. The subdomain method, where the domain of interest is divided into n smaller subdomains ! , i = 1; : : : ; n, and the weighting functions w (x) are given by i 8 >< 1 ; w (x) = >: 0 ; i 2! x 62 ! x i (A.9) i i : 2. The collocation method, were the weighting function is given by the Dirac delta function, w = (x ? x ), and possess the following property i Z i 8 >< 1 ; (x ? x )dx = >: 0 ; i =x x 6= x x i i : (A.10) 109 In this manner, h ( ) ( )i = Z ( ) ( )d = j R x wi x R x wi x x R xi (A.11) : 3. The least-squares method, where the weighting function wi = @R @ ci wi is given by (A.12) ; where is the -th coecient of the series expansion. ci i 4. The method of moments, where the weighting functions are given by the monomials wi = i x ; i =0 ;:::;n ?1 (A.13) : 5. The Bubnov-Galerkin method, where the weighting functions are given by the basis (or trial) functions used in the series expansion of the approximate solution, i.e. = . A modi cation of the form = forms the Petrov-Galerkin method. wi wi i gi i A.4 Classi cation of Di erential Operators Di erential operators are classi ed into linear and nonlinear operators. The adjoint of an operator is found by integrating R [ ]d by parts to obtain H vH f x 110 R [ ]dx + (B.T.), where the term (B.T.) denotes the boundary terms stemming from the performed integration(s) by parts. Linear operators are classi ed into self- and nonself-adjoint operators. A linear operator is said to be self-adjoint if L[] = L[]. The adjoint of a nonlinear operator N cannot be the same as the operator N . The term adjoint is therefore inappropriate for nonlinear operators; it is substituted by the term self-sucient. If the operator is nonlinear, the Frechret di erential must be examined. The Frechret di erential N [z] of a nonlinear operator N is de ned as [25](page 301) fH v 0 w N [w + z] ? N [w] = @ (N [w + z]) j ; N [z] lim =0 !0 @ 0 w (A.14) where N is the Frechret derivative with respect to the function w of the operator N operating on z. If the Frechret derivative is symmetric, i.e. 0 w Z v N [z]x = 0 w Z z N [v]x ; (A.15) 0 w then the operator N is said to be self-sucient. Equation (A.15) is the condition necessary for the existence of a functional that gives the operator N as its gradient [86]. It is the analogy of the condition that is required for a vector eld to be able to be derived from a potential, extented to function spaces [25](page 300). Note that the case of linear operators is in fact a special case of nonlinear operators since the Frechret derivative is given by the operator itself 0 L[w + z ] ? L[w] L[w] + L[z ] ? L[w] [ ] lim = lim = L[z] : !0 ! 0 Lw z (A.16) 111 Variational principles are readily available for problems governed by self-adjoint di erential operators. A di erential operator is said to be positive de nite if Z [ ]d fL f x > 0 ; f =0 6 (A.17) : Elliptic operators are self-adjoint and positive de nite in contrast to hyperbolic or parabolic operators. Steady-state problems typically yield elliptic operators, while time-dependent problems are typically associated with hyperbolic or parabolic operators. A.5 PDS Computational Issues A.5.1 The Formulation of the Optimization Variables The computational representation of the parameters to be optimized is neither unique nor trivial. For the 1-splines, for example, one option to de ne the PDS optimization variables in terms of the parameters to be optimized is given by Eq. (3.10). For this representation B M x = x(1) l = x(2) and r = x(3) ; x ; x ; (A.18) where x(1), x(2), and x(3) are the PDS variables. An alternative formulation could be the following M x = x(1) ; x L = x(2) and ; x R = x(3) ; (A.19) 112 where x = x ? x and x = x + x . The rst formulation requires the additional satisfaction of constraints since the functions to be chosen have to satisfy homogeneous boundary conditions, i.e. vanish at the boundaries. Speci cally, the constraints L M x(1) ? x(2) x l R ;L M r and x(1) + x(3) x ;R (A.20) need to be included in the PDS optimization problem formulation, where x and x denote the left and right boundaries of a one-dimensional domain, respectively. This is not necessary for the second representation, where the homogeneous boundary conditions can be readily satis ed by appropriate bounds, which PDS requires in all cases for the optimization variables. However, the rst representation was proven, by means of computational experimentations, to be more ecient than the second. It is obvious that with increasing number of parameters, as is the case for B3-splines, the number of choices increases exponentially, and a numerical study should be performed to determine the optimal representation. Lastly, note that an adequately large lower bound on x is necessary to avoid relatively large values in the evaluation of the derivatives of the basis functions. A typical value for the lower bound is 1 % of the width of the computational domain. ;L ;R A.5.2 Scaling An issue of signi cance in direct search optimization methods is the correct scaling among the variables. The objective of scaling is to avoid uneven searches in the 113 directions of the optimization variables that constitute the dimensionality of the search hyperplane. PDS o ers the option of enforcing diverse scaling factors for the optimization variables in a very simple and user-friendly manner. The scaling factor of each variable is given by the range of values that this variable can take. For example, if the variable x(1) is likely to take any value between -3 and 6, its scaling factor is 9. Scaling is unnecessary, obviously, if all optimization variables have approximately the same scaling factor, in which case PDS assigns a scaling factor of magnitude 1 to each variable. A scaling factor of magnitude 1 is to be selected in the absence of information. A.6 Diagonal Dominance and Spectral Radius A matrix A 2 R N X ja j> N N ii j is said to be diagonal dominant if 8i =1;j 6=i j a j; ij j = 1; : : : ; N : (A.21) The spectral radius of the matrix A is given by (A) = max j j; i = 1; : : : ; N ; i (A.22) where are the eigenvalues of the matrix A, i.e. the roots of the characteristic polynomial det(A ? I) = 0, with I being the N N identity matrix. i 114 A.7 Numerical Integration All numerical integrations were performed by means of ESSL library routines. ESSL is a numerical library provided by the IBM SP2 parallel computer, on which all codes presented in this report were executed. In particular, the double precision routines DGLNQ and DGLNQ2, which utilize the Gauss-Legendre quadrature rule for one- and two-dimensional numerical integration, respectively, were used. As mentioned in the main text of the report, the optimally chosen basis functions are allowed to overlap; a structured grid is not built to provide comfortably determined integration bounds. In addition, 1-splines are not uniformly de ned, but can take three di erent values according to their parameters and position in space. In this regard, it is necessary to develop algorithms for the ecient and accurate numerical integration of the variational integrals. In particular, two algorithms were developed. The rst algorithm can be employed when the structure of the functional allows the decomposition of the variational integral into simpler integrals, where the summations from the series expansions can be taken outside the integrals. For example, when the following algebraic manipulation can be performed B Z Z ( ) ( ( )) d d = Z Z X ( ) ( ) X d ( ) ( )d d = d X X Z Z ( ) ( ) d ( ) ( )d d = d X X Z ( ) d ( ) d Z ( ) ( )d a un a x; y @ un x; y @x n n i=1 n i=1 j =1 n n ci i x ci cj i=1 j =1 ci cj i i y i x x x y n j =1 i y j x d x j cj x x j x x x i j y x y j y x y y j y y ; (A.23) the integral of the product of two (or more) basis functions and/or its deriva- 115 tives in one dimension has to be computed. The algorithm used to determine the number m of the necessary subintegrations and the associated bounds aj and bj , j = 1; : : : ; m, for n basis functions (B1-splines), assuming that the bounds of the domain are x1 and x2, is the following Algorithm 1 Find the maximum xL among the xL;i ; i = 1; : : : ; n; Find the minimum xR among the xR;i ; i = 1; : : : ; n; If xL;max x1 ; set xL;max = x1; If xR;min x2 ; set xR;min = x2; Set m = 0; If xL;max xR;min ; stop; For i = 1; : : : ; n If xL;max < xM;i < xR;min m m + 1; am = xL;max ; bm = xM;i ; Set xL;max = xM;i ; End if; End for; m m + 1; am = xL;max ; bm = xR;min: In the case that the variational integral cannot be simpli ed and/or decomposed as in Eq. (A.23), the approximate functions have to be numerically evaluated on an adequate number of Gauss points so that the integral can be calculated accu- 116 rately. Both the unstructured and relatively complicated overlapping of the basis functions and the change of their de nition over certain spatial domains prevent the numerical integration from being eciently and accurately performed, even for a relatively high number of Gauss points. A domain decomposition, according to the basis function parameters and their location, is necessary to yield dependable numerical results. An algorithm was designed for this purpose and utilized for the two-dimensional integration in the compressible, inviscid, irrotational ow application. Assuming that the optimal sequential approximation has reached step and that the variational integral has to be evaluated within a rectangular domain of interest with boundaries 1, 2, 1, and 2, the integral is decomposed into ( ? 1) ( ? 1) subintegrals , = 1 ? 1, = 1 ?1 and evaluated with help of the following algorithm n I x nxsub n x y n nysub Ii;j y i ; : : : ; nxsub j Algorithm 2 Merge the n-dimensional arrays x L ; xM ; into a 3n-dimensional array called Merge the n-dimensional arrays y L ; yM ; into a 3n-dimensional array called Sort the 3n-dimensional arrays Set If nxsub xcopy = 0; (1) nxsub ( x1 nxsub ) = 1; xsub nxsub End if; + 1; x xcopy and xcopy xR ; and yR ycopy and ; ycopy ; ; : : : ; nysub 117 Algorithm 2 (cont.) For i = 1; : : : ; n If (x1 < xcopy (i) < x2 ) nxsub nxsub xsub(nxsub) = 6 and (xcopy (i) = xsub(nxsub)) + 1; x1 ; End if; End for; If xcopy (n) nxsub x2 + 1; nxsub xsub(nxsub) = x2 ; End if; Set If nysub = 0; ycopy (1) nysub y1 nysub ysub(nysub) = + 1; y1 ; End if; For i = 1; : : : ; n If (y1 < ycopy (i) < y2 ) nysub nysub ysub(nysub) End if; End for; = + 1; y1 ; 6 and (ycopy (i) = ysub(nysub)) 118 Algorithm 2 (cont.) If ( ) ycopy n nysub y2 nysub ( + 1; ) = 2; ysub nysub y End if; Set I = 0; n For = 1 i ?1 ; : : : ; nxsub For = 1 j ; : : : ; nysub Compute I n End for; End for : I n + n Ii;j n ; Ii;j ; ?1 119 Bibliography [1] J.D. Anderson, Jr. Modern Compressible Flow. McGraw Hill, 1990. Pages 242-250. [2] R. Barrett, M. Berry, T.F. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine, and H. van der Vorst. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. SIAM Publications, 1994. Pages 5-7. [3] A.R. Barron. \Universal approximation bounds for superpositions of a sigmoidal function". IEEE Transactions on Information Theory, 39(3):930{945, 1993. [4] H. Bateman. \Irrotational motion of a compressible inviscid uid". Proc. Nat. Acad. Sci, 16:816{825, 1930. [5] M. Becker. The Principles and Applications of Variational Methods. MIT Press, 1964. Research Monograph No. 27, pages 7-10. [6] C.M. Bishop. \Curvature-driven smoothing: A learning algorithm for feedforward networks". IEEE Transactions on Neural Networks, 4(5):882{884., 1993. [7] G.E.P. Box and N.R. Draper. Empirical Model-Building and Response Surface. Wiley, New York, 1997. [8] D.S. Broomhead and D. Lowe. \Multi-variable functional interpolation and adaptive networks". Complex Systems, 2:321{355., 1988. 120 [9] E.W. Cheney. Approximation Theory. McGraw Hill, 1966. Pages 3-6. [10] S. Cox and M. Overton. \Designing for optimal energy absorption - III Numerical minimization of the spectral abscissa". Structural Optimization, 13(1):17{ 22, February 1997. Journal of the International Society for Structural and Multidisciplinary Optimization (ISSMO), Springer Verlag. [11] G. Cybenko. \Approximation by superposition of a sigmoidal function". Mathematics of Control, Signals, and Systems, 2:303{314., 1989. [12] A.J. Davies. The Finite Element Method, a First Approach. Clarendon Press, 1980. Pages 48-58. [13] L. Demkowicz and J.T. Oden. \On a mesh optimization method based on a minimization of interpolation error". Int. J. Engng. Sci., 24(1):55{68, 1986. [14] J.E. Dennis, Jr. and V. Torczon. \Direct search methods on parallel machines". SIAM Journal of Optimization, 1(4):448{474, November 1991. [15] R.A. DeVore. \Degree of nonlinear approximation". In C. K. Chui, L. L. Schumaker, and J. D. Ward, editors, Approximation Theory VI, pages 175{ 201. Academic Press, 1989. [16] A.R. Diaz, N. Kikuchi, and J.E. Taylor. \A method of grid optimization for nite element methods". Computer Methods in Applied Mechanics and Engineering, 41:29{45, 1983. [17] A.R. Diaz, N. Kikuchi, and J.E. Taylor. \Optimal design formulations for nite element grid adaptation". In V. Komokov, editor, Lecture Notes in Mathematics: Sensitivity of Functionals with Applications in Engineering Science. Springer Verlag, Berlin, 1984. 121 [18] E.O. Doebelin. Engineering Experimentation: Planning, Execution, Reporting. McGraw Hill, 1995. [19] J. Duchon. Spline Minimizing Rotation-Invariant Semi-Norms in Sobolev Spaces, Constructive theory of Functions of Several Variables, Lecture Notes in Mathematics. Vol. 571, Springer Verlag, Berlin, 1977. [20] N. Dyn. \Interpolation and approximation by radial and related functions". In C. K. Chui, L. L. Schumaker, and J. D. Ward, editors, Approximation Theory VI, pages 211{234. Vol.1, Academic Press, 1989. [21] N. Dyn, D. Levin, and S. Rippa. \Numerical procedure for surface tting of scattered data by radial functions". SIAM J. Sci. Stat. Comput., 7(2):639{ 659, 1986. [22] P.R. Eiseman, Y.K. Choo, and R.E. Smith. \Algebraic grid generation with control points". In T.J. Chung, editor, Finite Elements in Fluids, Volume 8. Hemisphere Publ. Corp., 1992. [23] C.A. Felippa. \Optimization of nite element grids by direct energy search". Appl. Math. Modelling, 1:93{96, 1976. [24] C.A. Felippa. \Numerical experiments in nite element grid optimization by direct energy search". Appl. Math. Modelling, 1:239{244, 1977. [25] B.A. Finlayson. The Method of Weighted Residuals and Variational Principles. Academic Press, 1972. [26] C.A.J. Fletcher. Computational Galerkin Methods. Springer, New York, 1984. [27] C.A.J. Fletcher. Computational Techniques for Fluid Dynamics, volume 1. Springer Verlag, 1988. Pages 293-299. 122 [28] R. Franke. \Scattered data interpolation: Tests of some methods". Mathematics of Computation, 38:181{200, 1982. [29] F. Girosi, M. Jones, and T. Poggio. \Regularization theory and neural network architectures". Neural Computation, 7:219{269, 1995. [30] F. Girosi and T. Poggio. \Networks and the best approximation property". Biological Cybernetics, 63:169{176., 1990. [31] D. Greenspan. \On approximating extremals of functionals - I The method and examples for boundary value problems". Bull. Int. Comp. Centre, 4(2):99{120, April-June 1965. University of Rome. [32] D. Greenspan. \On approximating extremals of functionals - II Theory and generalizations related to boundary value problems for nonlinear di erential equations". Int. J. Engng Sci., 5:571{588, 1967. [33] D. Greenspan and P.C. Jain. \Application of a method for approximating extremals of functionals to compressible ow". Mathematical Analysis and Applications, 18(1):85{111, April 1967. Also available as technical report of the Mathematical Research Center of the University of Wisconsin, TR 597, January 1966. [34] G.L. Guymon. \A nite element solution of the one-dimensional di usionconvection equation". Water Resources Research, 6(1):204{210, February 1970. [35] G.L. Guymon, V.H. Scott, and L.R. Herrmann. \A general numerical solution of the two-dimensional di usion-convection equation by the nite element method". Water Resources Research, 6(6):1611{1617, December 1970. 123 [36] O. Hassan, K. Morgan, and J. Peraire. \Finite-element solution scheme for compressible viscous ow". In T.J. Chung, editor, Finite Elements in Fluids, Volume 8. Hemisphere Publ. Corp., 1992. [37] R. Hecht-Nielsen. Neurocomputing. Addison-Wesley, Inc., New York, 1990. [38] L. Holmstrom and P. Koistinen. \Using additive noise in back-propagation training". IEEE Transaction on Neural Networks, 3:24{38., 1992. [39] K. Hornik, M. Stinchcombe, and H. White. \Multi-layer feedforward networks are universal approximators". Neural Networks, 2:359{366, 1989. [40] K. Hornik, M. Stinchcombe, and H. White. \Multilayer feedforward networks are universal approximators". Neural Networks, 2:359{366., 1989. [41] K. Hornik, M. Stinchcombe, and H. White. \Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks". Neural Networks, 3:551{560., 1990. [42] I. Iami. \On the ow of a compressible uid past a circular cylinder". In Proc. Phys. - Math. Soc. Japan, pages 180{193, 1941. Ser. 3, volume 23. [43] Y. Ito. \Representation of functions by superpositions of a step or sigmoidal function and their applications to neural network theory". Neural Networks, 4:385{394., 1991. [44] P.S. Jensen. \Advances in computer methods for partial di erential methods". In R. Vivhnevestjy, editor, AICA Symposium, pages 80{85, Lehigh University, 1975. [45] C. Johnson. Numerical Solution of Partial Di erential Equations by the Finite Element Method. Cambridge University Press, 1990. Pages 33-38. 124 [46] L.K. Jones. \Constructive approximations for neural networks by sigmoidal functions". Proceedings of the IEEE, 78(10):1586{1589, 1990. [47] L.K. Jones. \A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural network training". The Annals of Statistics, 20(1):608{613, 1992. [48] G.S. Kimeldorf and G. Wahba. \A correspondence between bayesian estimation on stochastic processes and smoothing by splines". Annals of Mathematical Statistics, 41(2):495{502, 1970. [49] D. Kincaid and W. Cheney. Numerical Analysis. Brooks and Cole, 1990. Pages 181-182. [50] M. Krizek and P. Neittaanmaeki. Finite Element Approximation of Variational Problems and Applications. Longman Scienti c & Technical, 1990. Page 14. [51] C. Lanczos. The Variational Principles in Mechanics. University of Toronto Press, 1952. [52] R.M. Lewis and V. Torczon. \Pattern search algorithms for bound constrained minimization". Technical Report 96-20, ICASE, NASA Langley Research Center, Hampton, Virginia, 1996. Submitted to SIAM Journal of Optimization. [53] P.E. Lush and T.M. Cherry. \The variational method in hydrodynamics". J. Mech. Appl. Math., 9:6{21, 1956. [54] A.R. Manwell. \A variational principle for steady homenergic compressible ow with nite shocks". Wave motion, 2(1):83{95, 1980. 125 [55] G. Matheron. \Principles of geostatistics". Economic Geology, 58:1246{1266., 1963. [56] G. Matheron. \The intrinsic random functions and their applications". Adv. Appl. Prob., 5:439{468., 1973. [57] A.J. Meade, Jr. and A.A. Fernandez. \Solution of non-linear ordinary di erential equations by feedforward neural networks". Mathematical and Computer Modelling, 20(9):19{44, 1994. [58] A.J. Meade, Jr., M. Kokkolaras, and B.A. Zeldin. \Optimal incremental approximation for the adaptive solution of di erential equations". International Journal for Numerical Methods in Engineering, 1997. Submitted. [59] A.J. Meade, Jr., M. Kokkolaras, and B.A. Zeldin. \Sequential function approximation for the solution of di erential equations". Communications in Numerical Methods in Engineering, 13:977{986, 1997. [60] A.J. Meade, Jr. and B.A. Zeldin. \Approximation properties of local bases assembled from neural network transfer functions". Mathematical and Computer Modelling, 1997. To appear. [61] A.J. Meade, Jr. and B.A. Zeldin. \Establishing criteria to ensure successful feedforward arti cial neural network modelling of mechanical systems". Mathematical and Computer Modelling, 1997. To appear. [62] J. Meinguet. \Multivariate interpolation at arbitrary points made simple". J. Appl. Math. Phys., 30:292{304., 1979. [63] H.N. Mhaskar. \Neural networks for optimal approximation of smooth and analytic functions". Neural Computation, 8:164{177., 1996. 126 [64] S.G. Mikhlin. Variational Methods in Mathematical Physics. MacMillan Pergamon, 1964. [65] P.M. Morse and H. Feshbach. Methods of Theoretical Physics, volume 1. McGraw-Hill, 1953. Pages 276-280. [66] R.H. Myers, A.I. Khuri, and W.H. Carter, Jr. \Response surface methodology: 1966-1988". Technometrics, 31(2):137{157, 1989. [67] R.H. Myers and D.C. Montgomery. Response Surface Methodology: Process and Product Optimization using Designed Experiments. John Wiley & Sons, Inc. New York, 1995. [68] J.T. Oden. Variational Methods in Theoretical Mechanics. Springer Verlag, 1976. [69] J.T. Oden. \Adaptive methods in computational uid dynamics". In T.J. Chung, editor, Finite Elements in Fluids, Volume 8. Hemisphere Publ. Corp., 1992. [70] E.R. Oliveira. \Optimization of nite element solutions". In Proc. Third Conf. Matrix Methods in Struct. Mech., Ohio, 1971. Air Force Flight Dynamics Laboratory, Wright-Patterson AFB. AFFDL TR 71-160. [71] M.J. Orr. \Regularization in the selection of radial basis function centres". Neural Computation, 7(3):606{623., 1995. [72] J. Park and I.W. Sanberg. \Universal approximation using radial-basisfunction networks". Neural Computation, 3:246{257., 1991. [73] J. Platt. \A resource-allocating network for function interpolation". Neural Computation, 3:213{225, 1991. 127 [74] T. Poggio and F. Girosi. \Networks for approximation and learning". Proceedings of the IEEE, 78(9):1481{1497., 1990. [75] T. Poggio and F. Girosi. \Regularization algorithms for learning that are equivalent to multilayer networks". Science, 247:978{982., 1990. [76] R.J. Prozan. \A variational principle for compressible uid mechanics - Discussion of the multi-dimensional theory". Technical Report 3614, NASA, 1982. [77] R.J. Prozan. \A variational principle for compressible uid mechanics - Discussion of the one-dimensional theory". Technical Report 3526, NASA, April 1982. [78] K. Rektorys. Variational Methods in Mathematics, Science, and Engineering. Dordrecht, 1980. 2nd edition. [79] D. Rumelhart, G.E. Hinton, and R.J. Williams. \Learning internal representations by error propagation". In D. Rumelhart, J.L. McClelland, and the PDP Research Group, editors, Parallel Distributed Processing: Exploration in the Microstructure of Cognition, pages 318{362. MIT Press, Cambridge, MA, 1986. [80] S. Saarinen, R. Bramley, and G. Cybenko. \Ill-conditioning in neural network training problems". SIAM J. Sci. Comput., 14(3):693{714., 1993. [81] J. Sacks, W.J. Welch, T.J. Mitchell, and J.P. Wynn. \Design and analysis of computer experiments". Statistical Science, 4(4):409{435, 1989. [82] W.F. Stoecker. Design of Thermal Systems. McGraw-Hill, New York, 1989. Pages 189-190. 128 [83] A.N. Tikhonov and V.Y. Arsenin. Solution of Ill-Posed Problems. W.H. Winston, Washington, D.C,, 1977. [84] V. Torczon. \PDS: Direct search methods for unconstrained optimization on either sequential or parallel machines". Technical Report TR 92206, CRPC, March 1992. [85] V. Torczon. \On the convergence of pattern search methods". SIAM Journal of Optimization, February 1997. [86] M.M. Vainberg. Variational Methods for the Study of Nonlinear Operators. Holden-Day Inc., 1964. Pages 54-58. [87] G. Wahba. Spline Models for Observational Data. SIAM, Philadelphia, 1990. [88] W.J. Welch, R.J. Buck, J. Sacks, H.P. Wynn, T.J. Mitchell, and M.D. Morris. \Screening, predicting, and computer experiments". Technometrics, 34(1):15{ 25., 1992. [89] R.C. Williamson and U. Helmke. \Existence and uniqueness results for neural network approximations". IEEE Transactions on Neural Networks, 6(1):2{13., 1995. [90] O.C. Zienkiewicz and R.L. Taylor. The Finite Element Method. McGraw-Hill, 1989. Pages 240-242.

Log In

Application of Multilayer Feedforward Network to the Voiced-Unvoiced-Silence Detection Problem

Application of Multilayer Feedforward Network to the Voiced-Unvoiced-Silence Detection Problem

Related Papers

RELATED PAPERS