Nothing Special   »   [go: up one dir, main page]

Academia.eduAcademia.edu
Proceedings of the 3rd INFORMS Workshop on Data Mining and Health Informatics (DM-HI 2008) J. Li, D. Aleman, R. Sikora, eds. ANALYSIS OF SENSORY TRANSFORMATIONS IN RESPONSE TO COMPLEX SENSORY STIMULI IN THE AUDITORY PATHWAY Alexander Elman, Nan Kong*, Edward Bartlett, Kevin J. Otto Weldon School of Biomedical Engineering Purdue University West Lafayette, IN nkong@purdue.edu Abstract We develop a computational modeling framework to study information processing in the auditory pathway. Complex sounds consisting of a series of frequency modulated sweeps were used as test stimuli for our system. Frequency modulation is an important factor in sound recognition and speech perception. Human behavioral data demonstrated that perception of the direction modulation (UP or DOWN) could be easily controlled in the stimuli by manipulating a single parameter. In order to analyze the representations to these stimuli in the peripheral auditory system, a two-stage process was employed. First, the stimuli were decomposed by a well-accepted model of the auditory nerve to produce simulated auditory nerve outputs. Next, a neural network was developed to classify whether a given stimulus was going UP or DOWN. These data were compared to human behavioral data and neural activity in the rat auditory cortex. Outputs from the neural network model demonstrated that neurons could classify frequency change properly for simple stimuli but introduce significant bias errors in their classifications for more difficult stimuli. These results suggest ways by which pattern recognition techniques can be used to disambiguate neural responses that share some common characteristics. Keywords: auditory nerve, neural networks, medial geniculate, neural coding transformation 1. Introduction Biological sensory systems represent the physical world with a complex network of neurons that signal to one another via rapid electrical pulses called action potentials, or spikes. Although the neural spike codes for simple stimuli in peripheral sensory systems are understood relatively well, the neural coding transformations that take place between the auditory nerve and the auditory cortex, where conscious perception of sound originates, are poorly understood. We in this research utilize a computational modeling framework to study the way in which information is processed in the auditory pathway, including the auditory nerve, the auditory cortex, and its main sensory input, the auditory thalamus. In this research, we integrate psychophysical and electrophysiological techniques to collect data that will inspire computational models (see Fig 1). Our research goal is to develop a neural network computational model that successfully maps auditory stimuli to neural and behavioral responses as well as revealing the neural coding transformations that occur between different auditory nuclei. Our first purpose is to develop approaches with which pattern recognition techniques can be used to disambiguate neural responses that share some common characteristics (average spike rate, for example). Furthermore, the ability to understand how the analysis techniques are successful will provide insight into the underlying neural mechanisms. Despite employing very different decisionmaking processes, a shared goal of animal intelligence and artificial intelligence is to make Figure 1: Schematic description of our research approach optimal decisions efficiently in complex and stressful environments. Many machine sensors are advanced pattern recognition devices that will generate consistent decisions for repetitions of the same pattern. Although this is often useful, it does not provide the flexibility to adapt decisions to changing situational context and unpredictable events. Generally speaking, artificially engineered algorithms are very successful in certain well-crafted problems when the system and parameters are relatively known, but do poorly at complex decision tasks under uncertainty such as speech recognition in natural environments. Unlike machines, humans and animals are able to make rapid, reasonably accurate decisions and adapt their behaviors to maximize benefits in uncertain environments. Our vision of this line of research is to develop a theoretical framework to understand decision-making algorithms used by neural systems. Ideally, one would like to be able to predict behavior to a wide stimulus set in a wide range of behavioral contexts based on the neural activities of a relatively small number of neurons. For reasonably complex stimuli and environments, this becomes quite difficult because the perception of a stimulus may be shaped by factors such as the stimulus history and the consequences of potential behaviors in response to the stimulus. 2. Experimental Design Stimulus design Our basic experimental task is to discriminate upward changes in frequency versus downward changes in frequency. This flexible paradigm allows for the use of a wide variety of sound stimuli, including FM sweeps, tone sequences, and harmonic complexes. It also allows for excellent control over the stimulus complexity. The basic stimulus used was the “miniFM” stimulus (see Figures 2A&2B), which is a series of FM sweeps with semi-random starting times that produce a continually shifting set of temporally overlapping FM sweeps (see Figure 2B). Complex FM stimuli were designed to be Figure 2: MiniFM stimulus and electrophysiology analogous in terms of stimulating the peripheral receptor organs to moving dot patterns which have been used successfully to study neural correlates of visual perception [1]. FM sweeps were also chosen because proper perception of FM stimuli is critical for speech perception in realistic conditions [2]. Electrophysiological Data Electrophysiological recordings were performed in rat primary auditory cortex in response to the miniFM stimulus. Extracellular electrophysiological recordings from a single microelectrode site are shown as Figure 2C. The evoked recordings were made 12 days post-surgery in response to the stimulus shown in Figure 2A. From the response to 20 repetitions of the stimulus, spike density functions were estimated from the action potential firing patterns (solid black lines, Figure 2D). A correlation coefficient was calculated for the spike density function vs. the spectral power of each band of the stimulus frequency. The spectral power for 18,613 Hz is shown as the dotted line. The correlation coefficient between the spike density function and the spectral power function is 0.5, suggesting that acoustic selectivity accounted for some but not all of the response variability at this recording site. Psychophysical Data Test paradigm. The testing paradigm utilized four different sets of parameters in order to systematically vary the uncertainty. Once subjects have mastered discriminating Up100 from Down100 stimuli (see Figure 2), stimulus uncertainty was introduced by adding a controlled amount of FM sweep components whose sweep rates are significantly different than the primary sweep rate. As Figure 3 demonstrates, human listeners have difficulty discriminating upward from downward sweeps when stimulus uncertainty is present in the form of interfering stimuli. It is easy to control task difficulty by controlling the proportion of FM sweeps that share a common rate of frequency change. As the proportion of FM sweeps with the same sweep rate decreases, the task becomes more difficult. To investigate the task difficulty of miniFM and learning capabilities in humans, two subjects were tested for their ability to discriminate upward from downward miniFM stimuli. Subjects were not given verbal instructions, and the response panel consisted simply of an UP button and a DOWN button. Subjects were informed whether they responded correctly or not on each trial. These conditions were a partial attempt to simulate the limited information available to rats during training. During the Figure 3 and Table 1: (Training Block 1) Psychophysical data from two naïve human first training block (Figure 3, subjects tested with Up vs. Down sweeps, 10 trials/stimulus. (Training Block 2) Two improved their responses and Training Block 1) the subjects days later, subjects were retested and significantly shortened their reaction times (p<0.01, χ2 and ranksum tests). (Pattern Block) did poorly on the miniFM Subjects were also tested with 10 repeated stimulus blocks that progressed from discrimination, with Up100 to Down100 in 12 steps. Subjects significantly improved their responses and performance only somewhat shortened their reaction times (p<0.01, χ2 and ranksum tests). better than chance (Table 1, top row). A second training block (Figure 3, Training Block 2) significantly improved their performance and decreased their reaction time, indicating that they were able to improve their discrimination abilities with training (Table 1, middle row). A third set of stimuli was a pattern that was repeated 10 times (Figure 3, Pattern Block). Both subjects significantly improved their performance and decreased their reaction times compared to Training Block 2 (Table 1, bottom row). Both subjects were aware that stimuli repeated in a predictable pattern, but neither could explicitly report the pattern. These data suggest that behavior can be substantially improved when stimuli are predictable, that is, when stimulus uncertainty is low. 3. Modeling 3.1. Auditory Nerve Model Synaptic data was obtained using a feline auditory periphery model developed by Heinz and Bruce et al. [3] to characterize the response of mammalian auditory-nerve (AN) fibers to highlevel stimuli. The underpinning of the model’s accuracy is in its faithful representation of the component 1-component 2 (C1/C2) transition and peak-splitting phenomena [4]. The model consists of eight separate processing blocks that accept an input stimulus in sound pressure level and discharges spike trains from a model of the IHC-AN synapse. The first block is the middleear (ME) filter which accepts an instantaneous pressure waveform of the miniFM stimulus. The ME filter is a fifth-order digital filter that discretizes the input using the bilinear transformation at a sampling rate of 500 kHz. The output of the ME filter is fed into a parallel path tenth order C2 filter, a signal-path tenth order C1 chirping filter, and a feed-forward control path. The C1 and C2 filters include transconduction functions that model the behavior of the basilar membrane such as throttling frequency selectivity. The output of the C1 and C2 transconduction functions are summed and passed through a seventh order low pass filter of the inner hair cell (IHC) block. The output of the IHC block is passed to the input of the IHC-AN synapse block. The IHC-AN synapse model consists of a nonlinear time-varying three store diffusion model. The terminating block serves as a discharge generator that outputs spike times by a renewal process driven by the synapse output [3]. 3.2.Computational Model Development A computational model was developed to accurately discriminate correlation between the synaptic output data of the AN model and the sweep direction of the miniFM input stimulus. The choice to use a neural network (NN) to characterize the input/output relationship between stimulation and synaptic response follows from its ability to quickly discern patterns from nonlinear temporally organized data [5]. While a NN model does not yield any insight into how biological coding is performed, it is a useful tool in interpreting the neural coding transformations between auditory nuclei. 3.2.1. Description of the neural network The NN described in this experiment consists of a triple-layer perceptron network trained with a standard supervised feed-forward error back-propagation algorithm. Backpropagation is an appropriate NN architecture choice for this particular model because a well-trained backpropagation network faithfully generalizes outputs to foreign inputs. The input vector consists of selected characteristic frequency (CF) slices of firing rates (spikes/s) from the synaptic output of the AN model in response to the miniFM sweeps. The training vector consists of binary values corresponding to whether or not the input sweep is up (1) or down (0). During the training session, the network undergoes dynamic weight and bias adjustments until the mean squared error is minimized to a chosen goal. Training terminates when the training goal is realized. To prevent noise from degrading the quality of the NN model, the synaptic outputs for each characteristic frequency and sweep count are summed to create ten 0.5 s bins. Besides eliminating noise, binning the spike rates decreases the complexity of the system and improves training speed and performance. The Levenberg-Marquardt training algorithm was employed since it approximates the Hessian matrix instead of computing it directly, saving computation time at the expense of Figure 4: NN reported % of “Up” responses as a function of miniFM direction and coherence. Lower CF have biases toward up memory [5]. The hidden layer consisted sweep responses. of twenty neurons each with a linear transfer function. The output layer utilized a tan-sigmoid transfer function allowing output values to converge to a binary value. 3.3.2. Implementation of the neural network model The computational work was performed in MATLAB (The Mathworks, Natick, MA) running on a Windows-based (Microsoft, Redmond, WA) PC. Twelve miniFM stimuli were fed into the AN model and an m-file was created for each stimulus. From the m-file, a matrix consisting of 57 spike-train vectors separated on an equally spaced characteristic frequency (CF) continuum from 125 Hz to 16 kHz contained the synaptic firing rate data. From the synaptic output matrix, a representative set of seven different CF fibers was extracted from the matrix and the values were summed into ten 0.5 s bins. The Neural Network Toolbox™ within MATLAB was used to automate the creation of the NN. Four training sessions were performed and then trained using the NN over 7 CF. Sweep pairs of firing rate vectors served as the input vectors and the binary values identifying sweep direction was used as the training vector. Training was carried out over 100 trials for each CF. The stochastic nature of the NN output is the result of how the toolbox uses pseudo-random values to initialize the network weights and biases before training. Training session 1 consisted of the 100up/100down pair. Each subsequent training session added the next highest sweep pair with training session 4, the final session, including all pairs except for the 10up/10down and 5up/5down pair. Each training session was tested against all twelve miniFM stimuli. 4. Results The results are reported in terms of percentage of responses made to up sweeps of the FM stimuli. The accuracy of the NN varied over the continuum of CF values. The NN was faithful in response to sweep coherence between 25% and 100%. The data show that the output representing the low frequency CF of 0.648 kHz was biased towards responding with up responses to difficult downward sweeps (Down5, Down 10) but was able to discriminate the direction of more coherent stimuli. Interestingly, each individual fiber had individual as represented by the comparison of two CF in figure 4. The CF of 8.72 kHz is biased towards responding down with difficult upward sweeps. The change in bias occurs between 3.6 and 8.7 kHz. 5. Conclusions and Future Work Our results demonstrate the feasibility of investigating computational models that predict behavioral and neural data. Future work will extend these results to include rat behavioral data, more complete electrophysiological responses, and test between different computational methods. Our goals are twofold. First, we can use these models to gain insight into the neural mechanisms responsible for representing the sensory and behavioral variables. Second, we can apply these insights to the generation of advanced decision making and machine learning algorithms that can correctly identify patterns under substantial uncertainty. References 1. Kajikawa, Y., et al., 2005, “A comparison of Neuron Response Properties in Areas A1 and CM of the Marmoset Monkey Auditory Cortex: Tones and Broadband Noise,” Journal of Neurophysiology, 93(1), 22-34. 2. Kajikawa, Y., et al., 2008, “Coding of FM Sweep Trains and Twitter Calls in Area CM of Marmoset Auditory Cortex,” Hearing Research, 238(1-2), 107-125. 3. Zhang, X., Heinz, M.G., Bruce, I.C., and Carney, L.H., 2001, “A Phenomenological Model for the Responses of Auditory-nerve Fibers: I. Nonlinear Tuning with Compression and Surpession,” Journal of the Acoustical Society of America, 109(2), 648-670. 4. Zilany, M.S.A. and Bruce, I.C., 2006, “Modeling Auditory-nerve Responses for High Sound Pressure Levels in the Normal and Impaired Auditory Periphery,” Journal of the Acoustical Society of America, 120(3), 1446-66. 5. Bishop, C. M., 1995, Neural Networks for Pattern Recognition, Oxford University Press, New York.