International Conference on Acoustics, Speech, and Signal Processing, 2011
Inspired by bacterial motility, we propose an algorithm for adaptation over networks with mobile ... more Inspired by bacterial motility, we propose an algorithm for adaptation over networks with mobile nodes. The nodes have limited abilities and they are allowed to cooperate with their neighbors to optimize a common objective function. In contrast to traditional adaptation formulations, an important consideration in this work is the fact that the nodes do not know the form of the
Matrix pair beamformer (MPB) is a promising blind beamformer which exploits the temporal signatur... more Matrix pair beamformer (MPB) is a promising blind beamformer which exploits the temporal signature of the signal of interest (SOI) to acquire its spatial statistical information. It does not need any knowledge of directional information or training sequences. However, the major problem of the existing MPBs is that they have serious threshold effects and the thresholds will grow as the
2012 IEEE 13th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), 2012
ABSTRACT In this work, we consider a distributed beam coordination problem, where a collection of... more ABSTRACT In this work, we consider a distributed beam coordination problem, where a collection of arrays are interconnected by a certain topology. The beamformers employ an adaptive diffusion strategy to compute the beamforming weight vectors by relying solely on cooperation with their local neighbors. We analyze the mean-square-error (MSE) performance of the proposed strategy, including its transient and steady-state behavior. Simulation results support the findings that the MSE performance improves uniformly across the network relative to non-cooperative designs.
2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014
ABSTRACT The recent success of deep neural networks (DNNs) in speech recognition can be attribute... more ABSTRACT The recent success of deep neural networks (DNNs) in speech recognition can be attributed largely to their ability to extract a specific form of high-level features from raw acoustic data for subsequent sequence classification or recognition tasks. Among the many possible forms of DNN features, what forms are more useful than others and how effective these DNN features are in connection with the different types of downstream sequence recognizers remained unexplored and are the focus of this paper. We report our recent work on the construction of a diverse set of DNN features, including the vectors extracted from the output layer and from various hidden layers in the DNN. We then apply these features as the inputs to four types of classifiers to carry out the identical sequence classification task of phone recognition. The experimental results show that the features derived from the top hidden layer of the DNN perform the best for all four classifiers, especially for the autoregressive-moving-average (ARMA) version of a recurrent neural network. The feature vector derived from the DNN's output layer performs slightly worse but better than any of the hidden layers in the DNN except the top one.
2011 4th IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 2011
We develop an iterative diffusion mechanism to optimize a global cost function in a distributed m... more We develop an iterative diffusion mechanism to optimize a global cost function in a distributed manner over a network of nodes. The cost function is assumed to consist of a collection of individual components, and diffusion strategy allows the nodes to cooperate and diffuse information in real-time. Compared to incremental methods, diffusion methods do not require the use of a
2013 IEEE Global Conference on Signal and Information Processing, 2013
ABSTRACT We study the steady-state probability distribution of diffusion and consensus strategies... more ABSTRACT We study the steady-state probability distribution of diffusion and consensus strategies that employ constant step-sizes to enable continuous adaptation and learning. We show that, in the small step-size regime, the estimation error at each agent approaches a Gaussian distribution. More importantly, the covariance matrix of this distribution is shown to coincide with the error covariance matrix that would result from a centralized stochastic-gradient strategy. The results hold regardless of the connected topology and help clarify the convergence and learning behavior of distributed strategies in an interesting way.
2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2012
ABSTRACT Motivated by recent developments in the context of adaptation over networks, this work e... more ABSTRACT Motivated by recent developments in the context of adaptation over networks, this work establishes useful results about the limiting global behavior of diffusion and consensus strategies for the solution of distributed optimization problems. It is known that the choice of combination policies has a direct bearing on the convergence and performance of distributed solutions. This article reveals what aspects of the combination policies determine the nature of the Pareto-optimal solution and how close the distributed solution gets to it. The results suggest useful constructive procedures to control the convergence behavior of distributed strategies and to design effective combination procedures.
2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers, 2010
Abstract—Bacteria forage by moving towards nutrient sources in a process known as chemotaxis. The... more Abstract—Bacteria forage by moving towards nutrient sources in a process known as chemotaxis. The bacteria follow gradient variations by tumbling or moving in straight lines. Both modes of locomotion are affected by Brownian motion. Bacteria are also capable of ...
Matrix pair beamformer (MPB) is a blind beamformer. It exploits the temporal structure of the sig... more Matrix pair beamformer (MPB) is a blind beamformer. It exploits the temporal structure of the signal of interest (SOI) and applies generalized eigen-decomposition to a covariance matrix pair. Unlike other blind algorithms, it only uses the second order statistics. A key assumption in the previous work is that the two matrices have the same interference statistics. However, this assumption may
2013 Information Theory and Applications Workshop (ITA), 2013
ABSTRACT We examine the performance of stochastic-gradient learners over connected networks for g... more ABSTRACT We examine the performance of stochastic-gradient learners over connected networks for global optimization problems involving risk functions that are not necessarily quadratic. We consider two well-studied classes of distributed schemes including consensus strategies and diffusion strategies. We quantify how the mean-square-error and the convergence rate of the network vary with the combination policy and with the fraction of informed agents. Several combination policies are considered including doubly-stochastic rules, the averaging rule, Metropolis rule, and the Hastings rule. It will be seen that the performance of the network does not necessarily improve with a larger proportion of informed agents. A strategy to counter the degradation in performance is presented.
IEEE Journal of Selected Topics in Signal Processing, 2014
ABSTRACT Joint outage identification and state estimation in power systems is studied. A Bayesian... more ABSTRACT Joint outage identification and state estimation in power systems is studied. A Bayesian framework is employed, and a Gaussian prior distribution of the states is assumed. The joint posterior of the outage hypotheses and the network states is developed in closed form, which can be applied to obtain the optimal joint detector and estimator under any given performance criterion. Employing the minimum probability of error as the performance criterion in identifying outages with uncertain states, the optimal detector is obtained. Efficiently computable performance metrics that capture the probability of error of the optimal detector are developed. Under simplified model assumptions, closed-form expressions for these metrics are derived, and these lead to a mixed integer convex programming problem for optimizing sensor locations. Using convex relaxations, a branch and bound algorithm that finds the globally optimal sensor locations is developed. Significant performance gains from using the optimal detector with the optimal sensor locations are observed from simulations. Furthermore, performance with greedily selected sensor locations is shown to be very close to that with globally optimal sensor locations.
International Conference on Intelligent Transportation, 2007
Using vehicles as probes is a flexible and low-cost way to obtain real-time traffic information. ... more Using vehicles as probes is a flexible and low-cost way to obtain real-time traffic information. A key problem of using probe vehicles is to determine vehicle's sampling period and probe sample size. This paper addresses the sampling issue of using probe vehicles for detecting traffic information. An extended Nyquist sampling theorem based on signal processing theory is proposed to derive
ABSTRACT We study the distributed inference task over regression and classification models where ... more ABSTRACT We study the distributed inference task over regression and classification models where the likelihood function is strongly log-concave. We show that diffusion strategies allow the KL divergence between two likelihood functions to converge to zero at the rate 1/Ni on average and with high probability, where N is the number of nodes in the network and i is the number of iterations. We derive asymptotic expressions for the expected regularized KL divergence and show that the diffusion strategy can outperform both non-cooperative and conventional centralized strategies, since diffusion implementations can weigh a node's contribution in proportion to its noise level.
ABSTRACT A novel architecture of a recurrent neural network (RNN), integrated with a fully-connec... more ABSTRACT A novel architecture of a recurrent neural network (RNN), integrated with a fully-connected deep neural network (DNN) as its feature extractor, is presented. This deep-RNN is equipped with both causal temporal prediction and non-causal look-ahead, via auto-regression (AR) and moving-average (MA), respectively. We describe a primal-dual training method that formulates learning RNNs as a formal optimization problem with an inequality constraint that guarantees stability of the network dynamics. Experimental results demonstrate the effectiveness of this new method, which achieves 18.86% phone recognition error on the TIMIT benchmark with the core test set. The results also show the ARMA version of the deep-RNN is more effective than the AR version and that using DNNs to provide high-level abstraction of the raw filter-bank speech data as the input to the RNN gives much lower recognition error than without using the DNN.
ABSTRACT In this letter, we focus on the Stansfield localization algorithm, which is a direction-... more ABSTRACT In this letter, we focus on the Stansfield localization algorithm, which is a direction-of-arrival (DoA) fusion algorithm with high accuracy and low complexity. We derive the mean square error of the Stansfield algorithm with estimated DoA estimation error variance. Our derivation considers the statistical variation of DoA, as well as the impact of receive signal strength variations and node self-positioning error. In addition, we propose a distributed implementation of the Stansfield algorithm based on diffusion adaptation, which obtains accuracy comparable to its centralized counterpart and saves total transmit power for sufficient node density.
International Conference on Acoustics, Speech, and Signal Processing, 2011
Inspired by bacterial motility, we propose an algorithm for adaptation over networks with mobile ... more Inspired by bacterial motility, we propose an algorithm for adaptation over networks with mobile nodes. The nodes have limited abilities and they are allowed to cooperate with their neighbors to optimize a common objective function. In contrast to traditional adaptation formulations, an important consideration in this work is the fact that the nodes do not know the form of the
Matrix pair beamformer (MPB) is a promising blind beamformer which exploits the temporal signatur... more Matrix pair beamformer (MPB) is a promising blind beamformer which exploits the temporal signature of the signal of interest (SOI) to acquire its spatial statistical information. It does not need any knowledge of directional information or training sequences. However, the major problem of the existing MPBs is that they have serious threshold effects and the thresholds will grow as the
2012 IEEE 13th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), 2012
ABSTRACT In this work, we consider a distributed beam coordination problem, where a collection of... more ABSTRACT In this work, we consider a distributed beam coordination problem, where a collection of arrays are interconnected by a certain topology. The beamformers employ an adaptive diffusion strategy to compute the beamforming weight vectors by relying solely on cooperation with their local neighbors. We analyze the mean-square-error (MSE) performance of the proposed strategy, including its transient and steady-state behavior. Simulation results support the findings that the MSE performance improves uniformly across the network relative to non-cooperative designs.
2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014
ABSTRACT The recent success of deep neural networks (DNNs) in speech recognition can be attribute... more ABSTRACT The recent success of deep neural networks (DNNs) in speech recognition can be attributed largely to their ability to extract a specific form of high-level features from raw acoustic data for subsequent sequence classification or recognition tasks. Among the many possible forms of DNN features, what forms are more useful than others and how effective these DNN features are in connection with the different types of downstream sequence recognizers remained unexplored and are the focus of this paper. We report our recent work on the construction of a diverse set of DNN features, including the vectors extracted from the output layer and from various hidden layers in the DNN. We then apply these features as the inputs to four types of classifiers to carry out the identical sequence classification task of phone recognition. The experimental results show that the features derived from the top hidden layer of the DNN perform the best for all four classifiers, especially for the autoregressive-moving-average (ARMA) version of a recurrent neural network. The feature vector derived from the DNN's output layer performs slightly worse but better than any of the hidden layers in the DNN except the top one.
2011 4th IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 2011
We develop an iterative diffusion mechanism to optimize a global cost function in a distributed m... more We develop an iterative diffusion mechanism to optimize a global cost function in a distributed manner over a network of nodes. The cost function is assumed to consist of a collection of individual components, and diffusion strategy allows the nodes to cooperate and diffuse information in real-time. Compared to incremental methods, diffusion methods do not require the use of a
2013 IEEE Global Conference on Signal and Information Processing, 2013
ABSTRACT We study the steady-state probability distribution of diffusion and consensus strategies... more ABSTRACT We study the steady-state probability distribution of diffusion and consensus strategies that employ constant step-sizes to enable continuous adaptation and learning. We show that, in the small step-size regime, the estimation error at each agent approaches a Gaussian distribution. More importantly, the covariance matrix of this distribution is shown to coincide with the error covariance matrix that would result from a centralized stochastic-gradient strategy. The results hold regardless of the connected topology and help clarify the convergence and learning behavior of distributed strategies in an interesting way.
2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2012
ABSTRACT Motivated by recent developments in the context of adaptation over networks, this work e... more ABSTRACT Motivated by recent developments in the context of adaptation over networks, this work establishes useful results about the limiting global behavior of diffusion and consensus strategies for the solution of distributed optimization problems. It is known that the choice of combination policies has a direct bearing on the convergence and performance of distributed solutions. This article reveals what aspects of the combination policies determine the nature of the Pareto-optimal solution and how close the distributed solution gets to it. The results suggest useful constructive procedures to control the convergence behavior of distributed strategies and to design effective combination procedures.
2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers, 2010
Abstract—Bacteria forage by moving towards nutrient sources in a process known as chemotaxis. The... more Abstract—Bacteria forage by moving towards nutrient sources in a process known as chemotaxis. The bacteria follow gradient variations by tumbling or moving in straight lines. Both modes of locomotion are affected by Brownian motion. Bacteria are also capable of ...
Matrix pair beamformer (MPB) is a blind beamformer. It exploits the temporal structure of the sig... more Matrix pair beamformer (MPB) is a blind beamformer. It exploits the temporal structure of the signal of interest (SOI) and applies generalized eigen-decomposition to a covariance matrix pair. Unlike other blind algorithms, it only uses the second order statistics. A key assumption in the previous work is that the two matrices have the same interference statistics. However, this assumption may
2013 Information Theory and Applications Workshop (ITA), 2013
ABSTRACT We examine the performance of stochastic-gradient learners over connected networks for g... more ABSTRACT We examine the performance of stochastic-gradient learners over connected networks for global optimization problems involving risk functions that are not necessarily quadratic. We consider two well-studied classes of distributed schemes including consensus strategies and diffusion strategies. We quantify how the mean-square-error and the convergence rate of the network vary with the combination policy and with the fraction of informed agents. Several combination policies are considered including doubly-stochastic rules, the averaging rule, Metropolis rule, and the Hastings rule. It will be seen that the performance of the network does not necessarily improve with a larger proportion of informed agents. A strategy to counter the degradation in performance is presented.
IEEE Journal of Selected Topics in Signal Processing, 2014
ABSTRACT Joint outage identification and state estimation in power systems is studied. A Bayesian... more ABSTRACT Joint outage identification and state estimation in power systems is studied. A Bayesian framework is employed, and a Gaussian prior distribution of the states is assumed. The joint posterior of the outage hypotheses and the network states is developed in closed form, which can be applied to obtain the optimal joint detector and estimator under any given performance criterion. Employing the minimum probability of error as the performance criterion in identifying outages with uncertain states, the optimal detector is obtained. Efficiently computable performance metrics that capture the probability of error of the optimal detector are developed. Under simplified model assumptions, closed-form expressions for these metrics are derived, and these lead to a mixed integer convex programming problem for optimizing sensor locations. Using convex relaxations, a branch and bound algorithm that finds the globally optimal sensor locations is developed. Significant performance gains from using the optimal detector with the optimal sensor locations are observed from simulations. Furthermore, performance with greedily selected sensor locations is shown to be very close to that with globally optimal sensor locations.
International Conference on Intelligent Transportation, 2007
Using vehicles as probes is a flexible and low-cost way to obtain real-time traffic information. ... more Using vehicles as probes is a flexible and low-cost way to obtain real-time traffic information. A key problem of using probe vehicles is to determine vehicle's sampling period and probe sample size. This paper addresses the sampling issue of using probe vehicles for detecting traffic information. An extended Nyquist sampling theorem based on signal processing theory is proposed to derive
ABSTRACT We study the distributed inference task over regression and classification models where ... more ABSTRACT We study the distributed inference task over regression and classification models where the likelihood function is strongly log-concave. We show that diffusion strategies allow the KL divergence between two likelihood functions to converge to zero at the rate 1/Ni on average and with high probability, where N is the number of nodes in the network and i is the number of iterations. We derive asymptotic expressions for the expected regularized KL divergence and show that the diffusion strategy can outperform both non-cooperative and conventional centralized strategies, since diffusion implementations can weigh a node's contribution in proportion to its noise level.
ABSTRACT A novel architecture of a recurrent neural network (RNN), integrated with a fully-connec... more ABSTRACT A novel architecture of a recurrent neural network (RNN), integrated with a fully-connected deep neural network (DNN) as its feature extractor, is presented. This deep-RNN is equipped with both causal temporal prediction and non-causal look-ahead, via auto-regression (AR) and moving-average (MA), respectively. We describe a primal-dual training method that formulates learning RNNs as a formal optimization problem with an inequality constraint that guarantees stability of the network dynamics. Experimental results demonstrate the effectiveness of this new method, which achieves 18.86% phone recognition error on the TIMIT benchmark with the core test set. The results also show the ARMA version of the deep-RNN is more effective than the AR version and that using DNNs to provide high-level abstraction of the raw filter-bank speech data as the input to the RNN gives much lower recognition error than without using the DNN.
ABSTRACT In this letter, we focus on the Stansfield localization algorithm, which is a direction-... more ABSTRACT In this letter, we focus on the Stansfield localization algorithm, which is a direction-of-arrival (DoA) fusion algorithm with high accuracy and low complexity. We derive the mean square error of the Stansfield algorithm with estimated DoA estimation error variance. Our derivation considers the statistical variation of DoA, as well as the impact of receive signal strength variations and node self-positioning error. In addition, we propose a distributed implementation of the Stansfield algorithm based on diffusion adaptation, which obtains accuracy comparable to its centralized counterpart and saves total transmit power for sufficient node density.
Uploads
Papers by Jianshu Chen