Theory of computation

Applied Filters

People

Publications

Conferences

Publication Date

101 Results for: Book/Issue: ICML '08: Proceedings of the 25th international conference on Machine learningEdit SearchSave SearchRSS

Searched The ACM Guide to Computing Literature (3,765,363 records)|Limit your search to The ACM Full-Text Collection (758,398 records)

Showing 1 - 20of101 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

research-article
July 2008
Laplace maximum margin Markov networks
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1256–1263https://doi.org/10.1145/1390156.1390314

We propose Laplace max-margin Markov networks (LapM³N), and a general class of Bayesian M³N (BM³N) of which the LapM³N is a special case with sparse structural bias, for robust structured prediction. BM³N generalizes extant structured prediction rules ...
11
270
Metrics
Total Citations11
Total Downloads270
Last 12 Months3
Last 6 weeks0
Get Access
research-article
July 2008
Efficient multiclass maximum margin clustering
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1248–1255https://doi.org/10.1145/1390156.1390313

This paper presents a cutting plane algorithm for multiclass maximum margin clustering (MMC). The proposed algorithm constructs a nested sequence of successively tighter relaxations of the original MMC problem, and each optimization problem in this ...
54
440
Metrics
Total Citations54
Total Downloads440
Last 12 Months3
Last 6 weeks1
Get Access
research-article
July 2008
Estimating local optimums in EM algorithm over Gaussian mixture model
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1240–1247https://doi.org/10.1145/1390156.1390312

EM algorithm is a very popular iteration-based method to estimate the parameters of Gaussian Mixture Model from a large observation set. However, in most cases, EM algorithm is not guaranteed to converge to the global optimum. Instead, it stops at some ...
11
314
Metrics
Total Citations11
Total Downloads314
Last 12 Months11
Last 6 weeks0
Get Access
research-article
July 2008
Improved Nyström low-rank approximation and error analysis
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1232–1239https://doi.org/10.1145/1390156.1390311

Low-rank matrix approximation is an effective tool in alleviating the memory and computational burdens of kernel methods and sampling, as the mainstream of such algorithms, has drawn considerable attention in both theory and practice. This paper ...
143
1,147
Metrics
Total Citations143
Total Downloads1,147
Last 12 Months83
Last 6 weeks9
Get Access
research-article
July 2008
A quasi-Newton approach to non-smooth convex optimization
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1216–1223https://doi.org/10.1145/1390156.1390309

We extend the well-known BFGS quasi-Newton method and its limited-memory variant LBFGS to the optimization of non-smooth convex objectives. This is done in a rigorous fashion by generalizing three components of BFGS to subdifferentials: The local ...
9
382
Metrics
Total Citations9
Total Downloads382
Last 12 Months29
Last 6 weeks4
Get Access
research-article
July 2008
Preconditioned temporal difference learning
- Hengshuai Yao,
- Zhi-Qiang Liu
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1208–1215https://doi.org/10.1145/1390156.1390308

This paper extends many of the recent popular policy evaluation algorithms to a generalized framework that includes least-squares temporal difference (LSTD) learning, least-squares policy evaluation (LSPE) and a variant of incremental LSTD (iLSTD). The ...
5
167
Metrics
Total Citations5
Total Downloads167
Last 12 Months8
Last 6 weeks1
Get Access
research-article
July 2008
Democratic approximation of lexicographic preference models
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1200–1207https://doi.org/10.1145/1390156.1390307

Previous algorithms for learning lexicographic preference models (LPMs) produce a "best guess" LPM that is consistent with the observations. Our approach is more democratic: we do not commit to a single LPM. Instead, we approximate the target using the ...
12
128
Metrics
Total Citations12
Total Downloads128
Last 12 Months3
Last 6 weeks0
Get Access
research-article
July 2008
Fully distributed EM for very large datasets
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1184–1191https://doi.org/10.1145/1390156.1390305

In EM and related algorithms, E-step computations distribute easily, because data items are independent given parameters. For very large data sets, however, even storing all of the parameters in a single node for the M-step can be impractical. We ...
59
477
Metrics
Total Citations59
Total Downloads477
Last 12 Months7
Last 6 weeks2
Get Access
research-article
July 2008
Efficiently learning linear-linear exponential family predictive representations of state
- David Wingate,
- Satinder Singh
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1176–1183https://doi.org/10.1145/1390156.1390304

Exponential Family PSR (EFPSR) models capture stochastic dynamical systems by representing state as the parameters of an exponential family distribution over a shortterm window of future observations. They are appealing from a learning perspective ...
3
109
Metrics
Total Citations3
Total Downloads109
Last 12 Months1
Last 6 weeks0
Get Access
research-article
July 2008
Deep learning via semi-supervised embedding
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1168–1175https://doi.org/10.1145/1390156.1390303

We show how nonlinear embedding algorithms popular for use with shallow semi-supervised learning techniques such as kernel methods can be applied to deep multilayer architectures, either as a regularizer at the output layer, or on each layer of the ...
218
2,265
Metrics
Total Citations218
Total Downloads2,265
Last 12 Months137
Last 6 weeks15
Get Access
research-article
July 2008
Graph transduction via alternating minimization
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1144–1151https://doi.org/10.1145/1390156.1390300

Graph transduction methods label input data by learning a classification function that is regularized to exhibit smoothness along a graph over labeled and unlabeled samples. In practice, these algorithms are sensitive to the initial set of labels ...
84
380
Metrics
Total Citations84
Total Downloads380
Last 12 Months25
Last 6 weeks1
Get Access
research-article
July 2008
Dirichlet component analysis: feature extraction for compositional data
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1128–1135https://doi.org/10.1145/1390156.1390298

We consider feature extraction (dimensionality reduction) for compositional data, where the data vectors are constrained to be positive and constant-sum. In real-world problems, the data components (variables) usually have complicated "correlations" ...
9
231
Metrics
Total Citations9
Total Downloads231
Last 12 Months12
Last 6 weeks0
Get Access
research-article
July 2008
Manifold alignment using Procrustes analysis
- Chang Wang,
- Sridhar Mahadevan
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1120–1127https://doi.org/10.1145/1390156.1390297

In this paper we introduce a novel approach to manifold alignment, based on Procrustes analysis. Our approach differs from "semi-supervised alignment" in that it results in a mapping that is defined everywhere - when used with a suitable dimensionality ...
157
984
Metrics
Total Citations157
Total Downloads984
Last 12 Months67
Last 6 weeks9
Get Access
research-article
July 2008
Sparse multiscale gaussian process regression
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1112–1119https://doi.org/10.1145/1390156.1390296

Most existing sparse Gaussian process (g.p.) models seek computational advantages by basing their computations on a set of m basis functions that are the covariance function of the g.p. with one of its two inputs fixed. We generalise this for the case ...
25
445
Metrics
Total Citations25
Total Downloads445
Last 12 Months34
Last 6 weeks3
Get Access
research-article
July 2008
Prediction with expert advice for the Brier game
- Vladimir Vovk,
- Fedor Zhdanov
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1104–1111https://doi.org/10.1145/1390156.1390295

We show that the Brier game of prediction is mixable and find the optimal learning rate and substitution function for it. The resulting prediction algorithm is applied to predict results of football and tennis matches. The theoretical performance ...
7
161
Metrics
Total Citations7
Total Downloads161
Last 12 Months13
Last 6 weeks1
Get Access
research-article
July 2008
Beam sampling for the infinite hidden Markov model
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1088–1095https://doi.org/10.1145/1390156.1390293

The infinite hidden Markov model is a non-parametric extension of the widely used hidden Markov model. Our paper introduces a new inference algorithm for the infinite Hidden Markov model called beam sampling. Beam sampling combines slice sampling, which ...
110
961
Metrics
Total Citations110
Total Downloads961
Last 12 Months37
Last 6 weeks4
Get Access
research-article
July 2008
A semiparametric statistical approach to model-free policy evaluation
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1072–1079https://doi.org/10.1145/1390156.1390291

Reinforcement learning (RL) methods based on least-squares temporal difference (LSTD) have been developed recently and have shown good practical performance. However, the quality of their estimation has not been well elucidated. In this article, we ...
4
139
Metrics
Total Citations4
Total Downloads139
Last 12 Months1
Last 6 weeks0
Get Access
research-article
July 2008
Training restricted Boltzmann machines using approximations to the likelihood gradient
- Tijmen Tieleman
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1064–1071https://doi.org/10.1145/1390156.1390290

A new algorithm for training Restricted Boltzmann Machines is introduced. The algorithm, named Persistent Contrastive Divergence, is different from the standard Contrastive Divergence algorithms in that it aims to draw samples from almost exactly the ...
484
1,990
Metrics
Total Citations484
Total Downloads1,990
Last 12 Months96
Last 6 weeks3
Get Access
research-article
July 2008
ν-support vector machine as conditional value-at-risk minimization
- Akiko Takeda,
- Masashi Sugiyama
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1056–1063https://doi.org/10.1145/1390156.1390289

The ν-support vector classification (ν-SVC) algorithm was shown to work well and provide intuitive interpretations, e.g., the parameter ν roughly specifies the fraction of support vectors. Although ν corresponds to a fraction, it cannot take the entire ...
38
360
Metrics
Total Citations38
Total Downloads360
Last 12 Months18
Last 6 weeks1
Get Access
research-article
July 2008
The many faces of optimism: a unifying approach
- István Szita,
- András Lőrincz
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1048–1055https://doi.org/10.1145/1390156.1390288

The exploration-exploitation dilemma has been an intriguing and unsolved problem within the framework of reinforcement learning. "Optimism in the face of uncertainty" and model building play central roles in advanced exploration methods. Here, we ...
41
352
Metrics
Total Citations41
Total Downloads352
Last 12 Months19
Last 6 weeks2
Get Access

Applied Filters

People

Names

Institutions

Authors

Publications

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Laplace maximum margin Markov networks

Efficient multiclass maximum margin clustering

Estimating local optimums in EM algorithm over Gaussian mixture model

Improved Nyström low-rank approximation and error analysis

A quasi-Newton approach to non-smooth convex optimization

Preconditioned temporal difference learning

Democratic approximation of lexicographic preference models

Fully distributed EM for very large datasets

Efficiently learning linear-linear exponential family predictive representations of state

Deep learning via semi-supervised embedding

Graph transduction via alternating minimization

Dirichlet component analysis: feature extraction for compositional data

Manifold alignment using Procrustes analysis

Sparse multiscale gaussian process regression

Prediction with expert advice for the Brier game

Beam sampling for the infinite hidden Markov model

A semiparametric statistical approach to model-free policy evaluation

Training restricted Boltzmann machines using approximations to the likelihood gradient

ν-support vector machine as conditional value-at-risk minimization

The many faces of optimism: a unifying approach