Stochastic Optimization (Tim Hesterberg, organizer)


Tim Hesterberg (Insightful, Inc.)
One-pass Logistic Regression for Large Data Sets

Thursday 2:00-2:20, San Marino

Abstract:

Stochastic approximation (SA) is normally used for optimizing functions measured with error, e.g. for maximizing likelihood when the likelihood is approximated using Monte Carlo simulation. But SA ideas can also be useful in deterministic problems. For example, we describe a one-pass logistic regression approach for large data, using SA as an updating method when processing a large data set in blocks.



Jing Wang (Louisiana State University)
John Monahan (Department of Statistics, North Carolina State University)
Applying Stochastic Approximation to Nonlinear Mixed Effects Models

Thursday 2:20-2:40, San Marino

Abstract:

Nonlinear mixed effects models have become more frequently used for analyzing the data consisting of repeated measurements on subjects at several different times or under different experimental conditions in pharmacokinetics, pharmacodynamics, growth, and other studies. Maximum likelihood estimation in the nonlinear mixed effects models presents the challenge of maximizing an integral that can only be approximated numerically. We consider applying stochastic approximation methods, combining a noisy search with numerical integration using importance sampling. We compare this approach with other methods in a simulation study using models arising in pharmacokinetics.



Jim Garrett (Becton Dickinson)
Optimizing Out-Of-Bag Model Performance with Mixed-Input SPSA

Thursday 2:40-3:00, San Marino

Abstract:

Simultaneous Perturbation Stochastic Approximation (SPSA) is a stochastic optimization algorithm that is efficient and effective for many problems with continuous inputs. This article applies ideas from Design of Experiments (DOE) to SPSA, expanding its applicability to problems with both continuous and ordered discrete inputs. This "mixed-input" SPSA is applied to model-selection in order to optimize an "Out-Of-Bag" prediction performance estimate.



Abraham Flaxman (Department of Mathematical Sciences Carnegie-Mellon University)
Online Convex Optimization in the Bandit Setting: Gradient Descent without a Gradient

Thursday 3:00-3:20, San Marino

Abstract:

We study a general online convex optimization problem. We have a convex set S and an unknown sequence of cost functions c_1,c_2,..., and in each round, we choose a feasible point x_t in S, and learn the cost c_t(x_t). If the function c_t is also revealed after each round then, as Zinkevich has shown, an online gradient descent algorithm is effective even if the functions are chosen adversarially. That is, after n rounds, the total cost incurred will be O(sqrt(n)) more than the cost of the best single feasible point chosen with the benefit of hindsight, min_{x in S} c_1(x)+c_2(x)+...c_n(x). We present this result and then extend it to the "bandit" setting where, in each period, only the cost c_t(x_t) is revealed, and bound the expected regret as O(n^{3/4}).

Our approach uses a simple approximation of the gradient that is computed from evaluating c_t at a single (random) point. We show that this biased estimate is sufficient to approximate gradient descent on the sequence of functions. In other words, it is possible to use gradient descent in the online setting without seeing anything more than the value of the functions at a single point.