Title: Black-box Zeroth Order Optimization (in Theory and in Practice)
Abstract: In this talk, we will consider the problem of maximizing a black-box function via noisy and costly queries from a theoretical perspective (a lot of it) as well as applications (an exciting bit). We first motivate the problem by considering a wide variety of engineering design applications from the heuristic optimization of wireless networks to hardware acceleration to neural network architecture search.
In the second part of the talk, we consider the problem in a Bayesian framework with a Gaussian Process prior. In particular, a new algorithm for this problem is proposed, and high probability bounds on its simple and cumulative regret are established. The query point selection rule in most existing methods involves an exhaustive search over an increasingly fine sequence of uniform discretizations of the input space, hence an exponential (with dimension) complexity . Our proposed algorithm, in contrast, adaptively refines the domain. leading to a lower computational complexity (linear in dimension). In addition to the computational gains, sufficient conditions are identified under which the regret bounds of the new algorithm improve upon the known results. We end this segment of the talk by providing an algorithm independent lower bound and, hence, quantifying possible benefits when noisy gradient information is available.
In the third part of the talk, we build on the intuition provided by our work in the Bayesian setting to consider the problem in a non-Bayesian setting where the objective function is assumed to have a kernel representation as well as being smooth. In particular, we propose a variant of our algorithm –augmenting the Gaussian Process based surrogate with local polynomial estimators— with improved regret over prior work. Most notably, our algorithm closes the gap to the (information theoretic) regret lower bound for the class of widely used and practically relevant Matern family of Kernels.
This is joint work with my PhD student Shekhar Shubhanshu.
Bio: Tara Javidi received her BS in electrical engineering at Sharif University of Technology, Tehran, Iran. She received her MS degrees in electrical engineering (systems) and in applied mathematics (stochastic analysis) from the University of Michigan, Ann Arbor as well as her Ph.D. in electrical engineering and computer science. From 2002 to 2004, Tara Javidi was an assistant professor at the Electrical Engineering Department, University of Washington, Seattle. In 2005, she joined the University of California, San Diego, where she is currently a professor of electrical and computer engineering and a founding co-director of the Center for Machine-Integrated Computing and Security (MICS).
Tara Javidi’s research interests are in the theory of active learning, active hypothesis testing, information theory with feedback, stochastic control theory, and wireless and communication networks.