Aryan Mokhtari | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Aryan Mokhtari is active.

Explore More

Publication

Featured researches published by Aryan Mokhtari.

IEEE Transactions on Signal Processing | 2014

RES: Regularized Stochastic BFGS Algorithm

Aryan Mokhtari; Alejandro Ribeiro

RES, a regularized stochastic version of the Broyden-Fletcher-Goldfarb-Shanno (BFGS) quasi-Newton method, is proposed to solve strongly convex optimization problems with stochastic objectives. The use of stochastic gradient descent algorithms is widespread, but the number of iterations required to approximate optimal arguments can be prohibitive in high dimensional problems. Application of second-order methods, on the other hand, is impracticable because the computation of objective function Hessian inverses incurs excessive computational cost. BFGS modifies gradient descent by introducing a Hessian approximation matrix computed from finite gradient differences. RES utilizes stochastic gradients in lieu of deterministic gradients for both the determination of descent directions and the approximation of the objective functions curvature. Since stochastic gradients can be computed at manageable computational cost, RES is realizable and retains the convergence rate advantages of its deterministic counterparts. Convergence results show that lower and upper bounds on the Hessian eigenvalues of the sample functions are sufficient to guarantee almost sure convergence of a subsequence generated by RES and convergence of the sequence in expectation to optimal arguments. Numerical experiments showcase reductions in convergence time relative to stochastic gradient descent algorithms and non-regularized stochastic versions of BFGS. An application of RES to the implementation of support vector machines is developed.

IEEE Transactions on Signal Processing | 2016

A Class of Prediction-Correction Methods for Time-Varying Convex Optimization

Andrea Simonetto; Aryan Mokhtari; Alec Koppel; Geert Leus; Alejandro Ribeiro

This paper considers unconstrained convex optimization problems with time-varying objective functions. We propose algorithms with a discrete time-sampling scheme to find and track the solution trajectory based on prediction and correction steps, while sampling the problem data at a constant rate of 1/h, where h is the sampling period. The prediction step is derived by analyzing the iso-residual dynamics of the optimality conditions. The correction step adjusts for the distance between the current prediction and the optimizer at each time step, and consists either of one or multiple gradient steps or Newton steps, which respectively correspond to the gradient trajectory tracking (GTT) or Newton trajectory tracking (NTT) algorithms. Under suitable conditions, we establish that the asymptotic error incurred by both proposed methods behaves as O(h2), and in some cases as O(h4), which outperforms the state-of-the-art error bound of O(h) for correction-only methods in the gradient-correction step. Moreover, when the characteristics of the objective function variation are not available, we propose approximate gradient and Newton tracking algorithms (AGT and ANT, respectively) that still attain these asymptotical error bounds. Numerical simulations demonstrate the practical utility of the proposed methods and that they improve upon existing techniques by several orders of magnitude.

IEEE Transactions on Signal Processing | 2016

DQM: Decentralized Quadratically Approximated Alternating Direction Method of Multipliers

Aryan Mokhtari; Wei Shi; Qing Ling; Alejandro Ribeiro

This paper considers decentralized consensus optimization problems where nodes of a network have access to different summands of a global objective function. Nodes cooperate to minimize the global objective by exchanging information with neighbors only. A decentralized version of the alternating directions method of multipliers (DADMM) is a common method for solving this category of problems. DADMM exhibits linear convergence rate to the optimal objective for strongly convex functions but its implementation requires solving a convex optimization problem at each iteration. This can be computationally costly and may result in large overall convergence times. The decentralized quadratically approximated ADMM algorithm (DQM), which minimizes a quadratic approximation of the objective function that DADMM minimizes at each iteration, is proposed here. The consequent reduction in computational time is shown to have minimal effect on convergence properties. Convergence still proceeds at a linear rate with a guaranteed factor that is asymptotically equivalent to the DADMM linear convergence rate factor. Numerical results demonstrate advantages of DQM relative to DADMM and other alternatives in a logistic regression problem.

international conference on acoustics, speech, and signal processing | 2015

An approximate Newton method for distributed optimization

Aryan Mokhtari; Qing Ling; Alejandro Ribeiro

Agents of a network have access to strongly convex local functions fi and attempt to minimize the aggregate function f(x) = Σi=1nfi(x) while relying on variable exchanges with neighboring nodes. Various methods to solve this distributed optimization problem exist but they all rely on first order information. This paper introduces Network Newton, a method that incorporates second order information via distributed evaluation of approximations to Newton steps. The method is shown to converge linearly and to do so while exhibiting a quadratic phase. Numerical analyses show substantial reductions in convergence times relative to existing (first order) alternatives.

ieee transactions on signal and information processing over networks | 2016

A Decentralized Second-Order Method with Exact Linear Convergence Rate for Consensus Optimization

Aryan Mokhtari; Wei Shi; Qing Ling; Alejandro Ribeiro

This paper considers decentralized consensus optimization problems where different summands of a global objective function are available at nodes of a network that can communicate with neighbors only. The proximal method of multipliers is considered as a powerful tool that relies on proximal primal descent and dual ascent updates on a suitably defined augmented Lagrangian. The structure of the augmented Lagrangian makes this problem nondecomposable, which precludes distributed implementations. This problem is regularly addressed by the use of the alternating direction method of multipliers. The exact second-order method (ESOM) is introduced here as an alternative that relies on: First, the use of a separable quadratic approximation of the augmented Lagrangian, and second, a truncated Taylors series to estimate the solution of the first-order condition imposed on the minimization of the quadratic approximation of the augmented Lagrangian. The sequences of primal and dual variables generated by ESOM are shown to converge linearly to their optimal arguments when the aggregate cost function is strongly convex and its gradients are Lipschitz continuous. Numerical results demonstrate advantages of ESOM relative to decentralized alternatives in solving least-squares and logistic regression problems.

IEEE Transactions on Signal Processing | 2017

Network Newton Distributed Optimization Methods

Aryan Mokhtari; Qing Ling; Alejandro Ribeiro

We study the problem of minimizing a sum of convex objective functions, where the components of the objective are available at different nodes of a network and nodes are allowed to only communicate with their neighbors. The use of distributed gradient methods is a common approach to solve this problem. Their popularity notwithstanding, these methods exhibit slow convergence and a consequent large number of communications between nodes to approach the optimal argument because they rely on first-order information only. This paper proposes the network Newton (NN) method as a distributed algorithm that incorporates second-order information. This is done via distributed implementation of approximations of a suitably chosen Newton step. The approximations are obtained by truncation of the Newton steps Taylor expansion. This leads to a family of methods defined by the number K of Taylor series terms kept in the approximation. When keeping K terms of the Taylor series, the method is called NN-K and can be implemented through the aggregation of information in K-hop neighborhoods. Convergence to a point close to the optimal argument at a rate that is at least linear is proven and the existence of a tradeoff between convergence time and the distance to the optimal argument is shown. The numerical experiments corroborate reductions in the number of iterations and the communication cost that are necessary to achieve convergence relative to first-order alternatives.

ieee global conference on signal and information processing | 2013

Regularized stochastic BFGS algorithm

Aryan Mokhtari; Alejandro Ribeiro

A regularized stochastic version of the Broyden-Fletcher- Goldfarb-Shanno (BFGS) quasi-Newton method is proposed to solve optimization problems with stochastic objectives that arise in large scale machine learning. Stochastic gradient descent is the currently preferred solution methodology but the number of iterations required to approximate optimal arguments can be prohibitive in high dimensional problems. BFGS modifies gradient descent by introducing a Hessian approximation matrix computed from finite gradient differences. This paper utilizes stochastic gradient differences and introduces a regularization to ensure that the Hessian approximation matrix remains well conditioned. The resulting regularized stochastic BFGS method is shown to converge to optimal arguments almost surely over realizations of the stochastic gradient sequence. Numerical experiments showcase reductions in convergence time relative to stochastic gradient descent algorithms and non-regularized stochastic versions of BFGS.

IEEE Transactions on Signal Processing | 2017

Decentralized Quasi-Newton Methods

Mark Eisen; Aryan Mokhtari; Alejandro Ribeiro

We introduce the decentralized Broyden–Fletcher–Goldfarb–Shanno (D-BFGS) method as a variation of the BFGS quasi-Newton method for solving decentralized optimization problems. Decentralized quasi-Newton methods are of interest in problems that are not well conditioned, making first-order decentralized methods ineffective, and in which second-order information is not readily available, making second-order decentralized methods impossible. D-BFGS is a fully distributed algorithm in which nodes approximate curvature information of themselves and their neighbors through the satisfaction of a secant condition. We additionally provide a formulation of the algorithm in asynchronous settings. Convergence of D-BFGS is established formally in both the synchronous and asynchronous settings and strong performance advantages relative to existing methods are shown numerically.

conference on decision and control | 2016

Online optimization in dynamic environments: Improved regret rates for strongly convex problems

Aryan Mokhtari; Shahin Shahrampour; Ali Jadbabaie; Alejandro Ribeiro

In this paper, we address tracking of a time-varying parameter with unknown dynamics. We formalize the problem as an instance of online optimization in a dynamic setting. Using online gradient descent, we propose a method that sequentially predicts the value of the parameter and in turn suffers a loss. The objective is to minimize the accumulation of losses over the time horizon, a notion that is termed dynamic regret. While existing methods focus on convex loss functions, we consider strongly convex functions so as to provide better guarantees of performance. We derive a regret bound that captures the path-length of the time-varying parameter, defined in terms of the distance between its consecutive values. In other words, the bound represents the natural connection of tracking quality to the rate of change of the parameter. We provide numerical experiments to complement our theoretical findings.

IEEE Transactions on Automatic Control | 2017

Decentralized Prediction-Correction Methods for Networked Time-Varying Convex Optimization

Andrea Simonetto; Alec Koppel; Aryan Mokhtari; Geert Leus; Alejandro Ribeiro

We study networked unconstrained convex optimization problems where the objective function changes continuously in time. We propose a decentralized algorithm (DePCoT) with a discrete time-sampling scheme to find and track the solution trajectory based on prediction and gradient-based correction steps, while sampling the problem data at a constant sampling period h. Under suitable conditions and for limited sampling periods, we establish that the asymptotic error bound behaves as O(h2), which outperforms the state of the art existing error bound of O(h) for correction-only methods. The key contributions are the prediction step and a decentralized method to approximate the inverse of the Hessian of the cost function in a decentralized way, which yields quantifiable trade-offs between communication and accuracy.

Explore More