Language:

Preconditioned Gradient Descent for Overparameterized Nonconvex Burer--Monteiro Factorization with Global Optimality Certification

arXiv.org, 2023-04 [Peer Reviewed Journal]

2023. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://arxiv.org/licenses/nonexclusive-distrib/1.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2206.03345

Full text available

Citations Cited by

Actions
1. Add to My Research
2. Remove from My Research
3. E-mail
4. Print
5. Permalink
6. Citation
7. EasyBib
8. EndNote
9. RefWorks
10. Delicious
11. Export RIS
12. Export BibTeX

Title:
Preconditioned Gradient Descent for Overparameterized Nonconvex Burer--Monteiro Factorization with Global Optimality Certification
Author: Zhang, Gavin ; Fattahi, Salar ; Zhang, Richard Y
Subjects: Computer Science - Learning ; Convergence ; Cost function ; Ill-conditioned problems (mathematics) ; Mathematics - Optimization and Control ; Optimization ; Statistics - Machine Learning
Is Part Of: arXiv.org, 2023-04
Description: We consider using gradient descent to minimize the nonconvex function \(f(X)=\phi(XX^{T})\) over an \(n\times r\) factor matrix \(X\), in which \(\phi\) is an underlying smooth convex cost function defined over \(n\times n\) matrices. While only a second-order stationary point \(X\) can be provably found in reasonable time, if \(X\) is additionally rank deficient, then its rank deficiency certifies it as being globally optimal. This way of certifying global optimality necessarily requires the search rank \(r\) of the current iterate \(X\) to be overparameterized with respect to the rank \(r^{\star}\) of the global minimizer \(X^{\star}\). Unfortunately, overparameterization significantly slows down the convergence of gradient descent, from a linear rate with \(r=r^{\star}\) to a sublinear rate when \(r>r^{\star}\), even when \(\phi\) is strongly convex. In this paper, we propose an inexpensive preconditioner that restores the convergence rate of gradient descent back to linear in the overparameterized case, while also making it agnostic to possible ill-conditioning in the global minimizer \(X^{\star}\).
Publisher: Ithaca: Cornell University Library, arXiv.org
Language: English
Identifier: EISSN: 2331-8422
DOI: 10.48550/arxiv.2206.03345
Source: arXiv.org
Free E Journals
ROAD: Directory of Open Access Scholarly Resources
ProQuest Central

Back to results list


INSPIRE LIBRARY - TON DUC THANG UNIVERSITY	(84-028) 37 755 057	Feedback
19 Nguyen Huu Tho St. Dist.7, HCM	thuvien@tdtu.edu.vn	Feedback

Preconditioned Gradient Descent for Overparameterized Nonconvex Burer--Monteiro Factorization with Global Optimality Certification

Searching Remote Databases, Please Wait