Filtros de búsqueda

Lista de obras de Zeyuan Allen-Zhu

A Convergence Theory for Deep Learning via Over-Parameterization

artículo científico publicado en 2018

Backward Feature Correction: How Deep Learning Performs Deep (Hierarchical) Learning

artículo científico publicado en 2020

Byzantine Stochastic Gradient Descent

Can SGD Learn Recurrent Neural Networks with Provable Generalization?

scholarly article by Zeyuan Allen-Zhu & Yuanzhi Li published 2019 in Advances in Neural Information Processing Systems 32

Exploiting the Structure: Stochastic Gradient Methods Using Raw Clusters

artículo científico publicado en 2016

How To Make the Gradients Small Stochastically: Even Faster Convex and Nonconvex SGD

Is Q-Learning Provably Efficient?

LazySVD: Even Faster SVD Decomposition Yet Without Agonizing Pain

artículo científico publicado en 2016

Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers

Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-Norm Balls

artículo científico publicado en 2017

LoRA: Low-Rank Adaptation of Large Language Models

artículo científico publicado en 2021

NEON2: Finding Local Minima via First-Order Oracles

Natasha 2: Faster Non-Convex Optimization Than SGD

On the Convergence Rate of Training Recurrent Neural Networks

scholarly article by Zeyuan Allen-Zhu et al published 2019 in Advances in Neural Information Processing Systems 32

Optimal Black-Box Reductions Between Optimization Objectives

artículo científico publicado en 2016

Sparse sign-consistent Johnson-Lindenstrauss matrices: compression with neuroscience-based constraints

artículo científico publicado en 2014

The Lingering of Gradients: How to Reuse Gradients Over Time

What Can ResNet Learn Efficiently, Going Beyond Kernels?