This repository contains an experimental study of optimization algorithms applied to a logistic regression problem on the Mushroom dataset. The project implements several gradient-based and quasi-Newton optimization methods and compares their convergence behavior, computational cost, and efficiency.
The dataset is parsed from sparse format into a dense feature matrix, followed by exploratory analysis of feature correlations. A logistic regression objective with a mean squared error loss is implemented along with explicit gradient computation and validation using PyTorch autograd.
The repository includes implementations of stochastic and batch optimization methods such as Gradient Descent (GD), Stochastic Gradient Descent (SGD), Adagrad, BFGS, and Limited-memory BFGS (L-BFGS). Proximal variants with L1 regularization are also implemented to study sparsity effects. Experiments analyze convergence rates under different step sizes and batch sizes, compare optimization performance, and measure computational cost in terms of runtime, data accesses, and vector updates.
The results provide a practical comparison between classical first-order methods and quasi-Newton techniques for large-scale optimization.