Logo

A Framework for Distributed Large-Scale Sparse Regression

Speaker

Leng Chenlei, University of Warwick

Time

2018.04.17 14:00-15:00

Venue

Middle Lecture Room, Math Building

Abstract

An attractive approach for down-scaling a Big Data problem is to partition the dataset into subsets before fitting them via a divide and conquer approach. For a dataset with a large number of variables, this is best done via partitioning features, which however suffers from not taking correlations into account if not done properly. We propose a framework named DECO by applying a simple decorrelation step before performing sparse regression on each subset. The framework works for elliptically distributed features, heavy-tailed errors and a general class of sparsity penalties. Its performance is illustrated via sythesized and real data analysis. This is joint work with Xiangyu Wang at Google and David Dunson at Duke.