Hogg's Research: integer programming

2008-09-18

integer programming

Roweis and my approach to constructing archetypes—small subsets of data points that represent all data points—is one of integer (or actually binary integer) programming. You have a large number of data points, and you include a small number of them, and exclude the rest, subject to constraints (the constraints that each point in the large set be represented), and optimizing some cost function (the total number of archetypes, in the simplest case). In general, these problems are, indeed, NP hard, as I suspected (below).

Roweis had the good idea of approximating the binary programming problem with a linear programming problem, and then post-processing the result. This is a great idea, and it works pretty well, as I discovered this morning, when everything came together and my code just worked. However, the number of archetypes we were getting in our post-processing was significantly larger than that expected given the performance of the linear program approximation.

It turns out that standard linear programming packages (open source glpk and commercial CPLEX, for examples) have integer and binary programming capabilities. These also solve the linear program first and then post-process, but they do something extremely clever in the post-processing step and are much better than my greedy algorithm. They both come very close to saturating the linear programming optimal cost, for the problem we currently care about (although CPLEX does it much, much faster than glpk, in exchange for infinitely larger licensing fees).

It was a very satisfying, research-filled day. As time goes on I will let my loyal readers know why we are interested in this.

No comments:

Recent Collaborators

Adam Greenberg (Columbia)
Adam Myers (Wyoming)
Adi Zolotov
Adrian Price-Whelan (Flatiron)
Alex Malz (NYU)
Ana Bonaca (Harvard)
Andreas Küpper
Andy Casey (Monash)
Anna Y. Q. Ho (Caltech)
Anna-Christina Eilers (MPIA)
Aukosh Jagannath
Bernhard Schölkopf (MPI-IS)
Beth Willman (Arizona)
Boris Leistedt (NYU)
Brendon Brewer (Auckland)
Christopher Stumm (Etsy)
Dalya Baron (TAU)
Dan Foreman-Mackey (Flatiron)
Daniela Huppenkothen
David Mykytyn (NYU)
David Schiminovich (Columbia)
Demetri Muna
Dmitry Malyshev (Stanford)
Dun Wang
Dustin Lang (Princeton)
Ekta Patel (Berkeley)
Elisabeta Lusso (Arcetri)
Emily Griffith (Colorado)
Federica Bianco (NYU)
Fengji Hou
Hans-Walter Rix (MPIA)
Iain Murray (Edinburgh)
James Long (TAMU)
Jan Rybizki (MPIA)
Jeffrey Mei (NYUAD)
Jeremy Magland (Flatiron)
Jeremy Tinker (NYU)
Jo Bovy (Toronto)
Joe Hennawi (MPIA)
Joey Richards (Berkeley)
John Moustakas (Siena College)
Jonathan Bird (Vanderbilt)
Jonathan Goodman (NYU)
Kate Storey-Fisher (NYU)
Kathryn Johnston (Columbia)
Krikamol Muandet (MPI-IS)
Lauren Anderson
Leslie Greengard (Flatiron)
Lily Zhao (Flatiron)
Marcus Frean (Wellington)
Maria Kapala (Cape Town)
Marla Geha (Yale)
Megan Bedell (Flatiron)
Melissa Ness (Columbia)
Michael Blanton (NYU)
Mike O'Neil (NYU)
MJ Vakili (Leiden)
Morad Masjedi
Nora Eisner (Flatiron)
Paraskevi Tsalmantza
Phil Marshall (SLAC)
Rob Fergus (NYU)
Robyn Sanderson (Columbia)
Ronin Wu (Tokyo)
Rory Holmes (COM DEV)
Ross Fadely (Insight)
Ruth Angus (AMNH)
Sam Roweis (deceased)
Sarah Pearson (NYU)
Semyeong Oh
So Hattori (NYUAD)
Soledad Villar (JHU)
Stephen Feeney (Flatiron)
Steven Mohammed (Columbia)
Taisiya Kopytova (ASU)
Teresa Huang (NYU)
Tim Morton (Princeton)
Tom Barclay (NASA)

2008-09-18

integer programming

No comments:

Post a Comment