UAI 2013 - Tutorials
The tutorials will be held in Grand Ballroom C on Thursday, July 11th. For the tutorials schedule, please visit this link. Here is the list of tutorials this year:
Tutorials (with slides):
1. Polynomial Methods in Learning and Statistics
Ankur Moitra, Institute for Advanced Study
2. Large-Scale Distributed Machine Learning with GraphLab
Carlos Guestrin, University of Washington
3. Statistical Methods in Genomics
Lior Pachter, University of California, Berkeley
4. Computational Advertising & Causality
Leon Bottou, Microsoft Research
Polynomial Methods in Learning and Statistics
I will survey some of the emerging applications of the polynomial method in learning and statistics through three in-depth examples. The first part of the talk will focus on the problem of learning the parameters of a mixture of Gaussians, given random samples from the distribution. The theme here is that the method of moments can be shown to provably recover good estimates to the true parameters precisely because systems of polynomial equations are relatively stable. In fact, I will show that noisy estimates of the first six moments of a mixture of two Gaussians suffice to recover good estimates to the true mixture parameters, thus justifying an approach that dates back to Karl Pearson.
Next, I will consider the problem of learning the parameters of a topic model given random samples (documents) from this generative model. There has been considerable recent progress on this problem, and one line of attack exploits the spectral properties of certain tensors to give an algorithm for learning the parameters of a Latent Dirichlet Allocation model provided that the topic vectors are linearly independent. Another approach introduces a natural condition (separability) under which simple algorithms again based on the method of moments can be shown to work for arbitrary topic models, even ones which allow correlations between pairs of topics.
Lastly, I will introduce the population recovery problem where the goal is to find a representative set of templates given highly incomplete samples. This is a basic problem that arises in a number of settings, and I will present the first polynomial time algorithm (improving on an earlier quasi-polynomial time algorithm). Even though this algorithm is not based on the method of moments, again the polynomial method and tools from complex analysis turn out to be useful for analyzing our algorithm.
This survey will cover in part a number of papers by many authors including Anandkumar, Arora, Belkin, Foster, Ge, Hsu, Kalai, Kakade, Liu, Saks, Sinha, Wigderson, Valiant, Yehudayoff and myself.
Ankur Moitra is an NSF CI Fellow at the Institute for Advanced Study, and also a senior postdoc in the computer science department at Princeton University. He completed his PhD and MS at MIT in 2011 and 2009 respectively, where he was advised by Tom Leighton. He received a George M. Sprowls Award (best thesis) and a William A. Martin Award (best thesis) for his doctoral and master's dissertations. He has worked in numerous areas of algorithms, including approximation algorithms, metric embeddings, combinatorics and smoothed analysis, but lately has been working at the intersection of algorithms and learning.
Large-Scale Distributed Machine Learning with GraphLab
Today, machine learning (ML) methods play a central role in industry and science. The growth of the Web and improvements in sensor data collection technology have been rapidly increasing the magnitude and complexity of the ML tasks we must solve. This growth is driving the need for scalable, parallel ML algorithms that can handle "Big Data."
Unfortunately, translating these parallel algorithms into distributed systems that can handle this big data remains a significant challenge. Thus, the state-of-the-art algorithms in ML are often seen as "academic exercises", which are implemented in the real-world.In this tutorial, I will focus on three steps for addressing this problem:
- Examining common algorithmic patterns in distributed ML methods for tackling big data.
- Qualifying the challenges of implementing these algorithms in real distributed systems.
- Describing computational frameworks for implementing these algorithms at scale.
The goal of this tutorial is to help researchers tune their efforts on algorithms for big data, and enable them to easily translate their algorithms into systems that can be used in practical settings in industry and academia.
In the latter part, we will focus mainly on the GraphLab framework, which naturally expresses asynchronous, dynamic graph computations that are key for state-of-the-art ML algorithms. When these algorithms are expressed in our higher-level abstraction, GraphLab will effectively address many of the underlying parallelism challenges, including data distribution, optimized communication, and guaranteeing sequential consistency, a property that is surprisingly important for many ML algorithms. On a variety of large-scale tasks, GraphLab provides 20-100x performance improvements over Hadoop. In recent months, GraphLab has received many tens of thousands of downloads, and is being actively used by a number of startups, companies, research labs and universities.
Carlos Guestrin is the Amazon Professor of Machine Learning at the Computer Science & Engineering Department of the University of Washington. He is also a co-founder and CEO of GraphLab Inc., focusing large-scale machine learning and graph analytics. His previous positions include Associate Professor at Carnegie Mellon University and senior researcher at the Intel Research Lab in Berkeley. Carlos received his PhD and Master from Stanford University, and a Mechatronics Engineer degree from the University of Sao Paulo, Brazil. Carlos' work has been recognized by awards at a number of conferences and two journals: KDD 2007 and 2010, IPSN 2005 and 2006, VLDB 2004, NIPS 2003 and 2007, UAI 2005, ICML 2005, AISTATS 2010, JAIR in 2007 & 2012, and JWRPM in 2009. He is also a recipient of the ONR Young Investigator Award, NSF Career Award, Alfred P. Sloan Fellowship, IBM Faculty Fellowship, the Siebel Scholarship and the Stanford Centennial Teaching Assistant Award. Carlos was named one of the 2008 'Brilliant 10' by Popular Science Magazine, received the IJCAI Computers and Thought Award and the Presidential Early Career Award for Scientists and Engineers (PECASE). He is a former member of the Information Sciences and Technology (ISAT) advisory group for DARPA.
Statistical Methods in Genomics
"Sequence census methods" couple high-throughput DNA sequencing to biochemical assays and are transforming standard functional genomics assays into powerful genome-wide probes of biochemical activity.
In this tutorial, I'll provide an introduction to such methods with a focus on the crucial role of machine learning in the development, execution and interpretation of experiments. I will first focus on the prototype experiment, called RNA-Seq, and provide a walk-through of the chemistry, biology and computation that comprise the experiment, followed by a survey of machine learning techniques that have been applied to RNA-seq data. I will also highlight new questions in machine learning that arise in this context.
I will then examine the broader scope of the field, with a view towards providing a classification and categorization of sequence census methods.
Lior Pachter was born in Ramat Gan, Israel, and grew up in Pretoria, South Africa where he attended Pretoria Boys High School. After receiving a B.S. in Mathematics from Caltech in 1994, he left for MIT where he was awarded a PhD in applied mathematics in 1999. Lior then moved to the University of California at Berkeley where he was a postdoctoral researcher (1999-2001), assistant professor (2001-2005), associate professor (2005-2009), and is currently the Raymond and Beverly Sackler professor of computational biology at UC Berkeley and professor of mathematics and molecular and cellular biology with a joint appointment in computer science.
Lior's research interests span the mathematical and biological sciences, and he has authored over 100 research articles in the areas of algorithms, combinatorics, comparative genomics, algebraic statistics, molecular biology and evolution. He has been awarded a National Science Foundation CAREER award, a Sloan Research Fellowship, the Miller Professorship, and a Federal Laboratory Consortium award for the successful technology transfer of widely used sequence alignment software developed in his group.
Computational Advertising & Causality
Statistical machine learning technologies in the real world are never without a purpose. Using their predictions, humans or machines make decisions whose circuitous consequences often violate the modeling assumptions that justified the system design in the first place. Such contradictions appear very clearly in computational advertisement systems. The design of the ad placement engine directly influences the occurrence of clicks and the corresponding advertiser payments. It also has important indirect effects : (a) ad placement decisions impact the satisfaction of the users and therefore their willingness to frequent this web site in the future, (b) ad placement decisions impact the return on investment observed by the advertisers and therefore their future bids, and (c) ad placement decisions change the nature of the data collected for training the statistical models in the future.
Popular theoretical approaches, such as auction theory or multi-armed bandits, only address selected aspects of such a system. In contrast, the language and the methods of causal inference provide a full set of tools to answer the vast collection of questions facing the designer of such a system. Is it useful to pass new input signals to the statistical models? Is it worthwhile to collect and label a new training set? What about changing the loss function or the learning algorithm? In order to answer such questions, one needs to unravel how the information produced by the statistical models traverses the web of causes and consequences and produces measurable losses and rewards.
This tutorial provides a real world example demonstrating the value of causal inference for large-scale machine learning. It also illustrates a collection of practical counterfactual analysis techniques applicable to many real-life machine learning systems, including causal inferences techniques applicable to continuously valued variables with meaningful confidence intervals, and quasi-static analysis techniques for estimating how small interventions affect certain causal equilibria. In the context of computational advertisement, this analysis elucidates the connection between auction theory and machine learning.
Léon Bottou received the Diplôme d'Ingénieur de l'École Polytechnique (X84) in 1987, the Magistère de Mathématiques Fondamentales et Appliquées et d'Informatique from École Normale Superieure in 1988, and a Ph.D. in Computer Science from Université de Paris-Sud in 1991. Léon joined AT&T Bell Laboratories in 1991 and went on to AT&T Labs Research and NEC Labs America. He joined the Science team of Microsoft adCenter in 2010 and Microsoft Research in 2012. Léon's primary research interest are machine learning. Léon's secondary research interest is data compression and coding. His best known contributions are his work on large scale learning and on the DjVu document compression technology.