SlideShare a Scribd company logo
Optimal Learning
for Fun and Profit
Scott Clark, Ph.D.
Yelp Open House
11/20/13

sclark@yelp.com

@DrScottClark
Outline of Talk

● Optimal Learning
○ What is it?
○ Why do we care?
● Multi-armed bandits
○ Definition and motivation
○ Examples
● Bayesian global optimization
○ Optimal experiment design
○ Uses to extend traditional A/B testing
What is optimal learning?

Optimal learning addresses the challenge of
how to collect information as efficiently as
possible, primarily for settings where
collecting information is time consuming
and expensive.
Source: optimallearning.princeton.edu
Part I:
Multi-Armed Bandits
What are multi-armed bandits?

THE SETUP
●
●
●
●

Imagine you are in front of K slot machines.
Each one is set to "free play" (but you can still win $$$)
Each has a possibly different, unknown payout rate
You have a fixed amount of time to maximize payout

GO!
What are multi-armed bandits?

THE SETUP
(math version)

[Robbins 1952]
Modern Bandits

Why do we care?
● Maps well onto Click Through Rate (CTR)
○ Each arm is an ad or search result
○ Each click is a success
○ Want to maximize clicks
● Can be used in experiments (A/B testing)
○ Want to find the best solutions, fast
○ Want to limit how often bad solutions are used
Tradeoffs

Exploration vs. Exploitation
Gaining knowledge about the system
vs.
Getting largest payout with current knowledge
Naive Example

Epsilon First Policy
● Sample sequentially εT < T times
○ only explore
● Pick the best and sample for t = εT+1, ..., T
○ only exploit
Example (K = 3, t = 1)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

0

0

0

WINS:

0

0

0

RATIO:

-

-

-

Observed Information
Example (K = 3, t = 1)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

1

0

0

WINS:

1

0

0

RATIO:

1

-

-

Observed Information
Example (K = 3, t = 2)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

1

1

0

WINS:

1

1

0

RATIO:

1

1

-

Observed Information
Example (K = 3, t = 3)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

1

1

1

WINS:

1

1

0

RATIO:

1

1

0

Observed Information
Example (K = 3, t = 4)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

2

1

1

WINS:

1

1

0

RATIO:

0.5

1

0

Observed Information
Example (K = 3, t = 5)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

2

2

1

WINS:

1

2

0

RATIO:

0.5

1

0

Observed Information
Example (K = 3, t = 6)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

2

2

2

WINS:

1

2

0

RATIO:

0.5

1

0

Observed Information
Example (K = 3, t = 7)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

3

2

2

WINS:

2

2

2

RATIO:

0.66

1

0

Observed Information
Example (K = 3, t = 8)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

3

3

2

WINS:

2

3

0

RATIO:

0.66

1

0

Observed Information
Example (K = 3, t = 9)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

3

3

3

WINS:

2

3

1

RATIO:

0.66

1

0.33

Observed Information
Example (K = 3, t > 9)

Exploit!
Profit!
Right?
What if our ratio is a poor approx?
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

3

3

3

WINS:

2

3

1

RATIO:

0.66

1

0.33

Observed Information
What if our ratio is a poor approx?
Unknown
payout rate

p = 0.9

p = 0.5

p = 0.5

PULLS:

3

3

3

WINS:

2

3

1

RATIO:

0.66

1

0.33

Observed Information
Fixed exploration fails

Regret is unbounded!
Amount of exploration
needs to depend on data
We need better policies!
What should we do?

Many different policies
● Weighted random choice (another naive approach)
● Epsilon-greedy
○ Best arm so far with P=1-ε, random otherwise
● Epsilon-decreasing*
○ Best arm so far with P=1-(ε * exp(-rt)), random otherwise
● UCB-exp*
● UCB-tuned*
● BLA*
● SoftMax*
● etc, etc, etc (60+ years of research)
*Regret bounded as t->infinity
Extensions and complications

What if...
● Hardware constraints limit real-time knowledge? (batching)
● Payoff noisy? Non-binary? Changes in time? (dynamic content)
● Parallel sampling? (many concurrent users)
● Arms expire? (events, news stories, etc)
● You have knowledge of the user? (logged in, contextual history)
● The number of arms increases? Continuous? (parameter search)
Every problem is different.
This is an active area of research.
Part II:
Global Optimization
What is global optimization?

THE GOAL
●
●
●

Optimize some objective function
○ CTR, revenue, delivery time, or some combination thereof
given some parameters
○ config values, cuttoffs, ML parameters
CTR = f(parameters)
○ Find best parameters

(more mathy version)
What is MOE?

Metrics Optimization Engine
A global, black box method for parameter optimization

History of how past parameters have performed

MOE

New, optimal parameters
What does MOE do?
● MOE optimizes a metric (like CTR) given some
parameters as inputs (like scoring weights)
● Given the past performance of different parameters
MOE suggests new, optimal parameters to test

MOE
Results of A/B
tests run so far

New, optimal
values to A/B test
Example Experiment
Biz details distance in ad
●
●

Setting a different distance cutoff for each category
to show “X miles away” text in biz_details ad
For each category we define a maximum distance

Parameters + Obj Func
distance_cutoffs = {
‘shopping’: 20.0,
‘food’: 14.0,
‘auto’: 15.0,
…
}
objective_function = {
‘value’: 0.012,
‘std’: 0.00013
}

MOE

MapReduce, MongoDB, python

New Parameters
distance_cutoffs = {
‘shopping’: 22.1,
‘food’: 7.3,
‘auto’: 12.6,
…
}
Why do we need MOE?
● Parameter optimization is hard
○ Finding the perfect set of parameters takes a long time
○ Hope it is well behaved and try to move in the right direction
○ Not possible as number of parameters increases

● Intractable to find best set of parameters in all situations
○ Thousands of combinations of program type, flow, category
○ Finding the best parameters manually is impossible

● Heuristics quickly break down in the real world
○ Dependent parameters (changes to one change all others)
○ Many parameters at once (location, category, map, place, ...)
○ Non-linear (complexity and chaos break assumptions)

MOE solves all of these problems in an optimal way
How does it work?

MOE

1. Build Gaussian Process (GP)
with points sampled so far
2. Optimize covariance
hyperparameters of GP
3. Find point(s) of highest
Expected Improvement
within parameter domain
4. Return optimal next best
point(s) to sample
Gaussian Processes

Rasmussen and
Williams GPML
gaussianprocess.org
Optimizing Covariance Hyperparameters
Finding the GP model that fits best

●

All of these GPs are created with the same initial data
○ with different hyperparameters (length scales)

●

Need to find the model that is most likely given the data
○ Maximum likelihood, cross validation, priors, etc

Rasmussen and Williams Gaussian Processes for Machine Learning
Find point(s) of highest expected improvement
Expected Improvement of sampling two points

We want to find the point(s) that are expected to beat the best point seen so far the most.

[Jones, Schonlau, Welsch 1998]
[Clark, Frazier 2012]
What is MOE doing right now?

MOE is now live in production
● MOE is informing active experiments
● MOE is successfully optimizing towards all given metrics
● MOE treats the underlying system it is optimizing as a black box,
allowing it to be easily extended to any system

Ongoing:
● Looking into best path towards contributing it back to the
community, if/when we decide to open source.
● MOE + bandits = <3
Questions?

Questions?

sclark@yelp.com
@DrScottClark

More Related Content

Viewers also liked (15)

Ensuring Consistency in a Replicated World by Yelp Engineering, has 31 slides with 890 views.This document discusses Yelp's approach to ensuring data consistency across replicated databases. Key aspects include: - Using a "dirty session cookie" to route user requests to data centers with recent updates. - A "repl_delay_reporter" tool that measures replication lag between data centers to determine when updates have propagated. - "Heartbeat" checks that insert test data on masters and measure replication time to slaves to estimate replication delay.
Ensuring Consistency in a Replicated WorldEnsuring Consistency in a Replicated World
Ensuring Consistency in a Replicated World
Yelp Engineering
31 slides890 views
Building a World Class Security Team by Yelp Engineering, has 22 slides with 589 views.Michael Stoppelman, SVP of Engineering at Yelp, discussed building a world-class security team over time through hiring and focusing on security basics and getting professional. He described Yelp's early experiences without strong security protections, hiring their first security head in 2011, and implementing two-factor authentication and default cross-site scripting protection. Stoppelman outlined their efforts to strengthen corporate security through malware detection, encryption, and auditing and app security such improving access controls and credential management.
Building a World Class Security TeamBuilding a World Class Security Team
Building a World Class Security Team
Yelp Engineering
22 slides589 views
MySQL At Yelp by Yelp Engineering, has 30 slides with 3598 views.This document discusses MySQL and how it is used at Yelp. It provides an overview of MySQL's history and features. It then describes how Yelp uses over 100 MySQL servers with InnoDB and replication. Yelp utilizes tools like Puppet, Nagios, Ganglia, and Percona Toolkit to manage and monitor their MySQL infrastructure. The document also provides tips for using MySQL for new and existing projects, including suggestions for troubleshooting, backups, and community resources.
MySQL At YelpMySQL At Yelp
MySQL At Yelp
Yelp Engineering
30 slides3.6K views
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen... by Yelp Engineering, has 24 slides with 17896 views.The document discusses using ElasticSearch to enable fast and scalable search of reviews. It describes how ElasticSearch allows for tokenization, stemming, stop words removal and faceting to improve search performance compared to a basic SQL search. An example query and response show how ElasticSearch returns search results and highlights matching text. The document also briefly outlines how data could be indexed in ElasticSearch through a queueing system and how shards and replicas can provide replication and scalability. It closes by noting some potential performance issues to be aware of with ElasticSearch.
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen..."Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
Yelp Engineering
24 slides17.9K views
Scaling Traffic from 0 to 139 Million Unique Visitors by Yelp Engineering, has 12 slides with 2994 views.This document summarizes the traffic history and infrastructure changes at Yelp from 2005 to the present. It outlines the key milestones and technology changes over time as Yelp grew from handling around 200k searches per day with 1 database in 2005-2007 to serving traffic across 29 countries in 2014 with a distributed, scalable infrastructure utilizing technologies like Elasticsearch, Kafka, and Pyleus for real-time processing.
Scaling Traffic from 0 to 139 Million Unique VisitorsScaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique Visitors
Yelp Engineering
12 slides3K views
ETL in Clojure by Dmitriy Morozov, has 49 slides with 3646 views.The talk will compare Cascalog, fully-featured data processing and querying library on top of Hadoop, and Sparkling – A Clojure API for Apache Spark. How both of these compare in terms of performance and code complexity for Big Data processing and why you shouldn’t be writing MapReduce jobs in plain Hadoop API.
ETL in ClojureETL in Clojure
ETL in Clojure
Dmitriy Morozov
49 slides3.6K views
How Yelp does Service Discovery by John Billings, has 35 slides with 1949 views.This is a talk that I gave at the San Francisco DevOps meetup on 9/29/15. I talk about how Yelp performs service discovery using SmartStack and Docker.
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service Discovery
John Billings
35 slides1.9K views
Hyperparameter optimization with approximate gradient by Fabian Pedregosa, has 18 slides with 14520 views.This document discusses hyperparameter optimization using approximate gradients. It introduces the problem of optimizing hyperparameters along with model parameters. While model parameters can be estimated from data, hyperparameters require methods like cross-validation. The document proposes using approximate gradients to optimize hyperparameters more efficiently than costly methods like grid search. It derives the gradient of the objective with respect to hyperparameters and presents an algorithm called HOAG that approximates this gradient using inexact solutions. The document analyzes HOAG's convergence and provides experimental results comparing it to other hyperparameter optimization methods.
Hyperparameter optimization with approximate gradientHyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradient
Fabian Pedregosa
18 slides14.5K views
That's like, so random! Monte Carlo for Data Science by Corey Chivers, has 26 slides with 18975 views.1. Monte Carlo simulation can be used to understand obscure statistics, create your own statistics, avoid difficult math, understand inferences from data, and propagate uncertainty in complex models. 2. It allows running 'what if' scenarios, such as understanding how a surge in patients in one hospital unit would propagate to the rest of the hospital. 3. The talk introduces the concept of 'simudidactic', meaning to understand complex systems using randomization and computation to create models of real-world phenomena.
That's like, so random! Monte Carlo for Data ScienceThat's like, so random! Monte Carlo for Data Science
That's like, so random! Monte Carlo for Data Science
Corey Chivers
26 slides19K views
Yelp Academic Dataset by MandaniKeyur, has 22 slides with 4642 views.This document describes a project analyzing Yelp business data using an HDInsight Hadoop cluster on Azure. The project involves downloading Yelp data, converting it to CSV, loading it onto the cluster, and using HiveQL to query and visualize the data. Key aspects analyzed include business locations, categories, ratings over time, and reviews. Visualizations were created using PowerBI. The document outlines the cluster configuration, tools used, data processing flow, sample queries, and potential extensions like natural language processing.
Yelp Academic DatasetYelp Academic Dataset
Yelp Academic Dataset
MandaniKeyur
22 slides4.6K views
Aggregation for searching complex information spaces by Mounia Lalmas-Roelleke, has 44 slides with 1804 views.The diversity and complexity of contents available on the web have dramatically increased in recent years. Multimedia content such as images, videos, maps, voice recordings has been published more often than before. Document genres have also been diversified, for instance, news, blogs, FAQs, wiki. These diversified information sources are often dealt with in a separated way. For example, in web search, users have to switch between search verticals to access different sources. Recently, there has been a growing interest in finding effective ways to aggregate these information sources so that to hide the complexity of the information spaces to users searching for relevant information. For example, so-called aggregated search investigated by the major search engine companies will provide search results from several sources in a single result page. Aggregation itself is not a new paradigm; for instance, aggregate operators are common in database technology. This talk presents the challenges faced by the like of web search engines and digital libraries in providing the means to aggregate information from several and complex information spaces in a way that helps users in their information seeking tasks. It also discusses how other disciplines including databases, artificial intelligence, and cognitive science can be brought into building effective and efficient aggregated search systems.
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spaces
Mounia Lalmas-Roelleke
44 slides1.8K views
DSL in Clojure by Misha Kozik, has 75 slides with 3478 views.Talk about DSL, How to write DSL in Clojure, How to use Instaparse (simplest library for parsing grammars) and how we use Clojure and Instaparse in Zoomdata
DSL in ClojureDSL in Clojure
DSL in Clojure
Misha Kozik
75 slides3.5K views
Building a smarter application Stack by Tomas Doran from Yelp by dotCloud, has 42 slides with 166027 views.This document discusses Smartstack, a solution for service discovery and load balancing in distributed systems like Docker. It addresses problems like dynamically wiring dependent microservices and handling failures gracefully. Smartstack consists of Synapse, which generates HAProxy configurations for discovery, and Nerve, which registers services and checks health. Ambassadors provide simple connections for containers. It aims to reduce complexity compared to alternatives while working on traditional infrastructure, VMs, and Docker.
Building a smarter application Stack by Tomas Doran from YelpBuilding a smarter application Stack by Tomas Doran from Yelp
Building a smarter application Stack by Tomas Doran from Yelp
dotCloud
42 slides166K views
Hybrid MongoDB and RDBMS Applications by Steven Francia, has 68 slides with 27521 views.The document discusses implementing a hybrid database solution using both MongoDB and MySQL. It describes storing less frequently changing and reference data like users and products in MongoDB for flexibility, while storing transactional data like orders and inventory counts in MySQL for ACID compliance. The system keeps the data in sync between the two databases using listeners that update MySQL whenever related data is created or changed in MongoDB.
Hybrid MongoDB and RDBMS ApplicationsHybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS Applications
Steven Francia
68 slides27.5K views
Intro to Machine Learning by Corey Chivers, has 23 slides with 13430 views.This document provides an introduction to machine learning. It discusses that machine learning focuses on learning about processes in the world rather than just memorizing data. It also covers the main types of machine learning: supervised learning which learns mappings between examples and labels; unsupervised learning which learns structure from unlabeled examples; and reinforcement learning which learns to take actions to maximize rewards. The document explains that machine learning requires representing data as feature vectors and using models with optimization techniques to find parameters that generalize to new data rather than overfitting the training data.
Intro to Machine LearningIntro to Machine Learning
Intro to Machine Learning
Corey Chivers
23 slides13.4K views
Building a World Class Security Team
Building a World Class Security TeamBuilding a World Class Security Team
Building a World Class Security Team
Yelp Engineering
 
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen..."Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
Yelp Engineering
 
Scaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique VisitorsScaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique Visitors
Yelp Engineering
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service Discovery
John Billings
 
Hyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientHyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradient
Fabian Pedregosa
 
That's like, so random! Monte Carlo for Data Science
That's like, so random! Monte Carlo for Data ScienceThat's like, so random! Monte Carlo for Data Science
That's like, so random! Monte Carlo for Data Science
Corey Chivers
 
Yelp Academic Dataset
Yelp Academic DatasetYelp Academic Dataset
Yelp Academic Dataset
MandaniKeyur
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spaces
Mounia Lalmas-Roelleke
 
Building a smarter application Stack by Tomas Doran from Yelp
Building a smarter application Stack by Tomas Doran from YelpBuilding a smarter application Stack by Tomas Doran from Yelp
Building a smarter application Stack by Tomas Doran from Yelp
dotCloud
 
Hybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsHybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS Applications
Steven Francia
 
Intro to Machine Learning
Intro to Machine LearningIntro to Machine Learning
Intro to Machine Learning
Corey Chivers
 

Similar to "Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp Engineering Open House 11/20/13) (20)

Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFScott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SF
MLconf
 
Correlation, causation and incrementally recommendation problems at netflix ...
Correlation, causation and incrementally  recommendation problems at netflix ...Correlation, causation and incrementally  recommendation problems at netflix ...
Correlation, causation and incrementally recommendation problems at netflix ...
Roelof van Zwol
 
Setting up an A/B-testing framework
Setting up an A/B-testing frameworkSetting up an A/B-testing framework
Setting up an A/B-testing framework
Agnes van Belle
 
Machine learning Investigative Reporting NorthBaySolutions.pdf
Machine learning Investigative Reporting NorthBaySolutions.pdfMachine learning Investigative Reporting NorthBaySolutions.pdf
Machine learning Investigative Reporting NorthBaySolutions.pdf
ssusera5352a2
 
Machine Learning and Deep Learning 4 dummies
Machine Learning and Deep Learning 4 dummies Machine Learning and Deep Learning 4 dummies
Machine Learning and Deep Learning 4 dummies
Dori Waldman
 
Machine learning4dummies
Machine learning4dummiesMachine learning4dummies
Machine learning4dummies
Michael Winer
 
Ad science bid simulator (public ver)
Ad science bid simulator (public ver)Ad science bid simulator (public ver)
Ad science bid simulator (public ver)
Marsan Ma
 
Causal reasoning and Learning Systems
Causal reasoning and Learning SystemsCausal reasoning and Learning Systems
Causal reasoning and Learning Systems
Trieu Nguyen
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
Jaya Kawale
 
Ranking System for travel search (PoC)
Ranking System for travel search (PoC)Ranking System for travel search (PoC)
Ranking System for travel search (PoC)
M Baddar
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
SigOpt
 
User Payment Prediction in Free-to-Play
User Payment Prediction in Free-to-PlayUser Payment Prediction in Free-to-Play
User Payment Prediction in Free-to-Play
Ahmed Hassan
 
Big Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao PauloBig Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao Paulo
OCTO Technology
 
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Mathieu DESPRIEE
 
Feature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditFeature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax audit
Michael BENESTY
 
BKK16-300 Benchmarking 102
BKK16-300 Benchmarking 102BKK16-300 Benchmarking 102
BKK16-300 Benchmarking 102
Linaro
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Trees
ananth
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
SigOpt
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
Scott Clark
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fitting
Wush Wu
 
Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFScott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SF
MLconf
 
Correlation, causation and incrementally recommendation problems at netflix ...
Correlation, causation and incrementally  recommendation problems at netflix ...Correlation, causation and incrementally  recommendation problems at netflix ...
Correlation, causation and incrementally recommendation problems at netflix ...
Roelof van Zwol
 
Setting up an A/B-testing framework
Setting up an A/B-testing frameworkSetting up an A/B-testing framework
Setting up an A/B-testing framework
Agnes van Belle
 
Machine learning Investigative Reporting NorthBaySolutions.pdf
Machine learning Investigative Reporting NorthBaySolutions.pdfMachine learning Investigative Reporting NorthBaySolutions.pdf
Machine learning Investigative Reporting NorthBaySolutions.pdf
ssusera5352a2
 
Machine Learning and Deep Learning 4 dummies
Machine Learning and Deep Learning 4 dummies Machine Learning and Deep Learning 4 dummies
Machine Learning and Deep Learning 4 dummies
Dori Waldman
 
Machine learning4dummies
Machine learning4dummiesMachine learning4dummies
Machine learning4dummies
Michael Winer
 
Ad science bid simulator (public ver)
Ad science bid simulator (public ver)Ad science bid simulator (public ver)
Ad science bid simulator (public ver)
Marsan Ma
 
Causal reasoning and Learning Systems
Causal reasoning and Learning SystemsCausal reasoning and Learning Systems
Causal reasoning and Learning Systems
Trieu Nguyen
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
Jaya Kawale
 
Ranking System for travel search (PoC)
Ranking System for travel search (PoC)Ranking System for travel search (PoC)
Ranking System for travel search (PoC)
M Baddar
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
SigOpt
 
User Payment Prediction in Free-to-Play
User Payment Prediction in Free-to-PlayUser Payment Prediction in Free-to-Play
User Payment Prediction in Free-to-Play
Ahmed Hassan
 
Big Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao PauloBig Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao Paulo
OCTO Technology
 
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Mathieu DESPRIEE
 
Feature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditFeature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax audit
Michael BENESTY
 
BKK16-300 Benchmarking 102
BKK16-300 Benchmarking 102BKK16-300 Benchmarking 102
BKK16-300 Benchmarking 102
Linaro
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Trees
ananth
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
SigOpt
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
Scott Clark
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fitting
Wush Wu
 

More from Yelp Engineering (6)

Human Ops
Human OpsHuman Ops
Human Ops
Yelp Engineering
 
Teeing Up Python - Code Golf
Teeing Up Python - Code GolfTeeing Up Python - Code Golf
Teeing Up Python - Code Golf
Yelp Engineering
 
Fluxx Streaming
Fluxx StreamingFluxx Streaming
Fluxx Streaming
Yelp Engineering
 
Humans by the hundred (DevOps Days Ohio)
Humans by the hundred (DevOps Days Ohio)Humans by the hundred (DevOps Days Ohio)
Humans by the hundred (DevOps Days Ohio)
Yelp Engineering
 
A Beginners Guide To Launching Yelp In Hong Kong
A Beginners Guide To Launching Yelp In Hong KongA Beginners Guide To Launching Yelp In Hong Kong
A Beginners Guide To Launching Yelp In Hong Kong
Yelp Engineering
 
Own Your Career
Own Your CareerOwn Your Career
Own Your Career
Yelp Engineering
 

Recently uploaded (20)

Ansible Vault Encrypting and Protecting Secrets - RHCE.pdf
Ansible Vault Encrypting and Protecting Secrets - RHCE.pdfAnsible Vault Encrypting and Protecting Secrets - RHCE.pdf
Ansible Vault Encrypting and Protecting Secrets - RHCE.pdf
RHCSA Guru
 
EaseUS Partition Master Crack 2025 + Serial Key
EaseUS Partition Master Crack 2025 + Serial KeyEaseUS Partition Master Crack 2025 + Serial Key
EaseUS Partition Master Crack 2025 + Serial Key
piolttruth25
 
Making GenAI Work: A structured approach to implementation
Making GenAI Work: A structured approach to implementationMaking GenAI Work: A structured approach to implementation
Making GenAI Work: A structured approach to implementation
Jeffrey Funk
 
Windows Client Privilege Escalation-Shared.pptx
Windows Client Privilege Escalation-Shared.pptxWindows Client Privilege Escalation-Shared.pptx
Windows Client Privilege Escalation-Shared.pptx
Oddvar Moe
 
Measuring Copilot and Gen AI Success with Viva Insights and Purview
Measuring Copilot and Gen AI Success with Viva Insights and PurviewMeasuring Copilot and Gen AI Success with Viva Insights and Purview
Measuring Copilot and Gen AI Success with Viva Insights and Purview
Nikki Chapple
 
Create a Beautiful Terminal for Windows 🚀
Create a Beautiful Terminal for Windows 🚀Create a Beautiful Terminal for Windows 🚀
Create a Beautiful Terminal for Windows 🚀
Chris Wahl
 
Women in Automation: Career Development & Leadership in Automation
Women in Automation: Career Development & Leadership in AutomationWomen in Automation: Career Development & Leadership in Automation
Women in Automation: Career Development & Leadership in Automation
UiPathCommunity
 
Cloud Computing The Future of Technology
Cloud Computing The Future of TechnologyCloud Computing The Future of Technology
Cloud Computing The Future of Technology
joelmcapg
 
Taking Your Legacy Data Beyond Modernization with AWS.pdf
Taking Your Legacy Data Beyond Modernization with AWS.pdfTaking Your Legacy Data Beyond Modernization with AWS.pdf
Taking Your Legacy Data Beyond Modernization with AWS.pdf
Precisely
 
Presentation Session 2 -Context Grounding.pdf
Presentation Session 2 -Context Grounding.pdfPresentation Session 2 -Context Grounding.pdf
Presentation Session 2 -Context Grounding.pdf
Mukesh Kala
 
Google News Consideration for SEO | Google Search NYC
Google News Consideration for SEO | Google Search NYCGoogle News Consideration for SEO | Google Search NYC
Google News Consideration for SEO | Google Search NYC
Primary Position
 
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
All Things Open
 
IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]
IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]
IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]
jamesfolkner123
 
Real World RAG: 5 common issues encountered when building Real World Applicat...
Real World RAG: 5 common issues encountered when building Real World Applicat...Real World RAG: 5 common issues encountered when building Real World Applicat...
Real World RAG: 5 common issues encountered when building Real World Applicat...
walterheck3
 
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
All Things Open
 
Unleash the Power of Symfony Messenger
Unleash the  Power  of Symfony MessengerUnleash the  Power  of Symfony Messenger
Unleash the Power of Symfony Messenger
Kris Wallsmith
 
The Death of the Browser - Rachel-Lee Nabors, AgentQL
The Death of the Browser - Rachel-Lee Nabors, AgentQLThe Death of the Browser - Rachel-Lee Nabors, AgentQL
The Death of the Browser - Rachel-Lee Nabors, AgentQL
All Things Open
 
Comprehensive Guide to Ansible Application Roles.pdf
Comprehensive Guide to Ansible Application Roles.pdfComprehensive Guide to Ansible Application Roles.pdf
Comprehensive Guide to Ansible Application Roles.pdf
RHCSA Guru
 
Ansible Variables in Playbook - RHCE.pdf
Ansible Variables in Playbook - RHCE.pdfAnsible Variables in Playbook - RHCE.pdf
Ansible Variables in Playbook - RHCE.pdf
RHCSA Guru
 
Digital Nepal Framework 2.0: A Step Towards a Digitally Empowered Nepal
Digital Nepal Framework 2.0: A Step Towards a Digitally Empowered NepalDigital Nepal Framework 2.0: A Step Towards a Digitally Empowered Nepal
Digital Nepal Framework 2.0: A Step Towards a Digitally Empowered Nepal
ICT Frame Magazine Pvt. Ltd.
 
Ansible Vault Encrypting and Protecting Secrets - RHCE.pdf
Ansible Vault Encrypting and Protecting Secrets - RHCE.pdfAnsible Vault Encrypting and Protecting Secrets - RHCE.pdf
Ansible Vault Encrypting and Protecting Secrets - RHCE.pdf
RHCSA Guru
 
EaseUS Partition Master Crack 2025 + Serial Key
EaseUS Partition Master Crack 2025 + Serial KeyEaseUS Partition Master Crack 2025 + Serial Key
EaseUS Partition Master Crack 2025 + Serial Key
piolttruth25
 
Making GenAI Work: A structured approach to implementation
Making GenAI Work: A structured approach to implementationMaking GenAI Work: A structured approach to implementation
Making GenAI Work: A structured approach to implementation
Jeffrey Funk
 
Windows Client Privilege Escalation-Shared.pptx
Windows Client Privilege Escalation-Shared.pptxWindows Client Privilege Escalation-Shared.pptx
Windows Client Privilege Escalation-Shared.pptx
Oddvar Moe
 
Measuring Copilot and Gen AI Success with Viva Insights and Purview
Measuring Copilot and Gen AI Success with Viva Insights and PurviewMeasuring Copilot and Gen AI Success with Viva Insights and Purview
Measuring Copilot and Gen AI Success with Viva Insights and Purview
Nikki Chapple
 
Create a Beautiful Terminal for Windows 🚀
Create a Beautiful Terminal for Windows 🚀Create a Beautiful Terminal for Windows 🚀
Create a Beautiful Terminal for Windows 🚀
Chris Wahl
 
Women in Automation: Career Development & Leadership in Automation
Women in Automation: Career Development & Leadership in AutomationWomen in Automation: Career Development & Leadership in Automation
Women in Automation: Career Development & Leadership in Automation
UiPathCommunity
 
Cloud Computing The Future of Technology
Cloud Computing The Future of TechnologyCloud Computing The Future of Technology
Cloud Computing The Future of Technology
joelmcapg
 
Taking Your Legacy Data Beyond Modernization with AWS.pdf
Taking Your Legacy Data Beyond Modernization with AWS.pdfTaking Your Legacy Data Beyond Modernization with AWS.pdf
Taking Your Legacy Data Beyond Modernization with AWS.pdf
Precisely
 
Presentation Session 2 -Context Grounding.pdf
Presentation Session 2 -Context Grounding.pdfPresentation Session 2 -Context Grounding.pdf
Presentation Session 2 -Context Grounding.pdf
Mukesh Kala
 
Google News Consideration for SEO | Google Search NYC
Google News Consideration for SEO | Google Search NYCGoogle News Consideration for SEO | Google Search NYC
Google News Consideration for SEO | Google Search NYC
Primary Position
 
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
All Things Open
 
IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]
IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]
IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]
jamesfolkner123
 
Real World RAG: 5 common issues encountered when building Real World Applicat...
Real World RAG: 5 common issues encountered when building Real World Applicat...Real World RAG: 5 common issues encountered when building Real World Applicat...
Real World RAG: 5 common issues encountered when building Real World Applicat...
walterheck3
 
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
All Things Open
 
Unleash the Power of Symfony Messenger
Unleash the  Power  of Symfony MessengerUnleash the  Power  of Symfony Messenger
Unleash the Power of Symfony Messenger
Kris Wallsmith
 
The Death of the Browser - Rachel-Lee Nabors, AgentQL
The Death of the Browser - Rachel-Lee Nabors, AgentQLThe Death of the Browser - Rachel-Lee Nabors, AgentQL
The Death of the Browser - Rachel-Lee Nabors, AgentQL
All Things Open
 
Comprehensive Guide to Ansible Application Roles.pdf
Comprehensive Guide to Ansible Application Roles.pdfComprehensive Guide to Ansible Application Roles.pdf
Comprehensive Guide to Ansible Application Roles.pdf
RHCSA Guru
 
Ansible Variables in Playbook - RHCE.pdf
Ansible Variables in Playbook - RHCE.pdfAnsible Variables in Playbook - RHCE.pdf
Ansible Variables in Playbook - RHCE.pdf
RHCSA Guru
 
Digital Nepal Framework 2.0: A Step Towards a Digitally Empowered Nepal
Digital Nepal Framework 2.0: A Step Towards a Digitally Empowered NepalDigital Nepal Framework 2.0: A Step Towards a Digitally Empowered Nepal
Digital Nepal Framework 2.0: A Step Towards a Digitally Empowered Nepal
ICT Frame Magazine Pvt. Ltd.
 

"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp Engineering Open House 11/20/13)