SlideShare a Scribd company logo
1 of 37
Download to read offline
Optimal Learning
for Fun and Profit
Scott Clark, Ph.D.
Yelp Open House
11/20/13

sclark@yelp.com

@DrScottClark
Outline of Talk

● Optimal Learning
○ What is it?
○ Why do we care?
● Multi-armed bandits
○ Definition and motivation
○ Examples
● Bayesian global optimization
○ Optimal experiment design
○ Uses to extend traditional A/B testing
What is optimal learning?

Optimal learning addresses the challenge of
how to collect information as efficiently as
possible, primarily for settings where
collecting information is time consuming
and expensive.
Source: optimallearning.princeton.edu
Part I:
Multi-Armed Bandits
What are multi-armed bandits?

THE SETUP
●
●
●
●

Imagine you are in front of K slot machines.
Each one is set to "free play" (but you can still win $$$)
Each has a possibly different, unknown payout rate
You have a fixed amount of time to maximize payout

GO!
What are multi-armed bandits?

THE SETUP
(math version)

[Robbins 1952]
Modern Bandits

Why do we care?
● Maps well onto Click Through Rate (CTR)
○ Each arm is an ad or search result
○ Each click is a success
○ Want to maximize clicks
● Can be used in experiments (A/B testing)
○ Want to find the best solutions, fast
○ Want to limit how often bad solutions are used
Tradeoffs

Exploration vs. Exploitation
Gaining knowledge about the system
vs.
Getting largest payout with current knowledge
Naive Example

Epsilon First Policy
● Sample sequentially εT < T times
○ only explore
● Pick the best and sample for t = εT+1, ..., T
○ only exploit
Example (K = 3, t = 1)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

0

0

0

WINS:

0

0

0

RATIO:

-

-

-

Observed Information
Example (K = 3, t = 1)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

1

0

0

WINS:

1

0

0

RATIO:

1

-

-

Observed Information
Example (K = 3, t = 2)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

1

1

0

WINS:

1

1

0

RATIO:

1

1

-

Observed Information
Example (K = 3, t = 3)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

1

1

1

WINS:

1

1

0

RATIO:

1

1

0

Observed Information
Example (K = 3, t = 4)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

2

1

1

WINS:

1

1

0

RATIO:

0.5

1

0

Observed Information
Example (K = 3, t = 5)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

2

2

1

WINS:

1

2

0

RATIO:

0.5

1

0

Observed Information
Example (K = 3, t = 6)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

2

2

2

WINS:

1

2

0

RATIO:

0.5

1

0

Observed Information
Example (K = 3, t = 7)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

3

2

2

WINS:

2

2

2

RATIO:

0.66

1

0

Observed Information
Example (K = 3, t = 8)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

3

3

2

WINS:

2

3

0

RATIO:

0.66

1

0

Observed Information
Example (K = 3, t = 9)
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

3

3

3

WINS:

2

3

1

RATIO:

0.66

1

0.33

Observed Information
Example (K = 3, t > 9)

Exploit!
Profit!
Right?
What if our ratio is a poor approx?
Unknown
payout rate

p = 0.5

p = 0.8

p = 0.2

PULLS:

3

3

3

WINS:

2

3

1

RATIO:

0.66

1

0.33

Observed Information
What if our ratio is a poor approx?
Unknown
payout rate

p = 0.9

p = 0.5

p = 0.5

PULLS:

3

3

3

WINS:

2

3

1

RATIO:

0.66

1

0.33

Observed Information
Fixed exploration fails

Regret is unbounded!
Amount of exploration
needs to depend on data
We need better policies!
What should we do?

Many different policies
● Weighted random choice (another naive approach)
● Epsilon-greedy
○ Best arm so far with P=1-ε, random otherwise
● Epsilon-decreasing*
○ Best arm so far with P=1-(ε * exp(-rt)), random otherwise
● UCB-exp*
● UCB-tuned*
● BLA*
● SoftMax*
● etc, etc, etc (60+ years of research)
*Regret bounded as t->infinity
Extensions and complications

What if...
● Hardware constraints limit real-time knowledge? (batching)
● Payoff noisy? Non-binary? Changes in time? (dynamic content)
● Parallel sampling? (many concurrent users)
● Arms expire? (events, news stories, etc)
● You have knowledge of the user? (logged in, contextual history)
● The number of arms increases? Continuous? (parameter search)
Every problem is different.
This is an active area of research.
Part II:
Global Optimization
What is global optimization?

THE GOAL
●
●
●

Optimize some objective function
○ CTR, revenue, delivery time, or some combination thereof
given some parameters
○ config values, cuttoffs, ML parameters
CTR = f(parameters)
○ Find best parameters

(more mathy version)
What is MOE?

Metrics Optimization Engine
A global, black box method for parameter optimization

History of how past parameters have performed

MOE

New, optimal parameters
What does MOE do?
● MOE optimizes a metric (like CTR) given some
parameters as inputs (like scoring weights)
● Given the past performance of different parameters
MOE suggests new, optimal parameters to test

MOE
Results of A/B
tests run so far

New, optimal
values to A/B test
Example Experiment
Biz details distance in ad
●
●

Setting a different distance cutoff for each category
to show “X miles away” text in biz_details ad
For each category we define a maximum distance

Parameters + Obj Func
distance_cutoffs = {
‘shopping’: 20.0,
‘food’: 14.0,
‘auto’: 15.0,
…
}
objective_function = {
‘value’: 0.012,
‘std’: 0.00013
}

MOE

MapReduce, MongoDB, python

New Parameters
distance_cutoffs = {
‘shopping’: 22.1,
‘food’: 7.3,
‘auto’: 12.6,
…
}
Why do we need MOE?
● Parameter optimization is hard
○ Finding the perfect set of parameters takes a long time
○ Hope it is well behaved and try to move in the right direction
○ Not possible as number of parameters increases

● Intractable to find best set of parameters in all situations
○ Thousands of combinations of program type, flow, category
○ Finding the best parameters manually is impossible

● Heuristics quickly break down in the real world
○ Dependent parameters (changes to one change all others)
○ Many parameters at once (location, category, map, place, ...)
○ Non-linear (complexity and chaos break assumptions)

MOE solves all of these problems in an optimal way
How does it work?

MOE

1. Build Gaussian Process (GP)
with points sampled so far
2. Optimize covariance
hyperparameters of GP
3. Find point(s) of highest
Expected Improvement
within parameter domain
4. Return optimal next best
point(s) to sample
Gaussian Processes

Rasmussen and
Williams GPML
gaussianprocess.org
Optimizing Covariance Hyperparameters
Finding the GP model that fits best

●

All of these GPs are created with the same initial data
○ with different hyperparameters (length scales)

●

Need to find the model that is most likely given the data
○ Maximum likelihood, cross validation, priors, etc

Rasmussen and Williams Gaussian Processes for Machine Learning
Find point(s) of highest expected improvement
Expected Improvement of sampling two points

We want to find the point(s) that are expected to beat the best point seen so far the most.

[Jones, Schonlau, Welsch 1998]
[Clark, Frazier 2012]
What is MOE doing right now?

MOE is now live in production
● MOE is informing active experiments
● MOE is successfully optimizing towards all given metrics
● MOE treats the underlying system it is optimizing as a black box,
allowing it to be easily extended to any system

Ongoing:
● Looking into best path towards contributing it back to the
community, if/when we decide to open source.
● MOE + bandits = <3
Questions?

Questions?

sclark@yelp.com
@DrScottClark

More Related Content

Viewers also liked

Ensuring Consistency in a Replicated World
Ensuring Consistency in a Replicated WorldEnsuring Consistency in a Replicated World
Ensuring Consistency in a Replicated WorldYelp Engineering
 
Building a World Class Security Team
Building a World Class Security TeamBuilding a World Class Security Team
Building a World Class Security TeamYelp Engineering
 
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen..."Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...Yelp Engineering
 
Scaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique VisitorsScaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique VisitorsYelp Engineering
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service DiscoveryJohn Billings
 
Hyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientHyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientFabian Pedregosa
 
That's like, so random! Monte Carlo for Data Science
That's like, so random! Monte Carlo for Data ScienceThat's like, so random! Monte Carlo for Data Science
That's like, so random! Monte Carlo for Data ScienceCorey Chivers
 
Yelp Academic Dataset
Yelp Academic DatasetYelp Academic Dataset
Yelp Academic DatasetMandaniKeyur
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spacesMounia Lalmas-Roelleke
 
Building a smarter application Stack by Tomas Doran from Yelp
Building a smarter application Stack by Tomas Doran from YelpBuilding a smarter application Stack by Tomas Doran from Yelp
Building a smarter application Stack by Tomas Doran from YelpdotCloud
 
Hybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsHybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsSteven Francia
 
Intro to Machine Learning
Intro to Machine LearningIntro to Machine Learning
Intro to Machine LearningCorey Chivers
 

Viewers also liked (15)

Ensuring Consistency in a Replicated World
Ensuring Consistency in a Replicated WorldEnsuring Consistency in a Replicated World
Ensuring Consistency in a Replicated World
 
Building a World Class Security Team
Building a World Class Security TeamBuilding a World Class Security Team
Building a World Class Security Team
 
MySQL At Yelp
MySQL At YelpMySQL At Yelp
MySQL At Yelp
 
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen..."Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
 
Scaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique VisitorsScaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique Visitors
 
ETL in Clojure
ETL in ClojureETL in Clojure
ETL in Clojure
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service Discovery
 
Hyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientHyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradient
 
That's like, so random! Monte Carlo for Data Science
That's like, so random! Monte Carlo for Data ScienceThat's like, so random! Monte Carlo for Data Science
That's like, so random! Monte Carlo for Data Science
 
Yelp Academic Dataset
Yelp Academic DatasetYelp Academic Dataset
Yelp Academic Dataset
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spaces
 
DSL in Clojure
DSL in ClojureDSL in Clojure
DSL in Clojure
 
Building a smarter application Stack by Tomas Doran from Yelp
Building a smarter application Stack by Tomas Doran from YelpBuilding a smarter application Stack by Tomas Doran from Yelp
Building a smarter application Stack by Tomas Doran from Yelp
 
Hybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsHybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS Applications
 
Intro to Machine Learning
Intro to Machine LearningIntro to Machine Learning
Intro to Machine Learning
 

Similar to "Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp Engineering Open House 11/20/13)

Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFScott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFMLconf
 
Correlation, causation and incrementally recommendation problems at netflix ...
Correlation, causation and incrementally  recommendation problems at netflix ...Correlation, causation and incrementally  recommendation problems at netflix ...
Correlation, causation and incrementally recommendation problems at netflix ...Roelof van Zwol
 
Setting up an A/B-testing framework
Setting up an A/B-testing frameworkSetting up an A/B-testing framework
Setting up an A/B-testing frameworkAgnes van Belle
 
Machine learning Investigative Reporting NorthBaySolutions.pdf
Machine learning Investigative Reporting NorthBaySolutions.pdfMachine learning Investigative Reporting NorthBaySolutions.pdf
Machine learning Investigative Reporting NorthBaySolutions.pdfssusera5352a2
 
Machine Learning and Deep Learning 4 dummies
Machine Learning and Deep Learning 4 dummies Machine Learning and Deep Learning 4 dummies
Machine Learning and Deep Learning 4 dummies Dori Waldman
 
Machine learning4dummies
Machine learning4dummiesMachine learning4dummies
Machine learning4dummiesMichael Winer
 
Ad science bid simulator (public ver)
Ad science bid simulator (public ver)Ad science bid simulator (public ver)
Ad science bid simulator (public ver)Marsan Ma
 
Causal reasoning and Learning Systems
Causal reasoning and Learning SystemsCausal reasoning and Learning Systems
Causal reasoning and Learning SystemsTrieu Nguyen
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsJaya Kawale
 
Ranking System for travel search (PoC)
Ranking System for travel search (PoC)Ranking System for travel search (PoC)
Ranking System for travel search (PoC)M Baddar
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudSigOpt
 
User Payment Prediction in Free-to-Play
User Payment Prediction in Free-to-PlayUser Payment Prediction in Free-to-Play
User Payment Prediction in Free-to-PlayAhmed Hassan
 
Big Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao PauloBig Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao PauloOCTO Technology
 
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713Mathieu DESPRIEE
 
Feature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditFeature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditMichael BENESTY
 
BKK16-300 Benchmarking 102
BKK16-300 Benchmarking 102BKK16-300 Benchmarking 102
BKK16-300 Benchmarking 102Linaro
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Treesananth
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsScott Clark
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsSigOpt
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fittingWush Wu
 

Similar to "Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp Engineering Open House 11/20/13) (20)

Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFScott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SF
 
Correlation, causation and incrementally recommendation problems at netflix ...
Correlation, causation and incrementally  recommendation problems at netflix ...Correlation, causation and incrementally  recommendation problems at netflix ...
Correlation, causation and incrementally recommendation problems at netflix ...
 
Setting up an A/B-testing framework
Setting up an A/B-testing frameworkSetting up an A/B-testing framework
Setting up an A/B-testing framework
 
Machine learning Investigative Reporting NorthBaySolutions.pdf
Machine learning Investigative Reporting NorthBaySolutions.pdfMachine learning Investigative Reporting NorthBaySolutions.pdf
Machine learning Investigative Reporting NorthBaySolutions.pdf
 
Machine Learning and Deep Learning 4 dummies
Machine Learning and Deep Learning 4 dummies Machine Learning and Deep Learning 4 dummies
Machine Learning and Deep Learning 4 dummies
 
Machine learning4dummies
Machine learning4dummiesMachine learning4dummies
Machine learning4dummies
 
Ad science bid simulator (public ver)
Ad science bid simulator (public ver)Ad science bid simulator (public ver)
Ad science bid simulator (public ver)
 
Causal reasoning and Learning Systems
Causal reasoning and Learning SystemsCausal reasoning and Learning Systems
Causal reasoning and Learning Systems
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
 
Ranking System for travel search (PoC)
Ranking System for travel search (PoC)Ranking System for travel search (PoC)
Ranking System for travel search (PoC)
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
 
User Payment Prediction in Free-to-Play
User Payment Prediction in Free-to-PlayUser Payment Prediction in Free-to-Play
User Payment Prediction in Free-to-Play
 
Big Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao PauloBig Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao Paulo
 
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
 
Feature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditFeature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax audit
 
BKK16-300 Benchmarking 102
BKK16-300 Benchmarking 102BKK16-300 Benchmarking 102
BKK16-300 Benchmarking 102
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Trees
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fitting
 

More from Yelp Engineering

More from Yelp Engineering (6)

Human Ops
Human OpsHuman Ops
Human Ops
 
Teeing Up Python - Code Golf
Teeing Up Python - Code GolfTeeing Up Python - Code Golf
Teeing Up Python - Code Golf
 
Fluxx Streaming
Fluxx StreamingFluxx Streaming
Fluxx Streaming
 
Humans by the hundred (DevOps Days Ohio)
Humans by the hundred (DevOps Days Ohio)Humans by the hundred (DevOps Days Ohio)
Humans by the hundred (DevOps Days Ohio)
 
A Beginners Guide To Launching Yelp In Hong Kong
A Beginners Guide To Launching Yelp In Hong KongA Beginners Guide To Launching Yelp In Hong Kong
A Beginners Guide To Launching Yelp In Hong Kong
 
Own Your Career
Own Your CareerOwn Your Career
Own Your Career
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 

"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp Engineering Open House 11/20/13)

  • 1. Optimal Learning for Fun and Profit Scott Clark, Ph.D. Yelp Open House 11/20/13 sclark@yelp.com @DrScottClark
  • 2. Outline of Talk ● Optimal Learning ○ What is it? ○ Why do we care? ● Multi-armed bandits ○ Definition and motivation ○ Examples ● Bayesian global optimization ○ Optimal experiment design ○ Uses to extend traditional A/B testing
  • 3. What is optimal learning? Optimal learning addresses the challenge of how to collect information as efficiently as possible, primarily for settings where collecting information is time consuming and expensive. Source: optimallearning.princeton.edu
  • 5. What are multi-armed bandits? THE SETUP ● ● ● ● Imagine you are in front of K slot machines. Each one is set to "free play" (but you can still win $$$) Each has a possibly different, unknown payout rate You have a fixed amount of time to maximize payout GO!
  • 6. What are multi-armed bandits? THE SETUP (math version) [Robbins 1952]
  • 7. Modern Bandits Why do we care? ● Maps well onto Click Through Rate (CTR) ○ Each arm is an ad or search result ○ Each click is a success ○ Want to maximize clicks ● Can be used in experiments (A/B testing) ○ Want to find the best solutions, fast ○ Want to limit how often bad solutions are used
  • 8. Tradeoffs Exploration vs. Exploitation Gaining knowledge about the system vs. Getting largest payout with current knowledge
  • 9. Naive Example Epsilon First Policy ● Sample sequentially εT < T times ○ only explore ● Pick the best and sample for t = εT+1, ..., T ○ only exploit
  • 10. Example (K = 3, t = 1) Unknown payout rate p = 0.5 p = 0.8 p = 0.2 PULLS: 0 0 0 WINS: 0 0 0 RATIO: - - - Observed Information
  • 11. Example (K = 3, t = 1) Unknown payout rate p = 0.5 p = 0.8 p = 0.2 PULLS: 1 0 0 WINS: 1 0 0 RATIO: 1 - - Observed Information
  • 12. Example (K = 3, t = 2) Unknown payout rate p = 0.5 p = 0.8 p = 0.2 PULLS: 1 1 0 WINS: 1 1 0 RATIO: 1 1 - Observed Information
  • 13. Example (K = 3, t = 3) Unknown payout rate p = 0.5 p = 0.8 p = 0.2 PULLS: 1 1 1 WINS: 1 1 0 RATIO: 1 1 0 Observed Information
  • 14. Example (K = 3, t = 4) Unknown payout rate p = 0.5 p = 0.8 p = 0.2 PULLS: 2 1 1 WINS: 1 1 0 RATIO: 0.5 1 0 Observed Information
  • 15. Example (K = 3, t = 5) Unknown payout rate p = 0.5 p = 0.8 p = 0.2 PULLS: 2 2 1 WINS: 1 2 0 RATIO: 0.5 1 0 Observed Information
  • 16. Example (K = 3, t = 6) Unknown payout rate p = 0.5 p = 0.8 p = 0.2 PULLS: 2 2 2 WINS: 1 2 0 RATIO: 0.5 1 0 Observed Information
  • 17. Example (K = 3, t = 7) Unknown payout rate p = 0.5 p = 0.8 p = 0.2 PULLS: 3 2 2 WINS: 2 2 2 RATIO: 0.66 1 0 Observed Information
  • 18. Example (K = 3, t = 8) Unknown payout rate p = 0.5 p = 0.8 p = 0.2 PULLS: 3 3 2 WINS: 2 3 0 RATIO: 0.66 1 0 Observed Information
  • 19. Example (K = 3, t = 9) Unknown payout rate p = 0.5 p = 0.8 p = 0.2 PULLS: 3 3 3 WINS: 2 3 1 RATIO: 0.66 1 0.33 Observed Information
  • 20. Example (K = 3, t > 9) Exploit! Profit! Right?
  • 21. What if our ratio is a poor approx? Unknown payout rate p = 0.5 p = 0.8 p = 0.2 PULLS: 3 3 3 WINS: 2 3 1 RATIO: 0.66 1 0.33 Observed Information
  • 22. What if our ratio is a poor approx? Unknown payout rate p = 0.9 p = 0.5 p = 0.5 PULLS: 3 3 3 WINS: 2 3 1 RATIO: 0.66 1 0.33 Observed Information
  • 23. Fixed exploration fails Regret is unbounded! Amount of exploration needs to depend on data We need better policies!
  • 24. What should we do? Many different policies ● Weighted random choice (another naive approach) ● Epsilon-greedy ○ Best arm so far with P=1-ε, random otherwise ● Epsilon-decreasing* ○ Best arm so far with P=1-(ε * exp(-rt)), random otherwise ● UCB-exp* ● UCB-tuned* ● BLA* ● SoftMax* ● etc, etc, etc (60+ years of research) *Regret bounded as t->infinity
  • 25. Extensions and complications What if... ● Hardware constraints limit real-time knowledge? (batching) ● Payoff noisy? Non-binary? Changes in time? (dynamic content) ● Parallel sampling? (many concurrent users) ● Arms expire? (events, news stories, etc) ● You have knowledge of the user? (logged in, contextual history) ● The number of arms increases? Continuous? (parameter search) Every problem is different. This is an active area of research.
  • 27. What is global optimization? THE GOAL ● ● ● Optimize some objective function ○ CTR, revenue, delivery time, or some combination thereof given some parameters ○ config values, cuttoffs, ML parameters CTR = f(parameters) ○ Find best parameters (more mathy version)
  • 28. What is MOE? Metrics Optimization Engine A global, black box method for parameter optimization History of how past parameters have performed MOE New, optimal parameters
  • 29. What does MOE do? ● MOE optimizes a metric (like CTR) given some parameters as inputs (like scoring weights) ● Given the past performance of different parameters MOE suggests new, optimal parameters to test MOE Results of A/B tests run so far New, optimal values to A/B test
  • 30. Example Experiment Biz details distance in ad ● ● Setting a different distance cutoff for each category to show “X miles away” text in biz_details ad For each category we define a maximum distance Parameters + Obj Func distance_cutoffs = { ‘shopping’: 20.0, ‘food’: 14.0, ‘auto’: 15.0, … } objective_function = { ‘value’: 0.012, ‘std’: 0.00013 } MOE MapReduce, MongoDB, python New Parameters distance_cutoffs = { ‘shopping’: 22.1, ‘food’: 7.3, ‘auto’: 12.6, … }
  • 31. Why do we need MOE? ● Parameter optimization is hard ○ Finding the perfect set of parameters takes a long time ○ Hope it is well behaved and try to move in the right direction ○ Not possible as number of parameters increases ● Intractable to find best set of parameters in all situations ○ Thousands of combinations of program type, flow, category ○ Finding the best parameters manually is impossible ● Heuristics quickly break down in the real world ○ Dependent parameters (changes to one change all others) ○ Many parameters at once (location, category, map, place, ...) ○ Non-linear (complexity and chaos break assumptions) MOE solves all of these problems in an optimal way
  • 32. How does it work? MOE 1. Build Gaussian Process (GP) with points sampled so far 2. Optimize covariance hyperparameters of GP 3. Find point(s) of highest Expected Improvement within parameter domain 4. Return optimal next best point(s) to sample
  • 33. Gaussian Processes Rasmussen and Williams GPML gaussianprocess.org
  • 34. Optimizing Covariance Hyperparameters Finding the GP model that fits best ● All of these GPs are created with the same initial data ○ with different hyperparameters (length scales) ● Need to find the model that is most likely given the data ○ Maximum likelihood, cross validation, priors, etc Rasmussen and Williams Gaussian Processes for Machine Learning
  • 35. Find point(s) of highest expected improvement Expected Improvement of sampling two points We want to find the point(s) that are expected to beat the best point seen so far the most. [Jones, Schonlau, Welsch 1998] [Clark, Frazier 2012]
  • 36. What is MOE doing right now? MOE is now live in production ● MOE is informing active experiments ● MOE is successfully optimizing towards all given metrics ● MOE treats the underlying system it is optimizing as a black box, allowing it to be easily extended to any system Ongoing: ● Looking into best path towards contributing it back to the community, if/when we decide to open source. ● MOE + bandits = <3