Netflix Consumer Science and A/B Testing

•

35 likes•5,418 views

1) Netflix uses consumer science and A/B testing to improve the user experience and increase customer satisfaction and retention. 2) Scientists form hypotheses about potential product improvements and then test different variations through controlled experiments with real customers. 3) The results of A/B tests help Netflix determine which changes to roll out more broadly or iterate further based on metrics like hours watched and retention rates.

Technology News & Politics Business

Rochelle King
VP, User Experience & Product Services

Matt Marenghi
VP, User Interface Engineering

Netflix &
Consumer Science

“If you want to increase your success rate,
double your failure rate.”
– Thomas Watson, Sr., founder of IBM

Choosing the Right Metrics

Core Metric: Retention

19

Choosing the Right Metrics

Core Metric: Retention

20

Choosing the Right Metrics

Core Metric: Retention
Proxy Metric: Hours Watched
20

Start with a Hypothesis...

If we make a
huge “play” button,
people will watch
more.

21

Start with a Hypothesis...

If we make a
huge “play” button,
people will watch
more.
If we give people
$1 every time they press
“play”, retention will
improve.

21

Start with a Hypothesis...

If we make a
huge “play” button,
people will watch
more.
If we give people
$1 every time they press
“play”, retention will
improve.

Showing more
movies & TV shows will lead to
more streaming and improved
retention.
21

Determine the Variables

Showing more
movies & TV shows will lead to
more streaming and improved
retention.

23

Determine the Variables

more titles per row
more rows

24

Determine the Variables

more titles per row

Depth
vs.
more rows

Breadth

24

Design the Test

Control

25 rows x
75 titles

25

Design the Test

Control

25 rows x
75 titles

Cell 1

25 rows x
150 titles

25

Design the Test

Control

25 rows x
75 titles

Cell 1 Cell 2

25 rows x 31 rows x
150 titles 75 titles

25

Design the Test

Control

25 rows x
75 titles

Cell 1 Cell 2 Cell 3

25 rows x 31 rows x 31 rows x
150 titles 75 titles 150 titles

25

Design the Test

Control

25 rows x
75 titles

Cell 1 Cell 3
Cell 2
25 rows x 31 rows x
150 titles 31 rows x 150 titles
75 titles

25

Level the Data from
playing field real customers

26

Level the Data from Align to core
playing field real customers metrics

26

Large scale concept testing
can provide general direction

Original PlayStation 3 UI

How can we get our customers to watch more?

Cell 1:
Browsing more titles using a flexible menu system and
hierarchy will lead to more viewing

Cell 2:
A simple, flat interface which focuses on content will lead
to more viewing

Cell 3:
Separating navigation from content will guide our
members to the content and lead to more viewing

Cell 4:
A video-rich browsing experience will lead to more
viewing

VOTE!

Cell 1: Cell 2:
Hierarchy Grid

Cell 3: Cell 4:
Separation Video

And the winner is...

Cell 1: Cell 2:
Control Grid

Cell 3: Cell 4:
Separation Video

Data can give you confidence
in your decisions

Hypothesis

A cleaner UI which showcases the content
will lead to more viewing.

Results

Cell 0: Cell 1:
Control Clean

39

Roll Out! but...

“NO GOOD...it SucKs BIG
TIME...plz change back”

40

Roll Out! but...

“NO GOOD...it SucKs BIG
TIME...plz change back”

“I am hoping that at least one
person at Netflix with authority will
put down the crack pipe...and go
back to the old interface”

40

Dealing With Results

Roll it out

With consideration to the
change effect on users

44

Dealing With Results

Roll it out Move On

With consideration to the
change effect on users

44

Dealing With Results

Roll it out Move On

With consideration to the Polish won’t make it turn
change effect on users positive

44

When the World is Flat

Unsure of Value

Retest?

45

When the World is Flat

Unsure of Value

Retest?

If there’s a specific
concern, address it and
consider retesting

45

When the World is Flat

Unsure of Value Value Add Feature

Retest? Roll out?

If there’s a specific
concern, address it and
consider retesting

45

When the World is Flat

Unsure of Value Value Add Feature

Retest? Roll out?

If there’s a specific but...
concern, address it and - Ongoing tax
consider retesting - Likely to constrain
future innovation

45

Pitfalls

• A/B testing becomes a crutch for decision making

46

Pitfalls

• A/B testing becomes a crutch for decision making
• Not getting a clear signal

46

Pitfalls

• A/B testing becomes a crutch for decision making
• Not getting a clear signal
• Too many variations

46

Pitfalls

• A/B testing becomes a crutch for decision making
• Not getting a clear signal
• Too many variations
• Local maximum problem

46

Pitfalls

• A/B testing becomes a crutch for decision making
• Not getting a clear signal
• Too many variations
• Local maximum problem
• Declaring victory too soon

46

Fostering the Culture
• Universally embraced

49

Fostering the Culture
• Universally embraced
• Common vocabulary

49

Fostering the Culture
• Universally embraced
• Common vocabulary
• Be disciplined

49

Fostering the Culture
• Universally embraced
• Common vocabulary
• Be disciplined
• Share results broadly

49

Fostering the Culture People Matter
• Universally embraced
• Common vocabulary
• Be disciplined
• Share results broadly

49

Fostering the Culture People Matter
• Universally embraced • Humble
• Common vocabulary
• Be disciplined
• Share results broadly

49

Fostering the Culture People Matter
• Universally embraced • Humble
• Common vocabulary • Focused
• Be disciplined
• Share results broadly

49

Fostering the Culture People Matter
• Universally embraced • Humble
• Common vocabulary • Focused
• Be disciplined • Data-driven
• Share results broadly

49

“We are what we repeatedly do.
Excellence, then, is not an act but a
habit.”
– Aristotle

Questions?
Rochelle King - roking@netflix.com
Matt Marenghi - mmarenghi@netflix.com

PS - Interested in learning more first hand?
We’re hiring designers and engineers!

What's hot

Performance of solar plantvenkureddyuggumudi

“The Greatest” Campaign ppt.pptxKhadheejaAzad

Saudi Arabia-Solar power business plansSHAIK AMANULLA

Solar Energy (how it works and fact)Celin Vergonio

Startup IndiaLakshya Malhotra

Kakao Bank - Trailblazing Neobank from South KoreaSam Ghosh

The Shifting Economics of Global ManufacturingBoston Consulting Group

The success story of patanjaliBandita Das

Solar Power Technology in India by S Kumar Private Consultants

India Bottled Water Market (2018-2023)Research On Global Markets

Top Trends in Payments 2022Capgemini

Case Study on Jaggery Cottage Industry in India (Biswadeep Ghosh Hazra and Te...Biswadeep Ghosh Hazra

ppt on the Solar energyYuvraj Singh

Digital Platform EconomyGrow VC Group

Africa Fintech Report 2020 OsakuadeIfeoluwa

Quarterly analyst themes of oil and gas earnings, Q1 2022EY

Solar energy two case studiesK Vivek Varkey

The rise of Made in India digital contentRedSeer

Analyst presentation: Business Plan to 2023Hera Group

Energy Transition - A comprehensive approachSampe Purba

What's hot (20)

Performance of solar plant

“The Greatest” Campaign ppt.pptx

Saudi Arabia-Solar power business plans

Solar Energy (how it works and fact)

Startup India

Kakao Bank - Trailblazing Neobank from South Korea

The Shifting Economics of Global Manufacturing

The success story of patanjali

Solar Power Technology in India by S Kumar

India Bottled Water Market (2018-2023)

Top Trends in Payments 2022

Case Study on Jaggery Cottage Industry in India (Biswadeep Ghosh Hazra and Te...

ppt on the Solar energy

Digital Platform Economy

Africa Fintech Report 2020

Quarterly analyst themes of oil and gas earnings, Q1 2022

Solar energy two case studies

The rise of Made in India digital content

Analyst presentation: Business Plan to 2023

Energy Transition - A comprehensive approach

Viewers also liked

Disruption of Enterprise IT and DevOpsmike d. kail

Netflix IT Ops 2014 Roadmapmike d. kail

Netflix Product & Campaign DevelopmentNorman Tran

Devops at Netflix (re:Invent)Jeremy Edberg

Personas, Scenarios, User Stories, Use Cases (IxDworks.com)Valeria Gasik

Data and Consumer Product DevelopmentGaurav Bhalotia

Netflix Social. UXDi Student ProjectJaclyn Leneé Anderson

NetflixOSS for Triangle Devops Oct 2013aspyker

Netflix business proposalDanny Ford

Leading Agile Product DevelopmentArto Saari

Spring Cloud Netflix OSSSteve Hall

Optimizing the Ops in DevOpsGordon Haff

David Pearl: Analysis NetflixDavid Pearl

BUILD GREAT PRODUCTS: Introduction to LEAN Product DevelopmentKlooff

Velocity NYC 2016 - Containers @ Netflixaspyker

(ENT209) Netflix Cloud Migration, DevOps and Distributed Systems | AWS re:Inv...Amazon Web Services

Strategy Analysis of NETFLIXAbhishek Sao

Netflix Open Source Meetup Season 4 Episode 3aspyker

Ibm cloud nativenetflixossfinalaspyker

Re:invent 2016 Container Scheduling, Execution and AWS Integrationaspyker

Viewers also liked (20)

Disruption of Enterprise IT and DevOps

Netflix IT Ops 2014 Roadmap

Netflix Product & Campaign Development

Devops at Netflix (re:Invent)

Personas, Scenarios, User Stories, Use Cases (IxDworks.com)

Data and Consumer Product Development

Netflix Social. UXDi Student Project

NetflixOSS for Triangle Devops Oct 2013

Netflix business proposal

Leading Agile Product Development

Spring Cloud Netflix OSS

Optimizing the Ops in DevOps

David Pearl: Analysis Netflix

BUILD GREAT PRODUCTS: Introduction to LEAN Product Development

Velocity NYC 2016 - Containers @ Netflix

(ENT209) Netflix Cloud Migration, DevOps and Distributed Systems | AWS re:Inv...

Strategy Analysis of NETFLIX

Netflix Open Source Meetup Season 4 Episode 3

Ibm cloud nativenetflixossfinal

Re:invent 2016 Container Scheduling, Execution and AWS Integration

Similar to Netflix Consumer Science and A/B Testing

Paper overview: "Deep Residual Learning for Image Recognition"Ilya Kuzovkin

Focus fast bigd15_roger_belveal_2015-09-19Roger Belveal

Northern New England TUG - January 2024patrickdtherriault

Dana Chisnell: Designing for Delightful Interfaces (Webdagene 2011)webdagene

Play in User ExperienceJason Mesut

Similar to Netflix Consumer Science and A/B Testing (6)

Paper overview: "Deep Residual Learning for Image Recognition"

Focus fast bigd15_roger_belveal_2015-09-19

Northern New England TUG - January 2024

Dana Chisnell: Designing for Delightful Interfaces (Webdagene 2011)

Play in User Experience

Recently uploaded

TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc

Time Series Foundation Models - current state and future directionsNathaniel Shimoni

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey

Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery

Rise of the Machines: Known As Drones...Rick Flair

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal

Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll

A Journey Into the Emotions of Software DevelopersNicole Novielli

UiPath Community: Communication Mining from Zero to HeroUiPathCommunity

What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina

The State of Passkeys with FIDO Alliance.pptxLoriGlavin3

Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda

Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3

Decarbonising Buildings: Making a net-zero built environment a realityIES VE

Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq

Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen

Recently uploaded (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy

Time Series Foundation Models - current state and future directions

DevEX - reference for building teams, processes, and platforms

TeamStation AI System Report LATAM IT Salaries 2024

Moving Beyond Passwords: FIDO Paris Seminar.pdf

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...

Rise of the Machines: Known As Drones...

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...

Emixa Mendix Meetup 11 April 2024 about Mendix Native development

A Journey Into the Emotions of Software Developers

UiPath Community: Communication Mining from Zero to Hero

What is DBT - The Ultimate Data Build Tool.pdf

The State of Passkeys with FIDO Alliance.pptx

Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger

Generative AI for Technical Writer or Information Developers

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx

Decarbonising Buildings: Making a net-zero built environment a reality

Genislab builds better products and faster go-to-market with Lean project man...

Testing tools and AI - ideas what to try with some tool examples

Netflix Consumer Science and A/B Testing

1. Consumer Science & Product Development

2. Rochelle King VP, User Experience & Product Services Matt Marenghi VP, User Interface Engineering

3. Netflix Overview

4. TV & Movie Enjoyment Made Easy

5. ~26 Million Members

6. ~26 Million Members

7. Over 800 Partner Products

9. 8

10. 9

11. 10

12. 11

13. 12

14. 13

15. ? 13

16. Netflix & Consumer Science “If you want to increase your success rate, double your failure rate.” – Thomas Watson, Sr., founder of IBM

17. Goal: Customer Satisfaction 15

18. Measuring Success 16

19. 17

20. Consumer Science 17

21. Consumer Science 17

22. Consumer Science 17

23. Consumer Science 17

24. Consumer Science A/B testing 17

25. What “performs best”? 18

26. Choosing the Right Metrics 19

27. Choosing the Right Metrics Core Metric: Retention 19

28. Choosing the Right Metrics Core Metric: Retention 20

29. Choosing the Right Metrics Core Metric: Retention Proxy Metric: Hours Watched 20

30. Start with a Hypothesis... 21

31. Start with a Hypothesis... If we make a huge “play” button, people will watch more. 21

32. Start with a Hypothesis... If we make a huge “play” button, people will watch more. If we give people $1 every time they press “play”, retention will improve. 21

33. Start with a Hypothesis... If we make a huge “play” button, people will watch more. If we give people $1 every time they press “play”, retention will improve. Showing more movies & TV shows will lead to more streaming and improved retention. 21

34. Start with a Hypothesis... If we make a huge “play” button, people will watch more. If we give people $1 every time they press “play”, retention will improve. Showing more movies & TV shows will lead to more streaming and improved retention. 22

35. Determine the Variables Showing more movies & TV shows will lead to more streaming and improved retention. 23

36. Determine the Variables Showing more movies & TV shows will lead to more streaming and improved retention. 23

37. Determine the Variables 24

38. Determine the Variables more titles per row more rows 24

39. Determine the Variables more titles per row Depth vs. more rows Breadth 24

40. Design the Test Control 25 rows x 75 titles 25

41. Design the Test Control 25 rows x 75 titles 25

42. Design the Test Control 25 rows x 75 titles Cell 1 25 rows x 150 titles 25

43. Design the Test Control 25 rows x 75 titles Cell 1 Cell 2 25 rows x 31 rows x 150 titles 75 titles 25

44. Design the Test Control 25 rows x 75 titles Cell 1 Cell 2 Cell 3 25 rows x 31 rows x 31 rows x 150 titles 75 titles 150 titles 25

45. Design the Test Control 25 rows x 75 titles Cell 1 Cell 3 Cell 2 25 rows x 31 rows x 150 titles 31 rows x 150 titles 75 titles 25

46. 26

47. Level the playing field 26

48. Level the Data from playing field real customers 26

49. Level the Data from Align to core playing field real customers metrics 26

50. Large scale concept testing can provide general direction

51. Original PlayStation 3 UI

52. Original PlayStation 3 UI How can we get our customers to watch more?

53. Cell 1: Browsing more titles using a flexible menu system and hierarchy will lead to more viewing

54. Cell 2: A simple, flat interface which focuses on content will lead to more viewing

55. Cell 3: Separating navigation from content will guide our members to the content and lead to more viewing

56. Cell 4: A video-rich browsing experience will lead to more viewing

57. VOTE! Cell 1: Cell 2: Hierarchy Grid Cell 3: Cell 4: Separation Video

58. And the winner is... Cell 1: Cell 2: Control Grid Cell 3: Cell 4: Separation Video

59. Iterate...

60. Iterate...

61. Iterate...

62. Iterate...

63. Iterate...

64. Iterate...

65. Iterate...

66. Iterate...

67. Data can give you confidence in your decisions

68. Hypothesis A cleaner UI which showcases the content will lead to more viewing.

69. Cell 0: Cell 1: Control Clean

70. larger boxes

71. no titles

72. on hover

73.

74. Results Cell 0: Cell 1: Control Clean 39

75. Results Cell 0: Cell 1: Control Clean 39

76. Roll Out! 40

77. Roll Out! but... 40

78. Roll Out! but... “NO GOOD...it SucKs BIG TIME...plz change back” 40

79. Roll Out! but... “NO GOOD...it SucKs BIG TIME...plz change back” “I am hoping that at least one person at Netflix with authority will put down the crack pipe...and go back to the old interface” 40

80. Roll Out! but... “NO GOOD...it SucKs BIG TIME...plz change back” “I am hoping that at least one person at Netflix with authority will put down the crack pipe...and go back to the old interface” “I don’t like it, where is the sortable list? and I can’t stand the scroll it’s just wierd and stupid...” 40

81. Respond... 41

82. Today

83. Making Decisions

84. Dealing With Results 44

85. Dealing With Results Roll it out 44

86. Dealing With Results Roll it out With consideration to the change effect on users 44

87. Dealing With Results Roll it out Move On With consideration to the change effect on users 44

88. Dealing With Results Roll it out Move On With consideration to the Polish won’t make it turn change effect on users positive 44

89. When the World is Flat 45

90. When the World is Flat Unsure of Value Retest? 45

91. When the World is Flat Unsure of Value Retest? If there’s a specific concern, address it and consider retesting 45

92. When the World is Flat Unsure of Value Value Add Feature Retest? Roll out? If there’s a specific concern, address it and consider retesting 45

93. When the World is Flat Unsure of Value Value Add Feature Retest? Roll out? If there’s a specific but... concern, address it and - Ongoing tax consider retesting - Likely to constrain future innovation 45

94. Pitfalls 46

95. Pitfalls • A/B testing becomes a crutch for decision making 46

96. Pitfalls • A/B testing becomes a crutch for decision making • Not getting a clear signal 46

97. Pitfalls • A/B testing becomes a crutch for decision making • Not getting a clear signal • Too many variations 46

98. Pitfalls • A/B testing becomes a crutch for decision making • Not getting a clear signal • Too many variations • Local maximum problem 46

99. Pitfalls • A/B testing becomes a crutch for decision making • Not getting a clear signal • Too many variations • Local maximum problem • Declaring victory too soon 46

100. Pitfalls • A/B testing becomes a crutch for decision making • Not getting a clear signal • Too many variations • Local maximum problem • Declaring victory too soon • Not knowing when to end a test 46

101. Culture of Consumer Science

102. + 48

103. Approach + 48

104. Approach + People 48

105. 49

106. Fostering the Culture 49

107. Fostering the Culture • Universally embraced 49

108. Fostering the Culture • Universally embraced • Common vocabulary 49

109. Fostering the Culture • Universally embraced • Common vocabulary • Be disciplined 49

110. Fostering the Culture • Universally embraced • Common vocabulary • Be disciplined • Share results broadly 49

111. Fostering the Culture People Matter • Universally embraced • Common vocabulary • Be disciplined • Share results broadly 49

112. Fostering the Culture People Matter • Universally embraced • Humble • Common vocabulary • Be disciplined • Share results broadly 49

113. Fostering the Culture People Matter • Universally embraced • Humble • Common vocabulary • Focused • Be disciplined • Share results broadly 49

114. Fostering the Culture People Matter • Universally embraced • Humble • Common vocabulary • Focused • Be disciplined • Data-driven • Share results broadly 49

115. Fostering the Culture People Matter • Universally embraced • Humble • Common vocabulary • Focused • Be disciplined • Data-driven • Share results broadly • Curious about business 49

116. “We are what we repeatedly do. Excellence, then, is not an act but a habit.” – Aristotle

117. Questions? Rochelle King - roking@netflix.com Matt Marenghi - mmarenghi@netflix.com PS - Interested in learning more first hand? We’re hiring designers and engineers!

118. END

Editor's Notes

\n
ROCHELLE - introduction\nMATT - introduction\n
\n
At Netflix, we strive to delight our customers by making it as easy as possible to find and watch movies and TV shows.\n\n
26M global streaming members\n
In the past two years we&#x2019;ve developed relationships to get built into over 800 partner products\n\nWe are on major game consoles (Wii, PS3, Xbox), mobile devices (iPad, Android, iPhone) also on DVD players, smart TVs and home theaters\n\n\n
Streaming service has allowed us to change the way people use our service, much more mobile, more flexible and instant\n\nMaking it as easy as possible to get to is important which is why a key part of the strategy has been to get on as many devices as possible.\n\nInternet connected TVs, gaming consoles\n
Mobile - tablet, phones\n
PC & Macs\n
Started in the United States\n
Expanded to Canada in 2010\n
Latin America in Sept 2011\n
UK in Jan 2012, next territory in Q4 of 2012\n
At Netflix, we make most of our product decisions using &#x201C;consumer science&#x201D;. \n\n\n
When building a product, you need to be clear on what your goal is. Netflix is a consumer product - so customer satisfaction is the primary goal that drives most product innovation. We&#x2019;re also a subscription business - and we believe that satisfied customers will be more likely to renew their subscriptions and retain better. \n\n\n\n
We&#x2019;re a data-driven organization, so it&#x2019;s important for us to understand and measure whether or not the new product features we&#x2019;re rolling out are making a positive impact on our customers. Understanding how we measure success needs to be shared across the entire organization.\n\n
We use the term &#x201C;Consumer Science&#x201D; to capture how we measure success. We want to gather as much information as possible, directly from our customers, to understand what is or isn&#x2019;t working for them. Consumer Science can be made up of many components:\n- customer surveys\n- hard core data (demographics, % hours watched, etc.)\n- qualitative feedback directly from consumers via focus groups and usability testing\n- A/B testing or split testing where you give your customers a few experiences that are slightly different from each other and see which one performs best\n\n\n\n\n
We use the term &#x201C;Consumer Science&#x201D; to capture how we measure success. We want to gather as much information as possible, directly from our customers, to understand what is or isn&#x2019;t working for them. Consumer Science can be made up of many components:\n- customer surveys\n- hard core data (demographics, % hours watched, etc.)\n- qualitative feedback directly from consumers via focus groups and usability testing\n- A/B testing or split testing where you give your customers a few experiences that are slightly different from each other and see which one performs best\n\n\n\n\n
We use the term &#x201C;Consumer Science&#x201D; to capture how we measure success. We want to gather as much information as possible, directly from our customers, to understand what is or isn&#x2019;t working for them. Consumer Science can be made up of many components:\n- customer surveys\n- hard core data (demographics, % hours watched, etc.)\n- qualitative feedback directly from consumers via focus groups and usability testing\n- A/B testing or split testing where you give your customers a few experiences that are slightly different from each other and see which one performs best\n\n\n\n\n
We use the term &#x201C;Consumer Science&#x201D; to capture how we measure success. We want to gather as much information as possible, directly from our customers, to understand what is or isn&#x2019;t working for them. Consumer Science can be made up of many components:\n- customer surveys\n- hard core data (demographics, % hours watched, etc.)\n- qualitative feedback directly from consumers via focus groups and usability testing\n- A/B testing or split testing where you give your customers a few experiences that are slightly different from each other and see which one performs best\n\n\n\n\n
We use the term &#x201C;Consumer Science&#x201D; to capture how we measure success. We want to gather as much information as possible, directly from our customers, to understand what is or isn&#x2019;t working for them. Consumer Science can be made up of many components:\n- customer surveys\n- hard core data (demographics, % hours watched, etc.)\n- qualitative feedback directly from consumers via focus groups and usability testing\n- A/B testing or split testing where you give your customers a few experiences that are slightly different from each other and see which one performs best\n\n\n\n\n
The entire team needs to be on the same page about what &#x201C;performs best&#x201D; means.\n\n
Important to choose the right metrics. For Netflix, as a subscription business, RETENTION is the core metric that we want to measure on our tests. Anything that we test in our product should be with the intent of improving retention. \n\n
However, retention can be hard to measure or take a long time to measure. Therefore, it&#x2019;s important to develop leading indicators or proxy metrics. Hours watched is one of our proxy metrics. A customer who watches 4 hours a week from Netflix will be more likely to stick around as a customer than someone who is watching only 1 hour a month. Generally speaking, if they&#x2019;re watching more Netflix, then they&#x2019;re getting more value from our service and more likely to retain. \n\n\n
Every test starts with a hypothesis. Why do we think what we&#x2019;re going to do is actually going to make a difference for the business? Some ideas might sound like they&#x2019;ll make a difference to the core metric, but you need to ensure that they will actually help the overall business (e.g $1 per play - not good for business).\n\n
Every test starts with a hypothesis. Why do we think what we&#x2019;re going to do is actually going to make a difference for the business? Some ideas might sound like they&#x2019;ll make a difference to the core metric, but you need to ensure that they will actually help the overall business (e.g $1 per play - not good for business).\n\n
Every test starts with a hypothesis. Why do we think what we&#x2019;re going to do is actually going to make a difference for the business? Some ideas might sound like they&#x2019;ll make a difference to the core metric, but you need to ensure that they will actually help the overall business (e.g $1 per play - not good for business).\n\n
We&#x2019;ll use this last hypothesis as an example to walk through how A/B testing works.\n
What variables will you test to determine if your hypothesis is sound or not? At Netflix, our movies and TV shows are displayed in rows. Each row represents a different genre or category. \n\n
If the hypothesis is about &#x201C;showing more movies & TV shows&#x201D; up front, then you can either: 1) add more titles per row (provide more depth within each genre/category)\nor 2) add more rows and categories (provide more breadth in the catalog for customers to browse). Think about dependent & independent variables, control, significance&#x2026;\n\n\n
If the hypothesis is about &#x201C;showing more movies & TV shows&#x201D; up front, then you can either: 1) add more titles per row (provide more depth within each genre/category)\nor 2) add more rows and categories (provide more breadth in the catalog for customers to browse). Think about dependent & independent variables, control, significance&#x2026;\n\n\n
If the hypothesis is about &#x201C;showing more movies & TV shows&#x201D; up front, then you can either: 1) add more titles per row (provide more depth within each genre/category)\nor 2) add more rows and categories (provide more breadth in the catalog for customers to browse). Think about dependent & independent variables, control, significance&#x2026;\n\n\n
Every test starts with a control - usually the experience that is already out there. Then we make several different experiences (test cells) which hopefully give us a better understanding about how much impact the variables that we&#x2019;re testing has on our customers. It&#x2019;s a test/experiment - keep in mind that the ideal execution can be 2X as effective as a prototype, but not 10X. Once the test is run, analyze your data. Users will answer the question for you of which variables make an impact. We can learn a lot by understanding why our test succeeded or failed. In this test, adding breadth lifted viewing.\n\n\n
Every test starts with a control - usually the experience that is already out there. Then we make several different experiences (test cells) which hopefully give us a better understanding about how much impact the variables that we&#x2019;re testing has on our customers. It&#x2019;s a test/experiment - keep in mind that the ideal execution can be 2X as effective as a prototype, but not 10X. Once the test is run, analyze your data. Users will answer the question for you of which variables make an impact. We can learn a lot by understanding why our test succeeded or failed. In this test, adding breadth lifted viewing.\n\n\n
Every test starts with a control - usually the experience that is already out there. Then we make several different experiences (test cells) which hopefully give us a better understanding about how much impact the variables that we&#x2019;re testing has on our customers. It&#x2019;s a test/experiment - keep in mind that the ideal execution can be 2X as effective as a prototype, but not 10X. Once the test is run, analyze your data. Users will answer the question for you of which variables make an impact. We can learn a lot by understanding why our test succeeded or failed. In this test, adding breadth lifted viewing.\n\n\n
Every test starts with a control - usually the experience that is already out there. Then we make several different experiences (test cells) which hopefully give us a better understanding about how much impact the variables that we&#x2019;re testing has on our customers. It&#x2019;s a test/experiment - keep in mind that the ideal execution can be 2X as effective as a prototype, but not 10X. Once the test is run, analyze your data. Users will answer the question for you of which variables make an impact. We can learn a lot by understanding why our test succeeded or failed. In this test, adding breadth lifted viewing.\n\n\n
Every test starts with a control - usually the experience that is already out there. Then we make several different experiences (test cells) which hopefully give us a better understanding about how much impact the variables that we&#x2019;re testing has on our customers. It&#x2019;s a test/experiment - keep in mind that the ideal execution can be 2X as effective as a prototype, but not 10X. Once the test is run, analyze your data. Users will answer the question for you of which variables make an impact. We can learn a lot by understanding why our test succeeded or failed. In this test, adding breadth lifted viewing.\n\n\n
Every test starts with a control - usually the experience that is already out there. Then we make several different experiences (test cells) which hopefully give us a better understanding about how much impact the variables that we&#x2019;re testing has on our customers. It&#x2019;s a test/experiment - keep in mind that the ideal execution can be 2X as effective as a prototype, but not 10X. Once the test is run, analyze your data. Users will answer the question for you of which variables make an impact. We can learn a lot by understanding why our test succeeded or failed. In this test, adding breadth lifted viewing.\n\n\n
Every test starts with a control - usually the experience that is already out there. Then we make several different experiences (test cells) which hopefully give us a better understanding about how much impact the variables that we&#x2019;re testing has on our customers. It&#x2019;s a test/experiment - keep in mind that the ideal execution can be 2X as effective as a prototype, but not 10X. Once the test is run, analyze your data. Users will answer the question for you of which variables make an impact. We can learn a lot by understanding why our test succeeded or failed. In this test, adding breadth lifted viewing.\n\n\n
We like A/B testing because: 1) Levels the playing field - many ideas (from anyone) can be tested and it helps democratize product development. Eliminates the problem of only building the idea from the person who yells the loudest or the &#x201C;highest paid opinion&#x201D;. 2) Gives us data from real customers - best and most direct way for understanding what will work with our customers 3) Aligns to core metrics - Keeps the entire team (design, development, product management) on the same page about what we&#x2019;re measuring and thinking about how to move the business forward\n
We like A/B testing because: 1) Levels the playing field - many ideas (from anyone) can be tested and it helps democratize product development. Eliminates the problem of only building the idea from the person who yells the loudest or the &#x201C;highest paid opinion&#x201D;. 2) Gives us data from real customers - best and most direct way for understanding what will work with our customers 3) Aligns to core metrics - Keeps the entire team (design, development, product management) on the same page about what we&#x2019;re measuring and thinking about how to move the business forward\n
We like A/B testing because: 1) Levels the playing field - many ideas (from anyone) can be tested and it helps democratize product development. Eliminates the problem of only building the idea from the person who yells the loudest or the &#x201C;highest paid opinion&#x201D;. 2) Gives us data from real customers - best and most direct way for understanding what will work with our customers 3) Aligns to core metrics - Keeps the entire team (design, development, product management) on the same page about what we&#x2019;re measuring and thinking about how to move the business forward\n
A/B testing can be used for to test radically different ideas as well as smaller iterative ones.\n
This was the original TV UI for Playstation 3, before we had the ability to do A/B testing and dynamically update just the UI with server-delivered UI code. It was the launch of our PS3 downloadable application in late 2009, which replaced the original disc-based version, that introduced our use of the open source browser WebKit. Using WebKit in our application as the UI engine meant we could start doing true A/B testing of the UI in the same way that we had been doing for years on our netflix.com website on PC/Mac.\n\n\n
This experience served as our control. It was already available on a small number of Smart TVs. The main elements of it included: a) a menu structure for selecting different categories. The menu allowed for introducing navigation hierarchy with sub-lists, allowing for deeper drill-down into niches of the catalog, for example Romantic Comedies. b) this hierarchical browse exposed more of the catalog via browsing. c) more boxshots were available on screen at any one time.\n \n
This experience strived for simplicity. No hierarchy, no menus. Simple navigation within a grid of titles. Horizontal rows for individual categories/lists with a title always receiving focus and a panel along the right side providing a rich amount of metadata to inform one&#x2019;s decision.\n
This experience strived for separating navigation from the content. A guided experience through a set of menus. Once a category or sub-category was selected, the navigation got out of the way and the focus was on titles and supporting similar titles to whatever title has focus.\n\n
This experience focused on the power of playing video to help inform one&#x2019;s decision. They hypothesis being that perhaps customers can more easily choose what to watch through the act of watching. A title is always playing at fullscreen with an browse experience as an overlay over the video. Selecting a different title would result in that playing, but allowing the customer to continue browsing for other potential titles, similar to a channel surfing type of experience.\n
Which one do you think performed the best?\n
The Simple Grid UI won compared to our control. It&#x2019;s also worth noting that cells 3 & 4 were negative compared to the control. Internally, a lot of people were excited about the video cell and were confident that it would perform well. If we had simply rolled out that experience without first testing it, we would have done a disservice to our customers.\n
Once you have a general direction, then you can take it and test iterations on it so that you can improve it&#x2019;s performance even more.\n
Once you have a general direction, then you can take it and test iterations on it so that you can improve it&#x2019;s performance even more.\n
Once you have a general direction, then you can take it and test iterations on it so that you can improve it&#x2019;s performance even more.\n
Once you have a general direction, then you can take it and test iterations on it so that you can improve it&#x2019;s performance even more.\n
Once you have a general direction, then you can take it and test iterations on it so that you can improve it&#x2019;s performance even more.\n
Once you have a general direction, then you can take it and test iterations on it so that you can improve it&#x2019;s performance even more.\n
Once you have a general direction, then you can take it and test iterations on it so that you can improve it&#x2019;s performance even more.\n
Once you have a general direction, then you can take it and test iterations on it so that you can improve it&#x2019;s performance even more.\n
Once you have a general direction, then you can take it and test iterations on it so that you can improve it&#x2019;s performance even more.\n
Once you have a general direction, then you can take it and test iterations on it so that you can improve it&#x2019;s performance even more.\n
Once you have a general direction, then you can take it and test iterations on it so that you can improve it&#x2019;s performance even more.\n
Example from our website experience to illustrate another benefit of A/B testing.\n
Hypothesis is that by removing a lot of the clutter and affordances on the UI will make it easier for customers to discover great movies and TV shows to watch. \n
The &#x2018;control&#x2019; was the default UI at the time and cell 1 is a cleaned up version of the UI:\n- Box shots made larger to showcase the content, so large that we could remove the titles from above all the box shots\n- A number of items put into the hover state (play buttons, stars, etc.)\n\n
The &#x2018;control&#x2019; was the default UI at the time and cell 1 is a cleaned up version of the UI:\n- Box shots made larger to showcase the content, so large that we could remove the titles from above all the box shots\n- A number of items put into the hover state (play buttons, stars, etc.)\n\n
The &#x2018;control&#x2019; was the default UI at the time and cell 1 is a cleaned up version of the UI:\n- Box shots made larger to showcase the content, so large that we could remove the titles from above all the box shots\n- A number of items put into the hover state (play buttons, stars, etc.)\n\n
The &#x2018;control&#x2019; was the default UI at the time and cell 1 is a cleaned up version of the UI:\n- Box shots made larger to showcase the content, so large that we could remove the titles from above all the box shots\n- A number of items put into the hover state (play buttons, stars, etc.)\n\n
The &#x2018;control&#x2019; was the default UI at the time and cell 1 is a cleaned up version of the UI:\n- Box shots made larger to showcase the content, so large that we could remove the titles from above all the box shots\n- A number of items put into the hover state (play buttons, stars, etc.)\n\n
The &#x2018;control&#x2019; was the default UI at the time and cell 1 is a cleaned up version of the UI:\n- Box shots made larger to showcase the content, so large that we could remove the titles from above all the box shots\n- A number of items put into the hover state (play buttons, stars, etc.)\n\n
The &#x2018;control&#x2019; was the default UI at the time and cell 1 is a cleaned up version of the UI:\n- Box shots made larger to showcase the content, so large that we could remove the titles from above all the box shots\n- A number of items put into the hover state (play buttons, stars, etc.)\n\n
The &#x2018;control&#x2019; was the default UI at the time and cell 1 is a cleaned up version of the UI:\n- Box shots made larger to showcase the content, so large that we could remove the titles from above all the box shots\n- A number of items put into the hover state (play buttons, stars, etc.)\n\n
The &#x2018;control&#x2019; was the default UI at the time and cell 1 is a cleaned up version of the UI:\n- Box shots made larger to showcase the content, so large that we could remove the titles from above all the box shots\n- A number of items put into the hover state (play buttons, stars, etc.)\n\n
The &#x2018;control&#x2019; was the default UI at the time and cell 1 is a cleaned up version of the UI:\n- Box shots made larger to showcase the content, so large that we could remove the titles from above all the box shots\n- A number of items put into the hover state (play buttons, stars, etc.)\n\n
The &#x201C;clean&#x201D; design won and increased streaming hours.\n
The &#x201C;clean&#x201D; design won and increased streaming hours.\n
Naturally, we were excited to roll it out. However, the reaction we got from customers posting on our blog was VERY negative. When you get such emotional feedback (even if it&#x2019;s from a vocal minority) - it&#x2019;s hard not to question the decision you made and second guess yourself. But the data can give you confidence that there was something that was working better for customers in your design... Some folks seemed to be asking for a full rollback to the original site. However some (still negative) gave us some useful insight about the specifics of what they didn&#x2019;t like (scrolling and missing sortable list). \n
Naturally, we were excited to roll it out. However, the reaction we got from customers posting on our blog was VERY negative. When you get such emotional feedback (even if it&#x2019;s from a vocal minority) - it&#x2019;s hard not to question the decision you made and second guess yourself. But the data can give you confidence that there was something that was working better for customers in your design... Some folks seemed to be asking for a full rollback to the original site. However some (still negative) gave us some useful insight about the specifics of what they didn&#x2019;t like (scrolling and missing sortable list). \n
Naturally, we were excited to roll it out. However, the reaction we got from customers posting on our blog was VERY negative. When you get such emotional feedback (even if it&#x2019;s from a vocal minority) - it&#x2019;s hard not to question the decision you made and second guess yourself. But the data can give you confidence that there was something that was working better for customers in your design... Some folks seemed to be asking for a full rollback to the original site. However some (still negative) gave us some useful insight about the specifics of what they didn&#x2019;t like (scrolling and missing sortable list). \n
Naturally, we were excited to roll it out. However, the reaction we got from customers posting on our blog was VERY negative. When you get such emotional feedback (even if it&#x2019;s from a vocal minority) - it&#x2019;s hard not to question the decision you made and second guess yourself. But the data can give you confidence that there was something that was working better for customers in your design... Some folks seemed to be asking for a full rollback to the original site. However some (still negative) gave us some useful insight about the specifics of what they didn&#x2019;t like (scrolling and missing sortable list). \n
Naturally, we were excited to roll it out. However, the reaction we got from customers posting on our blog was VERY negative. When you get such emotional feedback (even if it&#x2019;s from a vocal minority) - it&#x2019;s hard not to question the decision you made and second guess yourself. But the data can give you confidence that there was something that was working better for customers in your design... Some folks seemed to be asking for a full rollback to the original site. However some (still negative) gave us some useful insight about the specifics of what they didn&#x2019;t like (scrolling and missing sortable list). \n
Naturally, we were excited to roll it out. However, the reaction we got from customers posting on our blog was VERY negative. When you get such emotional feedback (even if it&#x2019;s from a vocal minority) - it&#x2019;s hard not to question the decision you made and second guess yourself. But the data can give you confidence that there was something that was working better for customers in your design... Some folks seemed to be asking for a full rollback to the original site. However some (still negative) gave us some useful insight about the specifics of what they didn&#x2019;t like (scrolling and missing sortable list). \n
Because we controlled for different variables, our data combined with the customer feedback allowed us to discern what changes we should make to the features that we rolled out while being able to maintain a lot of the positive benefits we saw as well.\n\n
The site today retains much of what we originally rolled out - but without A/B testing, there&#x2019;s a chance that we would have rolled back all our changes and not been able to move the product forward in a meaningful way.\n
\n
Positive result: roll it out, but keep in mind that for existing customers there can be a &#x201C;change effect&#x201D; which might have a negative impact\nNegative result: kill the test, resist the urge to revisit it, polish it, thinking it will turn a negative result into a positive one - too costly and unlikely to work\n\n
Positive result: roll it out, but keep in mind that for existing customers there can be a &#x201C;change effect&#x201D; which might have a negative impact\nNegative result: kill the test, resist the urge to revisit it, polish it, thinking it will turn a negative result into a positive one - too costly and unlikely to work\n\n
Positive result: roll it out, but keep in mind that for existing customers there can be a &#x201C;change effect&#x201D; which might have a negative impact\nNegative result: kill the test, resist the urge to revisit it, polish it, thinking it will turn a negative result into a positive one - too costly and unlikely to work\n\n
Positive result: roll it out, but keep in mind that for existing customers there can be a &#x201C;change effect&#x201D; which might have a negative impact\nNegative result: kill the test, resist the urge to revisit it, polish it, thinking it will turn a negative result into a positive one - too costly and unlikely to work\n\n
Positive result: roll it out, but keep in mind that for existing customers there can be a &#x201C;change effect&#x201D; which might have a negative impact\nNegative result: kill the test, resist the urge to revisit it, polish it, thinking it will turn a negative result into a positive one - too costly and unlikely to work\n\n
Positive result: roll it out, but keep in mind that for existing customers there can be a &#x201C;change effect&#x201D; which might have a negative impact\nNegative result: kill the test, resist the urge to revisit it, polish it, thinking it will turn a negative result into a positive one - too costly and unlikely to work\n\n
We often see tests with a &#x201C;flat&#x201D; result. It's important to have the discipline to insist that any product change that doesn't change metrics in a positive direction should be reverted. Even if the change is "only neutral" and you really, really, really like it better, force yourself (and your team) to go back to the drawing board and try again.\n
We often see tests with a &#x201C;flat&#x201D; result. It's important to have the discipline to insist that any product change that doesn't change metrics in a positive direction should be reverted. Even if the change is "only neutral" and you really, really, really like it better, force yourself (and your team) to go back to the drawing board and try again.\n
We often see tests with a &#x201C;flat&#x201D; result. It's important to have the discipline to insist that any product change that doesn't change metrics in a positive direction should be reverted. Even if the change is "only neutral" and you really, really, really like it better, force yourself (and your team) to go back to the drawing board and try again.\n
We often see tests with a &#x201C;flat&#x201D; result. It's important to have the discipline to insist that any product change that doesn't change metrics in a positive direction should be reverted. Even if the change is "only neutral" and you really, really, really like it better, force yourself (and your team) to go back to the drawing board and try again.\n
We often see tests with a &#x201C;flat&#x201D; result. It's important to have the discipline to insist that any product change that doesn't change metrics in a positive direction should be reverted. Even if the change is "only neutral" and you really, really, really like it better, force yourself (and your team) to go back to the drawing board and try again.\n
We often see tests with a &#x201C;flat&#x201D; result. It's important to have the discipline to insist that any product change that doesn't change metrics in a positive direction should be reverted. Even if the change is "only neutral" and you really, really, really like it better, force yourself (and your team) to go back to the drawing board and try again.\n
We often see tests with a &#x201C;flat&#x201D; result. It's important to have the discipline to insist that any product change that doesn't change metrics in a positive direction should be reverted. Even if the change is "only neutral" and you really, really, really like it better, force yourself (and your team) to go back to the drawing board and try again.\n
We often see tests with a &#x201C;flat&#x201D; result. It's important to have the discipline to insist that any product change that doesn't change metrics in a positive direction should be reverted. Even if the change is "only neutral" and you really, really, really like it better, force yourself (and your team) to go back to the drawing board and try again.\n
We caution ourselves not to let A/B testing become a crutch for making decisions. That is, not every idea is worth the cost and effort to test. Focus testing on the ideas that will likely move the business forward and be measurable through core metrics. This helps eliminate mediocre ideas, weak hypotheses, and test that won&#x2019;t move the needle. \nClear Signal - We tend to focus testing on new users because they don&#x2019;t come with predisposed notions of how to use the product and the largest audience for us is the one that we have yet to get. \nVariations - its costly and time-intensive to test every single variation in isolation. Test cell design should be thoughful and each cell or variation should have a hypothesis behind it. \nLocal Maximum - Free yourself to focus on the big bet/wild/unpopular ideas AND the smaller, incremental sound hypothesis ideas. Know when to &#x201C;pivot&#x201D; because you&#x2019;ve max&#x2019;d out a specific angle.\nEarly victory - have the discipline to let a test run its expected course before getting too excited by very early results. Likewise, if a test has run its planned course (e.g. 2 months) and there is no positive signal or its negative, don&#x2019;t be stubborn and let it run indefinitely hoping it will magically turn positive. Its unlikely, and having the maturity to know when to call it a day and move on to testing other great ideas is important.\n\n
We caution ourselves not to let A/B testing become a crutch for making decisions. That is, not every idea is worth the cost and effort to test. Focus testing on the ideas that will likely move the business forward and be measurable through core metrics. This helps eliminate mediocre ideas, weak hypotheses, and test that won&#x2019;t move the needle. \nClear Signal - We tend to focus testing on new users because they don&#x2019;t come with predisposed notions of how to use the product and the largest audience for us is the one that we have yet to get. \nVariations - its costly and time-intensive to test every single variation in isolation. Test cell design should be thoughful and each cell or variation should have a hypothesis behind it. \nLocal Maximum - Free yourself to focus on the big bet/wild/unpopular ideas AND the smaller, incremental sound hypothesis ideas. Know when to &#x201C;pivot&#x201D; because you&#x2019;ve max&#x2019;d out a specific angle.\nEarly victory - have the discipline to let a test run its expected course before getting too excited by very early results. Likewise, if a test has run its planned course (e.g. 2 months) and there is no positive signal or its negative, don&#x2019;t be stubborn and let it run indefinitely hoping it will magically turn positive. Its unlikely, and having the maturity to know when to call it a day and move on to testing other great ideas is important.\n\n
We caution ourselves not to let A/B testing become a crutch for making decisions. That is, not every idea is worth the cost and effort to test. Focus testing on the ideas that will likely move the business forward and be measurable through core metrics. This helps eliminate mediocre ideas, weak hypotheses, and test that won&#x2019;t move the needle. \nClear Signal - We tend to focus testing on new users because they don&#x2019;t come with predisposed notions of how to use the product and the largest audience for us is the one that we have yet to get. \nVariations - its costly and time-intensive to test every single variation in isolation. Test cell design should be thoughful and each cell or variation should have a hypothesis behind it. \nLocal Maximum - Free yourself to focus on the big bet/wild/unpopular ideas AND the smaller, incremental sound hypothesis ideas. Know when to &#x201C;pivot&#x201D; because you&#x2019;ve max&#x2019;d out a specific angle.\nEarly victory - have the discipline to let a test run its expected course before getting too excited by very early results. Likewise, if a test has run its planned course (e.g. 2 months) and there is no positive signal or its negative, don&#x2019;t be stubborn and let it run indefinitely hoping it will magically turn positive. Its unlikely, and having the maturity to know when to call it a day and move on to testing other great ideas is important.\n\n
We caution ourselves not to let A/B testing become a crutch for making decisions. That is, not every idea is worth the cost and effort to test. Focus testing on the ideas that will likely move the business forward and be measurable through core metrics. This helps eliminate mediocre ideas, weak hypotheses, and test that won&#x2019;t move the needle. \nClear Signal - We tend to focus testing on new users because they don&#x2019;t come with predisposed notions of how to use the product and the largest audience for us is the one that we have yet to get. \nVariations - its costly and time-intensive to test every single variation in isolation. Test cell design should be thoughful and each cell or variation should have a hypothesis behind it. \nLocal Maximum - Free yourself to focus on the big bet/wild/unpopular ideas AND the smaller, incremental sound hypothesis ideas. Know when to &#x201C;pivot&#x201D; because you&#x2019;ve max&#x2019;d out a specific angle.\nEarly victory - have the discipline to let a test run its expected course before getting too excited by very early results. Likewise, if a test has run its planned course (e.g. 2 months) and there is no positive signal or its negative, don&#x2019;t be stubborn and let it run indefinitely hoping it will magically turn positive. Its unlikely, and having the maturity to know when to call it a day and move on to testing other great ideas is important.\n\n
We caution ourselves not to let A/B testing become a crutch for making decisions. That is, not every idea is worth the cost and effort to test. Focus testing on the ideas that will likely move the business forward and be measurable through core metrics. This helps eliminate mediocre ideas, weak hypotheses, and test that won&#x2019;t move the needle. \nClear Signal - We tend to focus testing on new users because they don&#x2019;t come with predisposed notions of how to use the product and the largest audience for us is the one that we have yet to get. \nVariations - its costly and time-intensive to test every single variation in isolation. Test cell design should be thoughful and each cell or variation should have a hypothesis behind it. \nLocal Maximum - Free yourself to focus on the big bet/wild/unpopular ideas AND the smaller, incremental sound hypothesis ideas. Know when to &#x201C;pivot&#x201D; because you&#x2019;ve max&#x2019;d out a specific angle.\nEarly victory - have the discipline to let a test run its expected course before getting too excited by very early results. Likewise, if a test has run its planned course (e.g. 2 months) and there is no positive signal or its negative, don&#x2019;t be stubborn and let it run indefinitely hoping it will magically turn positive. Its unlikely, and having the maturity to know when to call it a day and move on to testing other great ideas is important.\n\n
We caution ourselves not to let A/B testing become a crutch for making decisions. That is, not every idea is worth the cost and effort to test. Focus testing on the ideas that will likely move the business forward and be measurable through core metrics. This helps eliminate mediocre ideas, weak hypotheses, and test that won&#x2019;t move the needle. \nClear Signal - We tend to focus testing on new users because they don&#x2019;t come with predisposed notions of how to use the product and the largest audience for us is the one that we have yet to get. \nVariations - its costly and time-intensive to test every single variation in isolation. Test cell design should be thoughful and each cell or variation should have a hypothesis behind it. \nLocal Maximum - Free yourself to focus on the big bet/wild/unpopular ideas AND the smaller, incremental sound hypothesis ideas. Know when to &#x201C;pivot&#x201D; because you&#x2019;ve max&#x2019;d out a specific angle.\nEarly victory - have the discipline to let a test run its expected course before getting too excited by very early results. Likewise, if a test has run its planned course (e.g. 2 months) and there is no positive signal or its negative, don&#x2019;t be stubborn and let it run indefinitely hoping it will magically turn positive. Its unlikely, and having the maturity to know when to call it a day and move on to testing other great ideas is important.\n\n
Consumer Science is successful at Netflix because it&#x2019;s part of our DNA, and something that we&#x2019;ve evolved over many, many years. Everyone who works at Netflix (design, engineering, product management BUT ALSO legal, HR, recruiting, finance) understands what A/B testing is and how it&#x2019;s leveraged.\n
Two parts to make this successful:\n1) the day to day practice of A/B testing\n2) the people that we hire (again, in ALL parts of the company)\n
Two parts to make this successful:\n1) the day to day practice of A/B testing\n2) the people that we hire (again, in ALL parts of the company)\n
FOSTERING THE CULTURE:\nUniversally Embraced - from the executive team to all the individuals that are working on execution\n\nVocabulary - words like: &#x201C;hypothesis&#x201D; and &#x201C;core metric&#x201D; are commonly used to explain what we do. keeps everyone on the same page. makes product discussions, brainstorms more effective and less opinion driven with no reference to needed data to back it up (e.g. &#x201C;I&#x2019;d hypothesize...&#x201D;, &#x201C;I believe the data might show..&#x201D; vs. &#x201C;I think users want X&#x201D;, &#x2018;how are we going to measure X?&#x2019;)\n\nDiscipline - with lots of exciting ideas, it&#x2019;s easy to want to make exceptions and just roll things out (but if we had done that the video-based TV UI, we would have done the wrong thing). A/B testing is not a selectively used tool, and not viewed as optional. It&#x2019;s the default approach to making decisions when a hypothesis is testable. Decisions in absence of A/B test results are the exception, because you just don&#x2019;t know if that decision positively/negatively affected your business. If it CAN be tested, it SHOULD be tested.\nShare Results - Habitual sharing/context setting/broad communication, company-wide, of test results. It helps reinforce that many decisions are influenced by test results. Also, it helps everyone learn from what worked and what didn&#x2019;t. Allows all of us to hone our consumer instincts.\n\nPEOPLE:\nHumble - Empirical focus keeps us humble - most of the time you don&#x2019;t know exactly what your customer wants (even if you&#x2019;re an expert in your field). Quick feedback from testing set us straight, forces us to optimize for the customer. You WILL be wrong at some point in your career and you need to be able to accept that (no egos).\nFocused - With the experimental nature of testing, you need to be able to know what to focus on and when. Know how much effort to put into something to make it work well enough to get a good signal, and know what polish you can put off until it goes into production (if it goes into production)\nData-driven - EVERYONE needs to have an appreciation for the data (design, engineering, PM) and a basic understanding of how our data analysis works (statistical significance, etc.)\nCuriosity about business - Business acumen and business savvy are important because all of our tests are designed to make an impact on the business. Understanding the fundamental business strategy allows you to have a more holistic understanding of how your day to day work is impacting the company at large. (And will allow you to participate better in testing by helping you craft better ideas for what to test).\n\n\n\n\n\n
FOSTERING THE CULTURE:\nUniversally Embraced - from the executive team to all the individuals that are working on execution\n\nVocabulary - words like: &#x201C;hypothesis&#x201D; and &#x201C;core metric&#x201D; are commonly used to explain what we do. keeps everyone on the same page. makes product discussions, brainstorms more effective and less opinion driven with no reference to needed data to back it up (e.g. &#x201C;I&#x2019;d hypothesize...&#x201D;, &#x201C;I believe the data might show..&#x201D; vs. &#x201C;I think users want X&#x201D;, &#x2018;how are we going to measure X?&#x2019;)\n\nDiscipline - with lots of exciting ideas, it&#x2019;s easy to want to make exceptions and just roll things out (but if we had done that the video-based TV UI, we would have done the wrong thing). A/B testing is not a selectively used tool, and not viewed as optional. It&#x2019;s the default approach to making decisions when a hypothesis is testable. Decisions in absence of A/B test results are the exception, because you just don&#x2019;t know if that decision positively/negatively affected your business. If it CAN be tested, it SHOULD be tested.\nShare Results - Habitual sharing/context setting/broad communication, company-wide, of test results. It helps reinforce that many decisions are influenced by test results. Also, it helps everyone learn from what worked and what didn&#x2019;t. Allows all of us to hone our consumer instincts.\n\nPEOPLE:\nHumble - Empirical focus keeps us humble - most of the time you don&#x2019;t know exactly what your customer wants (even if you&#x2019;re an expert in your field). Quick feedback from testing set us straight, forces us to optimize for the customer. You WILL be wrong at some point in your career and you need to be able to accept that (no egos).\nFocused - With the experimental nature of testing, you need to be able to know what to focus on and when. Know how much effort to put into something to make it work well enough to get a good signal, and know what polish you can put off until it goes into production (if it goes into production)\nData-driven - EVERYONE needs to have an appreciation for the data (design, engineering, PM) and a basic understanding of how our data analysis works (statistical significance, etc.)\nCuriosity about business - Business acumen and business savvy are important because all of our tests are designed to make an impact on the business. Understanding the fundamental business strategy allows you to have a more holistic understanding of how your day to day work is impacting the company at large. (And will allow you to participate better in testing by helping you craft better ideas for what to test).\n\n\n\n\n\n
FOSTERING THE CULTURE:\nUniversally Embraced - from the executive team to all the individuals that are working on execution\n\nVocabulary - words like: &#x201C;hypothesis&#x201D; and &#x201C;core metric&#x201D; are commonly used to explain what we do. keeps everyone on the same page. makes product discussions, brainstorms more effective and less opinion driven with no reference to needed data to back it up (e.g. &#x201C;I&#x2019;d hypothesize...&#x201D;, &#x201C;I believe the data might show..&#x201D; vs. &#x201C;I think users want X&#x201D;, &#x2018;how are we going to measure X?&#x2019;)\n\nDiscipline - with lots of exciting ideas, it&#x2019;s easy to want to make exceptions and just roll things out (but if we had done that the video-based TV UI, we would have done the wrong thing). A/B testing is not a selectively used tool, and not viewed as optional. It&#x2019;s the default approach to making decisions when a hypothesis is testable. Decisions in absence of A/B test results are the exception, because you just don&#x2019;t know if that decision positively/negatively affected your business. If it CAN be tested, it SHOULD be tested.\nShare Results - Habitual sharing/context setting/broad communication, company-wide, of test results. It helps reinforce that many decisions are influenced by test results. Also, it helps everyone learn from what worked and what didn&#x2019;t. Allows all of us to hone our consumer instincts.\n\nPEOPLE:\nHumble - Empirical focus keeps us humble - most of the time you don&#x2019;t know exactly what your customer wants (even if you&#x2019;re an expert in your field). Quick feedback from testing set us straight, forces us to optimize for the customer. You WILL be wrong at some point in your career and you need to be able to accept that (no egos).\nFocused - With the experimental nature of testing, you need to be able to know what to focus on and when. Know how much effort to put into something to make it work well enough to get a good signal, and know what polish you can put off until it goes into production (if it goes into production)\nData-driven - EVERYONE needs to have an appreciation for the data (design, engineering, PM) and a basic understanding of how our data analysis works (statistical significance, etc.)\nCuriosity about business - Business acumen and business savvy are important because all of our tests are designed to make an impact on the business. Understanding the fundamental business strategy allows you to have a more holistic understanding of how your day to day work is impacting the company at large. (And will allow you to participate better in testing by helping you craft better ideas for what to test).\n\n\n\n\n\n
FOSTERING THE CULTURE:\nUniversally Embraced - from the executive team to all the individuals that are working on execution\n\nVocabulary - words like: &#x201C;hypothesis&#x201D; and &#x201C;core metric&#x201D; are commonly used to explain what we do. keeps everyone on the same page. makes product discussions, brainstorms more effective and less opinion driven with no reference to needed data to back it up (e.g. &#x201C;I&#x2019;d hypothesize...&#x201D;, &#x201C;I believe the data might show..&#x201D; vs. &#x201C;I think users want X&#x201D;, &#x2018;how are we going to measure X?&#x2019;)\n\nDiscipline - with lots of exciting ideas, it&#x2019;s easy to want to make exceptions and just roll things out (but if we had done that the video-based TV UI, we would have done the wrong thing). A/B testing is not a selectively used tool, and not viewed as optional. It&#x2019;s the default approach to making decisions when a hypothesis is testable. Decisions in absence of A/B test results are the exception, because you just don&#x2019;t know if that decision positively/negatively affected your business. If it CAN be tested, it SHOULD be tested.\nShare Results - Habitual sharing/context setting/broad communication, company-wide, of test results. It helps reinforce that many decisions are influenced by test results. Also, it helps everyone learn from what worked and what didn&#x2019;t. Allows all of us to hone our consumer instincts.\n\nPEOPLE:\nHumble - Empirical focus keeps us humble - most of the time you don&#x2019;t know exactly what your customer wants (even if you&#x2019;re an expert in your field). Quick feedback from testing set us straight, forces us to optimize for the customer. You WILL be wrong at some point in your career and you need to be able to accept that (no egos).\nFocused - With the experimental nature of testing, you need to be able to know what to focus on and when. Know how much effort to put into something to make it work well enough to get a good signal, and know what polish you can put off until it goes into production (if it goes into production)\nData-driven - EVERYONE needs to have an appreciation for the data (design, engineering, PM) and a basic understanding of how our data analysis works (statistical significance, etc.)\nCuriosity about business - Business acumen and business savvy are important because all of our tests are designed to make an impact on the business. Understanding the fundamental business strategy allows you to have a more holistic understanding of how your day to day work is impacting the company at large. (And will allow you to participate better in testing by helping you craft better ideas for what to test).\n\n\n\n\n\n
FOSTERING THE CULTURE:\nUniversally Embraced - from the executive team to all the individuals that are working on execution\n\nVocabulary - words like: &#x201C;hypothesis&#x201D; and &#x201C;core metric&#x201D; are commonly used to explain what we do. keeps everyone on the same page. makes product discussions, brainstorms more effective and less opinion driven with no reference to needed data to back it up (e.g. &#x201C;I&#x2019;d hypothesize...&#x201D;, &#x201C;I believe the data might show..&#x201D; vs. &#x201C;I think users want X&#x201D;, &#x2018;how are we going to measure X?&#x2019;)\n\nDiscipline - with lots of exciting ideas, it&#x2019;s easy to want to make exceptions and just roll things out (but if we had done that the video-based TV UI, we would have done the wrong thing). A/B testing is not a selectively used tool, and not viewed as optional. It&#x2019;s the default approach to making decisions when a hypothesis is testable. Decisions in absence of A/B test results are the exception, because you just don&#x2019;t know if that decision positively/negatively affected your business. If it CAN be tested, it SHOULD be tested.\nShare Results - Habitual sharing/context setting/broad communication, company-wide, of test results. It helps reinforce that many decisions are influenced by test results. Also, it helps everyone learn from what worked and what didn&#x2019;t. Allows all of us to hone our consumer instincts.\n\nPEOPLE:\nHumble - Empirical focus keeps us humble - most of the time you don&#x2019;t know exactly what your customer wants (even if you&#x2019;re an expert in your field). Quick feedback from testing set us straight, forces us to optimize for the customer. You WILL be wrong at some point in your career and you need to be able to accept that (no egos).\nFocused - With the experimental nature of testing, you need to be able to know what to focus on and when. Know how much effort to put into something to make it work well enough to get a good signal, and know what polish you can put off until it goes into production (if it goes into production)\nData-driven - EVERYONE needs to have an appreciation for the data (design, engineering, PM) and a basic understanding of how our data analysis works (statistical significance, etc.)\nCuriosity about business - Business acumen and business savvy are important because all of our tests are designed to make an impact on the business. Understanding the fundamental business strategy allows you to have a more holistic understanding of how your day to day work is impacting the company at large. (And will allow you to participate better in testing by helping you craft better ideas for what to test).\n\n\n\n\n\n
FOSTERING THE CULTURE:\nUniversally Embraced - from the executive team to all the individuals that are working on execution\n\nVocabulary - words like: &#x201C;hypothesis&#x201D; and &#x201C;core metric&#x201D; are commonly used to explain what we do. keeps everyone on the same page. makes product discussions, brainstorms more effective and less opinion driven with no reference to needed data to back it up (e.g. &#x201C;I&#x2019;d hypothesize...&#x201D;, &#x201C;I believe the data might show..&#x201D; vs. &#x201C;I think users want X&#x201D;, &#x2018;how are we going to measure X?&#x2019;)\n\nDiscipline - with lots of exciting ideas, it&#x2019;s easy to want to make exceptions and just roll things out (but if we had done that the video-based TV UI, we would have done the wrong thing). A/B testing is not a selectively used tool, and not viewed as optional. It&#x2019;s the default approach to making decisions when a hypothesis is testable. Decisions in absence of A/B test results are the exception, because you just don&#x2019;t know if that decision positively/negatively affected your business. If it CAN be tested, it SHOULD be tested.\nShare Results - Habitual sharing/context setting/broad communication, company-wide, of test results. It helps reinforce that many decisions are influenced by test results. Also, it helps everyone learn from what worked and what didn&#x2019;t. Allows all of us to hone our consumer instincts.\n\nPEOPLE:\nHumble - Empirical focus keeps us humble - most of the time you don&#x2019;t know exactly what your customer wants (even if you&#x2019;re an expert in your field). Quick feedback from testing set us straight, forces us to optimize for the customer. You WILL be wrong at some point in your career and you need to be able to accept that (no egos).\nFocused - With the experimental nature of testing, you need to be able to know what to focus on and when. Know how much effort to put into something to make it work well enough to get a good signal, and know what polish you can put off until it goes into production (if it goes into production)\nData-driven - EVERYONE needs to have an appreciation for the data (design, engineering, PM) and a basic understanding of how our data analysis works (statistical significance, etc.)\nCuriosity about business - Business acumen and business savvy are important because all of our tests are designed to make an impact on the business. Understanding the fundamental business strategy allows you to have a more holistic understanding of how your day to day work is impacting the company at large. (And will allow you to participate better in testing by helping you craft better ideas for what to test).\n\n\n\n\n\n
FOSTERING THE CULTURE:\nUniversally Embraced - from the executive team to all the individuals that are working on execution\n\nVocabulary - words like: &#x201C;hypothesis&#x201D; and &#x201C;core metric&#x201D; are commonly used to explain what we do. keeps everyone on the same page. makes product discussions, brainstorms more effective and less opinion driven with no reference to needed data to back it up (e.g. &#x201C;I&#x2019;d hypothesize...&#x201D;, &#x201C;I believe the data might show..&#x201D; vs. &#x201C;I think users want X&#x201D;, &#x2018;how are we going to measure X?&#x2019;)\n\nDiscipline - with lots of exciting ideas, it&#x2019;s easy to want to make exceptions and just roll things out (but if we had done that the video-based TV UI, we would have done the wrong thing). A/B testing is not a selectively used tool, and not viewed as optional. It&#x2019;s the default approach to making decisions when a hypothesis is testable. Decisions in absence of A/B test results are the exception, because you just don&#x2019;t know if that decision positively/negatively affected your business. If it CAN be tested, it SHOULD be tested.\nShare Results - Habitual sharing/context setting/broad communication, company-wide, of test results. It helps reinforce that many decisions are influenced by test results. Also, it helps everyone learn from what worked and what didn&#x2019;t. Allows all of us to hone our consumer instincts.\n\nPEOPLE:\nHumble - Empirical focus keeps us humble - most of the time you don&#x2019;t know exactly what your customer wants (even if you&#x2019;re an expert in your field). Quick feedback from testing set us straight, forces us to optimize for the customer. You WILL be wrong at some point in your career and you need to be able to accept that (no egos).\nFocused - With the experimental nature of testing, you need to be able to know what to focus on and when. Know how much effort to put into something to make it work well enough to get a good signal, and know what polish you can put off until it goes into production (if it goes into production)\nData-driven - EVERYONE needs to have an appreciation for the data (design, engineering, PM) and a basic understanding of how our data analysis works (statistical significance, etc.)\nCuriosity about business - Business acumen and business savvy are important because all of our tests are designed to make an impact on the business. Understanding the fundamental business strategy allows you to have a more holistic understanding of how your day to day work is impacting the company at large. (And will allow you to participate better in testing by helping you craft better ideas for what to test).\n\n\n\n\n\n
FOSTERING THE CULTURE:\nUniversally Embraced - from the executive team to all the individuals that are working on execution\n\nVocabulary - words like: &#x201C;hypothesis&#x201D; and &#x201C;core metric&#x201D; are commonly used to explain what we do. keeps everyone on the same page. makes product discussions, brainstorms more effective and less opinion driven with no reference to needed data to back it up (e.g. &#x201C;I&#x2019;d hypothesize...&#x201D;, &#x201C;I believe the data might show..&#x201D; vs. &#x201C;I think users want X&#x201D;, &#x2018;how are we going to measure X?&#x2019;)\n\nDiscipline - with lots of exciting ideas, it&#x2019;s easy to want to make exceptions and just roll things out (but if we had done that the video-based TV UI, we would have done the wrong thing). A/B testing is not a selectively used tool, and not viewed as optional. It&#x2019;s the default approach to making decisions when a hypothesis is testable. Decisions in absence of A/B test results are the exception, because you just don&#x2019;t know if that decision positively/negatively affected your business. If it CAN be tested, it SHOULD be tested.\nShare Results - Habitual sharing/context setting/broad communication, company-wide, of test results. It helps reinforce that many decisions are influenced by test results. Also, it helps everyone learn from what worked and what didn&#x2019;t. Allows all of us to hone our consumer instincts.\n\nPEOPLE:\nHumble - Empirical focus keeps us humble - most of the time you don&#x2019;t know exactly what your customer wants (even if you&#x2019;re an expert in your field). Quick feedback from testing set us straight, forces us to optimize for the customer. You WILL be wrong at some point in your career and you need to be able to accept that (no egos).\nFocused - With the experimental nature of testing, you need to be able to know what to focus on and when. Know how much effort to put into something to make it work well enough to get a good signal, and know what polish you can put off until it goes into production (if it goes into production)\nData-driven - EVERYONE needs to have an appreciation for the data (design, engineering, PM) and a basic understanding of how our data analysis works (statistical significance, etc.)\nCuriosity about business - Business acumen and business savvy are important because all of our tests are designed to make an impact on the business. Understanding the fundamental business strategy allows you to have a more holistic understanding of how your day to day work is impacting the company at large. (And will allow you to participate better in testing by helping you craft better ideas for what to test).\n\n\n\n\n\n
FOSTERING THE CULTURE:\nUniversally Embraced - from the executive team to all the individuals that are working on execution\n\nVocabulary - words like: &#x201C;hypothesis&#x201D; and &#x201C;core metric&#x201D; are commonly used to explain what we do. keeps everyone on the same page. makes product discussions, brainstorms more effective and less opinion driven with no reference to needed data to back it up (e.g. &#x201C;I&#x2019;d hypothesize...&#x201D;, &#x201C;I believe the data might show..&#x201D; vs. &#x201C;I think users want X&#x201D;, &#x2018;how are we going to measure X?&#x2019;)\n\nDiscipline - with lots of exciting ideas, it&#x2019;s easy to want to make exceptions and just roll things out (but if we had done that the video-based TV UI, we would have done the wrong thing). A/B testing is not a selectively used tool, and not viewed as optional. It&#x2019;s the default approach to making decisions when a hypothesis is testable. Decisions in absence of A/B test results are the exception, because you just don&#x2019;t know if that decision positively/negatively affected your business. If it CAN be tested, it SHOULD be tested.\nShare Results - Habitual sharing/context setting/broad communication, company-wide, of test results. It helps reinforce that many decisions are influenced by test results. Also, it helps everyone learn from what worked and what didn&#x2019;t. Allows all of us to hone our consumer instincts.\n\nPEOPLE:\nHumble - Empirical focus keeps us humble - most of the time you don&#x2019;t know exactly what your customer wants (even if you&#x2019;re an expert in your field). Quick feedback from testing set us straight, forces us to optimize for the customer. You WILL be wrong at some point in your career and you need to be able to accept that (no egos).\nFocused - With the experimental nature of testing, you need to be able to know what to focus on and when. Know how much effort to put into something to make it work well enough to get a good signal, and know what polish you can put off until it goes into production (if it goes into production)\nData-driven - EVERYONE needs to have an appreciation for the data (design, engineering, PM) and a basic understanding of how our data analysis works (statistical significance, etc.)\nCuriosity about business - Business acumen and business savvy are important because all of our tests are designed to make an impact on the business. Understanding the fundamental business strategy allows you to have a more holistic understanding of how your day to day work is impacting the company at large. (And will allow you to participate better in testing by helping you craft better ideas for what to test).\n\n\n\n\n\n
FOSTERING THE CULTURE:\nUniversally Embraced - from the executive team to all the individuals that are working on execution\n\nVocabulary - words like: &#x201C;hypothesis&#x201D; and &#x201C;core metric&#x201D; are commonly used to explain what we do. keeps everyone on the same page. makes product discussions, brainstorms more effective and less opinion driven with no reference to needed data to back it up (e.g. &#x201C;I&#x2019;d hypothesize...&#x201D;, &#x201C;I believe the data might show..&#x201D; vs. &#x201C;I think users want X&#x201D;, &#x2018;how are we going to measure X?&#x2019;)\n\nDiscipline - with lots of exciting ideas, it&#x2019;s easy to want to make exceptions and just roll things out (but if we had done that the video-based TV UI, we would have done the wrong thing). A/B testing is not a selectively used tool, and not viewed as optional. It&#x2019;s the default approach to making decisions when a hypothesis is testable. Decisions in absence of A/B test results are the exception, because you just don&#x2019;t know if that decision positively/negatively affected your business. If it CAN be tested, it SHOULD be tested.\nShare Results - Habitual sharing/context setting/broad communication, company-wide, of test results. It helps reinforce that many decisions are influenced by test results. Also, it helps everyone learn from what worked and what didn&#x2019;t. Allows all of us to hone our consumer instincts.\n\nPEOPLE:\nHumble - Empirical focus keeps us humble - most of the time you don&#x2019;t know exactly what your customer wants (even if you&#x2019;re an expert in your field). Quick feedback from testing set us straight, forces us to optimize for the customer. You WILL be wrong at some point in your career and you need to be able to accept that (no egos).\nFocused - With the experimental nature of testing, you need to be able to know what to focus on and when. Know how much effort to put into something to make it work well enough to get a good signal, and know what polish you can put off until it goes into production (if it goes into production)\nData-driven - EVERYONE needs to have an appreciation for the data (design, engineering, PM) and a basic understanding of how our data analysis works (statistical significance, etc.)\nCuriosity about business - Business acumen and business savvy are important because all of our tests are designed to make an impact on the business. Understanding the fundamental business strategy allows you to have a more holistic understanding of how your day to day work is impacting the company at large. (And will allow you to participate better in testing by helping you craft better ideas for what to test).\n\n\n\n\n\n
FOSTERING THE CULTURE:\nUniversally Embraced - from the executive team to all the individuals that are working on execution\n\nVocabulary - words like: &#x201C;hypothesis&#x201D; and &#x201C;core metric&#x201D; are commonly used to explain what we do. keeps everyone on the same page. makes product discussions, brainstorms more effective and less opinion driven with no reference to needed data to back it up (e.g. &#x201C;I&#x2019;d hypothesize...&#x201D;, &#x201C;I believe the data might show..&#x201D; vs. &#x201C;I think users want X&#x201D;, &#x2018;how are we going to measure X?&#x2019;)\n\nDiscipline - with lots of exciting ideas, it&#x2019;s easy to want to make exceptions and just roll things out (but if we had done that the video-based TV UI, we would have done the wrong thing). A/B testing is not a selectively used tool, and not viewed as optional. It&#x2019;s the default approach to making decisions when a hypothesis is testable. Decisions in absence of A/B test results are the exception, because you just don&#x2019;t know if that decision positively/negatively affected your business. If it CAN be tested, it SHOULD be tested.\nShare Results - Habitual sharing/context setting/broad communication, company-wide, of test results. It helps reinforce that many decisions are influenced by test results. Also, it helps everyone learn from what worked and what didn&#x2019;t. Allows all of us to hone our consumer instincts.\n\nPEOPLE:\nHumble - Empirical focus keeps us humble - most of the time you don&#x2019;t know exactly what your customer wants (even if you&#x2019;re an expert in your field). Quick feedback from testing set us straight, forces us to optimize for the customer. You WILL be wrong at some point in your career and you need to be able to accept that (no egos).\nFocused - With the experimental nature of testing, you need to be able to know what to focus on and when. Know how much effort to put into something to make it work well enough to get a good signal, and know what polish you can put off until it goes into production (if it goes into production)\nData-driven - EVERYONE needs to have an appreciation for the data (design, engineering, PM) and a basic understanding of how our data analysis works (statistical significance, etc.)\nCuriosity about business - Business acumen and business savvy are important because all of our tests are designed to make an impact on the business. Understanding the fundamental business strategy allows you to have a more holistic understanding of how your day to day work is impacting the company at large. (And will allow you to participate better in testing by helping you craft better ideas for what to test).\n\n\n\n\n\n
FOSTERING THE CULTURE:\nUniversally Embraced - from the executive team to all the individuals that are working on execution\n\nVocabulary - words like: &#x201C;hypothesis&#x201D; and &#x201C;core metric&#x201D; are commonly used to explain what we do. keeps everyone on the same page. makes product discussions, brainstorms more effective and less opinion driven with no reference to needed data to back it up (e.g. &#x201C;I&#x2019;d hypothesize...&#x201D;, &#x201C;I believe the data might show..&#x201D; vs. &#x201C;I think users want X&#x201D;, &#x2018;how are we going to measure X?&#x2019;)\n\nDiscipline - with lots of exciting ideas, it&#x2019;s easy to want to make exceptions and just roll things out (but if we had done that the video-based TV UI, we would have done the wrong thing). A/B testing is not a selectively used tool, and not viewed as optional. It&#x2019;s the default approach to making decisions when a hypothesis is testable. Decisions in absence of A/B test results are the exception, because you just don&#x2019;t know if that decision positively/negatively affected your business. If it CAN be tested, it SHOULD be tested.\nShare Results - Habitual sharing/context setting/broad communication, company-wide, of test results. It helps reinforce that many decisions are influenced by test results. Also, it helps everyone learn from what worked and what didn&#x2019;t. Allows all of us to hone our consumer instincts.\n\nPEOPLE:\nHumble - Empirical focus keeps us humble - most of the time you don&#x2019;t know exactly what your customer wants (even if you&#x2019;re an expert in your field). Quick feedback from testing set us straight, forces us to optimize for the customer. You WILL be wrong at some point in your career and you need to be able to accept that (no egos).\nFocused - With the experimental nature of testing, you need to be able to know what to focus on and when. Know how much effort to put into something to make it work well enough to get a good signal, and know what polish you can put off until it goes into production (if it goes into production)\nData-driven - EVERYONE needs to have an appreciation for the data (design, engineering, PM) and a basic understanding of how our data analysis works (statistical significance, etc.)\nCuriosity about business - Business acumen and business savvy are important because all of our tests are designed to make an impact on the business. Understanding the fundamental business strategy allows you to have a more holistic understanding of how your day to day work is impacting the company at large. (And will allow you to participate better in testing by helping you craft better ideas for what to test).\n\n\n\n\n\n
FOSTERING THE CULTURE:\nUniversally Embraced - from the executive team to all the individuals that are working on execution\n\nVocabulary - words like: &#x201C;hypothesis&#x201D; and &#x201C;core metric&#x201D; are commonly used to explain what we do. keeps everyone on the same page. makes product discussions, brainstorms more effective and less opinion driven with no reference to needed data to back it up (e.g. &#x201C;I&#x2019;d hypothesize...&#x201D;, &#x201C;I believe the data might show..&#x201D; vs. &#x201C;I think users want X&#x201D;, &#x2018;how are we going to measure X?&#x2019;)\n\nDiscipline - with lots of exciting ideas, it&#x2019;s easy to want to make exceptions and just roll things out (but if we had done that the video-based TV UI, we would have done the wrong thing). A/B testing is not a selectively used tool, and not viewed as optional. It&#x2019;s the default approach to making decisions when a hypothesis is testable. Decisions in absence of A/B test results are the exception, because you just don&#x2019;t know if that decision positively/negatively affected your business. If it CAN be tested, it SHOULD be tested.\nShare Results - Habitual sharing/context setting/broad communication, company-wide, of test results. It helps reinforce that many decisions are influenced by test results. Also, it helps everyone learn from what worked and what didn&#x2019;t. Allows all of us to hone our consumer instincts.\n\nPEOPLE:\nHumble - Empirical focus keeps us humble - most of the time you don&#x2019;t know exactly what your customer wants (even if you&#x2019;re an expert in your field). Quick feedback from testing set us straight, forces us to optimize for the customer. You WILL be wrong at some point in your career and you need to be able to accept that (no egos).\nFocused - With the experimental nature of testing, you need to be able to know what to focus on and when. Know how much effort to put into something to make it work well enough to get a good signal, and know what polish you can put off until it goes into production (if it goes into production)\nData-driven - EVERYONE needs to have an appreciation for the data (design, engineering, PM) and a basic understanding of how our data analysis works (statistical significance, etc.)\nCuriosity about business - Business acumen and business savvy are important because all of our tests are designed to make an impact on the business. Understanding the fundamental business strategy allows you to have a more holistic understanding of how your day to day work is impacting the company at large. (And will allow you to participate better in testing by helping you craft better ideas for what to test).\n\n\n\n\n\n
\n\n\n
\n
\n

Netflix Consumer Science and A/B Testing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Netflix Consumer Science and A/B Testing

Similar to Netflix Consumer Science and A/B Testing (6)

Recently uploaded

Recently uploaded (20)

Netflix Consumer Science and A/B Testing

Editor's Notes