SlideShare a Scribd company logo
1 of 27
CONCURRENCY
PATTERNS WITH
MONGODB
Yann Cluchey
CTO @ Cogenta
• Real-time retail intelligence
• Gather products and prices from web
• MongoDB in production
• Millions of updates per day, 3K/s peak
• Data in SQL, Mongo, ElasticSearch
Concurrency Patterns: Why?
• MongoDB: atomic updates, no transactions
• Need to ensure consistency & correctness
• What are my options with Mongo?
• Shortcuts
• Different approaches
Concurrency Control Strategies
• Pessimistic
• Suited for frequent conflicts
• http://bit.ly/two-phase-commits
• Optimistic
• Efficient when conflicts are rare
• http://bit.ly/isolate-sequence
• Multi-version
• All versions stored, client resolves conflict
• e.g. CouchDb
Optimistic Concurrency Control (OCC)
• No locks
• Prevent dirty writes
• Uses timestamp or a revision number
• Client checks & replays transaction
Example
Original
{ _id: 23, comment: “The quick brown fox…” }
Edit 1
{ _id: 23,
comment: “The quick brown fox prefers SQL” }
Edit 2
{ _id: 23,
comment: “The quick brown fox prefers
MongoDB” }
Example
Edit 1
db.comments.update(
{ _id: 23 },
{ _id: 23,
comment: “The quick brown fox prefers SQL” })
Edit 2
db.comments.update(
{ _id: 23 },
{ _id: 23,
comment: “The quick brown fox prefers MongoDB”
})
Outcome: One update is lost, other might be wrong
OCC Example
Original
{ _id: 23, rev: 1,
comment: “The quick brown fox…” }
Update a specific revision (edit 1)
db.comments.update(
{ _id: 23, rev: 1 },
{ _id: 23, rev: 2,
comment: “The quick brown fox prefers SQL”
})
OCC Example
Edit 2
db.comments.update(
{ _id: 23, rev: 1 },
{ _id: 23, rev: 2,
comment: “The quick brown fox prefers
MongoDB” })
..fails
{ updatedExisting: false, n: 0,
err: null, ok: 1 }
• Caveat: Only works if all clients follow convention
Update Operators in Mongo
• Avoid full document replacement by using operators
• Powerful operators such as $inc, $set, $push
• Many operators can be grouped into single atomic update
• More efficient (data over wire, parsing, etc.)
• Use as much as possible
• http://bit.ly/update-operators
Still Need OCC?
A hit counter
{ _id: 1, hits: 5040 }
Edit 1
db.stats.update({ _id: 1 },
{ $set: { hits: 5045 } })
Edit 2
db.stats.update({ _id: 1 },
{ $set: { hits: 5055 } })
Still Need OCC?
Edit 1
db.stats.update({ _id: 1 },
{ $inc: { hits: 5 } })
Edit 2
db.stats.update({ _id: 1 },
{ $inc: { hits: 10 } })
• Sequence of updates might vary
• Outcome always the same
• But what if sequence is important?
Still Need OCC?
• Operators can offset need for concurrency control
• Support for complex atomic manipulation
• Depends on use case
• You’ll need it for
• Opaque changes (e.g. text)
• Complex update logic in app domain
(e.g. changing a value affects some calculated fields)
• Sequence is important and can’t be inferred
Update Commands
• Update
• Specify query to match one or more documents
• Use { multi: true } to update multiple documents
• Must call Find() separately if you want a copy of the doc
• FindAndModify
• Update single document only
• Find + Update in single hit (atomic)
• Returns the doc before or after update
• Whole doc or subset
• Upsert (update or insert)
• Important feature. Works with OCC..?
Consistent Update Example
• Have a customer document
• Want to set the LastOrderValue and return the previous
value
db.customers.findAndModify({
query: { _id: 16, rev: 45 },
update: {
$set: { lastOrderValue: 699 },
$inc: { rev: 1 }
},
new: false
})
Consistent Update Example
• Customer has since been updated, or doesn’t exist
• Client should replay
null
• Intended version of customer successfully updated
• Original version is returned
{ _id: 16, rev: 45, lastOrderValue: 145 }
• Useful if client has got partial information and needs the
full document
• A separate Find() could introduce inconsistency
Independent Update with Upsert
• Keep stats about customers
• Want to increment NumOrders and return new total
• Customer document might not be there
• Independent operation still needs protection
db.customerStats.findAndModify({
query: { _id: 16 },
update: {
$inc: { numOrders: 1, rev: 1 },
$setOnInsert: { name: “Yann” }
},
new: true,
upsert: true
})
Independent Update with Upsert
• First run, document is created
{ _id: 16, numOrders: 1, rev: 1, name: “Yann” }
• Second run, document is updated
{ _id: 16, numOrders: 2, rev: 2, name: “Yann” }
Subdocuments
• Common scenario
• e.g. Customer and Orders in single document
• Clients like having everything
• Powerful operators for matching and updating
subdocuments
• $elemMatch, $, $addToSet, $push
• Alternatives to “Fat” documents;
• Client-side joins
• Aggregation
• MapReduce
Currency Control and Subdocuments
• Patterns described here still work, but might be
impractical
• Docs are large
• More collisions
• Solve with scale?
Subdocument Example
• Customer document contains orders
• Want to independently update orders
• Correct order #471 value to £260
{
_id: 16,
rev: 20,
name: “Yann”,
orders: {
“471”: { id: 471, value: 250, rev: 4 }
}
}
Subdocument Example
db.customers.findAndModify({
query: { “orders.471.rev”: { $lte: 4 } },
update: {
$set: { “orders.471.value”: 260 },
$inc: { rev: 1, “orders.471.rev”: 1 },
$setOnInsert: {
name: “Yann”,
“orders.471.id”: 471 }
},
new: true,
upsert: true
})
Subdocument Example
• First run, order updated successfully
• Could create if not exists
{
_id: 16,
rev: 21,
name: “Yann”,
orders: {
“471”: { id: 471, value: 260, rev: 5 }
}
}
Subdocument Example
• Second conflicting run
• Query didn’t match revision, duplicate document created
{
_id: ObjectId("533bf88a50dbb55a8a9b9128"),
rev: 1,
name: “Yann”,
orders: {
“471”: { id: 471, value: 260, rev: 1 }
}
}
Subdocument Example
• Solve with unique index (good idea anyway)
db.customers.ensureIndex(
{ "orders.id" : 1 },
{
"name" : "orderids",
"unique" : true
})
Subdocument Example
Client can handle findAndModify result accordingly;
• Successful update
{ updatedExisting: true }
• New document created
{ updatedExisting: false, n: 1 }
• Conflict, need to replay
{ errmsg: “exception: E11000 duplicate key
error index: db.customers.$orderids dup key…” }
Final Words
• Don’t forget deletes
• Gotchas about subdocument structure
orders: [ { id: 471 }, … ]
orders: { “471”: { }, … }
orders: { “471”: { id: 471 }, … }
• Coming in 2.6.0 stable
$setOnInsert: { _id: .. }
• Sharding..?

More Related Content

What's hot

[Pgday.Seoul 2020] SQL Tuning
[Pgday.Seoul 2020] SQL Tuning[Pgday.Seoul 2020] SQL Tuning
[Pgday.Seoul 2020] SQL TuningPgDay.Seoul
 
Introducing MongoDB Atlas
Introducing MongoDB AtlasIntroducing MongoDB Atlas
Introducing MongoDB AtlasMongoDB
 
MongoDB sharded cluster. How to design your topology ?
MongoDB sharded cluster. How to design your topology ?MongoDB sharded cluster. How to design your topology ?
MongoDB sharded cluster. How to design your topology ?Mydbops
 
Amazon Redshift의 이해와 활용 (김용우) - AWS DB Day
Amazon Redshift의 이해와 활용 (김용우) - AWS DB DayAmazon Redshift의 이해와 활용 (김용우) - AWS DB Day
Amazon Redshift의 이해와 활용 (김용우) - AWS DB DayAmazon Web Services Korea
 
An Introduction to MongoDB Compass
An Introduction to MongoDB CompassAn Introduction to MongoDB Compass
An Introduction to MongoDB CompassMongoDB
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBNodeXperts
 
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Spark Summit
 
MongoDB Atlas
MongoDB AtlasMongoDB Atlas
MongoDB AtlasMongoDB
 
[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기NAVER D2
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsJulien Le Dem
 
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유Hyojun Jeon
 
Redis - Usability and Use Cases
Redis - Usability and Use CasesRedis - Usability and Use Cases
Redis - Usability and Use CasesFabrizio Farinacci
 
Evolution of MongoDB Replicaset and Its Best Practices
Evolution of MongoDB Replicaset and Its Best PracticesEvolution of MongoDB Replicaset and Its Best Practices
Evolution of MongoDB Replicaset and Its Best PracticesMydbops
 
코드로 인프라 관리하기 - 자동화 툴 소개
코드로 인프라 관리하기 - 자동화 툴 소개코드로 인프라 관리하기 - 자동화 툴 소개
코드로 인프라 관리하기 - 자동화 툴 소개태준 문
 
Efficient DBA: Gain Time by Reducing Command-Line Keystrokes
Efficient DBA: Gain Time by Reducing Command-Line KeystrokesEfficient DBA: Gain Time by Reducing Command-Line Keystrokes
Efficient DBA: Gain Time by Reducing Command-Line KeystrokesSeth Miller
 
[215]네이버콘텐츠통계서비스소개 김기영
[215]네이버콘텐츠통계서비스소개 김기영[215]네이버콘텐츠통계서비스소개 김기영
[215]네이버콘텐츠통계서비스소개 김기영NAVER D2
 
스타트업 사례로 본 로그 데이터 분석 : Tajo on AWS
스타트업 사례로 본 로그 데이터 분석 : Tajo on AWS스타트업 사례로 본 로그 데이터 분석 : Tajo on AWS
스타트업 사례로 본 로그 데이터 분석 : Tajo on AWSMatthew (정재화)
 
Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013Julien Le Dem
 

What's hot (20)

[Pgday.Seoul 2020] SQL Tuning
[Pgday.Seoul 2020] SQL Tuning[Pgday.Seoul 2020] SQL Tuning
[Pgday.Seoul 2020] SQL Tuning
 
Introducing MongoDB Atlas
Introducing MongoDB AtlasIntroducing MongoDB Atlas
Introducing MongoDB Atlas
 
MongoDB sharded cluster. How to design your topology ?
MongoDB sharded cluster. How to design your topology ?MongoDB sharded cluster. How to design your topology ?
MongoDB sharded cluster. How to design your topology ?
 
Amazon Redshift의 이해와 활용 (김용우) - AWS DB Day
Amazon Redshift의 이해와 활용 (김용우) - AWS DB DayAmazon Redshift의 이해와 활용 (김용우) - AWS DB Day
Amazon Redshift의 이해와 활용 (김용우) - AWS DB Day
 
An Introduction to MongoDB Compass
An Introduction to MongoDB CompassAn Introduction to MongoDB Compass
An Introduction to MongoDB Compass
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
 
MongoDB Atlas
MongoDB AtlasMongoDB Atlas
MongoDB Atlas
 
[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analytics
 
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유
 
Redis - Usability and Use Cases
Redis - Usability and Use CasesRedis - Usability and Use Cases
Redis - Usability and Use Cases
 
Evolution of MongoDB Replicaset and Its Best Practices
Evolution of MongoDB Replicaset and Its Best PracticesEvolution of MongoDB Replicaset and Its Best Practices
Evolution of MongoDB Replicaset and Its Best Practices
 
코드로 인프라 관리하기 - 자동화 툴 소개
코드로 인프라 관리하기 - 자동화 툴 소개코드로 인프라 관리하기 - 자동화 툴 소개
코드로 인프라 관리하기 - 자동화 툴 소개
 
Efficient DBA: Gain Time by Reducing Command-Line Keystrokes
Efficient DBA: Gain Time by Reducing Command-Line KeystrokesEfficient DBA: Gain Time by Reducing Command-Line Keystrokes
Efficient DBA: Gain Time by Reducing Command-Line Keystrokes
 
[215]네이버콘텐츠통계서비스소개 김기영
[215]네이버콘텐츠통계서비스소개 김기영[215]네이버콘텐츠통계서비스소개 김기영
[215]네이버콘텐츠통계서비스소개 김기영
 
스타트업 사례로 본 로그 데이터 분석 : Tajo on AWS
스타트업 사례로 본 로그 데이터 분석 : Tajo on AWS스타트업 사례로 본 로그 데이터 분석 : Tajo on AWS
스타트업 사례로 본 로그 데이터 분석 : Tajo on AWS
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
 
Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013
 

Similar to Concurrency Patterns with MongoDB

Neue Features in MongoDB 3.6
Neue Features in MongoDB 3.6Neue Features in MongoDB 3.6
Neue Features in MongoDB 3.6MongoDB
 
Novedades de MongoDB 3.6
Novedades de MongoDB 3.6Novedades de MongoDB 3.6
Novedades de MongoDB 3.6MongoDB
 
TechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
TechEd AU 2014: Microsoft Azure DocumentDB Deep DiveTechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
TechEd AU 2014: Microsoft Azure DocumentDB Deep DiveIntergen
 
WinOps Conf 2016 - Michael Greene - Release Pipelines
WinOps Conf 2016 - Michael Greene - Release PipelinesWinOps Conf 2016 - Michael Greene - Release Pipelines
WinOps Conf 2016 - Michael Greene - Release PipelinesWinOps Conf
 
Benchmarking at Parse
Benchmarking at ParseBenchmarking at Parse
Benchmarking at ParseTravis Redman
 
Advanced Benchmarking at Parse
Advanced Benchmarking at ParseAdvanced Benchmarking at Parse
Advanced Benchmarking at ParseMongoDB
 
What's new in MongoDB 3.6?
What's new in MongoDB 3.6?What's new in MongoDB 3.6?
What's new in MongoDB 3.6?MongoDB
 
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward
 
Eventually Elasticsearch: Eventual Consistency in the Real World
Eventually Elasticsearch: Eventual Consistency in the Real WorldEventually Elasticsearch: Eventual Consistency in the Real World
Eventually Elasticsearch: Eventual Consistency in the Real WorldBeyondTrees
 
Webinar: What's new in the .NET Driver
Webinar: What's new in the .NET DriverWebinar: What's new in the .NET Driver
Webinar: What's new in the .NET DriverMongoDB
 
MSFT Dumaguete 061616 - Building High Performance Apps
MSFT Dumaguete 061616 - Building High Performance AppsMSFT Dumaguete 061616 - Building High Performance Apps
MSFT Dumaguete 061616 - Building High Performance AppsMarc Obaldo
 
No REST - Architecting Real-time Bulk Async APIs
No REST - Architecting Real-time Bulk Async APIsNo REST - Architecting Real-time Bulk Async APIs
No REST - Architecting Real-time Bulk Async APIsC4Media
 
Lessons from the Trenches - Building Enterprise Applications with RavenDB
Lessons from the Trenches - Building Enterprise Applications with RavenDBLessons from the Trenches - Building Enterprise Applications with RavenDB
Lessons from the Trenches - Building Enterprise Applications with RavenDBOren Eini
 
Designing your API Server for mobile apps
Designing your API Server for mobile appsDesigning your API Server for mobile apps
Designing your API Server for mobile appsMugunth Kumar
 
Cost-based Query Optimization in Hive
Cost-based Query Optimization in HiveCost-based Query Optimization in Hive
Cost-based Query Optimization in HiveDataWorks Summit
 
Cost-based query optimization in Apache Hive
Cost-based query optimization in Apache HiveCost-based query optimization in Apache Hive
Cost-based query optimization in Apache HiveJulian Hyde
 
Grails patterns and practices
Grails patterns and practicesGrails patterns and practices
Grails patterns and practicespaulbowler
 
MongoDB World 2019: Scaling Real-time Collaboration with MongoDB
MongoDB World 2019: Scaling Real-time Collaboration with MongoDBMongoDB World 2019: Scaling Real-time Collaboration with MongoDB
MongoDB World 2019: Scaling Real-time Collaboration with MongoDBMongoDB
 
Boosting the Performance of your Rails Apps
Boosting the Performance of your Rails AppsBoosting the Performance of your Rails Apps
Boosting the Performance of your Rails AppsMatt Kuklinski
 
Improving the Quality of Existing Software
Improving the Quality of Existing SoftwareImproving the Quality of Existing Software
Improving the Quality of Existing SoftwareSteven Smith
 

Similar to Concurrency Patterns with MongoDB (20)

Neue Features in MongoDB 3.6
Neue Features in MongoDB 3.6Neue Features in MongoDB 3.6
Neue Features in MongoDB 3.6
 
Novedades de MongoDB 3.6
Novedades de MongoDB 3.6Novedades de MongoDB 3.6
Novedades de MongoDB 3.6
 
TechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
TechEd AU 2014: Microsoft Azure DocumentDB Deep DiveTechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
TechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
 
WinOps Conf 2016 - Michael Greene - Release Pipelines
WinOps Conf 2016 - Michael Greene - Release PipelinesWinOps Conf 2016 - Michael Greene - Release Pipelines
WinOps Conf 2016 - Michael Greene - Release Pipelines
 
Benchmarking at Parse
Benchmarking at ParseBenchmarking at Parse
Benchmarking at Parse
 
Advanced Benchmarking at Parse
Advanced Benchmarking at ParseAdvanced Benchmarking at Parse
Advanced Benchmarking at Parse
 
What's new in MongoDB 3.6?
What's new in MongoDB 3.6?What's new in MongoDB 3.6?
What's new in MongoDB 3.6?
 
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
 
Eventually Elasticsearch: Eventual Consistency in the Real World
Eventually Elasticsearch: Eventual Consistency in the Real WorldEventually Elasticsearch: Eventual Consistency in the Real World
Eventually Elasticsearch: Eventual Consistency in the Real World
 
Webinar: What's new in the .NET Driver
Webinar: What's new in the .NET DriverWebinar: What's new in the .NET Driver
Webinar: What's new in the .NET Driver
 
MSFT Dumaguete 061616 - Building High Performance Apps
MSFT Dumaguete 061616 - Building High Performance AppsMSFT Dumaguete 061616 - Building High Performance Apps
MSFT Dumaguete 061616 - Building High Performance Apps
 
No REST - Architecting Real-time Bulk Async APIs
No REST - Architecting Real-time Bulk Async APIsNo REST - Architecting Real-time Bulk Async APIs
No REST - Architecting Real-time Bulk Async APIs
 
Lessons from the Trenches - Building Enterprise Applications with RavenDB
Lessons from the Trenches - Building Enterprise Applications with RavenDBLessons from the Trenches - Building Enterprise Applications with RavenDB
Lessons from the Trenches - Building Enterprise Applications with RavenDB
 
Designing your API Server for mobile apps
Designing your API Server for mobile appsDesigning your API Server for mobile apps
Designing your API Server for mobile apps
 
Cost-based Query Optimization in Hive
Cost-based Query Optimization in HiveCost-based Query Optimization in Hive
Cost-based Query Optimization in Hive
 
Cost-based query optimization in Apache Hive
Cost-based query optimization in Apache HiveCost-based query optimization in Apache Hive
Cost-based query optimization in Apache Hive
 
Grails patterns and practices
Grails patterns and practicesGrails patterns and practices
Grails patterns and practices
 
MongoDB World 2019: Scaling Real-time Collaboration with MongoDB
MongoDB World 2019: Scaling Real-time Collaboration with MongoDBMongoDB World 2019: Scaling Real-time Collaboration with MongoDB
MongoDB World 2019: Scaling Real-time Collaboration with MongoDB
 
Boosting the Performance of your Rails Apps
Boosting the Performance of your Rails AppsBoosting the Performance of your Rails Apps
Boosting the Performance of your Rails Apps
 
Improving the Quality of Existing Software
Improving the Quality of Existing SoftwareImproving the Quality of Existing Software
Improving the Quality of Existing Software
 

More from Yann Cluchey

Implementing Keyword Sort with Elasticsearch
Implementing Keyword Sort with ElasticsearchImplementing Keyword Sort with Elasticsearch
Implementing Keyword Sort with ElasticsearchYann Cluchey
 
Annotated Text feature in Elasticsearch
Annotated Text feature in ElasticsearchAnnotated Text feature in Elasticsearch
Annotated Text feature in ElasticsearchYann Cluchey
 
Machine Learning and the Elastic Stack
Machine Learning and the Elastic StackMachine Learning and the Elastic Stack
Machine Learning and the Elastic StackYann Cluchey
 
Elasticsearch at AffiliateWindow
Elasticsearch at AffiliateWindowElasticsearch at AffiliateWindow
Elasticsearch at AffiliateWindowYann Cluchey
 
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...Yann Cluchey
 
Lightning talk: elasticsearch at Cogenta
Lightning talk: elasticsearch at CogentaLightning talk: elasticsearch at Cogenta
Lightning talk: elasticsearch at CogentaYann Cluchey
 

More from Yann Cluchey (6)

Implementing Keyword Sort with Elasticsearch
Implementing Keyword Sort with ElasticsearchImplementing Keyword Sort with Elasticsearch
Implementing Keyword Sort with Elasticsearch
 
Annotated Text feature in Elasticsearch
Annotated Text feature in ElasticsearchAnnotated Text feature in Elasticsearch
Annotated Text feature in Elasticsearch
 
Machine Learning and the Elastic Stack
Machine Learning and the Elastic StackMachine Learning and the Elastic Stack
Machine Learning and the Elastic Stack
 
Elasticsearch at AffiliateWindow
Elasticsearch at AffiliateWindowElasticsearch at AffiliateWindow
Elasticsearch at AffiliateWindow
 
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
 
Lightning talk: elasticsearch at Cogenta
Lightning talk: elasticsearch at CogentaLightning talk: elasticsearch at Cogenta
Lightning talk: elasticsearch at Cogenta
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 

Concurrency Patterns with MongoDB

  • 2. • Real-time retail intelligence • Gather products and prices from web • MongoDB in production • Millions of updates per day, 3K/s peak • Data in SQL, Mongo, ElasticSearch
  • 3. Concurrency Patterns: Why? • MongoDB: atomic updates, no transactions • Need to ensure consistency & correctness • What are my options with Mongo? • Shortcuts • Different approaches
  • 4. Concurrency Control Strategies • Pessimistic • Suited for frequent conflicts • http://bit.ly/two-phase-commits • Optimistic • Efficient when conflicts are rare • http://bit.ly/isolate-sequence • Multi-version • All versions stored, client resolves conflict • e.g. CouchDb
  • 5. Optimistic Concurrency Control (OCC) • No locks • Prevent dirty writes • Uses timestamp or a revision number • Client checks & replays transaction
  • 6. Example Original { _id: 23, comment: “The quick brown fox…” } Edit 1 { _id: 23, comment: “The quick brown fox prefers SQL” } Edit 2 { _id: 23, comment: “The quick brown fox prefers MongoDB” }
  • 7. Example Edit 1 db.comments.update( { _id: 23 }, { _id: 23, comment: “The quick brown fox prefers SQL” }) Edit 2 db.comments.update( { _id: 23 }, { _id: 23, comment: “The quick brown fox prefers MongoDB” }) Outcome: One update is lost, other might be wrong
  • 8. OCC Example Original { _id: 23, rev: 1, comment: “The quick brown fox…” } Update a specific revision (edit 1) db.comments.update( { _id: 23, rev: 1 }, { _id: 23, rev: 2, comment: “The quick brown fox prefers SQL” })
  • 9. OCC Example Edit 2 db.comments.update( { _id: 23, rev: 1 }, { _id: 23, rev: 2, comment: “The quick brown fox prefers MongoDB” }) ..fails { updatedExisting: false, n: 0, err: null, ok: 1 } • Caveat: Only works if all clients follow convention
  • 10. Update Operators in Mongo • Avoid full document replacement by using operators • Powerful operators such as $inc, $set, $push • Many operators can be grouped into single atomic update • More efficient (data over wire, parsing, etc.) • Use as much as possible • http://bit.ly/update-operators
  • 11. Still Need OCC? A hit counter { _id: 1, hits: 5040 } Edit 1 db.stats.update({ _id: 1 }, { $set: { hits: 5045 } }) Edit 2 db.stats.update({ _id: 1 }, { $set: { hits: 5055 } })
  • 12. Still Need OCC? Edit 1 db.stats.update({ _id: 1 }, { $inc: { hits: 5 } }) Edit 2 db.stats.update({ _id: 1 }, { $inc: { hits: 10 } }) • Sequence of updates might vary • Outcome always the same • But what if sequence is important?
  • 13. Still Need OCC? • Operators can offset need for concurrency control • Support for complex atomic manipulation • Depends on use case • You’ll need it for • Opaque changes (e.g. text) • Complex update logic in app domain (e.g. changing a value affects some calculated fields) • Sequence is important and can’t be inferred
  • 14. Update Commands • Update • Specify query to match one or more documents • Use { multi: true } to update multiple documents • Must call Find() separately if you want a copy of the doc • FindAndModify • Update single document only • Find + Update in single hit (atomic) • Returns the doc before or after update • Whole doc or subset • Upsert (update or insert) • Important feature. Works with OCC..?
  • 15. Consistent Update Example • Have a customer document • Want to set the LastOrderValue and return the previous value db.customers.findAndModify({ query: { _id: 16, rev: 45 }, update: { $set: { lastOrderValue: 699 }, $inc: { rev: 1 } }, new: false })
  • 16. Consistent Update Example • Customer has since been updated, or doesn’t exist • Client should replay null • Intended version of customer successfully updated • Original version is returned { _id: 16, rev: 45, lastOrderValue: 145 } • Useful if client has got partial information and needs the full document • A separate Find() could introduce inconsistency
  • 17. Independent Update with Upsert • Keep stats about customers • Want to increment NumOrders and return new total • Customer document might not be there • Independent operation still needs protection db.customerStats.findAndModify({ query: { _id: 16 }, update: { $inc: { numOrders: 1, rev: 1 }, $setOnInsert: { name: “Yann” } }, new: true, upsert: true })
  • 18. Independent Update with Upsert • First run, document is created { _id: 16, numOrders: 1, rev: 1, name: “Yann” } • Second run, document is updated { _id: 16, numOrders: 2, rev: 2, name: “Yann” }
  • 19. Subdocuments • Common scenario • e.g. Customer and Orders in single document • Clients like having everything • Powerful operators for matching and updating subdocuments • $elemMatch, $, $addToSet, $push • Alternatives to “Fat” documents; • Client-side joins • Aggregation • MapReduce
  • 20. Currency Control and Subdocuments • Patterns described here still work, but might be impractical • Docs are large • More collisions • Solve with scale?
  • 21. Subdocument Example • Customer document contains orders • Want to independently update orders • Correct order #471 value to £260 { _id: 16, rev: 20, name: “Yann”, orders: { “471”: { id: 471, value: 250, rev: 4 } } }
  • 22. Subdocument Example db.customers.findAndModify({ query: { “orders.471.rev”: { $lte: 4 } }, update: { $set: { “orders.471.value”: 260 }, $inc: { rev: 1, “orders.471.rev”: 1 }, $setOnInsert: { name: “Yann”, “orders.471.id”: 471 } }, new: true, upsert: true })
  • 23. Subdocument Example • First run, order updated successfully • Could create if not exists { _id: 16, rev: 21, name: “Yann”, orders: { “471”: { id: 471, value: 260, rev: 5 } } }
  • 24. Subdocument Example • Second conflicting run • Query didn’t match revision, duplicate document created { _id: ObjectId("533bf88a50dbb55a8a9b9128"), rev: 1, name: “Yann”, orders: { “471”: { id: 471, value: 260, rev: 1 } } }
  • 25. Subdocument Example • Solve with unique index (good idea anyway) db.customers.ensureIndex( { "orders.id" : 1 }, { "name" : "orderids", "unique" : true })
  • 26. Subdocument Example Client can handle findAndModify result accordingly; • Successful update { updatedExisting: true } • New document created { updatedExisting: false, n: 1 } • Conflict, need to replay { errmsg: “exception: E11000 duplicate key error index: db.customers.$orderids dup key…” }
  • 27. Final Words • Don’t forget deletes • Gotchas about subdocument structure orders: [ { id: 471 }, … ] orders: { “471”: { }, … } orders: { “471”: { id: 471 }, … } • Coming in 2.6.0 stable $setOnInsert: { _id: .. } • Sharding..?

Editor's Notes

  1. 4xSQL, 1xMongo, 5xElastic
  2. No transactions.Different databases have different features.Cheating is fun. How can we avoid problems entirely.Efficiency is key. Understand what’s achievable in a single update
  3. All doable in MongoDBOptimistic usually the best option for typical mongodb projectHow to roll yo
  4. Two users simultaneously editing
  5. Not excitingly clever, or efficient..?
  6. Not excitingly clever, or efficient..?
  7. Not excitingly clever, or efficient..?
  8. They are your friends, go play wit
  9. Two users simultaneously editing
  10. Bank balance!
  11. Not going to get in to that…
  12. Not going to get in to that…
  13. Data is mastered elsewhere