SlideShare a Scribd company logo
1 of 21
HowStuffWorks version Cassandra SriSatish Ambati engineer, DataStax @srisatish
Bigtable, 2006 Dynamo, 2007 OSS, 2008 Incubator, 2009 TLP, 2010
Digital Reasoning: NLP + entity analytics OpenWave: enterprise messaging OpenX: largest publisher-side ad network in the world  Cloudkick: performance data & aggregation SimpleGEO: location-as-API Ooyala: video analytics and business intelligence ngmoco: massively multiplayer game worlds Cassandra in production
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Tuneable reads
Read Internals  @r39132 - #netflixcloud
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Client Marshal Arts ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Adding Nodes  ,[object Object],[object Object],[object Object],[object Object],[object Object]
Cassandra on EC2 cloud
Cassandra on EC2 cloud *Corey Hulen, EC2
inter-node comm ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Compactions K1 < Serialized data > K2 < Serialized data > K3 < Serialized data > -- -- -- Sorted K2 < Serialized data > K10 < Serialized data > K30 < Serialized data > -- -- -- Sorted K4 < Serialized data > K5 < Serialized data > K10 < Serialized data > -- -- -- Sorted MERGE  SORT Loaded in memory K1 < Serialized data > K2 < Serialized data > K3 < Serialized data > K4 < Serialized data > K5 < Serialized data > K10 < Serialized data > K30 < Serialized data > Sorted K1  Offset K5  Offset K30  Offset Bloom Filter Index File Data File
Compactions K1 < Serialized data > K2 < Serialized data > K3 < Serialized data > -- -- -- Sorted K2 < Serialized data > K10 < Serialized data > K30 < Serialized data > -- -- -- Sorted K4 < Serialized data > K5 < Serialized data > K10 < Serialized data > -- -- -- Sorted MERGE  SORT Loaded in memory K1 < Serialized data > K2 < Serialized data > K3 < Serialized data > K4 < Serialized data > K5 < Serialized data > K10 < Serialized data > K30 < Serialized data > Sorted K1  Offset K5  Offset K30  Offset Bloom Filter Index File Data File D E L E T E D
 
A L T W F P Y Key “C” U Availability in Action
A L T W F P Y Key “C” U X hint Availability in Action
JMX
OpsCenter
OpsCenter
OpsCenter

More Related Content

What's hot

Apache Spark: the next big thing? - StampedeCon 2014
Apache Spark: the next big thing? - StampedeCon 2014Apache Spark: the next big thing? - StampedeCon 2014
Apache Spark: the next big thing? - StampedeCon 2014StampedeCon
 
Flare and TensorFlare: Native Compilation for Spark and TensorFlow Pipelines ...
Flare and TensorFlare: Native Compilation for Spark and TensorFlow Pipelines ...Flare and TensorFlare: Native Compilation for Spark and TensorFlow Pipelines ...
Flare and TensorFlare: Native Compilation for Spark and TensorFlow Pipelines ...Databricks
 
SRv6 Mobile User Plane P4 proto-type
SRv6 Mobile User Plane P4 proto-typeSRv6 Mobile User Plane P4 proto-type
SRv6 Mobile User Plane P4 proto-typeKentaro Ebisawa
 
NeuLion IBC2017 展示概要
NeuLion IBC2017 展示概要NeuLion IBC2017 展示概要
NeuLion IBC2017 展示概要Taro Hagiya
 
[Perforce] Tasks - The Holy Hand Grenade of Branching
[Perforce] Tasks - The Holy Hand Grenade of Branching[Perforce] Tasks - The Holy Hand Grenade of Branching
[Perforce] Tasks - The Holy Hand Grenade of BranchingPerforce
 
Introducing TiDB [Delivered: 09/25/18 at Portland Cloud Native Meetup]
Introducing TiDB [Delivered: 09/25/18 at Portland Cloud Native Meetup]Introducing TiDB [Delivered: 09/25/18 at Portland Cloud Native Meetup]
Introducing TiDB [Delivered: 09/25/18 at Portland Cloud Native Meetup]Kevin Xu
 
Using GTP on Linux with libgtpnl
Using GTP on Linux with libgtpnlUsing GTP on Linux with libgtpnl
Using GTP on Linux with libgtpnlKentaro Ebisawa
 
A Primer on Blockchains and RSK
A Primer on Blockchains and RSKA Primer on Blockchains and RSK
A Primer on Blockchains and RSKGinoOsahon1
 
Proving Security Protocols Correct
Proving Security Protocols CorrectProving Security Protocols Correct
Proving Security Protocols CorrectLawrence Paulson
 
Linking Metrics to Logs using Loki
Linking Metrics to Logs using LokiLinking Metrics to Logs using Loki
Linking Metrics to Logs using LokiKnoldus Inc.
 
HPCC Systems - ECL for Programmers - Big Data - Data Scientist
HPCC Systems - ECL for Programmers - Big Data - Data ScientistHPCC Systems - ECL for Programmers - Big Data - Data Scientist
HPCC Systems - ECL for Programmers - Big Data - Data ScientistFujio Turner
 
How a Particle Accelerator Monitors Scientific Experiments Using InfluxDB
How a Particle Accelerator Monitors Scientific Experiments Using InfluxDBHow a Particle Accelerator Monitors Scientific Experiments Using InfluxDB
How a Particle Accelerator Monitors Scientific Experiments Using InfluxDBInfluxData
 
Summarization and Abstraction using deep learning
Summarization and Abstraction using deep learningSummarization and Abstraction using deep learning
Summarization and Abstraction using deep learningSK Reddy
 
Loki - like prometheus, but for logs
Loki - like prometheus, but for logsLoki - like prometheus, but for logs
Loki - like prometheus, but for logsJuraj Hantak
 
Understanding Hadoop Clusters and the Network
Understanding Hadoop Clusters and the NetworkUnderstanding Hadoop Clusters and the Network
Understanding Hadoop Clusters and the Networkbradhedlund
 
The magic of machine translation 20 july 2017
The magic of machine translation 20 july 2017The magic of machine translation 20 july 2017
The magic of machine translation 20 july 2017SK Reddy
 
Improving Authenticated Dynamic Dictionaries, with Application to Cryptocurre...
Improving Authenticated Dynamic Dictionaries, with Application to Cryptocurre...Improving Authenticated Dynamic Dictionaries, with Application to Cryptocurre...
Improving Authenticated Dynamic Dictionaries, with Application to Cryptocurre...Dmitry Meshkov
 
High performance networking with HTTP/2 and OkHTTP 3
High performance networking with HTTP/2 and OkHTTP 3High performance networking with HTTP/2 and OkHTTP 3
High performance networking with HTTP/2 and OkHTTP 3Siddharth Mathur
 

What's hot (20)

Apache Spark: the next big thing? - StampedeCon 2014
Apache Spark: the next big thing? - StampedeCon 2014Apache Spark: the next big thing? - StampedeCon 2014
Apache Spark: the next big thing? - StampedeCon 2014
 
Flare and TensorFlare: Native Compilation for Spark and TensorFlow Pipelines ...
Flare and TensorFlare: Native Compilation for Spark and TensorFlow Pipelines ...Flare and TensorFlare: Native Compilation for Spark and TensorFlow Pipelines ...
Flare and TensorFlare: Native Compilation for Spark and TensorFlow Pipelines ...
 
SRv6 Mobile User Plane P4 proto-type
SRv6 Mobile User Plane P4 proto-typeSRv6 Mobile User Plane P4 proto-type
SRv6 Mobile User Plane P4 proto-type
 
NeuLion IBC2017 展示概要
NeuLion IBC2017 展示概要NeuLion IBC2017 展示概要
NeuLion IBC2017 展示概要
 
[Perforce] Tasks - The Holy Hand Grenade of Branching
[Perforce] Tasks - The Holy Hand Grenade of Branching[Perforce] Tasks - The Holy Hand Grenade of Branching
[Perforce] Tasks - The Holy Hand Grenade of Branching
 
Introducing TiDB [Delivered: 09/25/18 at Portland Cloud Native Meetup]
Introducing TiDB [Delivered: 09/25/18 at Portland Cloud Native Meetup]Introducing TiDB [Delivered: 09/25/18 at Portland Cloud Native Meetup]
Introducing TiDB [Delivered: 09/25/18 at Portland Cloud Native Meetup]
 
Using GTP on Linux with libgtpnl
Using GTP on Linux with libgtpnlUsing GTP on Linux with libgtpnl
Using GTP on Linux with libgtpnl
 
A Primer on Blockchains and RSK
A Primer on Blockchains and RSKA Primer on Blockchains and RSK
A Primer on Blockchains and RSK
 
Proving Security Protocols Correct
Proving Security Protocols CorrectProving Security Protocols Correct
Proving Security Protocols Correct
 
Linking Metrics to Logs using Loki
Linking Metrics to Logs using LokiLinking Metrics to Logs using Loki
Linking Metrics to Logs using Loki
 
HPCC Systems - ECL for Programmers - Big Data - Data Scientist
HPCC Systems - ECL for Programmers - Big Data - Data ScientistHPCC Systems - ECL for Programmers - Big Data - Data Scientist
HPCC Systems - ECL for Programmers - Big Data - Data Scientist
 
How a Particle Accelerator Monitors Scientific Experiments Using InfluxDB
How a Particle Accelerator Monitors Scientific Experiments Using InfluxDBHow a Particle Accelerator Monitors Scientific Experiments Using InfluxDB
How a Particle Accelerator Monitors Scientific Experiments Using InfluxDB
 
IJEIR_1615
IJEIR_1615IJEIR_1615
IJEIR_1615
 
Summarization and Abstraction using deep learning
Summarization and Abstraction using deep learningSummarization and Abstraction using deep learning
Summarization and Abstraction using deep learning
 
SPARQLstream and Morph-streams
SPARQLstream and Morph-streamsSPARQLstream and Morph-streams
SPARQLstream and Morph-streams
 
Loki - like prometheus, but for logs
Loki - like prometheus, but for logsLoki - like prometheus, but for logs
Loki - like prometheus, but for logs
 
Understanding Hadoop Clusters and the Network
Understanding Hadoop Clusters and the NetworkUnderstanding Hadoop Clusters and the Network
Understanding Hadoop Clusters and the Network
 
The magic of machine translation 20 july 2017
The magic of machine translation 20 july 2017The magic of machine translation 20 july 2017
The magic of machine translation 20 july 2017
 
Improving Authenticated Dynamic Dictionaries, with Application to Cryptocurre...
Improving Authenticated Dynamic Dictionaries, with Application to Cryptocurre...Improving Authenticated Dynamic Dictionaries, with Application to Cryptocurre...
Improving Authenticated Dynamic Dictionaries, with Application to Cryptocurre...
 
High performance networking with HTTP/2 and OkHTTP 3
High performance networking with HTTP/2 and OkHTTP 3High performance networking with HTTP/2 and OkHTTP 3
High performance networking with HTTP/2 and OkHTTP 3
 

Viewers also liked

Orcid charleston presentation 110410
Orcid charleston presentation 110410Orcid charleston presentation 110410
Orcid charleston presentation 110410DKochalko
 
Metrics lightning talk
Metrics lightning talkMetrics lightning talk
Metrics lightning talkChris Lohfink
 
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)srisatish ambati
 
My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015
My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015
My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015gdusbabek
 
Getting to Know the Cassandra Codebase
Getting to Know the Cassandra CodebaseGetting to Know the Cassandra Codebase
Getting to Know the Cassandra Codebasegdusbabek
 

Viewers also liked (6)

Orcid charleston presentation 110410
Orcid charleston presentation 110410Orcid charleston presentation 110410
Orcid charleston presentation 110410
 
Metrics lightning talk
Metrics lightning talkMetrics lightning talk
Metrics lightning talk
 
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
 
My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015
My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015
My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015
 
Cassandra
CassandraCassandra
Cassandra
 
Getting to Know the Cassandra Codebase
Getting to Know the Cassandra CodebaseGetting to Know the Cassandra Codebase
Getting to Know the Cassandra Codebase
 

Similar to Svccg nosql 2011_sri-cassandra

Data Presentations Cassandra Sigmod
Data  Presentations  Cassandra SigmodData  Presentations  Cassandra Sigmod
Data Presentations Cassandra SigmodJeff Hammerbacher
 
Introduction to Programmable Networks by Clarence Anslem, Intel
Introduction to Programmable Networks by Clarence Anslem, IntelIntroduction to Programmable Networks by Clarence Anslem, Intel
Introduction to Programmable Networks by Clarence Anslem, IntelMyNOG
 
Spark Streaming Early Warning Use Case
Spark Streaming Early Warning Use CaseSpark Streaming Early Warning Use Case
Spark Streaming Early Warning Use Caserandom_chance
 
Synapse 2018 Guarding against failure in a hundred step pipeline
Synapse 2018 Guarding against failure in a hundred step pipelineSynapse 2018 Guarding against failure in a hundred step pipeline
Synapse 2018 Guarding against failure in a hundred step pipelineCalvin French-Owen
 
Bogdan Kecman INIT Presentation
Bogdan Kecman INIT PresentationBogdan Kecman INIT Presentation
Bogdan Kecman INIT Presentationarhismece
 
Scaling spark on kubernetes at Lyft
Scaling spark on kubernetes at LyftScaling spark on kubernetes at Lyft
Scaling spark on kubernetes at LyftLi Gao
 
Netflix's Transition to High-Availability Storage (QCon SF 2010)
Netflix's Transition to High-Availability Storage (QCon SF 2010)Netflix's Transition to High-Availability Storage (QCon SF 2010)
Netflix's Transition to High-Availability Storage (QCon SF 2010)Sid Anand
 
In-Memory Key Value Store (KVS) in FPGA for Ultra Low Latency and High Throug...
In-Memory Key Value Store (KVS) in FPGA for Ultra Low Latency and High Throug...In-Memory Key Value Store (KVS) in FPGA for Ultra Low Latency and High Throug...
In-Memory Key Value Store (KVS) in FPGA for Ultra Low Latency and High Throug...Tom Diederich
 
Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5Steen Larsen
 
Dissecting Open Source Cloud Evolution: An OpenStack Case Study
Dissecting Open Source Cloud Evolution: An OpenStack Case StudyDissecting Open Source Cloud Evolution: An OpenStack Case Study
Dissecting Open Source Cloud Evolution: An OpenStack Case StudySalman Baset
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Databricks
 
HPBigData2015 PSTL kafka spark vertica
HPBigData2015 PSTL kafka spark verticaHPBigData2015 PSTL kafka spark vertica
HPBigData2015 PSTL kafka spark verticaJack Gudenkauf
 
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...arnaudsoullie
 
Scaling Apache Spark on Kubernetes at Lyft
Scaling Apache Spark on Kubernetes at LyftScaling Apache Spark on Kubernetes at Lyft
Scaling Apache Spark on Kubernetes at LyftDatabricks
 
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
DPDK summit 2015: It's kind of fun  to do the impossible with DPDKDPDK summit 2015: It's kind of fun  to do the impossible with DPDK
DPDK summit 2015: It's kind of fun to do the impossible with DPDKLagopus SDN/OpenFlow switch
 
DPDK Summit 2015 - NTT - Yoshihiro Nakajima
DPDK Summit 2015 - NTT - Yoshihiro NakajimaDPDK Summit 2015 - NTT - Yoshihiro Nakajima
DPDK Summit 2015 - NTT - Yoshihiro NakajimaJim St. Leger
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking WalkthroughThomas Graf
 

Similar to Svccg nosql 2011_sri-cassandra (20)

Data Presentations Cassandra Sigmod
Data  Presentations  Cassandra SigmodData  Presentations  Cassandra Sigmod
Data Presentations Cassandra Sigmod
 
Introduction to Programmable Networks by Clarence Anslem, Intel
Introduction to Programmable Networks by Clarence Anslem, IntelIntroduction to Programmable Networks by Clarence Anslem, Intel
Introduction to Programmable Networks by Clarence Anslem, Intel
 
Spark Streaming Early Warning Use Case
Spark Streaming Early Warning Use CaseSpark Streaming Early Warning Use Case
Spark Streaming Early Warning Use Case
 
Synapse 2018 Guarding against failure in a hundred step pipeline
Synapse 2018 Guarding against failure in a hundred step pipelineSynapse 2018 Guarding against failure in a hundred step pipeline
Synapse 2018 Guarding against failure in a hundred step pipeline
 
Bogdan Kecman INIT Presentation
Bogdan Kecman INIT PresentationBogdan Kecman INIT Presentation
Bogdan Kecman INIT Presentation
 
Scaling spark on kubernetes at Lyft
Scaling spark on kubernetes at LyftScaling spark on kubernetes at Lyft
Scaling spark on kubernetes at Lyft
 
No[1][1]
No[1][1]No[1][1]
No[1][1]
 
Netflix's Transition to High-Availability Storage (QCon SF 2010)
Netflix's Transition to High-Availability Storage (QCon SF 2010)Netflix's Transition to High-Availability Storage (QCon SF 2010)
Netflix's Transition to High-Availability Storage (QCon SF 2010)
 
In-Memory Key Value Store (KVS) in FPGA for Ultra Low Latency and High Throug...
In-Memory Key Value Store (KVS) in FPGA for Ultra Low Latency and High Throug...In-Memory Key Value Store (KVS) in FPGA for Ultra Low Latency and High Throug...
In-Memory Key Value Store (KVS) in FPGA for Ultra Low Latency and High Throug...
 
Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5
 
Dissecting Open Source Cloud Evolution: An OpenStack Case Study
Dissecting Open Source Cloud Evolution: An OpenStack Case StudyDissecting Open Source Cloud Evolution: An OpenStack Case Study
Dissecting Open Source Cloud Evolution: An OpenStack Case Study
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
 
HPBigData2015 PSTL kafka spark vertica
HPBigData2015 PSTL kafka spark verticaHPBigData2015 PSTL kafka spark vertica
HPBigData2015 PSTL kafka spark vertica
 
Spark on Yarn @ Netflix
Spark on Yarn @ NetflixSpark on Yarn @ Netflix
Spark on Yarn @ Netflix
 
Producing Spark on YARN for ETL
Producing Spark on YARN for ETLProducing Spark on YARN for ETL
Producing Spark on YARN for ETL
 
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...
 
Scaling Apache Spark on Kubernetes at Lyft
Scaling Apache Spark on Kubernetes at LyftScaling Apache Spark on Kubernetes at Lyft
Scaling Apache Spark on Kubernetes at Lyft
 
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
DPDK summit 2015: It's kind of fun  to do the impossible with DPDKDPDK summit 2015: It's kind of fun  to do the impossible with DPDK
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
 
DPDK Summit 2015 - NTT - Yoshihiro Nakajima
DPDK Summit 2015 - NTT - Yoshihiro NakajimaDPDK Summit 2015 - NTT - Yoshihiro Nakajima
DPDK Summit 2015 - NTT - Yoshihiro Nakajima
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking Walkthrough
 

More from srisatish ambati

H2O Open Dallas 2016 keynote for Business Transformation
H2O Open Dallas 2016 keynote for Business TransformationH2O Open Dallas 2016 keynote for Business Transformation
H2O Open Dallas 2016 keynote for Business Transformationsrisatish ambati
 
Digital Transformation with AI and Data - H2O.ai and Open Source
Digital Transformation with AI and Data - H2O.ai and Open SourceDigital Transformation with AI and Data - H2O.ai and Open Source
Digital Transformation with AI and Data - H2O.ai and Open Sourcesrisatish ambati
 
Top 10 Performance Gotchas for scaling in-memory Algorithms.
Top 10 Performance Gotchas for scaling in-memory Algorithms.Top 10 Performance Gotchas for scaling in-memory Algorithms.
Top 10 Performance Gotchas for scaling in-memory Algorithms.srisatish ambati
 
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoopJava one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoopsrisatish ambati
 
High order bits from cassandra & hadoop
High order bits from cassandra & hadoopHigh order bits from cassandra & hadoop
High order bits from cassandra & hadoopsrisatish ambati
 
High order bits from cassandra & hadoop
High order bits from cassandra & hadoopHigh order bits from cassandra & hadoop
High order bits from cassandra & hadoopsrisatish ambati
 
Brisk hadoop june2011_sfjava
Brisk hadoop june2011_sfjavaBrisk hadoop june2011_sfjava
Brisk hadoop june2011_sfjavasrisatish ambati
 
Cacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svccCacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svccsrisatish ambati
 
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...srisatish ambati
 
How to Stop Worrying and Start Caching in Java
How to Stop Worrying and Start Caching in JavaHow to Stop Worrying and Start Caching in Java
How to Stop Worrying and Start Caching in Javasrisatish ambati
 
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...srisatish ambati
 

More from srisatish ambati (15)

H2O Open Dallas 2016 keynote for Business Transformation
H2O Open Dallas 2016 keynote for Business TransformationH2O Open Dallas 2016 keynote for Business Transformation
H2O Open Dallas 2016 keynote for Business Transformation
 
Digital Transformation with AI and Data - H2O.ai and Open Source
Digital Transformation with AI and Data - H2O.ai and Open SourceDigital Transformation with AI and Data - H2O.ai and Open Source
Digital Transformation with AI and Data - H2O.ai and Open Source
 
Top 10 Performance Gotchas for scaling in-memory Algorithms.
Top 10 Performance Gotchas for scaling in-memory Algorithms.Top 10 Performance Gotchas for scaling in-memory Algorithms.
Top 10 Performance Gotchas for scaling in-memory Algorithms.
 
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoopJava one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
 
High order bits from cassandra & hadoop
High order bits from cassandra & hadoopHigh order bits from cassandra & hadoop
High order bits from cassandra & hadoop
 
High order bits from cassandra & hadoop
High order bits from cassandra & hadoopHigh order bits from cassandra & hadoop
High order bits from cassandra & hadoop
 
Cassandra at no_sql
Cassandra at no_sqlCassandra at no_sql
Cassandra at no_sql
 
Brisk hadoop june2011_sfjava
Brisk hadoop june2011_sfjavaBrisk hadoop june2011_sfjava
Brisk hadoop june2011_sfjava
 
Brisk hadoop june2011
Brisk hadoop june2011Brisk hadoop june2011
Brisk hadoop june2011
 
Cacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svccCacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svcc
 
Jvm goes big_data_sfjava
Jvm goes big_data_sfjavaJvm goes big_data_sfjava
Jvm goes big_data_sfjava
 
jvm goes to big data
jvm goes to big datajvm goes to big data
jvm goes to big data
 
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
 
How to Stop Worrying and Start Caching in Java
How to Stop Worrying and Start Caching in JavaHow to Stop Worrying and Start Caching in Java
How to Stop Worrying and Start Caching in Java
 
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
 

Recently uploaded

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 

Recently uploaded (20)

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 

Svccg nosql 2011_sri-cassandra

  • 1. HowStuffWorks version Cassandra SriSatish Ambati engineer, DataStax @srisatish
  • 2. Bigtable, 2006 Dynamo, 2007 OSS, 2008 Incubator, 2009 TLP, 2010
  • 3. Digital Reasoning: NLP + entity analytics OpenWave: enterprise messaging OpenX: largest publisher-side ad network in the world Cloudkick: performance data & aggregation SimpleGEO: location-as-API Ooyala: video analytics and business intelligence ngmoco: massively multiplayer game worlds Cassandra in production
  • 4.
  • 6. Read Internals @r39132 - #netflixcloud
  • 7.
  • 8.
  • 9.
  • 11. Cassandra on EC2 cloud *Corey Hulen, EC2
  • 12.
  • 13. Compactions K1 < Serialized data > K2 < Serialized data > K3 < Serialized data > -- -- -- Sorted K2 < Serialized data > K10 < Serialized data > K30 < Serialized data > -- -- -- Sorted K4 < Serialized data > K5 < Serialized data > K10 < Serialized data > -- -- -- Sorted MERGE SORT Loaded in memory K1 < Serialized data > K2 < Serialized data > K3 < Serialized data > K4 < Serialized data > K5 < Serialized data > K10 < Serialized data > K30 < Serialized data > Sorted K1 Offset K5 Offset K30 Offset Bloom Filter Index File Data File
  • 14. Compactions K1 < Serialized data > K2 < Serialized data > K3 < Serialized data > -- -- -- Sorted K2 < Serialized data > K10 < Serialized data > K30 < Serialized data > -- -- -- Sorted K4 < Serialized data > K5 < Serialized data > K10 < Serialized data > -- -- -- Sorted MERGE SORT Loaded in memory K1 < Serialized data > K2 < Serialized data > K3 < Serialized data > K4 < Serialized data > K5 < Serialized data > K10 < Serialized data > K30 < Serialized data > Sorted K1 Offset K5 Offset K30 Offset Bloom Filter Index File Data File D E L E T E D
  • 15.  
  • 16. A L T W F P Y Key “C” U Availability in Action
  • 17. A L T W F P Y Key “C” U X hint Availability in Action
  • 18. JMX

Editor's Notes

  1. Typical write operation involves a write into a commit log for durability and recoverability and an update into an in-memory data structure. The write into the in-memory data structure is performed only after a successful write into the commit log. We have a dedicated disk on each machine for the commit log since all writes into the commit log are sequential and so we can maximize disk throughput. When the in-memory data structure crosses a certain threshold, calculated based on data size and number of objects, it dumps itself to disk. This write is performed on one of many commodity disks that machines are equipped with. All writes are sequential to disk and also generate an index for ecient lookup based on row key. These indices are also persisted along with the data le. Over time many such les could exist on disk and a merge process runs in the background to collate the different les into one le. This process is very similar to the compaction process that happens in the Bigtable system
  2. “ A typical read operation rst queries the in-memory data structure before looking into the les on disk. The files are looked at in the order of newest to oldest. When a disk lookup occurs we could be looking up a key in multiple les on disk. In order to prevent lookups into les that do not contain the key, a bloom lter, summarizing the keys in the le, is also stored in each data le and also kept in memory. This bloom lter is rst consulted to check if the key being looked up does indeed exist in the given le. A key in a column family could have many columns. Some special indexing is required to retrieve columns which are further away from the key. In order to prevent scanning of every column on disk we maintain column indices which allow us to jump to the right chunk on disk for column retrieval. As the columns for a given key are being serialized and written out to disk we generate indices at every 256K chunk boundary. This boundary is congurable, but we have found 256K to work well for us in our production workloads.”
  3. ConsistencyLevel