SlideShare a Scribd company logo
1 of 48
January 20, 2015
Sean Anderson
Manager, Data Services
@seanandersonBD
Making choices:
What kind of relationship are you seeking
with your database?
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
What are we going to talk about today?
•Databases are complicated tools
•There are numerous choices
– How did we get here?
•Understanding some of our choices
– SQL: Relational
– MongoDB: Documents
– Redis: Key-value
– Hadoop: Large distributed files
•How should I think about managing them?
2
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Common advice these days from smart people
3
Let’s take a step back
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Databases are not simple, single purpose tools
5
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
The relationship with your database can be complicated
It’s complicated
---
6
How did we get here?
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Traditional apps
(CRM, HR, Finance apps)
Modern apps
(mobile, social, media, games)
Custom-built
for the app
Programmable
by the app
Infrastructure
Mostly resides on premise Mostly resides on cloudData
Trend
App Development is Changing
8
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Traditional apps
(CRM, HR, Finance apps)
Modern apps
(mobile, social, media, games)
Systems of
Record
Highly structured
Slow to change
Transactional
Stable
Core to the business
Not very social
Systems of
Engagement
Loosely structured
Quick to adapt
Conversational
Dynamic and in flux
Edge of the business
Fundamentally social
Characteristics
of the system
Mostly resides on premise Mostly resides on cloudData
Trend
Applications are becoming systems of engagement
9
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
MEDIA GAMING M2M MOBILE SOCIAL
SOME UNIQUE SCENARIOS
Cloud scale and fast growth
High speed data retrieval needs
Frequently written, rarely read
Binary files
Short term data
Multi-location access
Zero downtime needs
Dynamic or object oriented models
Trying to avoid RAID / storage limits
Large files
We are building different kinds of applications
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Source: “15 Years of Hard Drive History: Capacities outran performance” (November 27, 2006)
http://www.tomshardware.com/reviews/15-years-of-hard-drive-history,1368-6.html
In the 15 year period before 2006, storage density increased 10,000x,
but performance only increased about 100x
11
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
As a result, a revolution ensued in the world of Data Services
Polyglot persistence is here to stay: there are about 150+ choices just in the “NoSQL” subset
12
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Two key issues
How do you ensure
best fit for your app?
What is the long term
view of your relationship
with your database?
13
Get to know your choices well
• Crash course!
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Relational Documents Key-value
Distributed large
sets
Understand the personality of your database
Let’s use these examples
Data
Integrity
SQL
Flexible
Schema
Scale
Fast
Retrieval
Data
structures
Distributed
Processing
Big Data
(MongoDB) (Redis) (Hadoop)(SQL)
15
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Relational databases (SQL)
They literally saved the world from running on paper
Strengths
• Data integrity through data types and semantic rules
• AGE >= 0
• Person must have a NAME
• Querying
• Aggregation
• SQL
“Weaknesses”
• Complex development as developer needs to
map relational model with object oriented code
• Complexity grows exponentially as relational
model grows
• Difficult to scale
• Expensive (hardware, software)
If your operation depends on the integrity
of your business rules, the relational
model rules.
Scaling is a little difficult and
performance is key.
Relational Documents Key-value
Distributed
large sets
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
• Allow new data without a defined schema
• Designed for scale
• Faster, agile development
• Databases in the cloud!
The complexities of relational databases led to NoSQL
17
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Documents Databases
vs.
{
_id : ObjectId("4c4ba5e5e8aabf3"),
car_make: "Volkswagen",
model : "Rabbit",
tires : [
{type : “driver front”,
brand: “Michelin”},
{type : “driver rear”,
brand: “Michelin”},
{type : “passenger front”,
brand: “Michelin”},
{type : “passenger rear”,
brand: “Michelin”}, ]
}
Relational Documents Key-value
Distributed
large sets
18
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
•Leading NoSQL database
•Open Source
•Agility and flexibility (no set schema)
•Better fit to modern development methodologies
•New types of records (fields) are added easily
•Imagine it like a folder you add pages to
MongoDB has emerged as a leader in Document databases
Relational Documents Key-value
Distributed
large sets
19
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
db.friends.insert (
{
name: “J.R.”,
email: “email@rackspace.com”,
twitter_handle: “jrarredondo”,
teams: [ “Mariners”, “Rangers” ],
group: 1
}
)
db.friends.ensureIndex( { group: 1} )
var myCursor = db.friends.find( { group: { $gt: 0 } } )
• Document databases and collections
• Indexes
• Rich query language
• Replication (transparent to the app)
– Writes to primary ensure consistency
– Configurable reads to secondaries to help performance
– Eventual consistency on secondary reads
– Election on failures of primary nodes
– Configurable write concerns for flexible write guarantees
depending on app needs
• Shards for horizontal scaling
– Shard Key used to partition data based on ranges or hashes
– Partition strategy depends on how evenly you want data
distributed, and the nature of your queries (single vs. ranges)
MongoDB
Relational Documents Key-value
Distributed
large sets
20
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Flexibility of data model (and its problems) with document databases
Appboy: App marketing automation platform for mobile apps
Courtesy of Jon Hyman, CIO and Co-Founder of Appboy 21
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Sometimes you combine databases...
22
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
MySQL and MongoDB together
• Heavily used during weekends and at
night
• Complex SQL queries
• “What are my friends drinking?”
• “Where can I find this beer?”
A social discovery and sharing
network for beer drinkers
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
MySQL and MongoDB together
What works best for the workflow?
- MySQL worked best for
reference data for us
- Not everything moved to
MongoDB
What stayed in MySQL?
Check-ins
Users
Relationships Data
Primary Datastore
What moved to MongoDB?
Activity Feed (Friend’s Graph)
Recommendation Data
Location-based Check-ins
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Relational Documents Key-value
Distributed
large sets
• Think about it as a single huge hash table
• Simple concepts
– GET / SET / DELETE <data> based on some <key>
• High performance, in memory
• Persistence
– Point-in-time Snapshots
– Append only / Journal
• Partitioning
– Redis Cluster (future)
– Proxy-based solutions such as Twemproxy
Key-value stores: Redis
Key Value
<key> <value>
<key> <value>
<key> <value>
<key> <value>
25
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
• Volatile keys: automatic expiration of keys
– SET <key> <value> EX <seconds>
– SETEX <key> <seconds> <value>
• Data structures
– LISTS, SETS / SORTED SETS, HASHES
• Publish / Subscribe
– SUBSCRIBE <channel>
– PUBLISH <channel> <message>
• Transactions (*)
– MULTI
• Commands to be executed as a single, atomic isolated operation
– EXEC / DISCARD
– (*) Warning: VERY different behaviors than in SQL
• Eviction policies
– Useful to implement Least Recently Used caches
Key-value stores: Redis
http://robots.thoughtbot.com/redis-pub-sub-how-does-it-work
Relational Documents Key-value
Distributed
large sets
26
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Cache
Making another application better
Data
Structures
(Example: Leaderboards!)
LISTS
SETS
SORTED SETS
HASHES
Relational Documents Key-value
Distributed
large sets
Redis Scenarios
27
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
• Full text search based on Apache Lucene
• Will run alongside of MongoDB, Hadoop, MySQL, and many other databases
• Allows for quick full text search of your data set
• Highly-Available by default
• Optimized Hardware
Elastic Search
28
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
• MongoDB Users
• Hadoop Users
• JSON formatted Databases
• Users with Large Data Sets
Any current Objectrocket MongoDB or Rackspace Cloud Big Data customer will be able to connect
Elastic Search through simple/documented tool.
29
Who might want to use Elastic Search?
www.rackspace.com
Klout XING GitHub
If you’re one of the many household names using
Klout to create campaigns that target social
influencers, you’re using Elasticsearch.
Elasticsearch made it possible for Klout to
provide their forthcoming self-service option to
their customers, which Klout predicts will allow
them to at least double their current revenues
XING is the leading business social network in
Europe, with half its users located in Germany
and the other half throughout the rest of Europe,
Asia and Australia. XING has called their
relationship with Elasticsearch a strategic
partnership, far beyond a simple customer and
service provider relationship. We’ve forged these
deep ties with our customer by enabling XING to
keep their users’ updates flowing in real-time.
Elasticsearch empowers GitHub’s 4 million
‘social coders’ through providing search across
GitHub’s 8 million + code repositories.
The GitHub team also makes use of
Elasticsearch to monitor for abuse using some
fairly clever logging hacks.
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Relational Documents Key-value
Distributed
large sets
Volume Velocity
Variety Complexity
“Big Data”: generating insights with Hadoop
V3CMining social data for sentiment
Analyzing web clickstreams
Analyzing log data for security breaches
Telemetry from sensors and machines
eCommerce predictive analytics
30
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Data
Services
Core
Services
Fundamentals of Hadoop v1
HDFS
Distributed File System
HBase
Distributed,
scalable, non
relational
database
HCatalog
Metadata and table management system
Pig
Data flow
scripting
language
Hive
DW analysis layer
through HiveQL
(SQL-like) queries
MapReduce
Data processing framework
Operational
Services
Ambari
Installation, monitoring, administration
Oozie
Workflow and job
scheduling
Zookeeper
Configuration, sync
and naming registry
Falcon
Data pipeline
framework
Knox
Auth and access
Flume
Log data
aggregation and
movement
Sqoop
Bulk data transfer
from and to
relational DB
Relational Documents Key-value
Distributed
large sets
31
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
MapReduce
32
…
Large, distributed
files
Algorithm
MAP
REDUCE
MAP MAP MAP MAP MAP
It’s more efficient to send
the algorithm to the data,
than moving data to the
algorithm
REDUCE
Partial answers
Answer
Simple example: how many times does
each word appear in all files?
mapper (filename, file-contents):
for each word in file-contents:
emit (word, 1)
reducer (word, values):
sum = 0
for each value in values:
sum = sum + value
emit (word, sum)
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Beyond MapReduce / batch with Hadoop 2.0
Source: Hortonworks
33
Other ideas
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Really understand the personality of your database
First impressions can be deceiving
“Redis is ‘just a cache’”
• SET
• GET
Redis is a server for data structures
• Strings
• Hashes
• Lists
• Sets / Sorted Sets
• Publish / Subscribe
Huge difference!
35
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Focus on the tradeoffs
SQL NoSQL
Data integrity
Business rules
Consistency
Transaction isolation
Atomicity
and
Rigidity
Flexibility of schema
Dynamic data models
Horizontal scale
Easier to get started
and
Inconsistency of data
36
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Relational Documents Key-value
Distributed large
sets
Understand the personality of your database
Let’s use these examples
(MongoDB) (Redis) (Hadoop)(SQL)
Customer contact
Reference data
Order Details
(Ship To, Bill To
SKU, Quantity, Price)
Billing transactions
Inventory
Prices
Member Info (user, pwd)
Customer relationships
Notes / Social
Partitions (shards)
Promotional materials
Dynamic schemas
Statements
Product Catalog, Images
Product Configuration
Personalized catalog
Member Comments
Product Reviews
Product Q&As
Session info
Cart
Recent orders
Home page info
Latest comments
Recommendations
Product “stars”
Upsell/Cross sell
Customer attributes
(non personally
identifiable information,
geo)
Sales history
Churn info
Price history
Social info
Comments “NPS”
Recommendations
All kinds of analysis
37
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
It’s good to understand the fundamental “theory”
What does your problem really need?
ACID
• Atomicity: A transactions either happens
completely, or not at all
– No partial transactions
• Consistency: Transactions end in a “valid” state
– No violation of rules
• Isolation: Transaction appears as if it is the only
thing happening to the database
– Relaxed most times
– Deals with phantom, dirty reads or non repeatable reads
• Durability: Committed transactions are permanent
– Even after failure
BASE
• Basically available:
– Supporting partial failures without complete system
failure
– Design as if users would end up in different partitions
• Soft state:
– Things can be in flux for a little bit of time
• Eventual consistency:
– Things right themselves
http://queue.acm.org/detail.cfm?id=1394128
New ways of thinking:
Do customers really need to know the level of
inventory of a product to place an order? Maybe
all they want is to know that it is not zero
38
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Know your CAP, really
Consistency, Availability and Partition Tolerance
You can only have 2 out of 3 in CAP!
• Partitions are not generally common
• Choosing Consistency or Availability is not final
• “It depends”
– Maybe on user
– Maybe on system
– Maybe on type of data
• Just think:
– How am I going to detect a problem in the network? (P)
– How am I going to limit operations once I detect that?
– How am I going to compensate to recover?
Wait! It’s not that simple
Hurst 2010 (http://blog.nahurst.com/visual-guide-to-nosql-systems)
Eric Brewer 2012 (http://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed)
39
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
• Stability
• Fit for core scenarios
• Configurability to different scenarios
• Integration with development languages
• Integration with other databases
• SQL compatibility
• End user vs. Developer skillset
• Conceptual changes
• Platform availability
• Data type and semantic needs
• Security
The “ilities” and their cousins
These are some of the challenges indirectly related to data that we must deal with
• Performance
• Scalability
• Consistency
• Resiliency
• Data model
• Flexibility
• Cost
• Training
• Tools availability
• Development experience
40
Our vision is Data as a Service
From databases to data as a service
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Two key issues
How do you ensure
best fit for your app?
What is the long term
view of your relationship
with your database?
42
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Data-as-a-Service: more time building,
less time managing databases
43
Four levels of DaaS transparency
Source: “Choosing The Right Cloud Provider” (December 5, 2013)
http://www.rackspace.com/blog/choosing-the-right-cloud-provider-for-your-mongodb-database/
• For some businesses, database or infrastructure
management IS core of the business
• For most software-based businesses, database or
infrastructure management represents time and
resources not spent building the application
• You must answer for yourself: are you in the
business of managing infrastructure, or in the
business of [your market here]?
More time
spent
building
the app
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
From Database-as-a-Service to Data-as-a-Service
Focus on building your app, not managing databases
Manage hardware infrastructure
Manage software infrastructure
(i.e. databases)
Build your application
(i.e. game, startup, mobile app, site)
YOU WANT TO BE
FOCUSED HERE
This is the only job that YOU MUST
DO without anybody’s help because
this is your intellectual property
YOU DON’T WANT TO HAVE
TO MANAGE DATABASES
OR SERVERS
It only takes away from time
building your application
Highest value activity for your application
44
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Data
as a service
The next vision for databases: Data-as-a-Service
Applications just access the data as a service, while the database is transparent
The app just
interacts with
THE DATA
The application does not see the
infrastructure
Towards transparent databases
hostname, port number
Build your application
(i.e. game, startup, mobile app, site)
YOU WANT TO BE
FOCUSED HERE
This is the only job that YOU MUST
DO without anybody’s help because
this is your intellectual property
Highest value activity for your application
45
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Public Cloud
Managed
Cloud
Your Private
Cloud on
prem
Private
Cloud
Data has mass and gravity: you need choices for your hybrid app
(Or: “Divorces are expensive”)
46
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Rackspace Offerings for the Data Tier
Infrastructure
For Data
Managed
Offerings of Most
Popular
Big Data, SQL, &
NoSQL Databases
Managed
Database Services
for Production
Apps
Cloud IaaS
Get started fast
Dedicated Hosting
Predictable costs &
performance
OnMetal
Cloud Elasticity & Dedicated
Performance
• Automatic DBA: Sharding, Backup, & HA
• Entire Stack Optimized on Bare Metal
• Supported 24x7x365 by experts
• More than MongoDB
• Architecture & Design
• Tuning & Monitoring
• 24 x 7 x 365 Support
• Cost Effective
DBA Services
47
RACKSPACE® HOSTING | 5000 WALZEM ROAD | SAN ANTONIO, TX 78218
US SALES: 1-800-961-2888 | US SUPPORT: 1-800-961-4454 | WWW.RACKSPACE.COM
RACKSPACE® HOSTING | © RACKSPACE US, INC. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN TH E UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COMRACKSPACE® HOSTING | © RACKSPACE US, INC. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN TH E UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COM
Let us know how we can help you
@seanandersonBD

More Related Content

Viewers also liked

Become an IT Service Broker
Become an IT Service BrokerBecome an IT Service Broker
Become an IT Service BrokerRackspace
 
Integration testing for salt states using aws ec2 container service
Integration testing for salt states using aws ec2 container serviceIntegration testing for salt states using aws ec2 container service
Integration testing for salt states using aws ec2 container serviceSaltStack
 
Personal Branding 2017
Personal Branding 2017Personal Branding 2017
Personal Branding 2017John Head
 
6 Commonly Asked Questions from Customers Building on AWS
6 Commonly Asked Questions from Customers Building on AWS6 Commonly Asked Questions from Customers Building on AWS
6 Commonly Asked Questions from Customers Building on AWSRackspace
 
Ruby + Josy
Ruby + JosyRuby + Josy
Ruby + JosyRubyJosy
 
The 5 Pillars of Cloudiness
The 5 Pillars of CloudinessThe 5 Pillars of Cloudiness
The 5 Pillars of CloudinessWayne Walls
 
Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...
Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...
Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...Rackspace
 
Unlocked Workshop OSCON 2013 - Part I
Unlocked Workshop OSCON 2013 - Part IUnlocked Workshop OSCON 2013 - Part I
Unlocked Workshop OSCON 2013 - Part IWayne Walls
 
Enterprise Open Cloud Forum: The Cloud is Making it Rain
Enterprise Open Cloud Forum: The Cloud is Making it RainEnterprise Open Cloud Forum: The Cloud is Making it Rain
Enterprise Open Cloud Forum: The Cloud is Making it RainRackspace
 
Enterprise Cloud Forum Rackspace IT: Journey to the Cloud
Enterprise Cloud Forum Rackspace IT: Journey to the CloudEnterprise Cloud Forum Rackspace IT: Journey to the Cloud
Enterprise Cloud Forum Rackspace IT: Journey to the CloudRackspace
 

Viewers also liked (11)

Become an IT Service Broker
Become an IT Service BrokerBecome an IT Service Broker
Become an IT Service Broker
 
Integration testing for salt states using aws ec2 container service
Integration testing for salt states using aws ec2 container serviceIntegration testing for salt states using aws ec2 container service
Integration testing for salt states using aws ec2 container service
 
Personal Branding 2017
Personal Branding 2017Personal Branding 2017
Personal Branding 2017
 
6 Commonly Asked Questions from Customers Building on AWS
6 Commonly Asked Questions from Customers Building on AWS6 Commonly Asked Questions from Customers Building on AWS
6 Commonly Asked Questions from Customers Building on AWS
 
Ruby + Josy
Ruby + JosyRuby + Josy
Ruby + Josy
 
The 5 Pillars of Cloudiness
The 5 Pillars of CloudinessThe 5 Pillars of Cloudiness
The 5 Pillars of Cloudiness
 
Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...
Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...
Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...
 
Unlocked Workshop OSCON 2013 - Part I
Unlocked Workshop OSCON 2013 - Part IUnlocked Workshop OSCON 2013 - Part I
Unlocked Workshop OSCON 2013 - Part I
 
Enterprise Open Cloud Forum: The Cloud is Making it Rain
Enterprise Open Cloud Forum: The Cloud is Making it RainEnterprise Open Cloud Forum: The Cloud is Making it Rain
Enterprise Open Cloud Forum: The Cloud is Making it Rain
 
Estadistica
EstadisticaEstadistica
Estadistica
 
Enterprise Cloud Forum Rackspace IT: Journey to the Cloud
Enterprise Cloud Forum Rackspace IT: Journey to the CloudEnterprise Cloud Forum Rackspace IT: Journey to the Cloud
Enterprise Cloud Forum Rackspace IT: Journey to the Cloud
 

More from Rackspace

RMS Security Breakfast
RMS Security BreakfastRMS Security Breakfast
RMS Security BreakfastRackspace
 
Starting the Journey to Managed Infrastructure Services
Starting the Journey to Managed Infrastructure ServicesStarting the Journey to Managed Infrastructure Services
Starting the Journey to Managed Infrastructure ServicesRackspace
 
Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...
Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...
Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...Rackspace
 
Rackspace::Solve NYC - Second Stage Cloud
Rackspace::Solve NYC - Second Stage CloudRackspace::Solve NYC - Second Stage Cloud
Rackspace::Solve NYC - Second Stage CloudRackspace
 
Rackspace::Solve NYC - The Future of Applications with Ken Cochrane, Engineer...
Rackspace::Solve NYC - The Future of Applications with Ken Cochrane, Engineer...Rackspace::Solve NYC - The Future of Applications with Ken Cochrane, Engineer...
Rackspace::Solve NYC - The Future of Applications with Ken Cochrane, Engineer...Rackspace
 
vCenter Site Recovery Manager: Architecting a DR Solution
vCenter Site Recovery Manager: Architecting a DR SolutionvCenter Site Recovery Manager: Architecting a DR Solution
vCenter Site Recovery Manager: Architecting a DR SolutionRackspace
 
Outsourcing IT Projects to Managed Hosting of the Cloud
Outsourcing IT Projects to Managed Hosting of the CloudOutsourcing IT Projects to Managed Hosting of the Cloud
Outsourcing IT Projects to Managed Hosting of the CloudRackspace
 
How to Bring Shadow IT to the Light
How to Bring Shadow IT to the LightHow to Bring Shadow IT to the Light
How to Bring Shadow IT to the LightRackspace
 
DR-to-the-Cloud Best Practices
DR-to-the-Cloud Best PracticesDR-to-the-Cloud Best Practices
DR-to-the-Cloud Best PracticesRackspace
 
Migrating Traditional Apps from On-Premises to the Hybrid Cloud
Migrating Traditional Apps from On-Premises to the Hybrid CloudMigrating Traditional Apps from On-Premises to the Hybrid Cloud
Migrating Traditional Apps from On-Premises to the Hybrid CloudRackspace
 
Rackspace::Solve SFO - CoreOS CEO Alex Polvi on Solving for What's Next
Rackspace::Solve SFO - CoreOS CEO Alex Polvi on Solving for What's NextRackspace::Solve SFO - CoreOS CEO Alex Polvi on Solving for What's Next
Rackspace::Solve SFO - CoreOS CEO Alex Polvi on Solving for What's NextRackspace
 
Rackspace::Solve SFO - Rackspace CEO Taylor Rhodes on the Power of Solving Pr...
Rackspace::Solve SFO - Rackspace CEO Taylor Rhodes on the Power of Solving Pr...Rackspace::Solve SFO - Rackspace CEO Taylor Rhodes on the Power of Solving Pr...
Rackspace::Solve SFO - Rackspace CEO Taylor Rhodes on the Power of Solving Pr...Rackspace
 
Rackspace::Solve SFO - Solving for the Coming Tidal Wave of Choices with Avai...
Rackspace::Solve SFO - Solving for the Coming Tidal Wave of Choices with Avai...Rackspace::Solve SFO - Solving for the Coming Tidal Wave of Choices with Avai...
Rackspace::Solve SFO - Solving for the Coming Tidal Wave of Choices with Avai...Rackspace
 
vSphere with Openstack
vSphere with OpenstackvSphere with Openstack
vSphere with OpenstackRackspace
 
Rackspace::Solve SFO - Solve(Scale) Featuring Docker CEO Ben Golub
Rackspace::Solve SFO - Solve(Scale) Featuring Docker CEO Ben GolubRackspace::Solve SFO - Solve(Scale) Featuring Docker CEO Ben Golub
Rackspace::Solve SFO - Solve(Scale) Featuring Docker CEO Ben GolubRackspace
 
Rackspace::Solve SFO - Welcome Keynote featuring Rackspace CTO John Engates
Rackspace::Solve SFO - Welcome Keynote featuring Rackspace CTO John EngatesRackspace::Solve SFO - Welcome Keynote featuring Rackspace CTO John Engates
Rackspace::Solve SFO - Welcome Keynote featuring Rackspace CTO John EngatesRackspace
 
vSphere with OpenStack
vSphere with OpenStackvSphere with OpenStack
vSphere with OpenStackRackspace
 
Pre-Aggregated Analytics And Social Feeds Using MongoDB
Pre-Aggregated Analytics And Social Feeds Using MongoDBPre-Aggregated Analytics And Social Feeds Using MongoDB
Pre-Aggregated Analytics And Social Feeds Using MongoDBRackspace
 
Ignite Innovation: Turn Developers Loose on the Hybrid Cloud”
Ignite Innovation: Turn Developers Loose on the Hybrid Cloud”Ignite Innovation: Turn Developers Loose on the Hybrid Cloud”
Ignite Innovation: Turn Developers Loose on the Hybrid Cloud”Rackspace
 
The Next Generation IT Department MUST HAVE CLOUD
The Next Generation IT Department MUST HAVE CLOUDThe Next Generation IT Department MUST HAVE CLOUD
The Next Generation IT Department MUST HAVE CLOUDRackspace
 

More from Rackspace (20)

RMS Security Breakfast
RMS Security BreakfastRMS Security Breakfast
RMS Security Breakfast
 
Starting the Journey to Managed Infrastructure Services
Starting the Journey to Managed Infrastructure ServicesStarting the Journey to Managed Infrastructure Services
Starting the Journey to Managed Infrastructure Services
 
Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...
Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...
Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...
 
Rackspace::Solve NYC - Second Stage Cloud
Rackspace::Solve NYC - Second Stage CloudRackspace::Solve NYC - Second Stage Cloud
Rackspace::Solve NYC - Second Stage Cloud
 
Rackspace::Solve NYC - The Future of Applications with Ken Cochrane, Engineer...
Rackspace::Solve NYC - The Future of Applications with Ken Cochrane, Engineer...Rackspace::Solve NYC - The Future of Applications with Ken Cochrane, Engineer...
Rackspace::Solve NYC - The Future of Applications with Ken Cochrane, Engineer...
 
vCenter Site Recovery Manager: Architecting a DR Solution
vCenter Site Recovery Manager: Architecting a DR SolutionvCenter Site Recovery Manager: Architecting a DR Solution
vCenter Site Recovery Manager: Architecting a DR Solution
 
Outsourcing IT Projects to Managed Hosting of the Cloud
Outsourcing IT Projects to Managed Hosting of the CloudOutsourcing IT Projects to Managed Hosting of the Cloud
Outsourcing IT Projects to Managed Hosting of the Cloud
 
How to Bring Shadow IT to the Light
How to Bring Shadow IT to the LightHow to Bring Shadow IT to the Light
How to Bring Shadow IT to the Light
 
DR-to-the-Cloud Best Practices
DR-to-the-Cloud Best PracticesDR-to-the-Cloud Best Practices
DR-to-the-Cloud Best Practices
 
Migrating Traditional Apps from On-Premises to the Hybrid Cloud
Migrating Traditional Apps from On-Premises to the Hybrid CloudMigrating Traditional Apps from On-Premises to the Hybrid Cloud
Migrating Traditional Apps from On-Premises to the Hybrid Cloud
 
Rackspace::Solve SFO - CoreOS CEO Alex Polvi on Solving for What's Next
Rackspace::Solve SFO - CoreOS CEO Alex Polvi on Solving for What's NextRackspace::Solve SFO - CoreOS CEO Alex Polvi on Solving for What's Next
Rackspace::Solve SFO - CoreOS CEO Alex Polvi on Solving for What's Next
 
Rackspace::Solve SFO - Rackspace CEO Taylor Rhodes on the Power of Solving Pr...
Rackspace::Solve SFO - Rackspace CEO Taylor Rhodes on the Power of Solving Pr...Rackspace::Solve SFO - Rackspace CEO Taylor Rhodes on the Power of Solving Pr...
Rackspace::Solve SFO - Rackspace CEO Taylor Rhodes on the Power of Solving Pr...
 
Rackspace::Solve SFO - Solving for the Coming Tidal Wave of Choices with Avai...
Rackspace::Solve SFO - Solving for the Coming Tidal Wave of Choices with Avai...Rackspace::Solve SFO - Solving for the Coming Tidal Wave of Choices with Avai...
Rackspace::Solve SFO - Solving for the Coming Tidal Wave of Choices with Avai...
 
vSphere with Openstack
vSphere with OpenstackvSphere with Openstack
vSphere with Openstack
 
Rackspace::Solve SFO - Solve(Scale) Featuring Docker CEO Ben Golub
Rackspace::Solve SFO - Solve(Scale) Featuring Docker CEO Ben GolubRackspace::Solve SFO - Solve(Scale) Featuring Docker CEO Ben Golub
Rackspace::Solve SFO - Solve(Scale) Featuring Docker CEO Ben Golub
 
Rackspace::Solve SFO - Welcome Keynote featuring Rackspace CTO John Engates
Rackspace::Solve SFO - Welcome Keynote featuring Rackspace CTO John EngatesRackspace::Solve SFO - Welcome Keynote featuring Rackspace CTO John Engates
Rackspace::Solve SFO - Welcome Keynote featuring Rackspace CTO John Engates
 
vSphere with OpenStack
vSphere with OpenStackvSphere with OpenStack
vSphere with OpenStack
 
Pre-Aggregated Analytics And Social Feeds Using MongoDB
Pre-Aggregated Analytics And Social Feeds Using MongoDBPre-Aggregated Analytics And Social Feeds Using MongoDB
Pre-Aggregated Analytics And Social Feeds Using MongoDB
 
Ignite Innovation: Turn Developers Loose on the Hybrid Cloud”
Ignite Innovation: Turn Developers Loose on the Hybrid Cloud”Ignite Innovation: Turn Developers Loose on the Hybrid Cloud”
Ignite Innovation: Turn Developers Loose on the Hybrid Cloud”
 
The Next Generation IT Department MUST HAVE CLOUD
The Next Generation IT Department MUST HAVE CLOUDThe Next Generation IT Department MUST HAVE CLOUD
The Next Generation IT Department MUST HAVE CLOUD
 

Recently uploaded

Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 

Recently uploaded (20)

Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 

What Kind of Relationship Are You Seeking With Your Database?

  • 1. January 20, 2015 Sean Anderson Manager, Data Services @seanandersonBD Making choices: What kind of relationship are you seeking with your database?
  • 2. RACKSPACE® HOSTING | WWW.RACKSPACE.COM What are we going to talk about today? •Databases are complicated tools •There are numerous choices – How did we get here? •Understanding some of our choices – SQL: Relational – MongoDB: Documents – Redis: Key-value – Hadoop: Large distributed files •How should I think about managing them? 2
  • 3. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Common advice these days from smart people 3
  • 4. Let’s take a step back
  • 5. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Databases are not simple, single purpose tools 5
  • 6. RACKSPACE® HOSTING | WWW.RACKSPACE.COM The relationship with your database can be complicated It’s complicated --- 6
  • 7. How did we get here?
  • 8. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Traditional apps (CRM, HR, Finance apps) Modern apps (mobile, social, media, games) Custom-built for the app Programmable by the app Infrastructure Mostly resides on premise Mostly resides on cloudData Trend App Development is Changing 8
  • 9. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Traditional apps (CRM, HR, Finance apps) Modern apps (mobile, social, media, games) Systems of Record Highly structured Slow to change Transactional Stable Core to the business Not very social Systems of Engagement Loosely structured Quick to adapt Conversational Dynamic and in flux Edge of the business Fundamentally social Characteristics of the system Mostly resides on premise Mostly resides on cloudData Trend Applications are becoming systems of engagement 9
  • 10. RACKSPACE® HOSTING | WWW.RACKSPACE.COM MEDIA GAMING M2M MOBILE SOCIAL SOME UNIQUE SCENARIOS Cloud scale and fast growth High speed data retrieval needs Frequently written, rarely read Binary files Short term data Multi-location access Zero downtime needs Dynamic or object oriented models Trying to avoid RAID / storage limits Large files We are building different kinds of applications
  • 11. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Source: “15 Years of Hard Drive History: Capacities outran performance” (November 27, 2006) http://www.tomshardware.com/reviews/15-years-of-hard-drive-history,1368-6.html In the 15 year period before 2006, storage density increased 10,000x, but performance only increased about 100x 11
  • 12. RACKSPACE® HOSTING | WWW.RACKSPACE.COM As a result, a revolution ensued in the world of Data Services Polyglot persistence is here to stay: there are about 150+ choices just in the “NoSQL” subset 12
  • 13. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Two key issues How do you ensure best fit for your app? What is the long term view of your relationship with your database? 13
  • 14. Get to know your choices well • Crash course!
  • 15. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Relational Documents Key-value Distributed large sets Understand the personality of your database Let’s use these examples Data Integrity SQL Flexible Schema Scale Fast Retrieval Data structures Distributed Processing Big Data (MongoDB) (Redis) (Hadoop)(SQL) 15
  • 16. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Relational databases (SQL) They literally saved the world from running on paper Strengths • Data integrity through data types and semantic rules • AGE >= 0 • Person must have a NAME • Querying • Aggregation • SQL “Weaknesses” • Complex development as developer needs to map relational model with object oriented code • Complexity grows exponentially as relational model grows • Difficult to scale • Expensive (hardware, software) If your operation depends on the integrity of your business rules, the relational model rules. Scaling is a little difficult and performance is key. Relational Documents Key-value Distributed large sets
  • 17. RACKSPACE® HOSTING | WWW.RACKSPACE.COM • Allow new data without a defined schema • Designed for scale • Faster, agile development • Databases in the cloud! The complexities of relational databases led to NoSQL 17
  • 18. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Documents Databases vs. { _id : ObjectId("4c4ba5e5e8aabf3"), car_make: "Volkswagen", model : "Rabbit", tires : [ {type : “driver front”, brand: “Michelin”}, {type : “driver rear”, brand: “Michelin”}, {type : “passenger front”, brand: “Michelin”}, {type : “passenger rear”, brand: “Michelin”}, ] } Relational Documents Key-value Distributed large sets 18
  • 19. RACKSPACE® HOSTING | WWW.RACKSPACE.COM •Leading NoSQL database •Open Source •Agility and flexibility (no set schema) •Better fit to modern development methodologies •New types of records (fields) are added easily •Imagine it like a folder you add pages to MongoDB has emerged as a leader in Document databases Relational Documents Key-value Distributed large sets 19
  • 20. RACKSPACE® HOSTING | WWW.RACKSPACE.COM db.friends.insert ( { name: “J.R.”, email: “email@rackspace.com”, twitter_handle: “jrarredondo”, teams: [ “Mariners”, “Rangers” ], group: 1 } ) db.friends.ensureIndex( { group: 1} ) var myCursor = db.friends.find( { group: { $gt: 0 } } ) • Document databases and collections • Indexes • Rich query language • Replication (transparent to the app) – Writes to primary ensure consistency – Configurable reads to secondaries to help performance – Eventual consistency on secondary reads – Election on failures of primary nodes – Configurable write concerns for flexible write guarantees depending on app needs • Shards for horizontal scaling – Shard Key used to partition data based on ranges or hashes – Partition strategy depends on how evenly you want data distributed, and the nature of your queries (single vs. ranges) MongoDB Relational Documents Key-value Distributed large sets 20
  • 21. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Flexibility of data model (and its problems) with document databases Appboy: App marketing automation platform for mobile apps Courtesy of Jon Hyman, CIO and Co-Founder of Appboy 21
  • 22. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Sometimes you combine databases... 22
  • 23. RACKSPACE® HOSTING | WWW.RACKSPACE.COM MySQL and MongoDB together • Heavily used during weekends and at night • Complex SQL queries • “What are my friends drinking?” • “Where can I find this beer?” A social discovery and sharing network for beer drinkers
  • 24. RACKSPACE® HOSTING | WWW.RACKSPACE.COM MySQL and MongoDB together What works best for the workflow? - MySQL worked best for reference data for us - Not everything moved to MongoDB What stayed in MySQL? Check-ins Users Relationships Data Primary Datastore What moved to MongoDB? Activity Feed (Friend’s Graph) Recommendation Data Location-based Check-ins
  • 25. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Relational Documents Key-value Distributed large sets • Think about it as a single huge hash table • Simple concepts – GET / SET / DELETE <data> based on some <key> • High performance, in memory • Persistence – Point-in-time Snapshots – Append only / Journal • Partitioning – Redis Cluster (future) – Proxy-based solutions such as Twemproxy Key-value stores: Redis Key Value <key> <value> <key> <value> <key> <value> <key> <value> 25
  • 26. RACKSPACE® HOSTING | WWW.RACKSPACE.COM • Volatile keys: automatic expiration of keys – SET <key> <value> EX <seconds> – SETEX <key> <seconds> <value> • Data structures – LISTS, SETS / SORTED SETS, HASHES • Publish / Subscribe – SUBSCRIBE <channel> – PUBLISH <channel> <message> • Transactions (*) – MULTI • Commands to be executed as a single, atomic isolated operation – EXEC / DISCARD – (*) Warning: VERY different behaviors than in SQL • Eviction policies – Useful to implement Least Recently Used caches Key-value stores: Redis http://robots.thoughtbot.com/redis-pub-sub-how-does-it-work Relational Documents Key-value Distributed large sets 26
  • 27. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Cache Making another application better Data Structures (Example: Leaderboards!) LISTS SETS SORTED SETS HASHES Relational Documents Key-value Distributed large sets Redis Scenarios 27
  • 28. RACKSPACE® HOSTING | WWW.RACKSPACE.COM • Full text search based on Apache Lucene • Will run alongside of MongoDB, Hadoop, MySQL, and many other databases • Allows for quick full text search of your data set • Highly-Available by default • Optimized Hardware Elastic Search 28
  • 29. RACKSPACE® HOSTING | WWW.RACKSPACE.COM • MongoDB Users • Hadoop Users • JSON formatted Databases • Users with Large Data Sets Any current Objectrocket MongoDB or Rackspace Cloud Big Data customer will be able to connect Elastic Search through simple/documented tool. 29 Who might want to use Elastic Search? www.rackspace.com Klout XING GitHub If you’re one of the many household names using Klout to create campaigns that target social influencers, you’re using Elasticsearch. Elasticsearch made it possible for Klout to provide their forthcoming self-service option to their customers, which Klout predicts will allow them to at least double their current revenues XING is the leading business social network in Europe, with half its users located in Germany and the other half throughout the rest of Europe, Asia and Australia. XING has called their relationship with Elasticsearch a strategic partnership, far beyond a simple customer and service provider relationship. We’ve forged these deep ties with our customer by enabling XING to keep their users’ updates flowing in real-time. Elasticsearch empowers GitHub’s 4 million ‘social coders’ through providing search across GitHub’s 8 million + code repositories. The GitHub team also makes use of Elasticsearch to monitor for abuse using some fairly clever logging hacks.
  • 30. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Relational Documents Key-value Distributed large sets Volume Velocity Variety Complexity “Big Data”: generating insights with Hadoop V3CMining social data for sentiment Analyzing web clickstreams Analyzing log data for security breaches Telemetry from sensors and machines eCommerce predictive analytics 30
  • 31. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Data Services Core Services Fundamentals of Hadoop v1 HDFS Distributed File System HBase Distributed, scalable, non relational database HCatalog Metadata and table management system Pig Data flow scripting language Hive DW analysis layer through HiveQL (SQL-like) queries MapReduce Data processing framework Operational Services Ambari Installation, monitoring, administration Oozie Workflow and job scheduling Zookeeper Configuration, sync and naming registry Falcon Data pipeline framework Knox Auth and access Flume Log data aggregation and movement Sqoop Bulk data transfer from and to relational DB Relational Documents Key-value Distributed large sets 31
  • 32. RACKSPACE® HOSTING | WWW.RACKSPACE.COM MapReduce 32 … Large, distributed files Algorithm MAP REDUCE MAP MAP MAP MAP MAP It’s more efficient to send the algorithm to the data, than moving data to the algorithm REDUCE Partial answers Answer Simple example: how many times does each word appear in all files? mapper (filename, file-contents): for each word in file-contents: emit (word, 1) reducer (word, values): sum = 0 for each value in values: sum = sum + value emit (word, sum)
  • 33. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Beyond MapReduce / batch with Hadoop 2.0 Source: Hortonworks 33
  • 35. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Really understand the personality of your database First impressions can be deceiving “Redis is ‘just a cache’” • SET • GET Redis is a server for data structures • Strings • Hashes • Lists • Sets / Sorted Sets • Publish / Subscribe Huge difference! 35
  • 36. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Focus on the tradeoffs SQL NoSQL Data integrity Business rules Consistency Transaction isolation Atomicity and Rigidity Flexibility of schema Dynamic data models Horizontal scale Easier to get started and Inconsistency of data 36
  • 37. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Relational Documents Key-value Distributed large sets Understand the personality of your database Let’s use these examples (MongoDB) (Redis) (Hadoop)(SQL) Customer contact Reference data Order Details (Ship To, Bill To SKU, Quantity, Price) Billing transactions Inventory Prices Member Info (user, pwd) Customer relationships Notes / Social Partitions (shards) Promotional materials Dynamic schemas Statements Product Catalog, Images Product Configuration Personalized catalog Member Comments Product Reviews Product Q&As Session info Cart Recent orders Home page info Latest comments Recommendations Product “stars” Upsell/Cross sell Customer attributes (non personally identifiable information, geo) Sales history Churn info Price history Social info Comments “NPS” Recommendations All kinds of analysis 37
  • 38. RACKSPACE® HOSTING | WWW.RACKSPACE.COM It’s good to understand the fundamental “theory” What does your problem really need? ACID • Atomicity: A transactions either happens completely, or not at all – No partial transactions • Consistency: Transactions end in a “valid” state – No violation of rules • Isolation: Transaction appears as if it is the only thing happening to the database – Relaxed most times – Deals with phantom, dirty reads or non repeatable reads • Durability: Committed transactions are permanent – Even after failure BASE • Basically available: – Supporting partial failures without complete system failure – Design as if users would end up in different partitions • Soft state: – Things can be in flux for a little bit of time • Eventual consistency: – Things right themselves http://queue.acm.org/detail.cfm?id=1394128 New ways of thinking: Do customers really need to know the level of inventory of a product to place an order? Maybe all they want is to know that it is not zero 38
  • 39. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Know your CAP, really Consistency, Availability and Partition Tolerance You can only have 2 out of 3 in CAP! • Partitions are not generally common • Choosing Consistency or Availability is not final • “It depends” – Maybe on user – Maybe on system – Maybe on type of data • Just think: – How am I going to detect a problem in the network? (P) – How am I going to limit operations once I detect that? – How am I going to compensate to recover? Wait! It’s not that simple Hurst 2010 (http://blog.nahurst.com/visual-guide-to-nosql-systems) Eric Brewer 2012 (http://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed) 39
  • 40. RACKSPACE® HOSTING | WWW.RACKSPACE.COM • Stability • Fit for core scenarios • Configurability to different scenarios • Integration with development languages • Integration with other databases • SQL compatibility • End user vs. Developer skillset • Conceptual changes • Platform availability • Data type and semantic needs • Security The “ilities” and their cousins These are some of the challenges indirectly related to data that we must deal with • Performance • Scalability • Consistency • Resiliency • Data model • Flexibility • Cost • Training • Tools availability • Development experience 40
  • 41. Our vision is Data as a Service From databases to data as a service
  • 42. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Two key issues How do you ensure best fit for your app? What is the long term view of your relationship with your database? 42
  • 43. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Data-as-a-Service: more time building, less time managing databases 43 Four levels of DaaS transparency Source: “Choosing The Right Cloud Provider” (December 5, 2013) http://www.rackspace.com/blog/choosing-the-right-cloud-provider-for-your-mongodb-database/ • For some businesses, database or infrastructure management IS core of the business • For most software-based businesses, database or infrastructure management represents time and resources not spent building the application • You must answer for yourself: are you in the business of managing infrastructure, or in the business of [your market here]? More time spent building the app
  • 44. RACKSPACE® HOSTING | WWW.RACKSPACE.COM From Database-as-a-Service to Data-as-a-Service Focus on building your app, not managing databases Manage hardware infrastructure Manage software infrastructure (i.e. databases) Build your application (i.e. game, startup, mobile app, site) YOU WANT TO BE FOCUSED HERE This is the only job that YOU MUST DO without anybody’s help because this is your intellectual property YOU DON’T WANT TO HAVE TO MANAGE DATABASES OR SERVERS It only takes away from time building your application Highest value activity for your application 44
  • 45. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Data as a service The next vision for databases: Data-as-a-Service Applications just access the data as a service, while the database is transparent The app just interacts with THE DATA The application does not see the infrastructure Towards transparent databases hostname, port number Build your application (i.e. game, startup, mobile app, site) YOU WANT TO BE FOCUSED HERE This is the only job that YOU MUST DO without anybody’s help because this is your intellectual property Highest value activity for your application 45
  • 46. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Public Cloud Managed Cloud Your Private Cloud on prem Private Cloud Data has mass and gravity: you need choices for your hybrid app (Or: “Divorces are expensive”) 46
  • 47. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Rackspace Offerings for the Data Tier Infrastructure For Data Managed Offerings of Most Popular Big Data, SQL, & NoSQL Databases Managed Database Services for Production Apps Cloud IaaS Get started fast Dedicated Hosting Predictable costs & performance OnMetal Cloud Elasticity & Dedicated Performance • Automatic DBA: Sharding, Backup, & HA • Entire Stack Optimized on Bare Metal • Supported 24x7x365 by experts • More than MongoDB • Architecture & Design • Tuning & Monitoring • 24 x 7 x 365 Support • Cost Effective DBA Services 47
  • 48. RACKSPACE® HOSTING | 5000 WALZEM ROAD | SAN ANTONIO, TX 78218 US SALES: 1-800-961-2888 | US SUPPORT: 1-800-961-4454 | WWW.RACKSPACE.COM RACKSPACE® HOSTING | © RACKSPACE US, INC. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN TH E UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COMRACKSPACE® HOSTING | © RACKSPACE US, INC. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN TH E UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COM Let us know how we can help you @seanandersonBD