During this recorded webcast, you will hear from Judith Hurwitz, noted analyst and author of Hybrid Cloud for Dummies and Bill Schmarzo, EMC Consulting’s CTO for EIMA. You will learn What is big fast data and how your organization will benefit from this transformation in data management.
2. What is Big Fast Data?
The Transition in Data
Management
Judith Hurwitz
3. What Is Big Fast Data?
Big Fast Data is the ability to
manage a huge volume of
disparate data at the right velocity
within the right timeframe
Characteristics of Big Fast Data
• Must be verified based on
accuracy and business
context
• Must incorporate variety of
data types including
structured unstructured data
3
4. Why Is Big Fast Data Important?
• Businesses need to gain
insights from massive
amounts of stored data
• Businesses need to be able
to make decisions faster to
impact outcomes
• Need to find answers
without asking the question
4
5. What Is The Business Looking For?
1. Ability to gain access to
vast amounts of available
data from multiple sources
2. Ability to identify anomalies
3. Ability to predict the future
4. Ability to react in real time
based on analysis
5
6. How Did We Get Here?
• Early online commerce sites and search
engines began pushing boundaries of data
management
• Successful companies found ways to
monetize huge volumes of customer data
to upsell
• The massive data had to be managed
efficiently and in the right context
6
7. Waves Of Data In Context With Usage Patterns
Wave Examples Characteristics
Relational Database System of Record Used for structured, transactional data, strict definitional controls.
Content Management Claims Document Management Used with unstructured/semi-structured text, derived value,
System System, Web content management context driven.
Data Warehouse Customer and account data Used for structured data. Subject oriented system optimized for
warehouse querying. Integrated, well-defined parameters, optimized for
storage, focused on timely access to corporate data.
Complex Event Monitoring sensor data in real time Large streams of data focused on managing and analyzing
Processing/Streaming data to determine process changes business processes.
In-Memory Databases Used in ecommerce engines to Uses main memory to cache data to improve speed. Fast
reduce latency and speed analytical processing that can transform decision making in real-
transaction processing. time or near real-time.
Hadoop Software Used to process massive amounts A non-relational software framework based on Google’s
Framework of highly distributed disparate data. MapReduce Framework. It includes a distributed file system
Examples include fraud processing, based software framework. Allows very large data files (both
image processing structured and unstructured data) to be distributed across all
nodes of a very large grid of servers.
NoSQL Databases Designed to process massive Supports various database models including graph, object, key
amounts of data in a flexible form. value, and document. Document oriented rather than relying on
Used in ecommerce to process joins, scale out model for scalability.
massive amounts of data flexibly.
7
8. How Infrastructure Supports The Reality Of Big Fast Data
• Availability of commodity
servers
• Horizontal scaling because
of virtualization
• Emergence of Cloud
Computing
• Advanced data
management including
predictive analytics and
big data analysis
8
9. Making Big Data Fast Data A Reality
• Create a well defined business and IT strategy
• Focus on the business problem such as identifying
buying opportunities at point of engagement or reducing
fraud through an early warning system
• Understand the characteristics of your own data that you
need to leverage for the future
• Identify your bottlenecks in your current data architecture
• Create a strategy so you can use massive data at the
right speed and the right context to anticipate new
opportunities 9
10. The Elements Of A Data Architecture
• Foundational Data Services- support for relational, in-memory
databases, structured and unstructured data
• Middleware Services – allow for communication and integration
between data sources
• Big Data Analytics – ability to analyze huge volumes of data
• Data Warehousing Capabilities – used to apply analytics to huge
volumes of complex data
• Management Services – deliver the right performance levels
• Virtualized Infrastructure – ability to optimize the environment
• Runtime Services – support for mobile computing and other user
environments
10
11. The Business Initiative For Big Fast Data
• Capture, transform, and
manage huge volumes of
information in near real time
• Capture data at the point of
creation and then combine data
sources to create context to
deliver on the business
objective
• Leverage data assets to gain a
competitive advantage
11