SlideShare a Scribd company logo
1 of 21
Download to read offline
Analyze This! Best
                                                         Practices For Big And
                                                               Fast Data

                                                            Judith Hurwitz, President
                                                              Hurwitz & Associates

                                                               Bill Schmarzo, CTO
                                                          EIMA Practice, EMC Consulting



© Copyright 2012 EMC Corporation. All rights reserved.                                    1
What is Big Fast Data?
The Transition in Data
    Management
           Judith Hurwitz
What Is Big Fast Data?

Big Fast Data is the ability to
manage a huge volume of
disparate data at the right velocity
within the right timeframe
Characteristics of Big Fast Data
  • Must be verified based on
     accuracy and business
     context
  • Must incorporate variety of
     data types including
     structured unstructured data
                                       3
Why Is Big Fast Data Important?


• Businesses need to gain
  insights from massive
  amounts of stored data
• Businesses need to be able
  to make decisions faster to
  impact outcomes
• Need to find answers
  without asking the question

                                   4
What Is The Business Looking For?


1. Ability to gain access to
   vast amounts of available
   data from multiple sources
2. Ability to identify anomalies
3. Ability to predict the future
4. Ability to react in real time
   based on analysis

                                     5
How Did We Get Here?

• Early online commerce sites and search
  engines began pushing boundaries of data
  management
• Successful companies found ways to
  monetize huge volumes of customer data
  to upsell
• The massive data had to be managed
  efficiently and in the right context
                                  6
Waves Of Data In Context With Usage Patterns
         Wave                          Examples                                            Characteristics
Relational Database         System of Record                        Used for structured, transactional data, strict definitional controls.
Content Management          Claims Document Management              Used with unstructured/semi-structured text, derived value,
System                      System, Web content management          context driven.
Data Warehouse              Customer and account data               Used for structured data. Subject oriented system optimized for
                            warehouse                               querying. Integrated, well-defined parameters, optimized for
                                                                    storage, focused on timely access to corporate data.
Complex Event               Monitoring sensor data in real time     Large streams of data focused on managing and analyzing
Processing/Streaming data   to determine process changes            business processes.
In-Memory Databases         Used in ecommerce engines to            Uses main memory to cache data to improve speed. Fast
                            reduce latency and speed                analytical processing that can transform decision making in real-
                            transaction processing.                 time or near real-time.
Hadoop Software             Used to process massive amounts         A non-relational software framework based on Google’s
Framework                   of highly distributed disparate data.   MapReduce Framework. It includes a distributed file system
                            Examples include fraud processing,      based software framework. Allows very large data files (both
                            image processing                        structured and unstructured data) to be distributed across all
                                                                    nodes of a very large grid of servers.
NoSQL Databases             Designed to process massive             Supports various database models including graph, object, key
                            amounts of data in a flexible form.     value, and document. Document oriented rather than relying on
                            Used in ecommerce to process            joins, scale out model for scalability.
                            massive amounts of data flexibly.

                                                                                                              7
How Infrastructure Supports The Reality Of Big Fast Data

 • Availability of commodity
   servers
 • Horizontal scaling because
   of virtualization
 • Emergence of Cloud
   Computing
 • Advanced data
   management including
   predictive analytics and
   big data analysis
                                              8
Making Big Data Fast Data A Reality

• Create a well defined business and IT strategy
• Focus on the business problem such as identifying
  buying opportunities at point of engagement or reducing
  fraud through an early warning system
• Understand the characteristics of your own data that you
  need to leverage for the future
• Identify your bottlenecks in your current data architecture
• Create a strategy so you can use massive data at the
  right speed and the right context to anticipate new
  opportunities                                   9
The Elements Of A Data Architecture

•   Foundational Data Services- support for relational, in-memory
    databases, structured and unstructured data
•   Middleware Services – allow for communication and integration
    between data sources
•   Big Data Analytics – ability to analyze huge volumes of data
•   Data Warehousing Capabilities – used to apply analytics to huge
    volumes of complex data
•   Management Services – deliver the right performance levels
•   Virtualized Infrastructure – ability to optimize the environment
•   Runtime Services – support for mobile computing and other user
    environments
                                                          10
The Business Initiative For Big Fast Data

•    Capture, transform, and
     manage huge volumes of
     information in near real time
•    Capture data at the point of
     creation and then combine data
     sources to create context to
     deliver on the business
     objective
•    Leverage data assets to gain a
     competitive advantage


                                                11
The Business Potential
                                                                   Of
                                                             Big Fast Data

                                                              Bill Schmarzo
                                                             CTO, EIM&A Practice
                                                               EMC Consulting




© Copyright 2012 EMC Corporation. All rights reserved.                             12
Big Fast Data Requires An Architecture For High-
velocity Data To Accelerate Operational Execution
                             Mobile-Enabled                  Application
                              Web Clients                   Performance
                                                              Manager
                                                                           Key Architecture Capabilities
                                                                            Scale out compute and storage
                          Cloud Application Platform                        Distribution: real-time WAN
 App Director Installer




                           Application Logic
                                                                            Data Diversity: SQL and NoSQL

                          In-memory Database                                Mobile enabled
                                                            Fast Ingest
                          vFabric Data Director    Greenplum                In-memory computing
                           Postgres      Oracle     Greenplum    Hadoop     In-database analytics
                                           Cloud Platform                   Cloud friendly architecture



© Copyright 2012 EMC Corporation. All rights reserved.                                                       13
Big Fast Data Use Cases
              Algorithmic Stock Trading                       Identify risk and pricing nuances in stock trading
Real-time




              Ad Serving                                      Serve right ad to right person at the right time
              Cyber Security                                  Flag potential security breach behaviors and situations
              Fraud Detection                                 Identify potential fraud situations at purchase time
              High-end Product Failure                        Predict high-end product failures (planes, trains, power plants)
              Next Best Offers                                Recommend products based on current shopping occurrence
              Churn Detection                                 Flag customer behaviors that are indicative of attrition
              Medical Treatment                               Recommend appropriate medical treatments in urgent situations
Right-time




              Money Laundering                                Flag suspicious financial transactions
              Claims Adjudication                             Approve insurance claims at time of filing
              Loan/Insurance Approval                         Calculate financial scores and risks to approve loan or policy
              Oil & Gas Exploration                           Track sensor feeds to identify potential drilling problems



             © Copyright 2012 EMC Corporation. All rights reserved.                                                              14
Use Case: Financial Trading And Real-time Operational
Analytics

                                                          Develop risk and pricing
                                                           algorithms against
                                                           historical data in
                                                           Greenplum Database using
                                                           analytical methods such as
                                                           linear regression,
                                                           clustering, etc.
                                                          Serve up analytic results
                                                           and scores to SQLFire for
                                                           real-time execution




© Copyright 2012 EMC Corporation. All rights reserved.                                  15
Use Case: Retail Location-based Marketing And Next
Best Offers
                                                          Develop analytic models
                                                           on detailed customer
                                                           loyalty and Point of Sale
                                                           (POS)data to create
                                                           “next best offer” scores
                                                           for each customer
                                                          Leverage “right-time”
                                                           feeds based upon
                                                           customer geo location to
                                                           deliver most appropriate
                                                           offers




© Copyright 2012 EMC Corporation. All rights reserved.                                 16
Use Case: Healthcare And Readmission Score At Initial
Admission
                                                           Out of 1000 patients,
                                                           1124 admissions
                                                                                     • Score patient at
                                                           expected within next 12     point of admission
                                                           months                      for the probability of
                                                                                       readmission based
                                                                                       upon patient history
                                                                                       and current health
                                                                                       factors
                                                                                     • Create custom
• Admissions increase with the
  level of cholesterol                                                                 treatment and
• Admissions decrease with the                                                         monitoring programs
  Max Heart Rate
• Cholesterol and Max Heart                                                            for high-risk patients
  Rate uncorrelated




  © Copyright 2012 EMC Corporation. All rights reserved.                                                        17
Greenplum And EMC Consulting Provide Big Fast Data
Strategy And Implementation Services
                                       Identify big data
             Vision
                                       analytics business
             Workshop
                                       use cases



                                    Analytics             Deploy analytics sandbox
                                                          to quantify the business
                                    Lab
                                                          case


                                                                                     Identify current state, determine required
                                                         Analytics
                                                                                     state and conduct gap analysis to develop
                                                         Operationalization
                                                                                     analytics implementation roadmap




           Repeat the process for
           identified business cases


© Copyright 2012 EMC Corporation. All rights reserved.                                                                            18
Questions and Answers




                    To type a question via WebEx, click on the Q&A tab
                             Please select “Ask: All Panelists”
                      to ensure your questions reach us. Thank you!

© Copyright 2012 EMC Corporation. All rights reserved.                   19
Learn More…
 See us at…
     –    Oct. 16-17 O’Reilly Strata Rx Conference, Santa Clara, CA
              ▪   Oct. 16 9:40 am It’s an Exciting Time in the Industry
              ▪   Oct. 16 3:35 pm Big Fast Data in Health Sciences: A Panel of Experts Discusses What and Why
              ▪   Oct. 17 2:05 pm A Predictive Approach to Real-Time Detection of Fraud, Waste, and Abuse in Healthcare
     –    Oct. 23-25 O’Reilly Strata New York Conference
              ▪   Oct. 23 11:15 am Great Debate: The Old Models are Broken
     –    On-demand webinar: Transform Your BI and Data Warehouse for Big Data
     –    Upcoming webinar Sept. 18, 11am PT/2pm ET Using Greenplum to Deliver Big Data Analytics

 Contact Judith Hurwitz
     –    Email: judith.hurwitz@hurwitz.com
     –    LinkedIn: http://www.linkedin.com/pub/judith-hurwitz/0/18/405
     –    Twitter: @jhurwitz

 Contact Bill Schmarzo
     –    Email: william.schmarzo@emc.com
     –    LinkedIn: http://www.linkedin.com/in/schmarzo
     –    Twitter: @schmarzo
     –    Blog: http://infocus.emc.com/author/william_schmarzo/


 © Copyright 2012 EMC Corporation. All rights reserved.                                                                   20
THANK YOU


© Copyright 2012 EMC Corporation. All rights reserved.   21

More Related Content

What's hot

RWDG Slides: Non-Invasive Metadata Governance
RWDG Slides: Non-Invasive Metadata GovernanceRWDG Slides: Non-Invasive Metadata Governance
RWDG Slides: Non-Invasive Metadata GovernanceDATAVERSITY
 
RWDG Slides: The Stewardship Approach to Data Governance
RWDG Slides: The Stewardship Approach to Data GovernanceRWDG Slides: The Stewardship Approach to Data Governance
RWDG Slides: The Stewardship Approach to Data GovernanceDATAVERSITY
 
ADV Slides: Increasing Artificial Intelligence Success with Master Data Manag...
ADV Slides: Increasing Artificial Intelligence Success with Master Data Manag...ADV Slides: Increasing Artificial Intelligence Success with Master Data Manag...
ADV Slides: Increasing Artificial Intelligence Success with Master Data Manag...DATAVERSITY
 
RWDG Slides: Data Architecture Is Data Governance
RWDG Slides: Data Architecture Is Data GovernanceRWDG Slides: Data Architecture Is Data Governance
RWDG Slides: Data Architecture Is Data GovernanceDATAVERSITY
 
RWDG Slides: Data and Metadata Will Not Govern Themselves
RWDG Slides: Data and Metadata Will Not Govern ThemselvesRWDG Slides: Data and Metadata Will Not Govern Themselves
RWDG Slides: Data and Metadata Will Not Govern ThemselvesDATAVERSITY
 
Building a Data Governance Strategy
Building a Data Governance StrategyBuilding a Data Governance Strategy
Building a Data Governance StrategyAnalytics8
 
DataEd Slides: Approaching Data Governance Strategically
DataEd Slides: Approaching Data Governance StrategicallyDataEd Slides: Approaching Data Governance Strategically
DataEd Slides: Approaching Data Governance StrategicallyDATAVERSITY
 
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...DATAVERSITY
 
Seiner dataversity-rwdg2017-05-operating modelofdatagovernanceroles-20170518f...
Seiner dataversity-rwdg2017-05-operating modelofdatagovernanceroles-20170518f...Seiner dataversity-rwdg2017-05-operating modelofdatagovernanceroles-20170518f...
Seiner dataversity-rwdg2017-05-operating modelofdatagovernanceroles-20170518f...DATAVERSITY
 
Dataversity Sponsorship and Advertising Opportunities
Dataversity Sponsorship and Advertising OpportunitiesDataversity Sponsorship and Advertising Opportunities
Dataversity Sponsorship and Advertising OpportunitiesDATAVERSITY
 
Slides: Data Governance Reality Check
Slides: Data Governance Reality CheckSlides: Data Governance Reality Check
Slides: Data Governance Reality CheckDATAVERSITY
 
Real-World Data Governance Webinar: Big Data Governance - What Is It and Why ...
Real-World Data Governance Webinar: Big Data Governance - What Is It and Why ...Real-World Data Governance Webinar: Big Data Governance - What Is It and Why ...
Real-World Data Governance Webinar: Big Data Governance - What Is It and Why ...DATAVERSITY
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
 
Slides: The Three Pillars for Effective Business Intelligence Governance
Slides: The Three Pillars for Effective Business Intelligence GovernanceSlides: The Three Pillars for Effective Business Intelligence Governance
Slides: The Three Pillars for Effective Business Intelligence GovernanceDATAVERSITY
 
Data Quality Strategies
Data Quality StrategiesData Quality Strategies
Data Quality StrategiesDATAVERSITY
 
Real-World Data Governance: What is a Data Steward and What Do They Do?
Real-World Data Governance: What is a Data Steward and What Do They Do?Real-World Data Governance: What is a Data Steward and What Do They Do?
Real-World Data Governance: What is a Data Steward and What Do They Do?DATAVERSITY
 
RWDG Webinar: Build Your Own Data Governance Tools
RWDG Webinar: Build Your Own Data Governance ToolsRWDG Webinar: Build Your Own Data Governance Tools
RWDG Webinar: Build Your Own Data Governance ToolsDATAVERSITY
 
Convincing Stakeholders Data Governance Is Essential
Convincing Stakeholders Data Governance Is EssentialConvincing Stakeholders Data Governance Is Essential
Convincing Stakeholders Data Governance Is EssentialDATAVERSITY
 
Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogActivate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogDATAVERSITY
 
Real-World Data Governance: Build Your Own Data Governance Tools
Real-World Data Governance: Build Your Own Data Governance ToolsReal-World Data Governance: Build Your Own Data Governance Tools
Real-World Data Governance: Build Your Own Data Governance ToolsDATAVERSITY
 

What's hot (20)

RWDG Slides: Non-Invasive Metadata Governance
RWDG Slides: Non-Invasive Metadata GovernanceRWDG Slides: Non-Invasive Metadata Governance
RWDG Slides: Non-Invasive Metadata Governance
 
RWDG Slides: The Stewardship Approach to Data Governance
RWDG Slides: The Stewardship Approach to Data GovernanceRWDG Slides: The Stewardship Approach to Data Governance
RWDG Slides: The Stewardship Approach to Data Governance
 
ADV Slides: Increasing Artificial Intelligence Success with Master Data Manag...
ADV Slides: Increasing Artificial Intelligence Success with Master Data Manag...ADV Slides: Increasing Artificial Intelligence Success with Master Data Manag...
ADV Slides: Increasing Artificial Intelligence Success with Master Data Manag...
 
RWDG Slides: Data Architecture Is Data Governance
RWDG Slides: Data Architecture Is Data GovernanceRWDG Slides: Data Architecture Is Data Governance
RWDG Slides: Data Architecture Is Data Governance
 
RWDG Slides: Data and Metadata Will Not Govern Themselves
RWDG Slides: Data and Metadata Will Not Govern ThemselvesRWDG Slides: Data and Metadata Will Not Govern Themselves
RWDG Slides: Data and Metadata Will Not Govern Themselves
 
Building a Data Governance Strategy
Building a Data Governance StrategyBuilding a Data Governance Strategy
Building a Data Governance Strategy
 
DataEd Slides: Approaching Data Governance Strategically
DataEd Slides: Approaching Data Governance StrategicallyDataEd Slides: Approaching Data Governance Strategically
DataEd Slides: Approaching Data Governance Strategically
 
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...
 
Seiner dataversity-rwdg2017-05-operating modelofdatagovernanceroles-20170518f...
Seiner dataversity-rwdg2017-05-operating modelofdatagovernanceroles-20170518f...Seiner dataversity-rwdg2017-05-operating modelofdatagovernanceroles-20170518f...
Seiner dataversity-rwdg2017-05-operating modelofdatagovernanceroles-20170518f...
 
Dataversity Sponsorship and Advertising Opportunities
Dataversity Sponsorship and Advertising OpportunitiesDataversity Sponsorship and Advertising Opportunities
Dataversity Sponsorship and Advertising Opportunities
 
Slides: Data Governance Reality Check
Slides: Data Governance Reality CheckSlides: Data Governance Reality Check
Slides: Data Governance Reality Check
 
Real-World Data Governance Webinar: Big Data Governance - What Is It and Why ...
Real-World Data Governance Webinar: Big Data Governance - What Is It and Why ...Real-World Data Governance Webinar: Big Data Governance - What Is It and Why ...
Real-World Data Governance Webinar: Big Data Governance - What Is It and Why ...
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Slides: The Three Pillars for Effective Business Intelligence Governance
Slides: The Three Pillars for Effective Business Intelligence GovernanceSlides: The Three Pillars for Effective Business Intelligence Governance
Slides: The Three Pillars for Effective Business Intelligence Governance
 
Data Quality Strategies
Data Quality StrategiesData Quality Strategies
Data Quality Strategies
 
Real-World Data Governance: What is a Data Steward and What Do They Do?
Real-World Data Governance: What is a Data Steward and What Do They Do?Real-World Data Governance: What is a Data Steward and What Do They Do?
Real-World Data Governance: What is a Data Steward and What Do They Do?
 
RWDG Webinar: Build Your Own Data Governance Tools
RWDG Webinar: Build Your Own Data Governance ToolsRWDG Webinar: Build Your Own Data Governance Tools
RWDG Webinar: Build Your Own Data Governance Tools
 
Convincing Stakeholders Data Governance Is Essential
Convincing Stakeholders Data Governance Is EssentialConvincing Stakeholders Data Governance Is Essential
Convincing Stakeholders Data Governance Is Essential
 
Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogActivate Data Governance Using the Data Catalog
Activate Data Governance Using the Data Catalog
 
Real-World Data Governance: Build Your Own Data Governance Tools
Real-World Data Governance: Build Your Own Data Governance ToolsReal-World Data Governance: Build Your Own Data Governance Tools
Real-World Data Governance: Build Your Own Data Governance Tools
 

Viewers also liked

Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...Cloudera, Inc.
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
 
Optimizing Hive Queries
Optimizing Hive QueriesOptimizing Hive Queries
Optimizing Hive QueriesOwen O'Malley
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Cloudera, Inc.
 
Hive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it finalHive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it finalHortonworks
 

Viewers also liked (8)

Big Data analytics best practices
Big Data analytics best practicesBig Data analytics best practices
Big Data analytics best practices
 
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWS
 
Optimizing Hive Queries
Optimizing Hive QueriesOptimizing Hive Queries
Optimizing Hive Queries
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
 
Hive ppt (1)
Hive ppt (1)Hive ppt (1)
Hive ppt (1)
 
Hive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it finalHive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it final
 

Similar to Analyze This! Best Practices For Big And Fast Data

Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing DataWorks Summit
 
IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data IBM
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarioskcmallu
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)Ajay Ohri
 
New Innovations in Information Management for Big Data - Smarter Business 2013
New Innovations in Information Management for Big Data - Smarter Business 2013New Innovations in Information Management for Big Data - Smarter Business 2013
New Innovations in Information Management for Big Data - Smarter Business 2013IBM Sverige
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
 
Fast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationFast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationDenodo
 
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...DataScienceConferenc1
 
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the CloudFoundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the CloudPrecisely
 
Ibm info sphere datastage and hadoop two best-of-breed solutions together-f...
Ibm info sphere datastage and hadoop   two best-of-breed solutions together-f...Ibm info sphere datastage and hadoop   two best-of-breed solutions together-f...
Ibm info sphere datastage and hadoop two best-of-breed solutions together-f...ArunshankarArjunan
 
Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012Calpont Corporation
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...MapR Technologies
 
Use Big Data Technologies to Modernize Your Enterprise Data Warehouse
Use Big Data Technologies to Modernize Your Enterprise Data Warehouse Use Big Data Technologies to Modernize Your Enterprise Data Warehouse
Use Big Data Technologies to Modernize Your Enterprise Data Warehouse EMC
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesAshraf Uddin
 
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the CloudFoundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the CloudPrecisely
 
What is the Point of Hadoop
What is the Point of HadoopWhat is the Point of Hadoop
What is the Point of HadoopDataWorks Summit
 
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Mark Heid
 

Similar to Analyze This! Best Practices For Big And Fast Data (20)

Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing
 
IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data
 
Using Big Data Smarter Decision Making
Using Big Data Smarter Decision MakingUsing Big Data Smarter Decision Making
Using Big Data Smarter Decision Making
 
The New Enterprise Data Platform
The New Enterprise Data PlatformThe New Enterprise Data Platform
The New Enterprise Data Platform
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
 
New Innovations in Information Management for Big Data - Smarter Business 2013
New Innovations in Information Management for Big Data - Smarter Business 2013New Innovations in Information Management for Big Data - Smarter Business 2013
New Innovations in Information Management for Big Data - Smarter Business 2013
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
Fast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationFast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow Presentation
 
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
 
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the CloudFoundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
 
Ibm info sphere datastage and hadoop two best-of-breed solutions together-f...
Ibm info sphere datastage and hadoop   two best-of-breed solutions together-f...Ibm info sphere datastage and hadoop   two best-of-breed solutions together-f...
Ibm info sphere datastage and hadoop two best-of-breed solutions together-f...
 
Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
 
Use Big Data Technologies to Modernize Your Enterprise Data Warehouse
Use Big Data Technologies to Modernize Your Enterprise Data Warehouse Use Big Data Technologies to Modernize Your Enterprise Data Warehouse
Use Big Data Technologies to Modernize Your Enterprise Data Warehouse
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the CloudFoundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
 
What is the Point of Hadoop
What is the Point of HadoopWhat is the Point of Hadoop
What is the Point of Hadoop
 
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
 

More from EMC

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDEMC
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote EMC
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOEMC
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremioEMC
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lakeEMC
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereEMC
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History EMC
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewEMC
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeEMC
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic EMC
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityEMC
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeEMC
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015EMC
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesEMC
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsEMC
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookEMC
 

More from EMC (20)

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremio
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis Openstack
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lake
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop Elsewhere
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical Review
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or Foe
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for Security
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure Age
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education Services
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere Environments
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBook
 

Recently uploaded

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsZilliz
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

Analyze This! Best Practices For Big And Fast Data

  • 1. Analyze This! Best Practices For Big And Fast Data Judith Hurwitz, President Hurwitz & Associates Bill Schmarzo, CTO EIMA Practice, EMC Consulting © Copyright 2012 EMC Corporation. All rights reserved. 1
  • 2. What is Big Fast Data? The Transition in Data Management Judith Hurwitz
  • 3. What Is Big Fast Data? Big Fast Data is the ability to manage a huge volume of disparate data at the right velocity within the right timeframe Characteristics of Big Fast Data • Must be verified based on accuracy and business context • Must incorporate variety of data types including structured unstructured data 3
  • 4. Why Is Big Fast Data Important? • Businesses need to gain insights from massive amounts of stored data • Businesses need to be able to make decisions faster to impact outcomes • Need to find answers without asking the question 4
  • 5. What Is The Business Looking For? 1. Ability to gain access to vast amounts of available data from multiple sources 2. Ability to identify anomalies 3. Ability to predict the future 4. Ability to react in real time based on analysis 5
  • 6. How Did We Get Here? • Early online commerce sites and search engines began pushing boundaries of data management • Successful companies found ways to monetize huge volumes of customer data to upsell • The massive data had to be managed efficiently and in the right context 6
  • 7. Waves Of Data In Context With Usage Patterns Wave Examples Characteristics Relational Database System of Record Used for structured, transactional data, strict definitional controls. Content Management Claims Document Management Used with unstructured/semi-structured text, derived value, System System, Web content management context driven. Data Warehouse Customer and account data Used for structured data. Subject oriented system optimized for warehouse querying. Integrated, well-defined parameters, optimized for storage, focused on timely access to corporate data. Complex Event Monitoring sensor data in real time Large streams of data focused on managing and analyzing Processing/Streaming data to determine process changes business processes. In-Memory Databases Used in ecommerce engines to Uses main memory to cache data to improve speed. Fast reduce latency and speed analytical processing that can transform decision making in real- transaction processing. time or near real-time. Hadoop Software Used to process massive amounts A non-relational software framework based on Google’s Framework of highly distributed disparate data. MapReduce Framework. It includes a distributed file system Examples include fraud processing, based software framework. Allows very large data files (both image processing structured and unstructured data) to be distributed across all nodes of a very large grid of servers. NoSQL Databases Designed to process massive Supports various database models including graph, object, key amounts of data in a flexible form. value, and document. Document oriented rather than relying on Used in ecommerce to process joins, scale out model for scalability. massive amounts of data flexibly. 7
  • 8. How Infrastructure Supports The Reality Of Big Fast Data • Availability of commodity servers • Horizontal scaling because of virtualization • Emergence of Cloud Computing • Advanced data management including predictive analytics and big data analysis 8
  • 9. Making Big Data Fast Data A Reality • Create a well defined business and IT strategy • Focus on the business problem such as identifying buying opportunities at point of engagement or reducing fraud through an early warning system • Understand the characteristics of your own data that you need to leverage for the future • Identify your bottlenecks in your current data architecture • Create a strategy so you can use massive data at the right speed and the right context to anticipate new opportunities 9
  • 10. The Elements Of A Data Architecture • Foundational Data Services- support for relational, in-memory databases, structured and unstructured data • Middleware Services – allow for communication and integration between data sources • Big Data Analytics – ability to analyze huge volumes of data • Data Warehousing Capabilities – used to apply analytics to huge volumes of complex data • Management Services – deliver the right performance levels • Virtualized Infrastructure – ability to optimize the environment • Runtime Services – support for mobile computing and other user environments 10
  • 11. The Business Initiative For Big Fast Data • Capture, transform, and manage huge volumes of information in near real time • Capture data at the point of creation and then combine data sources to create context to deliver on the business objective • Leverage data assets to gain a competitive advantage 11
  • 12. The Business Potential Of Big Fast Data Bill Schmarzo CTO, EIM&A Practice EMC Consulting © Copyright 2012 EMC Corporation. All rights reserved. 12
  • 13. Big Fast Data Requires An Architecture For High- velocity Data To Accelerate Operational Execution Mobile-Enabled Application Web Clients Performance Manager Key Architecture Capabilities  Scale out compute and storage Cloud Application Platform  Distribution: real-time WAN App Director Installer Application Logic  Data Diversity: SQL and NoSQL In-memory Database  Mobile enabled Fast Ingest vFabric Data Director Greenplum  In-memory computing Postgres Oracle Greenplum Hadoop  In-database analytics Cloud Platform  Cloud friendly architecture © Copyright 2012 EMC Corporation. All rights reserved. 13
  • 14. Big Fast Data Use Cases Algorithmic Stock Trading Identify risk and pricing nuances in stock trading Real-time Ad Serving Serve right ad to right person at the right time Cyber Security Flag potential security breach behaviors and situations Fraud Detection Identify potential fraud situations at purchase time High-end Product Failure Predict high-end product failures (planes, trains, power plants) Next Best Offers Recommend products based on current shopping occurrence Churn Detection Flag customer behaviors that are indicative of attrition Medical Treatment Recommend appropriate medical treatments in urgent situations Right-time Money Laundering Flag suspicious financial transactions Claims Adjudication Approve insurance claims at time of filing Loan/Insurance Approval Calculate financial scores and risks to approve loan or policy Oil & Gas Exploration Track sensor feeds to identify potential drilling problems © Copyright 2012 EMC Corporation. All rights reserved. 14
  • 15. Use Case: Financial Trading And Real-time Operational Analytics  Develop risk and pricing algorithms against historical data in Greenplum Database using analytical methods such as linear regression, clustering, etc.  Serve up analytic results and scores to SQLFire for real-time execution © Copyright 2012 EMC Corporation. All rights reserved. 15
  • 16. Use Case: Retail Location-based Marketing And Next Best Offers  Develop analytic models on detailed customer loyalty and Point of Sale (POS)data to create “next best offer” scores for each customer  Leverage “right-time” feeds based upon customer geo location to deliver most appropriate offers © Copyright 2012 EMC Corporation. All rights reserved. 16
  • 17. Use Case: Healthcare And Readmission Score At Initial Admission Out of 1000 patients, 1124 admissions • Score patient at expected within next 12 point of admission months for the probability of readmission based upon patient history and current health factors • Create custom • Admissions increase with the level of cholesterol treatment and • Admissions decrease with the monitoring programs Max Heart Rate • Cholesterol and Max Heart for high-risk patients Rate uncorrelated © Copyright 2012 EMC Corporation. All rights reserved. 17
  • 18. Greenplum And EMC Consulting Provide Big Fast Data Strategy And Implementation Services Identify big data Vision analytics business Workshop use cases Analytics Deploy analytics sandbox to quantify the business Lab case Identify current state, determine required Analytics state and conduct gap analysis to develop Operationalization analytics implementation roadmap Repeat the process for identified business cases © Copyright 2012 EMC Corporation. All rights reserved. 18
  • 19. Questions and Answers To type a question via WebEx, click on the Q&A tab Please select “Ask: All Panelists” to ensure your questions reach us. Thank you! © Copyright 2012 EMC Corporation. All rights reserved. 19
  • 20. Learn More…  See us at… – Oct. 16-17 O’Reilly Strata Rx Conference, Santa Clara, CA ▪ Oct. 16 9:40 am It’s an Exciting Time in the Industry ▪ Oct. 16 3:35 pm Big Fast Data in Health Sciences: A Panel of Experts Discusses What and Why ▪ Oct. 17 2:05 pm A Predictive Approach to Real-Time Detection of Fraud, Waste, and Abuse in Healthcare – Oct. 23-25 O’Reilly Strata New York Conference ▪ Oct. 23 11:15 am Great Debate: The Old Models are Broken – On-demand webinar: Transform Your BI and Data Warehouse for Big Data – Upcoming webinar Sept. 18, 11am PT/2pm ET Using Greenplum to Deliver Big Data Analytics  Contact Judith Hurwitz – Email: judith.hurwitz@hurwitz.com – LinkedIn: http://www.linkedin.com/pub/judith-hurwitz/0/18/405 – Twitter: @jhurwitz  Contact Bill Schmarzo – Email: william.schmarzo@emc.com – LinkedIn: http://www.linkedin.com/in/schmarzo – Twitter: @schmarzo – Blog: http://infocus.emc.com/author/william_schmarzo/ © Copyright 2012 EMC Corporation. All rights reserved. 20
  • 21. THANK YOU © Copyright 2012 EMC Corporation. All rights reserved. 21