SlideShare a Scribd company logo
1 of 10
Integrating Big Data Smarter
Steven Totman
Director Of Strategy
stotman@syncsort.com
@StevenTotman
Big Data Warehousing
+( )
2
Why Syncsort?
3
• 50% of all mainframes run Syncsort
• 1,500 Mainframe Customers: Most
used & trusted 3rd party mainframe
software
• Speed leader for ETL & Sort
• A history of innovation
• 25+ Issued & Pending Patents
• Large global customer base
• 15,000+ deployments in 68 countries
• First-to-market, fully integrated
approach to Hadoop ETL
For 40 years we have been helping companies solve their big data
issues…even before they knew the name Big Data!
Our customers are achieving the
impossible, every day!Integrating Big Data… Smarter!
Key Technology Partners
4
Mandatory sort steps in MapReduce processing
5
Smart Contributions to Improve Hadoop
6
JIRA
4807 Allow MapOutputBuffer to be pluggable
4808 Allow Reduce-side merge to be pluggable
4809 Make classes required for 2454 public
4812 Create reduce input merger plug-in
Description
…and more!!
4842 Shuffle race can hang reducer
2461 HDFS file name globbing in libhdfs
4482 Backport of 2454 to MapReduce 1 & 1.2
Native Sort:
ᵡNot modular
ᵡLimited capabilities
ᵡDifficult to fine-tune & configure (requires coding &
compilation)
Native
Sort
Hadoop
Node
Native
Sort
Hadoop
Node
Contribution:
 Modular
 Extensible
 Configurable through use of external sorters on
MapReduce nodes
Native
Sort
Hadoop
Node
Native
Sort
Hadoop
Node
+1 Committed Jan 22nd 2013
7
0
50
100
150
200
250
0 1000 2000 3000 4000 5000
ElapsedTime(min)
File Size (GB)
TeraSort Benchmark
Benefits to the Community
JOIN
MERGE
AGGREGRATION
CDC
COMPRESSION
LOOKUP
RANK
MATCH
Native Sort
Syncsort
The Smarter Approach to Hadoop ETL… and Mainframe
8
Connect Process
 Connect – One tool to connect all your data
 Translate - Best in class mainframe data
access with seamless data translation &
COBOL Copybooks support
 Process – Hadoop ETL without coding.
Develop, test & debug locally in Windows;
deploy on Hadoop
PLUS…
 Enterprise-grade security
 Smarter deployment, monitoring &
administration
 Disruptive cost-structure
 Decades of Mainframe expertise
Translate
9
www.syncsort.com/try
+
Running on CDH & HDP
Test drive dmx-h:
Bridge the Gap Between
Big Iron & Big Data!
• Self-contained image
• Use case accelerators for
• mainframe, Hadoop and more!
…and Quite Possibly The Only Approach!
A Smarter Approach…
( )
What Next?
10
Date: Wednesday, September 25, 2013
Time: 9:00 AM PT | 12:00 PM ET
Register & Attend Webinar for a Free T-Shirt

More Related Content

What's hot

State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataState of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataMathieu Dumoulin
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareMapR Technologies
 
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016Mathieu Dumoulin
 
The Briefcase Cluster – Enabling Big Data Everywhere
The Briefcase Cluster – Enabling Big Data Everywhere The Briefcase Cluster – Enabling Big Data Everywhere
The Briefcase Cluster – Enabling Big Data Everywhere MapR Technologies
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageMapR Technologies
 
Big data processing with PubSub, Dataflow, and BigQuery
Big data processing with PubSub, Dataflow, and BigQueryBig data processing with PubSub, Dataflow, and BigQuery
Big data processing with PubSub, Dataflow, and BigQueryThuyen Ho
 
Big Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS CloudBig Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS CloudAmazon Web Services
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Mathieu Dumoulin
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsMapR Technologies
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureMapR Technologies
 
GPUdb: A Distributed Database for Many-Core Devices
GPUdb: A Distributed Database for Many-Core DevicesGPUdb: A Distributed Database for Many-Core Devices
GPUdb: A Distributed Database for Many-Core Devicesinside-BigData.com
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data AnalyticsMapR Technologies
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMapR Technologies
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationMapR Technologies
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataMapR Technologies
 
Big Data- Automotive Industry Use Case
Big Data- Automotive Industry Use CaseBig Data- Automotive Industry Use Case
Big Data- Automotive Industry Use CaseSophie (C.F.) Tsai
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...MapR Technologies
 

What's hot (19)

State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataState of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
 
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
 
The Briefcase Cluster – Enabling Big Data Everywhere
The Briefcase Cluster – Enabling Big Data Everywhere The Briefcase Cluster – Enabling Big Data Everywhere
The Briefcase Cluster – Enabling Big Data Everywhere
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
Big data processing with PubSub, Dataflow, and BigQuery
Big data processing with PubSub, Dataflow, and BigQueryBig data processing with PubSub, Dataflow, and BigQuery
Big data processing with PubSub, Dataflow, and BigQuery
 
Big Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS CloudBig Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS Cloud
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
 
HDF Data in the Cloud
HDF Data in the CloudHDF Data in the Cloud
HDF Data in the Cloud
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
 
GPUdb: A Distributed Database for Many-Core Devices
GPUdb: A Distributed Database for Many-Core DevicesGPUdb: A Distributed Database for Many-Core Devices
GPUdb: A Distributed Database for Many-Core Devices
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
 
Penny Pinching at Scale
Penny Pinching at ScalePenny Pinching at Scale
Penny Pinching at Scale
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
 
Big Data- Automotive Industry Use Case
Big Data- Automotive Industry Use CaseBig Data- Automotive Industry Use Case
Big Data- Automotive Industry Use Case
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
 

Viewers also liked

Big data, Analytics and 4th Generation Data Warehousing
Big data, Analytics and 4th Generation Data WarehousingBig data, Analytics and 4th Generation Data Warehousing
Big data, Analytics and 4th Generation Data WarehousingMartyn Richard Jones
 
New Analytical Architectures for Big Data
New Analytical Architectures for Big DataNew Analytical Architectures for Big Data
New Analytical Architectures for Big DataCasey Kiernan
 
THE DATA WAREHOUSING DEBATE: TRADITIONAL vs. OPEN SOURCE from Structure:Data ...
THE DATA WAREHOUSING DEBATE: TRADITIONAL vs. OPEN SOURCE from Structure:Data ...THE DATA WAREHOUSING DEBATE: TRADITIONAL vs. OPEN SOURCE from Structure:Data ...
THE DATA WAREHOUSING DEBATE: TRADITIONAL vs. OPEN SOURCE from Structure:Data ...Gigaom
 
SAS on Your (Apache) Cluster, Serving your Data (Analysts)
SAS on Your (Apache) Cluster, Serving your Data (Analysts)SAS on Your (Apache) Cluster, Serving your Data (Analysts)
SAS on Your (Apache) Cluster, Serving your Data (Analysts)DataWorks Summit
 
SAS and Cloudera – Analytics at Scale
SAS and Cloudera – Analytics at ScaleSAS and Cloudera – Analytics at Scale
SAS and Cloudera – Analytics at ScaleCloudera, Inc.
 
Moving From SAS to R Webinar Presentation - 07Aug14
Moving From SAS to R Webinar Presentation - 07Aug14Moving From SAS to R Webinar Presentation - 07Aug14
Moving From SAS to R Webinar Presentation - 07Aug14Revolution Analytics
 
Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelMoving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelDataWorks Summit
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)Ajay Ohri
 
A big-data architecture for real-time analytics
A big-data architecture for real-time analyticsA big-data architecture for real-time analytics
A big-data architecture for real-time analyticsramikaurraminder
 
Introduction to Cognos BI
Introduction to Cognos BIIntroduction to Cognos BI
Introduction to Cognos BIEdureka!
 
Hadoop and Spark for the SAS Developer
Hadoop and Spark for the SAS DeveloperHadoop and Spark for the SAS Developer
Hadoop and Spark for the SAS DeveloperDataWorks Summit
 
SAS Modernization architectures - Big Data Analytics
SAS Modernization architectures - Big Data AnalyticsSAS Modernization architectures - Big Data Analytics
SAS Modernization architectures - Big Data AnalyticsDeepak Ramanathan
 
Getting Started with Salesforce Admin and Developer Foundation
Getting Started with Salesforce Admin and Developer FoundationGetting Started with Salesforce Admin and Developer Foundation
Getting Started with Salesforce Admin and Developer FoundationEdureka!
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopRevolution Analytics
 
Netezza integration with SAS software
Netezza integration with SAS softwareNetezza integration with SAS software
Netezza integration with SAS softwarePavel Zhivulin
 
Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics? Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics? Revolution Analytics
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata Hortonworks
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Rittman Analytics
 
Running Cognos on Hadoop
Running Cognos on HadoopRunning Cognos on Hadoop
Running Cognos on HadoopSenturus
 

Viewers also liked (20)

Big data, Analytics and 4th Generation Data Warehousing
Big data, Analytics and 4th Generation Data WarehousingBig data, Analytics and 4th Generation Data Warehousing
Big data, Analytics and 4th Generation Data Warehousing
 
New Analytical Architectures for Big Data
New Analytical Architectures for Big DataNew Analytical Architectures for Big Data
New Analytical Architectures for Big Data
 
THE DATA WAREHOUSING DEBATE: TRADITIONAL vs. OPEN SOURCE from Structure:Data ...
THE DATA WAREHOUSING DEBATE: TRADITIONAL vs. OPEN SOURCE from Structure:Data ...THE DATA WAREHOUSING DEBATE: TRADITIONAL vs. OPEN SOURCE from Structure:Data ...
THE DATA WAREHOUSING DEBATE: TRADITIONAL vs. OPEN SOURCE from Structure:Data ...
 
SAS on Your (Apache) Cluster, Serving your Data (Analysts)
SAS on Your (Apache) Cluster, Serving your Data (Analysts)SAS on Your (Apache) Cluster, Serving your Data (Analysts)
SAS on Your (Apache) Cluster, Serving your Data (Analysts)
 
SAS and Cloudera – Analytics at Scale
SAS and Cloudera – Analytics at ScaleSAS and Cloudera – Analytics at Scale
SAS and Cloudera – Analytics at Scale
 
Moving From SAS to R Webinar Presentation - 07Aug14
Moving From SAS to R Webinar Presentation - 07Aug14Moving From SAS to R Webinar Presentation - 07Aug14
Moving From SAS to R Webinar Presentation - 07Aug14
 
Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelMoving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
 
A big-data architecture for real-time analytics
A big-data architecture for real-time analyticsA big-data architecture for real-time analytics
A big-data architecture for real-time analytics
 
Introduction to Cognos BI
Introduction to Cognos BIIntroduction to Cognos BI
Introduction to Cognos BI
 
Hadoop and Spark for the SAS Developer
Hadoop and Spark for the SAS DeveloperHadoop and Spark for the SAS Developer
Hadoop and Spark for the SAS Developer
 
SAS Modernization architectures - Big Data Analytics
SAS Modernization architectures - Big Data AnalyticsSAS Modernization architectures - Big Data Analytics
SAS Modernization architectures - Big Data Analytics
 
Getting Started with Salesforce Admin and Developer Foundation
Getting Started with Salesforce Admin and Developer FoundationGetting Started with Salesforce Admin and Developer Foundation
Getting Started with Salesforce Admin and Developer Foundation
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
 
Netezza integration with SAS software
Netezza integration with SAS softwareNetezza integration with SAS software
Netezza integration with SAS software
 
Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics? Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics?
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
 
SASとHadoopとの連携 2015
SASとHadoopとの連携 2015SASとHadoopとの連携 2015
SASとHadoopとの連携 2015
 
Running Cognos on Hadoop
Running Cognos on HadoopRunning Cognos on Hadoop
Running Cognos on Hadoop
 

Similar to Steve Totman Syncsort Big Data Warehousing hug 23 sept Final

Hadoop Summit Amsterdam 2013 - Making Hadoop Ready for Prime Time - Syncsort ...
Hadoop Summit Amsterdam 2013 - Making Hadoop Ready for Prime Time - Syncsort ...Hadoop Summit Amsterdam 2013 - Making Hadoop Ready for Prime Time - Syncsort ...
Hadoop Summit Amsterdam 2013 - Making Hadoop Ready for Prime Time - Syncsort ...Steven Totman
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...DataWorks Summit
 
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.OW2
 
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...Precisely
 
Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU DatabasePowering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU DatabaseKinetica
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyInside Analysis
 
Why Hadoop is important to Syncsort
Why Hadoop is important to SyncsortWhy Hadoop is important to Syncsort
Why Hadoop is important to Syncsorthuguk
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeeling Cheung
 
Webinar: Improving Time to Value for Enterprise Big Data Analytics
Webinar: Improving Time to Value for Enterprise Big Data AnalyticsWebinar: Improving Time to Value for Enterprise Big Data Analytics
Webinar: Improving Time to Value for Enterprise Big Data AnalyticsStorage Switzerland
 
flexpod_hadoop_cloudera
flexpod_hadoop_clouderaflexpod_hadoop_cloudera
flexpod_hadoop_clouderaPrem Jain
 
IBMHadoopofferingTechline-Systems2015
IBMHadoopofferingTechline-Systems2015IBMHadoopofferingTechline-Systems2015
IBMHadoopofferingTechline-Systems2015Daniela Zuppini
 
Game Changed – How Hadoop is Reinventing Enterprise Thinking
Game Changed – How Hadoop is Reinventing Enterprise ThinkingGame Changed – How Hadoop is Reinventing Enterprise Thinking
Game Changed – How Hadoop is Reinventing Enterprise ThinkingInside Analysis
 
GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017Joshua Patterson
 
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...Data Con LA
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataPentaho
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseRizaldy Ignacio
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionEtu Solution
 
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...ModusOptimum
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointInside Analysis
 
Has Traditional MDM Finally Met its Match?
Has Traditional MDM Finally Met its Match?Has Traditional MDM Finally Met its Match?
Has Traditional MDM Finally Met its Match?Inside Analysis
 

Similar to Steve Totman Syncsort Big Data Warehousing hug 23 sept Final (20)

Hadoop Summit Amsterdam 2013 - Making Hadoop Ready for Prime Time - Syncsort ...
Hadoop Summit Amsterdam 2013 - Making Hadoop Ready for Prime Time - Syncsort ...Hadoop Summit Amsterdam 2013 - Making Hadoop Ready for Prime Time - Syncsort ...
Hadoop Summit Amsterdam 2013 - Making Hadoop Ready for Prime Time - Syncsort ...
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
 
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
 
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...
 
Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU DatabasePowering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
 
Why Hadoop is important to Syncsort
Why Hadoop is important to SyncsortWhy Hadoop is important to Syncsort
Why Hadoop is important to Syncsort
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
 
Webinar: Improving Time to Value for Enterprise Big Data Analytics
Webinar: Improving Time to Value for Enterprise Big Data AnalyticsWebinar: Improving Time to Value for Enterprise Big Data Analytics
Webinar: Improving Time to Value for Enterprise Big Data Analytics
 
flexpod_hadoop_cloudera
flexpod_hadoop_clouderaflexpod_hadoop_cloudera
flexpod_hadoop_cloudera
 
IBMHadoopofferingTechline-Systems2015
IBMHadoopofferingTechline-Systems2015IBMHadoopofferingTechline-Systems2015
IBMHadoopofferingTechline-Systems2015
 
Game Changed – How Hadoop is Reinventing Enterprise Thinking
Game Changed – How Hadoop is Reinventing Enterprise ThinkingGame Changed – How Hadoop is Reinventing Enterprise Thinking
Game Changed – How Hadoop is Reinventing Enterprise Thinking
 
GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017
 
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big Data
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
 
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter Point
 
Has Traditional MDM Finally Met its Match?
Has Traditional MDM Finally Met its Match?Has Traditional MDM Finally Met its Match?
Has Traditional MDM Finally Met its Match?
 

Recently uploaded

(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607dollysharma2066
 
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...lizamodels9
 
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCRashishs7044
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCRashishs7044
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCRashishs7044
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesKeppelCorporation
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfrichard876048
 
Call Girls Miyapur 7001305949 all area service COD available Any Time
Call Girls Miyapur 7001305949 all area service COD available Any TimeCall Girls Miyapur 7001305949 all area service COD available Any Time
Call Girls Miyapur 7001305949 all area service COD available Any Timedelhimodelshub1
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy Verified Accounts
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03DallasHaselhorst
 
Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfJos Voskuil
 
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607dollysharma2066
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024christinemoorman
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...ssuserf63bd7
 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailAriel592675
 
Kenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby AfricaKenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby Africaictsugar
 
Organizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessOrganizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessSeta Wicaksana
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis UsageNeil Kimberley
 

Recently uploaded (20)

(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
 
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
 
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR
 
Corporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information TechnologyCorporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information Technology
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation Slides
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdf
 
Call Girls Miyapur 7001305949 all area service COD available Any Time
Call Girls Miyapur 7001305949 all area service COD available Any TimeCall Girls Miyapur 7001305949 all area service COD available Any Time
Call Girls Miyapur 7001305949 all area service COD available Any Time
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail Accounts
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03
 
Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdf
 
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...
 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detail
 
Kenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby AfricaKenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby Africa
 
Organizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessOrganizational Structure Running A Successful Business
Organizational Structure Running A Successful Business
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage
 

Steve Totman Syncsort Big Data Warehousing hug 23 sept Final

  • 1. Integrating Big Data Smarter Steven Totman Director Of Strategy stotman@syncsort.com @StevenTotman Big Data Warehousing +( )
  • 2. 2
  • 3. Why Syncsort? 3 • 50% of all mainframes run Syncsort • 1,500 Mainframe Customers: Most used & trusted 3rd party mainframe software • Speed leader for ETL & Sort • A history of innovation • 25+ Issued & Pending Patents • Large global customer base • 15,000+ deployments in 68 countries • First-to-market, fully integrated approach to Hadoop ETL For 40 years we have been helping companies solve their big data issues…even before they knew the name Big Data! Our customers are achieving the impossible, every day!Integrating Big Data… Smarter! Key Technology Partners
  • 4. 4 Mandatory sort steps in MapReduce processing
  • 5. 5
  • 6. Smart Contributions to Improve Hadoop 6 JIRA 4807 Allow MapOutputBuffer to be pluggable 4808 Allow Reduce-side merge to be pluggable 4809 Make classes required for 2454 public 4812 Create reduce input merger plug-in Description …and more!! 4842 Shuffle race can hang reducer 2461 HDFS file name globbing in libhdfs 4482 Backport of 2454 to MapReduce 1 & 1.2 Native Sort: ᵡNot modular ᵡLimited capabilities ᵡDifficult to fine-tune & configure (requires coding & compilation) Native Sort Hadoop Node Native Sort Hadoop Node Contribution:  Modular  Extensible  Configurable through use of external sorters on MapReduce nodes Native Sort Hadoop Node Native Sort Hadoop Node +1 Committed Jan 22nd 2013
  • 7. 7 0 50 100 150 200 250 0 1000 2000 3000 4000 5000 ElapsedTime(min) File Size (GB) TeraSort Benchmark Benefits to the Community JOIN MERGE AGGREGRATION CDC COMPRESSION LOOKUP RANK MATCH Native Sort Syncsort
  • 8. The Smarter Approach to Hadoop ETL… and Mainframe 8 Connect Process  Connect – One tool to connect all your data  Translate - Best in class mainframe data access with seamless data translation & COBOL Copybooks support  Process – Hadoop ETL without coding. Develop, test & debug locally in Windows; deploy on Hadoop PLUS…  Enterprise-grade security  Smarter deployment, monitoring & administration  Disruptive cost-structure  Decades of Mainframe expertise Translate
  • 9. 9 www.syncsort.com/try + Running on CDH & HDP Test drive dmx-h: Bridge the Gap Between Big Iron & Big Data! • Self-contained image • Use case accelerators for • mainframe, Hadoop and more! …and Quite Possibly The Only Approach! A Smarter Approach… ( )
  • 10. What Next? 10 Date: Wednesday, September 25, 2013 Time: 9:00 AM PT | 12:00 PM ET Register & Attend Webinar for a Free T-Shirt

Editor's Notes

  1. Source image: http://us.fotolia.com/id/53234106