SlideShare a Scribd company logo
1 of 30
Download to read offline
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
avkash@bigdataperspective.com 
https://www.linkedin.com/in/avkashchauhan 
Lets Start and 
Define Big 
Data 
Apache Hadoop Training Series: Hadoop Introduction 1
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Apache Hadoop Training Series: Hadoop Introduction 2
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Lets Start 
and 
Define 
Big Data 
How 
Hadoop 
Fits in this 
scenario 
Apache Hadoop Training Series: Hadoop Introduction 3
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
http://www.packtpub.com/using-cloudera-impala/book 
http://www.amazon.com/Simplifying-Windows-Azure-HDInsight-Service/dp/0735673802 
http://blogs.msdn.com/b/microsoft_press/archive/2014/05/27/free-ebook-introducing-microsoft-azure-hdinsight.aspx 
https://www.linkedin.com/in/avkashchauhan 
Apache Hadoop Training Series: Hadoop Introduction 4
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Hadoop is an Open Source (Java based), “Scalable”, “fault 
tolerant” platform for large amount of unstructured data storage 
& processing, distributed across machines. 
Apache Hadoop Training Series: Hadoop Introduction 5
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Flexibility 
A Single Repo for 
storing and analyzing 
any kind of data not 
bounded by schema 
Scalability 
Scale-out architecture 
divides workload across 
multiple nodes using flexible 
distributed file system 
Low Cost 
Deployed on 
commodity 
hardware & open 
source platform 
Fault Tolerant 
Continue working 
event if node(s) go 
down 
A system to move computation, where the data is. 
Apache Hadoop Training Series: Hadoop Introduction 6
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Lets Start 
and Define 
Big Data 
Hadoop 
Landscape 
How 
Hadoop 
Fits in this 
scenario 
Apache Hadoop Training Series: Hadoop Introduction 7
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Apache Hadoop Training Series: Hadoop Introduction 8
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Apache Hadoop Training Series: Hadoop Introduction 9
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Apache Hadoop Training Series: Hadoop Introduction 10
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Lets Start 
and Define 
Big Data 
How Hadoop 
Fits in this 
scenario 
Hadoop 
Core 
Components 
Hadoop 
Landscape 
Data 
Storage 
Data 
Processing 
Apache Hadoop Training Series: Hadoop Introduction 11
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
HDFS 
MapReduce 
/YARN 
Hadoop Common 
Apache Hadoop Training Series: Hadoop Introduction 12
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Cloud 
Cloudera Impala Hortonworks Tez 
Impala uses C++ based in-memory 
processing of HDFS data through SQL 
like statements to expedite the data 
processing 
Use cases include user collaborative 
filtering, user recommendations, 
clustering and classification. 
Apache Hadoop Training Series: Hadoop Introduction 13
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Lets Start 
and Define 
Big Data 
How 
Hadoop Fits 
in this 
scenario 
Hadoop 
Landscape 
Applying 
Hadoop to 
Save $$ 
Hadoop 
Core 
Components 
Apache Hadoop Training Series: Hadoop Introduction 14
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Lets Start 
and Define 
Big Data 
How Hadoop 
Fits in this 
scenario 
Hadoop 
Landscape 
Concept of 
Data Lake 
Hadoop Core 
Components 
Applying 
Hadoop to 
Save $$ 
Apache Hadoop Training Series: Hadoop Introduction 15
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Apache Hadoop Training Series: Hadoop Introduction 16
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Apache Hadoop Training Series: Hadoop Introduction 17
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Lets Start 
and Define 
Big Data 
How 
Hadoop Fits 
in this 
scenario 
Hadoop 
Landscape 
Hadoop 
Core 
Components 
Concept of 
Data Lake 
Applying 
Hadoop to 
Save $$ 
Hadoop in 
Cloud 
Apache Hadoop Training Series: Hadoop Introduction 18
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Apache Hadoop Training Series: Hadoop Introduction 19
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Apache Hadoop Training Series: Hadoop Introduction 20
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Lets Start 
and Define 
Big Data 
How Hadoop 
Fits in this 
scenario 
Hadoop 
Landscape 
Hadoop Core 
Components 
Big Data 
Analytics 
Applying 
Hadoop to 
Save $$ 
Hadoop in 
Cloud 
Concept of 
Data Lake 
Apache Hadoop Training Series: Hadoop Introduction 21
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
EDW 
OLAP 
ODS 
Apache Hadoop Training Series: Hadoop Introduction 22
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Apache Hadoop Training Series: Hadoop Introduction 23
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Lets Start 
and Define 
Big Data 
How Hadoop 
Fits in this 
scenario 
Hadoop 
Landscape 
Hadoop Core 
Components 
Big Data 
Analytics 
With Hadoop 
Applying 
Hadoop to 
Save $$ 
Hadoop in 
Cloud 
Concept of 
Data Lake 
Apache Hadoop Training Series: Hadoop Introduction 24
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Amazon HDInsight Directives 
Data Storage S3 Azure Blobs Direct access to compute 
machine to super fast data 
delivery 
Processing EC2 
Azure Compute Dedicated Machines ready to 
turn with specific version of 
Hadoop runtime 
Processing Libraries Java based or any 
other language 
supported through 
Hadoop Streaming 
.Net based code User uploads their code 
processing binaries/ libraries 
Results S3 Azure Blobs Once job is completed the 
results are stored back to 
specific data storage used as 
source 
Visualization Custom Custom 3rd party application can 
connect to storage to perform 
visualization 
Apache Hadoop Training Series: Hadoop Introduction 25
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Apache Hadoop Training Series: Hadoop Introduction 26
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Apache Hadoop Training Series: Hadoop Introduction 27
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Lets Start 
and Define 
Big Data 
How Hadoop 
Fits in this 
scenario 
Hadoop 
Landscape 
Hadoop Core 
Components 
Big Data 
Analytics 
With Hadoop 
Applying 
Hadoop to 
Save $$ 
Hadoop in 
Cloud 
Concept of 
Data Lake 
Apache Hadoop Training Series: Hadoop Introduction 28
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
Apache Hadoop Training Series: Hadoop Introduction 29
Apache Hadoop Training Series: Hadoop 
Introduction 
10/23/14 
http://blogs.msdn.com/b/microsoft_press/archive/2014/05/27/free-ebook-introducing-microsoft-azure-hdinsight.aspx 
Apache Hadoop Training Series: Hadoop Introduction 30

More Related Content

What's hot

Apache Drill with Oracle, Hive and HBase
Apache Drill with Oracle, Hive and HBaseApache Drill with Oracle, Hive and HBase
Apache Drill with Oracle, Hive and HBaseNag Arvind Gudiseva
 
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureHadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureSkillspeed
 
Hadoop hive presentation
Hadoop hive presentationHadoop hive presentation
Hadoop hive presentationArvind Kumar
 
Hadoop - Overview
Hadoop - OverviewHadoop - Overview
Hadoop - OverviewJay
 
A glimpse of test automation in hadoop ecosystem by Deepika Achary
A glimpse of test automation in hadoop ecosystem by Deepika AcharyA glimpse of test automation in hadoop ecosystem by Deepika Achary
A glimpse of test automation in hadoop ecosystem by Deepika AcharyQA or the Highway
 
Hadoop Career Path and Interview Preparation
Hadoop Career Path and Interview PreparationHadoop Career Path and Interview Preparation
Hadoop Career Path and Interview PreparationEdureka!
 
Hadoop Administration pdf
Hadoop Administration pdfHadoop Administration pdf
Hadoop Administration pdfEdureka!
 
Cloudera hadoop installation
Cloudera hadoop installationCloudera hadoop installation
Cloudera hadoop installationSumitra Pundlik
 
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...Hadoop / Spark Conference Japan
 
알쓸신잡
알쓸신잡알쓸신잡
알쓸신잡youngick
 
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Hadoop / Spark Conference Japan
 
A Day in the Life of a Hadoop Administrator
A Day in the Life of a Hadoop AdministratorA Day in the Life of a Hadoop Administrator
A Day in the Life of a Hadoop AdministratorEdureka!
 
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka Edureka!
 
Apache Sqoop Tutorial | Sqoop: Import & Export Data From MySQL To HDFS | Hado...
Apache Sqoop Tutorial | Sqoop: Import & Export Data From MySQL To HDFS | Hado...Apache Sqoop Tutorial | Sqoop: Import & Export Data From MySQL To HDFS | Hado...
Apache Sqoop Tutorial | Sqoop: Import & Export Data From MySQL To HDFS | Hado...Edureka!
 
October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in Hadoop
October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in HadoopOctober 2016 HUG: The Pillars of Effective Data Archiving and Tiering in Hadoop
October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in HadoopYahoo Developer Network
 

What's hot (20)

Apache Drill with Oracle, Hive and HBase
Apache Drill with Oracle, Hive and HBaseApache Drill with Oracle, Hive and HBase
Apache Drill with Oracle, Hive and HBase
 
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureHadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
 
Hadoop hive presentation
Hadoop hive presentationHadoop hive presentation
Hadoop hive presentation
 
Hadoop - Overview
Hadoop - OverviewHadoop - Overview
Hadoop - Overview
 
Hadoop sqoop
Hadoop sqoop Hadoop sqoop
Hadoop sqoop
 
A glimpse of test automation in hadoop ecosystem by Deepika Achary
A glimpse of test automation in hadoop ecosystem by Deepika AcharyA glimpse of test automation in hadoop ecosystem by Deepika Achary
A glimpse of test automation in hadoop ecosystem by Deepika Achary
 
Hadoop Career Path and Interview Preparation
Hadoop Career Path and Interview PreparationHadoop Career Path and Interview Preparation
Hadoop Career Path and Interview Preparation
 
Hadoop Administration pdf
Hadoop Administration pdfHadoop Administration pdf
Hadoop Administration pdf
 
Cloudera hadoop installation
Cloudera hadoop installationCloudera hadoop installation
Cloudera hadoop installation
 
Beginning hive and_apache_pig
Beginning hive and_apache_pigBeginning hive and_apache_pig
Beginning hive and_apache_pig
 
Sqoop tutorial
Sqoop tutorialSqoop tutorial
Sqoop tutorial
 
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...
 
알쓸신잡
알쓸신잡알쓸신잡
알쓸신잡
 
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
 
Hadoop basics
Hadoop basicsHadoop basics
Hadoop basics
 
A Day in the Life of a Hadoop Administrator
A Day in the Life of a Hadoop AdministratorA Day in the Life of a Hadoop Administrator
A Day in the Life of a Hadoop Administrator
 
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
 
RHadoop
RHadoopRHadoop
RHadoop
 
Apache Sqoop Tutorial | Sqoop: Import & Export Data From MySQL To HDFS | Hado...
Apache Sqoop Tutorial | Sqoop: Import & Export Data From MySQL To HDFS | Hado...Apache Sqoop Tutorial | Sqoop: Import & Export Data From MySQL To HDFS | Hado...
Apache Sqoop Tutorial | Sqoop: Import & Export Data From MySQL To HDFS | Hado...
 
October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in Hadoop
October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in HadoopOctober 2016 HUG: The Pillars of Effective Data Archiving and Tiering in Hadoop
October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in Hadoop
 

Similar to Data 360 Conference: Introduction to Big Data, Hadoop and Big Data Analytics

Hadoop online training by certified trainer
Hadoop online training by certified trainerHadoop online training by certified trainer
Hadoop online training by certified trainersriram0233
 
Practical Hadoop Big Data Training Course by Certified Architect
Practical Hadoop Big Data Training Course by Certified ArchitectPractical Hadoop Big Data Training Course by Certified Architect
Practical Hadoop Big Data Training Course by Certified ArchitectKamal A
 
Best Hadoop and Amazon Online Training
Best Hadoop and Amazon Online TrainingBest Hadoop and Amazon Online Training
Best Hadoop and Amazon Online TrainingSamatha Kamuni
 
Hadoop and aws map reducecourse
Hadoop and aws map reducecourseHadoop and aws map reducecourse
Hadoop and aws map reducecourseSamatha Kamuni
 
Hadoop installation, Configuration, and Mapreduce program
Hadoop installation, Configuration, and Mapreduce programHadoop installation, Configuration, and Mapreduce program
Hadoop installation, Configuration, and Mapreduce programPraveen Kumar Donta
 
Certificate in Apache Hadoop - Edukite
Certificate in Apache Hadoop - EdukiteCertificate in Apache Hadoop - Edukite
Certificate in Apache Hadoop - EdukiteEduKite
 
Big-Data Hadoop Training Institutes in Pune | CloudEra Certification courses ...
Big-Data Hadoop Training Institutes in Pune | CloudEra Certification courses ...Big-Data Hadoop Training Institutes in Pune | CloudEra Certification courses ...
Big-Data Hadoop Training Institutes in Pune | CloudEra Certification courses ...mindscriptsseo
 
Webinar: Ways to Succeed with Hadoop in 2015
Webinar: Ways to Succeed with Hadoop in 2015Webinar: Ways to Succeed with Hadoop in 2015
Webinar: Ways to Succeed with Hadoop in 2015Edureka!
 
Apache hadoop-administrator-training
Apache hadoop-administrator-trainingApache hadoop-administrator-training
Apache hadoop-administrator-trainingKnowledgehut
 
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache HadoopIntroduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache HadoopAvkash Chauhan
 
Hadoop online training course
Hadoop online  training courseHadoop online  training course
Hadoop online training courseKamal A
 
Hadoop online training
Hadoop online trainingHadoop online training
Hadoop online trainingsrikanthhadoop
 
Hadoop online training
Hadoop online training Hadoop online training
Hadoop online training Keylabs
 
Best hadoop-online-training
Best hadoop-online-trainingBest hadoop-online-training
Best hadoop-online-trainingGeohedrick
 
Playing with Hadoop (NPW2013)
Playing with Hadoop (NPW2013)Playing with Hadoop (NPW2013)
Playing with Hadoop (NPW2013)Søren Lund
 
Big Data Training in Amritsar
Big Data Training in AmritsarBig Data Training in Amritsar
Big Data Training in AmritsarE2MATRIX
 

Similar to Data 360 Conference: Introduction to Big Data, Hadoop and Big Data Analytics (20)

Hadoop online training by certified trainer
Hadoop online training by certified trainerHadoop online training by certified trainer
Hadoop online training by certified trainer
 
Practical Hadoop Big Data Training Course by Certified Architect
Practical Hadoop Big Data Training Course by Certified ArchitectPractical Hadoop Big Data Training Course by Certified Architect
Practical Hadoop Big Data Training Course by Certified Architect
 
Best Hadoop and Amazon Online Training
Best Hadoop and Amazon Online TrainingBest Hadoop and Amazon Online Training
Best Hadoop and Amazon Online Training
 
Hadoop and aws map reducecourse
Hadoop and aws map reducecourseHadoop and aws map reducecourse
Hadoop and aws map reducecourse
 
Hadoop installation, Configuration, and Mapreduce program
Hadoop installation, Configuration, and Mapreduce programHadoop installation, Configuration, and Mapreduce program
Hadoop installation, Configuration, and Mapreduce program
 
Hadoop and aws short
Hadoop and aws  shortHadoop and aws  short
Hadoop and aws short
 
Hadoop and aws short
Hadoop and aws  shortHadoop and aws  short
Hadoop and aws short
 
Certificate in Apache Hadoop - Edukite
Certificate in Apache Hadoop - EdukiteCertificate in Apache Hadoop - Edukite
Certificate in Apache Hadoop - Edukite
 
Big-Data Hadoop Training Institutes in Pune | CloudEra Certification courses ...
Big-Data Hadoop Training Institutes in Pune | CloudEra Certification courses ...Big-Data Hadoop Training Institutes in Pune | CloudEra Certification courses ...
Big-Data Hadoop Training Institutes in Pune | CloudEra Certification courses ...
 
Webinar: Ways to Succeed with Hadoop in 2015
Webinar: Ways to Succeed with Hadoop in 2015Webinar: Ways to Succeed with Hadoop in 2015
Webinar: Ways to Succeed with Hadoop in 2015
 
Apache hadoop-administrator-training
Apache hadoop-administrator-trainingApache hadoop-administrator-training
Apache hadoop-administrator-training
 
Hadoop content
Hadoop contentHadoop content
Hadoop content
 
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache HadoopIntroduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
 
Hadoop online training course
Hadoop online  training courseHadoop online  training course
Hadoop online training course
 
Hadoop online training
Hadoop online trainingHadoop online training
Hadoop online training
 
Hadoop online training
Hadoop online training Hadoop online training
Hadoop online training
 
Best hadoop-online-training
Best hadoop-online-trainingBest hadoop-online-training
Best hadoop-online-training
 
Playing with Hadoop (NPW2013)
Playing with Hadoop (NPW2013)Playing with Hadoop (NPW2013)
Playing with Hadoop (NPW2013)
 
Hadoop online training
Hadoop online trainingHadoop online training
Hadoop online training
 
Big Data Training in Amritsar
Big Data Training in AmritsarBig Data Training in Amritsar
Big Data Training in Amritsar
 

More from Avkash Chauhan

AI Solutions with Macnica.ai - AI Expo 2018 Tokyo Japan
AI Solutions with Macnica.ai - AI Expo 2018 Tokyo JapanAI Solutions with Macnica.ai - AI Expo 2018 Tokyo Japan
AI Solutions with Macnica.ai - AI Expo 2018 Tokyo JapanAvkash Chauhan
 
AI Expo - AI Revolution in Silicon Valley
AI Expo - AI Revolution in Silicon ValleyAI Expo - AI Revolution in Silicon Valley
AI Expo - AI Revolution in Silicon ValleyAvkash Chauhan
 
Nikkei xTech coverage on macnica.ai announcement
Nikkei xTech coverage on macnica.ai announcementNikkei xTech coverage on macnica.ai announcement
Nikkei xTech coverage on macnica.ai announcementAvkash Chauhan
 
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Avkash Chauhan
 
Big Data Perspective UI V2
Big Data Perspective UI V2Big Data Perspective UI V2
Big Data Perspective UI V2Avkash Chauhan
 
Big Data Perspective (UI)
Big Data Perspective (UI)Big Data Perspective (UI)
Big Data Perspective (UI)Avkash Chauhan
 
Big Data Perspective (Company Information)
Big Data Perspective (Company Information)Big Data Perspective (Company Information)
Big Data Perspective (Company Information)Avkash Chauhan
 
Applied Machine learning using H2O, python and R Workshop
Applied Machine learning using H2O, python and R WorkshopApplied Machine learning using H2O, python and R Workshop
Applied Machine learning using H2O, python and R WorkshopAvkash Chauhan
 
The concept of Datalake with Hadoop
The concept of Datalake with HadoopThe concept of Datalake with Hadoop
The concept of Datalake with HadoopAvkash Chauhan
 
Developing Hadoop strategy for your Enterprise
Developing Hadoop strategy for your EnterpriseDeveloping Hadoop strategy for your Enterprise
Developing Hadoop strategy for your EnterpriseAvkash Chauhan
 
Introduction to Hadoop at Data-360 Conference
Introduction to Hadoop at Data-360 ConferenceIntroduction to Hadoop at Data-360 Conference
Introduction to Hadoop at Data-360 ConferenceAvkash Chauhan
 
Introduction to Apache Sqoop
Introduction to Apache SqoopIntroduction to Apache Sqoop
Introduction to Apache SqoopAvkash Chauhan
 
Introduction to Apache Pig
Introduction to Apache PigIntroduction to Apache Pig
Introduction to Apache PigAvkash Chauhan
 

More from Avkash Chauhan (15)

AI Solutions with Macnica.ai - AI Expo 2018 Tokyo Japan
AI Solutions with Macnica.ai - AI Expo 2018 Tokyo JapanAI Solutions with Macnica.ai - AI Expo 2018 Tokyo Japan
AI Solutions with Macnica.ai - AI Expo 2018 Tokyo Japan
 
AI Expo - AI Revolution in Silicon Valley
AI Expo - AI Revolution in Silicon ValleyAI Expo - AI Revolution in Silicon Valley
AI Expo - AI Revolution in Silicon Valley
 
Nikkei xTech coverage on macnica.ai announcement
Nikkei xTech coverage on macnica.ai announcementNikkei xTech coverage on macnica.ai announcement
Nikkei xTech coverage on macnica.ai announcement
 
H2O Core Introduction
H2O Core IntroductionH2O Core Introduction
H2O Core Introduction
 
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
 
Big Data Perspective UI V2
Big Data Perspective UI V2Big Data Perspective UI V2
Big Data Perspective UI V2
 
Big Data Perspective (UI)
Big Data Perspective (UI)Big Data Perspective (UI)
Big Data Perspective (UI)
 
Big Data Perspective (Company Information)
Big Data Perspective (Company Information)Big Data Perspective (Company Information)
Big Data Perspective (Company Information)
 
Applied Machine learning using H2O, python and R Workshop
Applied Machine learning using H2O, python and R WorkshopApplied Machine learning using H2O, python and R Workshop
Applied Machine learning using H2O, python and R Workshop
 
The concept of Datalake with Hadoop
The concept of Datalake with HadoopThe concept of Datalake with Hadoop
The concept of Datalake with Hadoop
 
Developing Hadoop strategy for your Enterprise
Developing Hadoop strategy for your EnterpriseDeveloping Hadoop strategy for your Enterprise
Developing Hadoop strategy for your Enterprise
 
Introduction to Hadoop at Data-360 Conference
Introduction to Hadoop at Data-360 ConferenceIntroduction to Hadoop at Data-360 Conference
Introduction to Hadoop at Data-360 Conference
 
Introduction to Apache Sqoop
Introduction to Apache SqoopIntroduction to Apache Sqoop
Introduction to Apache Sqoop
 
Introduction to Apache Pig
Introduction to Apache PigIntroduction to Apache Pig
Introduction to Apache Pig
 
Introduction to HBase
Introduction to HBaseIntroduction to HBase
Introduction to HBase
 

Data 360 Conference: Introduction to Big Data, Hadoop and Big Data Analytics

  • 1. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 avkash@bigdataperspective.com https://www.linkedin.com/in/avkashchauhan Lets Start and Define Big Data Apache Hadoop Training Series: Hadoop Introduction 1
  • 2. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Apache Hadoop Training Series: Hadoop Introduction 2
  • 3. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Lets Start and Define Big Data How Hadoop Fits in this scenario Apache Hadoop Training Series: Hadoop Introduction 3
  • 4. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 http://www.packtpub.com/using-cloudera-impala/book http://www.amazon.com/Simplifying-Windows-Azure-HDInsight-Service/dp/0735673802 http://blogs.msdn.com/b/microsoft_press/archive/2014/05/27/free-ebook-introducing-microsoft-azure-hdinsight.aspx https://www.linkedin.com/in/avkashchauhan Apache Hadoop Training Series: Hadoop Introduction 4
  • 5. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Hadoop is an Open Source (Java based), “Scalable”, “fault tolerant” platform for large amount of unstructured data storage & processing, distributed across machines. Apache Hadoop Training Series: Hadoop Introduction 5
  • 6. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Flexibility A Single Repo for storing and analyzing any kind of data not bounded by schema Scalability Scale-out architecture divides workload across multiple nodes using flexible distributed file system Low Cost Deployed on commodity hardware & open source platform Fault Tolerant Continue working event if node(s) go down A system to move computation, where the data is. Apache Hadoop Training Series: Hadoop Introduction 6
  • 7. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Lets Start and Define Big Data Hadoop Landscape How Hadoop Fits in this scenario Apache Hadoop Training Series: Hadoop Introduction 7
  • 8. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Apache Hadoop Training Series: Hadoop Introduction 8
  • 9. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Apache Hadoop Training Series: Hadoop Introduction 9
  • 10. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Apache Hadoop Training Series: Hadoop Introduction 10
  • 11. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Lets Start and Define Big Data How Hadoop Fits in this scenario Hadoop Core Components Hadoop Landscape Data Storage Data Processing Apache Hadoop Training Series: Hadoop Introduction 11
  • 12. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 HDFS MapReduce /YARN Hadoop Common Apache Hadoop Training Series: Hadoop Introduction 12
  • 13. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Cloud Cloudera Impala Hortonworks Tez Impala uses C++ based in-memory processing of HDFS data through SQL like statements to expedite the data processing Use cases include user collaborative filtering, user recommendations, clustering and classification. Apache Hadoop Training Series: Hadoop Introduction 13
  • 14. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Lets Start and Define Big Data How Hadoop Fits in this scenario Hadoop Landscape Applying Hadoop to Save $$ Hadoop Core Components Apache Hadoop Training Series: Hadoop Introduction 14
  • 15. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Lets Start and Define Big Data How Hadoop Fits in this scenario Hadoop Landscape Concept of Data Lake Hadoop Core Components Applying Hadoop to Save $$ Apache Hadoop Training Series: Hadoop Introduction 15
  • 16. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Apache Hadoop Training Series: Hadoop Introduction 16
  • 17. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Apache Hadoop Training Series: Hadoop Introduction 17
  • 18. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Lets Start and Define Big Data How Hadoop Fits in this scenario Hadoop Landscape Hadoop Core Components Concept of Data Lake Applying Hadoop to Save $$ Hadoop in Cloud Apache Hadoop Training Series: Hadoop Introduction 18
  • 19. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Apache Hadoop Training Series: Hadoop Introduction 19
  • 20. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Apache Hadoop Training Series: Hadoop Introduction 20
  • 21. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Lets Start and Define Big Data How Hadoop Fits in this scenario Hadoop Landscape Hadoop Core Components Big Data Analytics Applying Hadoop to Save $$ Hadoop in Cloud Concept of Data Lake Apache Hadoop Training Series: Hadoop Introduction 21
  • 22. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 EDW OLAP ODS Apache Hadoop Training Series: Hadoop Introduction 22
  • 23. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Apache Hadoop Training Series: Hadoop Introduction 23
  • 24. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Lets Start and Define Big Data How Hadoop Fits in this scenario Hadoop Landscape Hadoop Core Components Big Data Analytics With Hadoop Applying Hadoop to Save $$ Hadoop in Cloud Concept of Data Lake Apache Hadoop Training Series: Hadoop Introduction 24
  • 25. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Amazon HDInsight Directives Data Storage S3 Azure Blobs Direct access to compute machine to super fast data delivery Processing EC2 Azure Compute Dedicated Machines ready to turn with specific version of Hadoop runtime Processing Libraries Java based or any other language supported through Hadoop Streaming .Net based code User uploads their code processing binaries/ libraries Results S3 Azure Blobs Once job is completed the results are stored back to specific data storage used as source Visualization Custom Custom 3rd party application can connect to storage to perform visualization Apache Hadoop Training Series: Hadoop Introduction 25
  • 26. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Apache Hadoop Training Series: Hadoop Introduction 26
  • 27. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Apache Hadoop Training Series: Hadoop Introduction 27
  • 28. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Lets Start and Define Big Data How Hadoop Fits in this scenario Hadoop Landscape Hadoop Core Components Big Data Analytics With Hadoop Applying Hadoop to Save $$ Hadoop in Cloud Concept of Data Lake Apache Hadoop Training Series: Hadoop Introduction 28
  • 29. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 Apache Hadoop Training Series: Hadoop Introduction 29
  • 30. Apache Hadoop Training Series: Hadoop Introduction 10/23/14 http://blogs.msdn.com/b/microsoft_press/archive/2014/05/27/free-ebook-introducing-microsoft-azure-hdinsight.aspx Apache Hadoop Training Series: Hadoop Introduction 30