Hadoop Ecosystem

Jetzt loslegen. Gratis!
oder registrieren mit Ihrer E-Mail-Adresse
Rocket clouds
Hadoop Ecosystem von Mind Map: Hadoop Ecosystem

1. Bundle provides a way to package multiple coordinator and workflow jobs and to manage the lifecycle of those jobs

2. Workflow jobs are Directed Acyclical Graphs (DAGs), specifying a sequence of actions to execute. The Workflow job has to wait

3. Hive

3.1. SQL-like querying

3.2. Combiner can be used to optimize reducer performance

3.3. Structured data warehousing

3.4. Partition columns instead of indexes

4. Pig

4.1. Scripting for Hadoop

5. HBase

5.1. Non-relational

5.2. Column store

5.3. Transactional lookups

6. Flume

6.1. Log collector

6.2. Integrates into Hadoop

7. Oozie

7.1. Workflow processing

7.2. Links jobs

7.3. Coordinator jobs are recurrent Oozie Workflow jobs that are triggered by time and data availability.

8. Avro

8.1. Data parsing

8.2. Binary data serialization

8.3. RPC

8.4. language-neutral

8.5. optional codegen

8.6. schema evolution

8.7. untagged data

8.8. dynamic typing

9. Mahout

9.1. Machine learning

9.2. Applied to MR

10. Sqoop

10.1. Connects non-Hadoop stores (RDBMS)

10.2. Moves data to & from RDBMS to Hadoop

10.3. Autogens Java InputFormat code for data access

11. MapReduce

11.1. Distributed compute

11.2. Maps query onto nodes

11.3. Reduces aggregated results into answers

12. Ambari

12.1. Cluster deployment and admin

12.2. Driven by Hortonworks

13. ZooKeeper

13.1. Coordinator of shared state between apps

13.2. Naming, configuration, and synchronization services

14. YARN

14.1. cluster management

14.2. Hadoop 2

14.3. resource manager

14.4. job scheduler

15. BigTop

15.1. Package Hadoop ecosys

15.2. Test Hadoop ecosys package

16. Related Apache Ecosystems

17. HDFS

17.1. Distributed storage

17.2. Java-based filesystem

18. Spark

19. Impala

19.1. SQL query egnine

19.2. Query data stored in HDFS and HBase

19.3. Real time

20. Cascading

20.1. Higher abstraction from MR

20.2. Creates Flow that assembles Map/Reduce jobs