CSSE 434 - Introduction to the Hadoop Ecosystem
- Credit Hours: 4R-0L-4C
- Term Available: -
- Prerequisites: CSSE 230 - Data Structures and Algorithm Analysis *Some Experience with SQL recommended
- Corequisites: None
This advanced course examines emergent Big Data techniques through hands-on introductions to the various technologies and tools that make up the Hadoop ecosystem. Topics covered include internals of MapReduce and the Hadoop Distributed File system (HDFS), internals of the YARN distributed operating system, MapReduce for data processing, transformation & analysis tools for data at scale (processing terabytes and petabytes of information quickly), scheduling jobs using workflow engines, data transfer tools & real time engines for data processing.