Skip to main content

Data Science Tools and Techniques


SamatriX
Enrollment in this course is by invitation only

About This Course

Apache Hadoop is used for distributed storage and for processing large chunks of data. It is one of the most popular big data solutions. Hadoop 3 is a high-performance, more fault-tolerant, and highly efficient big data processing platform. It focuses on improved scalability and increased efficiency.

In this course, we focus on advnced concepts of the Hadoop ecosystem tool. You will learn how Hadoop works internally. You will also study the advanced concepts of different ecosystem tools. You woll also learn HDFS, YARN, MapReduce, Pig, Spark, and Workflow Management with Python.

Requirements

Add information about the skills and knowledge students need to take this course.

Frequently Asked Questions

What web browser should I use?

The Samatrix Learning platform works best with current versions of Chrome, Edge, Firefox, Internet Explorer, or Safari.

See our list of supported browsers for the most up-to-date information.