SMACK Stack for Data Science Training Course

SMACK is a collection of data platform softwares, namely Apache Spark, Apache Mesos, Apache Akka, Apache Cassandra, and Apache Kafka. Using the SMACK stack, users can create and scale data processing platforms.

This instructor-led, live training (online or onsite) is aimed at data scientists who wish to use the SMACK stack to build data processing platforms for big data solutions.

By the end of this training, participants will be able to:

Implement a data pipeline architecture for processing big data.
Develop a cluster infrastructure with Apache Mesos and Docker.
Analyze data with Spark and Scala.
Manage unstructured data with Apache Cassandra.

Format of the Course

Interactive lecture and discussion.
Lots of exercises and practice.
Hands-on implementation in a live-lab environment.

Course Customization Options

To request a customized training for this course, please contact us to arrange.

This course is available as onsite live training in Botswana or online live training.

Thank you for sending your enquiry! One of our team members will contact you shortly.

Thank you for sending your booking! One of our team members will contact you shortly.

Course Outline

Introduction

SMACK Stack Overview

What is Apache Spark? Apache Spark features
What is Apache Mesos? Apache Mesos features
What is Apache Akka? Apache Akka features
What is Apache Cassandra? Apache Cassandra features
What is Apache Kafka? Apache Kafka features

Scala Language

Scala syntax and structure
Scala control flow

Preparing the Development Environment

Installing and configuring the SMACK stack
Installing and configuring Docker

Apache Akka

Using actors

Apache Cassandra

Creating a database for read operations
Working with backups and recovery

Connectors

Creating a stream
Building an Akka application
Storing data with Cassandra
Reviewing connectors

Apache Kafka

Working with clusters
Creating, publishing, and consuming messages

Apache Mesos

Allocating resources
Running clusters
Working with Apache Aurora and Docker
Running services and jobs
Deploying Spark, Cassandra, and Kafka on Mesos

Apache Spark

Managing data flows
Working with RDDs and dataframes
Performing data analysis

Troubleshooting

Handling failure of services and errors

Summary and Conclusion

Requirements

An understanding of data processing systems

Audience

Data Scientists

14 Hours

Need help picking the right course?

Testimonials (1)

very interactive...

SMACK Stack for Data Science Training Course

Course Outline

Requirements

Testimonials (1)

Richard Langford

Course - SMACK Stack for Data Science

Related Categories

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites

SMACK Stack for Data Science Training Course

Course Outline

Requirements

Testimonials (1)

Richard Langford

Course - SMACK Stack for Data Science

Related Courses

Anaconda Ecosystem for Data Scientists

A Practical Introduction to Data Science

Data Science Programme

Audience:

Delivery:

Data Science for Big Data Analytics

Data Science essential for Marketing/Sales professionals

Jupyter for Data Science Teams

Kaggle

MATLAB Fundamentals, Data Science & Report Generation

Machine Learning for Data Science with Python

Accelerating Python Pandas Workflows with Modin

Python Programming for Finance

Python in Data Science

GPU Data Science with NVIDIA RAPIDS

Python and Spark for Big Data (PySpark)

Stratio: Rocket and Intelligence Modules with PySpark

Related Categories

Apache Spark

Apache Kafka

Data Science

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites