100 % eligible with education voucher
100% Promotion possible.

Data Scientist - Focus Python

Qualification for the job role as Data Scientist
Certificate of completion
Full time/part time
German, English
Course description

The certified online training as Data Scientist - Focus Python enables you to derive, verify and interpret predictive models from data in order to communicate the model results efficiently.

The additional skill building in Machine Learning will qualify you for the job role of Data Scientist or another analytical job role such as the Business Intelligence Analyst or Financial Analyst upon successful completion of the career path.

In this training you will learn
Data Analytics
Machine Learning Basics
Supervised Learning
  • Import, clean and filter data independently
  • Analyze data exploratively using descriptive statistics
  • Develop and verify complex forecasting models
  • Deepen Python programming skills
  • Build data models to predict business scenarios
  • Develop machine learning algorithms
  • Best practices for effective data visualization
  • Data Storytelling Methods

Target group

The advanced training Data Scientist - Focus Python is suitable for anyone who wants to use Python as a programming language, analyze data and create predictions based on this data to make data-driven decisions. You should have a basic motivation for statistics, logical thinking and machine learning. The Data Scientist training is also suitable for career changers.

Requirements for participation

  • Assessment test
  • Basic math, statistics & Python programming skills (incl. Pandas, Matplotlip).



Refreshing of knowledge in the use of Python as well as mathematical basics Description

Participants perform analysis and data manipulation in Python using the Pandas and Matplotlib packages.

Chapter 1 - Data Analytics with Python:
Participants will become familiar with our interactive programming environment - the Data Lab - and brush up on key programming and Python fundamentals for data processing with Pandas, data visualization with Matplotlib and Seaborn, and database querying with SQL Alchemy.

Chapter 2 - Linear Algebra:
Participants become familiar with the mathematical background of data science algorithms and learn the basic concepts of linear algebra. Using the Numpy package, participants calculate with vectors and matrices.

Chapter 3 - Probability Distributions:
Participants learn more about the statistical background of data science algorithms. They deal with important statistical concepts and learn about discrete and continuous distributions. Furthermore, participants will get an insight into the versioning of code with Git.

Machine Learning Basics

Solving supervised and unsupervised machine learning problems with sklearn

Participants create data science workflows with sklearn, evaluate their model performance using appropriate metrics, and become aware of the problem of overfitting.

Chapter 1 - Supervised Learning (Regression):
Using linear regression, participants learn how to use the Python package sklearn. Furthermore, they deal with the assumptions of the regression model and the evaluation of the generated forecasts. In the course of this, the bias-variance trade-off, concepts of regularization and various measures of model quality are also clarified.

Chapter 2 - Supervised Learning (Classification):
Participants are introduced to classification algorithms using the k-Nearest Neighbors algorithm and learn to evaluate the algorithm and assess classification performance. They optimize the parameters of their model considering the partitioning of the data into training and evaluation sets.

Chapter 3 - Unsupervised Learning (Clustering):
Participants learn about the k-Means algorithm as an example of an unsupervised learning algorithm. The assumptions and performance metrics of the algorithm are critically examined and a brief outlook on an alternative to k-means clustering is given.

Chapter 4 - Unsupervised Learning (Dimensionality Reduction):
Participants learn how to use Principal Component Analysis (PCA) to reduce the dimension of the data and use PCA to create uncorrelated features from the original data. In this context, the topic of feature engineering is examined in more detail and new features are generated from the old ones.

Chapter 5 - Outlier Detection:
Participants learn different approaches to identify outliers and understand how to deal with these unusual data points. They use robust measures and models to minimize the influence of outliers.

Deep Dive Supervised Learning

Extension of the own data science toolkit

Participants intensify their knowledge of data classification models. In doing so, they expand their skills in collecting and preparing data.

Chapter 1 - Data Gathering:
Participants learn to collect data by reading web pages and PDF documents. Using regular expressions, they structure collected text data so that they can use it together with known algorithms.

Chapter 2 - Logistic Regression:
Participants learn a second classification algorithm: logistic regression. They use new performance metrics to evaluate the results and learn how to make non-numerical data usable for their models.

Chapter 3 - Decision Trees and Random Forests:
Participants learn about the decision tree as an easy-to-interpret model. They combine several models into an ensemble to improve the predictions of their model. Furthermore, they are provided with methods for unbalanced categories.

Chapter 4 - Support Vector Machines:
Participants learn about a final classification algorithm - Support Vector Machines (SVM) and highlight the behavior of different kernels for SVM. They will also learn the typical steps of Natural Language Processing (NLP) and work on an NLP scenario using bag-of-words models.

Chapter 5 - Neural Networks:
Participants will be introduced to artificial neural networks and learn more about Deep Learning to create a multilayer artificial neural network and apply it to a data set.

Advanced Topics in Data Science

Independently apply simple and complex modeling techniques

Participants gain confidence in solving data science problems and learn to communicate results competently.

Chapter 1 - Visualization and Model Interpretation:
Participants learn important methods for interpreting and visualizing machine learning models. By using modelagnostic methods for interpretation, they learn to derive and communicate insights into the functioning of their models.

Chapter 2 - Spark:
Participants learn why working with distributed storage systems is relevant. With the Python package PySpark, they learn to read distributed databases, perform Big Data analysis, and use well-known machine learning algorithms on distributed systems.

Chapter 3 - Exercise Project:
Participants work on a prediction problem using a larger data set and independently apply their data science skills from cleaning the data set to interpreting the model. Participants receive feedback on their solution approach in a project meeting with StackFuel's mentoring team.

Chapter 4 - Final Project:
Participants receive another larger data set that they have to analyze independently and solve with less assistance compared to the exercise project. In an individual project meeting with the StackFuel mentoring team, participants receive feedback on their solution approach.

The StackFuel Data Lab offers me real added value. Here you can feel the practical relevance particularly well. The tasks were always clearly described and presented. So I always knew what I had to do. The training itself was a great experience!
Alexander Gross
Data Analyst at AIC Portaltechnik
The greatest added value for me is the practical relevance. Thanks to StackFuel, I can quickly implement what I've learned and adapt it for myself. That is the real learning success behind the online trainings.
Lutz Schneider
Strategic IT Buyer at Axel Springer SE
The content of StackFuel's online training was very practical. There were many good examples and projects. I found that very interesting and instructive. Since the training, my everyday professional life has changed significantly: I am now a data analytics specialist in my department.
Jaroslaw Wojciech Sulak
Specialist for data analysis at IAV GmbH
The user-friendly and flexible Python programming training has completely changed my view of complex data structures. Thanks to the sustainable and well thought-out learning concept as well as the seamless application of the learning content in the development environment, I can now implement the newly learned knowledge in my everyday job in test automation in greater depth and process data more easily and efficiently since then.
Jenny Lindenau
Technical Manager Test Management at Bank Deutsches Kraftfahrzeuggewerbe GmbH

Let's start with a consultation.

Our consultants will be happy to help you and answer all your questions. Free of charge and without obligation. We look forward to meeting you.
(incl. VAT)
0 € with education voucher