Qualification for the job role as data scientist

Data Analytics & Data Science (certification course)

Course description

The certification course Data Analytics and Data Science to become a data scientist enables you to independently clean, process and visualize data and to develop and verify complex forecasting models. In addition to knowledge of the Python programming language, the machine learning concepts taught ensure that you can work in analytical roles, e.g., as a (junior) data analyst, (junior) business intelligence analyst or (junior) financial analyst, and in advanced analytical roles, e.g., as a (junior) data scientist or (junior) machine learning engineer, after successfully completing your course as a data scientist. The Data Scientist certification proves your new knowledge and skills and thus enables you to make a successful career change.

In this  course you will learn:  

  • How to independently scan, clean, and filter data
  • How to explore and analyze data using descriptive statistics
  • How to develop and verify complex prediction models
  • How to create scripts in the Python programming language
  • How to build data models to predict business use cases
  • How to develop machine learning algorithms
  • Best practices for effective data visualization
  • Data storytelling methods
  • Target Audience  

    The target group for the Data Analytics & Data Science certification course are job-seekers, unemployed people, or people on furlough or at risk of imminent job loss who would like to have the course funded with an education voucher via the employment agency or the job center.

    Prerequisites for participation 

    No prior programming skills are required for the course. You have to pass a placement test in mathematics and statistics in advance and prove that you have a B2 level of German and a A2 level of English.
    Full-time (35 hrs./week)
    4 months (518 units)
    5 modules + hands-on projects
    Beginner
    German or English
    Certificate of completion
    0 € with education voucher
    12.09.2022
    What to Expect

    Course Overview

    Python skills

    Python is the No. 1 programming language for Machine Learning and Data Science and is relatively easy to learn, even for newcomers.

    Interactive tasks and final project

    Apply your knowledge in interactive practical tasks and a final project in which you implement your own data science project with real data sets.

    Job qualification

    With this training, we qualify you directly for the job role as a data scientist and other analytical roles in data teams.

    Python skills

    Python is the No. 1 programming language for Machine Learning and Data Science and is relatively easy to learn, even for newcomers.

    Interactive tasks and final project

    Apply your knowledge in interactive practical tasks and a final project in which you implement your own data science project with real data sets.

    Job qualification

    With this training, we qualify you directly for the job role as a data scientist and other analytical roles in data teams.

    Modules

    Objective:
    Introduction to programming with Python

    Description:
    Participants become familiar with the interactive learning platform
    – StackFuel‘s Data Lab – and the Python programming language.

    Chapter 1 – Python Basics:
    Participants navigate through the Data Lab for the first time and
    become familiar with the basics of programming. They learn to
    store numbers and text as variables in Python and to bundle them
    as groups in lists. This basic knowledge is completed with the
    proper way to read error messages.

    Chapter 2 – Programming Basics:
    Participants continue to build on their programming basics. This
    chapter focuses on the use of functions and methods as well as
    flow controls using conditions.

    Chapter 3 – Loops and Functions:
    The last chapter of the Basics module is dedicated to flow control
    using loops. Participants expand their functionality by importing
    additional Python packages and gain insight into code versioning
    code with Git. By the end of the chapter, participants will know
    the most important programming concepts that are important for
    working as a data analyst.
    Objective:
    Independent collection, analysis and visualization of data
    with Python.

    Description:
    Participants learn how to access, filter and merge new data sources.
    They will practice making company data accessible interactively
    in dynamic dashboards and independently perform classic data
    processing operations (importing, filtering, cleaning, and
    visualizing data).

    Chapter 1 – Data Pipelines (Pandas):
    This chapter teaches participants how to efficiently use Pandas – the
    standard tool of a data analyst in Python. Participants learn to use it
    to read, clean, and aggregate data in CSV files.
    As the second module begins, participants will receive assistance in
    optimizing their online presence as a data analyst.

    Chapter 2 – Data Exploration (Matplotlib):
    Participants practice visualizing different types of data using
    marketing data. Numerical data is represented as histograms and
    scatter plots, while categorical data is represented as column
    and pie charts.

    Chapter 3 – Predictions (Statistics):
    Participants learn statistical concepts such as medians and
    quartiles using product ratings. They identify outliers and create
    simple predictions using linear and logistic regression. In addition,
    participants focus on creating their own data analytics portfolio, and
    they receive practical tips about this.

    Chapter 4 – Internal Data (SQL):
    The participants learn to read databases using the example of an
    employee database and to formulate standard SQL queries.

    Chapter 5 – External Data (API):
    Participants will use Python to access information such as web
    pages and APIs designed by StackFuel on the Internet.

    Chapter 6 – Advanced Jupyter:
    Participants learn Jupyter functionalities and solve advanced
    visualization problems such as live updates and interactivity in
    the context of a stock market scenario.

    Chapter 7 – Exercise Project:
    Participants analyze a New York taxi data set with over one million
    trips and use their Python skills as independently as possible to
    answer given questions.

    Chapter 8 – Final Project:
    Participants analyze customer churn for a telecommunications
    company. They work through the entire data pipeline
    independently and answer typical questions. They present their
    project in a 1-on-1 feedback session with StackFuel‘s
    mentoring team.
    Objective:
    Solving supervised and unsupervised machine learning
    problems with sklearn.

    Description:
    Participants create data science workflows with sklearn, evaluate
    their model performance using appropriate metrics, and become
    aware of the problem of overfitting.

    Chapter 1 – Supervised Learning: Regression:
    Using linear regression, participants learn how to use the Python
    package sklearn. Furthermore, they deal with the assumptions of the
    regression model and the evaluation of the generated predictions.
    Participants learn about the bias-variance trade-off, regularization,
    and various metrics of model quality.

    Chapter 2 – Supervised Learning: Classification:
    Participants are introduced to classification algorithms using the
    k-Nearest-Neighbors algorithm and learn to evaluate the algorithm
    and assess classification performance. They optimize the parameters
    of their model and pay attention to dividing the data into training
    and evaluation sets.

    Chapter 3 – Unsupervised Learning: Clustering:
    Participants learn about the k-Means algorithm as an example of an
    unsupervised learning algorithm. The assumptions and performance
    metrics of the algorithm are critically examined and a brief look is
    taken at an alternative to k-Means clustering.

    Chapter 4 – Unsupervised Learning:
    Dimensionality Reduction: Participants learn how to reduce the
    dimension of the data using Principal Component Analysis (PCA)
    and use PCA to generate uncorrelated features from the original
    data. In this context, the topic of feature engineering is explored
    in more detail and new features are generated from the old ones.

    Chapter 5 – Outlier Detection:
    Participants learn about different approaches to identifying
    outliers and understand how to deal with these unusual data
    points. They use robust measures and models to minimize the
    impact of outliers.

    Objective:
    Expanding the data science toolkit.

    Description:
    Participants deepen their knowledge of data classification models. In
    doing so, they expand their skills in collecting and preparing data.

    Chapter 1 – Data Gathering:
    Participants learn to gather data by mining web pages and PDF
    documents. They structure collected text data using regular
    expressions so that they can use it in conjunction with familiar
    algorithms. As they begin this module, participants will receive help
    in optimizing their online presence as a data Scientist.

    Chapter 2 – Logistic Regression:
    Participants learn a second classification algorithm: logistic
    regression. They use new performance metrics to evaluate results
    and learn how to prepare non-numeric data for their models.

    Chapter 3 – Decision Trees and Random Forests:
    Participants learn about the decision tree as an easy-to-interpret
    model. They combine multiple models in an ensemble to improve
    the predictions of their model. They also learn methods to deal
    with unbalanced categories. In addition, participants will focus on
    creating their own data science portfolio, and receive practical
    tips for this.

    Chapter 4 – Support Vector Machines:
    Participants learn about a final classification algorithm – Support
    Vector Machines (SVM) and highlight the behavior of different
    kernels for SVM. They also learn the typical steps of Natural
    Language Processing (NLP) and work through an NLP scenario
    using bag-of-words models.

    Chapter 5 – Neural Networks:
    Participants will be introduced to artificial neural networks and
    learn more about deep learning, to create a multilayer artificial
    neural network and apply it to a data set.
    Objective:
    Independent application of simple and complex modeling.

    Description:
    Participants gain confidence in solving data science problems and
    learn to communicate results competently.

    Chapter 1 – Visualization and Model Interpretation:
    Participants learn important methods for interpreting and visualizing
    machine learning models. By using model-agnostic methods for
    interpretation, they learn to derive and communicate insights into
    how their models work.

    Chapter 2 – Spark:
    Participants learn why working with distributed memory systems
    is relevant. Using the Python package PySpark, they learn how to
    read distributed databases, perform big data analysis, and use wellknown machine learning algorithms on distributed systems.

    Chapter 3: Exercise Project:
    Participants work on a prediction problem using a larger data set and
    independently apply their data science skills from cleaning the data
    set to interpreting the model. Participants receive feedback on their
    approach to solving the problem in an individual project consultation
    with StackFuel‘s mentoring team.

    Chapter 4 – Final Project:
    Participants are given another larger dataset to analyze
    independently and solve with less assistance than they received for
    the practice project. Participants receive feedback on their solution
    approach in an individual project consultation with the StackFuel
    mentoring team.

    Start Dates

    22.08.2022
    Duration: 4 Months

    Download the curriculum now.

    LEARNING ENVIRONMENT

    Train online in your browser in our interactive learning platform.   

    StackFuel provides an innovative learning environment to develop your data skills in the most effective way – interactively and with real-world exercises. Learn to code in our Data Lab and develop algorithms and automate things with real industry datasets. Learn more now and benefit from 80% practical content in our courses. 
    Feedback
    What our graduates have to say. 

    Check out our other courses.

    BI Analyst – Focus Power BI
    Qualification for the Job Role of a Business Intelligence Analyst
    Type:
    Online course
    Duration:
    72 hours (4 months)
    Analytics & Reporting – Focus Power BI
    Guide to Effective Dashboards in Power BI
    Type:
    Online course
    Duration:
    32 hours (4 weeks)
    Python – Focus Object-Oriented Programming
    Design Principles for Software Development with Object-Oriented Programming
    Type:
    Online course
    Duration:
    32 hours
    15-minute introductory meeting

    Let us advise you personally!