We're using cookies, but you can turn them off in your browser settings. Otherwise, you are agreeing to our use of cookies. Learn more in our Privacy Policy

Python, Data Science & AI

Develop the skills to execute financial data science workflows using Python by applying advanced machine learning techniques from forecasting EPS changes to sentiment analysis.
woman looking into screen with computer data reflected in her glasses

Structure & duration

10–20 hours to complete

Online self-paced

Practical Skills Modules can be completed online at your own pace.

Prerequisites

We recommend candidates have basic familiarity with Python and with the CFA Level II Machine Learning curriculum.

Available for Level II and Level III.

Overview of Python, Data Science & AI

In Python, Data Science & AI, you will develop the skills to use Jupyter Notebooks for developing, presenting, and sharing data science and artificial intelligence (AI) projects. Python is known for its simplicity, scalability, and open-source modules. Data science employs tasks like data cleaning, visualization, and modeling to inform decisions. AI empowers machines to mimic human intelligence, including natural language understanding, decision making, and object recognition.

In this Practical Skills Module, you will follow the data science workflow from financial data ingestion to training artificial neural networks. Rather than studying the deep theoretical math of data science, each PSM unit provides a high-level overview of the data science tools and practical code-based solutions for investment professionals. You will have the opportunity to pull financial data and use the standard tools and techniques to prepare it to deliver insights, work through an example of forecasting %change in EPS, and explore a common natural language processing task of sentiment analysis. This module also includes an overview of Python Programming Fundamentals, which is offered as a separate PSM for Level I and is available for those needing a full review.

Over the course of this module, you will be guided through a series of videos, knowledge check questions, and projects to quickly build up your practical understanding of Python, data science, and AI concepts while applying them to industry-specific examples. After completing this module, you will have the tools to apply what you’ve learned immediately.

Key learning objectives for Python, Data Science & AI

  • Use Jupyter Notebook for developing, presenting, and sharing data science and artificial intelligence projects.
  • Perform text data encoding, tokenization, and feature engineering.
  • Train and evaluate feedforward and recurrent artificial neural networks to solve regression and classification machine learning problems.
  • Explore the underlying theory, intuition, and mathematics behind artificial neural networks and deep learning.
  • Assess the performance of trained machine learning regression and classification models using various key performance indicators (KPIs).
  • Perform hyperparameters optimization using GridSearchCV to achieve optimal machine learning models performance.
  • Master feature engineering and data cleaning strategies for machine learning and data science applications.
  • Master scikit-learn library to build, train, and test machine learning models using real-world datasets.
  • Describe simple and multiple linear regression models and the roles of dependent and independent variables in the model.

Key learning objectives by unit

    • Learn how to define Python variables, perform math operations, and leverage Python’s print() and input() functions.
    • Produce syntactically correct Python code using list comprehension and “for” loops.
    • Describe the syntax and use cases of user-defined, built-in, and lambda functions in Python and learn how to call these functions, send them arguments, and receive data from them.
    • Obtain companies’ financial data, such as balance sheets, income statements, and cash flow statements, using the SimFin Platform.
    • Master data wrangling and feature engineering strategies for machine learning and data science applications.
    • Perform data merging using Pandas.
    • Locate, count, and handle missing values.
    • Perform one-hot encoding, which works by converting categorical data into numeric variables to be used as inputs to machine learning models.
    • Describe simple and multiple linear regression models and the roles of dependent and independent variables in the model.
    • Describe the least-squares criterion and how it is used to estimate regression coefficients.
    • Master the scikit-learn library to build, train, and test machine learning models using real-world datasets to solve problems in the finance and banking sectors.
    • Discover the underlying theory, intuition, and mathematics behind artificial neural networks and deep learning.
    • Train and evaluate feedforward ANNs to solve regression machine learning problems in finance.
    • Examine the intuition behind the random forest algorithm and use it to solve regression problems using the scikit-learn library.
    • Explore the intuition behind boosting and leverage the XG-Boost algorithm to solve regression problems.
    • Perform hyperparameters optimization to improve machine learning regression model performance.
    • Explain the concept of text data encoding and padding.
    • Develop a pipeline to perform text data encoding/tokenization and padding in Python.
    • Perform text data cleaning.
    • Plot a word cloud, which is a powerful visual representation of text data in which the size of each word indicates its importance in the given dataset.
    • Split the data into training and testing using the scikit-learn library.
    • Train and evaluate a custom-built long short-term memory (LSTM) network to perform sentiment analysis on news test data.
    • Perform sentiment analysis using off-the-shelf pre-trained language models.

    Explore the CFA® Program

    Whether you’re considering the CFA Program or already registered as a candidate, we have information and resources to help you navigate the next step.

    Already a candidate?  Access candidate resources

    Man in suit crossing traffic