About this course
Python is a very powerful programming language used for many different applications. Over time, the huge community around this open source language has created quite a few tools to efficiently work with Python. In recent years, a number of tools have been built specifically for data science. As a result, analyzing data with Python has never been easier.
In this practical course, you will start from the very beginning, with basic arithmetic and variables, and learn how to handle data structures, such as Python lists, Numpy arrays, and Pandas DataFrames. Along the way, you’ll learn about Python functions and control flow. Plus, you’ll look at the world of data visualizations with Python and create your own stunning visualizations based on real data.
Programming in Python
Working with Data in Python
Data Modelling using Machine Learning
1. Case Study on online credit card fraud detection
Industry: Banking, Finance and Economics
Description: In this case study, we will focus on a particular form of credit card fraud—buying from an online store. We are assuming that for some of those transactions (of a higher value), some retailers require the customers to call in and confirm their credit card details. Then we identify the fraudulent merchant from the data provided, In order to catch the thief you need to find the merchant to which, all the affected parties shopped at, before the first fraudulent transaction occurred against their credit card.
Dataset: The dataset consists of data for 1,000 customers and 20 merchants. Over a period of 50 days, customers made over 225 K transactions for a total value of over $57 M.
2. Case Study on classifying the outbound call data of a bank
Industry: Banking, Telemarketing
Description: In this case study we will classify the outbound calls of a bank to see if such a call will result in a credit application or not using three most popular classification methods Gradient Boosting Naïve Bias, Generalized Linear Model and Random Forest. We will compare the performance of these methods using various performance and cost metrics for example, precision, recall, F1-score and Receiver Operating Characteristic (ROC).
Dataset: We will use the data related with direct marketing campaigns (phone calls) of a Portuguese banking institution. The dataset has 45211 records across 17 attributes ordered by date (from May 2008 to November 2010).
3. Case Study on Forecasting River flow using Time Series models
Industry: Natural resource management
Description: We will see various techniques of handling, analyzing, and building models for time series data. We will use the autoregressive moving average (ARMA) model and its generalization—the autoregressive integrated moving average(ARIMA) model to predict the future from time series data.