Data Science: Credit Card Fraud Detection – Model Building
Project: A hands-on Data Science project on credit card fraud detection that uses different sampling and model building techniques to find out who is trying to steal money from people.
What you’ll learn
Data Science: Credit Card Fraud Detection – Model Building
- Analyzing and understanding the data.
- Techniques for preparing data for use.
- Using Logistic Regression, KNN, Tree, Random Forest, XGBoost, and SVM models, you can make a model.
- KFolds that are repeated and KFolds that have been grouped together.
- SMOTE and ADASYN are three random oversamplers, and they all work.
- Metrics for Classification.
- In this case, we’re going to talk about how to
Requirements
-
Knowledge of Python
Description
In this course, I will cover, how to develop a Credit Card Fraud Detection model to categorize a transaction as Fraud or Legitimate with very high accuracy using different Machine Learning Models. This is a hands-on project where I will teach you the step-by-step process of creating and evaluating a machine learning model.
This course will walk you through the initial data exploration and understanding, data analysis, data preparation, model building, and evaluation. We will explore RepeatedKFold, StratifiedKFold, Random Oversampler, SMOTE, ADASYN concepts and then use multiple ML algorithms to create our model and finally focus on one which performs the best on the given dataset.
In order to make it easier for you to see what will be covered in the course, I have broken it down into the tasks below.
Task 1: Putting in the packages.
It’s Task 2: Importing the Libraries.
Task 3: Getting the data from the source.
Task 4: Getting to know the data
The Task 5: Checking the distribution of the target variable in your class is the next step.
Task 6: Find correlations and make a Heat Map.
Task 7: Making sure the features work well.
The Task 8: Train the Test Split
Task 9: Seeing how a variable is spread out on a graph.
Confusion Matrix, Classification Report, and AUC-ROC are the things you need to know about in Task 10.
created a common function to plot the confusion matrix for tasks 11 and 12.
What do you need to know about Logistic Regression? KNN? Tree? Random Forest? XGBoost?
Task 13: I made a single function that could be used to fit and predict a Logistic Regression model.
Task 14: I made a single function that could be used to fit and predict a KNN model.
The Task 15: I made a single function that could be used to fit and predict a Tree model.
Task 16: I made a single function that could be used to fit and predict a Random Forest model.
Task 17: I made a single function that could be used to fit and predict an XGBoost model.
The Task 18: I made a single function that could be used to fit and predict an SVM model.
Data Science: Credit Card Fraud Detection – Model Building
What you need to know about repeated Kfold and stacked Kfold in Task 19.
Task 20: Using RepeatedKFold and Model Evaluation to do cross validation for Task 20
Task 21: Using StratifiedKFold and Model Evaluation to do cross-validation and model evaluation.
The Task 22: Keep going with the model that has shown the best results so far, so far.
Task 23: Random Oversampler, SMOTE, and ADASYN are some of the things you need to know about.
24: Doing oversampling with random sampling and using StratifiedKFold for cross validation and evaluating the model that you made.
Task 25: Doing oversampling with SMOTE and evaluating the models that you made.
26: Oversampling with ADASYN and looking at the model you made
Task 27: Tuning the hyperparameters, or
The Task 28: Find the most important features.
Task 29: Make a final conclusion.
If you want to work in the 21st century, you need to know how to do data analysis and make models. Make sure you take the course now, and you’ll be a lot more knowledgeable about Machine learning in just a few hours, so don’t wait!
Who this course is for:
- Students and professionals who want to learn Data Analysis, Data Preparation for Model building, Evaluation.
- Students and professionals who want to learn RepeatedKFold, StratifiedKFold, Random Oversampler, SMOTE, ADASYN