whatsapp

whatsApp

Have any Questions? Enquiry here!
☎ +91-9972364704 LOGIN BLOG
× Home Careers Contact
Back
CREDIT CARD FRAUD DETECTION USING MACHINE LEARNING ALGORITHMS
CREDIT CARD FRAUD DETECTION USING MACHINE LEARNING ALGORITHMS

Introduction

Credit card fraud detection is presently the most frequently occurring problem in the present world. This is due to the rise in both online transactions and e-commerce platforms. Credit card fraud generally happens when the card was stolen for any of the unauthorized purposes or even when the fraudster uses the credit card information for his use. In the present world, we are facing a lot of credit card problems. To detect the fraudulent activities the credit card fraud detection system was introduced. This project aims to focus mainly on machine learning algorithms. The algorithms used are random forest algorithm and the Adaboost algorithm. The results of the two algorithms are based on accuracy, precision, recall, and F1-score. The ROC curve is plotted based on the confusion matrix. The Random Forest and the Adaboost algorithms are compared and the algorithm that has the greatest accuracy, precision, recall, and F1-score is considered as the best algorithm that is used to detect the fraud.

Objectives

  • Random Forest,
  • Adaboost,
  • ROC curve

RELATED WORK

New methods for credit card fraud detection with a lot of research methods and several fraud detection techniques with a special interest in the neural networks, data mining, and distributed data mining. Many other techniques are used to detect such credit card fraud. When done the literature survey on various methods of credit card fraud detection, we can conclude that to detect credit card fraud there are many other approaches in Machine Learning itself.

In 2019 Sahayasakila V, D.Kavya Monisha, Aishwarya, Sikhakolli Venkatavisalakshiswshai Yasaswi have explained the Twain important algorithmic techniques [8] which are the Whale Optimization Techniques (WOA) and SMOTE (Synthetic Minority Oversampling Techniques). They mainly aimed to improve the convergence speed and to solve the data imbalance problem. The class imbalance problem is overcome using the SMOTE technique and the WOA technique. The SMOTE technique discriminates all the transactions which are synthesized are again re-sampled to check the data accuracy and are optimized using the WOA technique. The algorithm also improves the convergence speed, reliability, and efficiency of the system.

In 2018 Navanushu Khare and Saad Yunus Sait have explained their work [5] on decision trees, random forest, SVM, and logistic regression. They have taken the highly skewed dataset and worked on such type of dataset. The performance evaluation is based on accuracy, sensitivity, specificity, and precision. The results indicate that the accuracy for the Logistic Regression is 97.7%, for Decision Trees is 95.5%, for Random Forest is 98.6%, for SVM classifier is 97.5%. They have concluded that the Random Forest algorithm has the highest accuracy among the other algorithms and is considered as the best algorithm to detect the fraud. They also concluded that the SVM algorithm has a data imbalance problem and does not give better results to detect credit card fraud. 

PROPOSED WORK

Proposed data-driven approaches for credit card fraud detection | Download  Scientific Diagram

Architecture Diagram

Figure 3 from DATA MINING APPLICATION IN CREDIT CARD FRAUD DETECTION SYSTEM  | Semantic Scholar

Random Forest Algorithm

Steps for Random Forest Algorithm

1. Take the Kaggle credit card fraud dataset that is trained and randomly select some of the sample data.

2. Using the randomly created sample data now creates the Decision Trees that are used to classify the cases into the fraud and non-fraud cases.

3. The Decision Trees are formed by splitting the nodes, the nodes which have the highest Information gain make it as the root node and classify the fraud and non-fraud cases.

4. Now the majority vote is performed and the decision Trees may result in 0 as output which includes that these are the non-fraud cases.

5. Finally, we find the accuracy, precision, recall, and F1 -score for both the fraud and non-fraud cases.

Random Forest algorithm

Algorithm Random Forest :

To generate c classifiers:

              For i=1 to c do 

              Randomly select the training data D with

              replacement to produce Di 

 Create a root node N containing Di and cell

Build Tree(N)

End for

Majority Vote

Build Tree(N)

Randomly select x% of all the possible splitting

features in N

Select the features F that has the highest Information

A gain for further splitting

Gain (T,X)=Entropy (T)-Entropy(T,X)

Now to calculate the entropy we use,

 ( ) ∑ ( )

Create f child nodes

For i=1 to f do

Set contents f N to Di

Call Build Tree(Ni)

End for

End

Classification Algorithms - Random Forest

Adaboost Algorithm 

AdaBoost Classifier Algorithms using Python Sklearn Tutorial - DataCamp

Steps for Adaboost Algorithm

1. The Kaggle credit card fraud dataset is taken and is

trained. Randomly select some of the sample data.

2. Using the randomly created sample data now creates

the decision trees sequentially for classifying the

fraud and non-fraud cases.

3. The decision trees are formed initially. This can be

done by splitting the node based on which has the

highest information gain, make it as the root node,

and classify the fraud and non-fraud cases.

4. Now calculate the error rate, performance, and update

the weights of the fraud and non-fraud transactions

that are incorrectly classified.

5. Now majority vote is performed and the decision

trees may result as output which indicates the nonfraud cases.

6. The decision trees may output 1 which indicates that

it is a fraud case.

7. Finally, we find the accuracy, precision, recall, and

F1-score for both the fraud and non-fraud cases.

Adaboost Algorithm

Algorithm Adaboost :

IINPUTdataset

Initialize weights, w1(n)=1/n

Create a decision tree

Select the one that has the lowest Entropy

If Incorrectly classified

 Calculate Total Error (TE)= sum of up incorrectly

 Classified sample weights

Calculate Performance,

 

 For each

 Incorrectly classified, increase weights:

 Weights incorrect =old weight *

 

 Correctly classified, decrease the weights:

 Weight correct =old weight *

 

 Normalized weight of each sample:

 

 Normalized weight = updated weight/sum of updated weight

 End for

End if

CONCLUSION

Even though there are many fraud detection techniques we can’t say that this particular algorithm detects the fraud completely. From our analysis, we can conclude that the accuracy is the same for both the Random Forest and the Adaboost algorithms. When we consider the precision, recall, and the F1-score the Random Forest algorithm has the highest value than the Adaboost algorithm. Hence we conclude that the Random Forest Algorithm works best than the Adaboost algorithm to detect credit card fraud.

FUTURE SCOPE

From the above analysis, it is clear that many machine learning algorithms are used to detect the fraud but we can observe that the results are not satisfactory. So, we would like to implement deep learning algorithms to detect credit card fraud accurately. 

latest engineering projects on data science

 

engineering projects on machine learning

 

latest engineering projects on data science

 

engineering projects on machine learning

 

best engineering projects on machine learning

 

best engineering projects on machine learning

 

best projects on machine learning

 

best projects in deep learning

 

best machine learning projects for resume

 

best machine learning projects for final year

 

best machine learning projects for beginners

 

best machine learning projects for portfolio

 

best machine learning projects for jobs

 

best machine learning projects github

 

best projects in machine learning

 

best machine learning projects with source code

 

best deep learning projects for resume

 

best deep learning projects github

 

best deep learning research projects

 

best machine learning project ideas

 

best machine learning projects

 

best ml projects for resume

 

top 5 machine learning projects for beginners

 

top 10 machine learning projects for beginners

 

best ai projects for beginners

 

 

 

best ml projects for final year students

 

best engineering projects on machine learning

 

best projects on machine learning

 

best projects in deep learning

 

best machine learning projects for resume

 

best machine learning projects for final year

 

best machine learning projects for beginners

 

best machine learning projects for portfolio

 

best machine learning projects for jobs

 

best machine learning projects github

 

best projects in machine learning

 

best machine learning projects with source code

 

best deep learning projects for resume

 

best deep learning projects github

 

best deep learning research projects

 

best machine learning project ideas

 

best machine learning projects

 

best ml projects for resume

 

top 5 machine learning projects for beginners

 

top 10 machine learning projects for beginners

 

best ai projects for beginners

 

best ml projects for final year students

 

best project for machine learning

 

best ml projects for beginners

 

best machine learning tutorial for beginners

 

mifra tech is the best place technical course learner

 

best project institute in bangalore  is the mifratech

 

best machine learning course with projects

 

best machine learning projects in python

 

best machine learning projects on github

 

mifratech is the best engineering project center for ece and cse

 

best machine learning programs online

 

top 10 machine learning projects for beginners in python

 

easy machine learning projects for beginners

Popular Coures