Advances in Economics, Management and Political Sciences
- The Open Access Proceedings Series for Conferences
Series Vol. 47 , 01 December 2023
* Author to whom correspondence should be addressed.
The escalating use of the Internet has led to a surge in online shopping and e-commerce, resulting in a corresponding increase in credit card fraud incidents. Therefore, this research focuses on employing machine learning techniques, which offer enhanced precision and efficiency compared to manual detection, to identify fraudulent activities. To establish the association between credit card transaction attributes and the presence of fraudsters, this study initially gathers data from Kaggle, subsequently normalizing the collected data. Furthermore, the data exhibits severe imbalance, leading to overfitting concerns. To ascertain feature correlations, a correlation heatmap is constructed. Moreover, this investigation selects three models for analysis. Finally, the performance of each model is evaluated using a confusion matrix and derived metrics. The findings reveal that both the decision tree and random forest models exhibit optimal performance, achieving 100% across all indicators. The most influential factors in determining credit card fraud involve the ratio to median purchase price and the geographical proximity of the transaction location to the cardholder's residence.
machine learning, credit card fraud prediction, business analysis
1. Saltz, J. S. (1996). Another Year of Credit Card Late Charge Cases: The Search for a Definition of “Interest” Continues. The Business Lawyer, 51(3), 925–931. http://www.jstor.org/stable/40687670
2. Benson, E. R. S., & Annie, P. A. (2011, March 1). Analysis on credit card fraud detection methods. https://doi.org/10.1109/ICCCET.2011.5762457
3. UK Finance. (2019). FRAUD THE FACTS 2019 The definitive overview of payment industry fraud. Retrieved June 28, 2023, from UK Finance website: https://www.ukfinance.org.uk/policy-and-guidance/reports-publications/fraud-facts-2019#:~:text=Fraud%20poses%20a%20major%20threat
4. Steele, J. (2021, June 11). Credit card fraud and ID theft statistics. Retrieved from CreditCards.com website: https://www.creditcards.com/statistics/credit-card-security-id-theft-fraud-statistics-1276/
5. Schulte, T. (2021, July 15). 50+ Identity Theft & Credit Card Fraud Statistics (2021). Retrieved from Define Financial website: https://www.definefinancial.com/blog/identity-theft-credit-card-fraud-statistics/
6. Rebala, G., Ravi, A., Churiwala, S. (2019). Machine Learning Definition and Basics. In: An Introduction to Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-15729-6_1
7. Han, S., Pool, J., Tran, J., & Dally, W. (2015) Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28.
8. Yu, Q., Chang, C. S., Yan, J. L., et al. (2019 )Semantic segmentation of intracranial hemorrhages in head CT scans, 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS). IEEE, 2019: 112-115.
9. Lo, S. C. B., Chan, H. P., Lin, J. S., et al. (1995) Artificial convolution neural network for medical image pattern recognition. Neural networks, 8(7-8): 1201-1214.
10. Kaggle (2023) Credit Card Fraud https://www.kaggle.com/datasets/dhanushnarayananr/credit-card-fraud/code?datasetId=2156255&sortBy=voteCount
11. Srimaneekarn, N., Hayter, A., Liu, W., & Tantipoj, C. (2022). Binary Response Analysis Using Logistic Regression in Dentistry. International Journal of Dentistry, 2022, 1–7. https://doi.org/10.1155/2022/5358602
12. You, J., Li, G., & Wang, H. (2021). Credit Grade Prediction Based on Decision Tree Model. 2021 16th International Conference on Intelligent Systems and Knowledge Engineering (ISKE). https://doi.org/10.1109/iske54062.2021.9755326
13. Speiser, J. L., Miller, M. E., Tooze, J., & Ip, E. (2019). A comparison of random forest variable selection methods for classification prediction modeling. Expert Systems with Applications, 134, 93–101. https://doi.org/10.1016/j.eswa.2019.05.028
The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open Access Instruction).