Money Laundering Detection on the Ethereum Blockchain Using the XGBoost Algorithm

Aldo Amrullah, Muhammad Arhami, Umri Erdiansyah

Abstract


Financial crimes involving crypto assets are becoming increasingly complex and pose significant challenges for early detection, particularly in money laundering practices. This study aims to detect money laundering activities on the Ethereum network by leveraging ten selected transaction features and implementing a pipeline approach that combines missing value imputation using SimpleImputer (mean strategy) and the Extreme Gradient Boosting (XGBoost) algorithm with a binary:logistic objective function. The data used was secondary data from DataverseNL, comprising 4,681 accounts, which included 2,179 illicit accounts and 2,502 normal accounts. The classification results show that the actual label distribution was 53.5% for the normal class and 46.5% for the illicit class, while the model's predictions showed 54.2% and 45.8%, respectively. Performance evaluation using a Confusion Matrix yielded an accuracy of 95%, with average precision, recall, and F1-score values of 0.95 each. These results indicate that the model has a balanced and accurate performance in classifying crypto transaction activities. Overall, the XGBoost pipeline system proves to be an effective approach for the early detection of money laundering risks in the cryptocurrency ecosystem and has the potential for further development as a foundation for digital financial monitoring and compliance systems.

References


O. Japinye, “Integrating Machine Learning in Anti-Money Laundering through Crypto: A Comprehensive Performance Review,†Eur. J. Accounting, Auditing and Finance Research, vol. 12, no. 4, pp. 54–80, Mar. 2024, doi: 10.37745/EJAAFR.2013/VOL12N45480.

E. Godspower-Akpomiemie and K. Ojah, “Money Laundering, Tax Havens and Transparency,†Routledge, pp. 248–266, 2022, doi: 10.4324/9781315169477-15.

M. Calafos and G. Dimitoglou, “Cyber Laundering: Money Laundering from Fiat Money to Cryptocurrency,†in Financial Cybersecurity Risk Management, Springer, 2022, pp. 271–300, doi: 10.1007/978-3-031-10507-4_12.

H. Almeida, P. Pinto, and A. F. Vilas, “A Review on Cryptocurrency Transaction Methods for Money Laundering,†Proc. 20th Int. Conf. on Security and Cryptography (SECRYPT), pp. 114–121, 2023, doi: 10.5220/0011993300003494.

A. Guidara, “Cryptocurrency and Money Laundering: A Literature Review,†Corporate Law and Governance Review, vol. 4, no. 2, pp. 36–41, 2022, doi: 10.22495/clgrv4i2p4.

A. Singh, J. Shaw, and V. Mishra, “A Systematic Analysis on Cryptocurrencies as a Financial Asset,†Proc. IEEE Int. Conf. on Recent Trends in Management, Technology and Innovation (IRTM), 2022, doi: 10.1109/IRTM54583.2022.9791804.

T. Labs, “2025 Crypto Crime Report,†Chainalysis, Feb. 2025.

A. Arbabi, A. Shojaeinasab, B. Bahrak, and H. Najjaran, “Mixing Solutions in Bitcoin and Ethereum Ecosystems: A Review and Tutorial,†arXiv preprint arXiv:2310.04899, 2023.

N. Pocher, M. Zichichi, F. Merizzi, M. Z. Shafiq, and S. Ferretti, “Detecting anomalous cryptocurrency transactions: An AML/CFT application of machine learning-based forensics,†Electronic Markets, vol. 33, no. 1, pp. 1–17, 2023, doi: 10.1007/s12525-023-00654-3.

P. Gao, D. Kong, and X. Li, “Implementation and Security Analysis of Cryptocurrencies Based on Ethereum,†arXiv preprint arXiv:2504.21367, 2025.

K. L. Du, R. Zhang, B. Jiang, J. Zeng, and J. Lu, “Understanding Machine Learning Principles: Learning, Inference, Generalization, and Computational Learning Theory,†Mathematics, vol. 13, no. 3, pp. 1–57, 2025, doi: 10.3390/math13030451.

İ. Kılıç and N. Yalçın, “A Novel Hybrid Methodology Based on Transfer Learning, Machine Learning, and ReliefF for Chickpea Seed Variety Classification,†Applied Sciences, vol. 15, no. 3, pp. 1–15, 2025, doi: 10.3390/app15031334.

F. Johannessen and M. Jullum, “Finding Money Launderers Using Heterogeneous Graph Neural Networks,†arXiv preprint arXiv:2307.13499, 2023.

T. S. Siddhesh, S. M. Rajagopal, and S. Bhaskaran, “Comparative Analysis of Machine Learning Algorithms for Anomaly Detection,†Proc. 2024 IEEE 9th Int. Conf. on Convergence in Technology (I2CT), 2024, doi: 10.1109/I2CT61223.2024.10544217.

S. Farrugia, J. Ellul, and G. Azzopardi, “Detection of Illicit Accounts over the Ethereum Blockchain,†Expert Systems with Applications, vol. 150, p. 113318, 2020, doi: 10.1016/j.eswa.2020.113318.

A. R. A. Talwalkar, M. Mohri, and A. Rostamizadeh, Foundations of Machine Learning, 2nd ed., MIT Press, 2018.

T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,†Proc. 22nd ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD), pp. 785–794, 2016, doi: 10.1145/2939672.2939785.

S. Q. Sultan, N. Javaid, N. Alrajeh, and M. Aslam, “Machine Learning-Based Stacking Ensemble Model for Prediction of Heart Disease with Explainable AI and K-Fold Cross-Validation: A Symmetric Approach,†Symmetry, vol. 17, no. 2, pp. 1–26, 2025, doi: 10.3390/sym17020185.

G. Azzopardi, S. Farrugia, and J. Ellul, “Detection of Illicit Accounts over the Ethereum Blockchain,†DataverseNL, doi: 10.34894/GKAQYN.

J. Li, “Area under the ROC Curve Has the Most Consistent Evaluation for Binary Classification,†PLoS One, vol. 19, no. 12, Dec. 2024, doi: 10.1371/journal.pone.0316019.

S. Gautam et al., “Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation,†Pattern Recognition, vol. 48, no. 9, pp. 2839–2848, 2015.

Evaluation of four machine learning models for signal detection, SAGE Open Medicine, vol. 11, 2023.

J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, 4th ed., Morgan Kaufmann, 2023


Refbacks

  • There are currently no refbacks.


Indexing :

Creative Commons License
Journal of Informatics Engineering and Software Applications (JIEngS) licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.