Model Klasifikasi Serangan DoS pada Jaringan Blockchain menggunakan Algoritma Proximal Policy Optimization
DoS Attack Classification on Blockchain Networks Using Proximal Policy Optimization
Teknologi blockchain yang dikenal memiliki karakteristik desentralisasi, transparansi, dan keamanan kriptografis tinggi masih rentan terhadap ancaman siber, khususnya serangan Denial of Service (DoS) yang dapat menurunkan ketersediaan jaringan. Penelitian ini mengusulkan model klasifikasi serangan DoS pada jaringan blockchain dengan menerapkan algoritma Proximal Policy Optimization (PPO) yang berbasis reinforcement learning dan dikenal memiliki stabilitas pelatihan serta efisiensi dalam optimisasi kebijakan. Model PPO dirancang menggunakan arsitektur Actor–Critic yang mengintegrasikan mekanisme pembelajaran penguatan dan pembelajaran terawasi untuk menghasilkan sistem deteksi adaptif yang mampu mengenali pola serangan dinamis. Eksperimen dilakukan menggunakan dataset Blockchain Network Attack Traffic (BNaT) yang berisi data lalu lintas normal dan DoS pada jaringan Ethereum, melalui tahapan preprocessing yang meliputi pembersihan data, transformasi fitur, dan pemetaan label biner. Pelatihan model dilaksanakan dengan konfigurasi parameter optimal yang terdiri atas learning rate 0.0003, clip epsilon 0.12, entropy coefficient 0.005, batch size 256, dan value coefficient 0.7. Hasil eksperimen menunjukkan bahwa model PPO mencapai akurasi 99.65%, precision 99.65%, recall 99.65%, F1-score 99.65%, Average Precision 99.93%, dan AUC 99.99%, yang menunjukkan kemampuan tinggi dalam membedakan lalu lintas normal dan serangan. Temuan ini menegaskan bahwa algoritma PPO efektif dan stabil dalam mendeteksi serangan DoS pada jaringan blockchain serta memiliki potensi untuk diterapkan dalam pengembangan sistem keamanan siber adaptif yang tangguh pada lingkungan terdistribusi.
Blockchain technology, known for its decentralization, transparency, and cryptographic security, remains vulnerable to cyber threats, particularly Denial of Service (DoS) attacks that can degrade network availability. This study proposes a classification model for detecting DoS attacks on blockchain networks using the Proximal Policy Optimization (PPO) algorithm, a reinforcement learning approach known for its training stability and policy optimization efficiency. The PPO model employs an Actor–Critic architecture that integrates reinforcement and supervised learning to build an adaptive detection mechanism capable of identifying dynamic attack behaviors. The experiment utilizes the Blockchain Network Attack Traffic (BNaT) dataset, containing normal and DoS traffic generated from the Ethereum network, with preprocessing steps including data cleaning, feature transformation, and binary label encoding. Model training was conducted using optimal hyperparameters, including a learning rate of 0.0003, clip epsilon of 0.12, entropy coefficient of 0.005, batch size of 256, and value coefficient of 0.7. Experimental results show that the PPO model achieved 99.65% accuracy, 99.65% precision, 99.65% recall, 99.65% F1-score, 99.93% Average Precision, and 99.99% AUC, demonstrating superior performance in distinguishing between normal and attack traffic. These findings confirm that PPO is an effective and stable method for DoS attack detection in blockchain networks and provides a promising foundation for developing adaptive and resilient cybersecurity systems in distributed environments.