In the current era, online buying and selling have
become increasingly popular, and one of the platforms contributing to this
trend is TikTok Shop. Despite TikTok Shop offering attractive shopping features
for consumers, order cancellations remain a challenge for sellers in optimizing
sales and profits. In this research, the effective Categorical Boosting
(CatBoost) algorithm is used to predict order cancellations. However, this
algorithm is still limited and not widely used in online shops like TikTok Shop.
To handle the issue of imbalanced data, the resampling oversampling technique
with the Synthetic Minority Oversampling Technique (SMOTE) is
employed.Furthermore, to identify significant factors contributing to
transaction cancellations, the Principal Component Analysis (PCA) technique is
utilized. The research data comprises customer purchase histories on the TikTok
Shop platform. The CRISP-DM (Cross Industry Process Model for Data Mining)
method is applied, encompassing business understanding, data understanding,
data preparation, modeling, evaluation, and deployment stages.The model is
evaluated using Stratified 10-fold Cross-Validation to measure the quality and
effectiveness of the predictive model for the target variable
"Cancelation," based on accuracy, recall, precision, and F1 scores.
Additionally, the Confusion Matrix is used as an additional evaluation tool to
assess the performance of the formed model.The research findings demonstrate
that the CatBoost algorithm achieves remarkably high accuracy (99.7%) in
classifying transactions as cancellations or non-cancellations, with perfect
precision, recall, and F1-scores (1.00) for both classes. The most influential
factors contributing to transaction cancellations are Payment Method, Regency
and City, Order Refund Amount, Variation, Province, Product Category, Shipping
Fee After Discount, and SKU Platform Discount.
Keywords: Catboost, CRISP-DM, Cancellations, PCA,
Imbalanced.