In analyzing the Human immunodeficiency virus (HIV) epidemic dynamics,
the biggest problem is uncertainty when planning for the future. In future evaluations,
predicting what might happen will make the decisions’ results more realistic.
Policymakers will have the opportunity to take precautions against any negative
changes that may occur. Machine learning methods that produce good and effective
predictive results are needed to plan future policies, eliminate the negativities and
overcome deciding in an uncertain environment. In this study, seven machine learning
models used to make time-series analysis for medical purposes are theoretically
explained. Machine learning methods such as Linear Regression, RepTree, Alternating
Model Trees, M5, k Nearest Neighbor (kNN), Autoregressive Integrated Moving
Average (ARIMA), and Random Forest were used. The dynamics of the HIV epidemic
in Turkey have been made stationary time series, considering compliance of the
correlation. Then, the time series were preprocessed using the Moving Average
technique, and the time series was softened. The time series is divided into 2/3 training
and 1/3 test sets. Machine learning methods were trained using these sets, parameter
optimization of models was made and tested. Then these models were used to forecast
the HIV epidemic Dynamics in Turkey in 3 years between 2019-Q4 and 2022-Q3. The
Random Forest method has been successful as the model that produces the least error
rate (Mean Absolute Percentage Error, MAPE) among these seven models. According
to the estimation results of the Random Forest model, R2 (the coefficient of
determination) value was 82.16%, E (efficiency) value was 0.6268, Slope value was
2.3362, and MAPE value was 5.4132%. The Random Forest model has been observed
to give excellent results for the three-year forecast of dynamics of the HIV epidemic in
Turkey.
Keywords: Akaike Information Criterion (AIC), Alternating Model Trees,
ARIMA, Autocorrelation Function (ACF), Bayes Information Criterion (BIC),
Chi-square, Efficiency, HIV Epidemiology, kNN, Ljung-Box Q-statistic (LBQ).