Wind speed prediction for site selection and reliable operation of wind power plants in coastal regions using machine learning algorithm variants

Mollick, Tajrian; Hashmi, Galib; Sabuj, Saifur Rahman

doi:10.1186/s40807-024-00098-z

Sustainable Energy Research

Table 1 Details of regression models utilized in this research

From: Wind speed prediction for site selection and reliable operation of wind power plants in coastal regions using machine learning algorithm variants

Regression model	Full form	Main features
MLR	Multiple linear regression	- Explores linear correlations between input variables (Salah et al., 2022)
Lasso	Least absolute shrinkage and selection operator	- Assigns one of the correlated predictors an elevated weight while minimizing the other correlated predictors to almost zero - Imposes a penalty on the total absolute values of the coefficients, named L1 penalty (Salah et al., 2022)
Ridge	Regularized inverse depth generating estimators	- Assigns weights that are similar to correlated predictors - Imposes an L2 penalty to the total squared values of the coefficients (Salah et al., 2022)
Elastic Net	Elastic net regression	- Handles collinear data and prevent overfitting - Combines components of both Lasso (L1) and Ridge (L2) regularization approaches (Malakouti, 2023)
KNN	K-nearest neighbors	- Makes predictions based on the average of the k-nearest neighbors' majority vote for a given data point - Applies a distance metric (Euclidean distance) which defines "nearest" (Tarek et al., 2023)
DT	Decision tree	- Identifies the greatest feature and split point at each node using mean squared error (MSE) - Makes judgments by recursively separating the data based on features (Talekar, 2020)
RF	Random forest	- Combines multiple decision trees to improve prediction accuracy as an ensemble learning method - Entails averaging the predictions made by several trees after they have been trained on arbitrary subsets of the data (Talekar, 2020)
GBR	Gradient boosting regression	- Reduces the loss function using gradient descent optimization - Utilizes the decision trees which are shallow and have little depth, as weak learners (Tarek et al., 2023)
AdaBoost	Adaptive boosting	- Creates a series of weak learners, which are usually shallow decision trees, and evaluates their performance using an exponential loss function (Jasman et al., 2022)
XGBoost	Extreme gradient boosting	- Well-known for its rapidity and effectiveness as a powerful gradient boosting algorithm - Creates a strong regression model by building a sequence of decision trees one after the other, each one fixing the mistakes of the previous one (“POWER \| Data Access Viewer”, 2023)
LightGBM	Light gradient boosting machine	- Well-recognized for its exceptional performance as a gradient boosting framework - Especially effective with the large datasets and provides quicker training times without sacrificing its remarkable accuracy in regression tasks (Malakouti, 2023)
CatBoost	Categorical boosting	- Combines dynamic learning rates, ordered boosting, and oblivious trees as an advanced gradient boost technique (Jasman et al., 2022) - Gains popularity, especially when working with complex datasets that contain categorical categories
LSTM	Hyperopt	- Understands relationships in sequential data as a type of recurrent neural network (RNN) - Excels at modeling sequences where long-term context is crucial because of its ability to store and propagate information over extended periods of time (Elsaraiti & Merabet, 2021)
GRU	Gated recurrent unit	- Develops relationships in sequential data as a type of recurrent neural network (RNN) - Detects long-term dependencies in sequential data while mitigating the problem of vanishing gradients that conventional RNNs encounter (Tao et al., 2022)

Back to article page