灰狼优化算法(GWO)：从理论到深度学习中的实践应用 - 文章 - 开发者社区

灰狼优化算法（GWO）是一种模拟灰狼群体猎食行为的优化算法，它主要通过模拟灰狼群体中的领袖狼（alpha）、跟随狼（beta、delta）和普通狼（omega）的角色来实现寻优

1. 灰狼优化算法过程：

1.1 初始化

首先，初始化一群灰狼的个体位置，每一个位置表示一个解，假设灰狼种群大小为，解的维度为，则初始化种群矩阵为：

其中表示第个灰狼的位置

2.2 计算适应度

根据适应度函数计算每个个体的位置对应的适应度值，并确定alpha、beta和delta狼的位置，设为适应度函数，则：

:适应度最优的位置（alpha狼）
:适应度次优的位置（beta狼）
:适应度第三的位置（delta狼）

2.3 位置更新

根据alpha、beta和delta狼的位置来更新其余灰狼的位置，更新公式如下：

2.3.1 距离向量计算

其中，、、是随机向量，其值位于之间：

2.3.2 计算新的位置向量

其中，、、是系数向量，其值在之间，是一个随时间递减的参数：

其中是收敛因子，随着迭代次数从2线性减小到0，的模取之间的随机数

2.3.3 更新灰狼位置

1.2 流程图

picture.image

两图分别为灰狼优化算法流程图、灰狼的等级制度(从上到下的优势递减)，灰狼优化算法通过模拟灰狼猎食过程中领袖狼和群体狼的行为，利用领袖狼的位置引导其他灰狼逐步逼近猎物（全局最优解），从而实现全局优化，其核心在于通过位置更新公式，不断逼近最优解

2. 灰狼算法优缺点

优点：与其他优化算法相比，灰狼算法的优化过程更快，因为它们先得出答案，再把不同答案进行比较并相应地进行排序，以此输出最佳解决方案

缺点：灰狼优化算法属于启发式优化算法，产生的最优解仅接近于原始最优解，并不是问题真正的最优解

3. 灰狼算法实现

3.1 灰狼算法在简单函数上求最值


          
import numpy as np
          
import matplotlib.pyplot as plt
          
plt.rcParams['font.sans-serif'] = 'SimHei'
          
plt.rcParams['axes.unicode_minus'] = False
          

          

          
# 定义优化问题的目标函数
          
def objective_function(x):
          
    return np.sum(x ** 2 + x)
          

          
# 初始化灰狼优化算法的参数
          
dim = 2  # 解的维度
          
num_wolves = 300  # 种群大小
          
max_iter = 100  # 最大迭代次数
          

          
# 初始化灰狼种群的位置
          
wolves = np.random.uniform(-10, 10, (num_wolves, dim))
          

          
# 初始化 alpha、beta、delta 狼的位置
          
alpha_pos = np.zeros(dim)
          
beta_pos = np.zeros(dim)
          
delta_pos = np.zeros(dim)
          
alpha_score = float("inf")
          
beta_score = float("inf")
          
delta_score = float("inf")
          

          
# 开始迭代
          
a = 2  # 控制参数，逐渐减小
          
convergence_curve = []
          

          
for t in range(max_iter):
          
    for i in range(num_wolves):
          
        fitness = objective_function(wolves[i])
          
        
          
        if fitness < alpha_score:
          
            alpha_score = fitness
          
            alpha_pos = wolves[i].copy()
          
        elif fitness < beta_score:
          
            beta_score = fitness
          
            beta_pos = wolves[i].copy()
          
        elif fitness < delta_score:
          
            delta_score = fitness
          
            delta_pos = wolves[i].copy()
          

          
    a = 2 - t * (2 / max_iter)  # 线性减少 a
          

          
    for i in range(num_wolves):
          
        for j in range(dim):
          
            r1, r2 = np.random.rand(), np.random.rand()
          
            A1 = 2 * a * r1 - a
          
            C1 = 2 * r2
          
            D_alpha = abs(C1 * alpha_pos[j] - wolves[i][j])
          
            X1 = alpha_pos[j] - A1 * D_alpha
          

          
            r1, r2 = np.random.rand(), np.random.rand()
          
            A2 = 2 * a * r1 - a
          
            C2 = 2 * r2
          
            D_beta = abs(C2 * beta_pos[j] - wolves[i][j])
          
            X2 = beta_pos[j] - A2 * D_beta
          

          
            r1, r2 = np.random.rand(), np.random.rand()
          
            A3 = 2 * a * r1 - a
          
            C3 = 2 * r2
          
            D_delta = abs(C3 * delta_pos[j] - wolves[i][j])
          
            X3 = delta_pos[j] - A3 * D_delta
          

          
            wolves[i][j] = (X1 + X2 + X3) / 3
          

          
    convergence_curve.append(alpha_score)
          

          
print(f"最优解: {alpha_pos}")
          
print(f"最优值: {alpha_score}")
          

          
# 绘制收敛曲线
          
plt.figure(figsize=(15,5))
          
plt.plot(convergence_curve)
          
plt.title("收敛曲线")
          
plt.xlabel("迭代次数")
          
plt.ylabel("适应度值")
          
plt.show()
          

          
# 可视化灰狼种群位置
          
if dim == 2:
          
    plt.figure(figsize=(15,5))
          
    plt.scatter(wolves[:, 0], wolves[:, 1], c='blue', label='Wolves')
          
    plt.scatter(alpha_pos[0], alpha_pos[1], c='red', label='Alpha', marker='x')
          
    plt.scatter(beta_pos[0], beta_pos[1], c='green', label='Beta', marker='x')
          
    plt.scatter(delta_pos[0], delta_pos[1], c='purple', label='Delta', marker='x')
          
    plt.title("灰狼种群位置")
          
    plt.legend()
          
    plt.show()

picture.image

在这里目标函数为，搜索空间为，解的维度为2，因此灰狼算法会在的范围内搜索，使得和代入目标函数后得到最小值的解（灰狼算法默认求目标函数值最小的解），具体参数如何修改参考下表

picture.image

3.2 灰狼算法在标准测试函数Rastrigin上求最值


          
import numpy as np
          
import matplotlib.pyplot as plt
          
from mpl_toolkits.mplot3d import Axes3D
          

          
plt.rcParams['font.sans-serif'] = 'SimHei'
          
plt.rcParams['axes.unicode_minus'] = False
          

          
# 定义优化问题的目标函数（Rastrigin函数）
          
def objective_function(x):
          
    return 10 * len(x) + np.sum(x ** 2 - 10 * np.cos(2 * np.pi * x))
          

          
# 初始化灰狼优化算法的参数
          
dim = 2  # 解的维度
          
num_wolves = 300  # 种群大小
          
max_iter = 100  # 最大迭代次数
          

          
# 初始化灰狼种群的位置
          
wolves = np.random.uniform(-5.12, 5.12, (num_wolves, dim))
          

          
# 初始化 alpha、beta、delta 狼的位置
          
alpha_pos = np.zeros(dim)
          
beta_pos = np.zeros(dim)
          
delta_pos = np.zeros(dim)
          
alpha_score = float("inf")
          
beta_score = float("inf")
          
delta_score = float("inf")
          

          
# 开始迭代
          
a = 2  # 控制参数，逐渐减小
          
convergence_curve = []
          

          
for t in range(max_iter):
          
    for i in range(num_wolves):
          
        fitness = objective_function(wolves[i])
          
        
          
        if fitness < alpha_score:
          
            alpha_score = fitness
          
            alpha_pos = wolves[i].copy()
          
        elif fitness < beta_score:
          
            beta_score = fitness
          
            beta_pos = wolves[i].copy()
          
        elif fitness < delta_score:
          
            delta_score = fitness
          
            delta_pos = wolves[i].copy()
          

          
    a = 2 - t * (2 / max_iter)  # 线性减少 a
          

          
    for i in range(num_wolves):
          
        for j in range(dim):
          
            r1, r2 = np.random.rand(), np.random.rand()
          
            A1 = 2 * a * r1 - a
          
            C1 = 2 * r2
          
            D_alpha = abs(C1 * alpha_pos[j] - wolves[i][j])
          
            X1 = alpha_pos[j] - A1 * D_alpha
          

          
            r1, r2 = np.random.rand(), np.random.rand()
          
            A2 = 2 * a * r1 - a
          
            C2 = 2 * r2
          
            D_beta = abs(C2 * beta_pos[j] - wolves[i][j])
          
            X2 = beta_pos[j] - A2 * D_beta
          

          
            r1, r2 = np.random.rand(), np.random.rand()
          
            A3 = 2 * a * r1 - a
          
            C3 = 2 * r2
          
            D_delta = abs(C3 * delta_pos[j] - wolves[i][j])
          
            X3 = delta_pos[j] - A3 * D_delta
          

          
            wolves[i][j] = (X1 + X2 + X3) / 3
          

          
    convergence_curve.append(alpha_score)
          

          
print(f"最优解: {alpha_pos}")
          
print(f"最优值: {alpha_score}")
          

          
plt.figure(figsize=(15,5))
          
plt.plot(convergence_curve)
          
plt.title("收敛曲线")
          
plt.xlabel("迭代次数")
          
plt.ylabel("适应度值")
          
plt.show()
          

          
if dim == 2:
          
    x = np.linspace(-5.12, 5.12, 400)
          
    y = np.linspace(-5.12, 5.12, 400)
          
    X, Y = np.meshgrid(x, y)
          
    Z = 10 * 2 + (X ** 2 - 10 * np.cos(2 * np.pi * X)) + (Y ** 2 - 10 * np.cos(2 * np.pi * Y))
          
    fig = plt.figure(figsize=(8,8))
          
    ax = fig.add_subplot(111, projection='3d')
          
    ax.plot_surface(X, Y, Z, cmap='viridis', alpha=0.2)
          
    ax.scatter(alpha_pos[0], alpha_pos[1], alpha_score, c='red', label='Alpha', marker='x')
          
    ax.scatter(beta_pos[0], beta_pos[1], beta_score, c='green', label='Beta', marker='x')
          
    ax.scatter(delta_pos[0], delta_pos[1], delta_score, c='purple', label='Delta', marker='x')
          
    ax.set_title("Rastrigin求解结果")
          
    ax.legend()
          
    plt.show()

picture.image

Rastrigin函数是一种常用于测试优化算法性能的多模态函数，其数学表达式为：

其中，是变量的数量，在这里解的维度dim = 2 ，也就是代表问题有两个变量，，然后Rastrigin函数存在特性为当取得最小值0，推导如下：

当时，函数值为：

在灰狼优化算法中求得最优解为，使得最优值为为0，在这里会发现和实际并不为0只是近似为0，进一步说明了灰狼算法的缺点既产生的最优解仅接近于原始最优解，并不是问题真正的最优解，但是

也说明算法找到了使目标函数值最小的解，即接近于零的解，意味着算法有效地找到了全局最优解，表明灰狼优化算法在这次运行中表现良好

4. 灰狼算法在模型上优化的运用

4.1 数据简单预处理


          
import pandas as pd
          
import numpy as np
          
df = pd.read_excel('数据.xlsx',index_col=0, parse_dates=['数据时间'])
          

          
# 数据预处理
          
df_max = np.max(df['总有功功率（kw）'])
          
df_min = np.min(df['总有功功率（kw）'])
          
df_bz = (df['总有功功率（kw）']-df_min)/(df_max-df_min)
          

          
def prepare_data(data, win_size):
          
    X = []  
          
    y = [] 
          
    for i in range(len(data) - win_size):
          
        temp_x = data[i:i + win_size]
          
        temp_y = data[i + win_size] 
          
        X.append(temp_x)
          
        y.append(temp_y)
          
    X = np.asarray(X)
          
    y = np.asarray(y)
          
    X = np.expand_dims(X, axis=-1)
          

          
    return X, y
          
win_size = 12
          
X, y = prepare_data(df_bz.values, win_size)
          
train_size = int(len(X) * 0.7)  
          
X_train, X_test = X[:train_size], X[train_size:]
          
y_train, y_test = y[:train_size], y[train_size:]

4.2 灰狼算法寻找最优参数


          
import tensorflow.compat.v1 as tf
          
from tensorflow.keras.layers import Flatten, Dense
          
from tensorflow.keras.models import Sequential
          
import matplotlib.pyplot as plt
          
from tcn import TCN
          

          
# 关闭TensorFlow 2.x的急切执行
          
tf.disable_eager_execution()
          

          
# 定义目标函数
          
def objective_function(params):
          
    dense_1, dense_2, filters1 = params
          
    dense_1, dense_2, filters1 = int(dense_1), int(dense_2), int(filters1)
          

          
    model = Sequential()
          
    model.add(TCN(nb_filters=filters1, kernel_size=6, activation='relu', input_shape=(win_size, 1), dilations=[1, 2, 4, 8, 16]))
          
    model.add(Flatten())
          
    model.add(Dense(dense_1, activation='relu'))
          
    model.add(Dense(dense_2, activation='relu'))
          
    model.add(Dense(1, activation='sigmoid'))
          
    
          
    model.compile(optimizer='adam', loss='mse')
          
    history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_test, y_test), verbose=0)
          
    val_loss = min(history.history['val_loss'])
          
    
          
    return val_loss
          

          
# 初始化GWO参数
          
dim = 3 # 超参数的数量
          
num_wolves = 5  # 灰狼种群的大小
          
max_iter = 20  # 最大迭代次数
          
lower_bound = [32, 64, 32]  # 超参数的下界
          
upper_bound = [128, 256, 128]  # 超参数的上界
          

          
# 初始化灰狼种群的位置
          
wolves = np.random.uniform(lower_bound, upper_bound, (num_wolves, dim))
          

          
# 初始化 alpha、beta、delta 狼的位置
          
alpha_pos = np.zeros(dim)
          
beta_pos = np.zeros(dim)
          
delta_pos = np.zeros(dim)
          
alpha_score = float("inf")
          
beta_score = float("inf")
          
delta_score = float("inf")
          

          
# 开始迭代
          
a = 2  # 控制参数，逐渐减小
          
convergence_curve = []
          

          
for t in range(max_iter):
          
    for i in range(num_wolves):
          
        fitness = objective_function(wolves[i])
          
        
          
        if fitness < alpha_score:
          
            alpha_score = fitness
          
            alpha_pos = wolves[i].copy()
          
        elif fitness < beta_score:
          
            beta_score = fitness
          
            beta_pos = wolves[i].copy()
          
        elif fitness < delta_score:
          
            delta_score = fitness
          
            delta_pos = wolves[i].copy()
          

          
    a = 2 - t * (2 / max_iter)  # 线性减少 a
          

          
    for i in range(num_wolves):
          
        for j in range(dim):
          
            r1, r2 = np.random.rand(), np.random.rand()
          
            A1 = 2 * a * r1 - a
          
            C1 = 2 * r2
          
            D_alpha = abs(C1 * alpha_pos[j] - wolves[i][j])
          
            X1 = alpha_pos[j] - A1 * D_alpha
          

          
            r1, r2 = np.random.rand(), np.random.rand()
          
            A2 = 2 * a * r1 - a
          
            C2 = 2 * r2
          
            D_beta = abs(C2 * beta_pos[j] - wolves[i][j])
          
            X2 = beta_pos[j] - A2 * D_beta
          

          
            r1, r2 = np.random.rand(), np.random.rand()
          
            A3 = 2 * a * r1 - a
          
            C3 = 2 * r2
          
            D_delta = abs(C3 * delta_pos[j] - wolves[i][j])
          
            X3 = delta_pos[j] - A3 * D_delta
          

          
            wolves[i][j] = np.clip((X1 + X2 + X3) / 3, lower_bound[j], upper_bound[j])
          

          
    convergence_curve.append(alpha_score)
          

          
print(f"最优解: {alpha_pos}")
          
print(f"最优值: {alpha_score}")
          

          
plt.plot(convergence_curve)
          
plt.title("收敛曲线")
          
plt.xlabel("迭代次数")
          
plt.ylabel("验证损失")
          
plt.show()

picture.image

在这里目标函数 objective_function(params) 负责创建和训练模型，并返回验证集上的最小损失值（验证损失），它的输入 params 是一个包含超参数的列表，灰狼算法初始化值如下：


          
dim = 3  # 超参数的数量
          
num_wolves = 5  # 灰狼种群的大小
          
max_iter = 20  # 最大迭代次数
          
lower_bound = [32, 64, 32]  # 超参数的下界
          
upper_bound = [128, 256, 128]  # 超参数的上界

这里的参数值都设置的比较小，实际上应该根据任务复杂度进行调整，通常较大的种群、较大迭代次数，可以提供更好的解决结果，当然计算成本也会增加，这里只是为了演示如何利用灰狼算法进行超参数搜索，通过算法我们输出了当前搜索范围，迭代次数下的最优参数，接下来使用该参数进行模型训练

4.3 最优参数下的模型训练


          
# 使用最优超参数重新训练模型
          
dense_1, dense_2, filters1 = map(int, alpha_pos)
          

          
model = Sequential()
          
model.add(TCN(nb_filters=filters1, kernel_size=6, activation='relu', input_shape=(win_size, 1), dilations=[1, 2, 4, 8, 16]))
          
model.add(Flatten())
          
model.add(Dense(dense_1, activation='relu'))
          
model.add(Dense(dense_2, activation='relu'))
          
model.add(Dense(1, activation='sigmoid'))
          

          
model.compile(optimizer='adam', loss='mse')
          

          
history = model.fit(X_train, y_train, epochs=100, batch_size=32, validation_data=(X_test, y_test), verbose=0)
          

          
# 可视化训练集和测试集的损失
          
plt.plot(history.history['loss'], label='Training Loss')
          
plt.plot(history.history['val_loss'], label='Validation Loss')
          
plt.xlabel('Epoch')
          
plt.ylabel('Loss')
          
plt.legend()
          
plt.show()

picture.image

5. 往期推荐

时间序列预测：CNN-BiLSTM模型实践

梯度提升集成：LightGBM与XGBoost组合预测

利用python meteostat库对全球气象数据访问，获取历史气象数据

基于LSTM模型的多输入多输出单步时间序列预测

使用LSTM模型预测多特征变量的时间序列

TCN时间序列卷积神经网络

如果你对类似于这样的文章感兴趣。

欢迎关注、点赞、转发~