期货种类会有不同到期日的多个合约同时买卖,不同到期日合约的价格变化基本同步,可是它们之间的价差会随时变化,今日我们拿股指期货的两个合约进行价差分析,分别是4月份的当月合约(IC2304)和当季合眼缘(IC2306)。
先加载原始Tick数据,组成10秒K线,并计算价差(Spread)
date_list = get_trade_date_list(start_date, end_date)
bar1 = get_future_tick_resample(code1, start_date, end_date, freq)
bar2 = get_future_tick_resample(code2, start_date, end_date, freq)
cp_df = pd.DataFrame(zip(bar1['price'],bar2['price']), columns=[code1,code2],index=bar1.index)
cp_df['spread'] = cp_df[code1] - cp_df[code2]
cp_df['spread_ma'] = cp_df['spread'].rolling(window=60).mean()
两个合约原始价格和价差数据样本如下:
价格和价差走势图
价差分布直方图
下面我们用这个组成后的10秒K线数据进行LSTM神经网络练习建模。
构建特征和标签
tick_df = pd.DataFrame(zip(bar1['price'],bar2['price']), columns=[code1,code2],index=bar1.index)
tick_df['spread'] = tick_df[code1] - tick_df[code2]
tick_df['spread_1'] = tick_df['spread'].rolling(window=5).sum().fillna(0)
tick_df['spread_2'] = tick_df['spread'].rolling(window=20).sum().fillna(0)
tick_df['spread_3'] = tick_df['spread'].rolling(window=40).sum().fillna(0)
tick_df.dropna(inplace=True)
tick_df['label'] = 0
tick_df['label'].iloc[:-fut_num] = tick_df['spread'].tolist()[fut_num:]
tick_df = tick_df.iloc[:-fut_num, :]
数据预处理
label = tick_df.loc[:, 'label']
data = tick_df.loc[:, features]
data, label, mm_y = normalization(data, label, normal_flag)
x, y = split_windows(data, label, seq_length)
x_data, y_data, x_train, y_train, x_test, y_test = split_data(x, y, 0.8)
train_loader, test_loader, num_epochs = data_generator(x_train, y_train, x_test, y_test, n_iters, batch_size)
模型练习及猜测
# 模型练习
moudle, criterion, loss_list = train_script_LSTM(train_loader, num_epochs, features=features, seq_length=seq_length
,batch_size=batch_size,
hidden_size=hidden_size,
num_layers=num_layers,
LR=LR)
# 模型猜测
data_predict = moudle(x_data)
loss = criterion(data_predict, y_data)
predict_np = data_predict.data.numpy()
real_np = y_data.data.numpy()
练习结果
残差全体在[-3,4]的区间内波动, 测试集里猜测值全体比真实值偏小。
我们现已将本文用到的悉数源数据+源代码+Python环境打包好了,做到开箱即用,一键运转,感兴趣的朋友能够下载,自己多动手才是学习的最佳途径。
关注我的同名大众号,在后台回复“源码”获取。