Advertisement

基于文本情感分析和LSTM的股票趋势预测

阅读量:

本文提出了一种基于股票历史数据、技术指标和股吧文本数据的分析框架。通过特征选择和技术指标的结合,构建了一个基于LSTM的情感分析模型用于股票市场的情绪预测与趋势判断。研究中还涉及了数据预处理、模型构建及训练过程,并提供了完整的代码实现(GitHub链接)以及论文发表信息(PDF链接)。

一. 大致流程如下

image.png

股票历史数据

image.png

股票技术指标

image.png

股票宏观指标

image.png

股吧文本数据

image.png

1.特征选择

2.股吧文本情感分析

######2.1文本数据预处理

image.png

######2.2 基于LSTM的情感分析模型

image.png

######2.3 处理结果

image.png

参考代码
https://github.com/Edward1Chou/SentimentAnalysis

3.趋势预测

复制代码
    ####################load data
    def load_data(df, sequence_length=10, split=0.8):
    # 转为float型
    data_all = np.array(df.astype(float))
    # 标准化
    x_scaler = MinMaxScaler()
    data_all[:, :-1] = x_scaler.fit_transform(data_all[:, :-1])
    data = []
    for i in range(len(data_all) - sequence_length - 1):
        data.append(data_all[i: i + sequence_length + 1])
    reshaped_data = np.array(data).astype('float64')
    x = reshaped_data[:, :-1]
    y = reshaped_data[:, -1][:, -1]
    scaler = MinMaxScaler()
    y = scaler.fit_transform(y.reshape(-1, 1))
    split_boundary = int(reshaped_data.shape[0] * split)
    train_x = x[: split_boundary]
    test_x = x[split_boundary:]
    
    train_y = y[: split_boundary]
    test_y = y[split_boundary:]
    
    return train_x, train_y, test_x, test_y, scaler
    
    
     def build_model(layers):
    model = Sequential()
    
    # By setting return_sequences to True we are able to stack another LSTM layer
    model.add(LSTM(
        input_dim=layers[0],
        output_dim=layers[1],
        return_sequences=True))
    model.add(Dropout(0.5))
    
    model.add(LSTM(
        layers[2],
        return_sequences=False))
    model.add(Dropout(0.4))
    
    model.add(Dense(
        output_dim=layers[3]))
    model.add(Activation("linear"))
    
    start = time.time()
    model.compile(loss="mse", optimizer="rmsprop", metrics=['accuracy'])
    print("Compilation Time : ", time.time() - start)
    return model
    
     def train_model(train_x, train_y, test_x, test_y):
    model = build_model([train_x.shape[2], train_x.shape[1], 200, 1])
    
    try:
        model.fit(train_x, train_y, batch_size=40, nb_epoch=200, validation_split=0.1)
        predict = model.predict(test_x)
        predict = np.reshape(predict, (predict.size, ))
    except KeyboardInterrupt:
        print(predict)
        print(test_y)
    try:
        fig = plt.figure(1)
        plt.plot(predict, 'r:')
        plt.plot(test_y, 'g-')
        plt.legend(['predict', 'true'])
    except Exception as e:
        print(e)
    return predict, test_y , model

本文代码已上传Github
https://github.com/tutan123/StockTrendPrediction

发表论文:

全部评论 (0)

还没有任何评论哟~