Open in Colab: https://colab.research.google.com/github/ryanraba/stocksml/blob/master/docs/modeling.ipynb


Defining Models

Models can be created through a simple structure that defines each hidden layer. Keras and Tensorflow are used under the covers so many of the common layer types available in Keras are passed through including: - Dense Neural Network - Recurrent Neural Network - Long Short-Term Memory Network - Convolutional Neural Network - Dropout

The desired output size of each layer must also be defined. Activations and other settings are fixed. StocksML will attempt to fit together layers correctly and align with the training data, but some care must be taken to define things in a way that makes sense.

StocksML uses an unsupervised adversarial algorithm for learning new trading strategies. This requires at least two models to learn from each other. Additional models (specified by the count parameter) are created by copying the first model and re-initializing the initial weights. The BuildModel function returns a list of Keras models and a numpy array of training data appropriately shaped for the model set.

First lets create a dense neural network with three hidden layers. Dropout layers are typically inserted to help the model generalize and prevent overfitting.

[1]:
!pip install stocksml >/dev/null
from stocksml import LoadData, BuildData, BuildModel

sdf, symbols = LoadData(symbols=['SPY','BND', 'VNQI', 'VIXM'])
fdf = BuildData(sdf)
building BND data...
building SPY data...
building VIXM data...
building VNQI data...
[2]:
models, dx = BuildModel(fdf, len(symbols), count=2, layers=[('dnn',128),
                                                            ('drop', 0.25),
                                                            ('dnn',64),
                                                            ('drop', 0.25),
                                                            ('dnn',32)])
print('training data shape', dx.shape)
models[0].summary()
training data shape (1036, 20)
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input (InputLayer)              [(None, 20)]         0
__________________________________________________________________________________________________
dnn_0 (Dense)                   (None, 128)          2688        input[0][0]
__________________________________________________________________________________________________
drop_1 (Dropout)                (None, 128)          0           dnn_0[0][0]
__________________________________________________________________________________________________
dnn_2 (Dense)                   (None, 64)           8256        drop_1[0][0]
__________________________________________________________________________________________________
drop_3 (Dropout)                (None, 64)           0           dnn_2[0][0]
__________________________________________________________________________________________________
dnn_4 (Dense)                   (None, 32)           2080        drop_3[0][0]
__________________________________________________________________________________________________
action (Dense)                  (None, 5)            165         dnn_4[0][0]
__________________________________________________________________________________________________
symbol (Dense)                  (None, 4)            132         dnn_4[0][0]
__________________________________________________________________________________________________
limit (Dense)                   (None, 1)            33          dnn_4[0][0]
==================================================================================================
Total params: 13,354
Trainable params: 13,354
Non-trainable params: 0
__________________________________________________________________________________________________

The dense and dropout layers we specified are created in the middle of the model (the ‘hidden’ portion) with the output sizes we provided. An input layer is added at the start and shaped to fit our provided feature dataframe (fdf). The 2-D numpy array dx is built from the feature dataframe returned for use in training later on.

Every model must end with three output layers: action, symbol, and limit. These output layers represent the “trading strategy” that is learned, including what action to take in the market (i.e. buy, sell, hold), what ticker symbol to use, and what limit price to set.

Recurrent Neural Networks

When a recurrent neural network (rnn or lstm) a third dimension is needed in the training data. This third dimension represents time and is created by stacking previous days of data. Use the depth parameter to control the size of the time stacking.

The recurrent layers can pass through the third dimension to each other, but this must be dropped when passing to a dense layer or the final output layers. This is handled automatically by StocksML.

[3]:
models, dx = BuildModel(fdf, len(symbols), count=2,
                        depth=5, layers=[('rnn',64),
                                         ('drop',0.25),
                                         ('rnn',32),
                                         ('drop',0.25),
                                         ('dnn',32)])
print('training data shape', dx.shape)
models[0].summary()
training data shape (1036, 5, 20)
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input (InputLayer)              [(None, 5, 20)]      0
__________________________________________________________________________________________________
rnn_0 (SimpleRNN)               (None, 5, 64)        5440        input[0][0]
__________________________________________________________________________________________________
drop_1 (Dropout)                (None, 5, 64)        0           rnn_0[0][0]
__________________________________________________________________________________________________
rnn_2 (SimpleRNN)               (None, 32)           3104        drop_1[0][0]
__________________________________________________________________________________________________
drop_3 (Dropout)                (None, 32)           0           rnn_2[0][0]
__________________________________________________________________________________________________
dnn_4 (Dense)                   (None, 32)           1056        drop_3[0][0]
__________________________________________________________________________________________________
action (Dense)                  (None, 5)            165         dnn_4[0][0]
__________________________________________________________________________________________________
symbol (Dense)                  (None, 4)            132         dnn_4[0][0]
__________________________________________________________________________________________________
limit (Dense)                   (None, 1)            33          dnn_4[0][0]
==================================================================================================
Total params: 9,930
Trainable params: 9,930
Non-trainable params: 0
__________________________________________________________________________________________________

We see that the input and rnn_0 layers have an extra dimension in the output shape. This is gone in the output of rnn_2 passed to dnn_4. The shape of the training data returned in dx is now 3 dimensional.

Convolutional Neural Network

As with recurrent neural networks, convolutional neural networks also need a third time dimension. When using a CNN, the third dimension is suppressed with an extra Flatten layer inserted afterwards.

[4]:
models, dx = BuildModel(fdf, len(symbols), count=2,
                        depth=5, layers=[('cnn',32),
                                         ('drop',0.25),
                                         ('cnn',16),
                                         ('drop',0.25),
                                         ('dnn',32)])
print('training data shape', dx.shape)
models[0].summary()
training data shape (1036, 5, 20)
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input (InputLayer)              [(None, 5, 20)]      0
__________________________________________________________________________________________________
cnn_0 (Conv1D)                  (None, 3, 32)        1952        input[0][0]
__________________________________________________________________________________________________
drop_1 (Dropout)                (None, 3, 32)        0           cnn_0[0][0]
__________________________________________________________________________________________________
cnn_2 (Conv1D)                  (None, 1, 16)        1552        drop_1[0][0]
__________________________________________________________________________________________________
flatten (Flatten)               (None, 16)           0           cnn_2[0][0]
__________________________________________________________________________________________________
drop_3 (Dropout)                (None, 16)           0           flatten[0][0]
__________________________________________________________________________________________________
dnn_4 (Dense)                   (None, 32)           544         drop_3[0][0]
__________________________________________________________________________________________________
action (Dense)                  (None, 5)            165         dnn_4[0][0]
__________________________________________________________________________________________________
symbol (Dense)                  (None, 4)            132         dnn_4[0][0]
__________________________________________________________________________________________________
limit (Dense)                   (None, 1)            33          dnn_4[0][0]
==================================================================================================
Total params: 4,378
Trainable params: 4,378
Non-trainable params: 0
__________________________________________________________________________________________________

Here we see that the cnn_0 layer passed 3-D data to the next cnn_2 layer, but then a flatten layer is automatically inserted before passing to the dense layers. As with the recurrent models, the training data in dx is now 3-D.

Limiting Symbol Choices

One of the three output layers (symbol) decides which ticker symbol to use in trading for the corresponding action and limit. This symbol must be present in the feature dataframe (fdf), but the models don’t actually care about that. They simply need to know what the maximum number of symbols is that they are going to be choosing from.

Sometimes it is desireable to restrict the ticker symbols used for actual trading to just a subset of what is in the training data. In this case, the choices parameter can be reduced to the desired value. Later on during training, this must be remembered and preserved for accurate strategy learning.

[5]:
models, dx = BuildModel(fdf, 2, count=2, layers=[('dnn',128),('dnn',64),('dnn',32)])
models[0].summary()
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input (InputLayer)              [(None, 20)]         0
__________________________________________________________________________________________________
dnn_0 (Dense)                   (None, 128)          2688        input[0][0]
__________________________________________________________________________________________________
dnn_1 (Dense)                   (None, 64)           8256        dnn_0[0][0]
__________________________________________________________________________________________________
dnn_2 (Dense)                   (None, 32)           2080        dnn_1[0][0]
__________________________________________________________________________________________________
action (Dense)                  (None, 5)            165         dnn_2[0][0]
__________________________________________________________________________________________________
symbol (Dense)                  (None, 2)            66          dnn_2[0][0]
__________________________________________________________________________________________________
limit (Dense)                   (None, 1)            33          dnn_2[0][0]
==================================================================================================
Total params: 13,288
Trainable params: 13,288
Non-trainable params: 0
__________________________________________________________________________________________________

The size of the symbol output layer tracks to the value passed in to the choices parameter.

Advanced Models

If you are comfortable using Keras directly, you can certainly build your own models with whatever advanced features you desire. The only constraint is that they must have one input layer and three output layers corresponding to action, symbol and limit as demonstrated above. It is likely easiest to continue to use the BuildModel function to construct the training data array dx even if ignoring the model list returned. The other option is augmenting the model list with additional advanced models of your own, they need not all be the same.