When stateful LSTM layers are used, you must also define the batch size as part of the input shape in the definition of the network by setting the batch input shape argument and the batch size must be a factor of the number of samples in the training dataset. The batch input shape argument requires a 3-dimensional tuple defined as batch size, time steps, and features. For example, we can define a stateful LSTM to be trained on a training dataset with 100 samples, a batch size of 10, and 5 time steps for 1 feature, as follows