Create LSTM/CNN network to predict middle part of a sequence — create_model_lstm_cnn_target

Creates a network consisting of an arbitrary number of CNN, LSTM and dense layers. Function creates two sub networks consisting each of (optional) CNN layers followed by an arbitrary number of LSTM layers. Afterwards the last LSTM layers get concatenated and followed by one or more dense layers. Last layer is a dense layer. Network tries to predict target in the middle of a sequence. If input is AACCTAAGG, input tensors should correspond to x1 = AACC, x2 = GGAA and y = T.

Usage

create_model_lstm_cnn_target_middle(
  maxlen = 50,
  dropout_lstm = 0,
  recurrent_dropout_lstm = 0,
  layer_lstm = 128,
  solver = "adam",
  learning_rate = 0.001,
  vocabulary_size = 4,
  bidirectional = FALSE,
  stateful = FALSE,
  batch_size = NULL,
  padding = "same",
  compile = TRUE,
  layer_dense = NULL,
  kernel_size = NULL,
  filters = NULL,
  pool_size = NULL,
  strides = NULL,
  label_input = NULL,
  zero_mask = FALSE,
  label_smoothing = 0,
  label_noise_matrix = NULL,
  last_layer_activation = "softmax",
  loss_fn = "categorical_crossentropy",
  num_output_layers = 1,
  f1_metric = FALSE,
  auc_metric = FALSE,
  bal_acc = FALSE,
  verbose = TRUE,
  batch_norm_momentum = 0.99,
  model_seed = NULL,
  mixed_precision = FALSE,
  mirrored_strategy = NULL
)

Arguments

maxlen

Length of predictor sequence.

dropout_lstm

Fraction of the units to drop for inputs.

recurrent_dropout_lstm

Fraction of the units to drop for recurrent state.

layer_lstm

Number of cells per network layer. Can be a scalar or vector.

solver

Optimization method, options are "adam", "adagrad", "rmsprop" or "sgd".

learning_rate

Learning rate for optimizer.

vocabulary_size

Number of unique character in vocabulary.

bidirectional

Use bidirectional wrapper for lstm layers.

stateful

Boolean. Whether to use stateful LSTM layer.

batch_size

Number of samples that are used for one network update. Only used if stateful = TRUE.

padding

Padding of CNN layers, e.g. "same", "valid" or "causal".

compile

Whether to compile the model.

layer_dense

Vector specifying number of neurons per dense layer after last LSTM or CNN layer (if no LSTM used).

kernel_size

Size of 1d convolutional layers. For multiple layers, assign a vector. (e.g, rep(3,2) for two layers and kernel size 3)

filters

Number of filters. For multiple layers, assign a vector.

pool_size

Integer, size of the max pooling windows. For multiple layers, assign a vector.

strides

Stride values. For multiple layers, assign a vector.

label_input

Integer or NULL. If not NULL, adds additional input layer of label_input size.

zero_mask

Boolean, whether to apply zero masking before LSTM layer. Only used if model does not use any CNN layers.

label_smoothing

Float in [0, 1]. If 0, no smoothing is applied. If > 0, loss between the predicted labels and a smoothed version of the true labels, where the smoothing squeezes the labels towards 0.5. The closer the argument is to 1 the more the labels get smoothed.

label_noise_matrix

Matrix of label noises. Every row stands for one class and columns for percentage of labels in that class. If first label contains 5 percent wrong labels and second label no noise, then

label_noise_matrix <- matrix(c(0.95, 0.05, 0, 1), nrow = 2, byrow = TRUE )

last_layer_activation

Activation function of output layer(s). For example "sigmoid" or "softmax".

loss_fn

Either "categorical_crossentropy" or "binary_crossentropy". If label_noise_matrix given, will use custom "noisy_loss".

num_output_layers

Number of output layers.

f1_metric

Whether to add F1 metric.

auc_metric

Whether to add AUC metric.

bal_acc

Whether to add balanced accuracy.

verbose

Boolean.

batch_norm_momentum

Momentum for the moving mean and the moving variance.

model_seed

Set seed for model parameters in tensorflow if not NULL.

mixed_precision

Whether to use mixed precision (https://www.tensorflow.org/guide/mixed_precision).

mirrored_strategy

Whether to use distributed mirrored strategy. If NULL, will use distributed mirrored strategy only if >1 GPU available.

Value

A keras model with two input layers. Consists of LSTN, CNN and dense layers.

Examples

if (FALSE) { # reticulate::py_module_available("tensorflow")
create_model_lstm_cnn_target_middle(
  maxlen = 500,
  vocabulary_size = 4,
  kernel_size = c(8, 8, 8),
  filters = c(16, 32, 64),
  pool_size = c(3, 3, 3),
  layer_lstm = c(32, 64),
  layer_dense = c(128, 4),
  learning_rate = 0.001)
 
}