Create twin network — create_model_twin

Twin network can be trained to maximize the distance between embeddings of inputs. Implements approach as described here.

Usage

create_model_twin_network(
  maxlen = 50,
  dropout_lstm = 0,
  recurrent_dropout_lstm = 0,
  layer_lstm = NULL,
  layer_dense = c(4),
  dropout_dense = NULL,
  kernel_size = NULL,
  filters = NULL,
  strides = NULL,
  pool_size = NULL,
  solver = "adam",
  learning_rate = 0.001,
  vocabulary_size = 4,
  bidirectional = FALSE,
  compile = TRUE,
  padding = "same",
  dilation_rate = NULL,
  gap_inputs = NULL,
  use_bias = TRUE,
  residual_block = FALSE,
  residual_block_length = 1,
  size_reduction_1Dconv = FALSE,
  zero_mask = FALSE,
  verbose = TRUE,
  batch_norm_momentum = 0.99,
  distance_method = "euclidean",
  last_layer_activation = "sigmoid",
  loss_fn = loss_cl(margin = 1),
  metrics = "acc",
  model_seed = NULL,
  mixed_precision = FALSE,
  mirrored_strategy = NULL
)

Arguments

maxlen: Length of predictor sequence.
dropout_lstm: Fraction of the units to drop for inputs.
recurrent_dropout_lstm: Fraction of the units to drop for recurrent state.
layer_lstm: Number of cells per network layer. Can be a scalar or vector.
layer_dense: Vector containing number of neurons per dense layer, before euclidean distance layer.
dropout_dense: Dropout rates between dense layers. No dropout if NULL.
kernel_size: Size of 1d convolutional layers. For multiple layers, assign a vector. (e.g, rep(3,2) for two layers and kernel size 3)
filters: Number of filters. For multiple layers, assign a vector.
strides: Stride values. For multiple layers, assign a vector.
pool_size: Integer, size of the max pooling windows. For multiple layers, assign a vector.
solver: Optimization method, options are "adam", "adagrad", "rmsprop" or "sgd".
learning_rate: Learning rate for optimizer.
vocabulary_size: Number of unique character in vocabulary.
bidirectional: Use bidirectional wrapper for lstm layers.
compile: Whether to compile the model.
padding: Padding of CNN layers, e.g. "same", "valid" or "causal".
dilation_rate: Integer, the dilation rate to use for dilated convolution.
gap_inputs: Global pooling method to apply. Same options as for flatten_method argument in create_model_transformer function.
use_bias: Boolean. Usage of bias for CNN layers.
residual_block: Boolean. If true, the residual connections are used in CNN. It is not used in the first convolutional layer.
residual_block_length: Integer. Determines how many convolutional layers (or triplets when size_reduction_1D_conv is TRUE) exist
size_reduction_1Dconv: Boolean. When TRUE, the number of filters in the convolutional layers is reduced to 1/4 of the number of filters of
zero_mask: Boolean, whether to apply zero masking before LSTM layer. Only used if model does not use any CNN layers.
verbose: Boolean.
batch_norm_momentum: Momentum for the moving mean and the moving variance.
distance_method: Either "euclidean" or "cosine".
last_layer_activation: Activation function of output layer(s). For example "sigmoid" or "softmax".
loss_fn: Either "categorical_crossentropy" or "binary_crossentropy". If label_noise_matrix given, will use custom "noisy_loss".
metrics: Vector or list of metrics.
model_seed: Set seed for model parameters in tensorflow if not NULL.
mixed_precision: Whether to use mixed precision (https://www.tensorflow.org/guide/mixed_precision).
mirrored_strategy: Whether to use distributed mirrored strategy. If NULL, will use distributed mirrored strategy only if >1 GPU available.

Value

A keras model implementing twin network architecture.

Examples

if (FALSE) { # reticulate::py_module_available("tensorflow")
model <- create_model_twin_network(
  maxlen = 50,
  layer_dense = 16,
  kernel_size = 12,
  filters = 4,
  pool_size = 3,
  learning_rate = 0.001)
  
}