Skip to contents

Twin network can be trained to maximize the distance between embeddings of inputs. Implements approach as described here.

Usage

create_model_twin_network(
  maxlen = 50,
  dropout_lstm = 0,
  recurrent_dropout_lstm = 0,
  layer_lstm = NULL,
  layer_dense = c(4),
  dropout_dense = NULL,
  kernel_size = NULL,
  filters = NULL,
  strides = NULL,
  pool_size = NULL,
  solver = "adam",
  learning_rate = 0.001,
  vocabulary_size = 4,
  bidirectional = FALSE,
  compile = TRUE,
  padding = "same",
  dilation_rate = NULL,
  gap_inputs = NULL,
  use_bias = TRUE,
  residual_block = FALSE,
  residual_block_length = 1,
  size_reduction_1Dconv = FALSE,
  zero_mask = FALSE,
  verbose = TRUE,
  batch_norm_momentum = 0.99,
  distance_method = "euclidean",
  last_layer_activation = "sigmoid",
  loss_fn = loss_cl(margin = 1),
  metrics = "acc",
  model_seed = NULL,
  mixed_precision = FALSE,
  mirrored_strategy = NULL
)

Arguments

maxlen

Length of predictor sequence.

dropout_lstm

Fraction of the units to drop for inputs.

recurrent_dropout_lstm

Fraction of the units to drop for recurrent state.

layer_lstm

Number of cells per network layer. Can be a scalar or vector.

layer_dense

Vector containing number of neurons per dense layer, before euclidean distance layer.

dropout_dense

Dropout rates between dense layers. No dropout if NULL.

kernel_size

Size of 1d convolutional layers. For multiple layers, assign a vector. (e.g, rep(3,2) for two layers and kernel size 3)

filters

Number of filters. For multiple layers, assign a vector.

strides

Stride values. For multiple layers, assign a vector.

pool_size

Integer, size of the max pooling windows. For multiple layers, assign a vector.

solver

Optimization method, options are "adam", "adagrad", "rmsprop" or "sgd".

learning_rate

Learning rate for optimizer.

vocabulary_size

Number of unique character in vocabulary.

bidirectional

Use bidirectional wrapper for lstm layers.

compile

Whether to compile the model.

padding

Padding of CNN layers, e.g. "same", "valid" or "causal".

dilation_rate

Integer, the dilation rate to use for dilated convolution.

gap_inputs

Global pooling method to apply. Same options as for flatten_method argument in create_model_transformer function.

use_bias

Boolean. Usage of bias for CNN layers.

residual_block

Boolean. If true, the residual connections are used in CNN. It is not used in the first convolutional layer.

residual_block_length

Integer. Determines how many convolutional layers (or triplets when size_reduction_1D_conv is TRUE) exist

size_reduction_1Dconv

Boolean. When TRUE, the number of filters in the convolutional layers is reduced to 1/4 of the number of filters of

zero_mask

Boolean, whether to apply zero masking before LSTM layer. Only used if model does not use any CNN layers.

verbose

Boolean.

batch_norm_momentum

Momentum for the moving mean and the moving variance.

distance_method

Either "euclidean" or "cosine".

last_layer_activation

Activation function of output layer(s). For example "sigmoid" or "softmax".

loss_fn

Either "categorical_crossentropy" or "binary_crossentropy". If label_noise_matrix given, will use custom "noisy_loss".

metrics

Vector or list of metrics.

model_seed

Set seed for model parameters in tensorflow if not NULL.

mixed_precision

Whether to use mixed precision (https://www.tensorflow.org/guide/mixed_precision).

mirrored_strategy

Whether to use distributed mirrored strategy. If NULL, will use distributed mirrored strategy only if >1 GPU available.

Value

A keras model implementing twin network architecture.

Examples

if (FALSE) { # reticulate::py_module_available("tensorflow")
model <- create_model_twin_network(
  maxlen = 50,
  layer_dense = 16,
  kernel_size = 12,
  filters = 4,
  pool_size = 3,
  learning_rate = 0.001)
  
}