Twin network can be trained to maximize the distance between embeddings of inputs. Implements approach as described here.
Usage
create_model_twin_network(
maxlen = 50,
dropout_lstm = 0,
recurrent_dropout_lstm = 0,
layer_lstm = NULL,
layer_dense = c(4),
dropout_dense = NULL,
kernel_size = NULL,
filters = NULL,
strides = NULL,
pool_size = NULL,
solver = "adam",
learning_rate = 0.001,
vocabulary_size = 4,
bidirectional = FALSE,
compile = TRUE,
padding = "same",
dilation_rate = NULL,
gap_inputs = NULL,
use_bias = TRUE,
residual_block = FALSE,
residual_block_length = 1,
size_reduction_1Dconv = FALSE,
zero_mask = FALSE,
verbose = TRUE,
batch_norm_momentum = 0.99,
distance_method = "euclidean",
last_layer_activation = "sigmoid",
loss_fn = loss_cl(margin = 1),
metrics = "acc",
model_seed = NULL,
mixed_precision = FALSE,
mirrored_strategy = NULL
)
Arguments
- maxlen
Length of predictor sequence.
- dropout_lstm
Fraction of the units to drop for inputs.
- recurrent_dropout_lstm
Fraction of the units to drop for recurrent state.
- layer_lstm
Number of cells per network layer. Can be a scalar or vector.
- layer_dense
Vector containing number of neurons per dense layer, before euclidean distance layer.
- dropout_dense
Dropout rates between dense layers. No dropout if
NULL
.- kernel_size
Size of 1d convolutional layers. For multiple layers, assign a vector. (e.g,
rep(3,2)
for two layers and kernel size 3)- filters
Number of filters. For multiple layers, assign a vector.
- strides
Stride values. For multiple layers, assign a vector.
- pool_size
Integer, size of the max pooling windows. For multiple layers, assign a vector.
- solver
Optimization method, options are
"adam", "adagrad", "rmsprop"
or"sgd"
.- learning_rate
Learning rate for optimizer.
- vocabulary_size
Number of unique character in vocabulary.
- bidirectional
Use bidirectional wrapper for lstm layers.
- compile
Whether to compile the model.
- padding
Padding of CNN layers, e.g.
"same", "valid"
or"causal"
.- dilation_rate
Integer, the dilation rate to use for dilated convolution.
- gap_inputs
Global pooling method to apply. Same options as for
flatten_method
argument in create_model_transformer function.- use_bias
Boolean. Usage of bias for CNN layers.
- residual_block
Boolean. If true, the residual connections are used in CNN. It is not used in the first convolutional layer.
- residual_block_length
Integer. Determines how many convolutional layers (or triplets when
size_reduction_1D_conv
isTRUE
) exist- size_reduction_1Dconv
Boolean. When
TRUE
, the number of filters in the convolutional layers is reduced to 1/4 of the number of filters of- zero_mask
Boolean, whether to apply zero masking before LSTM layer. Only used if model does not use any CNN layers.
- verbose
Boolean.
- batch_norm_momentum
Momentum for the moving mean and the moving variance.
- distance_method
Either "euclidean" or "cosine".
- last_layer_activation
Activation function of output layer(s). For example
"sigmoid"
or"softmax"
.- loss_fn
Either
"categorical_crossentropy"
or"binary_crossentropy"
. Iflabel_noise_matrix
given, will use custom"noisy_loss"
.- metrics
Vector or list of metrics.
- model_seed
Set seed for model parameters in tensorflow if not
NULL
.- mixed_precision
Whether to use mixed precision (https://www.tensorflow.org/guide/mixed_precision).
- mirrored_strategy
Whether to use distributed mirrored strategy. If NULL, will use distributed mirrored strategy only if >1 GPU available.