Skip to contents

Input is sequence of integers from vocabulary of size voc_size. Returns vector of integers corresponding to n-gram encoding. Integers greater than voc_size get encoded as voc_size^n + 1.

Usage

int_to_n_gram(int_seq, n, voc_size = 4)

Arguments

int_seq

Integer sequence

n

Length of n-gram aggregation

voc_size

Size of vocabulary.

Value

A numeric vector.

Examples

int_to_n_gram(int_seq = c(1,1,2,4,4), n = 2, voc_size = 4)
#> [1]  1  2  8 16