Write random sequences to fasta file — create_dummy

Create random sequences from predefined vocabulary and write to fasta file.

Usage

create_dummy_data(
  file_path,
  num_files,
  header = "header",
  seq_length,
  num_seq,
  fasta_name_start = "file",
  write_to_file_path = FALSE,
  prob = NULL,
  vocabulary = c("a", "c", "g", "t")
)

Arguments

file_path: Output directory; can also be a file name but only possible if write_to_file_path = TRUE and num_files = 1).
num_files: Number of files to create.
header: Fasta header name.
seq_length: Length of one sequence. If vector longer than 1, will randomly sample from that vector.
num_seq: Number of sequences per file.
fasta_name_start: Beginning string of file name. Output files are named fasta_name_start + _i.fasta where i is an integer index.
write_to_file_path: Whether to write output directly to file_path, i.e. file_path is not a directory.
prob: Probability of each character in the vocabulary to be sampled. If NULL each character has same probability.
vocabulary: Set of characters to sample sequences from.

Value

None. Writes data to files.

Examples

path_output <- tempfile()
dir.create(path_output)
create_dummy_data(file_path = path_output,
                  num_files = 3,
                  seq_length = 11, 
                  num_seq = 5,                   
                  vocabulary = c("a", "c", "g", "t"))
list.files(path_output)                
#> [1] "file_1.fasta" "file_2.fasta" "file_3.fasta"