Apply the clustering method used in Dunham & Beltrao (2020) to a new deep mutational landscape dataset based on a standardised deep_mutational_scan dataset.

recluster(
  x,
  keep_clustering = FALSE,
  deep_split = NULL,
  permissive = 0.4,
  add_combined = TRUE,
  cols = NULL,
  method = "average",
  ...
)

Arguments

x

deep_mutational_scan object to recluster

keep_clustering

logical. Keep the outputs of hclust and cutreeHybrid for downstream analysis

deep_split

Named vector of deepSplit parameters to pass to cutreeHybrid. Must be a numeric vector with a named entry for each amino acid or a single integer to apply to all amino acids

permissive

Absolute value threshold for considering a substitution permissive. Positions with all substitution scores below this threshold are assigned to the permissive subtype.

add_combined

Combine the supplied data with the deep_landscape dataset.

cols

Columns to cluster on, defaults to PC2:20

method

hclust linkage method.

...

Additional arguments passed on to cutreeHybrid. deepSplit should not be included here as it is specified per AA using deep_split.

Value

A tibble containing the clustered data.

Details

This will be most valuable if x is a large multi_study dataset including quite a few new scans, otherwise results will either be similar to the original dataset if add_combined = TRUE or based on too few positions to be meaningful if not.

The default parameters apply the procedure used in the paper and a few parameters are provided to allow some adjustments or experimentation. Larger more novel changes will likely require adapting the code itself.

Examples

new_studies <- bind_scans(deep_scans) reclust <- recluster(new_studies)