Recluster DMS data with a combined dataset — recluster • DeepScanScape

Apply the clustering method used in Dunham & Beltrao (2020) to a new deep mutational landscape dataset based on a standardised deep_mutational_scan dataset.

recluster(
  x,
  keep_clustering = FALSE,
  deep_split = NULL,
  permissive = 0.4,
  add_combined = TRUE,
  cols = NULL,
  method = "average",
  ...
)

Arguments

x	`deep_mutational_scan` object to recluster
keep_clustering	logical. Keep the outputs of hclust and cutreeHybrid for downstream analysis
deep_split	Named vector of deepSplit parameters to pass to cutreeHybrid. Must be a numeric vector with a named entry for each amino acid or a single integer to apply to all amino acids
permissive	Absolute value threshold for considering a substitution permissive. Positions with all substitution scores below this threshold are assigned to the permissive subtype.
add_combined	Combine the supplied data with the deep_landscape dataset.
cols	Columns to cluster on, defaults to PC2:20
method	`hclust` linkage method.
...	Additional arguments passed on to `cutreeHybrid`. deepSplit should not be included here as it is specified per AA using deep_split.

Value

A tibble containing the clustered data.

Details

This will be most valuable if x is a large multi_study dataset including quite a few new scans, otherwise results will either be similar to the original dataset if add_combined = TRUE or based on too few positions to be meaningful if not.

The default parameters apply the procedure used in the paper and a few parameters are provided to allow some adjustments or experimentation. Larger more novel changes will likely require adapting the code itself.

Examples

new_studies <- bind_scans(deep_scans)
reclust <- recluster(new_studies)