Impute NA values in a deep_mutational_scan objects ER scores. Values can be imputed as the median scores from the combined landscape dataset or using custom values.

impute(x, na_value = "impute")

Arguments

x

deep_mutational_scan to impute data from

na_value

Value to set missense NA values to. Can be "impute", "average", a single value or a matrix of substitution scores (see details).

Value

An imputed deep_mutational_scan

Details

If na_value == "impute" missense NA scores are imputed to be the median value of that substitution (e.g. A -> C) from the deep_landscape dataset. If na_value == "average" missense NA scores are set to the average missense score for that position. If na_value is a matrix it should have rows and column names corresponding to single letter amino acid codes and have cell i,j correspond to the imputed score for substitutions from i to j. Any other value of na_value is interpreted as the score to impute all NA values to.

An impute mask is also generated and added to the data tibble of the deep_mutational_scan. This consists of one column for each amino acid (impute_X) which contains '0' if the corresponding score in that row is not imputed, '1' for synonymous substitution imputed as 0 and '2' for non-synonymous substitutions that have undergone imputation.

Examples

# Load an unimputed DMS object path <- system.file("extdata", "urn_mavedb_00000011_a_1_scores.csv", package = "deepscanscape") csv <- read.csv(path, skip = 4) dms <- deep_mutational_scan(csv, name = "Hietpas Hsp90", scheme = "mave", trans = NULL, na_value = NULL, annotate = FALSE)
#> Warning: Duplicate scores present - averaging scores for each variant using 'mean'
#> Warning: No imputation applied but NA values present. Data may not be suitable for downstream analysis until NA values are removed.
# Set NA to a constant impute(dms, na_value = 1)
#> # Deep mutational scanning data #> # Name: Hietpas Hsp90 #> NA #> NA #> # 9 positions #> # Positional data: #> position wt A C D E F G H #> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 1 Q -0.00345 -0.0175 -0.267 -0.0209 -0.00234 -0.534 0.00303 #> 2 2 F -0.247 -0.184 -0.429 -0.769 0 -0.161 -0.0464 #> 3 3 G -0.151 -0.137 -0.715 -0.813 -0.0383 0 -0.632 #> 4 4 W -0.954 -0.857 -1.32 -1.14 -0.0259 -1.02 -0.941 #> 5 5 S -0.983 -0.982 -0.941 -0.845 -0.881 -0.252 -0.888 #> 6 6 A 0 -0.638 -0.907 -1.03 -0.944 -0.191 -0.902 #> 7 7 N -0.181 -0.245 -0.922 -0.855 -0.0423 -0.256 -0.0403 #> 8 8 M -0.173 -0.301 -0.840 -0.225 -0.00593 -0.928 -0.641 #> 9 9 E -0.0412 -0.00649 -0.139 0 -0.0605 -0.750 -0.0521 #> # … with 34 more variables: I <dbl>, K <dbl>, L <dbl>, M <dbl>, N <dbl>, #> # P <dbl>, Q <dbl>, R <dbl>, S <dbl>, T <dbl>, V <dbl>, W <dbl>, Y <dbl>, #> # name <chr>, impute_A <dbl>, impute_C <dbl>, impute_D <dbl>, impute_E <dbl>, #> # impute_F <dbl>, impute_G <dbl>, impute_H <dbl>, impute_I <dbl>, #> # impute_K <dbl>, impute_L <dbl>, impute_M <dbl>, impute_N <dbl>, #> # impute_P <dbl>, impute_Q <dbl>, impute_R <dbl>, impute_S <dbl>, #> # impute_T <dbl>, impute_V <dbl>, impute_W <dbl>, impute_Y <dbl>
# Use the built in imputed values impute(dms, na_value = "impute")
#> # Deep mutational scanning data #> # Name: Hietpas Hsp90 #> NA #> NA #> # 9 positions #> # Positional data: #> position wt A C D E F G H #> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 1 Q -0.00345 -0.0175 -0.267 -0.0209 -0.00234 -0.534 0.00303 #> 2 2 F -0.247 -0.184 -0.429 -0.769 0 -0.161 -0.0464 #> 3 3 G -0.151 -0.137 -0.715 -0.813 -0.0383 0 -0.632 #> 4 4 W -0.954 -0.857 -1.32 -1.14 -0.0259 -1.02 -0.941 #> 5 5 S -0.983 -0.982 -0.941 -0.845 -0.881 -0.252 -0.888 #> 6 6 A 0 -0.638 -0.907 -1.03 -0.944 -0.191 -0.902 #> 7 7 N -0.181 -0.245 -0.922 -0.855 -0.0423 -0.256 -0.0403 #> 8 8 M -0.173 -0.301 -0.840 -0.225 -0.00593 -0.928 -0.641 #> 9 9 E -0.0412 -0.00649 -0.139 0 -0.0605 -0.750 -0.0521 #> # … with 34 more variables: I <dbl>, K <dbl>, L <dbl>, M <dbl>, N <dbl>, #> # P <dbl>, Q <dbl>, R <dbl>, S <dbl>, T <dbl>, V <dbl>, W <dbl>, Y <dbl>, #> # name <chr>, impute_A <dbl>, impute_C <dbl>, impute_D <dbl>, impute_E <dbl>, #> # impute_F <dbl>, impute_G <dbl>, impute_H <dbl>, impute_I <dbl>, #> # impute_K <dbl>, impute_L <dbl>, impute_M <dbl>, impute_N <dbl>, #> # impute_P <dbl>, impute_Q <dbl>, impute_R <dbl>, impute_S <dbl>, #> # impute_T <dbl>, impute_V <dbl>, impute_W <dbl>, impute_Y <dbl>