Parse data downloaded from MaveDB, including averaging across multiply mutated sequences.
Usually called internally byparse_deep_scan
but is exposed to help users get their input into the
correct format.
parse_mavedb(x, score_col = "score", average_multi = FALSE, ...)
x | A data frame with a column 'hgvs_pro' gicing the HGVS protein mutation string describing the variant(s) and a fitness score column ('score' by default). |
---|---|
score_col | String. Column containing fitness scores, to conveniently use an additional data column as the score where mutliple measurements are included in the data. |
average_multi | Average scores for variants included in multiply mutated sequences, where they have not been measured individually. Care should be taken to check that the type of multiple mutation makes this appropriate. |
... | Ignored. |
A long format tibble
with columns specifying 'position', 'wt', 'mut' and 'score'.