count_features_per_protein.Rd
For differential testing with https://www.bioconductor.org/packages/release/bioc/html/DEqMS.htmlDEqMS, one needs to identify the number of features, e.g PSMs or peptides per protein. This function returns the number of features per protein per sample.
count_features_per_protein(obj, master_prot_col = "Master.Protein.Accessions")
MSnSet
. Contains PSMs or peptides.
character
Column name for master protein ID
tibble
with feature counts per sample per protein.
# Use a small example TMT dataset included with the camprotR package
df <- psm_tmt_total
# Make an MSnSet
df_exprs <- as.matrix(df[, grep("Abundance", colnames(df))])
colnames(df_exprs) <- gsub("Abundance\\.", "", colnames(df_exprs))
df_fData <- df[, grep("Abundance", colnames(df), invert = TRUE)]
psm <- MSnbase::MSnSet(exprs = df_exprs, fData = df_fData)
# Count the number of PSMs per protein
count_features_per_protein(psm, master_prot_col = "Master.Protein.Accessions")
#> # A tibble: 23,542 × 3
#> # Groups: sample [10]
#> sample Master.Protein.Accessions n
#> <chr> <chr> <int>
#> 1 126 "" 2
#> 2 126 "A0AVT1" 1
#> 3 126 "A0PJZ3" 1
#> 4 126 "A0PK00" 1
#> 5 126 "A1L020" 1
#> 6 126 "A1L0T0" 4
#> 7 126 "A1L390" 1
#> 8 126 "A1X283" 1
#> 9 126 "A3KMH1" 1
#> 10 126 "A3KN83" 1
#> # ℹ 23,532 more rows