parse_features.Rd
This function parses the output .txt files (peptide groups or PSMs) from Proteome Discoverer and then filters out features based on various criteria.
The function performs the following steps:
Remove features without a master protein
(Optional) Remove features without a unique master protein (i.e. Number.of.Protein.Groups == 1)
(Optional) Remove features matching a cRAP protein
(Optional) Remove features matching any protein associated with a cRAP protein (see below)
Remove features without quantification values (only if TMT or SILAC
are TRUE
and level = "peptide"
.)
parse_features(
data,
master_protein_col = "Master.Protein.Accessions",
protein_col = "Protein.Accessions",
unique_master = TRUE,
silac = FALSE,
TMT = FALSE,
level = "peptide",
filter_crap = TRUE,
crap_proteins = NULL,
filter_associated_crap = TRUE
)
data.frame
generated from txt file output from Proteome
Discoverer.
string
. Name of column containing master
proteins.
string
. Name of column containing all protein
matches.
logical
. Filter out features without a unique
master protein.
logical
. Is the experiment a SILAC experiment?
logical
. Is the experiment a TMT experiment?
string
. Type of input file, must be one of either
"peptide"
or "PSM"
.
logical
. Filter out features which match a cRAP
protein.
character vector
. Contains the cRAP accessions,
for example: c("P02768")
which is serum albumin.
logical
. Filter out features which
match a cRAP associated protein.
Returns a data.frame
with the filtered Proteome Discoverer output.
Associated cRAP proteins are proteins which have at least one feature shared with a cRAP protein. It has been observed that the cRAP database does not contain all possible cRAP proteins e.g. some features can be assigned to a keratin which is not in the provided cRAP database.
Using filter_associated_crap = TRUE
will filter out f2 and f3 in
addition to f1, in the example below; regardless of the value in the
Master.Protein.Accession column.
feature Protein.Accessions Master.Protein.Accessions
f1 protein1, protein2, cRAP, protein1,
f2 protein1, protein3 protein3, f3 protein2 protein2