estimate_go_overrep.Rd
This is a crude function to estimate the effect size of GO
over-representation i.e. we know a term is over-represented, but we want to
estimate the effect size/how over-represented it is. This function should
be run after get_enriched_go
.
estimate_go_overrep(obj, pwf, gene2cat)
data.frame
containing goseq
results as generated by
get_enriched_go
or goseq
.
data.frame
as used in get_enriched_go
or
goseq
.
data.frame
as used in get_enriched_go
or
goseq
.
Returns obj
with an extra column added called adj_overrep
. This
column is calculated for each GO term by:
numDEInCat / numInCat / (avgTermWeight / avgNonTermWeight) / (totalDEFeatures / totalFeatures)
where:
numDEInCat
is the number of differentially expressed genes (aka. proteins)
assigned to that GO term.
numInCat
is the total number of genes (aka. proteins) annotated to that
GO term.
avgTermWeight
is the average pwf$pwf
value for all the differentially
expressed genes that were assigned to that GO term.
avgNonTermWeight
is the average pwf$pwf
for all the other genes supplied
in pwf
.
totalDEFeatures
is the total number of differentially expressed genes
indicated in pwf
.
totalFeatures
is the total number of genes indicated in pwf
, i.e. the
number of rows.