Last updated: 2020-10-15
Checks: 7 0
Knit directory: Blancetal/analysis/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20200217)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 2c4fb3f. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: data/.DS_Store
Ignored: data/quaint-results.rda
Ignored: output/.DS_Store
Untracked files:
Untracked: data/Frey_Cold.txt
Untracked: data/Frey_Cold.txt~
Untracked: data/schaefer_clusters.csv~
Untracked: figures/Supplement_Ve.png
Untracked: figures/phist.png
Untracked: output/Genenames_0.05.txt
Unstaged changes:
Modified: analysis/Environmental_variance.Rmd
Modified: analysis/Identifying_quaint.Rmd
Modified: analysis/Selection_on_Expression_of_Env_Rsponse_Genes.Rmd
Modified: output/all_sigenes_annotate.csv
Deleted: output/names_0.05_all.txt
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/Selection_on_expression_of_coexpreession_clusters.Rmd
) and HTML (docs/Selection_on_expression_of_coexpreession_clusters.html
) files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view the files as they were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | 2c4fb3f | jgblanc | 2020-10-15 | Added GO terms |
html | b371b77 | jgblanc | 2020-10-15 | Build site. |
Rmd | 6797660 | jgblanc | 2020-10-15 | Added GO terms |
html | 1982147 | jgblanc | 2020-10-15 | Build site. |
Rmd | 0e39940 | jgblanc | 2020-10-15 | Added GO terms |
html | 9cc4fa0 | jgblanc | 2020-09-30 | Build site. |
Rmd | 3294a81 | jgblanc | 2020-09-30 | new clusters |
html | 9ede1d4 | jgblanc | 2020-04-24 | Build site. |
Rmd | 69c5a29 | jgblanc | 2020-04-24 | Finished clusters |
html | e26e754 | jgblanc | 2020-04-24 | Build site. |
Rmd | 3a6666a | jgblanc | 2020-04-24 | Finished clusters |
Rmd | 8298d4d | GitHub | 2020-04-16 | Merge branch ‘master’ into master |
Rmd | 6b00f47 | em | 2020-03-27 | stuff |
Rmd | a63fa86 | em | 2020-03-26 | stuff |
html | 7591f0d | jgblanc | 2020-03-16 | Build site. |
Rmd | df4fe40 | jgblanc | 2020-03-16 | added more pvals |
Rmd | 2cb89cd | jgblanc | 2020-03-13 | adding coexpression stuff |
Here is the code to test for expression on coeexpression clusters from Walleey et al.
For this analysis we want to test for selection within specific coexpression modules. We used coexpression modules from Walley et al. (2016) who used weight gene coexpression network analysis (WGCNA) to to group together genes that were similarly expressed in at least 4 tissues in one maize inbred line. Their analysis resulted in 66 co-expression networks. Below we will load their co-expression networks and select all the clusters that have at least 100 genes in them.
modules <- read.delim("../data/Modules.txt",na.strings=c("","NA"), stringsAsFactors = F)
num_genes <- apply(modules, 2, function(x) length(which(!is.na(x))))
num_genes_100 <- which(num_genes >= 100)
modules <- modules[,num_genes_100]
We now have 51 modules that have at least 100 genes. We will now get the median expression of all the genes that are present in our RNAseq dataset for each module and then test for selection on the median values in each cluster. The code below is similar code used to identify selected genes, we are simply conducting the test on median expression values in each line instead of a single gene expression in each line. Again we test for selection on the first 5 PCs and use the last half of PCs to estimate \(V_a\). Th function will return the p-values from the \(Q_{pc}\) test for all tissues.
cluster_analysis_func <- function(modules) {
alltissues = c('GRoot',"Kern","LMAD26","LMAN26", "L3Tip","GShoot","L3Base")
alltissuemodules = lapply(alltissues, function(mytissue){
##get the names of genes we have expression data for in each tissue
exp <- read.table(paste("../data/Raw_expression/",mytissue,".txt", sep="")) # read in expression level
geneNames = names(exp)
#get the median expression level for each module in this tissue
moduleExpression = apply(modules, 2, function(x){
olap = x[x %in% geneNames] #get genes in the module tha we have expression data for
moduleExp = dplyr::select(exp, all_of(olap)) #pull out the expression data for these genes
medianExp = apply(moduleExp, 1, median)
return(medianExp)
})
####identify selection in each of these modules
# Read in tissue specific kinship matrix
myF <- read.table(paste('../data/Kinship_matrices/F_',mytissue,'.txt',sep=""))
## Get Eigen Values and Vectors
myE <- eigen(myF)
E_vectors <- myE$vectors
E_values <- myE$values
## Testing for selection on first 5 PCs
myM = 1:5
## Using the last 1/2 of PCs to estimate Va
myL = 6:dim(myF)[1]
### test for selection on all modules
moduleSelection <- apply(moduleExpression, 2, function(x){
meanCenteredX = x[-length(x)] - mean(x)
myQpc = calcQpc(myZ = meanCenteredX, myU = E_vectors, myLambdas = E_values, myL = myL, myM = myM)
return(myQpc$pvals)
})
##make a dataframe with info to return
mydf = data.frame(t(moduleSelection))
names(mydf) = c('PC1','PC2','PC3','PC4','PC5')
mydf$module = row.names(mydf)
row.names(mydf) <- NULL
mydflong = tidyr::gather(mydf, 'PC','pval', PC1:PC5)
mydflong$tissue = mytissue
return(mydflong)})
return(alltissuemodules)
}
We have the p-values, let’s calculate the FDR and look for significant p-values.
alltissuemodules = cluster_analysis_func(modules)
#combine all into one list
bigdf <- do.call('rbind', alltissuemodules)
bigdf$fdr <- p.adjust(bigdf$pval, method='fdr') ##calculate an FDR
summary(bigdf$fdr) ##nothing is significant
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.4299 0.9996 0.9996 0.9990 0.9996 0.9996
kable(bigdf[which.min(bigdf[,3]),])
module | PC | pval | tissue | fdr | |
---|---|---|---|---|---|
732 | RNA_Root_Meristem | PC5 | 0.0002409 | LMAD26 | 0.4299294 |
There is no significant selection on coexpression clusters. The lowest FDR is for the “RNA Root Meristem” cluster in adult leaf tissue on PC 5. We can also look at the clusters with with lowest bonferroni corrected p-values
t <- bigdf %>% group_by(PC, tissue) %>% mutate(bonferroni = p.adjust(pval, method='bonferroni')) %>% arrange(bonferroni)
kable(head(t))
module | PC | pval | tissue | fdr | bonferroni |
---|---|---|---|---|---|
RNA_Root_Meristem | PC5 | 0.0002409 | LMAD26 | 0.4299294 | 0.0122837 |
RNA_FemaleSpikIt_._VegMeristem | PC1 | 0.0012924 | Kern | 0.7985099 | 0.0659101 |
Protein_Germinating_Kernals | PC1 | 0.0017164 | Kern | 0.7985099 | 0.0875356 |
RNA_Root_Cortex_2 | PC1 | 0.0017894 | Kern | 0.7985099 | 0.0912583 |
Protein_EndoCrown.PericarpAl | PC1 | 0.0039731 | Kern | 0.9996316 | 0.2026281 |
RNA_Root_Cortex | PC1 | 0.0053132 | Kern | 0.9996316 | 0.2709719 |
ZmRoot Clusters - get clustes with >= 100 genes
clusters <- fread("../data/schaefer_clusters.csv")
clusters <- clusters[,1:4]
dat <- clusters %>% group_by(ZmRoot) %>% mutate(num_genes = n()) %>% filter(num_genes >= 100)
names <- unique(dat$ZmRoot)
gene_list <- list()
max <- nrow(dat[dat$ZmRoot == 1,])
for(i in 1:length(names)) {
df <- clusters %>% filter(ZmRoot == names[i])
genes <- c(df$V1, rep(NA, max-nrow(df)))
gene_list[[i]] <- genes
}
modules <- do.call(cbind, gene_list)
Run analysis and FDR correction
alltissuemodules = cluster_analysis_func(modules)
#combine all into one list
bigdf <- do.call('rbind', alltissuemodules)
bigdf$fdr <- p.adjust(bigdf$pval, method='fdr') ##calculate an FDR
summary(bigdf$fdr) ##nothing is significant
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.4377 0.9951 0.9951 0.9889 0.9951 0.9951
kable(bigdf[which.min(bigdf[,3]),])
module | PC | pval | tissue | fdr | |
---|---|---|---|---|---|
158 | 4 | PC5 | 0.0011368 | LMAD26 | 0.4376829 |
Look at lowest p-value clusters
t <- bigdf %>% group_by(PC, tissue) %>% mutate(bonferroni = p.adjust(pval, method='bonferroni')) %>% arrange(bonferroni)
kable(head(t))
module | PC | pval | tissue | fdr | bonferroni |
---|---|---|---|---|---|
4 | PC5 | 0.0011368 | LMAD26 | 0.4376829 | 0.0125052 |
3 | PC5 | 0.0023537 | LMAD26 | 0.4530808 | 0.0258903 |
9 | PC1 | 0.0053389 | Kern | 0.5741055 | 0.0587274 |
8 | PC2 | 0.0063947 | LMAD26 | 0.5741055 | 0.0703414 |
9 | PC3 | 0.0074559 | LMAN26 | 0.5741055 | 0.0820151 |
11 | PC1 | 0.0186208 | Kern | 0.9950821 | 0.2048286 |
Let’s look at the GO enrichment terms for the most significant clusters (bonferroni p < 0.05)
info <- fread("~/Downloads/GO_clusters.csv")
zmroot <- info %>% filter(Network == "ZmRoot") %>% select(Network, `MCL Cluster`, `GO Term`, `Num Genes common`, `Num GO genes`, `Num MCL Genes`, pval, `GO Term Name`)
# Module 4
kable(zmroot %>% filter(`MCL Cluster` == "MCL4"))
Network | MCL Cluster | GO Term | Num Genes common | Num GO genes | Num MCL Genes | pval | GO Term Name |
---|---|---|---|---|---|---|---|
ZmRoot | MCL4 | GO:0006558 | 4 | 18 | 745 | 0.00181 | L-phenylalanine metabolic process |
ZmRoot | MCL4 | GO:1902221 | 4 | 18 | 745 | 0.00181 | erythrose 4-phosphate/phosphoenolpyruvate family amino acid metabolic process |
ZmRoot | MCL4 | GO:0006559 | 3 | 10 | 745 | 0.00282 | L-phenylalanine catabolic process |
ZmRoot | MCL4 | GO:0009074 | 3 | 10 | 745 | 0.00282 | aromatic amino acid family catabolic process |
ZmRoot | MCL4 | GO:1902222 | 3 | 10 | 745 | 0.00282 | erythrose 4-phosphate/phosphoenolpyruvate family amino acid catabolic process |
ZmRoot | MCL4 | GO:0016841 | 3 | 11 | 745 | 0.00379 | ammonia-lyase activity |
# Module 3
kable(zmroot %>% filter(`MCL Cluster` == "MCL3"))
Network | MCL Cluster | GO Term | Num Genes common | Num GO genes | Num MCL Genes | pval | GO Term Name |
---|---|---|---|---|---|---|---|
ZmRoot | MCL3 | GO:0006260 | 24 | 56 | 996 | 0.00e+00 | DNA replication |
ZmRoot | MCL3 | GO:0000786 | 37 | 151 | 996 | 0.00e+00 | nucleosome |
ZmRoot | MCL3 | GO:0032993 | 37 | 151 | 996 | 0.00e+00 | protein-DNA complex |
ZmRoot | MCL3 | GO:0044815 | 37 | 151 | 996 | 0.00e+00 | DNA packaging complex |
ZmRoot | MCL3 | GO:0006334 | 37 | 154 | 996 | 0.00e+00 | nucleosome assembly |
ZmRoot | MCL3 | GO:0034728 | 37 | 154 | 996 | 0.00e+00 | nucleosome organization |
ZmRoot | MCL3 | GO:0065004 | 37 | 154 | 996 | 0.00e+00 | protein-DNA complex assembly |
ZmRoot | MCL3 | GO:0071824 | 37 | 154 | 996 | 0.00e+00 | protein-DNA complex subunit organization |
ZmRoot | MCL3 | GO:0044427 | 42 | 210 | 996 | 0.00e+00 | chromosomal part |
ZmRoot | MCL3 | GO:0003777 | 25 | 71 | 996 | 0.00e+00 | microtubule motor activity |
ZmRoot | MCL3 | GO:0005875 | 25 | 71 | 996 | 0.00e+00 | microtubule associated complex |
ZmRoot | MCL3 | GO:0006325 | 39 | 194 | 996 | 0.00e+00 | chromatin organization |
ZmRoot | MCL3 | GO:0007018 | 26 | 85 | 996 | 0.00e+00 | microtubule-based movement |
ZmRoot | MCL3 | GO:0007017 | 30 | 120 | 996 | 0.00e+00 | microtubule-based process |
ZmRoot | MCL3 | GO:0044430 | 32 | 145 | 996 | 0.00e+00 | cytoskeletal part |
ZmRoot | MCL3 | GO:0034622 | 41 | 245 | 996 | 0.00e+00 | cellular macromolecular complex assembly |
ZmRoot | MCL3 | GO:0006928 | 26 | 99 | 996 | 0.00e+00 | movement of cell or subcellular component |
ZmRoot | MCL3 | GO:0003774 | 26 | 101 | 996 | 0.00e+00 | motor activity |
ZmRoot | MCL3 | GO:0006259 | 41 | 282 | 996 | 0.00e+00 | DNA metabolic process |
ZmRoot | MCL3 | GO:0016986 | 12 | 20 | 996 | 0.00e+00 | obsolete transcription initiation factor activity |
ZmRoot | MCL3 | GO:0006270 | 11 | 22 | 996 | 0.00e+00 | DNA replication initiation |
ZmRoot | MCL3 | GO:0034061 | 9 | 25 | 996 | 3.00e-07 | DNA polymerase activity |
ZmRoot | MCL3 | GO:0003887 | 8 | 23 | 996 | 2.00e-06 | DNA-directed DNA polymerase activity |
ZmRoot | MCL3 | GO:0006352 | 14 | 83 | 996 | 5.60e-06 | DNA-templated transcription, initiation |
ZmRoot | MCL3 | GO:0000280 | 5 | 9 | 996 | 1.18e-05 | nuclear division |
ZmRoot | MCL3 | GO:0007067 | 5 | 9 | 996 | 1.18e-05 | mitotic nuclear division |
ZmRoot | MCL3 | GO:1903047 | 5 | 9 | 996 | 1.18e-05 | mitotic cell cycle process |
ZmRoot | MCL3 | GO:0009360 | 4 | 7 | 996 | 8.44e-05 | DNA polymerase III complex |
ZmRoot | MCL3 | GO:0042575 | 4 | 7 | 996 | 8.44e-05 | DNA polymerase complex |
ZmRoot | MCL3 | GO:0048285 | 5 | 13 | 996 | 1.05e-04 | organelle fission |
ZmRoot | MCL3 | GO:0032774 | 14 | 108 | 996 | 1.16e-04 | RNA biosynthetic process |
ZmRoot | MCL3 | GO:0071554 | 14 | 108 | 996 | 1.16e-04 | cell wall organization or biogenesis |
ZmRoot | MCL3 | GO:0022402 | 6 | 21 | 996 | 1.39e-04 | cell cycle process |
ZmRoot | MCL3 | GO:0016779 | 16 | 138 | 996 | 1.50e-04 | nucleotidyltransferase activity |
ZmRoot | MCL3 | GO:0008107 | 4 | 14 | 996 | 1.92e-03 | galactoside 2-alpha-L-fucosyltransferase activity |
ZmRoot | MCL3 | GO:0008417 | 4 | 14 | 996 | 1.92e-03 | fucosyltransferase activity |
ZmRoot | MCL3 | GO:0031127 | 4 | 14 | 996 | 1.92e-03 | alpha-(1,2)-fucosyltransferase activity |
ZmRoot | MCL3 | GO:0042546 | 4 | 15 | 996 | 2.54e-03 | cell wall biogenesis |
ZmRoot | MCL3 | GO:0004990 | 3 | 8 | 996 | 3.17e-03 | oxytocin receptor activity |
# Module 9
kable(zmroot %>% filter(`MCL Cluster` == "MCL9"))
Network | MCL Cluster | GO Term | Num Genes common | Num GO genes | Num MCL Genes | pval | GO Term Name |
---|---|---|---|---|---|---|---|
ZmRoot | MCL9 | GO:0016706 | 3 | 9 | 204 | 4.53e-05 | oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen, 2-oxoglutarate as one donor, and incorporation of one atom each of oxygen into both donors |
ZmRoot | MCL9 | GO:0000003 | 5 | 64 | 204 | 1.90e-04 | reproduction |
ZmRoot | MCL9 | GO:0019953 | 5 | 64 | 204 | 1.90e-04 | sexual reproduction |
ZmRoot | MCL9 | GO:0044703 | 5 | 64 | 204 | 1.90e-04 | multi-organism reproductive process |
ZmRoot | MCL9 | GO:0022414 | 5 | 66 | 204 | 2.20e-04 | reproductive process |
ZmRoot | MCL9 | GO:0051213 | 3 | 28 | 204 | 1.57e-03 | dioxygenase activity |
ZmRoot | MCL9 | GO:0004470 | 2 | 11 | 204 | 3.57e-03 | malic enzyme activity |
# Module 8
kable(zmroot %>% filter(`MCL Cluster` == "MCL8"))
Network | MCL Cluster | GO Term | Num Genes common | Num GO genes | Num MCL Genes | pval | GO Term Name |
---|---|---|---|---|---|---|---|
ZmRoot | MCL8 | GO:0006869 | 7 | 60 | 206 | 7.00e-07 | lipid transport |
ZmRoot | MCL8 | GO:0008289 | 7 | 102 | 206 | 2.44e-05 | lipid binding |
ZmRoot | MCL8 | GO:0019439 | 4 | 35 | 206 | 2.03e-04 | aromatic compound catabolic process |
ZmRoot | MCL8 | GO:1901361 | 4 | 43 | 206 | 4.54e-04 | organic cyclic compound catabolic process |
ZmRoot | MCL8 | GO:0004367 | 2 | 6 | 206 | 1.02e-03 | glycerol-3-phosphate dehydrogenase [NAD+] activity |
ZmRoot | MCL8 | GO:0046434 | 2 | 7 | 206 | 1.42e-03 | organophosphate catabolic process |
ZmRoot | MCL8 | GO:0004966 | 2 | 9 | 206 | 2.41e-03 | galanin receptor activity |
ZmRoot | MCL8 | GO:0006559 | 2 | 10 | 206 | 3.00e-03 | L-phenylalanine catabolic process |
ZmRoot | MCL8 | GO:0009074 | 2 | 10 | 206 | 3.00e-03 | aromatic amino acid family catabolic process |
ZmRoot | MCL8 | GO:1902222 | 2 | 10 | 206 | 3.00e-03 | erythrose 4-phosphate/phosphoenolpyruvate family amino acid catabolic process |
ZmRoot | MCL8 | GO:0016841 | 2 | 11 | 206 | 3.64e-03 | ammonia-lyase activity |
ZmSAM Clusters - get clustes with >= 100 genes
dat <- clusters %>% group_by(ZmSAM) %>% mutate(num_genes = n()) %>% filter(num_genes >= 100)
names <- unique(dat$ZmSAM)
gene_list <- list()
max <- nrow(dat[dat$ZmSAM == 0,])
for(i in 1:length(names)) {
df <- clusters %>% filter(ZmSAM == names[i])
genes <- c(df$V1, rep(NA, max-nrow(df)))
gene_list[[i]] <- genes
}
modules <- do.call(cbind, gene_list)
Run analysis and FDR correction
alltissuemodules = cluster_analysis_func(modules)
#combine all into one list
bigdf <- do.call('rbind', alltissuemodules)
bigdf$fdr <- p.adjust(bigdf$pval, method='fdr') ##calculate an FDR
summary(bigdf$fdr) ##nothing is significant
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.9999 0.9999 0.9999 0.9999 0.9999 0.9999
kable(bigdf[which.min(bigdf[,3]),])
module | PC | pval | tissue | fdr | |
---|---|---|---|---|---|
184 | 2 | PC5 | 0.0033344 | LMAD26 | 0.9999109 |
Look at lowest p-value clusters
t <- bigdf %>% group_by(PC, tissue) %>% mutate(bonferroni = p.adjust(pval, method='bonferroni')) %>% arrange(bonferroni)
kable(head(t))
module | PC | pval | tissue | fdr | bonferroni |
---|---|---|---|---|---|
2 | PC5 | 0.0033344 | LMAD26 | 0.9999109 | 0.0433477 |
7 | PC3 | 0.0315067 | LMAN26 | 0.9999109 | 0.4095866 |
10 | PC1 | 0.0320505 | Kern | 0.9999109 | 0.4166559 |
4 | PC5 | 0.0436062 | LMAD26 | 0.9999109 | 0.5668812 |
2 | PC1 | 0.0536048 | Kern | 0.9999109 | 0.6968619 |
5 | PC5 | 0.0560459 | LMAN26 | 0.9999109 | 0.7285967 |
Let’s look at the GO enrichment terms for the most significant clusters (bonferroni p < 0.05)
zmsam <- info %>% filter(Network == "ZmSAM") %>% select(Network, `MCL Cluster`, `GO Term`, `Num Genes common`, `Num GO genes`, `Num MCL Genes`, pval, `GO Term Name`)
# Module 2
kable(zmsam %>% filter(`MCL Cluster` == "MCL2"))
Network | MCL Cluster | GO Term | Num Genes common | Num GO genes | Num MCL Genes | pval | GO Term Name |
---|---|---|---|---|---|---|---|
ZmSAM | MCL2 | GO:0033180 | 8 | 10 | 942 | 0.00000 | proton-transporting V-type ATPase, V1 domain |
ZmSAM | MCL2 | GO:0004347 | 3 | 5 | 942 | 0.00172 | glucose-6-phosphate isomerase activity |
ZmSAM | MCL2 | GO:0030120 | 6 | 24 | 942 | 0.00193 | vesicle coat |
ZmSAM | MCL2 | GO:0000502 | 3 | 7 | 942 | 0.00553 | proteasome complex |
ZmSAM | MCL2 | GO:0006094 | 3 | 7 | 942 | 0.00553 | gluconeogenesis |
ZmPAN Clusters - get clustes with >= 100 genes
dat <- clusters %>% group_by(ZmPAN) %>% mutate(num_genes = n()) %>% filter(num_genes >= 100)
names <- unique(dat$ZmPAN)
gene_list <- list()
max <- nrow(dat[dat$ZmPAN == 0,])
for(i in 1:length(names)) {
df <- clusters %>% filter(ZmPAN == names[i])
genes <- c(df$V1, rep(NA, max-nrow(df)))
gene_list[[i]] <- genes
}
modules <- do.call(cbind, gene_list)
Run analysis and FDR correction
alltissuemodules = cluster_analysis_func(modules)
#combine all into one list
bigdf <- do.call('rbind', alltissuemodules)
bigdf$fdr <- p.adjust(bigdf$pval, method='fdr') ##calculate an FDR
summary(bigdf$fdr) ##nothing is significant
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.9992 0.9992 0.9992 0.9992 0.9992 0.9992
kable(bigdf[which.min(bigdf[,3]),])
module | PC | pval | tissue | fdr | |
---|---|---|---|---|---|
241 | 3 | PC5 | 0.0045783 | LMAD26 | 0.9991676 |
Look at lowest p-value clusters
t <- bigdf %>% group_by(PC, tissue) %>% mutate(bonferroni = p.adjust(pval, method='bonferroni')) %>% arrange(bonferroni)
kable(head(t))
module | PC | pval | tissue | fdr | bonferroni |
---|---|---|---|---|---|
3 | PC5 | 0.0045783 | LMAD26 | 0.9991676 | 0.0778318 |
2 | PC5 | 0.0090431 | GShoot | 0.9991676 | 0.1537334 |
1 | PC5 | 0.0092036 | LMAN26 | 0.9991676 | 0.1564620 |
13 | PC1 | 0.0129806 | Kern | 0.9991676 | 0.2206705 |
13 | PC5 | 0.0142503 | LMAD26 | 0.9991676 | 0.2422548 |
10 | PC2 | 0.0157494 | LMAN26 | 0.9991676 | 0.2677405 |
Let’s look at the GO enrichment terms for the most significant clusters (bonferroni p < 0.05)
zmpan <- info %>% filter(Network == "ZmPAN") %>% select(Network, `MCL Cluster`, `GO Term`, `Num Genes common`, `Num GO genes`, `Num MCL Genes`, pval, `GO Term Name`)
# Module 3
kable(zmpan %>% filter(`MCL Cluster` == "MCL3"))
Network | MCL Cluster | GO Term | Num Genes common | Num GO genes | Num MCL Genes | pval | GO Term Name |
---|---|---|---|---|---|---|---|
ZmPAN | MCL3 | GO:0006414 | 20 | 37 | 793 | 0.00e+00 | translational elongation |
ZmPAN | MCL3 | GO:0044391 | 27 | 84 | 793 | 0.00e+00 | ribosomal subunit |
ZmPAN | MCL3 | GO:0015935 | 17 | 49 | 793 | 0.00e+00 | small ribosomal subunit |
ZmPAN | MCL3 | GO:0022613 | 9 | 12 | 793 | 0.00e+00 | ribonucleoprotein complex biogenesis |
ZmPAN | MCL3 | GO:0042254 | 9 | 12 | 793 | 0.00e+00 | ribosome biogenesis |
ZmPAN | MCL3 | GO:0003746 | 12 | 25 | 793 | 0.00e+00 | translation elongation factor activity |
ZmPAN | MCL3 | GO:0004298 | 12 | 27 | 793 | 0.00e+00 | threonine-type endopeptidase activity |
ZmPAN | MCL3 | GO:0070003 | 12 | 27 | 793 | 0.00e+00 | threonine-type peptidase activity |
ZmPAN | MCL3 | GO:0008135 | 20 | 90 | 793 | 0.00e+00 | translation factor activity, RNA binding |
ZmPAN | MCL3 | GO:0005839 | 12 | 29 | 793 | 0.00e+00 | proteasome core complex |
ZmPAN | MCL3 | GO:0044085 | 9 | 27 | 793 | 7.00e-07 | cellular component biogenesis |
ZmPAN | MCL3 | GO:0015934 | 10 | 36 | 793 | 1.10e-06 | large ribosomal subunit |
ZmPAN | MCL3 | GO:0006334 | 20 | 154 | 793 | 4.10e-06 | nucleosome assembly |
ZmPAN | MCL3 | GO:0034728 | 20 | 154 | 793 | 4.10e-06 | nucleosome organization |
ZmPAN | MCL3 | GO:0065004 | 20 | 154 | 793 | 4.10e-06 | protein-DNA complex assembly |
ZmPAN | MCL3 | GO:0071824 | 20 | 154 | 793 | 4.10e-06 | protein-DNA complex subunit organization |
ZmPAN | MCL3 | GO:0044429 | 18 | 141 | 793 | 1.57e-05 | mitochondrial part |
ZmPAN | MCL3 | GO:0019843 | 5 | 10 | 793 | 2.27e-05 | rRNA binding |
ZmPAN | MCL3 | GO:0006626 | 4 | 6 | 793 | 3.72e-05 | protein targeting to mitochondrion |
ZmPAN | MCL3 | GO:0007007 | 4 | 6 | 793 | 3.72e-05 | inner mitochondrial membrane organization |
ZmPAN | MCL3 | GO:0042719 | 4 | 6 | 793 | 3.72e-05 | mitochondrial intermembrane space protein transporter complex |
ZmPAN | MCL3 | GO:0045039 | 4 | 6 | 793 | 3.72e-05 | protein import into mitochondrial inner membrane |
ZmPAN | MCL3 | GO:0007005 | 4 | 7 | 793 | 8.41e-05 | mitochondrion organization |
ZmPAN | MCL3 | GO:0007006 | 4 | 7 | 793 | 8.41e-05 | mitochondrial membrane organization |
ZmPAN | MCL3 | GO:0044743 | 4 | 7 | 793 | 8.41e-05 | intracellular protein transmembrane import |
ZmPAN | MCL3 | GO:0065002 | 4 | 7 | 793 | 8.41e-05 | intracellular protein transmembrane transport |
ZmPAN | MCL3 | GO:0070585 | 4 | 7 | 793 | 8.41e-05 | protein localization to mitochondrion |
ZmPAN | MCL3 | GO:0071806 | 4 | 7 | 793 | 8.41e-05 | protein transmembrane transport |
ZmPAN | MCL3 | GO:0072655 | 4 | 7 | 793 | 8.41e-05 | establishment of protein localization to mitochondrion |
ZmPAN | MCL3 | GO:0090151 | 4 | 7 | 793 | 8.41e-05 | establishment of protein localization to mitochondrial membrane |
ZmPAN | MCL3 | GO:1990542 | 4 | 7 | 793 | 8.41e-05 | mitochondrial transmembrane transport |
ZmPAN | MCL3 | GO:0098798 | 6 | 20 | 793 | 1.02e-04 | mitochondrial protein complex |
ZmPAN | MCL3 | GO:0006325 | 20 | 194 | 793 | 1.21e-04 | chromatin organization |
ZmPAN | MCL3 | GO:0000786 | 17 | 151 | 793 | 1.32e-04 | nucleosome |
ZmPAN | MCL3 | GO:0032993 | 17 | 151 | 793 | 1.32e-04 | protein-DNA complex |
ZmPAN | MCL3 | GO:0044815 | 17 | 151 | 793 | 1.32e-04 | DNA packaging complex |
ZmPAN | MCL3 | GO:0034622 | 23 | 245 | 793 | 1.61e-04 | cellular macromolecular complex assembly |
ZmPAN | MCL3 | GO:0005853 | 4 | 8 | 793 | 1.63e-04 | eukaryotic translation elongation factor 1 complex |
ZmPAN | MCL3 | GO:0016272 | 4 | 10 | 793 | 4.58e-04 | prefoldin complex |
ZmPAN | MCL3 | GO:0006413 | 8 | 48 | 793 | 6.18e-04 | translational initiation |
ZmPAN | MCL3 | GO:0003743 | 8 | 50 | 793 | 8.19e-04 | translation initiation factor activity |
ZmPAN | MCL3 | GO:0016986 | 5 | 20 | 793 | 9.95e-04 | obsolete transcription initiation factor activity |
ZmPAN | MCL3 | GO:0017038 | 4 | 13 | 793 | 1.41e-03 | protein import |
ZmPAN | MCL3 | GO:0015929 | 3 | 7 | 793 | 2.04e-03 | hexosaminidase activity |
ZmPAN | MCL3 | GO:0006465 | 3 | 8 | 793 | 3.16e-03 | signal peptide processing |
sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] knitr_1.29 data.table_1.12.8 dplyr_1.0.0 quaint_0.0.0.9000
[5] ggpubr_0.3.0 ggplot2_3.3.2 reshape2_1.4.4 workflowr_1.6.2
loaded via a namespace (and not attached):
[1] tidyselect_1.1.0 xfun_0.15 purrr_0.3.4 haven_2.3.1
[5] lattice_0.20-41 carData_3.0-4 colorspace_1.4-1 vctrs_0.3.1
[9] generics_0.0.2 htmltools_0.5.0 yaml_2.2.1 rlang_0.4.6
[13] later_1.1.0.1 pillar_1.4.4 foreign_0.8-72 glue_1.4.1
[17] withr_2.2.0 readxl_1.3.1 lifecycle_0.2.0 plyr_1.8.6
[21] stringr_1.4.0 cellranger_1.1.0 munsell_0.5.0 ggsignif_0.6.0
[25] gtable_0.3.0 zip_2.0.4 evaluate_0.14 rio_0.5.16
[29] forcats_0.5.0 httpuv_1.5.4 curl_4.3 highr_0.8
[33] broom_0.5.6 Rcpp_1.0.4.6 promises_1.1.1 scales_1.1.1
[37] backports_1.1.8 abind_1.4-5 fs_1.4.1 hms_0.5.3
[41] digest_0.6.25 stringi_1.4.6 openxlsx_4.1.5 rstatix_0.6.0
[45] grid_3.6.2 rprojroot_1.3-2 tools_3.6.2 magrittr_1.5
[49] tibble_3.0.1 crayon_1.3.4 whisker_0.4 tidyr_1.1.0
[53] car_3.0-8 pkgconfig_2.0.3 ellipsis_0.3.1 rmarkdown_2.3
[57] R6_2.4.1 nlme_3.1-148 git2r_0.27.1 compiler_3.6.2