Last updated: 2020-05-08

Checks: 7 0

Knit directory: Blancetal/analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.6.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20200217) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/

Untracked files:
    Untracked:  data/cold-contingency-table.txt
    Untracked:  data/quaint-results.rda

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view them.

File Version Author Date Message
Rmd 8242d18 jgblanc 2020-05-08 fixed drought table
html 0f6b00c jgblanc 2020-05-07 Build site.
Rmd 5141f30 jgblanc 2020-05-07 added saving contingency table
html 1dfde4e jgblanc 2020-05-07 Build site.
Rmd 7cd9c70 jgblanc 2020-05-07 added saving contingency table
html feb6b89 jgblanc 2020-05-07 Build site.
Rmd 6478bf2 jgblanc 2020-05-07 change to fisher test
html 409888b jgblanc 2020-04-24 Build site.
Rmd 29bfe0e jgblanc 2020-04-24 prepping to publish env
Rmd 9c21090 em 2020-04-16 tsuff
Rmd 2b8f085 em 2020-04-07 stuff
Rmd dbf34fa em 2020-04-03 stuff
Rmd a128c45 em 2020-04-03 stuff
Rmd b2881a5 em 2020-04-02 stuff
Rmd d6a523b em 2020-04-02 stuff
Rmd cf43de2 em 2020-04-01 stuff
Rmd e0d8afc em 2020-03-25 more stuff
Rmd 640b45a em 2020-03-24 drought genes
Rmd 02b87d2 em 2020-03-23 stuf
Rmd 682f50e em 2020-03-23 stuff

Intro

The goal of this analysis is to determine if genes that have been shown to be differentially expressed in response to cold (Avila et al., 2018) or drought treatment (Forestan et al., 2020) are enriched for genes whose expression is under selection.

Cold Response Genes

First, read in the Avila et al. differential expression data and get the names of the genes used in their analysis.

cold <- read.csv("../data/Cold.csv", header = F, stringsAsFactors = F)
cold <-cold[,2:19] 
colnames(cold) <- c("V3_Gene_Name","V4_Gene_Name","Arabidopsis_Ortholog","Rice_Ortholog","Cold_1D_CG60_Means","Cold_1D_CG102_Means","Control_1D_CG60_Means","Control_1D_CG102_Means","FDR_1D_Genotype","FDR_1D_Treatment","FDR_\
1D_Interaction","Cold_4D_CG60_Means","Cold_4D_CG102_Means","Control_4D_CG60_Means","Control_4D_CG102_Means","FDR_4D_Genotype","FDR_4D_Treatment","FDR_4D_Interaction")

cold_names <- cold$V3_Gene_Name

Let’s now load our results from using QUAINT to identify selected genes. See the “Detecting selection on expression of individual genes” page for details.

alltissues = c('GRoot',"Kern","LMAD26","LMAN26", "L3Tip","GShoot","L3Base")
load('../data/quaint-results.rda')

The next step is to determine if differentially expressed genes are more likely to be under selection in every PC/Tissue combination. To do this we will use a chi-squared test.

First, separate genes that are differentially expressed in response to cold treatment and those that aren’t.

coldsum <- cold %>% select(V3_Gene_Name, V4_Gene_Name, FDR_1D_Treatment, FDR_4D_Treatment) ## pull out important columns

coldsum$sig = apply(coldsum[,3:4], 1, min, na.rm=T)
coldsigGenes = dplyr::filter(coldsum, sig < 0.1)$V3_Gene_Name
length(coldsigGenes)
[1] 12239
coldnotsigGenes = dplyr::filter(coldsum, sig > 0.1)$V3_Gene_Name
length(coldnotsigGenes)
[1] 11379

Before moving onto Fisher’s exact test, we will first check that the overlaps between cold response genes and selected genes were not due to both datasets being biased towards high expression genes.

### are coldsig genes higher expression (in LMAD26) than not sig genes?
mytissue = 'LMAD26'

exp = read.table(paste('../data/Raw_expression/',mytissue,'.txt', sep=''), stringsAsFactors = F)
exp_means = data.frame(gene = colnames(exp), expr = colMeans(exp), stringsAsFactors = F, row.names=NULL)

quaint_results = alltissueresults[[mytissue]]
quaint_ps = data.frame(gene = unlist(quaint_results[1,]),matrix(unlist(quaint_results[5,]), ncol=5, byrow=T), stringsAsFactors = F)
names(quaint_ps) =c('gene','PC1','PC2','PC3','PC4','PC5')
quaint_mer = dplyr::inner_join(exp_means, quaint_ps, by = 'gene') #merge with expression data

mydflong = tidyr::gather(quaint_mer, 'PC','pval', PC1:PC5)

##compare expression levels of genes that are sig and not sig for selection
mysig = dplyr::filter(mydflong, pval < 0.05)
mynot = dplyr::filter(mydflong, pval >= 0.05)


## compare expression levels of diff expressed genes
coldmer = dplyr::inner_join(coldsum, exp_means, by=c('V3_Gene_Name' = 'gene'))
csig = dplyr::filter(coldmer, sig < 0.1)
cnot = dplyr::filter(coldmer, sig >=0.1)

par(mar=c(8,5,2,2))
plot(c(1,2,4,5), c(median(mysig$expr), median(mynot$expr), median(csig$expr), median(cnot$expr)),
     bty="n", ylim=c(0,180), xaxt="n", yaxt="n",ylab = "median expression in leaves", xlab="", xlim = c(0,6))
axis(1, at=c(1,2,4,5), lab = c('selection','not selection','cold response','no response'), las=2)
axis(2,las=2)

Version Author Date
409888b jgblanc 2020-04-24

Gene expression of cold-response genes is not higher than expression of genes that don’t respond to cold. Now we will conduct Fisher’s exact test to compare the proportion of genes that show evidence of selection (un-adjusted p value less than 0.05) in cold-response genes compared with other genes.

##function for running the chisq test on a specific pc/tissue combination
run_chi_sq <- function(mytissue, myPC,sigGenes, notsigGenes, myp = 0.05, returntable=F){
  
  quaint_results = alltissueresults[[mytissue]] ##pull out tissue specific genes
  quaint_results_pc = data.frame(mygene = unlist(quaint_results[1,]), pval = unlist(lapply(quaint_results[5,], function(x){x[myPC]})), stringsAsFactors = F)  ## just need a table for the specific PC's pvalue
  sigOver = dplyr::filter(quaint_results_pc, mygene %in% sigGenes)
  notsigOver = dplyr::filter(quaint_results_pc, mygene %in% notsigGenes)
  
  ##make the chisq table
  bothsig = nrow(dplyr::filter(sigOver, pval < myp))
  diffexponly = nrow(dplyr::filter(sigOver, pval >= myp))
  selonly = nrow(dplyr::filter(notsigOver, pval < myp))
  nothingsig = nrow(dplyr::filter(notsigOver, pval > myp))
  chitbl = matrix(c(bothsig, selonly, diffexponly, nothingsig), nrow=2) ##make the chisq table
  if(returntable==T){return(chitbl)
    }else{
  chitest = fisher.test(chitbl) ##run the test
  return(chitest$p.value)}
}


##run on all PCs and all tissues for the cold genes
myresults = sapply(alltissues, function(mytissue){
  sapply(1:5,function(myPC){
    run_chi_sq(mytissue, myPC, coldsigGenes, coldnotsigGenes, 0.05)
  })
})
### note that we don't know what direction this goes in.

##we really just care about combos with significant evidence of selection in at least one gene
newresults = myresults
newresults[t(sigTable[,1:5]) ==0] <- NA

### make a table of the data for the supplement
coldTable = lapply(alltissues, function(mytissue){
  sapply(1:5,function(myPC){
    mytable = run_chi_sq(mytissue, myPC, coldsigGenes, coldnotsigGenes, 0.05, returntable=T)
    return(data.frame(tissue = mytissue, pc = myPC, bothsig = mytable[1,1], selonly = mytable[2,1], diffexponly = mytable[1,2], nothingsig = mytable[2,2], stringsAsFactors = F))
    
    })
})

coldTableOut = t(do.call(cbind, coldTable))
write.table(coldTableOut, row.names=F, col.names=T, file = "../data/cold-contingency-table.txt", quote=F)

kable(newresults)
GRoot Kern LMAD26 LMAN26 L3Tip GShoot L3Base
NA NA 0.7777273 0.6994290 NA NA NA
NA 0.7761694 0.7563795 NA 0.7709694 0.8824207 0.7851742
NA 0.3622327 1.0000000 0.0313280 NA NA 0.8169160
NA 1.0000000 NA NA 0.8071866 NA NA
NA NA 0.0043836 0.0832287 NA NA NA

Make bonferroni correction

kable(apply(newresults, 2, method="bonferroni",p.adjust)) 
GRoot Kern LMAD26 LMAN26 L3Tip GShoot L3Base
NA NA 1.0000000 1.0000000 NA NA NA
NA 1 1.0000000 NA 1 0.8824207 1
NA 1 1.0000000 0.0939840 NA NA 1
NA 1 NA NA 1 NA NA
NA NA 0.0175345 0.2496862 NA NA NA

The strongest signal for enrichment is for daytime expression in adult leaf tissue along PC5.

Finally, we will make a Figure S2 for showing the proportion of cold-response genes under selection along PC 5 in adult leaf tissue.

make_enrichment_fig <- function(mytissue, myPC, myp, sigGenes, notsigGenes){
quaint_results = alltissueresults[[mytissue]] ##pull out tissue specific genes
quaint_results_pc = data.frame(mygene = unlist(quaint_results[1,]), pval = unlist(lapply(quaint_results[5,], function(x){x[myPC]})), stringsAsFactors = F)  ## just need a table for the specific PC's pvalue
sigOver = dplyr::filter(quaint_results_pc, mygene %in% sigGenes)
notsigOver = dplyr::filter(quaint_results_pc, mygene %in% notsigGenes)
##make the chisq table
bothsig = nrow(dplyr::filter(sigOver, pval < myp))
diffexponly = nrow(dplyr::filter(sigOver, pval >= myp))
selonly = nrow(dplyr::filter(notsigOver, pval < myp))
nothingsig = nrow(dplyr::filter(notsigOver, pval > myp))
  
propDiffSel = bothsig/(bothsig+diffexponly)
propNotDiffSel = selonly/(selonly+nothingsig)

return(c(propDiffSel, propNotDiffSel))
}

test = make_enrichment_fig('LMAD26', 5, 0.05, coldsigGenes, coldnotsigGenes)
coldplot = function(){
bp = barplot(test, ylim=c(0,0.1), col = c("#0072B2","#CC79A7"), border="white", yaxt = "n", ylab = "Proportion under selection", cex.lab=1.5)
axis(2, las=2)
axis(1, at = bp, lab = c('Cold Response','Other'), cex.axis=1.5)}
coldplot()

Version Author Date
1dfde4e jgblanc 2020-05-07
feb6b89 jgblanc 2020-05-07
409888b jgblanc 2020-04-24

Drought response genes

We use drought response genes from Forestan et al. (https://onlinelibrary-wiley-com.proxy2.cl.msu.edu/doi/full/10.1111/pce.13660) Only looking at NST0 vs WST0 because that shows comparison of expression changes in drought and control. NSTO is control, WSTO is stressed

First, we will separated the drought response genes from the non drought response genes.

droughtgenes = read.csv('../data/pce13660-supp-0004-supplementalfile3_cuffdiff.csv', stringsAsFactors = F)
droughtgenes$v4_gene_model = sapply(droughtgenes$gene_id, function(x){strsplit(x, split=':')[[1]][2]}) #make a column for gene names

#convert gene names from v4 to v3
genemodel = read.csv('../data/gene_model_xref_v3.csv', stringsAsFactors = F)[,1:2]
dgmerge = dplyr::inner_join(droughtgenes, genemodel, by = "v4_gene_model")

droughtsig = dplyr::filter(dgmerge, q_value < 0.1)$v3_gene_model ##filter out sig genes
droughtnot = dplyr::filter(dgmerge, q_value > 0.1)$v3_gene_model ##filter out sig genes

Next, we want to check the overall expression differences between different categories of genes in adult leaf tissue.

##comparing overall expression level in the Kremling data between categories (look at LMAD26 for now)

mytissue = 'LMAD26'

exp = read.table(paste('../data/Raw_expression/',mytissue,'.txt', sep=''), stringsAsFactors = F)
exp_means = data.frame(gene = colnames(exp), expr = colMeans(exp), stringsAsFactors = F, row.names=NULL)

quaint_results = alltissueresults[[mytissue]]
quaint_ps = data.frame(gene = unlist(quaint_results[1,]),matrix(unlist(quaint_results[5,]), ncol=5, byrow=T), stringsAsFactors = F)
names(quaint_ps) =c('gene','PC1','PC2','PC3','PC4','PC5')
quaint_mer = dplyr::inner_join(exp_means, quaint_ps, by = 'gene') #merge with expression data

mydflong = tidyr::gather(quaint_mer, 'PC','pval', PC1:PC5)

##compare expression levels
mysig = dplyr::filter(mydflong, pval < 0.05)
mynot = dplyr::filter(mydflong, pval >= 0.05)

## compare expression levels of diff expressed genes
droughtmer = dplyr::inner_join(dgmerge, exp_means, by=c('v3_gene_model' = 'gene'))
dsig = dplyr::filter(droughtmer, q_value < 0.1)
dnot = dplyr::filter(droughtmer, q_value >=0.1)

par(mar=c(8,5,2,2))
plot(c(1,2,4,5), c(median(mysig$expr), median(mynot$expr), median(dsig$expr), median(dnot$expr)),
     bty="n", ylim=c(0,180), xaxt="n", yaxt="n",ylab = "median expression in leaves", xlab="", xlim = c(0,6))
axis(1, at=c(1,2,4,5), lab = c('selection','not selection','drought response','no response'), las=2)
axis(2,las=2)

Version Author Date
1dfde4e jgblanc 2020-05-07
feb6b89 jgblanc 2020-05-07
409888b jgblanc 2020-04-24

The drought response genes have higher expression compared to non-drought response genes. In order to do the chi-squared test we will use the 3500 most highly expressed non-drought genes as our non-drought to the proportion of selected genes to the proportion in drought genes. This will ensure that overlaps between drought response genes and selected genes were not due to both datasets being biased towards high expression genes.

## pick the top  expressed genes in dnot 
dnot_high = dnot[order(-dnot$expr),][1:3500,]

#down regulated genes
dsig_down = dplyr::filter(dsig, value_1 > value_2)
dsig_up = dplyr::filter(dsig, value_2 > value_1)

par(mar=c(8,5,2,2))
plot(c(1,2,4,5,6,7), c(median(mysig$expr), median(mynot$expr), median(dsig$expr), median(dsig_down$expr), median(dnot$expr), median(dnot_high$expr)),
     bty="n", ylim=c(0,180), xaxt="n", yaxt="n",ylab = "median expression in leaves", xlab="", 
     xlim = c(0,8)) 
axis(1, at=c(1,2,4,5,6,7), 
    lab = c('selection','not selection','drought response','drought down','no response','no, high'), las=2)
axis(2,las=2)

Version Author Date
1dfde4e jgblanc 2020-05-07
feb6b89 jgblanc 2020-05-07
409888b jgblanc 2020-04-24

Above we can see that the no response high expression genes now match the median expression of the drought response genes. Below we will make Figure S1 which shows boxplots of expression values across response and non response genes for both drought and cold response genes.

Version Author Date
1dfde4e jgblanc 2020-05-07
feb6b89 jgblanc 2020-05-07
409888b jgblanc 2020-04-24

Finally we will do the chi-squared test for drought response. We will run 3 separate tests. The first test is using the high expression subsample. The next two are just looking at the drought response genes that are upregulated or downregulated in response to drought separately.

###now test for enrichment compared to the high expression sample
dresults_high = sapply(alltissues, function(mytissue){
  sapply(1:5,function(myPC){
    run_chi_sq(mytissue, myPC, dsig$v3_gene_model, dnot_high$v3_gene_model, 0.05)
  })
})
dresults_high[t(sigTable[,1:5]) ==0] <- NA #remove thiings with no significant signal of selection

##just down reg responses
dresults_high_down = sapply(alltissues, function(mytissue){
  sapply(1:5,function(myPC){
    run_chi_sq(mytissue, myPC, dsig_down$v3_gene_model, dnot_high$v3_gene_model, 0.05)
  })
})
dresults_high_down[t(sigTable[,1:5]) ==0] <- NA #remove thiings with no significant signal of selection

##up responses
dresults_high_up = sapply(alltissues, function(mytissue){
  sapply(1:5,function(myPC){
    run_chi_sq(mytissue, myPC, dsig_up$v3_gene_model, dnot_high$v3_gene_model, 0.05)
  })
})
dresults_high_up[t(sigTable[,1:5]) ==0] <- NA #remove thiings with no significant signal of selection

### make a table of the data for the supplement for drought genes
droughtTable = lapply(alltissues, function(mytissue){
  sapply(1:5,function(myPC){
    mytable = run_chi_sq(mytissue, myPC, dsig_down$v3_gene_model, dnot_high$v3_gene_model, 0.05, returntable=T)
    return(data.frame(tissue = mytissue, pc = myPC, bothsig = mytable[1,1], selonly = mytable[2,1], diffexponly = mytable[1,2], nothingsig = mytable[2,2], stringsAsFactors = F))
    
    })
})

droughtTableOut = t(do.call(cbind, droughtTable))

droughtTableOut = t(do.call(cbind, coldTable))
write.table(droughtTableOut, row.names=F, col.names=T, file = "../data/drought-contingency-table.txt", quote=F)

Here are the results that we reported in the results section of the paper. Specifically we report the bonferroni corrected p-values for PC 5 for the genes down-regulated in response to drought and the absence of significant enrichment in genes upregulated in response to drought on PC 5.

kable(apply(dresults_high_down, 2, method="bonferroni",p.adjust)) 
GRoot Kern LMAD26 LMAN26 L3Tip GShoot L3Base
NA NA 0.7933238 0.2495644 NA NA NA
NA 1.0000000 0.1264809 NA 1.0000000 0.6962722 0.0676011
NA 1.0000000 1.0000000 1.0000000 NA NA 1.0000000
NA 0.0683568 NA NA 0.9012942 NA NA
NA NA 0.0009667 0.0000326 NA NA NA
kable(apply(dresults_high_up, 2, method="bonferroni",p.adjust)) 
GRoot Kern LMAD26 LMAN26 L3Tip GShoot L3Base
NA NA 1.0000000 1.0000000 NA NA NA
NA 1.0000000 1.0000000 NA 0.8676325 0.1813478 0.970032
NA 0.1540548 0.4154579 1.0000000 NA NA 1.000000
NA 0.3733326 NA NA 0.6107358 NA NA
NA NA 1.0000000 0.2312074 NA NA NA

Finally we report that of the 69 genes that are selected and down regulated in response to drought, for 43 of them the slope of the relationship between PC 5 and gene expression was positive.

##pull out qm for genes that show selection in LMAD26 and LMAN26 along PC 5.
mytissue = 'LMAD26'
quaint_results = alltissueresults[[mytissue]]
quaint_cms = data.frame(gene = unlist(quaint_results[1,]),matrix(unlist(quaint_results[2,]), ncol=5, byrow=T), stringsAsFactors = F)
names(quaint_cms) = c('gene','PC1','PC2','PC3','PC4','PC5')
mycms_long = tidyr::gather(quaint_cms, 'PC','Cm', PC1:PC5) ##gather into a long data frame
mypcm_long = dplyr::inner_join(mycms_long, mydflong)##merge with pvalues
Joining, by = c("gene", "PC")
mysig05 = dplyr::filter(mypcm_long, pval < 0.05 & PC == 'PC5') ##look at sig along PC5

## look at cms in the overlap with drought down regulation genes
mysigdown = dplyr::filter(mysig05, gene %in% dsig_down$v3_gene_model)


## number with positive slopes 
nrow(dplyr::filter(mysigdown, Cm > 0))
[1] 43

As a sanity check let’s plot PC 5 vs expression for some of the 43 genes and make sure they have a postive slope.

###make a plot of what's on PC5 (and to sanity check)
mygene = 'GRMZM2G351977'
mytissue = 'LMAD26'
#get gene index for kernel
lmad26genes = unlist(alltissueresults[[3]][1,])
myindex = match(mygene, lmad26genes)

#make a plot
explot = makeGenePlot('LMAD26',myindex, 5)
explot

Version Author Date
1dfde4e jgblanc 2020-05-07
feb6b89 jgblanc 2020-05-07
409888b jgblanc 2020-04-24

Now let’s make Figure 2 showing the proportion of genes under selection for down and upregulated drought genes.

get_enrichment <- function(mytissue, myPC, myp, sigGenes, notsigGenes){
  
quaint_results = alltissueresults[[mytissue]] ##pull out tissue specific genes
quaint_results_pc = data.frame(mygene = unlist(quaint_results[1,]), pval = unlist(lapply(quaint_results[5,], function(x){x[myPC]})), stringsAsFactors = F)  ## just need a table for the specific PC's pvalue
sigOver = dplyr::filter(quaint_results_pc, mygene %in% sigGenes)
notsigOver = dplyr::filter(quaint_results_pc, mygene %in% notsigGenes)
##make the chisq table
bothsig = nrow(dplyr::filter(sigOver, pval < myp))
diffexponly = nrow(dplyr::filter(sigOver, pval >= myp))
selonly = nrow(dplyr::filter(notsigOver, pval < myp))
nothingsig = nrow(dplyr::filter(notsigOver, pval > myp))
return(c(bothsig, bothsig/diffexponly, selonly/nothingsig))
}

sigcounts_down = sapply(alltissues, function(mytissue){
  lapply(1:5,function(myPC){
    get_enrichment(mytissue, myPC, 0.05,dsig_down$v3_gene_model, dnot_high$v3_gene_model)
  })
})

sigcounts_up = sapply(alltissues, function(mytissue){
  lapply(1:5,function(myPC){
    get_enrichment(mytissue, myPC, 0.05,dsig_up$v3_gene_model, dnot_high$v3_gene_model)
  })
})

makeplot = function(x){myplot = c(sigcounts_down[5,3][[1]][2], sigcounts_down[5,4][[1]][2], sigcounts_up[5,3][[1]][2], sigcounts_up[5,4][[1]][2], sigcounts_down[5,3][[1]][3], sigcounts_down[5,4][[1]][3])
bp = barplot(myplot, ylim=c(0,0.2), col = c("#E69F00", "#0072B2"), border="white", yaxt = "n", ylab = "Proportion under selection", cex.lab=1.5)
axis(2, las=2)
axis(1, at = bp[c(1,3,5),]+0.6, lab = c('Down','Up','No change'), cex.axis=1.5)
legend('topright', c('Day','Night'), bty="n", fill = c("#E69F00", "#0072B2"), border="white", cex=1.5)}

makeplot()

Version Author Date
1dfde4e jgblanc 2020-05-07
feb6b89 jgblanc 2020-05-07
409888b jgblanc 2020-04-24

We can also generate a table showing the number of significant drought response, the percent down-regulated and the percent up-regulated for both day and night.

df <- rbind(unlist(sigcounts_down[5,3]), unlist(sigcounts_down[5,4]))
colnames(df) <- c("Number Sel Genes", "Proportion Down-regulated", "Proportion Up-regulated")
row.names(df) <- c("Day", "Night")
kable(df)
N umber Sel Genes P roportion Down-regulated P roportion Up-regulated
Day 69 0.1405295 0.0812732
Night 58 0.1428571 0.0680041
sig_ind <- fread("../data/siggenes.txt")
sig_cold <- left_join(sig_ind, coldsum, by = c("V1" = "V3_Gene_Name")) %>% mutate("Cold_Sig" = sig < 0.1) %>% select("V1", "V2", "V3", "V4","V5", "Cold_Sig")
sig_drought <- left_join(sig_cold, dgmerge, by = c("V1" = "v3_gene_model")) %>% mutate("Drought_down-regulated" = value_1 > value_2 & significant == "yes") %>% mutate("Drought_up-regulated" = value_2 > value_1 & significant == "yes") %>% select("V1", "V2", "V3", "V4", "V5","Cold_Sig", "significant", "Drought_down-regulated", "Drought_up-regulated")
colnames(sig_drought) <- c("Gene_name", "PC", "P-value", "FDR", "Tissue", "Cold_response", "Drought_response", "Drought_down-regulated","Drought_up-regulated")
write.csv(sig_drought, "../output/all_sigenes_annotate.csv")

sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.12.8 quaint_0.0.0.9000 reshape2_1.4.3    knitr_1.28       
[5] dplyr_0.8.4       ggplot2_3.2.1     workflowr_1.6.0  

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.3       highr_0.8        plyr_1.8.5       compiler_3.6.2  
 [5] pillar_1.4.3     later_1.0.0      git2r_0.26.1     tools_3.6.2     
 [9] digest_0.6.25    evaluate_0.14    lifecycle_0.1.0  tibble_2.1.3    
[13] gtable_0.3.0     pkgconfig_2.0.3  rlang_0.4.4      yaml_2.2.1      
[17] xfun_0.12        withr_2.1.2      stringr_1.4.0    vctrs_0.2.3     
[21] fs_1.3.1         rprojroot_1.3-2  grid_3.6.2       tidyselect_1.0.0
[25] glue_1.3.1       R6_2.4.1         rmarkdown_2.1    farver_2.0.3    
[29] tidyr_1.0.2      purrr_0.3.3      magrittr_1.5     whisker_0.4     
[33] ellipsis_0.3.0   backports_1.1.5  scales_1.1.0     promises_1.1.0  
[37] htmltools_0.4.0  assertthat_0.2.1 colorspace_1.4-1 httpuv_1.5.2    
[41] labeling_0.3     stringi_1.4.6    lazyeval_0.2.2   munsell_0.5.0   
[45] crayon_1.3.4