Abstract: CRISPR-Cas ribonucleoproteins are important tools for gene editing in pre-implantation embryos. However, the inefficient production of biallelic deletions in cattle zygotes has hindered mechanistic studies of gene function. In addition, the presence of maternal RNAs that support embryo development until embryonic genome activation may cause confounding phenotypes. Here, we aimed to improve the efficiency of biallelic deletions and deplete specific maternal RNAs in cattle zygotes using CRISPR-Cas editing technology. Two electroporation sessions with Cas9D10A ribonucleoproteins targeting exon 1 and the promoter of OCT4 produced biallelic deletions in 91% of the embryos tested. In most cases, the deletions were longer than 1000 nucleotides long. Electroporation of Cas13a ribonucleoproteins prevents the production of the corresponding proteins. We electroporated Cas9D10A ribonucleoproteins targeting exon 1, including the promoter region, of OCT4 in two sessions with inclusion of Cas13a ribonucleoproteins targeting OCT4 mRNAs in the second session to ablate OCT4 function in cattle embryos. A lack of OCT4 resulted in embryos arresting development prior to blastocyst formation at a greater proportion (7.9%) than controls (30.7%, P<0.001). The few embryos that developed past the morula stage did not form a normal inner cell mass. Transcriptome analysis of single blastocysts, confirmed to lack exon 1 and promoter region of OCT4, revealed a significant (FDR<0.1) reduction in transcript abundance of many genes functionally connected to stemness, including markers of pluripotency (CADHD1, DPPA4, GNL3, RRM2). The results confirm that OCT4 is key regulator of genes that modulate pluripotency and is required to form a functional blastocyst in cattle.

Overview

Code produced by Jada Nix, Gustavo Schettini and Fernando Biase. We created this file to permit reproducibility of the findings described in the paper. Please direct questions to Fernando Biase: fbiase at vt.edu

The raw transcriptome data is deposited on GEO repository under the following access GSE236474.

Please contact Fernando Biase: fbiase at vt.edu if you’d like access to other files used in this code. Updated information may be obtained at www.biaselaboatory.com

library('tidyverse', quietly = TRUE,lib.loc="/usr/lib/R/site-library")
library('stringr', quietly = TRUE,lib.loc="/usr/lib/R/site-library")
library('edgeR',quietly = TRUE, lib.loc="/usr/lib/R/site-library")
library('ggplot2',quietly = TRUE, lib.loc="/usr/lib/R/site-library")
library("DESeq2",quietly = TRUE,lib.loc="/usr/lib/R/site-library")
library("goseq" ,quietly = TRUE, lib.loc="/usr/lib/R/site-library")
library("cowplot",quietly = TRUE, lib.loc="/usr/lib/R/site-library")
library("ComplexHeatmap", quietly = TRUE,lib.loc="/usr/lib/R/site-library")
library("circlize", quietly = TRUE,lib.loc="/usr/lib/R/site-library")
library("data.table",quietly = TRUE , lib.loc="/usr/lib/R/site-library")
library("kableExtra",quietly = TRUE , lib.loc="/usr/lib/R/site-library")
library("multcomp", quietly = TRUE,lib.loc="/usr/lib/R/site-library")
library("aod", quietly = TRUE,lib.loc="/usr/lib/R/site-library")
library("car", quietly = TRUE,lib.loc="/usr/lib/R/site-library")
library("emmeans", quietly = TRUE,lib.loc="/usr/lib/R/site-library")
library("lmtest", quietly = TRUE,lib.loc="/usr/lib/R/site-library")

Procedures

Use the tabs below to access each section of our code.

Analysis of embryo development

Load development one electroporation vs two

day7blast <-readxl::read_excel(paste(path_excel,"2xCas10_day7_development_data.xlsx", sep=""))
day7blast <- day7blast[c(2,3,4)]
day7blast$totalPZ<-day7blast$B + day7blast$NB
day7blast$Group<-as.factor(day7blast$Group)

day8blast <-readxl::read_excel(paste(path_excel,"2xCas10_day8_development_data.xlsx", sep=""))
day8blast <- day8blast[c(2,3,4)]
day8blast$totalPZ<-day8blast$B + day8blast$NB
day8blast$Group<-as.factor(day8blast$Group)

GLM groups day 7

model.log7<-glm(cbind(B,NB) ~ Group, family = binomial(link = "logit"), data = day7blast)
emmeans_day7<-emmeans(model.log7,  ~ Group,regrid = "log", type='response')
#summary(emmeans_day7)$response

GLM groups day 8

model.log8<-glm(cbind(B,NB) ~ Group, family = binomial(link = "logit"), data = day8blast)
emmeans_day8<-emmeans(model.log8,  ~ Group,regrid = "log", type='response')
#summary(emmeans_day8)$response

Table S1 summary GLM

Table_S1<- as.data.table(data.frame( time= c(rep("164-166 hpf",3),rep("188-190 hpf",3)), 
                          group=c("Cas9D10A + targeting sgRNAs", "Cas9D10A + scramble sgRNAs", "Controls not electroporated", "Cas9D10A + targeting sgRNAs", "Cas9D10A + scramble sgRNAs","Controls not electroporated"),
                          PZ_7= c( day7blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("Cas9D10A 1x") , 
                          day7blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("Scramble 1x"), 
                          day7blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("Control") ,
                          day8blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("Cas9D10A 1x") , 
                          day8blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("Scramble 1x"), 
                          day8blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("Control")  ),
                          percent_7= c(round(summary(emmeans_day7)$response[1]*100,1), 
                          round(summary(emmeans_day7)$response[4]*100,1) , 
                          round(summary(emmeans_day7)$response[3]*100,1) ,
                          round(summary(emmeans_day8)$response[1]*100,1) ,
                          round(summary(emmeans_day8)$response[4]*100,1) ,
                          round(summary(emmeans_day8)$response[3]*100,1)  ) ,
                          se_7= c(round(summary(emmeans_day7)$SE[1]*100,1), 
                          round(summary(emmeans_day7)$SE[4]*100,1) , 
                          round(summary(emmeans_day7)$SE[3]*100,1) ,
                          round(summary(emmeans_day8)$SE[1]*100,1) ,
                          round(summary(emmeans_day8)$SE[4]*100,1) ,
                          round(summary(emmeans_day8)$SE[3]*100,1)  ) ,
                          PZ_8= c( day7blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("Cas9D10A 2x") , 
                          day7blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("Scramble 2x"), 
                          day7blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("Control") ,
                          day8blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("Cas9D10A 2x") , 
                          day8blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("Scramble 2x"), 
                          day8blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("Control")  ),
                          percent_8= c(round(summary(emmeans_day7)$response[2]*100,1), 
                          round(summary(emmeans_day7)$response[5]*100,1) , 
                          round(summary(emmeans_day7)$response[3]*100,1) ,
                          round(summary(emmeans_day8)$response[2]*100,1) ,
                          round(summary(emmeans_day8)$response[5]*100,1) ,
                          round(summary(emmeans_day8)$response[3]*100,1)) ,
                          se_8= c(round(summary(emmeans_day7)$SE[2]*100,1), 
                          round(summary(emmeans_day7)$SE[5]*100,1) , 
                          round(summary(emmeans_day7)$SE[3]*100,1) ,
                          round(summary(emmeans_day8)$SE[2]*100,1) ,
                          round(summary(emmeans_day8)$SE[5]*100,1) ,
                          round(summary(emmeans_day8)$SE[3]*100,1)  )
                          ))
                        
kbl(Table_S1, col.names=c("",   "", "N PZ", "%", "SE","N PZ", "%", "SE"), caption = "Table S1. Blastocyst developmental rates for the experiments comparing one versus two electroporation sessions with Cas9D10A and sgRNAs." ) %>% kable_classic() %>% add_header_above(c(" "=2, "single electroporation"=3, "double electroporation"=3 ))

Table S1. Blastocyst developmental rates for the experiments comparing one versus two electroporation sessions with Cas9D10A and sgRNAs.
		single electroporation			double electroporation
		N PZ	%	SE	N PZ	%	SE
164-166 hpf	Cas9D10A + targeting sgRNAs	337	6.8	1.4	655	3.2	0.7
164-166 hpf	Cas9D10A + scramble sgRNAs	146	17.1	3.1	209	17.7	2.6
164-166 hpf	Controls not electroporated	182	25.3	3.2	182	25.3	3.2
188-190 hpf	Cas9D10A + targeting sgRNAs	337	11.6	1.7	655	7.9	1.1
188-190 hpf	Cas9D10A + scramble sgRNAs	146	31.5	3.8	209	28.2	3.1
188-190 hpf	Controls not electroporated	182	30.8	3.4	182	30.8	3.4

Table S2 Contrasts groups day 7

Table_S2<-
as.data.table(
 rbind( data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("Cas9D10A 1x / Scramble 1x"=c(1,0,0,-1,0)),df=15)) ,
       data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("Cas9D10A 1x / Control"=c(1,0,-1,0,0)),df=16) ) ,
       data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("Scramble 1x / Control"=c(0,0,-1,1,0)),df=10) ),
       data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("Cas9D10A 1x / Cas9D10A 2x"=c(1,-1,0,0,0)),df=31)),
       data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("Cas9D10A 2x / Scramble 2x"=c(0,1,0,0,-1)),df=26) ),
       data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("Cas9D10A 2x / Control"=c(0,1,-1,0,0)),df=26) ),
       data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("Scramble 2x / Control"=c(0,0,1,0,-1)),df=11) )
      ) ) 

Table_S2<-cbind(Table_S2,BH=round(p.adjust(c(data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("Cas9D10A 1x / Scramble 1x"=c(1,0,0,-1,0)),df=15))$p.value ,
       data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("Cas9D10A 1x / Control"=c(1,0,-1,0,0)),df=16))$p.value ,
       data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("Scramble 1x / Control"=c(0,0,-1,1,0)),df=10))$p.value,
       data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("Cas9D10A 1x / Cas9D10A 2x"=c(1,-1,0,0,0)),df=31))$p.value,
       data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("Cas9D10A 2x / Scramble 2x"=c(0,1,0,0,-1)),df=26))$p.value,
       data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("Cas9D10A 2x / Control"=c(0,1,-1,0,0)),df=26))$p.value,
       data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("Scramble 2x / Control"=c(0,0,1,0,-1)),df=11))$p.value), method = "BH"),4))


kableExtra::kbl(Table_S2,  caption = "Table S2. Contrasts between experimental groups for blastocyst yield at 164-166 hpf." , digits=4) %>% kable_classic()

Table S2. Contrasts between experimental groups for blastocyst yield at 164-166 hpf.
contrast	odds.ratio	SE	df	null	t.ratio	p.value	BH
Cas9D10A 1x / Scramble 1x	0.3545	0.1092	15	1	-3.3657	0.0042	0.0074
Cas9D10A 1x / Control	0.2166	0.0596	16	1	-5.5584	0.0000	0.0001
Scramble 1x / Control	0.6109	0.1699	10	1	-1.7722	0.1068	0.1068
Cas9D10A 1x / Cas9D10A 2x	2.2114	0.6847	31	1	2.5633	0.0154	0.0216
Cas9D10A 2x / Scramble 2x	0.1540	0.0441	26	1	-6.5322	0.0000	0.0000
Cas9D10A 2x / Control	0.0979	0.0274	26	1	-8.3042	0.0000	0.0000
Scramble 2x / Control	1.5723	0.3913	11	1	1.8185	0.0963	0.1068

Table S3 Contrasts groups day 8

Table_S3<-
as.data.table(
 rbind( data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("Cas9D10A 1x / Scramble 1x"=c(1,0,0,-1,0)),df=15)) ,
       data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("Cas9D10A 1x / Control"=c(1,0,-1,0,0)),df=16) ) ,
       data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("Scramble 1x / Control"=c(0,0,-1,1,0)),df=10) ),
       data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("Cas9D10A 1x / Cas9D10A 2x"=c(1,-1,0,0,0)),df=31)),
       data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("Cas9D10A 2x / Scramble 2x"=c(0,1,0,0,-1)),df=26) ),
       data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("Cas9D10A 2x / Control"=c(0,1,-1,0,0)),df=26) ),
       data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("Scramble 2x / Control"=c(0,0,1,0,-1)),df=11) )
      ) ) 

Table_S3<-cbind(Table_S3,BH=round(p.adjust(c(data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("Cas9D10A 1x / Scramble 1x"=c(1,0,0,-1,0)),df=15))$p.value ,
       data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("Cas9D10A 1x / Control"=c(1,0,-1,0,0)),df=16))$p.value ,
       data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("Scramble 1x / Control"=c(0,0,-1,1,0)),df=10))$p.value,
       data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("Cas9D10A 1x / Cas9D10A 2x"=c(1,-1,0,0,0)),df=31))$p.value,
       data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("Cas9D10A 2x / Scramble 2x"=c(0,1,0,0,-1)),df=26))$p.value,
       data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("Cas9D10A 2x / Control"=c(0,1,-1,0,0)),df=29))$p.value,
       data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("Scramble 2x / Control"=c(0,0,1,0,-1)),df=11))$p.value), method = "BH"),4))

kableExtra::kbl(Table_S3,  caption = "Table S3. Contrasts between experimental groups for blastocyst yield at 188-190  hpf." , digits=4) %>% kable_classic()

Table S3. Contrasts between experimental groups for blastocyst yield at 188-190 hpf.
contrast	odds.ratio	SE	df	null	t.ratio	p.value	BH
Cas9D10A 1x / Scramble 1x	0.2845	0.0701	15	1	-5.1005	0.0001	0.0002
Cas9D10A 1x / Control	0.2945	0.0689	16	1	-5.2232	0.0001	0.0002
Scramble 1x / Control	1.0350	0.2483	10	1	0.1434	0.8888	0.8888
Cas9D10A 1x / Cas9D10A 2x	1.5176	0.3390	31	1	1.8676	0.0713	0.0998
Cas9D10A 2x / Scramble 2x	0.2192	0.0463	26	1	-7.1936	0.0000	0.0000
Cas9D10A 2x / Control	0.1940	0.0419	26	1	-7.5892	0.0000	0.0000
Scramble 2x / Control	1.1299	0.2512	11	1	0.5496	0.5936	0.6925

Load development DART vs controls

day7blast <-readxl::read_excel(paste(path_excel,"DART_day7_development_data_updated.xlsx", sep=""))
day7blast <- day7blast[c(2,3,4)]
day7blast$totalPZ<-day7blast$B + day7blast$NB
day7blast$Group<-as.factor(day7blast$Group)

day8blast <-readxl::read_excel(paste(path_excel,"DART_day8_development_data_updated.xlsx", sep=""))
day8blast <- day8blast[c(2,3,4)]
day8blast$totalPZ<-day8blast$B + day8blast$NB
day8blast$Group<-as.factor(day8blast$Group)

GLM groups day 7

model.log7<-glm(cbind(B,NB) ~ Group, family = binomial(link = "logit"), data = day7blast)
emmeans_day7<-emmeans(model.log7,  ~ Group,regrid = "log", type='response')
#summary(emmeans_day7)$response

GLM groups day 8

model.log8<-glm(cbind(B,NB) ~ Group, family = binomial(link = "logit"), data = day8blast)
emmeans_day8<-emmeans(model.log8,  ~ Group,regrid = "log", type='response')
#summary(emmeans_day8)$response

Table S4 summary GLM

Table_S4<- as.data.table(data.frame( time= c(rep("164-166 hpf",3),rep("188-190 hpf",3)), 
                          group=c("DART + targeting sgRNAs", "DART + scramble gRNAs", "Controls not electroporated", "DART + targeting sgRNAs", "DART + scramble gRNAs","Controls not electroporated"),
                          PZ= c( day7blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("DART") , 
                          day7blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("Scramble DART"), 
                          day7blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("Control") ,
                          day8blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("DART") , 
                          day8blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("Scramble DART"), 
                          day8blast %>% group_by(Group) %>% summarize(total_PZ=sum(totalPZ)) %>%  deframe %>% getElement("Control")  ),
                          percent= c(round(summary(emmeans_day7)$response[2]*100,1), 
                          round(summary(emmeans_day7)$response[3]*100,1) , 
                          round(summary(emmeans_day7)$response[1]*100,1) ,
                          round(summary(emmeans_day8)$response[2]*100,1) ,
                          round(summary(emmeans_day8)$response[3]*100,1) ,
                          round(summary(emmeans_day8)$response[1]*100,1)  ) ,
                          se= c(round(summary(emmeans_day7)$SE[2]*100,1), 
                          round(summary(emmeans_day7)$SE[3]*100,1) , 
                          round(summary(emmeans_day7)$SE[1]*100,1) ,
                          round(summary(emmeans_day8)$SE[2]*100,1) ,
                          round(summary(emmeans_day8)$SE[3]*100,1) ,
                          round(summary(emmeans_day8)$SE[1]*100,1)  ) 
                          ))
                        
kbl(Table_S4, col.names=c("",   "", "N PZ", "%", "SE"), caption = "Table S4. Blastocyst developmental rates for experiments with CRISPR-DART targeting OCT4." ) %>% kable_classic()

Table S4. Blastocyst developmental rates for experiments with CRISPR-DART targeting OCT4.
		N PZ	%	SE
164-166 hpf	DART + targeting sgRNAs	830	6.1	0.8
164-166 hpf	DART + scramble gRNAs	312	17.0	2.1
164-166 hpf	Controls not electroporated	348	23.0	2.3
188-190 hpf	DART + targeting sgRNAs	832	13.0	1.2
188-190 hpf	DART + scramble gRNAs	312	30.1	2.6
188-190 hpf	Controls not electroporated	348	31.6	2.5

Table S5 Contrasts CRISPR DART

Table_S5<-
as.data.table(
 rbind( data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("DART + targeting sgRNAs / DART + scramble gRNAs "=c(0,1,-1)),df=35)) ,
       data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("DART + targeting sgRNAs / Control"=c(-1,1,0)),df=37) ) ,
       data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("DART + scramble gRNAs / Control"=c(-1,0,1)),df=21)),
       data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("DART + targeting sgRNAs / DART + scramble gRNAs"=c(0,1,-1)),df=35)) ,
       data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("DART + targeting sgRNAs / Control"=c(-1,1,0)),df=37) ) ,
       data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("DART + scramble gRNAs / Control"=c(0-1,0,1)),df=21))
       ) )

Table_S5<-cbind(time= c(rep("164-166 hpf",3),rep("188-190 hpf",3)), 
                Table_S5,
                BH=round(p.adjust(c(data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("DART / DART Scramble "=c(0,1,-1)),df=35))$p.value ,
                                    data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list("DART / Control"=c(-1,1,0)),df=37))$p.value ,
                                    data.frame(contrast(emmeans(model.log7,  ~ Group, type='response') , list(" DART Scramble / Control"=c(-1,0,1)),df=21))$p.value,
                                    data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("DART / DART Scramble "=c(0,1,-1)),df=35))$p.value ,
                                    data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list("DART / Control"=c(-1,1,0)),df=37))$p.value ,
                                    data.frame(contrast(emmeans(model.log8,  ~ Group, type='response') , list(" DART Scramble / Control"=c(0-1,0,1)),df=21))$p.value), method = "BH"),4))

kableExtra::kbl(Table_S5,  caption = "Table S5. Contrasts between experimental groups for blastocyst yield in experiments with CRISPR-DART targeting OCT4." , digits=4) %>% kable_classic()

Table S5. Contrasts between experimental groups for blastocyst yield in experiments with CRISPR-DART targeting OCT4.
time	contrast	odds.ratio	SE	df	null	t.ratio	p.value	BH
164-166 hpf	DART + targeting sgRNAs / DART + scramble gRNAs	0.3199	0.0668	35	1	-5.4566	0.0000	0.0000
164-166 hpf	DART + targeting sgRNAs / Control	0.2193	0.0423	37	1	-7.8746	0.0000	0.0000
164-166 hpf	DART + scramble gRNAs / Control	0.6855	0.1353	21	1	-1.9129	0.0695	0.0834
188-190 hpf	DART + targeting sgRNAs / DART + scramble gRNAs	0.3460	0.0556	35	1	-6.6000	0.0000	0.0000
188-190 hpf	DART + targeting sgRNAs / Control	0.3228	0.0499	37	1	-7.3100	0.0000	0.0000
188-190 hpf	DART + scramble gRNAs / Control	0.9329	0.1575	21	1	-0.4110	0.6852	0.6852

Analysis from the time lapse culture

tab <- matrix(c(56, 7   , 29    ,20 ,28 ,13,    10,5), ncol=4, byrow=TRUE)
colnames(tab) <- c("total_n",   "blastocyst",   "morulae",  "8_cell")
rownames(tab) <- c('dart','control')
tab

##         total_n blastocyst morulae 8_cell
## dart         56          7      29     20
## control      28         13      10      5

binom.test(7,56, p=13/28)

## 
##  Exact binomial test
## 
## data:  7 and 56
## number of successes = 7, number of trials = 56, p-value = 1.065e-07
## alternative hypothesis: true probability of success is not equal to 0.4642857
## 95 percent confidence interval:
##  0.0517645 0.2407326
## sample estimates:
## probability of success 
##                  0.125

binom.test(29,56, p=10/28)

## 
##  Exact binomial test
## 
## data:  29 and 56
## number of successes = 29, number of trials = 56, p-value = 0.01692
## alternative hypothesis: true probability of success is not equal to 0.3571429
## 95 percent confidence interval:
##  0.3803158 0.6534451
## sample estimates:
## probability of success 
##              0.5178571

binom.test(20,56, p=5/28)

## 
##  Exact binomial test
## 
## data:  20 and 56
## number of successes = 20, number of trials = 56, p-value = 0.001336
## alternative hypothesis: true probability of success is not equal to 0.1785714
## 95 percent confidence interval:
##  0.2335548 0.4964069
## sample estimates:
## probability of success 
##              0.3571429

RNA-seq analysis

Scripts to process raw files

Alignment, filtering and counting

#!/bin/bash

path_to_index=/genome/hisat2/Bos_taurus.ARS-UCD1.2.105
splice_sites=/genome/annotation/hisat_splice.txt
fastq=/mnt/storage/lab_folder/crispr_sequencing/june_19_2023/fastq
output_alignment=/mnt/storage/lab_folder/crispr_sequencing/june_19_2023/alignment

#Alignment

for sample in $(ls $fastq/*.gz | cut -c 62-82 | sort | uniq); do
read1=$(ls $fastq/*.gz | grep $sample | grep "R1")
read2=$(ls $fastq/*.gz | grep $sample | grep "R2")
echo $sample
echo $read1
echo $read2
cd $output_alignment
/bioinfo/hisat2-2.2.1/hisat2  -p 30 -k 1 -x $path_to_index -1 $read1 -2 $read2 --bowtie2-dp 2 --score-min L,0,-1 --dta --known-splicesite-infile $splice_sites  -S sample_$(echo $read1 | cut -c 71-74)_alignment.sam --summary-file sample_$(echo $read1 | cut -c 71-74)_alignment_summary.txt

done;

#Filtering
samtools sort -@ 30 -O BAM -l 9 -o $(echo $sample | cut -c 66-76)_alignment.bam  $(echo $sample | cut -c 66-76)_alignment.sam
samtools index $(echo $sample | cut -c 66-76)_alignment.bam
samtools view --threads 30 -b -h -F 1796 -o $(echo $sample | cut -c 66-76)_alignment.filtered.bam $(echo $sample | cut -c 66-76)_alignment.bam
samtools sort -@ 30 -O BAM -l 9 -o $(echo $sample | cut -c 66-76)_alignment.filtered.sorted.bam  $(echo $sample | cut -c 66-76)_alignment.filtered.bam

alignment=$(echo $sample | cut -c 66-76)_alignment.filtered.sorted.bam
undup_output=$(echo $sample | cut -c 66-76)_alignment.filtered.sorted.undup.bam

bammarkduplicates  I=$alignment O=$undup_output  markthreads=30 level=9  rmdup=1 index=1 dupindex=0  verbose=0

/bioinfo/samtools-1.17/samtools view -h $(echo $sample | cut -c 66-76)_alignment.filtered.sorted.undup.bam | awk ' $6 ~ /[1-9][3-9][0-9][M]/ || $1 ~ /^@/' | samtools view -bS -o a_$(echo $sample | cut -c 66-76)_alignment.filtered.sorted.undup.bam
/bioinfo/samtools-1.17/samtools index a_$(echo $sample | cut -c 66-76)_alignment.filtered.sorted.undup.bam
/bioinfo/samtools-1.17/samtools view -e '[NM]<=10' -O BAM -o b_$(echo $sample | cut -c 66-76)_alignment.filtered.sorted.undup.bam a_$(echo $sample | cut -c 66-76)_alignment.filtered.sorted.undup.bam
/bioinfo/samtools-1.17/samtools index b_$(echo $sample | cut -c 66-76)_alignment.filtered.sorted.undup.bam

#Counting

path_to_gtf=/genome/annotation/Bos_taurus.ARS-UCD1.2.105.gtf

/bioinfo/subread-2.0.1-source/bin/featureCounts -s 0 -a $path_to_gtf -o /mnt/storage/lab_folder/crispr_sequencing/june_19_2023/counts/b_$(echo $sample | cut -c 66-76).count -F 'GTF' -t 'exon' -g 'gene_id' --ignoreDup -T 30 -p  /mnt/storage/lab_folder/crispr_sequencing/june_19_2023/alignment/b_$(echo $sample | cut -c 66-76)_alignment.filtered.sorted.undup.bam
done;

Create matrix with raw counts

files<-list.files(path_counts, recursive=T, pattern="count", full.names = TRUE)
files<-files[grep("summary", files, invert = TRUE)]
files<-files[grep("/b_sample", files, invert = FALSE)]
#length(files)
count_data<-data.frame(matrix(nrow=27607))
for (n in 1:length(files)) {
  count<-read.delim(files[n], header=TRUE, sep= "\t", stringsAsFactors = FALSE, comment.char= "#")
  count<-count[,c(1,7)]
  count_data<-cbind(count_data,count)
}
rownames(count_data)<-count_data[,2]
count_data<-count_data[,seq(from = 3, to = 49, by = 2)]
colnames(count_data)<- str_remove(str_remove(stringr::str_split_fixed(colnames(count_data), "[.]", 8)[,8], "b_"), "_alignment.filtered.sorted.undup.bam")

Load annotation

annotation.ensembl.symbol<-read.delim(paste(path_resources,"/2023_06_24_annotation.ensembl.symbol.txt.bz2", sep=""), header=TRUE, sep= "\t",row.names=1, stringsAsFactors = FALSE)
gene.length<-read.delim(paste(path_resources,"/2023_06_24_gene.length.txt.bz2", sep=""), header=TRUE, sep= "\t",row.names=1, stringsAsFactors = FALSE)
annotation.GO.biomart<-read.delim(paste(path_resources,"/2023_06_24_annotation.GO.biomart.txt.bz2", sep=""), header=TRUE, sep= "\t",row.names=1, stringsAsFactors = FALSE)
annotation.ensembl.transcript<-read.delim(paste(path_resources,"/2023_06_24_annotation.ensembl.transcript.txt.bz2", sep=""), header=TRUE, sep= "\t",row.names=1, stringsAsFactors = FALSE)

Filter matrix to retain ‘protein_coding’ and ‘lncRNA’

count_data_annotated<-merge(count_data_a,annotation.ensembl.symbol, by.x="row.names", by.y="ensembl_gene_id", all.x=TRUE, all.y=FALSE)
count_data_annotated<-count_data_annotated[count_data_annotated$gene_biotype %in% c('protein_coding', 'lncRNA'),]
count_data_annotated_length<-gene.length[gene.length$ensembl_gene_id %in% count_data_annotated$Row.names, 2]

calculate cpm and filter for lowly expressed genes

rownames(count_data_annotated)<-count_data_annotated$Row.names

cpm<-edgeR::cpm(count_data_annotated[,c(2:13)],normalized.lib.sizes = TRUE,log = FALSE)

keep<-rowSums(cpm>1) >3
cpm_filtered<-cpm[keep,]
genes_expressed<-rownames(cpm_filtered)

cpm_filtered<-cpm_filtered[rownames(cpm_filtered) %in% genes_expressed,]
count_filtered<-count_data_a[rownames(count_data_a) %in% genes_expressed,]

Calculate TPM

gene_length<-gene.length[gene.length$ensembl_gene_id %in% rownames(count_filtered), ]
x <- count_filtered / gene_length$transcript_length
tpm_all_data<- t( t(x) * 1e6 / colSums(x) )

Differential transcript abundance

edgeR

group<-rep(c("dart", "control"),c(5,7))

design<-model.matrix(~ group)

dart_vs_control<-DGEList(count=count_data_annotated[,c(2:13)])
dart_vs_control<-estimateDisp(dart_vs_control,design, robust=TRUE)

dart_vs_control_QLFit <- glmQLFit(dart_vs_control, design,robust=TRUE)
dart_vs_control_QLFit <- glmQLFTest(dart_vs_control_QLFit , coef="groupdart")
dart_vs_control_edgeR_results_QLF<- topTags(dart_vs_control_QLFit, adjust.method = "fdr", n=Inf)$table

DESeq2

group<-data.frame(colnames(count_data_annotated[,c(2:13)]), groups=as.factor(rep(c("dart", "control"),c(5,7))))

dart_vs_control<-DESeqDataSetFromMatrix(countData=count_data_annotated[,c(2:13)],colData=group, design= ~groups)

# Wald method
dart_vs_control_Wald<-DESeq(dart_vs_control, test="Wald")
dart_vs_control_Wald<-results(dart_vs_control_Wald, contrast=c("groups",  "dart", "control"), pAdjustMethod="fdr", tidy=TRUE)
dart_vs_control_Wald<-dart_vs_control_Wald[with(dart_vs_control_Wald, order(pvalue)),]

Merge both results

dart_vs_control_combined_results<-merge(dart_vs_control_edgeR_results_QLF, by.x='row.names', dart_vs_control_Wald, by.y='row')
dart_vs_control_combined_results<-dart_vs_control_combined_results[complete.cases(dart_vs_control_combined_results),]
dim(dart_vs_control_combined_results[(dart_vs_control_combined_results$FDR<0.1) & (dart_vs_control_combined_results$padj<0.1),])

## [1] 125  12

Plots

Fig 1A

dart_vs_control_combined_results_annotated<-merge(dart_vs_control_combined_results,annotation.ensembl.symbol, by.x="Row.names", by.y="ensembl_gene_id", all.x=TRUE, all.y=FALSE)
tpm_all_data_subset<-tpm_all_data[rownames(tpm_all_data) %in% dart_vs_control_combined_results_annotated[(dart_vs_control_combined_results_annotated$FDR<0.1 & dart_vs_control_combined_results_annotated$padj<0.1),]$Row.names , ]

mat_scaled = t(apply(log(tpm_all_data_subset+1), 1, scale))

colnames(mat_scaled)<-colnames(tpm_all_data_subset)

ha = HeatmapAnnotation(type = type, annotation_name_side = "left")

htmp <-Heatmap(mat_scaled,name = "Log2(TPM)\n scaled",
col = colorRamp2(c(-3, 0, 3), c("blue2", "white", "orange2")), 
show_row_names = FALSE, show_column_names = FALSE, 
top_annotation =  HeatmapAnnotation(group = rep(c("OCT4-KO", "control"), c(5,7)) , col=list(group= c( "OCT4-KO"="red", "control"="blue")) ) )

draw(htmp, legend_grouping = "original",merge_legend = TRUE)

Fig 1B

tpm_all_data_subset<-tpm_all_data[rownames(tpm_all_data) %in%  c("ENSBTAG00000025246", "ENSBTAG00000000060", "ENSBTAG00000047871", "ENSBTAG00000046092", "ENSBTAG00000021523", "ENSBTAG00000014549", "ENSBTAG00000016455", "ENSBTAG00000038103","ENSBTAG00000025441", "ENSBTAG00000051907", "ENSBTAG00000008501", "ENSBTAG00000001595"),]


data.graph<-data.frame()
for(i in 1:12){
ID<-rownames(tpm_all_data_subset)[i]
group<-rep(c("OCT4-KO", "control"), c(5,7))
TPM<-tpm_all_data_subset[i,]
data.graph<-rbind(data.frame(ID,group,TPM),data.graph)
}


data.graph$ID <- factor(data.graph$ID, levels = c(  "ENSBTAG00000025246", "ENSBTAG00000008501","ENSBTAG00000000060", "ENSBTAG00000047871", "ENSBTAG00000046092", "ENSBTAG00000021523", "ENSBTAG00000014549", "ENSBTAG00000016455", "ENSBTAG00000038103","ENSBTAG00000025441", "ENSBTAG00000051907",  "ENSBTAG00000001595"), labels = c("ZIC2","FOXD3", "PRDM14", "KLF17" ,"HDAC8" ,"STAT3" ,"GNL3" ,"CACHD1" ,"DPPA4" ,"HSPA1A" ,"CYBA" ,  "MT1E"))


ggplot(data=data.graph, aes(x=group, y=TPM , color=group))+
geom_jitter()+
facet_wrap(~ID, scale="free", nrow=2)+
scale_color_manual(values=c("blue","red"))+
scale_x_discrete(NULL)+
theme_bw()+
theme(
axis.text=element_text(color="black"),
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
strip.text.x = element_text(face="italic"),
strip.background =element_rect(fill="white"),
legend.position="bottom",
legend.margin=margin(unit(-1,"cm")),
plot.margin=margin(0,0,0,0)
)

sessionInfo()

sessionInfo()

## R version 4.3.1 (2023-06-16)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.6 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] grid      stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] lmtest_0.9-40               zoo_1.8-12                 
##  [3] emmeans_1.8.7               car_3.1-2                  
##  [5] carData_3.0-5               aod_1.3.2                  
##  [7] multcomp_1.4-25             TH.data_1.1-2              
##  [9] MASS_7.3-60                 survival_3.5-5             
## [11] mvtnorm_1.2-2               kableExtra_1.3.4           
## [13] data.table_1.14.8           circlize_0.4.15            
## [15] ComplexHeatmap_2.16.0       cowplot_1.1.1              
## [17] goseq_1.52.0                geneLenDataBase_1.36.0     
## [19] BiasedUrn_2.0.10            DESeq2_1.40.2              
## [21] SummarizedExperiment_1.30.2 Biobase_2.60.0             
## [23] MatrixGenerics_1.12.2       matrixStats_1.0.0          
## [25] GenomicRanges_1.52.0        GenomeInfoDb_1.36.1        
## [27] IRanges_2.34.1              S4Vectors_0.38.1           
## [29] BiocGenerics_0.46.0         edgeR_3.42.4               
## [31] limma_3.56.2                lubridate_1.9.2            
## [33] forcats_1.0.0               stringr_1.5.0              
## [35] dplyr_1.1.2                 purrr_1.0.1                
## [37] readr_2.1.4                 tidyr_1.3.0                
## [39] tibble_3.2.1                ggplot2_3.4.2              
## [41] tidyverse_2.0.0            
## 
## loaded via a namespace (and not attached):
##   [1] RColorBrewer_1.1-3       rstudioapi_0.14          jsonlite_1.8.7          
##   [4] shape_1.4.6              magrittr_2.0.3           magick_2.7.4            
##   [7] estimability_1.4.1       GenomicFeatures_1.52.1   farver_2.1.1            
##  [10] rmarkdown_2.23           GlobalOptions_0.1.2      BiocIO_1.10.0           
##  [13] zlibbioc_1.46.0          vctrs_0.6.3              Cairo_1.6-0             
##  [16] memoise_2.0.1            Rsamtools_2.16.0         RCurl_1.98-1.12         
##  [19] webshot_0.5.5            htmltools_0.5.5          S4Arrays_1.0.4          
##  [22] progress_1.2.2           curl_5.0.1               cellranger_1.1.0        
##  [25] sass_0.4.6               bslib_0.5.0              sandwich_3.0-2          
##  [28] cachem_1.0.8             GenomicAlignments_1.36.0 mime_0.12               
##  [31] lifecycle_1.0.3          iterators_1.0.14         pkgconfig_2.0.3         
##  [34] Matrix_1.5-4.1           R6_2.5.1                 fastmap_1.1.1           
##  [37] GenomeInfoDbData_1.2.10  clue_0.3-64              digest_0.6.32           
##  [40] colorspace_2.1-0         AnnotationDbi_1.62.2     RSQLite_2.3.1           
##  [43] labeling_0.4.2           filelock_1.0.2           fansi_1.0.4             
##  [46] timechange_0.2.0         abind_1.4-5              httr_1.4.6              
##  [49] mgcv_1.8-42              compiler_4.3.1           bit64_4.0.5             
##  [52] withr_2.5.0              doParallel_1.0.17        BiocParallel_1.34.2     
##  [55] DBI_1.1.3                highr_0.10               biomaRt_2.56.1          
##  [58] rappdirs_0.3.3           DelayedArray_0.26.6      rjson_0.2.21            
##  [61] tools_4.3.1              glue_1.6.2               restfulr_0.0.15         
##  [64] nlme_3.1-162             cluster_2.1.4            generics_0.1.3          
##  [67] gtable_0.3.3             tzdb_0.4.0               hms_1.1.3               
##  [70] xml2_1.3.5               utf8_1.2.3               XVector_0.40.0          
##  [73] foreach_1.5.2            pillar_1.9.0             splines_4.3.1           
##  [76] BiocFileCache_2.8.0      lattice_0.21-8           rtracklayer_1.60.0      
##  [79] bit_4.0.5                tidyselect_1.2.0         GO.db_3.17.0            
##  [82] locfit_1.5-9.8           Biostrings_2.68.1        knitr_1.43              
##  [85] svglite_2.1.1            xfun_0.39                statmod_1.5.0           
##  [88] stringi_1.7.12           yaml_2.3.7               evaluate_0.21           
##  [91] codetools_0.2-19         cli_3.6.1                xtable_1.8-4            
##  [94] systemfonts_1.0.4        munsell_0.5.0            jquerylib_0.1.4         
##  [97] readxl_1.4.3             Rcpp_1.0.11              dbplyr_2.3.2            
## [100] coda_0.19-4              png_0.1-8                XML_3.99-0.14           
## [103] parallel_4.3.1           blob_1.2.4               prettyunits_1.1.1       
## [106] bitops_1.0-7             viridisLite_0.4.2        scales_1.2.1            
## [109] crayon_1.5.2             GetoptLong_1.0.5         rlang_1.1.1             
## [112] rvest_1.0.3              KEGGREST_1.40.0

Supplementary code to
Ablation of OCT4 function in cattle embryos by double electroporation of CRISPR-Cas for DNA and RNA targeting (CRISPR-DART)

Jada Nix,Gustavo Schettini and Fernando Biase

2023-07-07

Overview

Procedures

Analysis of embryo development

Load development one electroporation vs two

GLM groups day 7

GLM groups day 8

Table S1 summary GLM

Table S2 Contrasts groups day 7

Table S3 Contrasts groups day 8

Load development DART vs controls

GLM groups day 7

GLM groups day 8

Table S4 summary GLM

Table S5 Contrasts CRISPR DART

Analysis from the time lapse culture

RNA-seq analysis

Scripts to process raw files

Alignment, filtering and counting

Create matrix with raw counts

Load annotation

Filter matrix to retain ‘protein_coding’ and ‘lncRNA’

calculate cpm and filter for lowly expressed genes

Calculate TPM

Differential transcript abundance

edgeR

DESeq2

Merge both results

Plots

Fig 1A

Fig 1B

sessionInfo()

Supplementary code to Ablation of OCT4 function in cattle embryos by double electroporation of CRISPR-Cas for DNA and RNA targeting (CRISPR-DART)

Jada Nix,Gustavo Schettini and Fernando Biase

2023-07-07

Overview

Procedures

Analysis of embryo development

Load development one electroporation vs two

GLM groups day 7

GLM groups day 8

Table S1 summary GLM

Table S2 Contrasts groups day 7

Table S3 Contrasts groups day 8

Load development DART vs controls

GLM groups day 7

GLM groups day 8

Table S4 summary GLM

Table S5 Contrasts CRISPR DART

Analysis from the time lapse culture

RNA-seq analysis

Scripts to process raw files

Alignment, filtering and counting

Create matrix with raw counts

Load annotation

Filter matrix to retain ‘protein_coding’ and ‘lncRNA’

calculate cpm and filter for lowly expressed genes

Calculate TPM

Differential transcript abundance

edgeR

DESeq2

Merge both results

Plots

Fig 1A

Fig 1B

sessionInfo()

Supplementary code to
Ablation of OCT4 function in cattle embryos by double electroporation of CRISPR-Cas for DNA and RNA targeting (CRISPR-DART)