Water buffalo (Bubalus bubalis L.) is an important livestock species worldwide. Like many other livestock species, water buffalo lacks high quality and continuous reference genome assembly, required for fine-scale comparative genomics studies. In this work, we present a dataset, which characterizes genomic differences between water buffalo genome and the extensively studied cattle (Bos taurus Taurus) reference genome. This data set is obtained after alignment of 14 river buffalo whole genome sequencing datasets to the cattle reference. This data set consisted of 13, 444 deletion CNV regions, and 11,050 merged mobile element insertion (MEI) events within the upstream regions of annotated cattle genes. Gene expression data from cattle and buffalo were also presented for genes impacted by these regions.
This study sought to characterize differences in gene content, regulation and structure between taurine cattle and river buffalo (2n=50) (one extant type of water buffalo) using the extensively annotated UMD3.1 cattle reference genome as a basis for comparisons. Using 14 WGS datasets from river buffalo, we identified 13,444 deletion CNV regions (Supplemental Table 1) in river buffalo, but not identified in cattle. We also presented 11,050 merged mobile element insertion (MEI) events (Supplemental Table 2) in river buffalo, out of which, 568 of them are within the upstream regions of annotated cattle genes. Furthermore, our tissue transcriptomics analysis provided expression profiles of genes impacted by MEI (Supplemental Tables 3–6) and CNV (Supplemental Table 7) events identified in this study. This data provides the genomic coordinates of identified CNV-deletions and MEI events. Additionally, normalized read count of impacted genes, along with their adjusted p-values of statistical analysis were presented (Supplemental Tables 3–6).
Genomic coordinates of identified CNV-deletion and MEI events, and Ensemble gene names of impacted genes (Supplemental Tables 1 and 2)
Gene expression profiles and statistical significance (adjusted p-values) of genes impacted by MEI in liver (Supplemental Tables 3 and 4)
Gene expression profiles and statistical significance (adjusted p-values) of genes impacted by MEI in muscle (Supplemental Tables 5 and 6)
Gene expression profiles and statistical significance (adjusted p-values) of genes impacted by CNV deletions in river buffalo (Supplemental Table 7)
Public assessment of this dataset will allow for further analyses and functional annotation of genes that are potentially associated with phenotypic difference between cattle and water buffalo. Raw read data of whole genome and transcriptome sequencing were deposited to NCBI Bioprojects.