Source: AGRICULTURAL RESEARCH SERVICE submitted to NRP
US-UK COLLABORATIVE: REASSEMBLY OF CATTLE IMMUNE GENE CLUSTERS FOR QUANTITATIVE ANALYSIS
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
COMPLETE
Funding Source
Reporting Frequency
Annual
Accession No.
1005517
Grant No.
2015-67015-22970
Cumulative Award Amt.
$477,903.00
Proposal No.
2014-05983
Multistate No.
(N/A)
Project Start Date
Feb 1, 2015
Project End Date
Jan 31, 2020
Grant Year
2015
Program Code
[A1224]- Animal Health and Production and Animal Products: US-UK Collaborative Projects: Animal Health and Disease and Veterinary Immune Reagents
Recipient Organization
AGRICULTURAL RESEARCH SERVICE
RM 331, BLDG 003, BARC-W
BELTSVILLE,MD 20705-2351
Performing Department
USDA
Non Technical Summary
1. Current problem: The genetic potential of cattle is assessed by the use of commercial genotyping arrays, which drive a large proportion of the billion dollar cattle industry. These genotyping arrays are based on a cattle reference genome that has several large errors within regions that are likely to be involved in cattle health and immunity. Given that diseases such as bovine tuberculosis and mastitis cause significant loss of productivity for cattle owned by farmers in the US and UK, the ability to select for cattle that show better adaptive resistance to disease will reduce costs associated with management.2. Basic methods and approaches: A US-UK collaborative research team is funded by USDA-NIFA and the UK-BBSRC to study a topic of high priority to both countries. This funding partnership leverages the expertise of the three collaborating institutions, USDA-ARS (US), The PirBright Institute (UK) and the Roslin Institute (UK), in animal health , as well as available facilities and resources such as the USDA-ARS-MARC's PacBio sequencer, the Roslin Institute's bovine tuberculosis resistance panel, and the PirBright Institute's expertise in immunology. Using these combined resources and expertise, the team will use the latest PacBio sequencing technology to sequence poorly characterized immune genes in the cattle genome. Newly sequenced genes will be assessed by experts of the team to insure that generated data is of the highest quality. Next, naturally occurring variants will be identified by comparing sequence data from 40 individual Holstein bulls to the corrected genes. These variants will be ranked based on their suitability as genetic markers and a subset will be selected for further use. Finally, a herd of 1,200 Holstein cattle will be tested for these genetic markers and the results will be used in an association study designed to identify which of the markers is associated with bovine tuberculosis resistance and susceptibility.3. Ultimate goals: This project will provide resources to the cattle research community so that cattle can be selected for their resistance/susceptibility towards disease. In achieving this goal, we will improve the existing cattle reference genome and we will develop new genotyping assays, both of which will be publically available resources. The successful conclusion of this project will impact the global beef and dairy cattle industries and will subsequently improve the food security of both the US and UK by enabling the breeding of cattle that have improved adaptive immunity to disease.
Animal Health Component
(N/A)
Research Effort Categories
Basic
100%
Applied
(N/A)
Developmental
0%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
3043410108050%
3113410108050%
Knowledge Area
311 - Animal Diseases; 304 - Animal Genome;

Subject Of Investigation
3410 - Dairy cattle, live animal;

Field Of Science
1080 - Genetics;
Goals / Objectives
The general goal of this project will be to enable future research into genetic evaluation of immune gene alleles on the overall health and disease resistance of dairy cattle. In order to accomplish this goal, this project will work towards three objectives: 1) Improve the assembly of several IGC regions by performing assembly of PacBio reads derived from Hereford and Holstein bacterial artificial chromosomes and fosmid clones; 2) Identify sequence variants within newly assembled IGC regions by aligning high throughput whole genome sequence data to the assembled IGC; 3) Select informative SNP markers within each reassembled IGC region and perform a proof of principle GWAS experiment for bovine tuberculosis susceptibility with 1,200 differentially susceptible Holstein-Friesian cattle.
Project Methods
Several cutting edge methods will be employed by US and UK teams in order to achieve the goals of this project. Both groups will collect bacterial artificial chromosome (BAC) clones that specifically contain regions of the cattle genome that harbor immune gene clusters (IGC). In order to capture a wide array of the diversity of IGCs in the species, we will select clones from BAC libraries derived from four individuals at first (three from Holstein cattle and one from a Hereford bull). The BAC clones will then be sequenced using the latest Pacific Biosystems (PacBio) sequencing platform and chemistry in order to achieve contiguous read lengths greater than 20 kilobases (kb) in size. The use of PacBio sequencing is a major improvement over the prior use of Sanger or short-read sequencing techniques, where the shorter read lengths provided by the previous sequencing methods were insufficient for spanning the large repetitive regions found within the IGCs of mammalian genomes.The US team will then assemble the PacBio reads into larger contiguous sequence and then compare the new assemblies to the existing reference genome. This will provide a crucial evaluation point for the project, as the new assemblies will be tested for the following criteria: 1) the proportion of reads from other sequenced individuals mapping to the new assemblies compared to the old reference genome positions; 2) the homology of sequence from the new assemblies that matches to the old reference assembly; 3) the count and lengths of the new assemblies for each BAC clone, as the goal is one single, contiguous block of sequence for each clone. If any of the new assemblies does not pass the evaluation stages listed previously, finer control over the sequencing process and assembly criteria will be used to improve the new assembly until it is satisfactory.Both the US and UK teams will then align the sequence data from 40 Holstein bulls to the new assemblies and identify sequence variants within them. The primary focus will be on single nucleotide polymorphisms (SNP), insertions/deletions (INDEL) and structural variants (SV) that are identified from the individually sequenced animals. Using an adaptation of a previously developed SNP selection algorithm, the US team will select candidate genetic markers from the SNP data derived from the 40 Holstein bulls. The UK team will then commission large-scale genotyping assays for use on a panel of 1200 Holstein cattle that show differential susceptibility and resistance to bovine tuberculosis. At this stage, the performance of selected SNP variants will be assessed based on their call rate in the panel. If the call rate is poor, new markers will be selected from the previous data, and new genotyping assays will be commissioned. After genotyping is concluded, the UK team will perform a proof of principle genome-wide association study (GWAS) for susceptibility/resistance to bovine tuberculosis incorporating the newly assayed markers with existing genotyping data on the panel of Holsteins. Candidate variants that appear to be associated with tuberculosis susceptibility/resistance will be highlighted and released to the public for eventual incorporation in new genotyping assays.

Progress 02/01/15 to 01/31/20

Outputs
Target Audience:Farmers, scientists and livestock industry personnel who need methods/mechanisms for genomic selection of health traits in cattle. Additionally, bioinformaticians and geneticists who use cattle genomics tools. Additional efforts were made to reach out to existing Research Cattle Geneticists to distribute developed markers for use in association studies related to disease susceptibility and resistance. Changes/Problems:The contingency plan to develop Fosmid libraries was not needed due to our successes in assembling IGC haplotypes. Instead, this funding was used to purchase additional computational capacity on the USDA's Ceres cluster (a part of the SCINet initiative). This computational capacity currently benefits all Ceres users by enabling them to submit short-running jobs while the nodes are idle. What opportunities for training and professional development has the project provided?At the beginning of the project, PI Bickhart was considered an early career investigator. Due in part to this project's funding, he was able to establish a laboratory and publish several high-profile manuscripts related to the project objectives. Additionally, two US postdoctoral fellows were trained throughout the course of the project. On the UK collaborative side, PIJohn A. Hammond trained two postdoctoral fellows as well. The early career professionals trained as postdoctoral fellows throughout the course of this project are listed below: Dr. Kiranmayee Bakshy. US Postdoctoral Fellow. Dr. Juliana Young. US Postdoctoral Fellow. Dr. John C. Schwartz. UK Postdoctoral Fellow. Dr. Dorothea Heimeier. UK Postdoctoral Fellow. How have the results been disseminated to communities of interest?All datasets have been uploaded to the appropriate NCBI data repository sites. Additionaly, analysis has been disseminated to the scientific community through the publication of manuscripts in peer-reviewed journals. Finally, marker sequence has been distributed to independent researchers to test in on-going trials to estimate their effectiveness. What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? The successful completion of this project's objectives enabled the creation of two reference genomes for livestock species and the generation of alternative haplotypes for immune gene clusters in taurine cattle. Reference genome assemblies are the fundamental "maps" for genetic analysis and often require substantial investment of capital, time and expertise in order to generate high quality resources for future studies. This project enabled the participation of two groups of experts from the US and UK, whom contributed the requisite skills in genome sequencing and gene annotation, respectively, for reference genome assembly. This partnership resulted in the publication of the ARS1 goat (Capra hircus) and the ARS-UCD1.2 cattle (Bos taurus) reference genomes in addition to other project goals. These references are used by international research groups, but primarily benefit US and UK stakeholders by being most representative of western breeds of livestock (e.g. Hereford cattle and San Clemente goats). Furthermore, alternative haplotypes generated for the cattle genome assemblywere assembled and used in a subsequent genome-wide association study (GWAS) to identify associations between genetic markers and bovine tuberculosis (bTB) resistance. The complexity of bTB infection and incidence meant that a singular marker that explained the largest proportion of genetic variance was not identified in this study; however, new markers developed from our haplotypes show promise in improving the detection of bTB resistance in future breeding value calculations. Objective 1. The US project team sequenced and assembled 40 BAC clones on the PacBio RSII to generate19 unique haplotypes of three immune gene regions of the cattle genome. These haplotypes were polished and released to the public to facilitate further research into these highly polymorphic regions of the genome. During the course of this phase of the project, both the US and UK teams collaborated on two high profile genome assembly projects involving the domestic goat (Capra hircus) and domestic cow (Bos taurus). Collaboration on these project resulted in two highly representative reference genomes; one for each species. In all cases, the US team led the sequencing efforts while the UK team led the annotation efforts of each project. Objective 2. The US team aligned the sequence data of 125 Holstein bulls to the unique haplotypes assembled in objective 1. These alignments were used in a subsequence variant calling exercise to identify polymorphic sites that were unique to each haplotype region. The UK team then assisted in the selection of haplotype-specific genetic markers for custom genotype array design. Candidate markers were selected on the basis of their equal distribution across hapltoype regions and their frequency in the population. A total of 124 SNP variants were selected for custom array design in the subsequent objective. Objective 3. A DNA panel of 1,797 Holstein cows with bTB exposure phenotypes were genotyped with the custom panel of 124 SNP markers. Genotype data from this cohortwas used to filter custom markers to a set of 72 sites that appeared to accurately track immune gene haplotypes in the population. Using skin-test and lesion detection phenotypes from this cohort, a genome-wide association analysis was performed. Three custom markers were found to explain a relatively large proportion of the variance of phenotypes; however, these markers did not achieve genome-wide significance for this effect. All markers and anlaysis were published so as to contribute to ongoing efforts to selectively breed for bTB resistance in dairy cattle.

Publications

  • Type: Journal Articles Status: Published Year Published: 2019 Citation: John C Schwartz, Nicholas D Sanderson, Derek M Bickhart, Timothy PL Smith, John Anthony Hammond. 2019. The structure, evolution, and gene expression within the caprine leukocyte receptor complex. Frontiers in Immunology. 10:2302.
  • Type: Journal Articles Status: Published Year Published: 2018 Citation: John C Schwartz, Rebecca L Philp, Derek M Bickhart, Timothy PL Smith, John A Hammond. 2018. The antibody loci of the domestic goat (Capra hircus). Immunogenetics. 70:5. 317-326.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2018 Citation: Derek Bickhart, John C Schwartz, John A Hammond, Benjamin D Rosen, Juan F Medrano, Adam M Phillippy, Curtis P VanTassell, Timothy PL Smith. 2018. Long-Read Sequencing Enables the Accurate Annotation of Immune Gene Clusters in Goat. Annual Plant and Animal Genome Conference. San Diego, CA.
  • Type: Journal Articles Status: Published Year Published: 2017 Citation: John C Schwartz, Mark S Gibson, Dorothea Heimeier, Sergey Koren, Adam M Phillippy, Derek M Bickhart, Timothy PL Smith, Juan F Medrano, John A Hammond. 2017. The evolution of the natural killer complex; a comparison between mammals using new high-quality genome assemblies and targeted annotation. Immunogenetics. 69:4. 255-269.
  • Type: Journal Articles Status: Accepted Year Published: 2020 Citation: Benjamin D Rosen; Derek M Bickhart; Robert D Schnabel; Sergey Koren; Christine G Elsik; Elizabeth Tseng; Troy N Rowan; Wai Y Low; Aleksey Zimin; Christine Couldrey; Richard Hall; Wenli Li; Arang Rhie; Jay Ghurye; Stephanie D McKay; Francoise Thibaud-Nissen; Jinna Hoffman; Brenda M Murdoch; Warren M Snelling; Tara G McDaneld; John A Hammond; John C Schwartz; Wilson Nandolo; Darren E Hagen; Christian Dreischer; Sebastian J Schultheiss; Steven G Schroeder; Adam M Phillippy; John B Cole; Curtis P Van Tassell; George Liu; Timothy P.L. Smith; Juan F Medrano. 2020. De novo assembly of the cattle reference genome with single-molecule sequencing. Gigascience. Accepted.
  • Type: Journal Articles Status: Submitted Year Published: 2020 Citation: Kiranmayee Bakshy, Dorothea Harrison, John C. Schwartz, Elizabeth J. Glass, Juliana Y. Dias, Jennifer C. McClure, John B. Cole, Daniel J. Null, John A. Hammond, Timothy P. L. Smith, Derek M. Bickhart. 2020. Short Communication: Reassembly of key immune gene clusters of cattle reference genome to identify genetic markers linked to bovine Tuberculosis susceptibility and resistance. Journal of Dairy Science. Submitted.
  • Type: Journal Articles Status: Published Year Published: 2017 Citation: Derek M Bickhart, Benjamin D Rosen, Sergey Koren, Brian L Sayre, Alex R Hastie, Saki Chan, Joyce Lee, Ernest T Lam, Ivan Liachko, Shawn T Sullivan, Joshua N Burton, Heather J Huson, John C Nystrom, Christy M Kelley, Jana L Hutchison, Yang Zhou, Jiajie Sun, Alessandra Cris�, F Abel Ponce de Le�n, John C Schwartz, John A Hammond, Geoffrey C Waldbieser, Steven G Schroeder, George E Liu, Maitreya J Dunham, Jay Shendure, Tad S Sonstegard, Adam M Phillippy, Curtis P Van Tassell, Timothy PL Smith. 2017. Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome. Nature Genetics. 49:4. 643-650.


Progress 02/01/17 to 01/31/18

Outputs
Target Audience:Farmers, scientists and livestock industry personnel who need methods/mechanisms for genomic selection of health traits in cattle. Additionally, bioinformaticians, computational biologists, and geneticists interested in further classifying genetic diversity within key immune gene regions. Specifically, efforts were taken to disseminate assembled immune gene haplotypes to ongoing cattle reference genome assembly efforts, including the ARS-UCD Dominette assembly project. Additionally, information regarding assembled haplotypes was sent to the Council on Dairy Cattle Breeding (CDCB) and the Cooperative Dairy DNA Repository (CDDR) boards. Changes/Problems:The marker selection algorithm performed better than we had hoped, so we have no major problems/changes to report at this time. What opportunities for training and professional development has the project provided?The US and UK teams are currently supervising the professional development of three postdoctoral fellows who are being directly funded by this project or the US-UK joint initiative: 1. Kiranmayee Bakshy, US Postdoc. Currently being trained to assist with the genome-wide association of marker states to resistance phenotypes. 2. John Schwartz, UK PostDoc, contributed to BAC library creation and selection. Published a manuscript on assembled immune gene haplotypes. 3. Dorothea Harrison, UK PostDoc, aligned assembled BAC contigs to reference genome for assessment. Assisted with SNP marker selection for the custom array design. How have the results been disseminated to communities of interest?Specialized software and analysis notes have been submitted to a public GitHub repository (https://github.com/njdbickhart). The US and UK teams have been participating in the ARS-UCD Dominette reference genome assembly project and have contributed unique haplotype sequence to that effort in order to improve a resource that will be used by the cattle research community. What do you plan to do during the next reporting period to accomplish the goals?The US team will use SNP marker genotypes derived from our submitted custom array in a genome-wide association study of bovine tuberculosis resistance and susceptibility. Marker significance for gathered phenotypes will be reported, and all maker probe sequences/sites will be reported to collaborators interested in applying these custom markers to their own disease resistance surveys. A peer-reviewed manuscript will be prepared and submitted to further disseminate results to the public.

Impacts
What was accomplished under these goals? The US team has aligned whole genome shotgun sequence data from 172 Holstein bulls to the previously described assembled immune gene haplotypes and a version of the ARS-UCD Dominette reference genome assembly. Using a scoring algorithm and advice from our UK colleagues, over 67,000 identified SNP and INDEL variants were filtered down to 243 high priority variants for genotyping. These high priority variants were used as templates for the generation of a 60-marker, custom genotyping array that is scheduled for testing in Spring of 2018 on a cohort of 1,200+ Holstein cattle. After testing, markers will be assessed for their association with bovine tuberculosis susceptibility/resistance phenotypes, and a decision will be made to include more makers from the high priority list based on the number of markers that passed the test phase.

Publications


    Progress 02/01/16 to 01/31/17

    Outputs
    Target Audience:Farmers, scientists and livestock industry personnel who need methods/mechanisms for genomic selection of health traits in cattle. Additionally, bioinformaticians, computational biologists, and geneticists interested in further classifying genetic diversity within key immune gene regions. Changes/Problems:The US team is currently accomplishing all project milestones without any major changes or issues to report. The UK team has reported some difficulty in selecting BAC clones from additional libraries. To address this issue, the US team will assist the UK team in pursuing an alternative fosmid-based screening strategy being developed by a research group in Seattle. This method promises to simplify the selection of BAC clones containing highly polymorphic immune gene segments by using PacBio reads to select larger segments of immune gene clusters. This will also present an opportunity to send a postdoctoral fellow to Seattle from the UK to engage in further training as part of this grant. If this effort does not work to increase the number of assembled haplotype segments for immune gene regions in cattle, we will rely on animal pedigree and whole-genome-shotgun Illumina reads aligned to our existing assembled haplotype segments to select for differentially inherited SNP markers within these regions. Evenat this stage of project completion, we will have a large set of candidate SNP and INDEL markers to test and associate with Bovine Tuberculosis resistance in the selected cattle population. What opportunities for training and professional development has the project provided?The US team is in the process of hiring a Postdoctoral fellow, and members of the USDA MARC team include two technicians who are learning new analysis techniques. The UK team is currently training two Postdoctoral fellows in Immunology, gene annotation and de novo assembly. 1. John Schwartz, UK PostDoc, contributed to BAC library creation and selection. Drafted a manuscript and submitted the manuscript for peer review 2. Dorothea Harrison, UK PostDoc, aligned assembled BAC contigs to reference genome for assessment. Drafted a manuscript and is preparing to submit for peer review. How have the results been disseminated to communities of interest?Specialized software and analysis notes have been submitted to a public GitHub repository (https://github.com/njdbickhart). Interested groups can reproduce all analysis methods using these note files. The UK team has written a manuscript that details the expansion of the NKC cluster in cattle and has deposited BAC clone assemblies to Genbank. The US team has contributed PacBio sequencing and clones to the forthcoming Cattle reference genome assembly led by Benjamin Rosen, Timothy P.L. Smith and Juan Medrano. What do you plan to do during the next reporting period to accomplish the goals?The UK team will continue to pursue BAC library enrichment techniques to identify clones that represent distinct haplotypes for reassembly. The US team will align WGS data from 245 sequenced Holstein bulls to assembled haplotypes in order to identify unique SNP and INDEL marker candidates. Sequenom testing will be commissioned using the top 50 marker candidates on a 1,200 sample DNA panel present in the UK to ascertain marker frequency. Markers that pass initial tests will be published and distributed to the public and interested research groups.

    Impacts
    What was accomplished under these goals? The US team has sequenced and assembled an additional 12 BAC clones from the CHORI-240 and RCPI-42 libraries to increase the coverage of haplotypes represented in these libraries. Additionally, alignment of Illumina WGS data held by the US team to the BAC clones has begun. Discovered SNPs and INDELs are currently being analyzed and tested to ensure that suitable markers are generated for downstream genotyping.

    Publications

    • Type: Journal Articles Status: Under Review Year Published: 2016 Citation: John C Schwartz, Mark S Gibson, Dorothea Heimeier, Sergey Koren, Adam M Phillippy, Derek M Bickhart, Timothy PL Smith, Juan F Medrano, John A Hammond. 2016. The evolution of the natural killer complex; a comparison between mammals using new high-quality genome assemblies and targeted annotation. bioRxiv 069922; doi: https://doi.org/10.1101/069922
    • Type: Journal Articles Status: Under Review Year Published: 2016 Citation: Derek M Bickhart, Benjamin D Rosen, Sergey Koren, Brian L Sayre, Alex R Hastie, Saki Chan, Joyce Lee, Ernest T Lam, Ivan Liachko, Shawn T Sullivan, Joshua N Burton, Heather J Huson, Christy M Kelley, Jana L Hutchison, Yang Zhou, Jiajie Sun, Alessandra Crisa, F. Abel Ponce de Leon, John C Schwartz, John A Hammond, Geoffrey C Waldbieser, Steven G Schroeder, George E Liu, Maitreya J Dunham, Jay Shendure, Tad S Sonstegard, Adam M Phillippy, Curtis P Van Tassell, Timothy P.L. Smith. 2016. Single-molecule sequencing and conformational capture enable de novo mammalian reference genomes. bioRxiv 064352; doi: https://doi.org/10.1101/064352


    Progress 02/01/15 to 01/31/16

    Outputs
    Target Audience:Farmers, scientists and livestock industry personnel who need methods/mechanisms for genomic selection of health traits in cattle. Additionally, bioinformaticians and geneticists who use cattle genomics tools. Changes/Problems:Difficulties have arisen with the initial selection of BAC clones for assembly. The UK team has developed probes for BAC library screening that promise to provide an unbiased means of selecting individual clones that directly overlap IGC regions. In addition the UK team is developing the manufacture of BAC libraries that allow in silico screening to identify positive clones, enabling the isolation of more IGCs. Other approaches for specific genomic enrichment and sequencing are being explored. The US team will work closely with their UK collaborators to implement these techniques on the US-held BAC libraries to increase the coverage of IGC regions. Regardless, several important clones have already been assembled, but these methods promise to retrieve the full diversity of haplotypes that are present in the US BAC libraries. What opportunities for training and professional development has the project provided?The US team is in the process of hiring a Postdoctoral fellow, and members of the USDA MARC team include two technicians who are learning new analysis techniques. The UK team is currently training two Postdoctoral fellows in Immunology, gene annotation and de novo assembly. 1. John Schwartz, UK PostDoc, contributed to BAC library creation and selection 2. Dorothea Harrison, UK PostDoc, aligned assembled BAC contigs to reference genome for assessment How have the results been disseminated to communities of interest?Specialized software and analysis notes have been submitted to a public GitHub repository (https://github.com/njdbickhart). Interested groups can reproduce all analysis methods using these note files. Additionally, the UK team is drafting a manuscript detailing the general structure of the NKC gene cluster, which is a critical IGC that is being targeted for reassembly. Details on the gene structure of this region will greatly assist the project by providing context for determining different haplotypes of the cluster. What do you plan to do during the next reporting period to accomplish the goals?In order to accomplish objective 1, the US team will use the oligo probes developed by the UK team in order to select specific BAC clones that directly overlap misassembled IGC regions. The UK team will also use this technique in order to select BAC clones from libraries created from the DNA of two UK Holstein cattle. Both teams will isolate BAC DNA, and send samples to the USDA MARC for PacBio sequencing. In order to assess the quality of the assembled DNA sequence, the US team will develop new scripts that test Illumina Whole Genome Sequence data alignments. The UK team will assess assembled contig gene annotation to determine the coverage of key IGC regions. In order to address objective 2, the US team will scaffold the contigs, and both the US and UK teams will align whole genome shotgun sequence data from 56 individual Holstein bulls to the scaffolds. Sequence read mappability profiles will be developed for each scaffold, and key marker SNPs will be identified from the derived variant calling data.

    Impacts
    What was accomplished under these goals? Combined, the US and UK teams have selected and sequenced 46 BAC clones that flank, or overlap, targeted IGC regions. The US team has focused on the sequencing of clones using PacBio technologies, and the assembly of these clones into contiguous stretches of DNA sequence (contigs). To this end, the US team has developed novel scripts and analysis pipelines to assess the quality and locations of the assembled BAC clones. The UK team has assisted with the initial discovery of overlapping BAC clones, and provided essential expertise in the annotation and assessment of IGC gene segments. UK group alignments have enabled the US team to assess the coverage of IGC regions, and to redirect focus towards additional BAC clones for sequencing. Both groups have selected a wide grouping of BAC clones for sequencing, and the resultant sequencing and assembly has produced three contigs that represent two haplotypes of the Natural Killer Cell (NKC) gene complex that are misassembled on the cattle UMD3.1 reference genome. The US team has written two assembly quality assessment scripts and submitted them to public programming repositories. The UK team has also developed oligo probes for selecting BAC clones that span critical IGC regions, as bioinformatics approaches to select BACs for sequencing have been difficult.

    Publications