Viewport Size Code:
Login | Create New Account
picture

  MENU

About | Classical Genetics | Timelines | What's New | What's Hot

About | Classical Genetics | Timelines | What's New | What's Hot

icon

Bibliography Options Menu

icon
QUERY RUN:
HITS:
PAGE OPTIONS:
Hide Abstracts   |   Hide Additional Links
NOTE:
Long bibliographies are displayed in blocks of 100 citations at a time. At the end of each block there is an option to load the next block.

Bibliography on: Pangenome

The Electronic Scholarly Publishing Project: Providing world-wide, free access to classic scientific papers and other scholarly materials, since 1993.

More About:  ESP | OUR CONTENT | THIS WEBSITE | WHAT'S NEW | WHAT'S HOT

ESP: PubMed Auto Bibliography 31 Aug 2024 at 01:30 Created: 

Pangenome

Although the enforced stability of genomic content is ubiquitous among MCEs, the opposite is proving to be the case among prokaryotes, which exhibit remarkable and adaptive plasticity of genomic content. Early bacterial whole-genome sequencing efforts discovered that whenever a particular "species" was re-sequenced, new genes were found that had not been detected earlier — entirely new genes, not merely new alleles. This led to the concepts of the bacterial core-genome, the set of genes found in all members of a particular "species", and the flex-genome, the set of genes found in some, but not all members of the "species". Together these make up the species' pan-genome.

Created with PubMed® Query: ( pangenome OR "pan-genome" OR "pan genome" ) NOT pmcbook NOT ispreviousversion

Citations The Papers (from PubMed®)

-->

RevDate: 2024-08-28
CmpDate: 2024-08-28

King AC, Kumar N, Mellor KC, et al (2024)

Comparison of gene-by-gene and genome-wide short nucleotide sequence-based approaches to define the global population structure of Streptococcus pneumoniae.

Microbial genomics, 10(8):.

Defining the population structure of a pathogen is a key part of epidemiology, as genomically related isolates are likely to share key clinical features such as antimicrobial resistance profiles and invasiveness. Multiple different methods are currently used to cluster together closely related genomes, potentially leading to inconsistency between studies. Here, we use a global dataset of 26 306 Streptococcus pneumoniae genomes to compare four clustering methods: gene-by-gene seven-locus MLST, core genome MLST (cgMLST)-based hierarchical clustering (HierCC) assignments, life identification number (LIN) barcoding and k-mer-based PopPUNK clustering (known as GPSCs in this species). We compare the clustering results with phylogenetic and pan-genome analyses to assess their relationship with genome diversity and evolution, as we would expect a good clustering method to form a single monophyletic cluster that has high within-cluster similarity of genomic content. We show that the four methods are generally able to accurately reflect the population structure based on these metrics and that the methods were broadly consistent with each other. We investigated further to study the discrepancies in clusters. The greatest concordance was seen between LIN barcoding and HierCC (adjusted mutual information score=0.950), which was expected given that both methods utilize cgMLST, but have different methods for defining an individual cluster and different core genome schema. However, the existence of differences between the two methods shows that the selection of a core genome schema can introduce inconsistencies between studies. GPSC and HierCC assignments were also highly concordant (AMI=0.946), showing that k-mer-based methods which use the whole genome and do not require the careful selection of a core genome schema are just as effective at representing the population structure. Additionally, where there were differences in clustering between these methods, this could be explained by differences in the accessory genome that were not identified in cgMLST. We conclude that for S. pneumoniae, standardized and stable nomenclature is important as the number of genomes available expands. Furthermore, the research community should transition away from seven-locus MLST, whilst cgMLST, GPSC and LIN assignments should be used more widely. However, to allow for easy comparison between studies and to make previous literature relevant, the reporting of multiple clustering names should be standardized within the research.

RevDate: 2024-08-28

Chen G, Shi G, Dai Y, et al (2024)

Graph-Based Pan-Genome Reveals the Pattern of Deleterious Mutations during the Domestication of Saccharomyces cerevisiae.

Journal of fungi (Basel, Switzerland), 10(8): pii:jof10080575.

The "cost of domestication" hypothesis suggests that the domestication of wild species increases the number, frequency, and/or proportion of deleterious genetic variants, potentially reducing their fitness in the wild. While extensively studied in domesticated species, this phenomenon remains understudied in fungi. Here, we used Saccharomyces cerevisiae, the world's oldest domesticated fungus, as a model to investigate the genomic characteristics of deleterious variants arising from fungal domestication. Employing a graph-based pan-genome approach, we identified 1,297,761 single nucleotide polymorphisms (SNPs), 278,147 insertion/deletion events (indels; <30 bp), and 19,967 non-redundant structural variants (SVs; ≥30 bp) across 687 S. cerevisiae isolates. Comparing these variants with synonymous SNPs (sSNPs) as neutral controls, we found that the majority of the derived nonsynonymous SNPs (nSNPs), indels, and SVs were deleterious. Heterozygosity was positively correlated with the impact of deleterious SNPs, suggesting a role of genetic diversity in mitigating their effects. The domesticated isolates exhibited a higher additive burden of deleterious SNPs (dSNPs) than the wild isolates, but a lower burden of indels and SVs. Moreover, the domesticated S. cerevisiae showed reduced rates of adaptive evolution relative to the wild S. cerevisiae. In summary, deleterious variants tend to be heterozygous, which may mitigate their harmful effects, but they also constrain breeding potential. Addressing deleterious alleles and minimizing the genetic load are crucial considerations for future S. cerevisiae breeding efforts.

RevDate: 2024-08-28

Gagie T (2024)

How to Find Long Maximal Exact Matches and Ignore Short Ones.

Developments in language theory. Conference on Developments in Language Theory, 14791:131-140.

Finding maximal exact matches (MEMs) between strings is an important task in bioinformatics, but it is becoming increasingly challenging as geneticists switch to pangenomic references. Fortunately, we are usually interested only in the relatively few MEMs that are longer than we would expect by chance. In this paper we show that under reasonable assumptions we can find all MEMs of length at least L between a pattern of length m and a text of length n in O (m) time plus extra O (l o g n) time only for each MEM of length at least nearly L using a compact index for the text, suitable for pangenomics.

RevDate: 2024-08-27

Aoun N, Georgoulis SJ, Avalos JK, et al (2024)

A pangenomic atlas reveals eco-evolutionary dynamics that shape type VI secretion systems in plant-pathogenic Ralstonia.

mBio [Epub ahead of print].

Soilborne Ralstonia solanacearum species complex (RSSC) pathogens disrupt microbial communities as they invade roots and fatally wilt plants. RSSC pathogens secrete antimicrobial toxins using a type VI secretion system (T6SS). To investigate how evolution and ecology have shaped the T6SS of these bacterial pathogens, we analyzed the T6SS gene content and architecture across the RSSC and their evolutionary relatives. Our analysis reveals that two ecologically similar Burkholderiaceae taxa, xylem-pathogenic RSSC and Paracidovorax, have convergently evolved to wield large arsenals of T6SS toxins. To understand the mechanisms underlying genomic enrichment of T6SS toxins, we compiled an atlas of 1,066 auxiliary T6SS toxin clusters ("aux" clusters) across 99 high-quality RSSC genomes. We classified 25 types of aux clusters with toxins that predominantly target lipids, nucleic acids, or unknown cellular substrates. The aux clusters were located in diverse genetic neighborhoods and had complex phylogenetic distributions, suggesting frequent horizontal gene flow. Phages and other mobile genetic elements account for most of the aux cluster acquisition on the chromosome but very little on the megaplasmid. Nevertheless, RSSC genomes were more enriched in aux clusters on the megaplasmid. Although the single, ancestral T6SS was broadly conserved in the RSSC, the T6SS has been convergently lost in atypical, non-soilborne lineages. Overall, our data suggest dynamic interplay between the lifestyle of RSSC lineages and the evolution of T6SSes with robust arsenals of toxins. This pangenomic atlas poises the RSSC as an emerging, tractable model to understand the role of the T6SS in shaping pathogen populations.IMPORTANCEWe explored the eco-evolutionary dynamics that shape the inter-microbial warfare mechanisms of a globally significant plant pathogen, the Ralstonia solanacearum species complex. We discovered that most Ralstonia wilt pathogens have evolved extensive and diverse repertoires of type VI secretion system-associated antimicrobial toxins. These expansive toxin arsenals potentially enhance the ability of Ralstonia pathogens to invade plant microbiomes, enabling them to rapidly colonize and kill their host plants. We devised a classification system to categorize the Ralstonia toxins. Interestingly, many of the toxin gene clusters are encoded on mobile genetic elements, including prophages, which may be mutualistic symbionts that enhance the inter-microbial competitiveness of Ralstonia wilt pathogens. Moreover, our findings suggest that the convergent loss of this multi-gene trait contributes to genome reduction in two vector-transmitted lineages of Ralstonia pathogens. Our findings demonstrate that the interplay between microbial ecology and pathogen lifestyle shapes the evolution of a genetically complex antimicrobial weapon.

RevDate: 2024-08-27

Huang B, Fan C, Chen K, et al (2024)

VCAT: an integrated variant function annotation tools.

Human genetics [Epub ahead of print].

The development of sequencing technology has promoted discovery of variants in the human genome. Identifying functions of these variants is important for us to link genotype to phenotype, and to diagnose diseases. However, it usually requires researchers to visit multiple databases. Here, we presented a one-stop webserver for variant function annotation tools (VCAT, https://biomed.nscc-gz.cn/zhaolab/VCAT/) that is the first one connecting variant to functions via the epigenome, protein, drug and RNA. VCAT is also the first one to make all annotations visualized in interactive charts or molecular structures. VCAT allows users to upload data in VCF format, and download results via a URL. Moreover, VCAT has annotated a huge number (1,262,041,068) of variants collected from dbSNP, 1000 Genomes projects, gnomAD, ICGC, TCGA, and HPRC Pangenome project. For these variants, users are able to searcher their functions, related diseases and drugs from VCAT. In summary, VCAT provides a one-stop webserver to explore the potential functions of human genomic variants including their relationship with diseases and drugs.

RevDate: 2024-08-27

Tiwari VK, Saripalli G, Sharma PK, et al (2024)

Wheat genomics: genomes, pangenomes, and beyond.

Trends in genetics : TIG pii:S0168-9525(24)00170-7 [Epub ahead of print].

There is an urgent need to improve wheat for upcoming challenges, including biotic and abiotic stresses. Sustainable wheat improvement requires the introduction of new genes and alleles in high-yielding wheat cultivars. Using new approaches, tools, and technologies to identify and introduce new genes in wheat cultivars is critical. High-quality genomes, transcriptomes, and pangenomes provide essential resources and tools to examine wheat closely to identify and manipulate new and targeted genes and alleles. Wheat genomics has improved excellently in the past 5 years, generating multiple genomes, pangenomes, and transcriptomes. Leveraging these resources allows us to accelerate our crop improvement pipelines. This review summarizes the progress made in wheat genomics and trait discovery in the past 5 years.

RevDate: 2024-08-27

Olawoye IB, Waglechner N, McIntosh F, et al (2024)

Genomic Epidemiology of Mycobacterium abscessus on the Island of Montréal Not Suggestive of Healthcare-associated Person-to-Person Transmission.

The Journal of infectious diseases pii:7737256 [Epub ahead of print].

BACKGROUND: Mycobacterium abscessus complex (MABC), an opportunistic nontuberculous mycobacteria (NTM), can lead to poor clinical outcomes in pulmonary infections. Conflicting data exist on person-to-person transmission of MABC within and across healthcare facilities. To investigate further, a comprehensive retrospective study across five healthcare institutions on the Island of Montréal was undertaken.

METHODS: We analyzed the genomes of 221 MABC isolates obtained from 115 individuals (2010-2018) to identify possible links. Genetic similarity, defined as ≤25 single-nucleotide polymorphisms (SNPs), was investigated through a blinded epidemiological inquiry.

RESULTS: Bioinformatics analyses identified 28 sequence types (STs), including globally observed dominant circulating clones (DCCs). Further analysis revealed 210 isolate pairs within the SNP threshold. Among these pairs, there was one possible lab contamination where isolates from different patients processed in the same lab differed by only 2 SNPs. There were 37 isolate pairs from patients who had provided specimens from the same hospital; however, epidemiological analysis found no evidence of healthcare-associated person-to-person transmission between these patients. Additionally, pan-genome analysis showed higher discriminatory power than core genome analysis for examining genomic similarity.

CONCLUSIONS: Genomics alone is insufficient to establish MABC transmission, particularly considering the genetic similarity and wide distribution of DCCs, although pan-genome analysis has the potential to add further insight. Our findings indicate that MABC infections in Montréal are unlikely attributable to healthcare-associated person-to-person transmission.

RevDate: 2024-08-26

Švara A, Sun H, Fei Z, et al (2024)

Advancing apple genetics research: Malus coronaria and Malus ioensis genomes and a gene family-based pangenome of native North American apples.

DNA research : an international journal for rapid publication of reports on genes and genomes pii:7741556 [Epub ahead of print].

Wild Malus species flourished in North America long before Europeans introduced domesticated apples. Malus coronaria and M. ioensis are native to the mid-western and eastern USA, while M. angustifolia and M. fusca grow in the southeast and west, respectively. They offer disease resistance, climate and soil adaptability, and horticultural traits for apple breeding. However, their utilization remains limited due to insufficient genomic resources and specific genetics. We report high-quality phased chromosome-scale assemblies of M. coronaria and M. ioensis, generated using long-read and conformation capture sequencing. Phylogenetic and synteny analysis indicated high relatedness between these two genomes and previously-published genome of M. angustifolia, and lower relatedness with M. fusca. Gene family-based pangenome of North American Malus identified 60,211 orthogroups containing 340,087 genes. Genes involved in basic cellular and metabolic processes, growth, and development were core to the existence of these species, whereas genes involved in secondary metabolism, stress response, and interactions with other organisms were accessory and are likely associated with adaptation to specific environments. Structural variation hotspots were mostly overlapping with high gene density. This study offers novel native North American Malus genome resources that can be used to identify genes for apple breeding and understand their evolution and adaptation.

RevDate: 2024-08-25

Arjun OK, Sethi M, Parida D, et al (2024)

Comprehensive physiological and genomic characterization of a potential probiotic strain, Lactiplantibacillus plantarum ILSF15, isolated from the gut of tribes of Odisha, India.

Gene pii:S0378-1119(24)00763-7 [Epub ahead of print].

Characterizing probiotic features of organisms isolated from diverse environments can lead to the discovery of novel strains with promising functional features and health attributes. The present study attempts to characterize a novel probiotic strain isolated from the gut of the tribal population of Odisha, India. Based on 16S rRNA-based phylogeny, the strain was identified as a species of the Lactiplantibacillus genus and was named Lactiplantibacillus plantarum strain ILSF15. The current investigation focuses on elucidating this strain's genetic and physiological properties associated with probiotic attributes such as biosafety risk, host adaptation/survival traits, and beneficial functional features. The novel strain was observed, in vitro, exhibiting features such as acid/bile tolerance, adhesion to the host enteric epithelial cells, cholesterol assimilation, and pathogen exclusion, indicating its ability to survive the harsh environment of the human GIT and resist the growth of harmful microorganisms. Additionally, the L. plantarum ILSF15 strain was found to harbor genes associated with the metabolism and synthesis of various bioactive molecules, including amino acids, carbohydrates, lipids, and vitamins, highlighting the organism's ability to efficiently utilize diverse resources and contribute to the host's nutrition and health. Several genes involved in host adaptation/survival strategies and host-microbe interactions were also identified from the ILSF15 genome. Moreover, L. plantarum strains, in general, were found to have an open pangenome characterized by high genetic diversity and the absence of specific lineages associated with particular habitats, signifying its versatile nature and potential applications in probiotic and functional food industries.

RevDate: 2024-08-24
CmpDate: 2024-08-24

He H, Leng Y, Cao X, et al (2024)

The pan-tandem repeat map highlights multiallelic variants underlying gene expression and agronomic traits in rice.

Nature communications, 15(1):7291.

Tandem repeats (TRs) are genomic regions that tandemly change in repeat number, which are often multiallelic. Their characteristics and contributions to gene expression and quantitative traits in rice are largely unknown. Here, we survey rice TR variations based on 231 genome assemblies and the rice pan-genome graph. We identify 227,391 multiallelic TR loci, including 54,416 TR variations that are absent from the Nipponbare reference genome. Only 1/3 TR variations show strong linkage with nearby bi-allelic variants (SNPs, Indels and PAVs). Using 193 panicle and 202 leaf transcriptomic data, we reveal 485 and 511 TRs act as QTLs independently of other bi-allelic variations to nearby gene expression, respectively. Using plant height and grain width as examples, we identify and validate TRs contributions to rice agronomic trait variations. These findings would enhance our understanding of the functions of multiallelic variants and facilitate rice molecular breeding.

RevDate: 2024-08-23
CmpDate: 2024-08-23

Wang S, Sun S, Wang Q, et al (2024)

PathoTracker: an online analytical metagenomic platform for Klebsiella pneumoniae feature identification and outbreak alerting.

Communications biology, 7(1):1038.

Clinical metagenomics (CMg) Nanopore sequencing can facilitate infectious disease diagnosis. In China, sub-lineages ST11-KL64 and ST11-KL47 Carbapenem-resistant Klebsiella pneumoniae (CRKP) are widely prevalent. We propose PathoTracker, a specially compiled database and arranged method for strain feature identification in CMg samples and CRKP traceability. A database targeting high-prevalence horizontal gene transfer in CRKP strains and a ST11-only database for distinguishing two sub-lineages in China were created. To make the database user-friendly, facilitate immediate downstream strain feature identification from raw Nanopore metagenomic data, and avoid the need for phylogenetic analysis from scratch, we developed data analysis methods. The methods included pre-performed phylogenetic analysis, gene-isolate-cluster index and multilevel pan-genome database and reduced storage space by 10-fold and random-access memory by 52-fold compared with normal methods. PathoTracker can provide accurate and fast strain-level analysis for CMg data after 1 h Nanopore sequencing, allowing early warning of outbreaks. A user-friendly page (http://PathoTracker.pku.edu.cn/) was developed to facilitate online analysis, including strain-level feature, species identifications and phylogenetic analyses. PathoTracker proposed in this study will aid in the downstream analysis of CMg.

RevDate: 2024-08-22
CmpDate: 2024-08-22

Fang Y, Xiao X, Lin J, et al (2024)

Pan-genome and phylogenomic analyses highlight Hevea species delineation and rubber trait evolution.

Nature communications, 15(1):7232.

The para rubber tree (Hevea brasiliensis) is the world's sole commercial source of natural rubber, a vital industrial raw material. However, the narrow genetic diversity of this crop poses challenges for rubber breeding. Here, we generate high-quality de novo genome assemblies for three H. brasiliensis cultivars, two H. brasiliensis wild accessions, and three other Hevea species (H. nitida, H. pauciflora, and H. benthamiana). Through analyzing genomes of 94 Hevea accessions, we identify five distinct lineages that do not align with their previous species delineations. We discover multiple accessions with hybrid origins between these lineages, indicating incomplete reproductive isolation between them. Only two out of four wild lineages have been introduced to commercial rubber cultivars. Furthermore, we reveal that the rubber production traits emerged following the development of a large REF/SRPP gene cluster and its functional specialization in rubber-producing laticifers within this genus. These findings would enhance rubber breeding and benefit research communities.

RevDate: 2024-08-22

Mangal V, Verma LK, Singh SK, et al (2024)

Triumphs of genomic-assisted breeding in crop improvement.

Heliyon, 10(15):e35513.

Conventional breeding approaches have played a significant role in meeting the food demand remarkably well until now. However, the increasing population, yield plateaus in certain crops, and limited recombination necessitate using genomic resources for genomics-assisted crop improvement programs. As a result of advancements in the next-generation sequence technology, GABs have developed dramatically to characterize allelic variants and facilitate their rapid and efficient incorporation in crop improvement programs. Genomics-assisted breeding (GAB) has played an important role in harnessing the potential of modern genomic tools, exploiting allelic variation from genetic resources and developing cultivars over the past decade. The availability of pangenomes for major crops has been a significant development, albeit with varying degrees of completeness. Even though adopting these technologies is essentially determined on economic grounds and cost-effective assays, which create a wealth of information that can be successfully used to exploit the latent potential of crops. GAB has been instrumental in harnessing the potential of modern genomic resources and exploiting allelic variation for genetic enhancement and cultivar development. GAB strategies will be indispensable for designing future crops and are expected to play a crucial role in breeding climate-smart crop cultivars with higher nutritional value.

RevDate: 2024-08-21

Chan DTC, HC Bernstein (2024)

Pangenomic landscapes shape performances of a synthetic genetic circuit across Stutzerimonas species.

mSystems [Epub ahead of print].

Engineering identical genetic circuits into different species typically results in large differences in performance due to the unique cellular environmental context of each host, a phenomenon known as the "chassis-effect" or "context-dependency". A better understanding of how genomic and physiological contexts underpin the chassis-effect will improve biodesign strategies across diverse microorganisms. Here, we combined a pangenomic-based gene expression analysis with quantitative measurements of performance from an engineered genetic inverter device to uncover how genome structure and function relate to the observed chassis-effect across six closely related Stutzerimonas hosts. Our results reveal that genome architecture underpins divergent responses between our chosen non-model bacterial hosts to the engineered device. Specifically, differential expression of the core genome, gene clusters shared between all hosts, was found to be the main source of significant concordance to the observed differential genetic device performance, whereas specialty genes from respective accessory genomes were not significant. A data-driven investigation revealed that genes involved in denitrification and components of trans-membrane transporter proteins were among the most differentially expressed gene clusters between hosts in response to the genetic device. Our results show that the chassis-effect can be traced along differences among the most conserved genome-encoded functions and that these differences create a unique biodesign space among closely related species.IMPORTANCEContemporary synthetic biology endeavors often default to a handful of model organisms to host their engineered systems. Model organisms such as Escherichia coli serve as attractive hosts due to their tractability but do not necessarily provide the ideal environment to optimize performance. As more novel microbes are domesticated for use as biotechnology platforms, synthetic biologists are urged to explore the chassis-design space to optimize their systems and deliver on the promises of synthetic biology. The consequences of the chassis-effect will therefore only become more relevant as the field of biodesign grows. In our work, we demonstrate that the performance of a genetic device is highly dependent on the host environment it operates within, promoting the notion that the chassis can be considered a design variable to tune circuit function. Importantly, our results unveil that the chassis-effect can be traced along similarities in genome architecture, specifically the shared core genome. Our study advocates for the exploration of the chassis-design space and is a step forward to empowering synthetic biologists with knowledge for more efficient exploration of the chassis-design space to enable the next generation of broad-host-range synthetic biology.

RevDate: 2024-08-21

Wang L, Cheng X, Guo Y, et al (2024)

Novel isolates of hydrogen-oxidizing chemolithoautotrophic Sulfurospirillum provide insight to the functions and adaptation mechanisms of Campylobacteria in shallow-water hydrothermal vents.

mSystems [Epub ahead of print].

Enhancing the availability of representative isolates from hydrothermal vents (HTVs) is imperative for comprehending the microbial processes that propel the vent ecosystem. In recent years, Campylobacteria have emerged as the predominant and ubiquitous taxon across both shallow and deep-sea vent systems. Nevertheless, only a few isolates have been cultured, primarily originating from deep-sea HTVs. Presently, no cultivable isolates of Campylobacteria are accessible in shallow water vent systems (<200 m), which exhibit markedly distinct environmental conditions from their deep-sea counterparts. In this study, we enriched a novel isolate (genus Sulfurospirillum, Campylobacteria) from shallow-water HTVs of Kueishan Island. Genomic and physiological analysis revealed that this novel Campylobacteria species grows on a variety of substrate and carbon/energy sources. The pan-genome and phenotypic comparisons with 12 previously isolated Sulfurospirillum species from different environments supported the identification of functional features in Sulfurospirillum genomes crucial for adaptation to vent environments, such as sulfur oxidation, carbon fixation, biofilm formation, and benzoate/toluene degradation, as well as diverse genes related with signal transportation. To conclude, the metabolic characteristics of this novel Campylobacteria augment our understanding of Campylobacteria spanning from deep-sea to shallow-water vent systems.IMPORTANCECampylobacteria emerge as the dominant and ubiquitous taxa within vent systems, playing important roles in the vent ecosystems. However, isolated representatives of Campylobacteria have been mainly from the deep-sea hydrothermal fields, leaving a significant knowledge gap regarding the functions, activities, and adaptation strategies of the vent microorganisms in shallow-water hydrothermal vents (HTVs). This study bridges this gap by providing insights into the phenomics and genomic diversity of genus Sulfurospirillum (order Campylobacterales, class Campylobacteria) based on data derived from a novel isolate obtained from shallow-water HTVs. Our mesophilic isolate of Sulfurospirillum not only augments the genus diversity of Campylobacteria pure cultures derived from vent systems but also serves as the inaugural reference isolate for Campylobacteria in shallow-water environments.

RevDate: 2024-08-21

Trost K, Knopp MR, Wimmer JLE, et al (2024)

A universal and constant rate of gene content change traces pangenome flux to LUCA.

FEMS microbiology letters pii:7737773 [Epub ahead of print].

Prokaryotic genomes constantly undergo gene flux via lateral gene transfer, generating a pangenome structure consisting of a conserved core genome surrounded by a more variable accessory genome shell. Over time, flux generates change in genome content. Here we measure and compare the rate of genome flux for 5 655 prokaryotic genomes as a function of amino acid sequence divergence in 36 universally distributed proteins of the informational core (IC). We find a clock of gene content change. The long-term average rate of gene content flux is remarkably constant across all higher prokaryotic taxa sampled, whereby the size of the accessory genome-the proportion of the genome harboring gene content difference for genome pairs-varies across taxa. The proportion of species-level accessory genes per genome, varies from 0% (Chlamydia) to 30-33% (Alphaproteobacteria, Gammaproteobacteria, Clostridia). A clock-like rate of gene content change across all prokaryotic taxa sampled suggest that pangenome structure is a general feature of prokaryotic genomes and that it has been in existence since the divergence of bacteria and archaea.

RevDate: 2024-08-20

Wang Z, Hülpüsch C, Foesel B, et al (2024)

Genomic and functional divergence of Staphylococcus aureus strains from atopic dermatitis patients and healthy individuals: insights from global and local scales.

Microbiology spectrum [Epub ahead of print].

Atopic dermatitis (AD) is the most common chronic inflammatory skin disease worldwide and is characterized by a complex interplay with skin microbiota, with Staphylococcus aureus often abnormally more abundant in AD patients than in healthy individuals (HE). S. aureus harbors diverse strains with varied genetic compositions and functionalities, which exhibit differential connections with the severity of AD. However, the differences in S. aureus strains between AD and HE remain unclear, with most variations seen at a specific geographic level, implying spontaneous adaptations rather than systematic distinctions. This study presents genomic and functional differences between these S. aureus strains from AD and HE on both global and local levels. We observed reduced gene content diversity but increased functional variation in the global AD-associated strains. Two additional AD-dominant clusters emerged, with Cluster 1 enriched in transposases and Cluster 2 showcasing genes linked to adaptability and antibiotic resistance. Particularly, robust evidence illustrates that the lantibiotic operon of S. aureus, involved in the biosynthesis of lantibiotics, was acquired via horizontal gene transfer from environmental bacteria. Comparisons of the gene abundance profiles in functional categories also indicate limited zoonotic potential between human and animal isolates. Local analysis mirrored global gene diversity but showed distinct functional variations between AD and HE strains. Overall, this research provides foundational insights into the genomic evolution, adaptability, and antibiotic resistance of S. aureus, with significant implications for clinical microbiology.IMPORTANCEOur study uncovers significant genomic variations in Staphylococcus aureus strains associated with atopic dermatitis. We observed adaptive evolution tailored to the disease microenvironment, characterized by a smaller pan-genome than strains from healthy skin both on the global and local levels. Key functional categories driving strain diversification include "replication and repair" and "transporters," with transposases being pivotal. Interestingly, the local strains predominantly featured metal-related genes, whereas global ones emphasized antimicrobial resistances, signifying scale-dependent diversification nuances. We also pinpointed horizontal gene transfer events, indicating interactions between human-associated and environmental bacteria. These insights expand our comprehension of S. aureus's genetic adaptation in atopic dermatitis, yielding valuable implications for clinical approaches.

RevDate: 2024-08-19

Song Y, Long C, Wang Y, et al (2024)

Advancements in multi-omics for nutraceutical enhancement and traits improvement in buckwheat.

Critical reviews in biotechnology [Epub ahead of print].

Buckwheat (Fagopyrum spp.) is a typical pseudocereal, valued for its extensive nutraceutical potential as well as its centuries-old cultivation. Tartary buckwheat and common buckwheat have been used globally and become well-known nutritious foods due to their high quantities of: proteins, flavonoids, and minerals. Moreover, its increasing demand makes it critical to improve nutraceutical, traits and yield. In this review, bioactive compounds accumulated in buckwheat were comprehensively evaluated according to their chemical structure, properties, and physiological function. Biosynthetic pathways of flavonoids, phenolic acids, and fagopyrin were methodically summarized, with the regulation of flavonoid biosynthesis. Although there are classic synthesis pathways presented in the previous research, the metabolic flow of how these certain compounds are being synthesized in buckwheat still remains uncovered. The functional genes involved in the biosynthesis of flavonols, stress response, and plant development were identified based on multi-omics research. Furthermore, it delves into the applications of multi-omics in improving buckwheat's agronomic traits, including: yield, nutritional content, stress resilience, and bioactive compounds biosynthesis. While pangenomics combined with other omics to mine elite genes, the regulatory network and mechanism of specific agronomic traits and biosynthetic of bioactive components, and developing a more efficient genetic transformation system for genetic engineering require further investigation for the execution of breeding designs aimed at enhancing desirable traits in buckwheat. This critical review will provide a comprehensive understanding of multi-omics for nutraceutical enhancement and traits improvement in buckwheat.

RevDate: 2024-08-19

Klingström T, Zonabend König E, AA Zwane (2024)

Beyond the hype: using AI, big data, wearable devices, and the internet of things for high-throughput livestock phenotyping.

Briefings in functional genomics pii:7735403 [Epub ahead of print].

Phenotyping of animals is a routine task in agriculture which can provide large datasets for the functional annotation of genomes. Using the livestock farming sector to study complex traits enables genetics researchers to fully benefit from the digital transformation of society as economies of scale substantially reduces the cost of phenotyping animals on farms. In the agricultural sector genomics has transitioned towards a model of 'Genomics without the genes' as a large proportion of the genetic variation in animals can be modelled using the infinitesimal model for genomic breeding valuations. Combined with third generation sequencing creating pan-genomes for livestock the digital infrastructure for trait collection and precision farming provides a unique opportunity for high-throughput phenotyping and the study of complex traits in a controlled environment. The emphasis on cost efficient data collection mean that mobile phones and computers have become ubiquitous for cost-efficient large-scale data collection but that the majority of the recorded traits can still be recorded manually with limited training or tools. This is especially valuable in low- and middle income countries and in settings where indigenous breeds are kept at farms preserving more traditional farming methods. Digitalization is therefore an important enabler for high-throughput phenotyping for smaller livestock herds with limited technology investments as well as large-scale commercial operations. It is demanding and challenging for individual researchers to keep up with the opportunities created by the rapid advances in digitalization for livestock farming and how it can be used by researchers with or without a specialization in livestock. This review provides an overview of the current status of key enabling technologies for precision livestock farming applicable for the functional annotation of genomes.

RevDate: 2024-08-19

Rana R, Nayak PK, Madhavan VN, et al (2024)

Comparative genomics-based insights into Xanthomonas indica, a non-pathogenic species of healthy rice microbiome with bioprotection function.

Applied and environmental microbiology [Epub ahead of print].

Xanthomonas species are major pathogens of plants and have been studied extensively. There is increasing recognition of the importance of non-pathogenic species within the same genus. With this came the need to understand the genomic and functional diversity of non-pathogenic Xanthomonas (NPX) at the species and strain level. This study reports isolation and investigation into the genomic diversity and variation in NPX isolates, chiefly Xanthomonas indica, a newly discovered NPX species from rice. The study establishes the relationship of X. indica strains within clade I of Xanthomonads with another NPX species, X. sontii, also associated with rice seeds. Identification of highly diverse strains, open-pan genome, and systematic hyper-variation at the lipopolysaccharide biosynthetic locus when compared to pathogenic Xanthomonas indicates the acquisition of new functions for adaptation. Furthermore, comparative genomics studies established the absence of major virulence genes such as type III secretion system and effectors, which are present in the pathogens, and the presence of a known bacterial-killing type IV secretion system (X-T4SS). The diverse non-pathogenic strains of X. indica and X. sontii were found to protect rice from bacterial leaf blight pathogen, X. oryzae pv. oryzae (Xoo). The absence of phenotype of an X-T4SS mutant suggests redundancy in the genetic basis of the mechanisms involved in the bioprotection function, which may include multiple genetic loci, such as putative bacteriocin-encoding gene clusters and involvement of other factors such as nutrient and niche competition apart from induction of innate immunity through shared microbial-associated molecular patterns. The rice-NPX community and its pathogenic counterpart can be a promising model for understanding plant-microbe-microbiome interaction studies.IMPORTANCEThe Xanthomonas group of bacteria is known for its characteristic lifestyle as a phytopathogen. However, the discovery of non-pathogenic Xanthomonas (NPX) species is a major shift in understanding this group of bacteria. Multi-strain, in-depth genomic, evolutionary and functional studies on each of these NPX species are still lacking. This study on diverse non-pathogenic strains provides novel insights into genome diversity, dynamics, and evolutionary trends of NPX species from rice microbiome apart from its relationship with other relatives that form a sub-clade. Interestingly, we also uncovered that NPX species protect rice from pathogenic Xanthomonas species. The plant protection property shows their importance as a part of a healthy plant microbiome. Furthermore, finding an open pan-genome and large-scale variation at lipopolysaccharide biosynthetic locus indicates a significant role of the NPX community in host adaptation. The findings and high-quality genomic resources of NPX species and the strains will allow further systematic molecular and host-associated microbial community studies for plant health.

RevDate: 2024-08-19

Mederos MA, Court CM, Dipardo BJ, et al (2024)

Oncogenic pathway signatures predict the risk of progression and recurrence in well-differentiated pancreatic neuroendocrine tumors.

Journal of surgical oncology [Epub ahead of print].

BACKGROUND: Pancreatic neuroendocrine tumors (pNETs) are genomically diverse tumors. The management of newly diagnosed well-differentiated pNETs is limited by a lack of sensitivity of existing biomarkers for prognostication. Our goal was to investigate the potential utility of genetic markers as a predictor of progression-free survival (PFS) and recurrence-free survival (RFS).

METHODS: Whole-exome sequencing of resected well-differentiated, low and intermediate-grade (G1 and G2) pNETs and normal adjacent tissue from patients who underwent resection from 2005 to 2015 was performed. Genetic alterations were classified using pan-genomic and oncogenic pathway classifications. Additional samples with genetic and clinicopathologic data available were obtained from the publicly available International Cancer Genome Consortium (ICGC) database and included in the analysis. The prognostic relevance of these genomic signatures on PFS and RFS was analyzed.

RESULTS: Thirty-one patients who underwent resection for pNET were identified. Genomic analysis of mutational, copy number, cytogenetic, and complex phenomena revealed similar patterns to prior studies of pNETs with relatively few somatic gene mutations but numerous instances of copy number changes. Analysis of genomic and clinicopathologic outcomes using the combined data from our study as well as the ICGC pNET cohort (n = 124 patients) revealed that the recurrent pattern of whole chromosome loss (RPCL) and metastatic disease were independently associated with disease progression. When evaluating patients with local disease at the time of resection, RPCL and alterations in the TGFβ oncogenic pathway were independently associated with the risk of recurrence.

CONCLUSIONS: Well-differentiated pNETs are genomically diverse tumors. Pathway signatures may be prognostic for predicting disease progression and recurrence.

RevDate: 2024-08-16

Woodhouse MR, Cannon EK, Portwood JL, et al (2024)

Tools and Resources at the Maize Genetics and Genomics Database (MaizeGDB).

Cold Spring Harbor protocols pii:pdb.over108430 [Epub ahead of print].

The Maize Genetics and Genomics Database (MaizeGDB) is the community resource for maize researchers, offering a suite of tools, informatics resources, and curated data sets to support maize genetics, genomics, and breeding research. Here, we provide an overview of the key resources available at MaizeGDB, including maize genomes, comparative genomics, and pan-genomics tools. This review aims to familiarize users with the range of options available for maize research and highlights the importance of MaizeGDB as a central hub for the maize research community. By providing a detailed snapshot of the database's capabilities, we hope to enable researchers to make use of MaizeGDB's resources, ultimately assisting them to better study the evolution and diversity of maize.

RevDate: 2024-08-16

Ma W, MJ Chaisson (2024)

High-resolution global diversity copy number variation maps and association with ctyper.

bioRxiv : the preprint server for biology pii:2024.08.11.607269.

Genetic analysis of copy number variations (CNVs), especially in complex regions, is challenging due to reference bias and ambiguous alignment of Next-Generation Sequencing (NGS) reads to repetitive DNA. Consequently, aggregate copy numbers are typically analyzed, overlooking variation between gene copies. Pangenomes contain diverse sequences of gene copies and enable the study of sequence-resolved CNVs. We developed a method, ctyper, to discover sequence-resolved CNVs in NGS data by leveraging CNV genes from pangenomes. From 118 public assemblies, we constructed a database of 3,351 CNV genes, distinguishing each gene copy as a resolved allele. We used phylogenetic trees to organize alleles into highly similar allele-types that revealed events of linked small variants due to stratification, structural variation, conversion, and duplication. Saturation analysis showed that new samples share an average of 97.8% CNV alleles with the database. The ctyper method traces individual gene copies in NGS data to their nearest alleles in the database and identifies allele-specific copy numbers using multivariate linear regression on k-mer counts and phylogenetic clustering. Applying ctyper to 1000 Genomes Project (1kgp) samples showed Hardy-Weinberg Equilibrium on 99.3% of alleles and a 97.6% F1 score on genotypes based on 641 1kgp trios. Leave-one-out analysis on 39 assemblies matched to 1kgp samples showed that 96.5% of variants in query sequences match the genotyped allele. Genotyping 1kgp data revealed 226 population-specific CNVs, including a conversion on SMN2 to SMN1, potentially impacting Spinal Muscular Atrophy diagnosis in Africans. Our results revealed two models of CNV: recent CNVs due to ongoing duplications and polymorphic CNVs from ancient paralogs missing from the reference. To measure the functional impact of CNVs, after merging allele-types, we conducted genome-wide Quantitative Trait Locus analysis on 451 1kgp samples with Geuvadis rRNA-seqs. Using a linear mixed model, our genotyping enables the inference of relative expression levels of paralogs within a gene family. In a global evolutionary context, 150 out of 1,890 paralogs (7.94%) and 546 out of 16,628 orthologs (3.28%) had significantly different expression levels, suggesting divergent expression from original genes. Specific examples include lower expression on the converted SMN and increased expression on translocated AMY2B (GTEx pancreas data). Our method enables large cohort studies on complex CNVs to uncover hidden health impacts and overcome reference bias.

RevDate: 2024-08-15
CmpDate: 2024-08-15

Jara-Servin A, Mejia G, Romero MF, et al (2024)

Unravelling the genomic and environmental diversity of the ubiquitous Solirubrobacter.

Environmental microbiology, 26(8):e16685.

Solirubrobacter, though widespread in soils and rhizospheres, has been relatively unexplored despite its ubiquity. Previously acknowledged as a common soil bacterium, our research explores its phylogenomics, pangenomics, environmental diversity, and interactions within bacterial communities. By analysing seven genomic sequences, we have identified a pangenome consisting of 19,645 protein families, of which 2644 are shared across all studied genomes, forming the core genome. Interestingly, despite the non-motility of reported isolates, we discovered genes for flagellin and a partial flagellum assembly pathway. Examining the 16S ribosomal RNA genes of Solirubrobacter revealed substantial diversity, with 3166 operational taxonomic units identified in Mexican soils. Co-occurrence network analysis further demonstrated its significant integration within bacterial communities. Through phylogenomic scrutiny, we conclusively excluded the NCBI's GCA_009993245.1 genome from being classified as a Solirubrobacter. Our research into the metagenomic diversity of Solirubrobacter across various environments confirmed its presence in rhizospheres and certain soils, underscoring its adaptability. The geographical ubiquity of Solirubrobacter in rhizospheres raises intriguing questions regarding its potential interactions with plant hosts and the biotic and abiotic factors influencing its presence in soil. Given its ecological significance and genetic diversity, Solirubrobacter warrants further investigation as a potentially crucial yet underappreciated keystone species.

RevDate: 2024-08-15

Gtari M, Maaoui R, Ghodhbane-Gtari F, et al (2024)

MAGs-centric crack: how long will, spore-positive Frankia and most Protofrankia, microsymbionts remain recalcitrant to axenic growth?.

Frontiers in microbiology, 15:1367490.

Nearly 50 years after the ground-breaking isolation of the primary Comptonia peregrina microsymbiont under axenic conditions, efforts to isolate a substantial number of Protofrankia and Frankia strains continue with enduring challenges and complexities. This study aimed to streamline genomic insights through comparative and predictive tools to extract traits crucial for isolating specific Frankia in axenic conditions. Pangenome analysis unveiled significant genetic diversity, suggesting untapped potential for cultivation strategies. Shared metabolic strategies in cellular components, central metabolic pathways, and resource acquisition traits offered promising avenues for cultivation. Ecological trait extraction indicated that most uncultured strains exhibit no apparent barriers to axenic growth. Despite ongoing challenges, potential caveats, and errors that could bias predictive analyses, this study provides a nuanced perspective. It highlights potential breakthroughs and guides refined cultivation strategies for these yet-uncultured strains. We advocate for tailored media formulations enriched with simple carbon sources in aerobic environments, with atmospheric nitrogen optionally sufficient to minimize contamination risks. Temperature adjustments should align with strain preferences-28-29°C for Frankia and 32-35°C for Protofrankia-while maintaining an alkaline pH. Given potential extended incubation periods (predicted doubling times ranging from 3.26 to 9.60 days, possibly up to 21.98 days), patience and rigorous contamination monitoring are crucial for optimizing cultivation conditions.

RevDate: 2024-08-14

Fortin SG, Sun X, Jayakumar A, et al (2024)

Nitrite-oxidizing bacteria adapted to low oxygen conditions dominate nitrite oxidation in marine oxygen minimum zones.

The ISME journal pii:7733695 [Epub ahead of print].

Nitrite is a central molecule in the nitrogen cycle because nitrite oxidation to nitrate (an aerobic process) retains fixed nitrogen in a system and its reduction to dinitrogen gas (anaerobic) reduces the fixed nitrogen inventory. Despite its acknowledged requirement for oxygen, nitrite oxidation is observed in oxygen-depleted layers of the ocean's oxygen minimum zones (OMZs), challenging the current understanding of OMZ nitrogen cycling. Previous attempts to determine whether nitrite-oxidizing bacteria in the anoxic layer differ from known nitrite oxidizers in the open ocean were limited by cultivation difficulties and sequencing depth. Here, we construct 31 draft genomes of nitrite-oxidizing bacteria from global OMZs. The distribution of nitrite oxidation rates, abundance and expression of nitrite oxidoreductase genes, and relative abundance of nitrite-oxidizing bacterial draft genomes from the same samples all show peaks in the core of the oxygen-depleted zone (ODZ) and are all highly correlated in depth profiles within the major ocean oxygen minimum zones. The ODZ nitrite oxidizers are not found in the Tara Oceans global dataset (the most complete oxic ocean dataset), and the major nitrite oxidizers found in the oxygenated ocean do not occur in ODZ waters. A pangenomic analysis shows the ODZ nitrite oxidizers have distinct gene clusters compared to oxic nitrite oxidizers and are microaerophilic. These findings all indicate the existence of nitrite oxidizers whose niche is oxygen-deficient seawater. Thus, specialist nitrite-oxidizing bacteria are responsible for fixed nitrogen retention in marine oxygen minimum zones, with implications for control of the ocean's fixed nitrogen inventory.

RevDate: 2024-08-14
CmpDate: 2024-08-14

Jouffe C, Dyar KA, NH Uhlenhaut (2024)

Chromatin Immunoprecipitation in Adipose Tissue and Adipocytes: How to Proceed and Optimize the Protocol for Transcription Factor DNA Binding.

Methods in molecular biology (Clifton, N.J.), 2846:35-45.

Chromatin immunoprecipitation (ChIP) coupled to qPCR or sequencing is a crucial experiment to determine direct transcriptional regulation under the control of specific transcriptional factors or co-regulators at loci-specific or pan-genomic levels.Here we provide a reliable method for processing ChIP from adipocytes or frozen adipose tissue collection, isolation of nuclei, cross-linking of protein-DNA complexes, chromatin shearing, immunoprecipitation, and DNA purification. We also discuss critical steps for optimizing the experiment to perform a successful ChIP in lipid-rich cells/tissues.

RevDate: 2024-08-14
CmpDate: 2024-08-14

Lemieux JE (2024)

Analysis of the Borreliaceae Pangenome Reveals a Distinct Genomic Architecture Conserved Across Phylogenetic Scales.

The Journal of infectious diseases, 230(Supplement_1):S51-S61.

The family Borreliaceae contains arthropod-borne spirochetes that cause two widespread human diseases, Lyme disease and relapsing fever. Lyme disease is a subacute, progressive illness with variable stage and tissue manifestations. Relapsing fever is an acute febrile illness with prominent bacteremia that may recur and disseminate, particularly to the nervous system. Clinical heterogeneity is a hallmark of both diseases. While human clinical manifestations are influenced by a wide variety of factors, including immune status and host genetic susceptibility, there is evidence that Borreliaceae microbial factors influence the clinical manifestations of human disease caused by this family of spirochetes. Despite these associations, the spirochete genes that influence the severity and manifestations of human disease are, for the most part, unknown. Recent work has identified lineage-specific expansions of lipoproteome-rich accessory genome elements in virulent clones of Borrelia burgdorferi. Using publicly available genome assemblies, it is shown that all Borreliaceae lineages for which sufficient sequence data are available harbor a similar pattern of strongly structured, lineage-specific expansions in their accessory genomes, particularly among lipoproteins, and that this pattern holds across phylogenetic scales including genera, species, and genotypes. The relationships among pangenome elements suggest that infrequent episodes of marked genomic change followed by clonal expansion in geographically and enzootically structured populations may account for the unique lineage structure of Borreliaceae. This analysis informs future genotype-phenotype studies among Borreliaceae and lays a foundation for studies of individual gene function guided by phylogenetic patterns of conservation, diversification, gain, and/or loss.

RevDate: 2024-08-14

Perrin C, Coutts M, B Dadone-Montaudié (2024)

Subungual melanoma: molecular analysis of 31 cases from early stage to invasive melanoma.

Histopathology [Epub ahead of print].

AIMS: The distinction between the benign subungual melanocytic lesions and an early lesion of subungual melanoma (SUM) remains a diagnostic challenge. We evaluated the routine diagnostic utility of array Comparative Genomic Hybridization (aCGH) to detect whole-genome copy number variations (CNV) as well as targeted next-generation sequencing (NGS) in SUM.

METHODS AND RESULTS: This retrospective study included 20 cases of in situ SUM and 11 cases of invasive SUM. Analysis by aCGH detected common oncogene amplifications in all but one case of invasive SUM (n = 10) and in all cases of in situ SUM with a melanocyte count (MC) >45/mm (n = 4 true positive) and the average number of CNV was 8.5. Thirteen remaining cases of in situ SUM gave false negative results (n = 13), owing to a lack of sufficient melanocytes to analyse (median MC of 35.35; range: 10.16-39.5). Molecular analysis failed in four cases (three in situ SUM and one invasive SUM) due to insufficient amounts of DNA. Across the whole cohort, the sensitivity of aCGH was 52%, but when adjusting the cutoff to MC >45/mm, the sensitivity was 93%. Targeted NGS was less informative than aCGH analyses in our series of SUM.

CONCLUSION: To distinguish malignant from benign lesions, especially in situ SUM versus atypical lentiginous melanocytic proliferations, aCGH analysis should be performed when the MC is above 45 melanocytes per linear millimetre. This pangenomic method can detect oncogene amplifications, as well as a number of CNV >3, which strongly support the diagnosis of malignancy.

RevDate: 2024-08-13

Feng Y, Yang Y, Hu Y, et al (2024)

Population genomics uncovers global distribution, antimicrobial resistance, and virulence genes of the opportunistic pathogen Klebsiella aerogenes.

Cell reports, 43(8):114602 pii:S2211-1247(24)00941-0 [Epub ahead of print].

Klebsiella aerogenes is an understudied and clinically important pathogen. We therefore investigate its population structure by genome analysis aligned with metadata. We sequence 130 non-duplicated K. aerogenes clinical isolates and identify two inter-patient transmission events. We then retrieve all publicly available K. aerogenes genomes (n = 1,026, accessed by January 1, 2023) and analyze them with our 130 genomes. We develop a core-genome multi-locus sequence-typing scheme. We find that K. aerogenes is a species complex comprising four phylogroups undergoing evolutionary divergence, likely forming three species. We delineate remarkable clonal diversity and identify three worldwide-distributed carbapenemase-encoding clonal clusters, representing high-risk lineages. We uncover that K. aerogenes has an open genome equipped by a large arsenal of antimicrobial resistance genes. We identify two genetic regions specific for K. aerogenes, encoding a type VI secretion system and flagella/chemotaxis for motility, respectively, both contributing to the virulence. These results provide much-needed insights into the population structure and pan-genomes of K. aerogenes.

RevDate: 2024-08-12

Kuronen J, Horsfield ST, Pöntinen AK, et al (2024)

Pangenome-spanning epistasis and coselection analysis via de Bruijn graphs.

Genome research pii:gr.278485.123 [Epub ahead of print].

Studies of bacterial adaptation and evolution are hampered by the difficulty of measuring traits such as virulence, drug resistance, and transmissibility in large populations. In contrast, it is now feasible to obtain high-quality complete assemblies of many bacterial genomes thanks to scalable high-accuracy long-read sequencing technologies. To exploit this opportunity, we introduce a phenotype- and alignment-free method for discovering coselected and epistatically interacting genomic variation from genome assemblies covering both core and accessory parts of genomes. Our approach uses a compact colored de Bruijn graph to approximate the intragenome distances between pairs of loci for a collection of bacterial genomes to account for the impacts of linkage disequilibrium (LD). We demonstrate the versatility of our approach to efficiently identify associations between loci linked with drug resistance and adaptation to the hospital niche in the major human bacterial pathogens Streptococcus pneumoniae and Enterococcus faecalis.

RevDate: 2024-08-12
CmpDate: 2024-08-12

Yin J, He M, Liu XX, et al (2024)

Peteryoungia algae sp. nov. isolated from seaweeds of Gouqi Island, China, and its unique genetic features among Peteryoungia strains.

Antonie van Leeuwenhoek, 117(1):112.

A Gram-stain-negative, light khaki, strictly aerobic, rod-shaped, motile via multiple flagella, and catalase- and oxidase-positive bacterium, designated as SSM4.3[T], was isolated from the seaweed of Gouqi Island in the East China Sea. The novel isolate grows at 0-5.0% NaCl concentrations (w/v) (optimum 1%), pH 5.0-9.0 (optimum pH 7.0), and 15-37 °C (optimum 30 °C). The 16S rRNA gene sequences-based phylogeny indicates that the novel marine isolate belongs to the family Rhizobiaceae and that it shared the greatest sequence similarity (98.9%) with Peteryoungia rhizophila CGMCC 1.15691[T]. This classification was also supported by phylogenetic analysis using core genes. The predominant fatty acids (≥ 10%) of the strain were identified as C18:1 ω7c/C18:1 ω6c. Q-10 was identified as the major isoprenoid quinone, with trace levels of Q-9 present. The major polar lipids were identified as diphosphatidylglycerol, phosphatidylethanolamine and phosphatidylglycerol. The complete genome size of strain SSM4.3[T] is 4.39 Mb with a DNA G+C content of 61.3%. The average nucleotide identity, digital DNA-DNA hybridization, and average amino acid identity values between the genomes of strain SSM4.3[T] and its closely related representatives were 74.80-86.93%, 20.00-32.30%, and 70.30-91.52%, respectively. Phylogenetic analysis, grounded on the core genes, reveals the evolutionary relationship between SSM4.3[T] and other Peteryoungia strains. Pan-genomics analysis of 8 previously classified Peteryoungia species and SSM4.3[T] revealed their unique genetic features and functions. Overall, strain SSM4.3[T] was considered to be a new species of the Peteryoungia genus; the name Peteryoungia algae sp. nov. has been proposed, with type strain SSM4.3[T] (= LMG 32561 = MCCC 1K07170).

RevDate: 2024-08-12
CmpDate: 2024-08-12

Vale FF, Roberts RJ, Kobayashi I, et al (2024)

Gene content, phage cycle regulation model and prophage inactivation disclosed by prophage genomics in the Helicobacter pylori Genome Project.

Gut microbes, 16(1):2379440.

Prophages can have major clinical implications through their ability to change pathogenic bacterial traits. There is limited understanding of the prophage role in ecological, evolutionary, adaptive processes and pathogenicity of Helicobacter pylori, a widespread bacterium causally associated with gastric cancer. Inferring the exact prophage genomic location and completeness requires complete genomes. The international Helicobacter pylori Genome Project (HpGP) dataset comprises 1011 H. pylori complete clinical genomes enriched with epigenetic data. We thoroughly evaluated the H. pylori prophage genomic content in the HpGP dataset. We investigated population evolutionary dynamics through phylogenetic and pangenome analyses. Additionally, we identified genome rearrangements and assessed the impact of prophage presence on bacterial gene disruption and methylome. We found that 29.5% (298) of the HpGP genomes contain prophages, of which only 32.2% (96) were complete, minimizing the burden of prophage carriage. The prevalence of H. pylori prophage sequences was variable by geography and ancestry, but not by disease status of the human host. Prophage insertion occasionally results in gene disruption that can change the global bacterial epigenome. Gene function prediction allowed the development of the first model for lysogenic-lytic cycle regulation in H. pylori. We have disclosed new prophage inactivation mechanisms that appear to occur by genome rearrangement, merger with other mobile elements, and pseudogene accumulation. Our analysis provides a comprehensive framework for H. pylori prophage biological and genomics, offering insights into lysogeny regulation and bacterial adaptation to prophages.

RevDate: 2024-08-12

Hasnat S, Hoque MN, Mahbub MM, et al (2024)

Pantothenate kinase: A promising therapeutic target against pathogenic Clostridium species.

Heliyon, 10(14):e34544.

Current treatment of clostridial infections includes broad-spectrum antibiotics and antitoxins, yet antitoxins are ineffective against all Clostridiumspecies. Moreover, rising antimicrobial resistance (AMR) threatens treatment effectiveness and public health. This study therefore aimed to discover a common drug target for four pathogenic clostridial species, Clostridium botulinum, C. difficile, C. tetani, and C. perfringens through an in-silico core genomic approach. Using four reference genomes of C. botulinum, C. difficile, C. tetani, and C. perfringens, we identified 1484 core genomic proteins (371/genome) and screened them for potential drug targets. Through a subtractive approach, four core proteins were finally identified as drug targets, represented by type III pantothenate kinase (CoaX) and, selected for further analyses. Interestingly, the CoaX is involved in the phosphorylation of pantothenate (vitamin B5), which is a critical precursor for coenzyme A (CoA) biosynthesis. Investigation of druggability analysis on the identified drug target reinforces CoaX as a promising novel drug target for the selected Clostridium species. During the molecular screening of 1201 compounds, a known agonist drug compound (Vibegron) showed strong inhibitory activity against targeted clostridial CoaX. Additionally, we identified tazobactam, a beta-lactamase inhibitor, as effective against the newly proposed target, CoaX. Therefore, identifying CoaX as a single drug target effective against all four clostridial pathogens presents a valuable opportunity to develop a cost-effective treatment for multispecies clostridial infections.

RevDate: 2024-08-09
CmpDate: 2024-08-09

Krisna MA, Jolley KA, Monteith W, et al (2024)

Development and implementation of a core genome multilocus sequence typing scheme for Haemophilus influenzae.

Microbial genomics, 10(8):.

Haemophilus influenzae is part of the human nasopharyngeal microbiota and a pathogen causing invasive disease. The extensive genetic diversity observed in H. influenzae necessitates discriminatory analytical approaches to evaluate its population structure. This study developed a core genome multilocus sequence typing (cgMLST) scheme for H. influenzae using pangenome analysis tools and validated the cgMLST scheme using datasets consisting of complete reference genomes (N = 14) and high-quality draft H. influenzae genomes (N = 2297). The draft genome dataset was divided into a development dataset (N = 921) and a validation dataset (N = 1376). The development dataset was used to identify potential core genes, and the validation dataset was used to refine the final core gene list to ensure the reliability of the proposed cgMLST scheme. Functional classifications were made for all the resulting core genes. Phylogenetic analyses were performed using both allelic profiles and nucleotide sequence alignments of the core genome to test congruence, as assessed by Spearman's correlation and ordinary least square linear regression tests. Preliminary analyses using the development dataset identified 1067 core genes, which were refined to 1037 with the validation dataset. More than 70% of core genes were predicted to encode proteins essential for metabolism or genetic information processing. Phylogenetic and statistical analyses indicated that the core genome allelic profile accurately represented phylogenetic relatedness among the isolates (R [2] = 0.945). We used this cgMLST scheme to define a high-resolution population structure for H. influenzae, which enhances the genomic analysis of this clinically relevant human pathogen.

RevDate: 2024-08-08

Khan MAS, Chaity SC, Hosen A, et al (2024)

Genomic epidemiology of multidrug-resistant clinical Acinetobacter baumannii in Bangladesh.

Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases pii:S1567-1348(24)00107-2 [Epub ahead of print].

The rising frequency of multidrug-resistant (MDR) Acinetobacter baumannii infections represents a significant public health challenge in Bangladesh. Genomic analysis of bacterial pathogens enhances surveillance and control efforts by providing insights into genetic diversity, antimicrobial resistance (AMR) profiles, and transmission dynamics. In this study, we conducted a comprehensive bioinformatic analysis of 82 whole-genome sequences (WGS) of A. baumannii from Bangladesh to understand their genomic epidemiological characteristics. WGS of the MDR and biofilm-forming A. baumannii strain S1C revealed the presence of 28 AMR genes, predicting its pathogenicity and classification within sequence type ST2. Multi-locus sequence typing (MLST) genotyping suggested heterogeneity in the distribution of clinical A. baumannii strains in Bangladesh, with a predominance of ST575. The resistome diversity was evident from the detection of 82 different AMR genes, with antibiotic inactivation being the most prevalent resistance mechanism. All strains were predicted to be multidrug-resistant. The observed virulence genes were associated with immune evasion, biofilm formation, adherence, nutrient acquisition, effector delivery, and other mechanisms. Mobile genetic elements carrying AMR genes were predicted in 68.29% (N = 56) of the genomes. The "open" state of the pan-genome and a high proportion of accessory genes highlighted the genome plasticity and diversity of A. baumannii in Bangladesh. Additionally, phylogenomic analysis indicated clustering of A. baumannii strains into three separate clades according to sequence type. In summary, our findings offer detailed insights into the genomic landscape of A. baumannii in Bangladesh, contributing to our understanding of its epidemiology and pathogenicity and informing strategies to combat this pathogen.

RevDate: 2024-08-08

Singhvi N, Talwar C, Nagar S, et al (2024)

Insights into the radiation and oxidative stress mechanisms in genus Deinococcus.

Computational biology and chemistry, 112:108161 pii:S1476-9271(24)00149-X [Epub ahead of print].

Deinococcus species, noted for their exceptional resistance to DNA-damaging environmental stresses, have piqued scientists' interest for decades. This study dives into the complex mechanisms underpinning radiation resistance in the Deinococcus genus. We have examined the genomes of 82 Deinococcus species and classified radiation-resistance proteins manually into five unique curated categories: DNA repair, oxidative stress defense, Ddr and Ppr proteins, regulatory proteins, and miscellaneous resistance components. This classification reveals important information about the various molecular mechanisms used by these extremophiles which have been less explored so far. We also investigated the presence or lack of these proteins in the context of phylogenetic relationships, core, and pan-genomes, which offered light on the evolutionary dynamics of radiation resistance. This comprehensive study provides a deeper understanding of the genetic underpinnings of radiation resistance in the Deinococcus genus, with potential implications for understanding similar mechanisms in other organisms using an interactomics approach. Finally, this study reveals the complexities of radiation resistance mechanisms, providing a comprehensive understanding of the genetic components that allow Deinococcus species to flourish under harsh environments. The findings add to our understanding of the larger spectrum of stress adaption techniques in bacteria and may have applications in sectors ranging from biotechnology to environmental research.

RevDate: 2024-08-08
CmpDate: 2024-08-08

Rojas-Vargas J, Rebollar EA, Sanchez-Flores A, et al (2024)

A comparative genomic study of a hydrocarbon-degrading marine bacterial consortium.

PloS one, 19(8):e0303363 pii:PONE-D-23-38809.

Ocean oil pollution has a large impact on the environment and the health of living organisms. Bioremediation cleaning strategies are promising eco-friendly alternatives for tackling this problem. Previously, we designed and reported a hydrocarbon (HC) degrading microbial consortium of four marine strains belonging to the species Alloalcanivorax xenomutans, Halopseudomonas aestusnigri, Paenarthrobacter sp., and Pseudomonas aeruginosa. However, the knowledge about the metabolic potential of this bacterial consortium for HC bioremediation is not yet well understood. Here, we analyzed the complete genomes of these marine bacterial strains accompanied by a phylogenetic reconstruction along with 138 bacterial strains. Synteny between complete genomes of the same species or genus, revealed high conservation among strains of the same species, covering over 91% of their genomic sequences. Functional predictions highlighted a high abundance of genes related to HC degradation, which may result in functional redundancy within the consortium; however, unique and complete gene clusters linked to aromatic degradation were found in the four genomes, suggesting substrate specialization. Pangenome gain and loss analysis of genes involved in HC degradation provided insights into the evolutionary history of these capabilities, shedding light on the acquisition and loss of relevant genes related to alkane and aromatic degradation. Our work, including comparative genomic analyses, identification of secondary metabolites, and prediction of HC-degrading genes, enhances our understanding of the functional diversity and ecological roles of these marine bacteria in crude oil-contaminated marine environments and contributes to the applied knowledge of bioremediation.

RevDate: 2024-08-08

Saikat TA, Sayem Khan MA, Islam MS, et al (2024)

Characterization and genome mining of Bacillus subtilis BDSA1 isolated from river water in Bangladesh: A promising bacterium with diverse biotechnological applications.

Heliyon, 10(14):e34369.

The metabolic versatility of Bacillus subtilis makes it useful for a wide range of applications in biotechnology, from bioremediation to industrially important metabolite production. Understanding the molecular attributes of the biocontrol characteristics of B. subtilis is necessary for its tailored use in the environment and industry. Therefore, the present study aimed to conduct phenotypic characterization and whole genome analysis of the B. subtilis BDSA1 isolated from polluted river water from Dhaka, Bangladesh to explore its biotechnological potential. The chromium reduction capacity at 100 ppm Cr (VI) showed that B. subtilis BDSA1 reduced 40 % of Cr (VI) within 24hrs at 37 °C. Exposure of this bacterium to 200 ppm cadmium resulted in 43 % adsorption following one week of incubation at 37 °C. Molecular detection of chrA and czcC gene confirmed chromium and cadmium resistance characteristics of BDSA1. The size of the genome of the B. subtilis BDSA1 was 4.2 Mb with 43.4 % GC content. Genome annotation detected the presence of numerous genes involved in the degradation of xenobiotics, resistance to abiotic stress, production of lytic enzymes, siderophore formation, and plant growth promotion. The assembled genome also carried chromium, cadmium, copper, and arsenic resistance-related genes, notably cadA, czcD, czrA, arsB etc. Genome mining revealed six biosynthetic gene clusters for bacillaene, bacillibacin, bacilysin, subtilosin, fengycin and surfactin. Importantly, BDSA1 was predicted to be non-pathogenic to humans and had only two acquired antimicrobial resistance genes. The pan-genome analysis showed the openness of the B. subtilis pan-genome. Our findings suggested that B. subtilis BDSA1 might be a promising candidate for diverse biotechnological uses.

RevDate: 2024-08-08

Chen M, Trotter VV, Walian PJ, et al (2024)

Molecular mechanisms and environmental adaptations of flagellar loss and biofilm growth of Rhodanobacter under environmental stress.

The ISME journal pii:7729366 [Epub ahead of print].

Biofilms aid bacterial adhesion to surfaces via direct and indirect mechanisms, and formation of biofilms is considered as an important strategy for adaptation and survival in sub-optimal environmental conditions. However, the molecular underpinnings of biofilm formation in subsurface sediment/groundwater ecosystems where microorganisms often experience fluctuations in nutrient input, pH, nitrate or metal concentrations is underexplored. We examined biofilm formation under different nutrient, pH, metal, and nitrate regimes of 16 Rhodanobacter strains isolated from subsurface groundwater wells spanning diverse pH (3.5 to 5) and nitrate levels (13.7 to 146 mM). Eight Rhodanobacter strains demonstrated significant biofilm growth under low pH, suggesting adaptation to survive and grow at low pH. Biofilms intensified under aluminum stress, particularly in strains possessing fewer genetic traits associated with biofilm formation warranting further investigation. Through RB-TnSeq, proteomics, use of specific mutants and transmission electron microscopy analysis, we discovered flagellar loss under aluminum stress, indicating a potential relationship between motility, metal tolerance, and biofilm growth. Comparative genomic analyses revealed absence of flagella and chemotaxis genes, and presence of putative Type VI secretion system in the high biofilm-forming strain FW021-MT20. This study identifies genetic determinants associated with biofilm growth in a predominant environmental genus, Rhodanobacter, under metal stress and identifies traits aiding survival and adaptation to contaminated subsurface environments.

RevDate: 2024-08-07

Wang Z, Wang M, L Du (2024)

Public perceptions of international genetic information sharing for biomedical research in China: a case study of the social media debate on the article "A Pangenome Reference of 36 Chinese Populations" published in Nature.

Human genomics, 18(1):86.

BACKGROUND: The international disclosure of Chinese human genetic data continues to be a contentious issue in China, generating public debates in both traditional and social media channels. Concerns have intensified after Chinese scientists' research on pangenome data was published in the prestigious journal Nature.

METHODS: This study scrutinized microblogs posted on Weibo, a popular Chinese social media site, in the two months immediately following the publication (June 14, 2023-August 21, 2023). Content analysis was conducted to assess the nature of public responses, justifications for positive or negative attitudes, and the users' overall knowledge of how Chinese human genetic information is regulated and managed in China.

RESULTS: Weibo users displayed contrasting attitudes towards the article's public disclose of pangenome research data, with 18% positive, 64% negative, and 18% neutral. Positive attitudes came primarily from verified government and media accounts, which praised the publication. In contrast, negative attitudes originated from individual users who were concerned about national security and health risks and often believed that the researchers have betrayed China. The benefits of data sharing highlighted in the commentaries included advancements in disease research and scientific progress. Approximately 16% of the microblogs indicated that Weibo users had misunderstood existing regulations and laws governing data sharing and stewardship.

CONCLUSIONS: Based on the predominantly negative public attitudes toward scientific data sharing established by our study, we recommend enhanced outreach by scientists and scientific institutions to increase the public understanding of developments in genetic research, international data sharing, and associated regulations. Additionally, governmental agencies can alleviate public fears and concerns by being more transparent about their security reviews of international collaborative research involving Chinese human genetic data and its cross-border transfer.

RevDate: 2024-08-07

Kuo WH, Wright SJ, Small LL, et al (2024)

De novo genome assembly of white clover (Trifolium repens L.) reveals the role of copy number variation in rapid environmental adaptation.

BMC biology, 22(1):165.

BACKGROUND: White clover (Trifolium repens) is a globally important perennial forage legume. This species also serves as an eco-evolutionary model system for studying within-species chemical defense variation; it features a well-studied polymorphism for cyanogenesis (HCN release following tissue damage), with higher frequencies of cyanogenic plants favored in warmer locations worldwide. Using a newly generated haplotype-resolved genome and two other long-read assemblies, we tested the hypothesis that copy number variants (CNVs) at cyanogenesis genes play a role in the ability of white clover to rapidly adapt to local environments. We also examined questions on subgenome evolution in this recently evolved allotetraploid species and on chromosomal rearrangements in the broader IRLC legume clade.

RESULTS: Integration of PacBio HiFi, Omni-C, Illumina, and linkage map data yielded a completely de novo genome assembly for white clover (created without a priori sequence assignment to subgenomes). We find that white clover has undergone extensive transposon diversification since its origin but otherwise shows highly conserved genome organization and composition with its diploid progenitors. Unlike some other clover species, its chromosomal structure is conserved with other IRLC legumes. We further find extensive evidence of CNVs at the major cyanogenesis loci; these contribute to quantitative variation in the cyanogenic phenotype and to local adaptation across wild North American populations.

CONCLUSIONS: This work provides a case study documenting the role of CNVs in local adaptation in a plant species, and it highlights the value of pan-genome data for identifying contributions of structural variants to adaptation in nature.

RevDate: 2024-08-07

Joishy TK, Bhattacharya A, Singh CT, et al (2024)

Probiotic and anti-inflammatory properties of Lactiplantibacillus plantarum MKTJ24 isolated from an artisanal fermented fish of North-east India.

New biotechnology pii:S1871-6784(24)00034-7 [Epub ahead of print].

The study aimed to isolate and characterize lactic acid bacteria from various traditional fermented fish products from North East India, including Xindol, Hentak, and Ngari, which hold significant dietary importance for the indigenous tribes. Additionally, the study sought to examine their untargeted metabolomic profiles. A total of 43 strains of Bacillus, Priestia, Staphylococcus, Pediococcus, and Lactiplantibacillus were isolated, characterized by 16S rRNA gene and tested for probiotic properties. Five strains passed pH and bile salt tests with strain dependent antimicrobial activity, which exhibited moderate autoaggregation and hydrophobicity properties. Lactiplantibacillus plantarum MKTJ24 exhibited the highest hydrophobicity (42%), which was further confirmed by adhesion assay in HT-29 cell lines (100%). Lactiplantibacillus plantarum MKTJ24 treatment in LPS-stimulated HT-29 cells up-regulated expression of mucin genes compared to LPS-treated cells. Treatment of RAW 264.7 cells with Lactiplantibacillus plantarum MKTJ24 decreased LPS-induced reactive oxygen species (ROS) and nitric oxide (NO) productions. Further, genome analysis of Lactiplantibacillus plantarum MKTJ24 revealed the presence of several probiotic markers and immunomodulatory genes. The genome was found to harbour plantaracin operon involved in bacteriocin production. A pangenome analysis using all the publicly available L. plantarum genomes specifically isolated from fermented fish products identified 120 unique genes in Lactiplantibacillus plantarum MKTJ24. Metabolomic analysis indicated dominance of ascorbic acids, pentafluropropionate, cyclopropaneacetic acid, florobenzylamine, and furanonee in Xindol. This study suggests that Lactiplantibacillus plantarum MKTJ24 has potential probiotic and immunomodulatory properties that could be used in processing traditional fermented fish products on an industrial scale to improve their quality and enhance functional properties.

RevDate: 2024-08-06
CmpDate: 2024-08-06

Cortinovis G, Vincenzi L, Anderson R, et al (2024)

Adaptive gene loss in the common bean pan-genome during range expansion and domestication.

Nature communications, 15(1):6698.

The common bean (Phaseolus vulgaris L.) is a crucial legume crop and an ideal evolutionary model to study adaptive diversity in wild and domesticated populations. Here, we present a common bean pan-genome based on five high-quality genomes and whole-genome reads representing 339 genotypes. It reveals ~234 Mb of additional sequences containing 6,905 protein-coding genes missing from the reference, constituting 49% of all presence/absence variants (PAVs). More non-synonymous mutations are found in PAVs than core genes, probably reflecting the lower effective population size of PAVs and fitness advantages due to the purging effect of gene loss. Our results suggest pan-genome shrinkage occurred during wild range expansion. Selection signatures provide evidence that partial or complete gene loss was a key adaptive genetic change in common bean populations with major implications for plant adaptation. The pan-genome is a valuable resource for food legume research and breeding for climate change mitigation and sustainable agriculture.

RevDate: 2024-08-06

Gasparini K, Figueiredo YG, Araújo WL, et al (2024)

De novo domestication in the Solanaceae: advances and challenges.

Current opinion in biotechnology, 89:103177 pii:S0958-1669(24)00113-7 [Epub ahead of print].

The advent of highly efficient genome editing (GE) tools, coupled with high-throughput genome sequencing, has paved the way for the accelerated domestication of crop wild relatives. New crops could thus be rapidly created that are well adapted to cope with drought, flooding, soil salinity, or insect damage. De novo domestication avoids the complexity of transferring polygenic stress resistance from wild species to crops. Instead, new crops can be created by manipulating major genes in stress-resistant wild species. However, the genetic basis of certain relevant domestication-related traits often involve epistasis and pleiotropy. Furthermore, pan-genome analyses show that structural variation driving gene expression changes has been selected during domestication. A growing body of work suggests that the Solanaceae family, which includes crop species such as tomatoes, potatoes, eggplants, peppers, and tobacco, is a suitable model group to dissect these phenomena and operate changes in wild relatives to improve agronomic traits rapidly with GE. We briefly discuss the prospects of this exciting novel field in the interface between fundamental and applied plant biology and its potential impact in the coming years.

RevDate: 2024-08-06

Le DQ, Nguyen TA, Nguyen SH, et al (2024)

Efficient inference of large prokaryotic pangenomes with PanTA.

Genome biology, 25(1):209.

Pangenome inference is an indispensable step in bacterial genomics, yet its scalability poses a challenge due to the rapid growth of genomic collections. This paper presents PanTA, a software package designed for constructing pangenomes of large bacterial datasets, showing unprecedented efficiency levels multiple times higher than existing tools. PanTA introduces a novel mechanism to construct the pangenome progressively without rebuilding the accumulated collection from scratch. The progressive mode is shown to consume orders of magnitude less computational resources than existing solutions in managing growing datasets. The software is open source and is publicly available at https://github.com/amromics/panta and at 10.6084/m9.figshare.23724705 .

RevDate: 2024-08-05
CmpDate: 2024-08-05

Olanrewaju OS, Molale-Tom LG, CC Bezuidenhout (2024)

Genomic diversity, antibiotic resistance, and virulence in South African Enterococcus faecalis and Enterococcus lactis isolates.

World journal of microbiology & biotechnology, 40(10):289.

This study presents the empirical findings of an in-depth genomic analysis of Enterococcus faecalis and Enterococcus lactis isolates from South Africa. It offers valuable insights into their genetic characteristics and their significant implications for public health. The study uncovers nuanced variations in the gene content of these isolates, despite their similar GC contents, providing a comprehensive view of the evolutionary diversity within the species. Genomic islands are identified, particularly in E. faecalis, emphasizing its propensity for horizontal gene transfer and genetic diversity, especially in terms of antibiotic resistance genes. Pangenome analysis reveals the existence of a core genome, accounting for a modest proportion of the total genes, with 2157 core genes, 1164 shell genes, and 4638 cloud genes out of 7959 genes in 52 South African E. faecalis genomes (2 from this study, 49 south Africa genomes downloaded from NCBI, and E. faecalis reference genome). Detecting large-scale genomic rearrangements, including chromosomal inversions, underscores the dynamic nature of bacterial genomes and their role in generating genetic diversity. The study uncovers an array of antibiotic resistance genes, with trimethoprim, tetracycline, glycopeptide, and multidrug resistance genes prevalent, raising concerns about the effectiveness of antibiotic treatment. Virulence gene profiling unveils a diverse repertoire of factors contributing to pathogenicity, encompassing adhesion, biofilm formation, stress resistance, and tissue damage. These empirical findings provide indispensable insights into these bacteria's genomic dynamics, antibiotic resistance mechanisms, and virulence potential, underlining the pressing need to address antibiotic resistance and implement robust control measures.

RevDate: 2024-08-05

Kileeg Z, Wang P, GA Mott (2024)

Chromosome-scale assembly and annotation of eight Arabidopsis thaliana ecotypes.

Genome biology and evolution pii:7727391 [Epub ahead of print].

The plant Arabidopsis thaliana is a model system used by researchers through much of plant research. Recent efforts have focused on discovering the genomic variation found in naturally occurring ecotypes isolated from around the world. These ecotypes have come from diverse climates and therefore have faced and adapted to a variety of abiotic and biotic stressors. The sequencing and comparative analysis of these genomes can offer insight into the adaptive strategies of plants. While there are a large number of ecotype genome sequences available, the majority were created using short-read technology. Mapping of short-reads containing structural variation to a reference genome bereft of that variation leads to incorrect mapping of those reads, resulting in a loss of genetic information and introduction of false heterozygosity. For this reason, long-read de novo sequencing of genomes is required to resolve structural variation events. In this paper, we sequenced the genomes of eight natural variants of A. thaliana using nanopore sequencing. This resulted in highly contiguous assemblies with >95% of the genome contained within 5 contigs. The sequencing results from this study include 5 ecotypes from relict and African populations, an area of untapped genetic diversity. With this study, we increase the knowledge of diversity we have across A. thaliana ecotypes and contribute to ongoing production of an A. thaliana pan-genome.

RevDate: 2024-08-03

She H, Liu Z, Xu Z, et al (2024)

Pan-genome analysis of 13 Spinacia accessions reveals structural variations associated with sex chromosome evolution and domestication traits in spinach.

Plant biotechnology journal [Epub ahead of print].

Structural variations (SVs) are major genetic variants that can be involved in the origin, adaptation and domestication of species. However, the identification and characterization of SVs in Spinacia species are rare due to the lack of a pan-genome. Here, we report eight chromosome-scale assemblies of cultivated spinach and its two wild species. After integration with five existing assemblies, we constructed a comprehensive Spinacia pan-genome and identified 193 661 pan-SVs, which were genotyped in 452 Spinacia accessions. Our pan-SVs enabled genome-wide association study identified signals associated with sex and clarified the evolutionary direction of spinach. Most sex-linked SVs (86%) were biased to occur on the Y chromosome during the evolution of the sex-linked region, resulting in reduced Y-linked gene expression. The frequency of pan-SVs among Spinacia accessions further illustrated the contribution of these SVs to domestication, such as bolting time and seed dormancy. Furthermore, compared with SNPs, pan-SVs act as efficient variants in genomic selection (GS) because of their ability to capture missing heritability information and higher prediction accuracy. Overall, this study provides a valuable resource for spinach genomics and highlights the potential utility of pan-SV in crop improvement and breeding programmes.

RevDate: 2024-08-02

Lin MJ, Langmead B, Y Safonova (2024)

IGLoo: Profiling the Immunoglobulin Heavy chain locus in Lymphoblastoid Cell Lines with PacBio High-Fidelity Sequencing reads.

bioRxiv : the preprint server for biology pii:2024.07.20.604421.

New high-quality human genome assemblies derived from lymphoblastoid cell lines (LCLs) provide reference genomes and pangenomes for genomics studies. However, the characteristics of LCLs pose technical challenges to profiling immunoglobulin (IG) genes. IG loci in LCLs contain a mixture of germline and somatically recombined haplotypes, making them difficult to genotype or assemble accurately. To address these challenges, we introduce IGLoo , a software tool that implements novel methods for analyzing sequence data and genome assemblies derived from LCLs. IGLoo characterizes somatic V(D)J recombination events in the sequence data and identifies the breakpoints and missing IG genes in the LCL-based assemblies. Furthermore, IGLoo implements a novel reassembly framework to improve germline assembly quality by integrating information about somatic events and population structural variantions in the IG loci. We applied IGLoo to study the assemblies from the Human Pangenome Reference Consortium, providing new insights into the mechanisms, gene usage, and patterns of V(D)J recombination, causes of assembly fragmentation in the IG heavy chain (IGH) locus, and improved representation of the IGH assemblies.

RevDate: 2024-08-01

Nguyen AK, Schall PZ, JM Kidd (2024)

A map of canine sequence variation relative to a Greenland wolf outgroup.

Mammalian genome : official journal of the International Mammalian Genome Society [Epub ahead of print].

For over 15 years, canine genetics research relied on a reference assembly from a Boxer breed dog named Tasha (i.e., canFam3.1). Recent advances in long-read sequencing and genome assembly have led to the development of numerous high-quality assemblies from diverse canines. These assemblies represent notable improvements in completeness, contiguity, and the representation of gene promoters and gene models. Although genome graph and pan-genome approaches have promise, most genetic analyses in canines rely upon the mapping of Illumina sequencing reads to a single reference. The Dog10K consortium, and others, have generated deep catalogs of genetic variation through an alignment of Illumina sequencing reads to a reference genome obtained from a German Shepherd Dog named Mischka (i.e., canFam4, UU_Cfam_GSD_1.0). However, alignment to a breed-derived genome may introduce bias in genotype calling across samples. Since the use of an outgroup reference genome may remove this effect, we have reprocessed 1929 samples analyzed by the Dog10K consortium using a Greenland wolf (mCanLor1.2) as the reference. We efficiently performed remapping and variant calling using a GPU-implementation of common analysis tools. The resulting call set removes the variability in genetic differences seen across samples and breed relationships revealed by principal component analysis are not affected by the choice of reference genome. Using this sequence data, we inferred the history of population sizes and found that village dog populations experienced a 9-13 fold reduction in historic effective population size relative to wolves.

RevDate: 2024-08-01

Cong J, Zhang S, Zhang Q, et al (2024)

Conserved features and diversity attributes of chimeric RNAs across accessions in four plants.

Plant biotechnology journal [Epub ahead of print].

As a non-collinear expression form of genetic information, chimeric RNAs increase the complexity of transcriptome in diverse organisms. Although chimeric RNAs have been identified in plants, few common features have been revealed. Here, we systemically explored the landscape of chimeric RNAs across multi-accession and multi-tissue using pan-genome and transcriptome data of four plants: rice, maize, soybean, and Arabidopsis. Among the four species, conserved characteristics of breakpoints and parental genes were discovered. In each species, chimeric RNAs displayed a high level of diversity among accessions, and the clustering of accessions using chimeric events was generally concordant with clustering based on genomic variants, implying a general relationship between genetic variations and chimeric RNAs. Through mass spectrometry, we confirmed a fusion protein OsNDC1-OsGID1L2 and observed its subcellular localization, which differed from the original proteins. Phenotypic cues in transgenic rice suggest the potential functions of OsNDC1-OsGID1L2. Moreover, an intriguing chimeric event Os01g0216500-Os01g0216900, generated by a large deletion in basmati rice, also exists in another accession without the deletion, demonstrating its convergence in evolution. Our results illuminate the characteristics and hint at the evolutionary implications of plant chimeric RNAs, which serve as a supplement to genetic variations, thus expanding our understanding of genetic diversity.

RevDate: 2024-07-31
CmpDate: 2024-07-31

Rose SA, Robicheau BM, Tolman J, et al (2024)

Nitrogen fixation in the widely distributed marine γ-proteobacterial diazotroph Candidatus Thalassolituus haligoni.

Science advances, 10(31):eadn1476.

The high diversity and global distribution of heterotrophic bacterial diazotrophs (HBDs) in the ocean has recently become apparent. However, understanding the role these largely uncultured microorganisms play in marine N2 fixation poses a challenge due to their undefined growth requirements and the complex regulation of the nitrogenase enzyme. We isolated and characterized Candidatus Thalassolituus haligoni, a member of a widely distributed clade of HBD belonging to the Oceanospirillales. Analysis of its nifH gene via amplicon sequencing revealed the extensive distribution of Cand. T. haligoni across the Pacific, Atlantic, and Arctic Oceans. Pangenome analysis indicates that the isolate shares >99% identity with an uncultured metagenome-assembled genome called Arc-Gamma-03, recently recovered from the Arctic Ocean. Through combined genomic, proteomic, and physiological approaches, we confirmed that the isolate fixes N2 gas. However, the mechanisms governing nitrogenase regulation in Cand. T. haligoni remain unclear. We propose Cand. T. haligoni as a globally distributed, cultured HBD model species within this understudied clade of Oceanospirillales.

RevDate: 2024-07-30

Kim JI, Manuele A, Maguire F, et al (2024)

Identification of key drivers of antimicrobial resistance in Enterococcus using machine learning.

Canadian journal of microbiology [Epub ahead of print].

With antimicrobial resistance (AMR) rapidly evolving in pathogens, quick and accurate identification of genetic determinants of phenotypic resistance is essential for improving surveillance, stewardship, and clinical mitigation. Machine learning (ML) models show promise for AMR prediction in diagnostics but require a deep understanding of internal processes to use effectively. Our study utilized AMR gene, pangenomic, and predicted plasmid features from 647 Enterococcus faecium and Enterococcus faecalis genomes across the One Health continuum, along with corresponding resistance phenotypes, to develop interpretive ML classifiers. Vancomycin resistance could be predicted with 99% accuracy with AMR gene features, 98% with pangenome features, and 96% with plasmid clusters. Top pangenome features overlapped with the resistance genes of the vanA operon, which are often laterally transmitted via plasmids. Doxycycline resistance prediction achieved approximately 92% accuracy with pangenome features, with the top feature being elements of Tn916 conjugative transposon, a tet(M) carrier. Erythromycin resistance prediction models achieved about 90% accuracy, but top features were negatively correlated with resistance due to the confounding effect of population structure. This work demonstrates the importance of reviewing ML models' features to discern biological relevance even when achieving high-performance metrics. Our workflow offers the potential to propose hypotheses for experimental testing, enhancing the understanding of AMR mechanisms, which are crucial for combating the AMR crisis.

RevDate: 2024-07-29
CmpDate: 2024-07-29

Gan S, Ruan L, Xu X, et al (2024)

Whole genome sequencing and analysis of Bacillus sp. TTMP2, a tetramethylpyrazine-producing bacterium.

Molecular biology reports, 51(1):863.

BACKGROUND: Tetramethylpyrazine has been extensively studied as an anticancer substance and a flavor substance in the fields of medicine and food industry. A strain with high tetramethylpyrazine production was screened from the fermented grains of Danquan winery. Genome sequencing can reveal the potential roles of bacteria by thoroughly examining the connection between genes and phenotypes from a genomic perspective.

METHODS AND RESULTS: In this study, whole genome of this strain was sequenced and analyzed. This paper summarized the genomic characteristics of strain TTMP2 and analyzed genes related to the synthesis of tetramethylpyrazine. Bacillus sp. TTMP2 has a complete metabolic pathway for acetoin and tetramethylpyrazine metabolism. Gene function was analyzed by COG annotation, GO annotation, KEGG annotation and functional annotations for lipoproteins, carbohydrate-active enzymes, and pathogen-host interactions. Phylogenetic analysis indicated that Bacillus velezensis had the high homology with Bacillus sp. TTMP2. Genomes of 16 Bacillus species cover all genes of Bacillus, suggesting that genus Bacillus has an open pan-genome and can survive in diverse environments.

CONCLUSION: The analysis of genome sequencing data from Bacillus sp. TTMP2 showed that its metabolic characteristics could be deeply understood, indicating that this bacterium had a particular role in tetramethylpyrazine synthesis.

RevDate: 2024-07-29

Kim J, Varki R, Oliva M, et al (2024)

Re [2] Pair: Increasing the Scalability of RePair by Decreasing Memory Usage.

bioRxiv : the preprint server for biology pii:2024.07.11.603142.

UNLABELLED: The RePair compression algorithm produces a context-free grammar by iteratively substituting the most frequently occurring pair of consecutive symbols with a new symbol until all consecutive pairs of symbols appear only once in the compressed text. It is widely used in the settings of bioinformatics, machine learning, and information retrieval where random access to the original input text is needed. For example, in pangenomics, RePair is used for random access to a population of genomes. BigRePair improves the scalability of the original RePair algorithm by using Prefix-Free Parsing (PFP) to preprocess the text prior to building the RePair grammar. Despite the efficiency of PFP on repetitive text, there is a scalability issue with the size of the parse which causes a memory bottleneck in BigRePair. In this paper, we design and implement recursive RePair (denoted as Re [2] Pair), which builds the RePair grammar using recursive PFP. Our novel algorithm faces the challenge of constructing the RePair grammar without direct access to the parse of text, relying solely on the dictionary of the text and the parse and dictionary of the parse of the text. We compare Re [2] Pair to BigRePair using SARS-CoV-2 haplotypes and haplotypes from the 1000 Genomes Project. We show that our method Re [2] Pair achieves over a 40% peak memory reduction and a speed up ranging between 12% to 79% compared to BigRePair when compressing the largest input texts in all experiments. Re [2] Pair is made publicly available under the GNU public license here: https://github.com/jkim210/Recursive-RePair.

Theory of computation → Formal languages and automata theory.

RevDate: 2024-07-27

Romanenko L, Bystritskaya E, Savicheva Y, et al (2024)

Description and Whole-Genome Sequencing of Mariniflexile litorale sp. nov., Isolated from the Shallow Sediments of the Sea of Japan.

Microorganisms, 12(7): pii:microorganisms12071413.

A Gram-negative, aerobic, rod-shaped, non-motile, yellow-pigmented bacterium, KMM 9835[T], was isolated from the sediment sample obtained from the Amur Bay of the Sea of Japan seashore, Russia. Phylogenetic analyses based on the 16S rRNA gene and whole genome sequences positioned the novel strain KMM 9835[T] in the genus Mariniflexile as a separate line sharing the highest 16S rRNA gene sequence similarities of 96.6% and 96.2% with Mariniflexile soesokkakense RSSK-9[T] and Mariniflexile fucanivorans SW5[T], respectively, and similarity values of <96% to other recognized Mariniflexile species. The average nucleotide identity and digital DNA-DNA hybridization values between strain KMM 9835[T] and M. soesokkakense KCTC 32427[T], Mariniflexile gromovii KCTC 12570[T], M. fucanivorans DSM 18792[T], and M. maritimum M5A1M[T] were 83.0%, 82.5%, 83.4%, and 78.3% and 30.7%, 29.6%, 29.5%, and 24.4%, respectively. The genomic DNA GC content of strain KMM 9835[T] was 32.5 mol%. The dominant menaquinone was MK-6, and the major fatty acids were iso-C15:0, iso-C15:1ω10c, and C15:0. The polar lipids of strain KMM 9835[T] consisted of phosphatidylethanolamine, two unidentified aminolipids, an unidentified phospholipid, and six unidentified lipids. A pan-genome analysis showed that the KMM 9835[T] genome encoded 753 singletons. The annotated singletons were more often related to transport protein systems (SusC), transcriptional regulators (AraC, LytTR, LacI), and enzymes (glycosylases). The KMM 9835[T] genome was highly enriched in CAZyme-encoding genes, the proportion of which reached 7.3%. Moreover, the KMM 9835[T] genome was characterized by a high abundance of CAZyme gene families (GH43, GH28, PL1, PL10, CE8, and CE12), indicating its potential to catabolize pectin. This may represent part of an adaptation strategy facilitating microbial consumption of plant polymeric substrates in aquatic environments near shorelines and freshwater sources. Based on the combination of phylogenetic and phenotypic characterization, the marine sediment strain KMM 9835[T] (=KCTC 92792[T]) represents a novel species of the genus Mariniflexile, for which the name Mariniflexile litorale sp. nov. is proposed.

RevDate: 2024-07-27

Tatarenkov A, Muñoz-Gutiérrez I, Vargas I, et al (2024)

Pangenome Analysis Reveals Novel Contact-Dependent Growth Inhibition System and Phenazine Biosynthesis Operons in Proteus mirabilis BL95 That Are Located in An Integrative and Conjugative Element.

Microorganisms, 12(7): pii:microorganisms12071321.

Proteus mirabilis is a leading cause of urinary tract infections and a common commensal of the gastrointestinal tract. Our recent study (JB) showed that P. mirabilis strain BL95 employs a novel contact-dependent killing system against enteric bacteria in the mouse gut and in vitro. To uncover the genetic determinants of this system, we performed whole-genome sequencing of BL95 and compared it with 98 complete genomes of P. mirabilis. BL95 carries 56 coding sequences (CDSs) not found in other P. mirabilis. Over half of these unique genes are located on a novel integrative conjugative element (ICE) named ICEPm2, inserted in tRNA-Phe and exclusive to BL95. ICEPm2 has integration, conjugation, and DNA replication modules nearly identical to ICEPm1 (common in P. mirabilis), but ICEPm2 of BL95 carries two unique operons for P. mirabilis-a phenazine biosynthesis and a contact-dependent growth inhibition (CDI) system. ICEPm2 is absent in the P. mirabilis (AR_0156) closest to BL95 and it is present in the genomes of several Escherichia coli from mouse intestines, indicating its recent horizontal mobilization. BL95 shares over 100 genes of five different secretion systems with other P. mirabilis, mostly poorly studied, making a large pool of candidate genes for the contact-dependent growth inhibition.

RevDate: 2024-07-26

Peng J, Xiao R, Wu C, et al (2024)

Characterization of the prevalence of Salmonella in different retail chicken supply modes using genome-wide and machine-learning analyses.

Food research international (Ottawa, Ont.), 191:114654.

Salmonella is a foodborne pathogen that causes salmonellosis, of which retail chicken meat is a major source. However, the prevalence of Salmonella in different retail chicken supply modes and the threat posed to consumers remains unclear. The prevalence, serotype distribution, antibiotic resistance, and genomic characteristics of Salmonella in three supply modes of retail chicken (live poultry, frozen, and chilled) were investigated using whole-genome sequencing (WGS) and machine learning (ML). In this study, 480 retail chicken samples from live poultry, frozen, and chilled supply modes in Guangzhou from 2020 to 2021, as well as 253 Salmonella isolates (total isolation rate = 53.1 %), were collected. The prevalence of isolates in the live poultry mode (67.5 %, 81/120) was statistically higher than in the frozen (50.0 %, 120/240) and chilled (43.3 %, 52/120) (P < 0.05) modes. Serotype identification showed significant differences in the serotype distribution of Salmonella in different supply modes. S. Enteritis (46.7 %) and S. Indiana (14.2 %) were predominant in the frozen mode. S. Agona (23.5 %) and S. Saintpaul (13.6 %) were predominant in live poultry, while S. Enteritis (40.4 %) and S. Kentucky (17.3 %) were predominant in chilled mode. Antibiotic testing showed that frozen mode isolates were more resistant; the multidrug-resistant (MDR) rate of isolates in the frozen mode reached 91.8 %, significantly higher than in the chilled (86.5 %) and live (74.1 %) (P < 0.05) modes. WGS was performed on 155 top serotypes (S. Enteritidis, S. Kentucky, S. Indiana, and S. Agona). The antibiotic resistance gene analysis showed that the abundance and carrying rate of antibiotic resistance genes of Salmonella in the frozen mode (54 types, 16.1 %) were significantly higher than in other modes (live poultry: 36 types, 9.4 %, P < 0.05; chilled: 31 types, 11.6 %). The blaNDM-1 and blaNDM-9 genes encoding carbapenem resistance were found in frozen mode isolates on a complex transposon consisting of TnAS3-IS26. Virulence factors and plasmid replicons were abundant in the studied frozen mode isolates. In addition, single nucleotide polymorphism (SNP) phylogenetic tree results showed that in the frozen supply mode, the S. Enteritidis clonal clade continued to contaminate retail chicken meat and was homologous to S. Enteritidis strains found in farm chicken embryos, slaughterhouse chicken carcasses, and patients from hospitals in China (SNP 0 = 10). Notably, the pan-genome-based ML model showed that characteristic genes in frozen and live poultry isolates differed. The narZ gene was a key characteristic gene in frozen isolates, encoding nitrate reductase, relating to anaerobic bacterial growth. The ydgJ gene is a key characteristic gene in the live mode and encodes an oxidoreductase related to oxidative function in bacteria. The high prevalence of live poultry mode Salmonella and the transmission of frozen mode MDR Salmonella in this study pose serious risks to food safety and public health, emphasizing the importance of improving disinfection and cold storage measures to reduce Salmonella contamination and transmission. In conclusion, the continued surveillance of Salmonella across different supply models and the development of an epidemiological surveillance system based on WGS is necessary.

RevDate: 2024-07-26

Radford EJ, DE Whitworth (2024)

The genetic basis of predation by myxobacteria.

Advances in microbial physiology, 85:1-55.

Myxobacteria (phylum Myxococcota) are abundant and virtually ubiquitous microbial predators. Facultatively multicellular organisms, they are able to form multicellular fruiting bodies and swarm across surfaces, cooperatively hunting for prey. Myxobacterial communities are able to kill a wide range of prey microbes, assimilating their biomass to fuel population growth. Their mechanism of predation is exobiotic - hydrolytic enzymes and toxic metabolites are secreted into the extracellular environment, killing and digesting prey cells from without. However, recent observations of single-cell predation and contact-dependent prey killing challenge the dogma of myxobacterial predation being obligately cooperative. Regardless of their predatory mechanisms, myxobacteria have a broad prey range, which includes Gram-negative bacteria, Gram-positive bacteria and fungi. Pangenome analyses have shown that their extremely large genomes are mainly composed of accessory genes, which are not shared by all members of their species. It seems that the diversity of accessory genes in different strains provides the breadth of activity required to prey upon such a smorgasbord of microbes, and also explains the considerable strain-to-strain variation in predatory efficiency against specific prey. After providing a short introduction to general features of myxobacterial biology which are relevant to predation, this review brings together a rapidly growing body of work into the molecular mechanisms and genetic basis of predation, presenting a summary of current knowledge, highlighting trends in research and suggesting strategies by which we can potentially exploit myxobacterial predation in the future.

RevDate: 2024-07-26

Magome TG, Ochai SO, Hassim A, et al (2024)

A genome-based investigation of the Priestia species isolated from anthrax endemic regions in Kruger National Park.

Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases pii:S1567-1348(24)00100-X [Epub ahead of print].

Priestia is a genus that was renamed from the genus Bacillus based on the conserved signature indels (CSIs) in protein sequences that separate Priestia species from Bacillus, with the latter only including closely related species to B. subtilis and B. cereus. Diagnosis of anthrax, a zoonotic disease is implicated by tripartite anthrax virulence genes (lef, pagA, and cya) and poly-γ-D-glutamic acid capsular genes (capABCDE) of Bacillus anthracis. Due to the amplification of anthrax virulence genes in Priestia isolates, the search for homologous anthrax virulence genes within the Priestia spp. (n = 9) isolated from animal blood smears was embarked upon through whole genome sequencing. In silico taxonomic identification of the isolates was conducted using genome taxonomy database (GTDB), average nucleotide identity (ANI), and multi-locus sequence typing (MLST), which identified the genomes as P. aryabhattai (n = 5), P. endophytica (n = 2) and P. megaterium (n = 2). A pan-genome analysis was further employed on the Priestia genomes, including the screening of virulence, antibiotic resistance genes and mobile genetic elements on the sequenced genomes. The oligoribonuclease NrnB protein sequences showed that Priestia spp. possess a unique CSI that is absent in other Bacillus species. Furthermore, the CSI in P. endophytica is unique from other Priestia spp. Pan-genomic analysis indicates that P. endophytica clusters separately from P. aryabhattai and P. megaterium. In silico BLASTn genome analysis using the SYBR primers, Taqman probes and primers that target the chromosomal marker (Ba-1), protective antigen (pagA), and lethal factor (lef) on B. anthracis, showed partial binding to Priestia regions encoding for hypothetical proteins, pyridoxine biosynthesis, hydrolase, and inhibitory proteins. The antibiotic resistance genes (ARG) profile of Priestia spp. showed that the genomes contained no more than two ARGs. This included genes conferring resistance to rifamycin and fosfomycin (P. endophytica) as well as clindamycin (P. aryabhattai and P. megaterium). Priestia genomes lacked B. anthracis plasmids and consisted of plasmid replicon types with unknown functions. Furthermore, the amplification of Priestia strains may result in false positives when qPCR is used to detect the virulence genes of B. anthracis in soil, blood smears, and/or environmental samples.

RevDate: 2024-07-26

Peñil-Celis A, Tagg KA, Webb HE, et al (2024)

Mobile genetic elements define the non-random structure of the Salmonella enterica serovar Typhi pangenome.

mSystems [Epub ahead of print].

Bacterial relatedness measured using select chromosomal loci forms the basis of public health genomic surveillance. While approximating vertical evolution through this approach has proven exceptionally valuable for understanding pathogen dynamics, it excludes a fundamental dimension of bacterial evolution-horizontal gene transfer. Incorporating the accessory genome is the logical remediation and has recently shown promise in expanding epidemiological resolution for enteric pathogens. Employing k-mer-based Jaccard index analysis, and a novel genome length distance metric, we computed pangenome (i.e., core and accessory) relatedness for the globally important pathogen Salmonella enterica serotype Typhi (Typhi), and graphically express both vertical (homology-by-descent) and horizontal (homology-by-admixture) evolutionary relationships in a reticulate network of over 2,200 U.S. Typhi genomes. This analysis revealed non-random structure in the Typhi pangenome that is driven predominantly by the gain and loss of mobile genetic elements, confirming and expanding upon known epidemiological patterns, revealing novel plasmid dynamics, and identifying avenues for further genomic epidemiological exploration. With an eye to public health application, this work adds important biological context to the rapidly improving ways of analyzing bacterial genetic data and demonstrates the value of the accessory genome to infer pathogen epidemiology and evolution.IMPORTANCEGiven bacterial evolution occurs in both vertical and horizontal dimensions, inclusion of both core and accessory genetic material (i.e., the pangenome) is a logical step toward a more thorough understanding of pathogen dynamics. With an eye to public, and indeed, global health relevance, we couple contemporary tools for genomic analysis with decades of research on mobile genetic elements to demonstrate the value of the pangenome, known and unknown, annotated, and hypothetical, for stratification of Salmonella enterica serovar Typhi (Typhi) populations. We confirm and expand upon what is known about Typhi epidemiology, plasmids, and antimicrobial resistance dynamics, and offer new avenues of exploration to further deduce Typhi ecology and evolution, and ultimately to reduce the incidence of human disease.

RevDate: 2024-07-26

Mortimer TD (2024)

mSphere of Influence: Predicting the evolution of pathogen populations.

mSphere [Epub ahead of print].

Tatum D. Mortimer works in the field of pathogen population genomics and evolution. In this mSphere of Influence article, she reflects on how "Frequency-dependent selection can forecast evolution in Streptococcus pneumoniae" by Azarian et al. and "Contingency, repeatability, and predictability in the evolution of a prokaryotic pangenome" by Beavan et al. made an impact on her by highlighting the ways in which genomic data can be used to predict pathogen evolution.

RevDate: 2024-07-26
CmpDate: 2024-07-26

Cui H, Fan S, Ding W, et al (2024)

Genomic Analysis of Novel Sulfitobacter Bacterial Strains Isolated from Marine Biofilms.

Marine drugs, 22(7): pii:md22070289.

Bacteria from the genus Sulfitobacter are distributed across various marine habitats and play a significant role in sulfur cycling. However, the metabolic features of Sulfitobacter inhabiting marine biofilms are still not well understood. Here, complete genomes and paired metatranscriptomes of eight Sulfitobacter strains, isolated from biofilms on subtidal stones, have been analyzed to explore their central energy metabolism and potential of secondary metabolite biosynthesis. Based on average nucleotide identity and phylogenetic analysis, the eight strains were classified into six novel species and two novel strains. The reconstruction of the metabolic pathways indicated that all strains had a complete Entner-Doudoroff pathway, pentose phosphate pathway, and diverse pathways for amino acid metabolism, suggesting the presence of an optimized central carbon metabolism. Pangenome analysis further revealed the differences between the gene cluster distribution patterns among the eight strains, suggesting significant functional variation. Moreover, a total of 47 biosynthetic gene clusters were discovered, which were further classified into 37 gene cluster families that showed low similarity with previously documented clusters. Furthermore, metatranscriptomic analysis revealed the expressions of key functional genes involved in the biosynthesis of ribosomal peptides in in situ marine biofilms. Overall, this study sheds new light on the metabolic features, adaptive strategies, and value of genome mining in this group of biofilm-associated Sulfitobacter bacteria.

RevDate: 2024-07-26

Abid A, Alzahrani B, Naz S, et al (2024)

Reverse Vaccinology Approach to Identify Novel and Immunogenic Targets against Streptococcus gordonii.

Biology, 13(7): pii:biology13070510.

Streptococcus gordonii is a gram-positive, mutualistic bacterium found in the human body. It is found in the oral cavity, upper respiratory tract, and intestines, and presents a serious clinical problem because it can lead to opportunistic infections in individuals with weakened immune systems. Streptococci are the most prevalent inhabitants of oral microbial communities, and are typical oral commensals found in the human oral cavity. These streptococci, along with many other oral microbes, produce multispecies biofilms that can attach to salivary pellicle components and other oral bacteria via adhesin proteins expressed on the cell surface. Antibiotics are effective against this bacterium, but resistance against antibodies is increasing. Therefore, a more effective treatment is needed. Vaccines offer a promising method for preventing this issue. This study generated a multi-epitope vaccine against Streptococcus gordonii by targeting the completely sequenced proteomes of five strains. The vaccine targets are identified using a pangenome and subtractive proteomic approach. In the present study, 13 complete strains out of 91 strains of S. gordonii are selected. The pangenomics results revealed that out of 2835 pan genes, 1225 are core genes. Out of these 1225 core genes, 643 identified as non-homologous proteins by subtractive proteomics. A total of 20 essential proteins are predicted from non-homologous proteins. Among these 20 essential proteins, only five are identified as surface proteins. The vaccine construct is designed based on selected B- and T-cell epitopes of the antigenic proteins with the help of linkers and adjuvants. The designed vaccine is docked against TLR2. The expression of the protein is determined using in silico gene cloning. Findings concluded that Vaccine I with adjuvant shows higher interactions with TLR2, suggesting that the vaccine has the ability to induce a humoral and cell-mediated response to treat and prevent infection; this makes it promising as a vaccine against infectious diseases caused by S. gordonii. Furthermore, validation of the vaccine construct is required by in vitro and in vivo trials to check its actual potency and safety for use to prevent infectious diseases caused by S. gordonii.

RevDate: 2024-07-25
CmpDate: 2024-07-25

Mol M, P de Maayer (2024)

Elucidating the biotechnological potential of the genera Parageobacillus and Saccharococcus through comparative genomic and pan-genome analysis.

BMC genomics, 25(1):723.

BACKGROUND: The genus Geobacillus and its associated taxa have been the focal point of numerous thermophilic biotechnological investigations, both at the whole cell and enzyme level. By contrast, comparatively little research has been done on its recently delineated sister genus, Parageobacillus. Here we performed pan-genomic analyses on a subset of publicly available Parageobacillus and Saccharococcus genomes to elucidate their biotechnological potential.

RESULTS: Phylogenomic analysis delineated the compared taxa into two distinct genera, Parageobacillus and Saccharococcus, with P. caldoxylosilyticus isolates clustering with S. thermophilus in the latter genus. Both genera present open pan-genomes, with the species P. toebii being characterized with the highest novel gene accrual. Diversification of the two genera is driven through the variable presence of plasmids, bacteriophages and transposable elements. Both genera present a range of potentially biotechnologically relevant features, including a source of novel antimicrobials, thermostable enzymes including DNA-active enzymes, carbohydrate active enzymes, proteases, lipases and carboxylesterases. Furthermore, they present a number of metabolic pathways pertinent to degradation of complex hydrocarbons and xenobiotics and for green energy production.

CONCLUSIONS: Comparative genomic analyses of Parageobacillus and Saccharococcus suggest that taxa in both of these genera can serve as a rich source of biotechnologically and industrially relevant secondary metabolites, thermostable enzymes and metabolic pathways that warrant further investigation.

RevDate: 2024-07-25
CmpDate: 2024-07-25

Chekesa B, Singh H, Gonzalez-Juarbe N, et al (2024)

Pangenome and genomic signatures linked to the dominance of the lineage-4 of Mycobacterium tuberculosis isolated from extrapulmonary tuberculosis patients in western Ethiopia.

PloS one, 19(7):e0304060 pii:PONE-D-24-08245.

BACKGROUND: The lineage 4 (L4) of Mycobacterium tuberculosis (MTB) is not only globally prevalent but also locally dominant, surpassing other lineages, with lineage 2 (L2) following in prevalence. Despite its widespread occurrence, factors influencing the expansion of L4 and its sub-lineages remain poorly understood both at local and global levels. Therefore, this study aimed to conduct a pan-genome and identify genomic signatures linked to the elevated prevalence of L4 sublineages among extrapulmonary TB (EPTB) patients in western Ethiopia.

METHODS: A cross-sectional study was conducted at an institutional level involving confirmed cases of extrapulmonary tuberculosis (EPTB) patients from August 5, 2018, to December 30, 2019. A total of 75 MTB genomes, classified under lineage 4 (L4), were used for conducting pan-genome and genome-wide association study (GWAS) analyses. After a quality check, variants were identified using MTBseq, and genomes were de novo assembled using SPAdes. Gene prediction and annotation were performed using Prokka. The pan-genome was constructed using GET_HOMOLOGUES, and its functional analysis was carried out with the Bacterial Pan-Genome Analysis tool (BPGA). For GWAS analysis, Scoary was employed with Benjamini-Hochberg correction, with a significance threshold set at p-value ≤ 0.05.

RESULTS: The analysis revealed a total of 3,270 core genes, predominantly associated with orthologous groups (COG) functions, notably in the categories of '[R] General function prediction only' and '[I] Lipid transport and metabolism'. Conversely, functions related to '[N] Cell motility' and '[Q] Secondary metabolites biosynthesis, transport, and catabolism' were primarily linked to unique and accessory genes. The pan-genome of MTB L4 was found to be open. Furthermore, the GWAS study identified genomic signatures linked to the prevalence of sublineages L4.6.3 and L4.2.2.2.

CONCLUSIONS: Apart from host and environmental factors, the sublineage of L4 employs distinct virulence factors for successful dissemination in western Ethiopia. Given that the functions of these newly identified genes are not well understood, it is advisable to experimentally validate their roles, particularly in the successful transmission of specific L4 sublineages over others.

RevDate: 2024-07-25
CmpDate: 2024-07-25

Dost I, Abdel-Glil M, Persson S, et al (2024)

Genomic study of European Clostridioides difficile ribotype 002/sequence type 8.

Microbial genomics, 10(7):.

Clostridioides difficile has significant clinical importance as a leading cause of healthcare-associated infections, with symptoms ranging from mild diarrhoea to severe colitis, and possible life-threatening complications. C. difficile ribotype (RT) 002, mainly associated with MLST sequence type (ST) 8, is one of the most common RTs found in humans. This study aimed at investigating the genetic characteristics of 537 C. difficile genomes of ST8/RT002. To this end, we sequenced 298 C. difficile strains representing a new European genome collection, with strains from Germany, Denmark, France and Portugal. These sequences were analysed against a global dataset consisting of 1,437 ST8 genomes available through Enterobase. Our results showed close genetic relatedness among the studied ST8 genomes, a diverse array of antimicrobial resistance (AMR) genes and the presence of multiple mobile elements. Notably, the pangenome analysis revealed an open genomic structure. ST8 shows relatively low overall variation. Thus, clonal isolates were found across different One Health sectors (humans, animals, environment and food), time periods, and geographical locations, suggesting the lineage's stability and a universal environmental source. Importantly, this stability did not hinder the acquisition of AMR genes, emphasizing the adaptability of this bacterium to different selective pressures. Although only 2.4 % (41/1,735) of the studied genomes originated from non-human sources, such as animals, food, or the environment, we identified 9 cross-sectoral core genome multilocus sequence typing (cgMLST) clusters. Our study highlights the importance of ST8 as a prominent lineage of C. difficile with critical implications in the context of One Health. In addition, these findings strongly support the need for continued surveillance and investigation of non-human samples to gain a more comprehensive understanding of the epidemiology of C. difficile.

RevDate: 2024-07-24

Garg V, Bohra A, Mascher M, et al (2024)

Unlocking plant genetics with telomere-to-telomere genome assemblies.

Nature genetics [Epub ahead of print].

Contiguous genome sequence assemblies will help us to realize the full potential of crop translational genomics. Recent advances in sequencing technologies, especially long-read sequencing strategies, have made it possible to construct gapless telomere-to-telomere (T2T) assemblies, thus offering novel insights into genome organization and function. Plant genomes pose unique challenges, such as a continuum of ancient to recent polyploidy and abundant highly similar and long repetitive elements. Owing to progress in sequencing approaches, for most crop plants, chromosome-scale reference genome assemblies are available, but T2T assembly construction remains challenging. Here we describe methods for haplotype-resolved, gapless T2T assembly construction in plants, including various crop species. We outline the impact of T2T assemblies in elucidating the roles of repetitive elements in gene regulation, as well as in pangenomics, functional genomics, genome-assisted breeding and targeted genome manipulation. In conjunction with sequence-enriched germplasm repositories, T2T assemblies thus hold great promise for basic and applied plant sciences.

RevDate: 2024-07-24

Hong UVT, Tamiru-Oli M, Hurgobin B, et al (2024)

Genomic and Cell-Specific Regulation of Benzylisoquinoline Alkaloid Biosynthesis in Opium Poppy.

Journal of experimental botany pii:7719224 [Epub ahead of print].

Opium poppy is a crop of great commercial value as a source of several opium alkaloids for the pharmaceutical industries including morphine, codeine, thebaine, noscapine and papaverine. Most enzymes involved in benzylisoquinoline alkaloids (BIAs) biosynthesis in opium poppy have been functionally characterized, and opium poppy currently serves as a model system to study BIA metabolism in plants. BIA biosynthesis in opium poppy involves two biosynthetic gene clusters associated respectively with the morphine and noscapine branches. Recent reports have shown that genes in the same cluster are co-expressed, suggesting they might also be co-regulated. However, the transcriptional regulation of opium poppy BIA biosynthesis is not well studied. Opium poppy BIA biosynthesis involves three cell types associated with the phloem system: companion cells, sieve elements and laticifers. The transcripts and enzymes associated with BIA biosynthesis are distributed across cell types, requiring the translocation of key enzymes and pathway intermediates between cell types. Together, these suggest that the regulation of BIA biosynthesis in opium poppy is multilayered and complex, involving biochemical, genomic, and physiological mechanisms. In this review, we highlight recent advances in genome sequencing and single cell and spatial transcriptomics with a focus on how these efforts can improve our understanding of the genomic and cell-specific regulation of BIA biosynthesis. Such knowledge is vital for opium poppy genetic improvement and metabolic engineering efforts targeting the modulation of alkaloid yield and composition.

RevDate: 2024-07-24

Chintakovid N, Singkhamanan K, Yaikhan T, et al (2024)

Probiogenomic analysis of Lactiplantibacillus plantarum SPS109: A potential GABA-producing and cholesterol-lowering probiotic strain.

Heliyon, 10(13):e33823.

Lactiplantibacillus plantarum SPS109, an isolated strain of lactic acid bacteria (LAB) from fermented foods, showed remarkable potential as a probiotic with dual capabilities in γ-aminobutyric acid (GABA) production and cholesterol reduction. This study employs genomic and comparative analyses to search into the strain's genetic profile, safety features, and probiotic attributes. The safety assessment reveals the absence of virulence factors and antimicrobial resistance genes, while the genome uncovers bacteriocin-related elements, including sactipeptides and a cluster for putative plantaricins, strengthening its ability to combat diverse pathogens. Pangenome analysis revealed unique bacteriocin-related genes, specifically lcnD and bcrA, distinguishing SPS109 from four other L. plantarum strains producing GABA. In addition, genomic study emphasizes SPS109 strain distinctive features, two GABA-related genes responsible for GABA production and a bile tolerance gene (cbh) crucial for cholesterol reduction. Additionally, the analysis highlights several genes of potential probiotic properties, including stress tolerance, vitamin production, and antioxidant activity. In summary, L. plantarum SPS109 emerges as a promising probiotic candidate with versatile applications in the food and beverage industries, supported by its unique genomic features and safety profile.

RevDate: 2024-07-23

L Rocha J, Lou RN, PH Sudmant (2024)

Structural variation in humans and our primate kin in the era of telomere-to-telomere genomes and pangenomics.

Current opinion in genetics & development, 87:102233 pii:S0959-437X(24)00082-0 [Epub ahead of print].

Structural variants (SVs) account for the majority of base pair differences both within and between primate species. However, our understanding of inter- and intra-species SV has been historically hampered by the quality of draft primate genomes and the absence of genome resources for key taxa. Recently, advances in long-read sequencing and genome assembly have begun to radically reshape our understanding of SVs. Two landmark achievements include the publication of a human telomere-to-telomere (T2T) genome as well as the development of the first human pangenome reference. In this review, we first look back to the major works laying the foundation for these projects. We then examine the ways in which T2T genome assemblies and pangenomes are transforming our understanding of and approach to primate SV. Finally, we discuss what the future of primate SV research may look like in the era of T2T genomes and pangenomics.

RevDate: 2024-07-23

Li H, Marin M, MR Farhat (2024)

Exploring gene content with pangene graphs.

Bioinformatics (Oxford, England) pii:7718494 [Epub ahead of print].

MOTIVATION: The gene content regulates the biology of an organism. It varies between species and between individuals of the same species. Although tools have been developed to identify gene content changes in bacterial genomes, none is applicable to collections of large eukaryotic genomes such as the human pangenome.

RESULTS: We developed pangene, a computational tool to identify gene orientation, gene order and gene copy-number changes in a collection of genomes. Pangene aligns a set of input protein sequences to the genomes, resolves redundancies between protein sequences and constructs a gene graph with each genome represented as a walk in the graph. It additionally finds subgraphs, which we call bibubbles, that capture gene content changes. Applied to the human pangenome, pangene identifies known gene-level variations and reveals complex haplotypes that are not well studied before. Pangene also works with high-quality bacterial pangenome and reports similar numbers of core and accessory genes in comparison to existing tools.

Source code at https://github.com/lh3/pangene; pre-built pangene graphs can be downloaded from https://zenodo.org/records/8118576 and visualized at https://pangene.bioinweb.org.

RevDate: 2024-07-22

Mazwi KD, Lekota KE, Glover BA, et al (2024)

Whole Genome Sequence Analysis of Brucella spp. from Human, Livestock, and Wildlife in South Africa.

Journal of microbiology (Seoul, Korea) [Epub ahead of print].

Brucellosis is an economically important zoonotic disease affecting humans, livestock, and wildlife health globally and especially in Africa. Brucella abortus and B. melitensis have been isolated from human, livestock (cattle and goat), and wildlife (sable) in South Africa (SA) but with little knowledge of the population genomic structure of this pathogen in SA. As whole genome sequencing can assist to differentiate and trace the origin of outbreaks of Brucella spp. strains, the whole genomes of retrospective isolates (n = 19) from previous studies were sequenced. Sequences were analysed using average nucleotide identity (ANI), pangenomics, and whole genome single nucleotide polymorphism (wgSNP) to trace the geographical origin of cases of brucellosis circulating in human, cattle, goats, and sable from different provinces in SA. Pangenomics analysis of B. melitensis (n = 69) and B. abortus (n = 56) was conducted with 19 strains that included B. abortus from cattle (n = 3) and B. melitensis from a human (n = 1), cattle (n = 1), goat (n = 1), Rev1 vaccine strain (n = 1), and sable (n = 12). Pangenomics analysis of B. melitensis genomes, highlighted shared genes, that include 10 hypothetical proteins and genes that encodes for acetyl-coenzyme A synthetase (acs), and acylamidase (aam) amongst the sable genomes. The wgSNP analysis confirmed the B. melitensis isolated from human was more closely related to the goat from the Western Cape Province from the same outbreak than the B. melitensis cattle sample from different cases in the Gauteng Province. The B. melitensis sable strains could be distinguished from the African lineage, constituting their own African sub-clade. The sequenced B. abortus strains clustered in the C2 lineage that is closely related to the isolates from Mozambique and Zimbabwe. This study identified genetically diverse Brucella spp. among various hosts in SA. This study expands the limited known knowledge regarding the presence of B. melitensis in livestock and humans in SA, further building a foundation for future research on the distribution of the Brucella spp. worldwide and its evolutionary background.

RevDate: 2024-07-22

Gheorghe-Barbu I, Surleac M, Barbu IC, et al (2024)

Decoding the resistome, virulome and mobilome of clinical versus aquatic Acinetobacter baumannii in southern Romania.

Heliyon, 10(13):e33372.

Acinetobacter baumannii, a notorious opportunistic pathogen, presents a formidable challenge in both clinical and environmental fields due to its resilience and ability to acquire resistance. This study undertook a comprehensive analysis of 183 A. baumannii isolates collected between 2019 and 2022 from intra-hospital infections (IHI), hospital sewages (Hs), wastewater treatment plants (WWTP), and adjacent river waters from two Southern cities, focusing on their resistome, virulome, and mobilome through isolation on chromogenic media, identification by MALDI-TOF-MS and antibiotic susceptibility testing by disk diffusion) followed by genotypic characterization [Whole Genome Sequencing (WGS), 3rd generation sequencing through the MinION (ONT) platform, pangenome description, and respectively horizontal gene transfer through conjugation assays]. Our findings reveal significant genomic plasticity and the prevalence of high-risk international clones, underlining the potential of these isolates to act as reservoirs for antibiotic resistance genes (ARGs) that could be dynamically exchanged between clinical and environmental settings through mobile genetic elements (MGEs) such as the pMAL1 plasmids and the critical role of WWTPs in the persistence and spread of A. baumannii. Moreover, our study presents the first report of the co-occurrence of bla OXA-23 and bla OXA-72 in A. baumannii ST2 clone. Thus, our research underscores the necessity for integrated surveillance and targeted interventions across healthcare and environmental sectors to mitigate the risk posed by this adaptable pathogen.

RevDate: 2024-07-20
CmpDate: 2024-07-20

Xiang Y, Zhu K, Min K, et al (2024)

Characterization of a Salmonella enterica serovar Typhimurium lineage with rough colony morphology and multidrug resistance.

Nature communications, 15(1):6123.

Salmonella enterica serovar Typhimurium (S. Typhimurium) is a major cause of salmonellosis, and the emergence of multidrug-resistant pathovariants has become a growing concern. Here, we investigate a distinct rough colony variant exhibiting a strong biofilm-forming ability isolated in China. Whole-genome sequencing on 2,212 Chinese isolates and 1,739 publicly available genomes reveals the population structure and evolutionary history of the rough colony variants. Characterized by macro, red, dry, and rough (mrdar) colonies, these variants demonstrate enhanced biofilm formation at 28 °C and 37 °C compared to typical rdar colonies. The mrdar variants exhibit extensive multidrug resistance, with significantly higher resistance to at least five classes of antimicrobial agents compared to non-mrdar variants. This resistance is primarily conferred by an IncHI2 plasmid harboring 19 antimicrobial resistance genes. Phylogenomic analysis divides the global collections into six lineages. The majority of mrdar variants belong to sublineage L6.5, which originated from Chinese smooth colony strains and possibly emerged circa 1977. Among the mrdar variants, upregulation of the csgDEFG operons is observed, probably due to a distinct point mutation (-44G > T) in the csgD gene promoter. Pangenome and genome-wide association analyses identify 87 specific accessory genes and 72 distinct single nucleotide polymorphisms associated with the mrdar morphotype.

RevDate: 2024-07-19

Fan X, Kong L, Wang J, et al (2024)

Surface-programmed microbiome assembly in phycosphere to microplastics contamination.

Water research, 262:122064 pii:S0043-1354(24)00964-3 [Epub ahead of print].

Recalcitrance in microplastics accounts for ubiquitous white pollution. Of special interest are the capabilities of microorganisms to accelerate their degradation sustainably. Compared to the well-studied pure cultures in degrading natural polymers, the algal-bacterial symbiotic system is considered as a promising candidate for microplastics removal, cascading bottom-up impacts on ecosystem-scale processes. This study selected and enriched the algae-associated microbial communities hosted by the indigenous isolation Desmodesmus sp. in wastewater treatment plants with micro-polyvinyl chloride, polyethylene terephthalate, polyethylene, and polystyrene contamination. Results elaborated that multiple settled and specific affiliates were recruited by the uniform algae protagonist from the biosphere under manifold microplastic stress. Alteration of distinct chemical functionalities and deformation of polymers provide direct evidence of degradation in phycosphere under illumination. Microplastic-induced phycosphere-derived DOM created spatial gradients of aromatic protein, fulvic and humic acid-like and tryptophan components to expanded niche-width. Surface thermodynamic analysis was conducted to simulate the reciprocal and reversible interaction on algal-bacterial and phycosphere-microplastic interface, revealing the enhancement of transition to stable and irreversible aggregation for functional microbiota colonization and microplastics capture. Furthermore, pangenomic analysis disclosed the genes related to the chemotaxis and the proposed microplastics biodegradation pathway in enriched algal-bacterial microbiome, orchestrating the evidence for common synthetic polymer particles and ultimately to confirm the effectiveness and potential. The present study emphasizes the necessity for future endeavors aimed at fully leveraging the potential of algal-bacterial mutualistic systems within sustainable bioremediation strategies targeting the eradication of microplastic waste.

RevDate: 2024-07-19

Hatmaker EA, Barber AE, Drott MT, et al (2024)

Pathogenicity is associated with population structure in a fungal pathogen of humans.

bioRxiv : the preprint server for biology pii:2024.07.05.602241.

Aspergillus flavus is a clinically and agriculturally important saprotrophic fungus responsible for severe human infections and extensive crop losses. We analyzed genomic data from 250 (95 clinical and 155 environmental) A. flavus isolates from 9 countries, including 70 newly sequenced clinical isolates, to examine population and pan-genome structure and their relationship to pathogenicity. We identified five A. flavus populations, including a new population, D, corresponding to distinct clades in the genome-wide phylogeny. Strikingly, > 75% of clinical isolates were from population D. Accessory genes, including genes within biosynthetic gene clusters, were significantly more common in some populations but rare in others. Population D was enriched for genes associated with zinc ion binding, lipid metabolism, and certain types of hydrolase activity. In contrast to the major human pathogen Aspergillus fumigatus , A. flavus pathogenicity in humans is strongly associated with population structure, making it a great system for investigating how population-specific genes contribute to pathogenicity.

RevDate: 2024-07-17
CmpDate: 2024-07-17

Kusza S, Badaoui B, G Wanjala (2024)

Insights into the genomic homogeneity of Moroccan indigenous sheep breeds though the lens of runs of homozygosity.

Scientific reports, 14(1):16515.

Numerous studies have indicated that Morocco's indigenous sheep breeds are genetically homogenous, posing a risk to their survival in the challenging harsh climate conditions where they predominantly inhabit. To understand the genetic behind genetic homogeneity through the lens of runs of homozygosity (ROH), we analyzed the whole genome sequences of five indigenous sheep breeds (Beni Guil, Ouled Djellal, D'man, Sardi, Timahdite and Admixed).The results from principal component, admixture, Fst, and neighbour joining tree analyses consistently showed a homogenous genetic structure. This structure was characterized by an average length of 1.83 Mb for runs of homozygosity (ROH) segments, with a limited number of long ROH segments (24-48 Mb and > 48 Mb). The most common ROH segments were those ranging from 1-6 Mb. The most significant regions of homozygosity (ROH Islands) were mostly observed in two chromosomes, namely Chr1 and Chr5. Specifically, ROH Islands were exclusively discovered in the Ouled Djellal breed on Chr1, whereas Chr5 exhibited ROH Islands in all breeds. The analysis of ROH Island and iHS technique was employed to detect signatures of selection on Chr1 and Chr5. The results indicate that Chr5 had a high level of homogeneity, with the same genes being discovered across all breeds. In contrast, Chr1 displays some genetic variances between breeds. Genes identified on Chr5 included SLC39A1, IL23A, CAST, IL5, IL13, and IL4 which are responsible for immune response while genes identified on Chr1 include SOD1, SLAMF9, RTP4, CLDN1, and PRKAA2. ROH segment profile and effective population sizes patterns suggests that the genetic uniformity of studied breeds is the outcome of events that transpired between 250 and 300 generations ago. This research not only contributes to the understanding of ROH distribution across breeds but helps design and implement native sheep breeding and conservation strategies in Morocco. Future research, incorporating a broader sample size and utilizing the pangenome for reference, is recommended to further elucidate these breeds' genomic landscapes and adaptive mechanisms.

RevDate: 2024-07-17
CmpDate: 2024-07-17

Machado E, Vasconcellos S, Gomes L, et al (2024)

Phylogenomic and genomic analysis reveals unique and shared genetic signatures of Mycobacterium kansasii complex species.

Microbial genomics, 10(7):.

Species belonging to the Mycobacterium kansasii complex (MKC) are frequently isolated from humans and the environment and can cause serious diseases. The most common MKC infections are caused by the species M. kansasii (sensu stricto), leading to tuberculosis-like disease. However, a broad spectrum of virulence, antimicrobial resistance and pathogenicity of these non-tuberculous mycobacteria (NTM) are observed across the MKC. Many genomic aspects of the MKC that relate to these broad phenotypes are not well elucidated. Here, we performed genomic analyses from a collection of 665 MKC strains, isolated from environmental, animal and human sources. We inferred the MKC pangenome, mobilome, resistome, virulome and defence systems and show that the MKC species harbours unique and shared genomic signatures. High frequency of presence of prophages and different types of defence systems were observed. We found that the M. kansasii species splits into four lineages, of which three are lowly represented and mainly in Brazil, while one lineage is dominant and globally spread. Moreover, we show that four sub-lineages of this most distributed M. kansasii lineage emerged during the twentieth century. Further analysis of the M. kansasii genomes revealed almost 300 regions of difference contributing to genomic diversity, as well as fixed mutations that may explain the M. kansasii's increased virulence and drug resistance.

RevDate: 2024-07-16

Prigozhin DM, Sutherland CA, Rangavajjhala S, et al (2024)

Majority of the highly variable NLRs in maize share genomic location and contain additional target-binding domains.

Molecular plant-microbe interactions : MPMI [Epub ahead of print].

Nucleotide-binding, Leucine Rich Repeat proteins (NLRs) are a major class of immune receptors in plants. NLRs include both conserved and rapidly evolving members, however their evolutionary trajectory in crops remains understudied. Availability of crop pan-genomes enables analysis of the recent events in the evolution of this highly complex gene family within domesticated species. Here, we investigated the NLR complement of 26 nested association mapping (NAM) founder lines of maize. We found that maize has just four main subfamilies containing rapidly evolving highly variable NLR (hvNLR) receptors. Curiously, three of these phylogenetically distinct hvNLR lineages are located in adjacent clusters on chromosome 10. Members of the same hvNLR clade show variable expression and methylation across lines and tissues, consistent with their rapid evolution. By combining sequence diversity analysis and AlphaFold2 computational structure prediction we predicted ligand binding sites in the hvNLRs. We also observed novel insertion domains in the LRR regions of two hvNLR subfamilies that likely contribute to target recogniton. To make this analysis accessible, we created NLRCladeFinder, a Google Colaboratory notebook, that accepts any newly identified NLR sequence, places it in the evolutionary context of the maize pan-NLRome, and provides an updated clade alignment, phylogenetic tree, and sequence diversity information for the gene of interest.

RevDate: 2024-07-16

Chandra G, Gibney D, C Jain (2024)

Haplotype-aware sequence alignment to pangenome graphs.

Genome research pii:gr.279143.124 [Epub ahead of print].

Modern pangenome graphs are built using haplotype-resolved genome assemblies. When mapping reads to a pangenome graph, prioritizing alignments that are consistent with the known haplotypes improves genotyping accuracy. However, the existing rigorous formulations for co-linear chaining and alignment problems do not consider the haplotype paths in a pangenome graph. This often leads to spurious read alignments to those paths that are unlikely recombinations of the known haplotypes. In this paper, we develop novel formulations and algorithms for sequence-to-graph alignment and chaining problems. Inspired by the genotype imputation models, we assume that a query sequence is an imperfect mosaic of reference haplotypes. Accordingly, we introduce a recombination penalty in the scoring functions for each haplotype switch. First, we solve haplotype-aware sequence-to-graph alignment in O(|Q||E||H|) time, where Q is the query sequence, E is the set of edges, and H is the set of haplotypes represented in the graph. To complement our solution, we prove that an algorithm significantly faster than O(|Q||E||H|) is impossible under the Strong Exponential Time Hypothesis (SETH). Second, we propose a haplotype-aware chaining algorithm that runs in O(|H|N log|H|N) time after graph preprocessing, where N is the count of input anchors. We then establish that a chaining algorithm significantly faster than O(|H|N) is impossible under SETH. As a proof-of-concept, we implemented our chaining algorithm in the Minichain aligner. By aligning sequences sampled from the human major histocompatibility complex (MHC) to a pangenome graph of 60 MHC haplotypes, we demonstrate that our algorithm achieves better consistency with ground-truth recombinations when compared to a haplotype-agnostic algorithm.

RevDate: 2024-07-16

Uzuner H, Paschen A, Schadendorf D, et al (2024)

Orthanq: transparent and uncertainty-aware haplotype quantification with application in HLA-typing.

BMC bioinformatics, 25(1):240.

BACKGROUND: Identification of human leukocyte antigen (HLA) types from DNA-sequenced human samples is important in organ transplantation and cancer immunotherapy and remains a challenging task considering sequence homology and extreme polymorphism of HLA genes.

RESULTS: We present Orthanq, a novel statistical model and corresponding application for transparent and uncertainty-aware quantification of haplotypes. We utilize our approach to perform HLA typing while, for the first time, reporting uncertainty of predictions and transparently observing mutations beyond reported HLA types. Using 99 gold standard samples from 1000 Genomes, Illumina Platinum Genomes and Genome In a Bottle projects, we show that Orthanq can provide overall superior accuracy and shorter runtimes than state-of-the-art HLA typers.

CONCLUSIONS: Orthanq is the first approach that allows to directly utilize existing pangenome alignments and type all HLA loci. Moreover, it can be generalized for usages beyond HLA typing, e.g. for virus lineage quantification. Orthanq is available under https://orthanq.github.io .

RevDate: 2024-07-16

Ceres K, Zehr JD, Murrell C, et al (2024)

Evolutionary genomic analyses of canine E. coli infections identify a relic capsular locus associated with resistance to multiple classes of antimicrobials.

Applied and environmental microbiology [Epub ahead of print].

UNLABELLED: Infections caused by antimicrobial-resistant Escherichia coli are the leading cause of death attributed to antimicrobial resistance (AMR) worldwide, and the known AMR mechanisms involve a range of functional proteins. Here, we employed a pan-genome wide association study (GWAS) approach on over 1,000 E. coli isolates from sick dogs collected across the US and Canada and identified a strong statistical association (empirical P < 0.01) of AMR, involving a range of antibiotics to a group 1 capsular (CPS) gene cluster. This cluster included genes under relaxed selection pressure, had several loci missing, and had pseudogenes for other key loci. Furthermore, this cluster is widespread in E. coli and Klebsiella clinical isolates across multiple host species. Earlier studies demonstrated that the octameric CPS polysaccharide export protein Wza can transmit macrolide antibiotics into the E. coli periplasm. We suggest that the CPS in question, and its highly divergent Wza, functions as an antibiotic trap, preventing antimicrobial penetration. We also highlight the high diversity of lineages circulating in dogs across all regions studied, the overlap with human lineages, and regional prevalence of resistance to multiple antimicrobial classes.

IMPORTANCE: Much of the human genomic epidemiology data available for E. coli mechanism discovery studies has been heavily biased toward shiga-toxin producing strains from humans and livestock. E. coli occupies many niches and produces a wide variety of other significant pathotypes, including some implicated in chronic disease. We hypothesized that since dogs tend to share similar strains with their owners and are treated with similar antibiotics, their pathogenic isolates will harbor unexplored AMR mechanisms of importance to humans as well as animals. By comparing over 1,000 genomes with in vitro antimicrobial susceptibility data from sick dogs across the US and Canada, we identified a strong multidrug resistance association with an operon that appears to have once conferred a type 1 capsule production system.

RevDate: 2024-07-16

Jespersen MG, Hayes AJ, Tong SYC, et al (2024)

Pangenome evaluation of gene essentiality in Streptococcus pyogenes.

Microbiology spectrum [Epub ahead of print].

Bacterial species often consist of strains with variable gene content, collectively referred to as the pangenome. Variations in the genetic makeup of strains can alter bacterial physiology and fitness. To define biologically relevant genes of a genome, genome-wide transposon mutant libraries have been used to identify genes essential for survival or virulence in a given strain. Such phenotypic studies have been conducted in four different genotypes of the human pathogen Streptococcus pyogenes, yet challenges exist in comparing results across studies conducted in different genetic backgrounds and conditions. To advance genotype to phenotype inferences across different S. pyogenes strains, we built a pangenome database of 249 S. pyogenes reference genomes. We systematically re-analyzed publicly available transposon sequencing datasets from S. pyogenes using a transposon sequencing-specific analysis pipeline, Transit. Across four genetic backgrounds and nine phenotypic conditions, 355 genes were essential for survival, corresponding to ~24% of the core genome. Clusters of Orthologous Genes (COG) categories related to coenzyme and lipid transport and growth functions were overrepresented as essential. Finally, essential operons across S. pyogenes genotypes were defined, with an increased number of essential operons detected under in vivo conditions. This study provides an extendible database to which new studies can be added, and a searchable html-based resource to direct future investigations into S. pyogenes biology.IMPORTANCEStreptococcus pyogenes is a human-adapted pathogen occupying restricted ecological niches. Understanding the essentiality of genes across different strains and experimental conditions is important to direct research questions and efforts to prevent the large burden of disease caused by S. pyogenes. To this end we systematically reanalyzed transposon sequencing studies in S. pyogenes using transposon sequencing-specific methods, integrating them into an extendible meta-analysis framework. This provides a repository of gene essentiality in S. pyogenes which was used to highlight specific genes of interest and for the community to guide future phenotypic studies.

RevDate: 2024-07-16

Brejová B, Gagie T, Herencsárová E, et al (2024)

Maximum-scoring path sets on pangenome graphs of constant treewidth.

Frontiers in bioinformatics, 4:1391086.

We generalize a problem of finding maximum-scoring segment sets, previously studied by Csűrös (IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2004, 1, 139-150), from sequences to graphs. Namely, given a vertex-weighted graph G and a non-negative startup penalty c, we can find a set of vertex-disjoint paths in G with maximum total score when each path's score is its vertices' total weight minus c. We call this new problem maximum-scoring path sets (MSPS). We present an algorithm that has a linear-time complexity for graphs with a constant treewidth. Generalization from sequences to graphs allows the algorithm to be used on pangenome graphs representing several related genomes and can be seen as a common abstraction for several biological problems on pangenomes, including searching for CpG islands, ChIP-seq data analysis, analysis of region enrichment for functional elements, or simple chaining problems.

RevDate: 2024-07-16

Montecillo JAV (2024)

Genomics of the Thermophilic Bacterium Thermosulfidibacter takaii Reveals Novel Lineage of Deep-Branching Bacterial Phylum.

Indian journal of microbiology, 64(2):762-772.

UNLABELLED: The thermophilic bacterium Thermosulfidibacter takaii is affiliated to the deep-branching bacterial lineage in the phylum Aquificota. However, the recent taxonomic study of the phylum Aquificota revealed that T. takaii has no specific association with the phylum. The fact that T. takaii is considered an important model organism for studying the evolution and kinetics of ancestral carbon metabolism pathways, its proper classification is therefore of significant interest. In this work, phylogenomics and comparative genomic analyses were employed to ascertain the taxonomic placement of T. takaii. Results from the phylogenetic analyses based on 16S rRNA gene and core genome sequences confirmed the exclusion of T. takaii from the phylum Aquificota and further revealed a phylum-level lineage for T. takaii. The analysis of conserved signature indels (CSIs) specific for the phylum Aquificota also supported the exclusion of T. takaii from the phylum. Pan-genome analysis of T. takaii along with the members of the closely related clade from the phylum Thermodesulfobacteriota revealed that T. takaii was indeed distinct, supporting its phylum-level placement. Furthermore, the presence of CSIs specific to T. takaii, and the results from the average nucleotide identity and average amino acid identity analyses, together with the unique characteristic of T. takaii also provided evidence supporting its assignment to a novel phylum. Based on these results, T. takaii is proposed to be transferred to a novel family, Thermosulfidibacteraceae fam. nov., of a novel order, Thermosulfidibacterales ord. nov., and a novel class, Thermosulfidibacteria classis nov., within a novel phylum Thermosulfidibacterota phyl. nov.

SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s12088-024-01214-9.

RevDate: 2024-07-16

Rodenburg SYA, de Ridder D, Govers F, et al (2024)

Oomycete Metabolism Is Highly Dynamic and Reflects Lifestyle Adaptations.

Molecular plant-microbe interactions : MPMI [Epub ahead of print].

The selective pressure of pathogen-host symbiosis drives adaptations. How these interactions shape the metabolism of pathogens is largely unknown. Here, we use comparative genomics to systematically analyze the metabolic networks of oomycetes, a diverse group of eukaryotes that includes saprotrophs as well as animal and plant pathogens, with the latter causing devastating diseases with significant economic and/or ecological impacts. In our analyses of 44 oomycete species, we uncover considerable variation in metabolism that can be linked to lifestyle differences. Comparisons of metabolic gene content reveal that plant pathogenic oomycetes have a bipartite metabolism consisting of a conserved core and an accessory set. The accessory set can be associated with the degradation of defense compounds produced by plants when challenged by pathogens. Obligate biotrophic oomycetes have smaller metabolic networks, and taxonomically distantly related biotrophic lineages display convergent evolution by repeated gene losses in both the conserved as well as the accessory set of metabolisms. When investigating to what extent the metabolic networks in obligate biotrophs differ from those in hemibiotrophic plant pathogens, we observe that the losses of metabolic enzymes in obligate biotrophs are not random and that gene losses predominantly influence the terminal branches of the metabolic networks. Our analyses represent the first metabolism-focused comparison of oomycetes at this scale and will contribute to a better understanding of the evolution of oomycete metabolism in relation to lifestyle adaptation. Numerous oomycete species are devastating plant pathogens that cause major damage in crops and natural ecosystems. Their interactions with hosts are shaped by strong selection, but how selection affects adaptation of the primary metabolism to a pathogenic lifestyle is not yet well established. By pan-genome and metabolic network analyses of distantly related oomycete pathogens and their nonpathogenic relatives, we reveal considerable lifestyle- and lineage-specific adaptations. This study contributes to a better understanding of metabolic adaptations in pathogenic oomycetes in relation to lifestyle, host, and environment, and the findings will help in pinpointing potential targets for disease control. [Formula: see text] Copyright © 2024 The Author(s). This is an open access article distributed under the CC BY-NC-ND 4.0 International license.

RevDate: 2024-07-15

Chang T, Gavelis GS, Brown JM, et al (2024)

Genomic representativeness and chimerism in large collections of SAGs and MAGs of marine prokaryoplankton.

Microbiome, 12(1):126.

BACKGROUND: Single amplified genomes (SAGs) and metagenome-assembled genomes (MAGs) are the predominant sources of information about the coding potential of uncultured microbial lineages, but their strengths and limitations remain poorly understood. Here, we performed a direct comparison of two previously published collections of thousands of SAGs and MAGs obtained from the same, global environment.

RESULTS: We found that SAGs were less prone to chimerism and more accurately reflected the relative abundance and the pangenome content of microbial lineages inhabiting the epipelagic of the tropical and subtropical ocean, as compared to MAGs. SAGs were also better suited to link genome information with taxa discovered through 16S rRNA amplicon analyses. Meanwhile, MAGs had the advantage of more readily recovering genomes of rare lineages.

CONCLUSIONS: Our analyses revealed the relative strengths and weaknesses of the two most commonly used genome recovery approaches in environmental microbiology. These considerations, as well as the need for better tools for genome quality assessment, should be taken into account when designing studies and interpreting data that involve SAGs or MAGs. Video Abstract.

RevDate: 2024-07-15
CmpDate: 2024-07-15

Bosi E, Taviani E, Avesani A, et al (2024)

Pan-Genome Provides Insights into Vibrio Evolution and Adaptation to Deep-Sea Hydrothermal Vents.

Genome biology and evolution, 16(7):.

This study delves into the genomic features of 10 Vibrio strains collected from deep-sea hydrothermal vents in the Pacific Ocean, providing insights into their evolutionary history and ecological adaptations. Through sequencing and pan-genome analysis involving 141 Vibrio species, we found that deep-sea strains exhibit larger genomes with unique gene distributions, suggesting adaptation to the vent environment. The phylogenomic reconstruction of the investigated isolates revealed the presence of 2 main clades: The first is monophyletic, consisting exclusively of Vibrio alginolyticus, while the second forms a monophyletic clade comprising both Vibrio antiquarius and Vibrio diabolicus species, which were previously isolated from deep-sea vents. All strains carry virulence and antibiotic resistance genes related to those found in human pathogenic Vibrio species which may play a wider ecological role other than host infection in these environments. In addition, functional genomic analysis identified genes potentially related to deep-sea survival and stress response, alongside candidate genes encoding for novel antimicrobial agents. Ultimately, the pan-genome we generated represents a valuable resource for future studies investigating the taxonomy, evolution, and ecology of Vibrio species.

RevDate: 2024-07-14
CmpDate: 2024-07-14

Seru LV, Forde TL, Roberto-Charron A, et al (2024)

Genomic characterization and virulence gene profiling of Erysipelothrix rhusiopathiae isolated from widespread muskox mortalities in the Canadian Arctic Archipelago.

BMC genomics, 25(1):691.

BACKGROUND: Muskoxen are important ecosystem components and provide food, economic opportunities, and cultural well-being for Indigenous communities in the Canadian Arctic. Between 2010 and 2021, Erysipelothrix rhusiopathiae was isolated from carcasses of muskoxen, caribou, a seal, and an Arctic fox during multiple large scale mortality events in the Canadian Arctic Archipelago. A single strain ('Arctic clone') of E. rhusiopathiae was associated with the mortalities on Banks, Victoria and Prince Patrick Islands, Northwest Territories and Nunavut, Canada (2010-2017). The objectives of this study were to (i) characterize the genomes of E. rhusiopathiae isolates obtained from more recent muskox mortalities in the Canadian Arctic in 2019 and 2021; (ii) identify and compare common virulence traits associated with the core genome and mobile genetic elements (i.e. pathogenicity islands and prophages) among Arctic clone versus other E. rhusiopathiae genomes; and iii) use pan-genome wide association studies (GWAS) to determine unique genetic contents of the Arctic clone that may encode virulence traits and that could be used for diagnostic purposes.

RESULTS: Phylogenetic analyses revealed that the newly sequenced E. rhusiopathiae isolates from Ellesmere Island, Nunavut (2021) also belong to the Arctic clone. Of 17 virulence genes analysed among 28 Arctic clone isolates, four genes - adhesin, rhusiopathiae surface protein-A (rspA), choline binding protein-B (cbpB) and CDP-glycerol glycerophosphotransferase (tagF) - had amino acid sequence variants unique to this clone when compared to 31 other E. rhusiopathiae genomes. These genes encode proteins that facilitate E. rhusiopathiae to attach to the host endothelial cells and form biofilms. GWAS analyses using Scoary found several unique genes to be overrepresented in the Arctic clone.

CONCLUSIONS: The Arctic clone of E. rhusiopathiae was associated with multiple muskox mortalities spanning over a decade and multiple Arctic islands with distances over 1000 km, highlighting the extent of its spatiotemporal spread. This clone possesses unique gene content, as well as amino acid variants in multiple virulence genes that are distinct from the other closely related E. rhusiopathiae isolates. This study establishes an essential foundation on which to investigate whether these differences are correlated with the apparent virulence of this specific clone through in vitro and in vivo studies.

RevDate: 2024-07-14

Biswas R, Swetha RG, Basu S, et al (2024)

Designing multi-epitope vaccine against human cytomegalovirus integrating pan-genome and reverse vaccinology pipelines.

Biologicals : journal of the International Association of Biological Standardization, 87:101782 pii:S1045-1056(24)00039-3 [Epub ahead of print].

Human cytomegalovirus (HCMV) is accountable for high morbidity in neonates and immunosuppressed individuals. Due to the high genetic variability of HCMV, current prophylactic measures are insufficient. In this study, we employed a pan-genome and reverse vaccinology approach to screen the target for efficient vaccine candidates. Four proteins, envelope glycoprotein M, UL41A, US23, and US28, were shortlisted based on cellular localization, high solubility, antigenicity, and immunogenicity. A total of 29 B-cell and 44 T-cell highly immunogenic and antigenic epitopes with high global population coverage were finalized using immunoinformatics tools and algorithms. Further, the epitopes that were overlapping among the finalized B-cell and T-cell epitopes were linked with suitable linkers to form various combinations of multi-epitopic vaccine constructs. Among 16 vaccine constructs, Vc12 was selected based on physicochemical and structural properties. The docking and molecular simulations of VC12 were performed, which showed its high binding affinity (-23.35 kcal/mol) towards TLR4 due to intermolecular hydrogen bonds, salt bridges, and hydrophobic interactions, and there were only minimal fluctuations. Furthermore, Vc12 eliciting a good response was checked for its expression in Escherichia coli through in silico cloning and codon optimization, suggesting it to be a potent vaccine candidate.

RevDate: 2024-07-13
CmpDate: 2024-07-13

Egor G, Artem K, Maksim B, et al (2024)

Enhancing SNV identification in whole-genome sequencing data through the incorporation of known genetic variants into the minimap2 index.

BMC bioinformatics, 25(1):238.

MOTIVATION: Alignment of reads to a reference genome sequence is one of the key steps in the analysis of human whole-genome sequencing data obtained through Next-generation sequencing (NGS) technologies. The quality of the subsequent steps of the analysis, such as the results of clinical interpretation of genetic variants or the results of a genome-wide association study, depends on the correct identification of the position of the read as a result of its alignment. The amount of human NGS whole-genome sequencing data is constantly growing. There are a number of human genome sequencing projects worldwide that have resulted in the creation of large-scale databases of genetic variants of sequenced human genomes. Such information about known genetic variants can be used to improve the quality of alignment at the read alignment stage when analysing sequencing data obtained for a new individual, for example, by creating a genomic graph. While existing methods for aligning reads to a linear reference genome have high alignment speed, methods for aligning reads to a genomic graph have greater accuracy in variable regions of the genome. The development of a read alignment method that takes into account known genetic variants in the linear reference sequence index allows combining the advantages of both sets of methods.

RESULTS: In this paper, we present the minimap2_index_modifier tool, which enables the construction of a modified index of a reference genome using known single nucleotide variants and insertions/deletions (indels) specific to a given human population. The use of the modified minimap2 index improves variant calling quality without modifying the bioinformatics pipeline and without significant additional computational overhead. Using the PrecisionFDA Truth Challenge V2 benchmark data (for HG002 short-read data aligned to the GRCh38 linear reference (GCA_000001405.15) with parameters k = 27 and w = 14) it was demonstrated that the number of false negative genetic variants decreased by more than 9500, and the number of false positives decreased by more than 7000 when modifying the index with genetic variants from the Human Pangenome Reference Consortium.

RevDate: 2024-07-13
CmpDate: 2024-07-13

Yang X, Luo S, Yang S, et al (2024)

Chromosome-level genome assembly of Hippophae rhamnoides variety.

Scientific data, 11(1):776.

Fructus hippophae (Hippophae rhamnoides spp. mongolica×Hippophae rhamnoides sinensis), a hybrid variety of sea buckthorn that Hippophae rhamnoides spp. mongolica serves as the female parent and Hippophae rhamnoides sinensis serves as the male parent, is a traditional plant with great potentials of economic and medical values. Herein, we gained a chromosome-level genome of Fructus hippophae about 918.59 Mb, with the scaffolds N50 reaching 83.65 Mb. Then, we anchored 440 contigs with 97.17% of the total genome sequences onto 12 pseudochromosomes. Next, de-novo, homology and transcriptome assembly strategies were adopted for gene structure prediction. This predicted 36475 protein-coding genes, of which 36226 genes could be functionally annotated. Simultaneously, various strategies were used for quality assessment, both the complete BUSCO value (98.80%) and the mapping rate indicated the high assembly quality. Repetitive elements, which occupied 63.68% of the genome, and 1483600 bp of non-coding RNA were annotated. Here, we provide genomic information on female plants of a popular variety, which can provide data for pan-genomic construction of sea buckthorn and for the resolution of the mechanism of sex differentiation.

RevDate: 2024-07-13
CmpDate: 2024-07-13

Gao Z, Lu Y, Chong Y, et al (2024)

Beef Cattle Genome Project: Advances in Genome Sequencing, Assembly, and Functional Genes Discovery.

International journal of molecular sciences, 25(13): pii:ijms25137147.

Beef is a major global source of protein, playing an essential role in the human diet. The worldwide production and consumption of beef continue to rise, reflecting a significant trend. However, despite the critical importance of beef cattle resources in agriculture, the diversity of cattle breeds faces severe challenges, with many breeds at risk of extinction. The initiation of the Beef Cattle Genome Project is crucial. By constructing a high-precision functional annotation map of their genome, it becomes possible to analyze the genetic mechanisms underlying important traits in beef cattle, laying a solid foundation for breeding more efficient and productive cattle breeds. This review details advances in genome sequencing and assembly technologies, iterative upgrades of the beef cattle reference genome, and its application in pan-genome research. Additionally, it summarizes relevant studies on the discovery of functional genes associated with key traits in beef cattle, such as growth, meat quality, reproduction, polled traits, disease resistance, and environmental adaptability. Finally, the review explores the potential of telomere-to-telomere (T2T) genome assembly, structural variations (SVs), and multi-omics techniques in future beef cattle genetic breeding. These advancements collectively offer promising avenues for enhancing beef cattle breeding and improving genetic traits.

RevDate: 2024-07-13

Huang A, Feng S, Ye Z, et al (2024)

Genome Assembly and Structural Variation Analysis of Luffa acutangula Provide Insights on Flowering Time and Ridge Development.

Plants (Basel, Switzerland), 13(13): pii:plants13131828.

Luffa spp. is an important worldwide cultivated vegetable and medicinal plant from the Cucurbitaceae family. In this study, we report a high-quality chromosome-level genome of the high-generation inbred line SG261 of Luffa acutangula. The genomic sequence was determined by PacBio long reads, Hi-C sequencing reads, and 10× Genomics sequencing, with an assembly size of 739.82 Mb, contig N50 of 18.38 Mb, and scaffold N50 of 56.08 Mb. The genome of L. acutangula SG261 was predicted to contain 27,312 protein-coding genes and 72.56% repetitive sequences, of which long terminal repeats (LTRs) were an important form of repetitive sequences, accounting for 67.84% of the genome. Phylogenetic analysis reveals that L. acutangula evolved later than Luffa cylindrica, and Luffa is closely related to Momodica charantia. Comparing the genome of L. acutangula SG261 and L. cylindrica with PacBio data, 67,128 high-quality structural variations (SVs) and 55,978 presence-absence variations (PAVs) were identified in SG261, resulting in 2424 and 1094 genes with variation in the CDS region, respectively, and there are 287 identical genes affected by two different structural variation analyses. In addition, we found that the transcription factor FY (FLOWERING LOCUS Y) families had a large expansion in L. acutangula SG261 (flowering in the morning) compared to L. cylindrica (flowering in the afternoon), which may result in the early flowering time in L. acutangula SG261. This study provides valuable reference for the breeding of and pan-genome research into Luffa species.

RevDate: 2024-07-12
CmpDate: 2024-07-12

Miga KH (2024)

From complete genomes to pangenomes.

American journal of human genetics, 111(7):1265-1268.

Highlighting the Distinguished Speakers Symposium on "The Future of Human Genetics and Genomics," this collection of articles is based on presentations at the ASHG 2023 Annual Meeting in Washington, DC, in celebration of all our field has accomplished in the past 75 years, since the founding of ASHG in 1948.

RevDate: 2024-07-12

Barcia-Cruz R, Balboa S, Lema A, et al (2024)

Comparative genomics of Vibrio toranzoniae strains.

International microbiology : the official journal of the Spanish Society for Microbiology [Epub ahead of print].

Vibrio toranzoniae is a marine bacterium belonging to the Splendidus clade that was originally isolated from healthy clams in Galicia (NW Spain). Its isolation from different hosts and seawater indicated two lifestyles and wide geographical distribution. The aim of the present study was to determine the differences at the genomic level among six strains (4 isolated from clam and 2 from seawater) and to determine their phylogeny. For this purpose, whole genomes of the six strains were sequenced by different technologies including Illumina and PacBio, and the resulting sequences were corrected. Genomes were annotated and compared using different online tools. Furthermore, the study of core- and pan-genomes were examined, and the phylogeny was inferred. The content of the core genome ranged from 2953 to 2766 genes and that of the pangenome ranged from 6278 to 6132, depending on the tool used. Although the strains shared certain homology, with DDH values ranging from 77.10 to 82.30 and values of OrthoANI values higher than 97%, some differences were found related to motility, capsule synthesis, iron acquisition systems or mobile genetic elements. Phylogenetic analysis of the core genome did not reveal a differentiation of the strains according to their lifestyle (commensal or free-living), but that of the pangenome indicated certain geographical isolation in the same growing area. This study led to the reclassification of some isolates formerly described as V. toranzoniae and demonstrated the importance of cured deposited sequences to proper phylogenetic assignment.

RevDate: 2024-07-12
CmpDate: 2024-07-12

Li XY, Fang XM, Jia HT, et al (2024)

Noviherbaspirillum album sp. nov., an airborne bacteria isolated from an urban area of Beijing, China.

International journal of systematic and evolutionary microbiology, 74(7):.

A Gram-negative, ellipsoidal to short-rod-shaped, motile bacterium was isolated from Beijing's urban air. The isolate exhibited the closest kinship with Noviherbaspirillum aerium 122213-3[T], exhibiting 98.4 % 16S rRNA gene sequence similarity. Phylogenetic analyses based on 16S rRNA gene sequences and genomes showed that it clustered closely with N. aerium 122213-3[T], thus forming a distinct phylogenetic lineage within the genus Noviherbaspirillum. The average nucleotide identity and digital DNA-DNA hybridization values between strain I16B-00201[T] and N. aerium 122213-3[T] were 84.6 and 29.4 %, respectively. The respiratory ubiquinone was ubiquinone 8. The major fatty acids (>10 %) were summed feature 3 (C16:1ω6c/C16:1ω7c, 43.3 %), summed feature 8 (C18:1ω7c/C18:1ω6c, 15.9 %) and C12:0 (11.0 %). The polyamine profile showed putrescine as the predominant compound. The polar lipid profile consisted of diphosphatidylglycerol, phosphatidylglycerol, phosphatidylethanolamine, phosphatidylcholine, unknown lipids and unknown phosphatidylaminolipids. The phenotypic, phylogenetic and chemotaxonomic results consistently supported that strain I16B-00201[T] represented a novel species of the genus Noviherbaspirillum, for which the name Noviherbaspirillum album sp. nov. is proposed, with I16B-00201[T] (=CPCC 100848[T]=KCTC 52095[T]) designated as the type strain. Its DNA G+C content is 59.4 mol%. Pan-genome analysis indicated that some Noviherbaspirillum species possess diverse nitrogen and aromatic compound metabolism pathways, suggesting their potential value in pollutant treatment.

RevDate: 2024-07-12

Schüler MA, Riedel T, Overmann J, et al (2024)

Comparative genome analyses of clinical and non-clinical Clostridioides difficile strains.

Frontiers in microbiology, 15:1404491.

The pathogenic bacterium Clostridioides difficile is a worldwide health burden with increasing morbidity, mortality and antibiotic resistances. Therefore, extensive research efforts are made to unravel its virulence and dissemination. One crucial aspect for C. difficile is its mobilome, which for instance allows the spread of antibiotic resistance genes (ARG) or influence strain virulence. As a nosocomial pathogen, the majority of strains analyzed originated from clinical environments and infected individuals. Nevertheless, C. difficile can also be present in human intestines without disease development or occur in diverse environmental habitats such as puddle water and soil, from which several strains could already be isolated. We therefore performed comprehensive genome comparisons of closely related clinical and non-clinical strains to identify the effects of the clinical background. Analyses included the prediction of virulence factors, ARGs, mobile genetic elements (MGEs), and detailed examinations of the pan genome. Clinical-related trends were thereby observed. While no significant differences were identified in fundamental C. difficile virulence factors, the clinical strains carried more ARGs and MGEs, and possessed a larger accessory genome. Detailed inspection of accessory genes revealed higher abundance of genes with unknown function, transcription-associated, or recombination-related activity. Accessory genes of these functions were already highlighted in other studies in association with higher strain virulence. This specific trend might allow the strains to react more efficiently on changing environmental conditions in the human host such as emerging stress factors, and potentially increase strain survival, colonization, and strain virulence. These findings indicated an adaptation of the strains to the clinical environment. Further, implementation of the analysis results in pairwise genome comparisons revealed that the majority of these accessory genes were encoded on predicted MGEs, shedding further light on the mobile genome of C. difficile. We therefore encourage the inclusion of non-clinical strains in comparative analyses.

LOAD NEXT 100 CITATIONS

ESP Quick Facts

ESP Origins

In the early 1990's, Robert Robbins was a faculty member at Johns Hopkins, where he directed the informatics core of GDB — the human gene-mapping database of the international human genome project. To share papers with colleagues around the world, he set up a small paper-sharing section on his personal web page. This small project evolved into The Electronic Scholarly Publishing Project.

ESP Support

In 1995, Robbins became the VP/IT of the Fred Hutchinson Cancer Research Center in Seattle, WA. Soon after arriving in Seattle, Robbins secured funding, through the ELSI component of the US Human Genome Project, to create the original ESP.ORG web site, with the formal goal of providing free, world-wide access to the literature of classical genetics.

ESP Rationale

Although the methods of molecular biology can seem almost magical to the uninitiated, the original techniques of classical genetics are readily appreciated by one and all: cross individuals that differ in some inherited trait, collect all of the progeny, score their attributes, and propose mechanisms to explain the patterns of inheritance observed.

ESP Goal

In reading the early works of classical genetics, one is drawn, almost inexorably, into ever more complex models, until molecular explanations begin to seem both necessary and natural. At that point, the tools for understanding genome research are at hand. Assisting readers reach this point was the original goal of The Electronic Scholarly Publishing Project.

ESP Usage

Usage of the site grew rapidly and has remained high. Faculty began to use the site for their assigned readings. Other on-line publishers, ranging from The New York Times to Nature referenced ESP materials in their own publications. Nobel laureates (e.g., Joshua Lederberg) regularly used the site and even wrote to suggest changes and improvements.

ESP Content

When the site began, no journals were making their early content available in digital format. As a result, ESP was obliged to digitize classic literature before it could be made available. For many important papers — such as Mendel's original paper or the first genetic map — ESP had to produce entirely new typeset versions of the works, if they were to be available in a high-quality format.

ESP Help

Early support from the DOE component of the Human Genome Project was critically important for getting the ESP project on a firm foundation. Since that funding ended (nearly 20 years ago), the project has been operated as a purely volunteer effort. Anyone wishing to assist in these efforts should send an email to Robbins.

ESP Plans

With the development of methods for adding typeset side notes to PDF files, the ESP project now plans to add annotated versions of some classical papers to its holdings. We also plan to add new reference and pedagogical material. We have already started providing regularly updated, comprehensive bibliographies to the ESP.ORG site.

Electronic Scholarly Publishing
961 Red Tail Lane
Bellingham, WA 98226

E-mail: RJR8222 @ gmail.com

Papers in Classical Genetics

The ESP began as an effort to share a handful of key papers from the early days of classical genetics. Now the collection has grown to include hundreds of papers, in full-text format.

Digital Books

Along with papers on classical genetics, ESP offers a collection of full-text digital books, including many works by Darwin and even a collection of poetry — Chicago Poems by Carl Sandburg.

Timelines

ESP now offers a large collection of user-selected side-by-side timelines (e.g., all science vs. all other categories, or arts and culture vs. world history), designed to provide a comparative context for appreciating world events.

Biographies

Biographical information about many key scientists (e.g., Walter Sutton).

Selected Bibliographies

Bibliographies on several topics of potential interest to the ESP community are automatically maintained and generated on the ESP site.

ESP Picks from Around the Web (updated 28 JUL 2024 )