MENU
The Electronic Scholarly Publishing Project: Providing world-wide, free access to classic scientific papers and other scholarly materials, since 1993.
More About: ESP | OUR CONTENT | THIS WEBSITE | WHAT'S NEW | WHAT'S HOT
ESP: PubMed Auto Bibliography 30 Dec 2025 at 06:58 Created:
Pangenome
Although the enforced stability of genomic content is ubiquitous among MCEs, the opposite is proving to be the case among prokaryotes, which exhibit remarkable and adaptive plasticity of genomic content. Early bacterial whole-genome sequencing efforts discovered that whenever a particular "species" was re-sequenced, new genes were found that had not been detected earlier — entirely new genes, not merely new alleles. This led to the concepts of the bacterial core-genome, the set of genes found in all members of a particular "species", and the flex-genome, the set of genes found in some, but not all members of the "species". Together these make up the species' pan-genome.
Created with PubMed® Query: ( pangenome[TIAB] OR "pan-genome"[TIAB] OR "pan genome"[TIAB] ) NOT pmcbook NOT ispreviousversion
Citations The Papers (from PubMed®)
RevDate: 2025-12-30
Comparative genomics of Brucella species reveals key determinants of secondary metabolism, antimicrobial resistance, and virulence.
Scientific reports pii:10.1038/s41598-025-33824-8 [Epub ahead of print].
The genomes of 40 Brucella strains were retrieved from the NCBI database to investigate Brucellosis at the genomic level, focusing on secondary metabolites, resistance genes, and virulence factors. Genome analysis software, secondary metabolite mining tools, and relevant gene databases were employed for detailed analysis. The genome sizes of these strains range from 4.88 to 6.00 MB, with G+C content between 53.5 and 60.5%. Phylogenetic analysis classified the strains into three distinct clades: Brucella anthropi CCUG 34461, Brucella sp. NBRC 13694, and Brucella anthropi MAG47. Pan-genome analysis revealed 21,800 gene families, 198 core genes, and 10,371 unique genes, indicating an open pan-genome. The secondary metabolite mining software identified 18 categories and 350 gene clusters, predicting a total of 298 secondary metabolites, primarily arylpolyene, acyl-amino acids, betalactone, terpene, hydrogen cyanide, and NAGGN. Genome sequences were uploaded in FASTA format to the CARD resistance gene database, identifying seven resistance genes: rpsE, rpsL, rosA, golS, fabG, fabI, and uL3. B. anthropi SBA01 and B. media Q1108 were found to harbor the highest number of drug resistance genes. Likewise, the sequences were compared to the VFDB virulence gene database, revealing eight virulence genes: lpxC, acpXL, fliY, bspJ, lpxA, fliI, fliQ, and bvrR. The B. cytisi IPA7.2 strain exhibited the highest number of virulence genes, with lpxC and acpXL potentially being unique to Brucella compared to other species. This study provides comprehensive genomic data, elucidating the relationship between the pan-genome, core genome, and genome size, while predicting the types of secondary metabolites, resistance, and virulence genes. These findings provide a basis for comprehensively understanding Brucella and lay a solid foundation for its prevention and treatment.
Additional Links: PMID-41466046
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41466046,
year = {2025},
author = {Liu, X and Zhang, R and Sun, M and Qiao, J and Liang, M},
title = {Comparative genomics of Brucella species reveals key determinants of secondary metabolism, antimicrobial resistance, and virulence.},
journal = {Scientific reports},
volume = {},
number = {},
pages = {},
doi = {10.1038/s41598-025-33824-8},
pmid = {41466046},
issn = {2045-2322},
abstract = {The genomes of 40 Brucella strains were retrieved from the NCBI database to investigate Brucellosis at the genomic level, focusing on secondary metabolites, resistance genes, and virulence factors. Genome analysis software, secondary metabolite mining tools, and relevant gene databases were employed for detailed analysis. The genome sizes of these strains range from 4.88 to 6.00 MB, with G+C content between 53.5 and 60.5%. Phylogenetic analysis classified the strains into three distinct clades: Brucella anthropi CCUG 34461, Brucella sp. NBRC 13694, and Brucella anthropi MAG47. Pan-genome analysis revealed 21,800 gene families, 198 core genes, and 10,371 unique genes, indicating an open pan-genome. The secondary metabolite mining software identified 18 categories and 350 gene clusters, predicting a total of 298 secondary metabolites, primarily arylpolyene, acyl-amino acids, betalactone, terpene, hydrogen cyanide, and NAGGN. Genome sequences were uploaded in FASTA format to the CARD resistance gene database, identifying seven resistance genes: rpsE, rpsL, rosA, golS, fabG, fabI, and uL3. B. anthropi SBA01 and B. media Q1108 were found to harbor the highest number of drug resistance genes. Likewise, the sequences were compared to the VFDB virulence gene database, revealing eight virulence genes: lpxC, acpXL, fliY, bspJ, lpxA, fliI, fliQ, and bvrR. The B. cytisi IPA7.2 strain exhibited the highest number of virulence genes, with lpxC and acpXL potentially being unique to Brucella compared to other species. This study provides comprehensive genomic data, elucidating the relationship between the pan-genome, core genome, and genome size, while predicting the types of secondary metabolites, resistance, and virulence genes. These findings provide a basis for comprehensively understanding Brucella and lay a solid foundation for its prevention and treatment.},
}
RevDate: 2025-12-30
CmpDate: 2025-12-30
Phenotypic Characterization and Genomic Mining of Uric Acid Catabolism Genes in Lactiplantibacillus plantarum YC.
Foods (Basel, Switzerland), 14(24): pii:foods14244343.
This study presents the phenotypic characterization and genomic mining of uric acid catabolism genes in Lactiplantibacillus plantarum YC, a novel food-grade lactic acid bacterium isolated from traditional fermented vegetables with potent uric acid-lowering activity. YC is non-hemolytic, catalase- and gelatinase-negative, exhibits strong adhesion and broad antibacterial activity, and degrades 29.22% of uric acid in vitro, along with complete (100%) degradation of inosine and guanosine. Whole-genome sequencing revealed a 3,214,448 bp chromosome encoding 3026 protein-coding genes. Comparative genomics-based functional annotation highlighted abundant CAZy-related genes and antimicrobial factors, including lysozyme and monooxygenase. Crucially, genomic mining identified a complete uric acid degradation gene cluster, comprising pucK (uric acid permease), hpxO (uric acid hydroxylase), eight copies of hiuH (5-hydroxyisourate hydrolase), allB (allantoinase), and purine nucleoside transport/metabolism genes (rihA, rihB, rihC, pbuG). This work provides the first comparative genomic insight into the genetic architecture and distribution of uric acid metabolism in L. plantarum, elucidating YC's dual urate-lowering mechanism and delivering key molecular markers for developing enzyme-based functional foods and microbial therapeutics against hyperuricemia.
Additional Links: PMID-41465049
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41465049,
year = {2025},
author = {Zhao, Y and Yang, S and He, M and Chai, P and Sun, Z and Zhu, Q and Li, Z and Guo, Q and Liu, H},
title = {Phenotypic Characterization and Genomic Mining of Uric Acid Catabolism Genes in Lactiplantibacillus plantarum YC.},
journal = {Foods (Basel, Switzerland)},
volume = {14},
number = {24},
pages = {},
doi = {10.3390/foods14244343},
pmid = {41465049},
issn = {2304-8158},
support = {32272270//National Natural Science Foundation of China/ ; 32402076//National Natural Science Foundation of China/ ; 25YDTPJC00750//Tianjin Science and Technology Plan Project/ ; 25JCQNJC00840//Tianjin Science and Technology Plan Project/ ; },
abstract = {This study presents the phenotypic characterization and genomic mining of uric acid catabolism genes in Lactiplantibacillus plantarum YC, a novel food-grade lactic acid bacterium isolated from traditional fermented vegetables with potent uric acid-lowering activity. YC is non-hemolytic, catalase- and gelatinase-negative, exhibits strong adhesion and broad antibacterial activity, and degrades 29.22% of uric acid in vitro, along with complete (100%) degradation of inosine and guanosine. Whole-genome sequencing revealed a 3,214,448 bp chromosome encoding 3026 protein-coding genes. Comparative genomics-based functional annotation highlighted abundant CAZy-related genes and antimicrobial factors, including lysozyme and monooxygenase. Crucially, genomic mining identified a complete uric acid degradation gene cluster, comprising pucK (uric acid permease), hpxO (uric acid hydroxylase), eight copies of hiuH (5-hydroxyisourate hydrolase), allB (allantoinase), and purine nucleoside transport/metabolism genes (rihA, rihB, rihC, pbuG). This work provides the first comparative genomic insight into the genetic architecture and distribution of uric acid metabolism in L. plantarum, elucidating YC's dual urate-lowering mechanism and delivering key molecular markers for developing enzyme-based functional foods and microbial therapeutics against hyperuricemia.},
}
RevDate: 2025-12-29
Complete genomes of Rickettsia typhi reveal a clonal population.
PLoS neglected tropical diseases, 19(12):e0013828 pii:PNTD-D-25-02004 [Epub ahead of print].
Murine typhus, caused by infection with Rickettsia typhi, is a neglected disease contributing to infectious disease burden in south- and southeast Asia. Despite its importance, we have minimal knowledge of the genomics of R. typhi, with only four complete genomes being sequenced prior to this work. We sequenced a further 25 genomes including historical strains collected before 1976 from both human and rat hosts, and recent genomes isolated from patients at a single hospital in Laos. Whole genome SNP analysis reveals extremely low levels of genetic diversity across the 29 genomes, with overall nucleotide diversity (π) of 1.27e-05 and evidence of purifying selection, and a minimal pan-genome. Phylogenetic analysis shows clustering of the genome by historic or modern origin, with the exception of one modern strain which is most closely related to historic strains from Thailand, and no clustering by host origin. The highly conserved genome of R. typhi suggests strong constraints on genome evolution in this obligate intracellular parasite, and has implications for the design of future murine typhus diagnostic tools and vaccines.
Additional Links: PMID-41460881
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41460881,
year = {2025},
author = {Keeratipusana, C and Phuklia, W and Phommadeechack, V and Thaipadungpanit, J and Chansamouth, V and Phommasone, K and Rattanavong, S and Moore, CE and Robinson, MT and Richards, AL and Newton, PN and Batty, EM},
title = {Complete genomes of Rickettsia typhi reveal a clonal population.},
journal = {PLoS neglected tropical diseases},
volume = {19},
number = {12},
pages = {e0013828},
doi = {10.1371/journal.pntd.0013828},
pmid = {41460881},
issn = {1935-2735},
abstract = {Murine typhus, caused by infection with Rickettsia typhi, is a neglected disease contributing to infectious disease burden in south- and southeast Asia. Despite its importance, we have minimal knowledge of the genomics of R. typhi, with only four complete genomes being sequenced prior to this work. We sequenced a further 25 genomes including historical strains collected before 1976 from both human and rat hosts, and recent genomes isolated from patients at a single hospital in Laos. Whole genome SNP analysis reveals extremely low levels of genetic diversity across the 29 genomes, with overall nucleotide diversity (π) of 1.27e-05 and evidence of purifying selection, and a minimal pan-genome. Phylogenetic analysis shows clustering of the genome by historic or modern origin, with the exception of one modern strain which is most closely related to historic strains from Thailand, and no clustering by host origin. The highly conserved genome of R. typhi suggests strong constraints on genome evolution in this obligate intracellular parasite, and has implications for the design of future murine typhus diagnostic tools and vaccines.},
}
RevDate: 2025-12-29
CmpDate: 2025-12-29
Genomic diversity and functional adaptation of Limosilactobacillus reuteri isolated from diverse ecological niches.
Frontiers in microbiology, 16:1732127.
Limosilactobacillus reuteri is a widely utilized probiotic, however, the genomic diversity and evolutionary mechanisms underlying its adaptation to various hosts and environments remain incompletely understood. This study employed comparative genomics to analyze 176 L. reuteri genomes from animal (rodents, mammals, ruminants, and birds), human intestinal, and food sources (dairy products, fermented foods; 89 newly sequenced and 92 retrieved, 5 excluded by ANI < 95%). We assessed genomic features, average nucleotide identity, pan/core genomes, carbohydrate-active enzymes, bacteriocin production, CRISPR-Cas systems, and antibiotic resistance genes. The pan-genome consisted of 16,814 genes, while the core genome contained 553 genes. Core-gene phylogeny revealed seven clades, rodents isolates were positioned closer to the root. The clustering trend of fermented foods isolates in the phylogenetic tree may indicate that these strains have undergone convergent evolution or adaptive evolution in a specific environment. CAZymes varied across sources, and the predicted bacteriocin clusters were enriched in animal-derived, particularly in rodent isolates. CAZy functional composition in L. reuteri is shaped by the ecological niche and host environment, reflecting a pattern of host-driven evolutionary adaptation. CRISPR-Cas systems were present in 23.3% of genomes, predominantly in rodents isolates, indicating strong anti-phage capabilities. The heterogeneity of CRISPR-Cas systems among sources suggests that subpopulations of L. reuteri have been subjected to different evolutionary pressures. The predominance of Type II systems agrees with their widespread occurrence in lactobacilli. The presence of multiple probiotic function-related genes across all separation sources confirms the robust probiotic potential of L. reuteri. Antibiotic resistance genes, including tet, ermB, and vatE, were most prevalent among animal-derived isolates, with the highest numbers occurring in mammals and the lowest in rodents. Therefore, strain-specific safety assessments are necessary prior to clinical or food applications. The findings underscore the significance of host-specific adaptations in shaping the genetic and functional profiles of L. reuteri, offering valuable implications for its application in food-derived, human-derived, animal-derived and therapeutics.
Additional Links: PMID-41459216
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41459216,
year = {2025},
author = {Sun, Y and Zhao, Q and Li, W and Kwok, LY and Zhang, H},
title = {Genomic diversity and functional adaptation of Limosilactobacillus reuteri isolated from diverse ecological niches.},
journal = {Frontiers in microbiology},
volume = {16},
number = {},
pages = {1732127},
pmid = {41459216},
issn = {1664-302X},
abstract = {Limosilactobacillus reuteri is a widely utilized probiotic, however, the genomic diversity and evolutionary mechanisms underlying its adaptation to various hosts and environments remain incompletely understood. This study employed comparative genomics to analyze 176 L. reuteri genomes from animal (rodents, mammals, ruminants, and birds), human intestinal, and food sources (dairy products, fermented foods; 89 newly sequenced and 92 retrieved, 5 excluded by ANI < 95%). We assessed genomic features, average nucleotide identity, pan/core genomes, carbohydrate-active enzymes, bacteriocin production, CRISPR-Cas systems, and antibiotic resistance genes. The pan-genome consisted of 16,814 genes, while the core genome contained 553 genes. Core-gene phylogeny revealed seven clades, rodents isolates were positioned closer to the root. The clustering trend of fermented foods isolates in the phylogenetic tree may indicate that these strains have undergone convergent evolution or adaptive evolution in a specific environment. CAZymes varied across sources, and the predicted bacteriocin clusters were enriched in animal-derived, particularly in rodent isolates. CAZy functional composition in L. reuteri is shaped by the ecological niche and host environment, reflecting a pattern of host-driven evolutionary adaptation. CRISPR-Cas systems were present in 23.3% of genomes, predominantly in rodents isolates, indicating strong anti-phage capabilities. The heterogeneity of CRISPR-Cas systems among sources suggests that subpopulations of L. reuteri have been subjected to different evolutionary pressures. The predominance of Type II systems agrees with their widespread occurrence in lactobacilli. The presence of multiple probiotic function-related genes across all separation sources confirms the robust probiotic potential of L. reuteri. Antibiotic resistance genes, including tet, ermB, and vatE, were most prevalent among animal-derived isolates, with the highest numbers occurring in mammals and the lowest in rodents. Therefore, strain-specific safety assessments are necessary prior to clinical or food applications. The findings underscore the significance of host-specific adaptations in shaping the genetic and functional profiles of L. reuteri, offering valuable implications for its application in food-derived, human-derived, animal-derived and therapeutics.},
}
RevDate: 2025-12-29
CmpDate: 2025-12-29
White paper: standards for handling and analyzing plant pan-genomes.
F1000Research, 14:739.
Plant pan-genomes, which aggregate genomic sequences and annotations from multiple individuals of a species, have emerged as transformative tools for understanding genetic diversity, adaptation, and evolutionary dynamics. Super-pan-genomes, extending across species boundaries, further enable comparative analyses of clades or genera, bridging breeding applications with evolutionary insights (Shang et al., 2022; Li et al., 2023a). However, the absence of standardized practices for data generation, analysis, and sharing hinders reproducibility and interoperability. This white paper presents a harmonized framework developed by the ELIXIR E-PAN consortium, addressing nomenclature, quality control (QC), data formats, visualization, and community practices. By adopting these guidelines, researchers can enhance FAIR (Findable, Accessible, Interoperable, Reusable) compliance, foster collaboration, and accelerate translational applications in crop improvement and evolutionary biology.
Additional Links: PMID-41458396
Full Text:
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41458396,
year = {2025},
author = {Heuermann, MC and Barros, P and Beier, S and Gundlach, H and Alvarez-Jarreta, J and Hassani-Pak, K and König, P and Fiebig, A and Godec, T and Gruden, K and Nolte, N and Petek, M and Scholz, U and Zagorščak, M and Vandepoele, K and Van Bel, M},
title = {White paper: standards for handling and analyzing plant pan-genomes.},
journal = {F1000Research},
volume = {14},
number = {},
pages = {739},
doi = {10.12688/f1000research.166538.2},
pmid = {41458396},
issn = {2046-1402},
mesh = {*Genome, Plant ; *Genomics/standards ; Quality Control ; },
abstract = {Plant pan-genomes, which aggregate genomic sequences and annotations from multiple individuals of a species, have emerged as transformative tools for understanding genetic diversity, adaptation, and evolutionary dynamics. Super-pan-genomes, extending across species boundaries, further enable comparative analyses of clades or genera, bridging breeding applications with evolutionary insights (Shang et al., 2022; Li et al., 2023a). However, the absence of standardized practices for data generation, analysis, and sharing hinders reproducibility and interoperability. This white paper presents a harmonized framework developed by the ELIXIR E-PAN consortium, addressing nomenclature, quality control (QC), data formats, visualization, and community practices. By adopting these guidelines, researchers can enhance FAIR (Findable, Accessible, Interoperable, Reusable) compliance, foster collaboration, and accelerate translational applications in crop improvement and evolutionary biology.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Genome, Plant
*Genomics/standards
Quality Control
RevDate: 2025-12-27
CmpDate: 2025-12-28
Genomic characterization and comparative analysis of multidrug-resistant Escherichia coli ST361 from urban wetland.
BMC microbiology, 25(1):800.
BACKGROUND: Wetlands are severely affected by human activities like untreated sewage, overuse of antibiotics and agricultural runoff, leading to the accumulation of antibiotic-resistant bacteria (ARB) and antibiotic resistance genes (ARGs). Migratory birds visiting these contaminated wetlands further spread the ARB and ARGs globally. Urban lakes, with high anthropogenic impact, are increasingly showing MDR E. coli, which are responsible for global infections and AMR gene dissemination. ST361 is an emerging global threat, showing an upward trend in prevalence and resistance to major antibiotic classes. The presence of this ST in aquatic environments suggests a robust transmission cycle between animals, humans, and the environment, often via agricultural runoff and contaminated wastewater. This study reports the first instance of ST361 from wetlands in India. The aim of this study is to identify the resistance and virulence genes present in this ST361 isolate and perform comparative genomic analysis of ST361 isolates from both clinical and environmental sources globally.
METHODS: Fresh water sample (KOTW5) was collected from a lake in chennai during monsoon season. The sample was streaked on selective agar media and maintained at 37℃ for 24 h. Antimicrobial Susceptibility test were carried out in accordance with CLSI guidelines and the sample was confirmed as E.coli using MALDI-TOF. Whole genome sequencing (WGS) was performed using Illumina NovaSeq 6000 instrument after DNA extraction and then library preparation. After processing the sequencing reads, various softwares were employed for the identification of clonal subtype, serotype, antibiotic resistance genes, virulence genes and other genomic characteristics of the isolate. The phylogenetic tree was visualized using iTOL v6.
RESULTS: KOTW5 exhibited resistance to multiple antibiotics including ampicillin, tetracycline, and colistin. KOTW5 was revealed to be an emerging pandemic clone ST361 which has a genome size of 4,481,772 bp with 50.8% GC content. It harbors resistance genes like pmr, TolC, acr, and sul2. The serotype is O9/H30. KOTW5 also possesses 50 virulence genes, including fim, esp and gsp genes. Pangenome analysis of KOTW5 and other ST361 isolates (clinical and environmental) indicated it's close relationship with UEMH01, clinical E. coli isolate from Tanzania, and DUDV01, clinical E. coli isolate from Germany, suggesting its potential to cause human infections in the future. They were found to share the genes ampC, pmrF, acr, emr, evg, tolC, esp, fim, gsp, and ompA. In indian scenario, KOTW5 belongs to a clade containing the clinical isolates JAKMBW01 and DAZBLI01 and share the following genes: ampC, H-NS, pmrF, tolC, csg, ent, espL1, espX5, fim, gsp, and ompA.
CONCLUSION: The study highlights the presence of E. coli ST361, a multidrug-resistant, high-risk clone, in wetlands. Wetlands may act as "hotspots" for the persistence and spread of MDR E. coli, which is concerning as antibiotics are becoming less effective as last-resort treatments. The finding of ST361 in lakes underscores the "One Health" concept, emphasizing that antibiotic resistance impacts human, animal, and environmental health. To combat rising antimicrobial resistance (AMR) in aquatic environments, comprehensive strategies such as advanced wastewater treatment, stricter regulations on antibiotic use, public awareness campaigns, and routine environmental monitoring are crucial.
Additional Links: PMID-41455885
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41455885,
year = {2025},
author = {Srijith, L and Sivakumar, V and Sivanandham, G and Sridharan, KS and Paul, A and Warrier, S and Perumal, K},
title = {Genomic characterization and comparative analysis of multidrug-resistant Escherichia coli ST361 from urban wetland.},
journal = {BMC microbiology},
volume = {25},
number = {1},
pages = {800},
pmid = {41455885},
issn = {1471-2180},
mesh = {*Escherichia coli/genetics/drug effects/isolation & purification/classification/pathogenicity ; *Drug Resistance, Multiple, Bacterial/genetics ; *Wetlands ; Anti-Bacterial Agents/pharmacology ; India ; *Genome, Bacterial ; Microbial Sensitivity Tests ; Humans ; Phylogeny ; Whole Genome Sequencing ; Lakes/microbiology ; Genomics ; Virulence Factors/genetics ; Escherichia coli Infections/microbiology ; },
abstract = {BACKGROUND: Wetlands are severely affected by human activities like untreated sewage, overuse of antibiotics and agricultural runoff, leading to the accumulation of antibiotic-resistant bacteria (ARB) and antibiotic resistance genes (ARGs). Migratory birds visiting these contaminated wetlands further spread the ARB and ARGs globally. Urban lakes, with high anthropogenic impact, are increasingly showing MDR E. coli, which are responsible for global infections and AMR gene dissemination. ST361 is an emerging global threat, showing an upward trend in prevalence and resistance to major antibiotic classes. The presence of this ST in aquatic environments suggests a robust transmission cycle between animals, humans, and the environment, often via agricultural runoff and contaminated wastewater. This study reports the first instance of ST361 from wetlands in India. The aim of this study is to identify the resistance and virulence genes present in this ST361 isolate and perform comparative genomic analysis of ST361 isolates from both clinical and environmental sources globally.
METHODS: Fresh water sample (KOTW5) was collected from a lake in chennai during monsoon season. The sample was streaked on selective agar media and maintained at 37℃ for 24 h. Antimicrobial Susceptibility test were carried out in accordance with CLSI guidelines and the sample was confirmed as E.coli using MALDI-TOF. Whole genome sequencing (WGS) was performed using Illumina NovaSeq 6000 instrument after DNA extraction and then library preparation. After processing the sequencing reads, various softwares were employed for the identification of clonal subtype, serotype, antibiotic resistance genes, virulence genes and other genomic characteristics of the isolate. The phylogenetic tree was visualized using iTOL v6.
RESULTS: KOTW5 exhibited resistance to multiple antibiotics including ampicillin, tetracycline, and colistin. KOTW5 was revealed to be an emerging pandemic clone ST361 which has a genome size of 4,481,772 bp with 50.8% GC content. It harbors resistance genes like pmr, TolC, acr, and sul2. The serotype is O9/H30. KOTW5 also possesses 50 virulence genes, including fim, esp and gsp genes. Pangenome analysis of KOTW5 and other ST361 isolates (clinical and environmental) indicated it's close relationship with UEMH01, clinical E. coli isolate from Tanzania, and DUDV01, clinical E. coli isolate from Germany, suggesting its potential to cause human infections in the future. They were found to share the genes ampC, pmrF, acr, emr, evg, tolC, esp, fim, gsp, and ompA. In indian scenario, KOTW5 belongs to a clade containing the clinical isolates JAKMBW01 and DAZBLI01 and share the following genes: ampC, H-NS, pmrF, tolC, csg, ent, espL1, espX5, fim, gsp, and ompA.
CONCLUSION: The study highlights the presence of E. coli ST361, a multidrug-resistant, high-risk clone, in wetlands. Wetlands may act as "hotspots" for the persistence and spread of MDR E. coli, which is concerning as antibiotics are becoming less effective as last-resort treatments. The finding of ST361 in lakes underscores the "One Health" concept, emphasizing that antibiotic resistance impacts human, animal, and environmental health. To combat rising antimicrobial resistance (AMR) in aquatic environments, comprehensive strategies such as advanced wastewater treatment, stricter regulations on antibiotic use, public awareness campaigns, and routine environmental monitoring are crucial.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Escherichia coli/genetics/drug effects/isolation & purification/classification/pathogenicity
*Drug Resistance, Multiple, Bacterial/genetics
*Wetlands
Anti-Bacterial Agents/pharmacology
India
*Genome, Bacterial
Microbial Sensitivity Tests
Humans
Phylogeny
Whole Genome Sequencing
Lakes/microbiology
Genomics
Virulence Factors/genetics
Escherichia coli Infections/microbiology
RevDate: 2025-12-27
Pangenomes: new tools for ecological and evolutionary genomics.
Trends in ecology & evolution pii:S0169-5347(25)00344-1 [Epub ahead of print].
Genomic structural variation is an important component of genetic variation in natural populations. By assembling and analyzing multiple high-quality genomes within a species or clade, pangenomes capture variation that can be missed by reference-based genomics at both the sequence and the genic levels. Although pangenomes are nascent tools for animals compared with other taxa, they have already unveiled novel insights into genome evolution, adaptation, the genomic basis of organismal traits, and conservation genomics. We highlight the rapid progress and unique ecological and evolutionary discoveries emerging from applying pangenome tools to diverse natural populations. We conclude that pangenomes are fundamentally shifting the field by revealing structural variants as a key source of adaptive potential and genomic diversity previously missed by single-reference methods.
Additional Links: PMID-41455665
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41455665,
year = {2025},
author = {Fang, B and Edwards, SV},
title = {Pangenomes: new tools for ecological and evolutionary genomics.},
journal = {Trends in ecology & evolution},
volume = {},
number = {},
pages = {},
doi = {10.1016/j.tree.2025.11.010},
pmid = {41455665},
issn = {1872-8383},
abstract = {Genomic structural variation is an important component of genetic variation in natural populations. By assembling and analyzing multiple high-quality genomes within a species or clade, pangenomes capture variation that can be missed by reference-based genomics at both the sequence and the genic levels. Although pangenomes are nascent tools for animals compared with other taxa, they have already unveiled novel insights into genome evolution, adaptation, the genomic basis of organismal traits, and conservation genomics. We highlight the rapid progress and unique ecological and evolutionary discoveries emerging from applying pangenome tools to diverse natural populations. We conclude that pangenomes are fundamentally shifting the field by revealing structural variants as a key source of adaptive potential and genomic diversity previously missed by single-reference methods.},
}
RevDate: 2025-12-27
Streptococcus vaginalis affects cellular dynamics of cervical cancer cells via oxidative stress-induced activation of endoplasmic reticulum unfolded protein response.
Microbiological research, 305:128433 pii:S0944-5013(25)00392-1 [Epub ahead of print].
The vaginal microbiome plays an important role in the development of cervical cancer, highlighting the potential influence of specific members on disease susceptibility, progression, and suppression. In this study, we characterized a recently identified species of vaginal viridans group streptococci, Streptococcus vaginalis. By examining its prevalence, genomic features, and interactions with model cervical cancer cells, we aim to deepen the understanding of its biological significance and broader implications for vaginal health. Microbiome profiling detected S. vaginalis in 27 % of a cohort of Korean women, and the second most abundant species of Streptococcus. Pan-genome analysis and comparative genomics of S. vaginalis strains revealed their reduced pathogenic potentials. In vitro bioassays using cervical cancer cell models (HeLa, SiHa, and CaSki) demonstrated significant effects of S. vaginalis, influencing morphology, proliferation, migration, colony formation, and the induction of apoptosis. Mechanistic investigation identified the involvement of the endoplasmic reticulum (ER) stress and the activation of the unfolded protein response (UPR). Hydrogen peroxide produced by S. vaginalis was found to induce oxidative stress, triggering the ER stress-mediated cellular stress responses in cervical cancer cells. Our study revealed the influence of S. vaginalis on the dynamics of cervical cancer cells via oxidative stress-induced activation of the ER UPR pathway. These mechanistic insights emphasize a potential avenue for therapeutic interventions aimed at modulating oxidative and ER stress responses in cervical cancer treatment strategies. Overall, our findings provide new perspectives into the biological significance of S. vaginalis, expanding our understanding of its potential role beyond simple commensalism.
Additional Links: PMID-41455276
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41455276,
year = {2025},
author = {Montecillo, JAV and Yoo, HJ and Lee, YY and Park, CM and Cho, A and Lee, H and Kim, JM and Lee, NY and Park, SH and Park, NJ and Han, HS and Chong, GO and Seo, I},
title = {Streptococcus vaginalis affects cellular dynamics of cervical cancer cells via oxidative stress-induced activation of endoplasmic reticulum unfolded protein response.},
journal = {Microbiological research},
volume = {305},
number = {},
pages = {128433},
doi = {10.1016/j.micres.2025.128433},
pmid = {41455276},
issn = {1618-0623},
abstract = {The vaginal microbiome plays an important role in the development of cervical cancer, highlighting the potential influence of specific members on disease susceptibility, progression, and suppression. In this study, we characterized a recently identified species of vaginal viridans group streptococci, Streptococcus vaginalis. By examining its prevalence, genomic features, and interactions with model cervical cancer cells, we aim to deepen the understanding of its biological significance and broader implications for vaginal health. Microbiome profiling detected S. vaginalis in 27 % of a cohort of Korean women, and the second most abundant species of Streptococcus. Pan-genome analysis and comparative genomics of S. vaginalis strains revealed their reduced pathogenic potentials. In vitro bioassays using cervical cancer cell models (HeLa, SiHa, and CaSki) demonstrated significant effects of S. vaginalis, influencing morphology, proliferation, migration, colony formation, and the induction of apoptosis. Mechanistic investigation identified the involvement of the endoplasmic reticulum (ER) stress and the activation of the unfolded protein response (UPR). Hydrogen peroxide produced by S. vaginalis was found to induce oxidative stress, triggering the ER stress-mediated cellular stress responses in cervical cancer cells. Our study revealed the influence of S. vaginalis on the dynamics of cervical cancer cells via oxidative stress-induced activation of the ER UPR pathway. These mechanistic insights emphasize a potential avenue for therapeutic interventions aimed at modulating oxidative and ER stress responses in cervical cancer treatment strategies. Overall, our findings provide new perspectives into the biological significance of S. vaginalis, expanding our understanding of its potential role beyond simple commensalism.},
}
RevDate: 2025-12-26
Reference-Guided Chromosome-by-Chromosome de novo Assembly at Scale Using Low-Coverage High-Fidelity Long-Reads with HiFiCCL.
Advanced science (Weinheim, Baden-Wurttemberg, Germany) [Epub ahead of print].
Population genomics using short-read resequencing captures single-nucleotide polymorphisms and small insertions and deletions but struggles with structural variants, leading to a loss of heritability in genome-wide association studies. In recent years, long-read sequencing has improved pangenome construction for diverse eukaryotic species, including humans, crops, and other organisms of ecological and economic importance, addressing this issue to some extent. Sufficient-coverage high-fidelity data for population genomics is often prohibitively expensive, limiting its use in large-scale populations and broader eukaryotic species and creating an urgent need for robust low-coverage assemblies. However, current assemblers underperform in such conditions. To address this, HiFiCCL is proposed, the first assembly framework specifically designed for low-coverage high-fidelity reads, using a reference-guided, chromosome-by-chromosome assembly approach. This study demonstrates that HiFiCCL improves low-coverage assembly performance of existing assemblers and outperforms the state-of-the-art assemblers on human and plant datasets. Tested on 45 human datasets (∼5× coverage), HiFiCCL combined with hifiasm reduces the length of misassembled contigs relative to hifiasm by an average of 21.19% and up to 38.58%. These improved assemblies excel in detecting large germline structural variants, minimize inter-chromosome mis-scaffolding, and improve the detection of specific germline and tumor somatic structural variants based on the pangenome graph.
Additional Links: PMID-41449706
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41449706,
year = {2025},
author = {Jiang, Z and Pan, W and Gao, R and Hu, H and Gao, W and Zhou, M and Yin, YH and Qian, Z and Jin, S and Wang, G},
title = {Reference-Guided Chromosome-by-Chromosome de novo Assembly at Scale Using Low-Coverage High-Fidelity Long-Reads with HiFiCCL.},
journal = {Advanced science (Weinheim, Baden-Wurttemberg, Germany)},
volume = {},
number = {},
pages = {e15308},
doi = {10.1002/advs.202515308},
pmid = {41449706},
issn = {2198-3844},
support = {2022YFF1202101//National Key R&D Program of China/ ; 62225109//National Natural Science Foundation of China/ ; 2572024AW35//Fundamental Research Funds for the Central Universities/ ; },
abstract = {Population genomics using short-read resequencing captures single-nucleotide polymorphisms and small insertions and deletions but struggles with structural variants, leading to a loss of heritability in genome-wide association studies. In recent years, long-read sequencing has improved pangenome construction for diverse eukaryotic species, including humans, crops, and other organisms of ecological and economic importance, addressing this issue to some extent. Sufficient-coverage high-fidelity data for population genomics is often prohibitively expensive, limiting its use in large-scale populations and broader eukaryotic species and creating an urgent need for robust low-coverage assemblies. However, current assemblers underperform in such conditions. To address this, HiFiCCL is proposed, the first assembly framework specifically designed for low-coverage high-fidelity reads, using a reference-guided, chromosome-by-chromosome assembly approach. This study demonstrates that HiFiCCL improves low-coverage assembly performance of existing assemblers and outperforms the state-of-the-art assemblers on human and plant datasets. Tested on 45 human datasets (∼5× coverage), HiFiCCL combined with hifiasm reduces the length of misassembled contigs relative to hifiasm by an average of 21.19% and up to 38.58%. These improved assemblies excel in detecting large germline structural variants, minimize inter-chromosome mis-scaffolding, and improve the detection of specific germline and tumor somatic structural variants based on the pangenome graph.},
}
RevDate: 2025-12-25
CmpDate: 2025-12-25
Genomic characterization and comparative analysis of multidrug-resistant uropathogenic Escherichia coli strain U1 from Pakistan revealed ST131 as dominant clade.
Antonie van Leeuwenhoek, 119(1):22.
Uropathogenic Escherichia coli (UPEC) is a primary etiological agent of urinary tract infections (UTIs) worldwide. The emergence of multidrug-resistant (MDR) UPEC strains, especially the globally disseminated ST131 clone, poses a critical health threat in regions like Pakistan, where comprehensive genomic data is limited. This study performed an in-depth genomic characterization of a newly isolated MDR UPEC strain (U1) and conducted a comparative pangenome analysis of 73 UPEC genomes from Pakistan. The overall cohort exhibited an average genome size of 5.2 Mb, an average GC content of 50.6%, and an average of 5180 coding sequences. In silico genomic analysis identified U1 as a high-risk ST131 lineage member (O25:H4, phylogroup B2). The strain exhibited an MDR profile, supported by the prediction of key antibiotic resistance genes (ARGs), including bla[CTX-M-15] and dfrA17, as well as several putative virulence factors (VFs) and four plasmid replicon types. The comparative analysis revealed a highly diverse and open pangenome (3280 core and 10,977 unique genes). The U1 genome's total coding sequences (5273 genes) contribute ~ 30% share of the total pangenome gene families, indicating its status as a well-equipped strain with essential genes (core) and a substantial number of fitness and adaptability genes (accessory/unique). Core-genome phylogeny confirmed the prevalence of the ST131 lineage, with U1 clustering closely with other local isolates. Widespread VFs and ARGs highlight their critical role in UPEC adaptability. These findings demand urgent antimicrobial stewardship and enhanced genomic surveillance to control the spread of MDR UPEC, particularly the ST131 clone, in Pakistan.
Additional Links: PMID-41447408
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41447408,
year = {2025},
author = {Ishaq, Z and Khalil, K and Faiz, F and Haider, U and Nasir, S and Qayyum, H and Ali, A},
title = {Genomic characterization and comparative analysis of multidrug-resistant uropathogenic Escherichia coli strain U1 from Pakistan revealed ST131 as dominant clade.},
journal = {Antonie van Leeuwenhoek},
volume = {119},
number = {1},
pages = {22},
pmid = {41447408},
issn = {1572-9699},
mesh = {Pakistan/epidemiology ; *Uropathogenic Escherichia coli/genetics/drug effects/classification/isolation & purification ; *Drug Resistance, Multiple, Bacterial/genetics ; Humans ; *Genome, Bacterial ; *Escherichia coli Infections/microbiology/epidemiology ; Phylogeny ; Urinary Tract Infections/microbiology ; Genomics ; Virulence Factors/genetics ; Anti-Bacterial Agents/pharmacology ; },
abstract = {Uropathogenic Escherichia coli (UPEC) is a primary etiological agent of urinary tract infections (UTIs) worldwide. The emergence of multidrug-resistant (MDR) UPEC strains, especially the globally disseminated ST131 clone, poses a critical health threat in regions like Pakistan, where comprehensive genomic data is limited. This study performed an in-depth genomic characterization of a newly isolated MDR UPEC strain (U1) and conducted a comparative pangenome analysis of 73 UPEC genomes from Pakistan. The overall cohort exhibited an average genome size of 5.2 Mb, an average GC content of 50.6%, and an average of 5180 coding sequences. In silico genomic analysis identified U1 as a high-risk ST131 lineage member (O25:H4, phylogroup B2). The strain exhibited an MDR profile, supported by the prediction of key antibiotic resistance genes (ARGs), including bla[CTX-M-15] and dfrA17, as well as several putative virulence factors (VFs) and four plasmid replicon types. The comparative analysis revealed a highly diverse and open pangenome (3280 core and 10,977 unique genes). The U1 genome's total coding sequences (5273 genes) contribute ~ 30% share of the total pangenome gene families, indicating its status as a well-equipped strain with essential genes (core) and a substantial number of fitness and adaptability genes (accessory/unique). Core-genome phylogeny confirmed the prevalence of the ST131 lineage, with U1 clustering closely with other local isolates. Widespread VFs and ARGs highlight their critical role in UPEC adaptability. These findings demand urgent antimicrobial stewardship and enhanced genomic surveillance to control the spread of MDR UPEC, particularly the ST131 clone, in Pakistan.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
Pakistan/epidemiology
*Uropathogenic Escherichia coli/genetics/drug effects/classification/isolation & purification
*Drug Resistance, Multiple, Bacterial/genetics
Humans
*Genome, Bacterial
*Escherichia coli Infections/microbiology/epidemiology
Phylogeny
Urinary Tract Infections/microbiology
Genomics
Virulence Factors/genetics
Anti-Bacterial Agents/pharmacology
RevDate: 2025-12-27
CmpDate: 2025-12-25
Complexity welcome: Pangenome graphs for comprehensive population genomics.
Quantitative plant biology, 6:e43.
Pangenome graphs are revolutionising evolutionary and population genomics by moving beyond linear reference genomes to represent the full spectrum of sequence diversity within and across species. This review traces the field's progression from reference-augmented graphs to assembly-based, alignment-first approaches that capture complex structural variation with reduced bias. We examine key strategies for graph construction, genotyping and implementing graph-aware tools in functional genomics, including transcriptomics and epigenomics. While much of the work to date has focused on humans, diverse and structurally complex plant genomes pose unique challenges that require further methodological innovation. Key bottlenecks - including visualisation, scalability and integration with multi-omic data - persist. By outlining trade-offs among current tools and emphasising the need for rigorous evaluation frameworks, we argue that progress will depend on community-driven efforts to unify graph construction, genotyping and interpretation. Despite technical hurdles, pangenome graphs offer a powerful foundation for more inclusive evolutionary and population genomics.
Additional Links: PMID-41445923
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41445923,
year = {2025},
author = {Bao, Z and Weigel, D},
title = {Complexity welcome: Pangenome graphs for comprehensive population genomics.},
journal = {Quantitative plant biology},
volume = {6},
number = {},
pages = {e43},
pmid = {41445923},
issn = {2632-8828},
abstract = {Pangenome graphs are revolutionising evolutionary and population genomics by moving beyond linear reference genomes to represent the full spectrum of sequence diversity within and across species. This review traces the field's progression from reference-augmented graphs to assembly-based, alignment-first approaches that capture complex structural variation with reduced bias. We examine key strategies for graph construction, genotyping and implementing graph-aware tools in functional genomics, including transcriptomics and epigenomics. While much of the work to date has focused on humans, diverse and structurally complex plant genomes pose unique challenges that require further methodological innovation. Key bottlenecks - including visualisation, scalability and integration with multi-omic data - persist. By outlining trade-offs among current tools and emphasising the need for rigorous evaluation frameworks, we argue that progress will depend on community-driven efforts to unify graph construction, genotyping and interpretation. Despite technical hurdles, pangenome graphs offer a powerful foundation for more inclusive evolutionary and population genomics.},
}
RevDate: 2025-12-24
Comparative genomics reveals avian Acinetobacter baumannii as antimicrobial resistance reservoirs and bovine strains with elevated pathogenicity.
Scientific reports pii:10.1038/s41598-025-33117-0 [Epub ahead of print].
Multidrug-resistant Acinetobacter baumannii is primarily known as an opportunistic nosocomial pathogen but is increasingly recognized as an emerging contaminant in foods of animal origin. In the present study, we characterized the A. baumannii strains from raw meat samples and undertook a comparative genomic analysis to explore the resistome, virulome, mobilome (integrons and transposons) and pathogenic potential of the A. baumannii strains from food animals and animal origin foods. Avian strains of A. baumannii were found to carry a significantly higher number of antimicrobial resistance genes (P < 0.05) and also possess mobile genetic elements (transposons, integrons), while bovine strains were found to possess significantly higher pathogenic potential (P < 0.05) than A. baumannii strains from other sources. Pan-genome and pan-resistome analyses indicated an open nature for both across all sources. Phylogenetic analysis revealed spatial clustering of the isolates, and isolates sequenced in the present study clustered closely with Chinese isolates. Taken together, being the first comparative genomic study on A. baumannii strains from food animals and foods of animal origin, the present study provided insights into the genomic features of A. baumannii strains while highlighting the potential role of avian strains of A. baumannii in the dissemination of antimicrobial resistance through animal origin foods.
Additional Links: PMID-41444811
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41444811,
year = {2025},
author = {Bhargavi, D and Ghatak, S and Srinivas, K and Angappan, M and Pyngrope, DA and Lamare, JP and Das, S and Milton, AAP and Prasad, CB},
title = {Comparative genomics reveals avian Acinetobacter baumannii as antimicrobial resistance reservoirs and bovine strains with elevated pathogenicity.},
journal = {Scientific reports},
volume = {},
number = {},
pages = {},
doi = {10.1038/s41598-025-33117-0},
pmid = {41444811},
issn = {2045-2322},
abstract = {Multidrug-resistant Acinetobacter baumannii is primarily known as an opportunistic nosocomial pathogen but is increasingly recognized as an emerging contaminant in foods of animal origin. In the present study, we characterized the A. baumannii strains from raw meat samples and undertook a comparative genomic analysis to explore the resistome, virulome, mobilome (integrons and transposons) and pathogenic potential of the A. baumannii strains from food animals and animal origin foods. Avian strains of A. baumannii were found to carry a significantly higher number of antimicrobial resistance genes (P < 0.05) and also possess mobile genetic elements (transposons, integrons), while bovine strains were found to possess significantly higher pathogenic potential (P < 0.05) than A. baumannii strains from other sources. Pan-genome and pan-resistome analyses indicated an open nature for both across all sources. Phylogenetic analysis revealed spatial clustering of the isolates, and isolates sequenced in the present study clustered closely with Chinese isolates. Taken together, being the first comparative genomic study on A. baumannii strains from food animals and foods of animal origin, the present study provided insights into the genomic features of A. baumannii strains while highlighting the potential role of avian strains of A. baumannii in the dissemination of antimicrobial resistance through animal origin foods.},
}
RevDate: 2025-12-24
Pangenome Analysis Reveals Structural Variations Associated With Citric Acid Accumulation in Prunus mume.
Plant biotechnology journal [Epub ahead of print].
Pangenome can reveal a large number of variations, providing a more comprehensive view of the genetic diversity of species that a single reference genome cannot surpass. Here, we assembled the haplotype telomere-to-telomere genome and 10 chromosome-level genomes, integrated with two previously reported genomes, and constructed a graph pangenome for P. mume with 'PmNH_Hap1' T2T genome as the reference. Core gene families accounted for 64.47%, while non-essential gene families comprised 35.53%. The total length of the graph pangenome reaches 412.41 Mb, which is a significant increase of 179.60 Mb compared with the 'PmNH_Hap1' reference genome, and these unique insertion sequences contain 5918 genes. Additionally, by combining with five Prunus armeniaca genomes for pangenome variation detection, we identified a total of 51 461 non-redundant SVs, including 15 217 insertions, 21 683 deletions, 13 509 translocations and 1025 inversions. Among these, 60.50% of presence/absence variations were formed by transposons. Pangenome analysis revealed that a 376 bp SV insertion was universally present in the promoter of this gene in P. mume, PmAGL30 can bind to this SV to enhance PmPH4 gene expression and promoted citric acid accumulation in P. mume fruits, leading to extremely significant differences in citric acid content compared with other stone fruits. Additionally, we developed SV molecular markers for the early screening of germplasm with high citric acid content in P. mume fruits. In summary, we constructed a high-quality graph pangenome that reveals abundant genetic variations, providing valuable insights for variety improvement and molecular breeding in P. mume.
Additional Links: PMID-41440171
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41440171,
year = {2025},
author = {Huang, X and Lin, X and Zhou, P and Tan, W and Gao, F and Ni, Z and Shi, T and Bai, Y and Ma, C and Ma, Y and Li, M and Hayat, F and Zhang, Q and Tao, R and Wan, J and Gao, Z},
title = {Pangenome Analysis Reveals Structural Variations Associated With Citric Acid Accumulation in Prunus mume.},
journal = {Plant biotechnology journal},
volume = {},
number = {},
pages = {},
doi = {10.1111/pbi.70518},
pmid = {41440171},
issn = {1467-7652},
support = {JBGS [2021] 019//the "JBGS" Project of Seed Industry Revitalization in Jiangsu Province/ ; KYZZ2025006//Fundamental Research Funds for the Central Universities/ ; 2023M731734//China Postdoctoral Science Foundation/ ; 2023ZB729//Jiangsu Excellence Postdoctoral Program/ ; //the Priority Academic Program Development of Jiangsu Higher Education Institutions PAPD/ ; 32502637//the National Science Foundation of China/ ; },
abstract = {Pangenome can reveal a large number of variations, providing a more comprehensive view of the genetic diversity of species that a single reference genome cannot surpass. Here, we assembled the haplotype telomere-to-telomere genome and 10 chromosome-level genomes, integrated with two previously reported genomes, and constructed a graph pangenome for P. mume with 'PmNH_Hap1' T2T genome as the reference. Core gene families accounted for 64.47%, while non-essential gene families comprised 35.53%. The total length of the graph pangenome reaches 412.41 Mb, which is a significant increase of 179.60 Mb compared with the 'PmNH_Hap1' reference genome, and these unique insertion sequences contain 5918 genes. Additionally, by combining with five Prunus armeniaca genomes for pangenome variation detection, we identified a total of 51 461 non-redundant SVs, including 15 217 insertions, 21 683 deletions, 13 509 translocations and 1025 inversions. Among these, 60.50% of presence/absence variations were formed by transposons. Pangenome analysis revealed that a 376 bp SV insertion was universally present in the promoter of this gene in P. mume, PmAGL30 can bind to this SV to enhance PmPH4 gene expression and promoted citric acid accumulation in P. mume fruits, leading to extremely significant differences in citric acid content compared with other stone fruits. Additionally, we developed SV molecular markers for the early screening of germplasm with high citric acid content in P. mume fruits. In summary, we constructed a high-quality graph pangenome that reveals abundant genetic variations, providing valuable insights for variety improvement and molecular breeding in P. mume.},
}
RevDate: 2025-12-26
CmpDate: 2025-12-24
Genome-based reassessment of Sphingobacterium reveals the limitation of the fixed dDDH threshold for prokaryotic species delineation.
Current research in microbial sciences, 10:100524.
Accurate species delimitation in prokaryotes increasingly relies on genome-scale comparisons, yet fixed genomic thresholds can be unreliable in lineages shaped by extensive gene flux. In this study, we revisited the taxonomy of genus Sphingobacterium using phylogenomic reconstruction and comprehensive whole-genome comparisons. The genus displays a highly open pan-genome, with only 22 universally conserved genes and nearly 20,000 cloud genes, indicating pronounced genomic plasticity. Within this complex evolutionary context, we detected a rare non-transitive paradox in digital DNA-DNA hybridization (dDDH) within the Sphingobacterium siyangense group. All strains share average nucleotide identity (ANI) values above the accepted species boundary (95%), yet some strain pairs exhibit dDDH values below the species threshold (70%), resulting in a conflict restricted to this metric. Phylogenomic analyses, core genome variation, average amino acid identity (AAI) patterns, and functional gene profiles consistently support the monophyly and genomic cohesion of these strains, showing that dependence on dDDH alone may lead to ambiguous species boundaries. Based on the combined evidence, we treat all members of the S. siyangense cluster as a single species and propose additional taxonomic revisions. Sphingobacterium ginsenosidimutans is recognized as a heterotypic synonym of Sphingobacterium detergens. The species Rhinopithecimicrobium faecis is proposed for reclassification as Sphingobacterium faecis comb. nov. These findings demonstrate that rigid dDDH cutoffs cannot fully capture evolutionary relationships and highlight the value of integrating phylogenomic and pan-genomic evidence for resolving complex species level classifications in prokaryotes.
Additional Links: PMID-41439236
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41439236,
year = {2026},
author = {Li, S and Wang, XR and Zhang, W and Li, WJ},
title = {Genome-based reassessment of Sphingobacterium reveals the limitation of the fixed dDDH threshold for prokaryotic species delineation.},
journal = {Current research in microbial sciences},
volume = {10},
number = {},
pages = {100524},
pmid = {41439236},
issn = {2666-5174},
abstract = {Accurate species delimitation in prokaryotes increasingly relies on genome-scale comparisons, yet fixed genomic thresholds can be unreliable in lineages shaped by extensive gene flux. In this study, we revisited the taxonomy of genus Sphingobacterium using phylogenomic reconstruction and comprehensive whole-genome comparisons. The genus displays a highly open pan-genome, with only 22 universally conserved genes and nearly 20,000 cloud genes, indicating pronounced genomic plasticity. Within this complex evolutionary context, we detected a rare non-transitive paradox in digital DNA-DNA hybridization (dDDH) within the Sphingobacterium siyangense group. All strains share average nucleotide identity (ANI) values above the accepted species boundary (95%), yet some strain pairs exhibit dDDH values below the species threshold (70%), resulting in a conflict restricted to this metric. Phylogenomic analyses, core genome variation, average amino acid identity (AAI) patterns, and functional gene profiles consistently support the monophyly and genomic cohesion of these strains, showing that dependence on dDDH alone may lead to ambiguous species boundaries. Based on the combined evidence, we treat all members of the S. siyangense cluster as a single species and propose additional taxonomic revisions. Sphingobacterium ginsenosidimutans is recognized as a heterotypic synonym of Sphingobacterium detergens. The species Rhinopithecimicrobium faecis is proposed for reclassification as Sphingobacterium faecis comb. nov. These findings demonstrate that rigid dDDH cutoffs cannot fully capture evolutionary relationships and highlight the value of integrating phylogenomic and pan-genomic evidence for resolving complex species level classifications in prokaryotes.},
}
RevDate: 2025-12-23
MLDeCNV: A machine learning approach for predicting copy number variation types in plant genomes.
Computers in biology and medicine, 201:111394 pii:S0010-4825(25)01748-2 [Epub ahead of print].
Copy number variations (CNVs) play a crucial role in shaping genetic diversity and influencing various plant traits. However, existing methods for CNV characterization often face challenges due to the complexity and repetitive nature of plant genomes. Here, we present MLDeCNV (Machine Learning for Decoding Copy Number Variation) a novel open-source machine-learning based tool optimized for predicting CNV types (deletions, duplications, and non-CNVs) in plant genomes. Built on the XGBoost model, MLDeCNV utilizes 32 selected CNV-related features derived from coverage metrics, nucleotide composition, and sequencing statistics. The model was trained on a high-confidence CNV dataset comprising of experimentally validated and computationally predicted CNVs. It exhibits strong performance across various CNV size ranges and training set sizes, achieving an accuracy of 89.27 %, with precision, recall, and F1-score, all at 89.3 %, and an Area Under Curve of 0.9783, underscoring its robustness and reliability. Extensive comparisons with traditional machine learning models reveal that XGBoost outperforms other methods, particularly in handling complex, nonlinear interactions within the CNV data. Additionally, while MLDeCNV does not perform de novo CNV detection, it evaluates CNV type classification from pre-identified genomic regions, making it a post-detection classification tool. This tool, accessible at http://46.202.167.198:5004/ can be integrated downstream of CNV detection pipelines, enhancing the accuracy of CNV type categorization. The precise classification of CNV types from pre-identified genomic regions will streamline downstream genomic analyses, facilitating enhanced understanding and utilization of genetic variation in plants.
Additional Links: PMID-41435499
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41435499,
year = {2025},
author = {Das, P and Saha, B and Sharma, NK and Iquebal, MA and Papanicolaou, A and Angadi, UB and Kumar, D and Jaiswal, S},
title = {MLDeCNV: A machine learning approach for predicting copy number variation types in plant genomes.},
journal = {Computers in biology and medicine},
volume = {201},
number = {},
pages = {111394},
doi = {10.1016/j.compbiomed.2025.111394},
pmid = {41435499},
issn = {1879-0534},
abstract = {Copy number variations (CNVs) play a crucial role in shaping genetic diversity and influencing various plant traits. However, existing methods for CNV characterization often face challenges due to the complexity and repetitive nature of plant genomes. Here, we present MLDeCNV (Machine Learning for Decoding Copy Number Variation) a novel open-source machine-learning based tool optimized for predicting CNV types (deletions, duplications, and non-CNVs) in plant genomes. Built on the XGBoost model, MLDeCNV utilizes 32 selected CNV-related features derived from coverage metrics, nucleotide composition, and sequencing statistics. The model was trained on a high-confidence CNV dataset comprising of experimentally validated and computationally predicted CNVs. It exhibits strong performance across various CNV size ranges and training set sizes, achieving an accuracy of 89.27 %, with precision, recall, and F1-score, all at 89.3 %, and an Area Under Curve of 0.9783, underscoring its robustness and reliability. Extensive comparisons with traditional machine learning models reveal that XGBoost outperforms other methods, particularly in handling complex, nonlinear interactions within the CNV data. Additionally, while MLDeCNV does not perform de novo CNV detection, it evaluates CNV type classification from pre-identified genomic regions, making it a post-detection classification tool. This tool, accessible at http://46.202.167.198:5004/ can be integrated downstream of CNV detection pipelines, enhancing the accuracy of CNV type categorization. The precise classification of CNV types from pre-identified genomic regions will streamline downstream genomic analyses, facilitating enhanced understanding and utilization of genetic variation in plants.},
}
RevDate: 2025-12-23
Metagenomic characterization of the metabolism, evolution, and global distribution of Candidatus Accumulibacter members in wastewater treatment plants.
The ISME journal pii:8402456 [Epub ahead of print].
Deciphering the genomic basis of ecological diversification in activated sludge microbiomes is essential for optimizing treatment technology and advancing microbial ecology. Here, we present a global genome-resolved investigation of Candidatus Accumulibacter, the primary functional agent of enhanced biological phosphorus removal, based on 828 metagenomes from wastewater treatment plants across six continents. We recovered 104 high-quality Candidatus Accumulibacter metagenome-assembled genomes, discovering a new clade (Clade IV), substantially expanding the known phylogenetic diversity and revealing a ubiquitous yet geographically heterogeneous global distribution. Phylogenomic and pangenome analyses uncovered extensive clade-specific gene gain and loss, particularly in nitrogen metabolism, suggesting divergent evolutionary trajectories shaped by relaxed selection and niche adaptation. Genome-wide patterns of convergent streamlining and enriched antiviral defense systems indicate selective pressures from strong competition and viral predation. Constraint-based metabolic modeling revealed pervasive amino acid autotrophies and metabolic complementarity, coupled with distinct carbon utilization strategies that support ecological specialization across operational settings. Experimental validation reconciled model-phenotype discrepancies, highlighting the importance of transporter promiscuity and gene regulation in carbon substrate assimilation. Collectively, our findings redefine Candidatus Accumulibacter as a dynamic model of microbial genome plasticity, metabolic adaptation, and ecological resilience, providing an insight for understanding how microbial communities adapt and respond under engineered environmental conditions.
Additional Links: PMID-41432242
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41432242,
year = {2025},
author = {Xie, X and Chen, L and Yuan, J and Zheng, H and Zhang, L and Yu, X and Liu, X and Wei, C and Qiu, G},
title = {Metagenomic characterization of the metabolism, evolution, and global distribution of Candidatus Accumulibacter members in wastewater treatment plants.},
journal = {The ISME journal},
volume = {},
number = {},
pages = {},
doi = {10.1093/ismejo/wraf278},
pmid = {41432242},
issn = {1751-7370},
abstract = {Deciphering the genomic basis of ecological diversification in activated sludge microbiomes is essential for optimizing treatment technology and advancing microbial ecology. Here, we present a global genome-resolved investigation of Candidatus Accumulibacter, the primary functional agent of enhanced biological phosphorus removal, based on 828 metagenomes from wastewater treatment plants across six continents. We recovered 104 high-quality Candidatus Accumulibacter metagenome-assembled genomes, discovering a new clade (Clade IV), substantially expanding the known phylogenetic diversity and revealing a ubiquitous yet geographically heterogeneous global distribution. Phylogenomic and pangenome analyses uncovered extensive clade-specific gene gain and loss, particularly in nitrogen metabolism, suggesting divergent evolutionary trajectories shaped by relaxed selection and niche adaptation. Genome-wide patterns of convergent streamlining and enriched antiviral defense systems indicate selective pressures from strong competition and viral predation. Constraint-based metabolic modeling revealed pervasive amino acid autotrophies and metabolic complementarity, coupled with distinct carbon utilization strategies that support ecological specialization across operational settings. Experimental validation reconciled model-phenotype discrepancies, highlighting the importance of transporter promiscuity and gene regulation in carbon substrate assimilation. Collectively, our findings redefine Candidatus Accumulibacter as a dynamic model of microbial genome plasticity, metabolic adaptation, and ecological resilience, providing an insight for understanding how microbial communities adapt and respond under engineered environmental conditions.},
}
RevDate: 2025-12-22
Population-specific pangenome unveils a third FAD2 gene and solves the peanut mid-oleic fatty acid mystery.
Nature communications pii:10.1038/s41467-025-67371-7 [Epub ahead of print].
Accumulating evidences have shown that the mid-oleic fatty acid phenotype in peanuts cannot be explained by the traditional two-gene model involving AhFAD2A and AhFAD2B, which are genes encoding fatty-acid desaturase 2. But the underlying genetic mechanism remains unclear. Here, we present a population-specific pangenome using the eight founder genomes of the PeanutMAGIC population. This graph-based pangenome serves as a comprehensive reference, capturing all segregating haplotypes within the population. We conduct whole genome sequencing for the MAGIC Core, a subset of 310 RILs, for genotyping. Using pangenome-based genotypes, we trace recombination for detailed genomic analysis and phenotypic association. This investigation identifies a unique third gene, named AhFAD2C, near AhFAD2B. When recombination occurs, AhFAD2C segregates from AhFAD2B. We reveal the genotype determining mid-oleic fatty acid phenotype. Our findings underscore the limitations of a single-reference genome, which leads to false association and marker discovery. In contrast, a population-specific pangenome provides a more reliable framework for genomic studies. This study reveals insights into the genetic mechanism of peanut oil quality and demonstrates the advantages of population-specific pangenomes.
Additional Links: PMID-41429781
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41429781,
year = {2025},
author = {Thompson, E and Korani, W and Wu, D and Garg, V and Tonnis, B and Wang, M and Corley Holbrook, C and Ozias-Akins, P and Culbreath, AK and Varshney, RK and Guo, B and Clevenger, JP},
title = {Population-specific pangenome unveils a third FAD2 gene and solves the peanut mid-oleic fatty acid mystery.},
journal = {Nature communications},
volume = {},
number = {},
pages = {},
doi = {10.1038/s41467-025-67371-7},
pmid = {41429781},
issn = {2041-1723},
abstract = {Accumulating evidences have shown that the mid-oleic fatty acid phenotype in peanuts cannot be explained by the traditional two-gene model involving AhFAD2A and AhFAD2B, which are genes encoding fatty-acid desaturase 2. But the underlying genetic mechanism remains unclear. Here, we present a population-specific pangenome using the eight founder genomes of the PeanutMAGIC population. This graph-based pangenome serves as a comprehensive reference, capturing all segregating haplotypes within the population. We conduct whole genome sequencing for the MAGIC Core, a subset of 310 RILs, for genotyping. Using pangenome-based genotypes, we trace recombination for detailed genomic analysis and phenotypic association. This investigation identifies a unique third gene, named AhFAD2C, near AhFAD2B. When recombination occurs, AhFAD2C segregates from AhFAD2B. We reveal the genotype determining mid-oleic fatty acid phenotype. Our findings underscore the limitations of a single-reference genome, which leads to false association and marker discovery. In contrast, a population-specific pangenome provides a more reliable framework for genomic studies. This study reveals insights into the genetic mechanism of peanut oil quality and demonstrates the advantages of population-specific pangenomes.},
}
RevDate: 2025-12-21
Genomic insights into Mycobacterium orygis in wild ungulates in Chennai, India.
Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases pii:S1567-1348(25)00158-3 [Epub ahead of print].
Bovine tuberculosis (bTB) represents a significant global threat to livestock, wildlife, and human health, yet wildlife tuberculosis (wTB) remains underreported in India. This study aimed to investigate the presence of Mycobacterium tuberculosis complex (MTBC) species in wild ungulates at Guindy National Park, Chennai. Postmortem tissue samples and fecal pellets were collected from spotted deer, sambar deer, and blackbuck, and analyzed using culture, molecular diagnostics, histopathology, and whole genome sequencing (WGS). Mycobacterium orygis was isolated from tissue samples of four animals, while molecular assays detected MTBC DNA in nine fecal samples. Drug resistance was identified in three fecal samples. Histopathology revealed characteristic granulomatous lesions, and WGS confirmed M. orygis in all tissue-derived isolates. Pangenome analysis identified 4222 genes with an estimated γ value of 0.0086, suggesting an open pangenome with novel genes accumulating at a slow rate. Comparative genomic analysis, including insertion sequence (IS) profiling and SNP analysis, revealed limited overall diversity but unique SNPs in Chennai isolates, suggesting local genomic differentiation. These findings highlight the dynamic nature of M. orygis within MTBC and highlight the importance of functional studies to understand host specificity and adaptability and the urgent need for systematic wTB surveillance and One Health-based interventions in India.
Additional Links: PMID-41423185
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41423185,
year = {2025},
author = {Ramanujam, H and Ramalingam, M and Refaya, AK and Rajendran, P and Baskar, M and Palanivel, N and Devarajan, S and Balasubramaniam, A and Nithya, R and Shanmugam, S and Palaniyandi, K},
title = {Genomic insights into Mycobacterium orygis in wild ungulates in Chennai, India.},
journal = {Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases},
volume = {},
number = {},
pages = {105869},
doi = {10.1016/j.meegid.2025.105869},
pmid = {41423185},
issn = {1567-7257},
abstract = {Bovine tuberculosis (bTB) represents a significant global threat to livestock, wildlife, and human health, yet wildlife tuberculosis (wTB) remains underreported in India. This study aimed to investigate the presence of Mycobacterium tuberculosis complex (MTBC) species in wild ungulates at Guindy National Park, Chennai. Postmortem tissue samples and fecal pellets were collected from spotted deer, sambar deer, and blackbuck, and analyzed using culture, molecular diagnostics, histopathology, and whole genome sequencing (WGS). Mycobacterium orygis was isolated from tissue samples of four animals, while molecular assays detected MTBC DNA in nine fecal samples. Drug resistance was identified in three fecal samples. Histopathology revealed characteristic granulomatous lesions, and WGS confirmed M. orygis in all tissue-derived isolates. Pangenome analysis identified 4222 genes with an estimated γ value of 0.0086, suggesting an open pangenome with novel genes accumulating at a slow rate. Comparative genomic analysis, including insertion sequence (IS) profiling and SNP analysis, revealed limited overall diversity but unique SNPs in Chennai isolates, suggesting local genomic differentiation. These findings highlight the dynamic nature of M. orygis within MTBC and highlight the importance of functional studies to understand host specificity and adaptability and the urgent need for systematic wTB surveillance and One Health-based interventions in India.},
}
RevDate: 2025-12-22
Genomic characteristics of multidrug-resistant Streptococcus pluranimalium isolated from a cow with mastitis in Xinjiang, China.
Veterinary journal (London, England : 1997) pii:S1090-0233(25)00244-8 [Epub ahead of print].
Streptococcus pluranimalium has emerged as a significant zoonotic pathogen, associated with bovine mastitis and a range of infections in multiple animal species and humans. In this study, a multidrug-resistant S. pluranimalium strain, designated XJSP, was isolated from milk samples collected during a clinical mastitis outbreak on a dairy farm in Xinjiang, China. Whole-genome sequencing revealed its genome comprised 2,066,720bp with a GC content of 38.78%, harboring 1,933 coding sequences, 44 tRNAs, 5 rRNA operons, 5 genomic islands, 3 prophage regions, and a 5,007bp plasmid pXJSP-1 carrying four resistance genes: LRA-7, salB, ermK, and lnuA. Phylogenetic analysis positioned S. pluranimalium within a distinct clade alongside S. ictaluri, S. cuniculipharyngis, S. hillyeri, and S. entericus. Pan-genome analysis of 25 global S. pluranimalium strains identified 880 core genes and extensive accessory gene content, reflecting high genetic diversity. The XJSP strain exhibited resistance to ceftriaxone, cefotaxime, erythromycin, ofloxacin, chloramphenicol, and tetracycline, and carried 66 antimicrobial resistance genes and 212 virulence factor genes. Comparative plasmid analysis indicated potential horizontal transfer of resistance genes among Streptococci. These findings enhance our understanding of the genomic architecture, antimicrobial resistance mechanisms, and evolutionary relationships of S. pluranimalium, highlighting its potential risk for interspecies transmission and its implications for public health.
Additional Links: PMID-41420886
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41420886,
year = {2025},
author = {Jie, Z and Wang, L and Jia, W and Liao, Z and Dong, Q and Tian, Y},
title = {Genomic characteristics of multidrug-resistant Streptococcus pluranimalium isolated from a cow with mastitis in Xinjiang, China.},
journal = {Veterinary journal (London, England : 1997)},
volume = {},
number = {},
pages = {106540},
doi = {10.1016/j.tvjl.2025.106540},
pmid = {41420886},
issn = {1532-2971},
abstract = {Streptococcus pluranimalium has emerged as a significant zoonotic pathogen, associated with bovine mastitis and a range of infections in multiple animal species and humans. In this study, a multidrug-resistant S. pluranimalium strain, designated XJSP, was isolated from milk samples collected during a clinical mastitis outbreak on a dairy farm in Xinjiang, China. Whole-genome sequencing revealed its genome comprised 2,066,720bp with a GC content of 38.78%, harboring 1,933 coding sequences, 44 tRNAs, 5 rRNA operons, 5 genomic islands, 3 prophage regions, and a 5,007bp plasmid pXJSP-1 carrying four resistance genes: LRA-7, salB, ermK, and lnuA. Phylogenetic analysis positioned S. pluranimalium within a distinct clade alongside S. ictaluri, S. cuniculipharyngis, S. hillyeri, and S. entericus. Pan-genome analysis of 25 global S. pluranimalium strains identified 880 core genes and extensive accessory gene content, reflecting high genetic diversity. The XJSP strain exhibited resistance to ceftriaxone, cefotaxime, erythromycin, ofloxacin, chloramphenicol, and tetracycline, and carried 66 antimicrobial resistance genes and 212 virulence factor genes. Comparative plasmid analysis indicated potential horizontal transfer of resistance genes among Streptococci. These findings enhance our understanding of the genomic architecture, antimicrobial resistance mechanisms, and evolutionary relationships of S. pluranimalium, highlighting its potential risk for interspecies transmission and its implications for public health.},
}
RevDate: 2025-12-20
Genomic epidemiology of antimicrobial resistance in Proteus mirabilis: core genome and plasmid-mediated drivers.
BMC microbiology pii:10.1186/s12866-025-04651-8 [Epub ahead of print].
Proteus mirabilis has emerged as a clinically significant multidrug-resistant (MDR) pathogen, yet the genomic drivers and dissemination mechanisms of its antimicrobial resistance (AMR) remain poorly characterized. To address this gap, we conducted a pangenome analysis of 2,013 P. mirabilis genomes, including 1,990 publicly available strains and 23 newly sequenced clinical isolates, to delineate the species-wide AMR landscape. Our study identified 197 AMR gene subtypes spanning 12 antibiotic classes, with seven resistance determinants embedded in the core genome. Clinically critical resistance phenotypes-notably to third-generation cephalosporins and carbapenems-were strongly associated with the proliferation of β-lactamase genes (blaTEM-1, blaCTX-M-15, blaNDM-1) across lineages. Strikingly, 36.4% of AMR genes resided on mobile genetic elements, with conjugative IncC plasmids acting as primary vectors for high-risk resistance cassettes, including β-lactamase (blaTEM-1, blaNDM-1) and aminoglycoside-modifying [aac(6')-Ib-cr, aph(3'')-Ib] genes. Phylogenetic reconstruction revealed that blaNDM-1-carrying IncC plasmids formed interspecies clusters with homologs from Klebsiella pneumoniae, Escherichia coli, and Salmonella enterica, demonstrating active cross-genera transmission within Enterobacteriaceae. These findings highlight two parallel evolutionary strategies in P. mirabilis: the conservation of core genome resistance mechanisms and horizontal acquisition of plasmid-borne MDR traits, which provide comprehensive understanding of AMR transmission in the bacterium.
Additional Links: PMID-41420203
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41420203,
year = {2025},
author = {Zhang, P and Cheng, Z and Cao, Y and Liu, S and Zhao, M},
title = {Genomic epidemiology of antimicrobial resistance in Proteus mirabilis: core genome and plasmid-mediated drivers.},
journal = {BMC microbiology},
volume = {},
number = {},
pages = {},
doi = {10.1186/s12866-025-04651-8},
pmid = {41420203},
issn = {1471-2180},
support = {32201393//National Natural Science Foundation of China/ ; },
abstract = {Proteus mirabilis has emerged as a clinically significant multidrug-resistant (MDR) pathogen, yet the genomic drivers and dissemination mechanisms of its antimicrobial resistance (AMR) remain poorly characterized. To address this gap, we conducted a pangenome analysis of 2,013 P. mirabilis genomes, including 1,990 publicly available strains and 23 newly sequenced clinical isolates, to delineate the species-wide AMR landscape. Our study identified 197 AMR gene subtypes spanning 12 antibiotic classes, with seven resistance determinants embedded in the core genome. Clinically critical resistance phenotypes-notably to third-generation cephalosporins and carbapenems-were strongly associated with the proliferation of β-lactamase genes (blaTEM-1, blaCTX-M-15, blaNDM-1) across lineages. Strikingly, 36.4% of AMR genes resided on mobile genetic elements, with conjugative IncC plasmids acting as primary vectors for high-risk resistance cassettes, including β-lactamase (blaTEM-1, blaNDM-1) and aminoglycoside-modifying [aac(6')-Ib-cr, aph(3'')-Ib] genes. Phylogenetic reconstruction revealed that blaNDM-1-carrying IncC plasmids formed interspecies clusters with homologs from Klebsiella pneumoniae, Escherichia coli, and Salmonella enterica, demonstrating active cross-genera transmission within Enterobacteriaceae. These findings highlight two parallel evolutionary strategies in P. mirabilis: the conservation of core genome resistance mechanisms and horizontal acquisition of plasmid-borne MDR traits, which provide comprehensive understanding of AMR transmission in the bacterium.},
}
RevDate: 2025-12-19
Gaining insights into novel pathogen hosts: Characterisation of Actinomyces weissii strains isolated from companion animals.
Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases pii:S1567-1348(25)00156-X [Epub ahead of print].
Actinomyces weissii was first isolated from the oral cavity of dogs in 2012. This study characterised the phenotypic and genotypic features of 11 additional strains obtained from diverse pathological lesions in companion animals. A multi-faceted approach was employed, combining culture techniques, biochemical profiling, MALDI-TOF MS, real-time PCR, sequencing of genetic markers, and whole-genome analysis. Despite minor phenotypic variation, MALDI-TOF MS and genotypic analyses consistently confirmed species identity and revealed intraspecies diversity. Whole-genome sequencing revealed four haemolysin family protein genes, and all isolates exhibited complete β-haemolysis. Pan-genome analysis defined a conserved core genome of 1559 genes, shared by all isolates and the reference strain, including haemolysin-related genes. Phylogenetic comparisons placed A. urogenitalis and A. trachealis as closest relatives. These findings broaden the host spectrum of A. weissii, with the first isolation from a cat, suggesting that this species may also occur in felines and merits consideration in veterinary diagnostics. Its presence in companion animals raises the possibility of zoonotic risk, and together with the detection of haemolysin family protein genes, underscores the need for further investigation to clarify its pathogenic significance and potential impact on veterinary practice and public health.
Additional Links: PMID-41419072
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41419072,
year = {2025},
author = {Cirak, B and Kreitlow, A and Borowiak, M and Hoffmann, C and Lämmler, C and Heydel, C and Ewers, C and Malorny, B and Plötz, M and Abdulmawjood, A},
title = {Gaining insights into novel pathogen hosts: Characterisation of Actinomyces weissii strains isolated from companion animals.},
journal = {Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases},
volume = {},
number = {},
pages = {105867},
doi = {10.1016/j.meegid.2025.105867},
pmid = {41419072},
issn = {1567-7257},
abstract = {Actinomyces weissii was first isolated from the oral cavity of dogs in 2012. This study characterised the phenotypic and genotypic features of 11 additional strains obtained from diverse pathological lesions in companion animals. A multi-faceted approach was employed, combining culture techniques, biochemical profiling, MALDI-TOF MS, real-time PCR, sequencing of genetic markers, and whole-genome analysis. Despite minor phenotypic variation, MALDI-TOF MS and genotypic analyses consistently confirmed species identity and revealed intraspecies diversity. Whole-genome sequencing revealed four haemolysin family protein genes, and all isolates exhibited complete β-haemolysis. Pan-genome analysis defined a conserved core genome of 1559 genes, shared by all isolates and the reference strain, including haemolysin-related genes. Phylogenetic comparisons placed A. urogenitalis and A. trachealis as closest relatives. These findings broaden the host spectrum of A. weissii, with the first isolation from a cat, suggesting that this species may also occur in felines and merits consideration in veterinary diagnostics. Its presence in companion animals raises the possibility of zoonotic risk, and together with the detection of haemolysin family protein genes, underscores the need for further investigation to clarify its pathogenic significance and potential impact on veterinary practice and public health.},
}
RevDate: 2025-12-19
Host-adapted probiotic potential of Ligilactobacillus agilis 2-2 revealed by comparative genomic and phenotypic analyses.
Poultry science, 105(2):106274 pii:S0032-5791(25)01514-7 [Epub ahead of print].
Host-specific adaptation shapes the evolution of safe and effective probiotics. In this study, we performed the first pan-genome analysis of Ligilactobacillus agilis using 40 genomes from poultry and mammalian sources. The species exhibits an open, highly plastic genome with host-driven divergence in carbohydrate metabolism. Glycosyltransferase GT2, bglF_2 and tcyB/C were enriched in mammalian strains, whereas capsule biosynthesis gene epsH and asp2 were predominant in poultry strains. The poultry-derived strain L. agilis 2-2 harbors gene clusters associated with acid and bile tolerance (atpA-atpH), adhesion (mapA), short-chain fatty acid biosynthesis (ldhA, ackA-pta), and antioxidant defense (trxA/B, msrA/B), collectively supporting its intestinal adaptation and probiotic fitness. Phenotypically, L. agilis 2-2 exhibited strong acid (86.9 %) and bile (84.1 %) tolerance, high aggregation (75 %) and hydrophobicity (55.3 %), and potent antimicrobial activity, facilitating gut colonization. Its cell-free supernatant displayed strong antioxidant capacity, scavenging DPPH (79.5 %), hydroxyl (66.2 %), and ABTS[+] (83.2 %) radicals, and produced abundant lactic acid (7.7 mg/mL), butyrate (1767.2 μg/mL), and propionate (1097.0 μg/mL). Collectively, these findings establish L. agilis 2-2 as a metabolically versatile, host-adapted, and genomically safe probiotic, highlighting its potential for targeted poultry applications and providing mechanistic insights into host-specific adaptation in Ligilactobacillus.
Additional Links: PMID-41418533
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41418533,
year = {2025},
author = {Zhang, Z and Lv, Y and Guo, Z and Liu, L and Chen, X and Han, W and Wei, J and Guo, S and Sun, Y and Wang, S},
title = {Host-adapted probiotic potential of Ligilactobacillus agilis 2-2 revealed by comparative genomic and phenotypic analyses.},
journal = {Poultry science},
volume = {105},
number = {2},
pages = {106274},
doi = {10.1016/j.psj.2025.106274},
pmid = {41418533},
issn = {1525-3171},
abstract = {Host-specific adaptation shapes the evolution of safe and effective probiotics. In this study, we performed the first pan-genome analysis of Ligilactobacillus agilis using 40 genomes from poultry and mammalian sources. The species exhibits an open, highly plastic genome with host-driven divergence in carbohydrate metabolism. Glycosyltransferase GT2, bglF_2 and tcyB/C were enriched in mammalian strains, whereas capsule biosynthesis gene epsH and asp2 were predominant in poultry strains. The poultry-derived strain L. agilis 2-2 harbors gene clusters associated with acid and bile tolerance (atpA-atpH), adhesion (mapA), short-chain fatty acid biosynthesis (ldhA, ackA-pta), and antioxidant defense (trxA/B, msrA/B), collectively supporting its intestinal adaptation and probiotic fitness. Phenotypically, L. agilis 2-2 exhibited strong acid (86.9 %) and bile (84.1 %) tolerance, high aggregation (75 %) and hydrophobicity (55.3 %), and potent antimicrobial activity, facilitating gut colonization. Its cell-free supernatant displayed strong antioxidant capacity, scavenging DPPH (79.5 %), hydroxyl (66.2 %), and ABTS[+] (83.2 %) radicals, and produced abundant lactic acid (7.7 mg/mL), butyrate (1767.2 μg/mL), and propionate (1097.0 μg/mL). Collectively, these findings establish L. agilis 2-2 as a metabolically versatile, host-adapted, and genomically safe probiotic, highlighting its potential for targeted poultry applications and providing mechanistic insights into host-specific adaptation in Ligilactobacillus.},
}
RevDate: 2025-12-18
Comparative pangenome analysis of Yersinia enterocolitica in a one health approach.
BMC genomics pii:10.1186/s12864-025-12420-0 [Epub ahead of print].
Additional Links: PMID-41413462
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41413462,
year = {2025},
author = {Martins, BTF and Rodrigues, RDS and Nero, LA},
title = {Comparative pangenome analysis of Yersinia enterocolitica in a one health approach.},
journal = {BMC genomics},
volume = {},
number = {},
pages = {},
doi = {10.1186/s12864-025-12420-0},
pmid = {41413462},
issn = {1471-2164},
support = {finance code 001//Coordenação de Aperfeiçoamento de Pessoal de Nível Superior/ ; finance code 001//Coordenação de Aperfeiçoamento de Pessoal de Nível Superior/ ; finance code 001//Coordenação de Aperfeiçoamento de Pessoal de Nível Superior/ ; },
}
RevDate: 2025-12-25
CmpDate: 2025-12-23
A simple method to efficiently generate structural variation in plants.
PLoS genetics, 21(12):e1011977.
Phenotypic variation is essential for the selection of new traits of interest. Structural variants, consisting of deletions, duplications, inversions, and translocations, have greater potential for phenotypic consequences than single nucleotide variants. Pan-genome studies have highlighted the importance of structural variation in the evolution and selection of novel traits. Here, we describe a simple method to induce structural variation in plants. We demonstrate that a short period of growth on the topoisomerase II inhibitor etoposide induces heritable structural variation and altered phenotypes in Arabidopsis thaliana at high frequency. Using long-read sequencing and genetic analyses, we identified deletions and inversions underlying semi-dominant and recessive phenotypes. This method requires minimal resources, is potentially applicable to any plant species, and can replace irradiation as a source of induced large-effect structural variation.
Additional Links: PMID-41411371
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41411371,
year = {2025},
author = {Bechen, LL and Ahsan, N and Bahrainwala, A and Gehring, M and Satyaki, PRV},
title = {A simple method to efficiently generate structural variation in plants.},
journal = {PLoS genetics},
volume = {21},
number = {12},
pages = {e1011977},
pmid = {41411371},
issn = {1553-7404},
mesh = {*Arabidopsis/genetics/drug effects/growth & development ; Phenotype ; Etoposide/pharmacology ; Genome, Plant ; Topoisomerase II Inhibitors/pharmacology ; *Genomic Structural Variation/genetics ; Genetic Variation ; Chromosome Inversion/genetics ; },
abstract = {Phenotypic variation is essential for the selection of new traits of interest. Structural variants, consisting of deletions, duplications, inversions, and translocations, have greater potential for phenotypic consequences than single nucleotide variants. Pan-genome studies have highlighted the importance of structural variation in the evolution and selection of novel traits. Here, we describe a simple method to induce structural variation in plants. We demonstrate that a short period of growth on the topoisomerase II inhibitor etoposide induces heritable structural variation and altered phenotypes in Arabidopsis thaliana at high frequency. Using long-read sequencing and genetic analyses, we identified deletions and inversions underlying semi-dominant and recessive phenotypes. This method requires minimal resources, is potentially applicable to any plant species, and can replace irradiation as a source of induced large-effect structural variation.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Arabidopsis/genetics/drug effects/growth & development
Phenotype
Etoposide/pharmacology
Genome, Plant
Topoisomerase II Inhibitors/pharmacology
*Genomic Structural Variation/genetics
Genetic Variation
Chromosome Inversion/genetics
RevDate: 2025-12-18
Everything is everywhere but Escherichia coli adapts to different niches.
The ISME journal pii:8382876 [Epub ahead of print].
Pathogens that are harmless in one environment can cause serious disease in another. Among host-associated bacteria, transition between hosts can have serious consequences for animal and human health. However, much remains unknown about how adaptation shapes bacterial distribution in the wild. Here, investigating the ecological genomics of Escherichia coli from diverse hosts and environments, we address the idea that bacteria disperse freely, and challenge the "everything is everywhere" paradigm. Using comparative genomics and parallelised high throughout pangenome-wide association studies (900 experiments) we investigate lineage distribution and identify adaptive genomic signatures associated with host species, physiology and ecology. Our findings provide insights into bacterial niche adaptation, emphasize the impact of agriculture on microbial evolution, and inform One Health frameworks by linking genomics, host ecology, and the emergence of antimicrobial resistance.
Additional Links: PMID-41408707
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41408707,
year = {2025},
author = {Monteith, W and Krisna, MA and Puigserver, BG and Cummins, EA and Kelly, DJ and Taylor, AJ and Sheppard, SK},
title = {Everything is everywhere but Escherichia coli adapts to different niches.},
journal = {The ISME journal},
volume = {},
number = {},
pages = {},
doi = {10.1093/ismejo/wraf267},
pmid = {41408707},
issn = {1751-7370},
abstract = {Pathogens that are harmless in one environment can cause serious disease in another. Among host-associated bacteria, transition between hosts can have serious consequences for animal and human health. However, much remains unknown about how adaptation shapes bacterial distribution in the wild. Here, investigating the ecological genomics of Escherichia coli from diverse hosts and environments, we address the idea that bacteria disperse freely, and challenge the "everything is everywhere" paradigm. Using comparative genomics and parallelised high throughout pangenome-wide association studies (900 experiments) we investigate lineage distribution and identify adaptive genomic signatures associated with host species, physiology and ecology. Our findings provide insights into bacterial niche adaptation, emphasize the impact of agriculture on microbial evolution, and inform One Health frameworks by linking genomics, host ecology, and the emergence of antimicrobial resistance.},
}
RevDate: 2025-12-20
Conserved accessory genes link a phylogenetically distinct Bacillus subtilis strain from Indian bekang to the Japanese natto clade.
Scientific reports, 15(1):43097.
UNLABELLED: Bacillus subtilis is central to Asian fermented soybean foods, including Japanese natto. To explore the genomic boundaries of B. subtilis var. natto, we conducted a comparative pangenome analysis of 42 strains, including the core natto clade (n = 26) and its closest relatives. Our analysis revealed a striking evolutionary paradox centered on a single strain isolated from Indian bekang. Core-genome phylogenetic analysis places this bekang strain clearly outside the tight natto clade, with a Nepalese kinema strain being its closest systematic neighbor. In stark contrast, quantitative analysis of accessory gene profiles revealed this single bekang strain is the functional nearest neighbor to the natto clade, sharing a highly conserved accessory gene repertoire. This shared profile defines a "natto-type" adaptive strategy (the "broad-sense natto group," n = 27), separating it from other related strains. Analysis of this group-specific repertoire revealed an enrichment of transcriptional regulators and metabolic enzymes. This finding provides a compelling case study (n = 1) of polygenic adaptation, suggesting complex evolutionary pathways, such as horizontal gene transfer or selective retention, can drive rapid adaptation across disparate lineages.
SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1038/s41598-025-29683-y.
Additional Links: PMID-41407778
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41407778,
year = {2025},
author = {Seki, K and Nagano, Y},
title = {Conserved accessory genes link a phylogenetically distinct Bacillus subtilis strain from Indian bekang to the Japanese natto clade.},
journal = {Scientific reports},
volume = {15},
number = {1},
pages = {43097},
pmid = {41407778},
issn = {2045-2322},
abstract = {UNLABELLED: Bacillus subtilis is central to Asian fermented soybean foods, including Japanese natto. To explore the genomic boundaries of B. subtilis var. natto, we conducted a comparative pangenome analysis of 42 strains, including the core natto clade (n = 26) and its closest relatives. Our analysis revealed a striking evolutionary paradox centered on a single strain isolated from Indian bekang. Core-genome phylogenetic analysis places this bekang strain clearly outside the tight natto clade, with a Nepalese kinema strain being its closest systematic neighbor. In stark contrast, quantitative analysis of accessory gene profiles revealed this single bekang strain is the functional nearest neighbor to the natto clade, sharing a highly conserved accessory gene repertoire. This shared profile defines a "natto-type" adaptive strategy (the "broad-sense natto group," n = 27), separating it from other related strains. Analysis of this group-specific repertoire revealed an enrichment of transcriptional regulators and metabolic enzymes. This finding provides a compelling case study (n = 1) of polygenic adaptation, suggesting complex evolutionary pathways, such as horizontal gene transfer or selective retention, can drive rapid adaptation across disparate lineages.
SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1038/s41598-025-29683-y.},
}
RevDate: 2025-12-17
Biogeographical and phylogenetic constraints on horizontal gene transfer and genome evolution in Streptomyces.
Microbiology spectrum [Epub ahead of print].
UNLABELLED: The role of horizontal gene transfer (HGT) in shaping bacterial genomes is well recognized, but constraints on gene exchange and the degree to which these constraints shape genome evolution remain poorly described. In this study, we sought to determine whether geographic and phylogenetic distance constrains HGT within and between bacterial species. To address this question, we isolated strains (n = 17) of two closely related bacterial species, Streptomyces griseus and Streptomyces pratensis from two ecologically similar sites. We identified homologous recombination events within the core genomes of these species (557 recent and 457 ancient) and determined that patterns of recombination were constrained primarily by phylogeny rather than geography. Notably, shell accessory genes were over three times more likely to be shared between the same species than with non-related geographical neighbors. The richness of secondary metabolite gene clusters is highly variable with an average of 35-55 clusters per genome, depending on clade membership. The majority of secondary metabolite gene clusters (60%) are found in all strains, indicating that they were present in the most recent common ancestor of S. griseus and pratensis. We conclude that most HGT in the core and accessory genome is phylogenetically constrained, while HGT of shell genes is more likely influenced by geography. This outcome indicates that the predominant mechanisms of HGT favor high phylogenetic relatedness, and that rapid gene acquisition and loss in the accessory genome could aid with adaptation to contemporary environmental conditions.
IMPORTANCE: Horizontal gene transfer (HGT) is a vital ecological and evolutionary force in microbiology, but we still lack a precise understanding of how precisely HGT acts on the gene pool of a species or genus. While HGT can complicate phylogenetic analyses and assumptions of homology, its role in adaptation and acquiring secondary metabolites should not be overlooked. Microbial ecologists agree that the pangenome is a shifting collection of genes that can be influenced by both vertical inheritance and ecological factors. This study examines how the Streptomyces pangenome is shaped by these two forces and offers an important quantitative insight into how HGT shapes bacterial genome dynamics.
Additional Links: PMID-41404872
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41404872,
year = {2025},
author = {Hariharan, J and Andam, CP and Buckley, DH},
title = {Biogeographical and phylogenetic constraints on horizontal gene transfer and genome evolution in Streptomyces.},
journal = {Microbiology spectrum},
volume = {},
number = {},
pages = {e0295825},
doi = {10.1128/spectrum.02958-25},
pmid = {41404872},
issn = {2165-0497},
abstract = {UNLABELLED: The role of horizontal gene transfer (HGT) in shaping bacterial genomes is well recognized, but constraints on gene exchange and the degree to which these constraints shape genome evolution remain poorly described. In this study, we sought to determine whether geographic and phylogenetic distance constrains HGT within and between bacterial species. To address this question, we isolated strains (n = 17) of two closely related bacterial species, Streptomyces griseus and Streptomyces pratensis from two ecologically similar sites. We identified homologous recombination events within the core genomes of these species (557 recent and 457 ancient) and determined that patterns of recombination were constrained primarily by phylogeny rather than geography. Notably, shell accessory genes were over three times more likely to be shared between the same species than with non-related geographical neighbors. The richness of secondary metabolite gene clusters is highly variable with an average of 35-55 clusters per genome, depending on clade membership. The majority of secondary metabolite gene clusters (60%) are found in all strains, indicating that they were present in the most recent common ancestor of S. griseus and pratensis. We conclude that most HGT in the core and accessory genome is phylogenetically constrained, while HGT of shell genes is more likely influenced by geography. This outcome indicates that the predominant mechanisms of HGT favor high phylogenetic relatedness, and that rapid gene acquisition and loss in the accessory genome could aid with adaptation to contemporary environmental conditions.
IMPORTANCE: Horizontal gene transfer (HGT) is a vital ecological and evolutionary force in microbiology, but we still lack a precise understanding of how precisely HGT acts on the gene pool of a species or genus. While HGT can complicate phylogenetic analyses and assumptions of homology, its role in adaptation and acquiring secondary metabolites should not be overlooked. Microbial ecologists agree that the pangenome is a shifting collection of genes that can be influenced by both vertical inheritance and ecological factors. This study examines how the Streptomyces pangenome is shaped by these two forces and offers an important quantitative insight into how HGT shapes bacterial genome dynamics.},
}
RevDate: 2025-12-21
CmpDate: 2025-12-17
DNA shape and epigenomics distinguish the mechanistic origin of human genomic structural variations.
Nucleic acids research, 53(22):.
The recent advent of long-read whole genome sequencing has enabled us to create an accurate telomere-to-telomere reference genome, construct pangenome graphs, and compile precise catalogs of genomic structural variations (SVs). These comprehensive SV repositories provide an excellent opportunity to explore the role of SVs in genotype-phenotype associations and examine the mechanisms by which SVs are introduced through double-strand break (DSB) repair. Here, we employed comprehensive SV catalogs identified through various short- and long-read whole genome sequencing efforts to infer the underlying mechanisms of SV introduction based on their genomic and epigenomic profiles. Our findings indicate that high local DNA methylation and DNA shape-related features, such as low variations in propeller twist, support the origins of homology-driven SVs. Subsequently, we utilized an active-learning-based unsupervised clustering approach, revealing that homology-dependent SVs show greater evidence of retaining ancestral recombination patterns compared to their homology-independent counterparts. Finally, our comparison of inherited and de novo SVs from healthy populations and rare disease cohorts showed distinct upstream H3K27me3 levels in de novo SVs from individuals with ultra-rare disorders. These findings highlight genome-wide characteristics that may influence the choice of repair mechanisms linked to heritable SV origins.
Additional Links: PMID-41404800
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41404800,
year = {2025},
author = {Boev, NB and Gerstein, MB and Kumar, S},
title = {DNA shape and epigenomics distinguish the mechanistic origin of human genomic structural variations.},
journal = {Nucleic acids research},
volume = {53},
number = {22},
pages = {},
pmid = {41404800},
issn = {1362-4962},
support = {//Canada Research Chair Program/ ; //Williams Professorship funds/ ; //Canadian Institute of Health Research/ ; //Princess Margaret Cancer Foundation/ ; /CRI/Cancer Research Institute/United States ; U24 HG007497/HG/NHGRI NIH HHS/United States ; //Terry Fox Research Institute/ ; U24HG007497//National Institute of Health/ ; /WT_/Wellcome Trust/United Kingdom ; },
mesh = {Humans ; *Epigenomics/methods ; *Genome, Human ; DNA Methylation ; *Genomic Structural Variation ; Whole Genome Sequencing ; *DNA/genetics/chemistry ; Epigenesis, Genetic ; },
abstract = {The recent advent of long-read whole genome sequencing has enabled us to create an accurate telomere-to-telomere reference genome, construct pangenome graphs, and compile precise catalogs of genomic structural variations (SVs). These comprehensive SV repositories provide an excellent opportunity to explore the role of SVs in genotype-phenotype associations and examine the mechanisms by which SVs are introduced through double-strand break (DSB) repair. Here, we employed comprehensive SV catalogs identified through various short- and long-read whole genome sequencing efforts to infer the underlying mechanisms of SV introduction based on their genomic and epigenomic profiles. Our findings indicate that high local DNA methylation and DNA shape-related features, such as low variations in propeller twist, support the origins of homology-driven SVs. Subsequently, we utilized an active-learning-based unsupervised clustering approach, revealing that homology-dependent SVs show greater evidence of retaining ancestral recombination patterns compared to their homology-independent counterparts. Finally, our comparison of inherited and de novo SVs from healthy populations and rare disease cohorts showed distinct upstream H3K27me3 levels in de novo SVs from individuals with ultra-rare disorders. These findings highlight genome-wide characteristics that may influence the choice of repair mechanisms linked to heritable SV origins.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
Humans
*Epigenomics/methods
*Genome, Human
DNA Methylation
*Genomic Structural Variation
Whole Genome Sequencing
*DNA/genetics/chemistry
Epigenesis, Genetic
RevDate: 2025-12-19
CmpDate: 2025-12-17
PanBGC: a pangenome-inspired framework for comparative analysis of biosynthetic gene clusters.
ISME communications, 5(1):ycaf225.
Bacterial secondary metabolites are a major source of therapeutics and play key roles in microbial ecology. These compounds are encoded by biosynthetic gene clusters (BGCs), which show extensive genetic diversity across microbial genomes. While recent advances have enabled clustering of BGCs into gene cluster families (GCFs), there is still a lack of frameworks for systematically analysing their internal diversity at a population scale. Here, we introduce "PanBGC", a pangenome-inspired framework that treats each GCF as a population of related BGCs. This enables classification of biosynthetic genes into core, accessory, and unique categories and provides openness metrics to quantify compositional diversity. Applied to over 250 000 BGCs from more than 35 000 genomes, PanBGC maps biosynthetic diversity of more than 80 000 GCFs. Our analysis reveals that gene composition reshuffling, rather than acquisition of new genes, is the dominant driver of diversity within GCFs, with most families exhibiting closed gene repertoires but high compositional variability. Additionally, transporter-related domains were commonly identified among core genes, reflecting the fundamental importance of compound export in BGC function. To facilitate exploration, we present PanBGC-DB (https://panbgc-db.cs.uni-tuebingen.de), an interactive web platform for comparative BGC analysis. PanBGC-DB offers gene- and domain-level visualizations, phylogenetic tools, openness metrics, and custom query integration. Together, PanBGC and PanBGC-DB provide a scalable framework for exploring BGCs at population resolution and for contextualizing newly discovered BGCs within the global landscape of secondary metabolism.
Additional Links: PMID-41403705
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41403705,
year = {2025},
author = {Paccagnella, D and Bağcı, C and Gavriilidou, A and Ziemert, N},
title = {PanBGC: a pangenome-inspired framework for comparative analysis of biosynthetic gene clusters.},
journal = {ISME communications},
volume = {5},
number = {1},
pages = {ycaf225},
pmid = {41403705},
issn = {2730-6151},
abstract = {Bacterial secondary metabolites are a major source of therapeutics and play key roles in microbial ecology. These compounds are encoded by biosynthetic gene clusters (BGCs), which show extensive genetic diversity across microbial genomes. While recent advances have enabled clustering of BGCs into gene cluster families (GCFs), there is still a lack of frameworks for systematically analysing their internal diversity at a population scale. Here, we introduce "PanBGC", a pangenome-inspired framework that treats each GCF as a population of related BGCs. This enables classification of biosynthetic genes into core, accessory, and unique categories and provides openness metrics to quantify compositional diversity. Applied to over 250 000 BGCs from more than 35 000 genomes, PanBGC maps biosynthetic diversity of more than 80 000 GCFs. Our analysis reveals that gene composition reshuffling, rather than acquisition of new genes, is the dominant driver of diversity within GCFs, with most families exhibiting closed gene repertoires but high compositional variability. Additionally, transporter-related domains were commonly identified among core genes, reflecting the fundamental importance of compound export in BGC function. To facilitate exploration, we present PanBGC-DB (https://panbgc-db.cs.uni-tuebingen.de), an interactive web platform for comparative BGC analysis. PanBGC-DB offers gene- and domain-level visualizations, phylogenetic tools, openness metrics, and custom query integration. Together, PanBGC and PanBGC-DB provide a scalable framework for exploring BGCs at population resolution and for contextualizing newly discovered BGCs within the global landscape of secondary metabolism.},
}
RevDate: 2025-12-16
Genomic resources for Australian alfalfa (Medicago sativa L.) genomics: reformatted reference genome, annotated variants, gene presence-absence and diversity analysis from genome re-sequencing.
BMC plant biology pii:10.1186/s12870-025-07941-5 [Epub ahead of print].
BACKGROUND: Alfalfa (Medicago sativa L.), a globally significant forage crop, has faced limited breeding progress in recent decades. Several challenges hinder genetic gain in alfalfa, including its status as an outbreeding tetraploid species with pronounced inbreeding depression, high parent numbers in synthetic crosses resulting in limited genetic differentiation between cultivars and a lack of genomic resources to advance genomic breeding techniques in the species.
RESULTS: We aim to address some of these limitations by generating genomic resources for alfalfa improvement, including reformatting an allele-aware reference genome to remove duplicate haplotypes while retaining presence absence variation, genome annotation identifying genes and functional elements, SNP discovery and SNP variant effect prediction. The predicted gene set was expanded by the inclusion of RNA sequencing from multiple tissue types and stress treatments. Genetic diversity of 316 samples from seven commercially available cultivars relevant to Australian grazing systems was examined, including a population level analysis of gene presence-absence variation. There is little genetic differentiation between cultivars, with higher diversity within than between cultivars. Several genes were found to display presence-absence at the population level.
CONCLUSIONS: These findings provide insights for alfalfa breeding programs and underscore the need for continued efforts in developing genomic tools to unlock the crop's full potential.
Additional Links: PMID-41402760
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41402760,
year = {2025},
author = {Malmberg, MM and Suraweera, DD and Baillie, RC and Smith, KF and Cogan, NOI},
title = {Genomic resources for Australian alfalfa (Medicago sativa L.) genomics: reformatted reference genome, annotated variants, gene presence-absence and diversity analysis from genome re-sequencing.},
journal = {BMC plant biology},
volume = {},
number = {},
pages = {},
doi = {10.1186/s12870-025-07941-5},
pmid = {41402760},
issn = {1471-2229},
abstract = {BACKGROUND: Alfalfa (Medicago sativa L.), a globally significant forage crop, has faced limited breeding progress in recent decades. Several challenges hinder genetic gain in alfalfa, including its status as an outbreeding tetraploid species with pronounced inbreeding depression, high parent numbers in synthetic crosses resulting in limited genetic differentiation between cultivars and a lack of genomic resources to advance genomic breeding techniques in the species.
RESULTS: We aim to address some of these limitations by generating genomic resources for alfalfa improvement, including reformatting an allele-aware reference genome to remove duplicate haplotypes while retaining presence absence variation, genome annotation identifying genes and functional elements, SNP discovery and SNP variant effect prediction. The predicted gene set was expanded by the inclusion of RNA sequencing from multiple tissue types and stress treatments. Genetic diversity of 316 samples from seven commercially available cultivars relevant to Australian grazing systems was examined, including a population level analysis of gene presence-absence variation. There is little genetic differentiation between cultivars, with higher diversity within than between cultivars. Several genes were found to display presence-absence at the population level.
CONCLUSIONS: These findings provide insights for alfalfa breeding programs and underscore the need for continued efforts in developing genomic tools to unlock the crop's full potential.},
}
RevDate: 2025-12-16
Whole-genome sequencing to explore genomic profile, pathogenicity, virulence motifs, and pangenome of clinical MRSA isolates.
Scientific reports pii:10.1038/s41598-025-32897-9 [Epub ahead of print].
Multidrug-resistant (MDR) and Methicillin-resistant Staphylococcus aureus (MRSA) are major opportunistic pathogens, particularly in healthcare settings. Despite the importance of such strains as a priority nosocomial pathogen, genomic data on Bangladeshi isolates remain limited. This study presents whole genome sequence (WGS) analysis of three MDR-MRSA strains (SA-18, SA-36, and SA-41) isolated from wound-infected patients in a tertiary care hospital in Bangladesh. Multi-locus sequence typing (MLST) identified SA-18 and SA-36 as sequence type ST6, and SA-41 as ST1930. All isolates carried the mecA gene and SCCmec type IVa were classified as MRSA. A total of 22 antibiotic resistance genes (ARGs) were identified, including blaZ, erm(C), norA, and sdrM. Pathogenicity prediction classified all isolates as human pathogens, with SA-41 showing the highest pathogenic potential (98%). Virulence analysis revealed 75 virulence factor genes (VFGs), such as hla, spa, icaA-D, and PVL. Mobile genetic elements (MGEs), plasmids, and an intact 50.9 kb prophage region were detected, with the ebp gene identified within the prophage in SA-41. Pangenome analysis revealed 3764 gene clusters, indicating an open and diverse pangenome. The findings of the present study highlight the importance of WGS in antimicrobial stewardship programs by rapidly identifying resistant genes and spread of multidrug-resistant MRSA strains in Bangladesh which will allow better infection control, rational use of antibiotics and developing effective public health policies.
Additional Links: PMID-41402559
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41402559,
year = {2025},
author = {Kuddus, A and Islam, MR and Hossain, A and Bongulto, KA and Watanabe, K and Muhit, MA},
title = {Whole-genome sequencing to explore genomic profile, pathogenicity, virulence motifs, and pangenome of clinical MRSA isolates.},
journal = {Scientific reports},
volume = {},
number = {},
pages = {},
doi = {10.1038/s41598-025-32897-9},
pmid = {41402559},
issn = {2045-2322},
abstract = {Multidrug-resistant (MDR) and Methicillin-resistant Staphylococcus aureus (MRSA) are major opportunistic pathogens, particularly in healthcare settings. Despite the importance of such strains as a priority nosocomial pathogen, genomic data on Bangladeshi isolates remain limited. This study presents whole genome sequence (WGS) analysis of three MDR-MRSA strains (SA-18, SA-36, and SA-41) isolated from wound-infected patients in a tertiary care hospital in Bangladesh. Multi-locus sequence typing (MLST) identified SA-18 and SA-36 as sequence type ST6, and SA-41 as ST1930. All isolates carried the mecA gene and SCCmec type IVa were classified as MRSA. A total of 22 antibiotic resistance genes (ARGs) were identified, including blaZ, erm(C), norA, and sdrM. Pathogenicity prediction classified all isolates as human pathogens, with SA-41 showing the highest pathogenic potential (98%). Virulence analysis revealed 75 virulence factor genes (VFGs), such as hla, spa, icaA-D, and PVL. Mobile genetic elements (MGEs), plasmids, and an intact 50.9 kb prophage region were detected, with the ebp gene identified within the prophage in SA-41. Pangenome analysis revealed 3764 gene clusters, indicating an open and diverse pangenome. The findings of the present study highlight the importance of WGS in antimicrobial stewardship programs by rapidly identifying resistant genes and spread of multidrug-resistant MRSA strains in Bangladesh which will allow better infection control, rational use of antibiotics and developing effective public health policies.},
}
RevDate: 2025-12-19
CmpDate: 2025-12-16
Gene copy-number features generalize better than SNPs for antimicrobial resistance prediction in Staphylococcus aureus.
npj antimicrobials and resistance, 3(1):100.
Rapid prediction of antimicrobial resistance (AMR) from genome sequences is essential for timely therapy, yet models based on curated marker panels or core-genome Single Nucleotide Polymorphisms (SNPs) often fail to generalize to novel bacterial lineages. We evaluate AMR prediction in Staphylococcus aureus using pan-genome features that encode homologous gene copy number (including absence) and compare them to SNP-based models across six antibiotics and 4255 isolates. Gradient-boosted decision tree ensembles (XGBoost) trained on gene copy number achieve macro-averaged F1-scores of 0.925-0.988, surpassing SNP-based models (0.838-0.935). Under lineage-held-out evaluation, which withholds entire clades to mimic previously unseen lineages, gene-content models retain markedly higher performance (F1 = 0.875 and 0.904 across two split schemes), whereas SNP-based models degrade substantially (F1 = 0.557 and 0.638). Feature ablation indicates that predictive signal is distributed across many homologous gene families rather than dominated by a few markers, a structure consistent with stronger cross-lineage generalization. Because gene-content features can be robustly obtained even from low-coverage sequencing, this approach extends genome-based AMR prediction to real-world clinical and epidemiological datasets. Together, these results show that copy-number-based pan-genome representations provide a robust alternative to SNP-only approaches, particularly when models must generalize to lineages not represented in training data.
Additional Links: PMID-41402414
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41402414,
year = {2025},
author = {Fistarol, BF and Gervasio, JD and Szöllősi, GJ},
title = {Gene copy-number features generalize better than SNPs for antimicrobial resistance prediction in Staphylococcus aureus.},
journal = {npj antimicrobials and resistance},
volume = {3},
number = {1},
pages = {100},
pmid = {41402414},
issn = {2731-8745},
abstract = {Rapid prediction of antimicrobial resistance (AMR) from genome sequences is essential for timely therapy, yet models based on curated marker panels or core-genome Single Nucleotide Polymorphisms (SNPs) often fail to generalize to novel bacterial lineages. We evaluate AMR prediction in Staphylococcus aureus using pan-genome features that encode homologous gene copy number (including absence) and compare them to SNP-based models across six antibiotics and 4255 isolates. Gradient-boosted decision tree ensembles (XGBoost) trained on gene copy number achieve macro-averaged F1-scores of 0.925-0.988, surpassing SNP-based models (0.838-0.935). Under lineage-held-out evaluation, which withholds entire clades to mimic previously unseen lineages, gene-content models retain markedly higher performance (F1 = 0.875 and 0.904 across two split schemes), whereas SNP-based models degrade substantially (F1 = 0.557 and 0.638). Feature ablation indicates that predictive signal is distributed across many homologous gene families rather than dominated by a few markers, a structure consistent with stronger cross-lineage generalization. Because gene-content features can be robustly obtained even from low-coverage sequencing, this approach extends genome-based AMR prediction to real-world clinical and epidemiological datasets. Together, these results show that copy-number-based pan-genome representations provide a robust alternative to SNP-only approaches, particularly when models must generalize to lineages not represented in training data.},
}
RevDate: 2025-12-30
CmpDate: 2025-12-30
Primulina pan-genome reveals differential gene retention following whole-genome duplications and provides insights into edaphic specialization.
Cell reports, 44(12):116763.
Primulina, a genus of >200 species specialized to extreme soils, provides a model for edaphic adaptation. We assemble seven genomes and construct a pan-genome spanning nine species from karst, Danxia, and acidic soils. Comparative analyses reveal that karst-adapted species have smaller genomes. Two lineage-specific whole-genome duplications (WGDs) exhibit biased duplicate loss in large gene families but preferential retention of transcription factors, indicating combined adaptive and nonadaptive forces. Pan-genome analyses identify ion channel and transporter genes enriched in variant hotspots and under positive selection in karst lineages. Candidate genes for drought and salt stress tolerance include ABC transporters and ion channels. Notably, an ABC transporter shows positive selection in karst species and unique structural variation in non-karst species. Together, our findings show that genome downsizing, biased post-WGD retention, and evolution of ion-transport pathways shape adaptation to extreme soils. The Primulina pan-genome provides a resource for dissecting mechanisms underlying edaphic specialization.
Additional Links: PMID-41401072
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41401072,
year = {2025},
author = {Feng, C and Yang, L and Wang, J and Shi, X and Kang, M},
title = {Primulina pan-genome reveals differential gene retention following whole-genome duplications and provides insights into edaphic specialization.},
journal = {Cell reports},
volume = {44},
number = {12},
pages = {116763},
doi = {10.1016/j.celrep.2025.116763},
pmid = {41401072},
issn = {2211-1247},
mesh = {*Gene Duplication ; *Soil/chemistry ; Phylogeny ; Adaptation, Physiological/genetics ; Evolution, Molecular ; },
abstract = {Primulina, a genus of >200 species specialized to extreme soils, provides a model for edaphic adaptation. We assemble seven genomes and construct a pan-genome spanning nine species from karst, Danxia, and acidic soils. Comparative analyses reveal that karst-adapted species have smaller genomes. Two lineage-specific whole-genome duplications (WGDs) exhibit biased duplicate loss in large gene families but preferential retention of transcription factors, indicating combined adaptive and nonadaptive forces. Pan-genome analyses identify ion channel and transporter genes enriched in variant hotspots and under positive selection in karst lineages. Candidate genes for drought and salt stress tolerance include ABC transporters and ion channels. Notably, an ABC transporter shows positive selection in karst species and unique structural variation in non-karst species. Together, our findings show that genome downsizing, biased post-WGD retention, and evolution of ion-transport pathways shape adaptation to extreme soils. The Primulina pan-genome provides a resource for dissecting mechanisms underlying edaphic specialization.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Gene Duplication
*Soil/chemistry
Phylogeny
Adaptation, Physiological/genetics
Evolution, Molecular
RevDate: 2025-12-18
CmpDate: 2025-12-16
Whole-genome sequence data of the Listeria monocytogenes 64_37 isolated from a trout fillet.
Data in brief, 63:112301.
Listeria monocytogenes is a major foodborne pathogen causing listeriosis disease in humans. Fish is a product that can serve as a transmission source. Here we present the whole-genome sequence (WGS) of L. monocytogenes strain 64_37 isolated from a trout fillet. Initial de novo assembly of Illumina raw reads produced 40 contigs of 3,016,736 bp and a GC % content of 37.85 %. Further reference-based assembly resulted in a high-quality genome consisting of one scaffold with a total length 2,957,126 bp and a completeness of 99.99 %. A rep26-type replicon plasmid was detected. MLST analysis assigned strain 64_37 to serogroup IIa and sequence type (ST) 204 of clonal complex (CC) 204. Functional annotation using the Rapid Annotation using Subsystem Technology (RAST) server revealed 2,949 coding sequences and 41 RNA sequences. Pan-genome analysis identified 88 unique genes in strain 64_37. Investigation of the pathogenic potential of L. monocytogenes 64_37 revealed the presence of multiple virulence genes. Investigation of the presence of antimicrobial resistance (AMR) genes suggested resistance to lincosamide, cationic peptides, fosfomycin and fluoroquinolone. This data provides an overview of pathogenic potential of L. monocytogenes 64_37 and can be used to further shed light on its presence in fish products.
Additional Links: PMID-41399444
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41399444,
year = {2025},
author = {Slavko, A and Pemaj, V and Drosinos, EH and Skandamis, PN and Rychli, K and Papadimitriou, K},
title = {Whole-genome sequence data of the Listeria monocytogenes 64_37 isolated from a trout fillet.},
journal = {Data in brief},
volume = {63},
number = {},
pages = {112301},
pmid = {41399444},
issn = {2352-3409},
abstract = {Listeria monocytogenes is a major foodborne pathogen causing listeriosis disease in humans. Fish is a product that can serve as a transmission source. Here we present the whole-genome sequence (WGS) of L. monocytogenes strain 64_37 isolated from a trout fillet. Initial de novo assembly of Illumina raw reads produced 40 contigs of 3,016,736 bp and a GC % content of 37.85 %. Further reference-based assembly resulted in a high-quality genome consisting of one scaffold with a total length 2,957,126 bp and a completeness of 99.99 %. A rep26-type replicon plasmid was detected. MLST analysis assigned strain 64_37 to serogroup IIa and sequence type (ST) 204 of clonal complex (CC) 204. Functional annotation using the Rapid Annotation using Subsystem Technology (RAST) server revealed 2,949 coding sequences and 41 RNA sequences. Pan-genome analysis identified 88 unique genes in strain 64_37. Investigation of the pathogenic potential of L. monocytogenes 64_37 revealed the presence of multiple virulence genes. Investigation of the presence of antimicrobial resistance (AMR) genes suggested resistance to lincosamide, cationic peptides, fosfomycin and fluoroquinolone. This data provides an overview of pathogenic potential of L. monocytogenes 64_37 and can be used to further shed light on its presence in fish products.},
}
RevDate: 2025-12-18
CmpDate: 2025-12-15
Population-scale gene expression analysis reveals the contribution of expression diversity to the modern wheat improvement.
Nature communications, 16(1):11133.
Gene expression diversity is crucial for crop breeding, yet population genomics has focused primarily on sequence polymorphisms. A single reference genome for RNA-seq cannot handle introgression bias. Here, we conduct RNA-seq for 328 wheat lines, including landraces and elite cultivars from China and the United States, to investigate the expression variation underlying agronomic traits. Leveraging pan-genome resources, we identify 23,296 more transcripts than using the Chinese Spring reference. We construct a pan-gene regulatory atlas through eQTL analysis, revealing the tight regulation of introgressed genes. We identify 299 high-confidence candidate genes for 34 agronomic traits and resistance to 8 Blumeria graminis f. sp. tritici isolates, more than one-fifth of which were absent from the Chinese Spring. Utilizing the Kenong 9204 mutant library, 73.7% of the candidates show significant phenotypic effects. Our work mitigates the reference bias and highlights the impact of breeding-driven directional expression changes on wheat adaptation and improvement.
Additional Links: PMID-41398157
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41398157,
year = {2025},
author = {Zhang, Z and Ma, S and Yin, M and Zhao, C and Zhao, X and Yu, Y and Wang, H and Li, X and Si, Y and Niu, J and Xie, J and Wang, L and Wu, J and Zhang, Y and Zheng, Q and Zheng, S and Jiang, N and Liu, X and Ling, HQ and He, F},
title = {Population-scale gene expression analysis reveals the contribution of expression diversity to the modern wheat improvement.},
journal = {Nature communications},
volume = {16},
number = {1},
pages = {11133},
pmid = {41398157},
issn = {2041-1723},
support = {31971877//National Natural Science Foundation of China (National Science Foundation of China)/ ; 31921005//National Natural Science Foundation of China (National Science Foundation of China)/ ; },
mesh = {*Triticum/genetics/microbiology ; *Gene Expression Regulation, Plant ; Quantitative Trait Loci/genetics ; Plant Breeding ; China ; Disease Resistance/genetics ; Genetic Variation ; Genome, Plant ; Plant Diseases/microbiology/genetics ; Gene Expression Profiling ; Phenotype ; Ascomycota ; },
abstract = {Gene expression diversity is crucial for crop breeding, yet population genomics has focused primarily on sequence polymorphisms. A single reference genome for RNA-seq cannot handle introgression bias. Here, we conduct RNA-seq for 328 wheat lines, including landraces and elite cultivars from China and the United States, to investigate the expression variation underlying agronomic traits. Leveraging pan-genome resources, we identify 23,296 more transcripts than using the Chinese Spring reference. We construct a pan-gene regulatory atlas through eQTL analysis, revealing the tight regulation of introgressed genes. We identify 299 high-confidence candidate genes for 34 agronomic traits and resistance to 8 Blumeria graminis f. sp. tritici isolates, more than one-fifth of which were absent from the Chinese Spring. Utilizing the Kenong 9204 mutant library, 73.7% of the candidates show significant phenotypic effects. Our work mitigates the reference bias and highlights the impact of breeding-driven directional expression changes on wheat adaptation and improvement.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Triticum/genetics/microbiology
*Gene Expression Regulation, Plant
Quantitative Trait Loci/genetics
Plant Breeding
China
Disease Resistance/genetics
Genetic Variation
Genome, Plant
Plant Diseases/microbiology/genetics
Gene Expression Profiling
Phenotype
Ascomycota
RevDate: 2025-12-15
Medicago super-pangenome reveals adaptive advantages and evolutionary constraints in autotetraploid alfalfa.
Nature communications pii:10.1038/s41467-025-67280-9 [Epub ahead of print].
The genetic basis for the adaptive advantages of polyploids over their diploid relatives remains poorly understood. To address this knowledge gap, we generate a haplotype-resolved autotetraploid alfalfa (Medicago sativa subsp. sativa) genome and construct a super-pangenome from 13 genomes across seven Medicago taxa. We discover substantial gene content variation in alfalfa, with only 20.1% of genes present on all four haplotypes. Within this group, 53.3% are core genes conserved across the Medicago genus, which we term 'tetra-copy core genes'. We find these genes are significantly enriched in climate-adaptation-associated genes (1.60-fold) and stress-responsive differentially expressed genes (1.61-fold). Paradoxically, they also carry a high genetic burden, with 80.1% of deleterious variants located in coding regions. Indeed, overexpressing a representative tetra-copy core gene, the glycine decarboxylase (MsGDC), improves both biomass and nitrogen use efficiency, despite its high genetic burden. Our study reveals the trade-off between adaptation and evolutionary constraints mediated by tetra-copy core genes, facilitating polyploid genetics and alfalfa breeding.
Additional Links: PMID-41398149
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41398149,
year = {2025},
author = {Zhang, F and Wei, C and Shi, X and Cao, S and Xu, X and Ma, Z and Peng, Y and Arshad, R and Xue, H and Zhang, Z and Zhang, W and Xu, Y and Dong, Y and Zhou, L and Cao, X and Du, M and Wang, X and Zhang, Z and Long, R and Kang, J and Zhou, Y and Yang, Q},
title = {Medicago super-pangenome reveals adaptive advantages and evolutionary constraints in autotetraploid alfalfa.},
journal = {Nature communications},
volume = {},
number = {},
pages = {},
doi = {10.1038/s41467-025-67280-9},
pmid = {41398149},
issn = {2041-1723},
abstract = {The genetic basis for the adaptive advantages of polyploids over their diploid relatives remains poorly understood. To address this knowledge gap, we generate a haplotype-resolved autotetraploid alfalfa (Medicago sativa subsp. sativa) genome and construct a super-pangenome from 13 genomes across seven Medicago taxa. We discover substantial gene content variation in alfalfa, with only 20.1% of genes present on all four haplotypes. Within this group, 53.3% are core genes conserved across the Medicago genus, which we term 'tetra-copy core genes'. We find these genes are significantly enriched in climate-adaptation-associated genes (1.60-fold) and stress-responsive differentially expressed genes (1.61-fold). Paradoxically, they also carry a high genetic burden, with 80.1% of deleterious variants located in coding regions. Indeed, overexpressing a representative tetra-copy core gene, the glycine decarboxylase (MsGDC), improves both biomass and nitrogen use efficiency, despite its high genetic burden. Our study reveals the trade-off between adaptation and evolutionary constraints mediated by tetra-copy core genes, facilitating polyploid genetics and alfalfa breeding.},
}
RevDate: 2025-12-15
Structure of the Enterobacter pan-genome is revealed using machine learning.
Microbiology spectrum [Epub ahead of print].
The growing availability of publicly accessible Enterobacter genomes offers an opportunity to reveal the structure of its pangenome, uncovering the catalog of genes across the genus and their distribution across the different species and subspecies of the genus. In this study, we analyze 777 high-quality complete Enterobacter genomes using a pangenome matrix. The accessory genome, consisting of the genes found in many, but not all strains, was decomposed using non-negative matrix factorization (NMF) to identify groups of genes, called Phylons, that are found to be present across the subgroups of the genomes analyzed. The Phylons are representative of major modes of inheritance, both lineage-associated and horizontal, found across the pangenome. Using NMF, we defined 31 Phylons, representative of 21 lineage-associated gene sets, and 10 Phylons containing genes associated with mobile genetic elements. Six mobile Phylons were extrachromosomal, representing plasmids, and four associated with chromosomal DNA. These 31 Phylons define the structure of the Enterobacter pangenome. This structure is consistent with the classification of an additional 2,291 fragmented genome sequences. This structure enables the pangenome-wide mapping of genetic traits, such as motility genes, biosynthetic gene clusters, antimicrobial resistance genes, and virulence factors. NMF thus enabled phylogenetic and functional classification of genomes based on the pangenome-scale assessment of a genome's gene portfolio. A robust classification of Enterobacter spp. enhances the understanding of the evolution of this clinically significant pathogen.IMPORTANCEEnterobacter spp. represent a vital member of the Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, Enterobacter species, and Escherichia coli pathogens relevant for their nosocomial pathogenicity and antimicrobial resistance. Understanding the genomic diversity of the genus is vital for further study of its evolution and resistance potential. We constructed a pangenome of 777 Enterobacter complete genomes. Machine learning techniques were used to mathematically define major subpopulations of Enterobacter based on their accessory gene content, which for the first time defined dominant modes of lineage-associated and horizontal inheritance. This analysis provides insights into the distribution of traits related to antimicrobial resistance, biosynthetic gene clusters, and virulence factors. This study provides robust classification of Enterobacter isolates identifying differential genetic traits across the species and subspecies of the genus, overcoming some of the ambiguity in its taxonomy.
Additional Links: PMID-41395970
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41395970,
year = {2025},
author = {Burrows, JT and Li, G and Monk, JM and Chauhan, SM and Palsson, BO},
title = {Structure of the Enterobacter pan-genome is revealed using machine learning.},
journal = {Microbiology spectrum},
volume = {},
number = {},
pages = {e0192225},
doi = {10.1128/spectrum.01922-25},
pmid = {41395970},
issn = {2165-0497},
abstract = {The growing availability of publicly accessible Enterobacter genomes offers an opportunity to reveal the structure of its pangenome, uncovering the catalog of genes across the genus and their distribution across the different species and subspecies of the genus. In this study, we analyze 777 high-quality complete Enterobacter genomes using a pangenome matrix. The accessory genome, consisting of the genes found in many, but not all strains, was decomposed using non-negative matrix factorization (NMF) to identify groups of genes, called Phylons, that are found to be present across the subgroups of the genomes analyzed. The Phylons are representative of major modes of inheritance, both lineage-associated and horizontal, found across the pangenome. Using NMF, we defined 31 Phylons, representative of 21 lineage-associated gene sets, and 10 Phylons containing genes associated with mobile genetic elements. Six mobile Phylons were extrachromosomal, representing plasmids, and four associated with chromosomal DNA. These 31 Phylons define the structure of the Enterobacter pangenome. This structure is consistent with the classification of an additional 2,291 fragmented genome sequences. This structure enables the pangenome-wide mapping of genetic traits, such as motility genes, biosynthetic gene clusters, antimicrobial resistance genes, and virulence factors. NMF thus enabled phylogenetic and functional classification of genomes based on the pangenome-scale assessment of a genome's gene portfolio. A robust classification of Enterobacter spp. enhances the understanding of the evolution of this clinically significant pathogen.IMPORTANCEEnterobacter spp. represent a vital member of the Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, Enterobacter species, and Escherichia coli pathogens relevant for their nosocomial pathogenicity and antimicrobial resistance. Understanding the genomic diversity of the genus is vital for further study of its evolution and resistance potential. We constructed a pangenome of 777 Enterobacter complete genomes. Machine learning techniques were used to mathematically define major subpopulations of Enterobacter based on their accessory gene content, which for the first time defined dominant modes of lineage-associated and horizontal inheritance. This analysis provides insights into the distribution of traits related to antimicrobial resistance, biosynthetic gene clusters, and virulence factors. This study provides robust classification of Enterobacter isolates identifying differential genetic traits across the species and subspecies of the genus, overcoming some of the ambiguity in its taxonomy.},
}
RevDate: 2025-12-17
CmpDate: 2025-12-15
PanARGMiner (Pan-Genomic Antimicrobial Resistance Gene Miner): An advanced feature selection framework for extracting key resistance genes from pan-genomic datasets.
Computational and structural biotechnology journal, 27:5363-5374.
Identifying antimicrobial resistance (AMR)-related biomarkers from large-scale genomic datasets is often akin to finding a needle in a haystack. With pan-genomic data containing more than 100,000 gene sequences, isolating features that truly drive resistance remains a major challenge in computational biology. Here we present PanARGMiner, a machine learning-based feature selection framework designed to robustly extract highly relevant and informative biomarkers from high-dimensional biological data. PanARGMiner uses an ensemble-based feature selection strategy to select highly informative and compact feature subsets. It then utilizes repeated iterations to ensure the stability and reliability of the proposed framework, enabling PanARGMiner to generate significantly reduced features with comparable prediction performance compared to those obtained with other feature selection algorithms. Applying PanARGMiner to bacterial pan-genomic antimicrobial resistance datasets successfully extracted as few as one to ten candidate AMR biomarkers from datasets with more than 100,000 genes for five common pathogens. Although many of the extracted candidate AMR biomarkers are well-known resistance genes, proteins not known to be associated with AMR mechanisms, including functionally uncharacterized hypothetical proteins, were also extracted. This indicates the potential of PanARGMiner in revealing both established and novel mechanisms of antibiotic resistance, thus providing actionable insights for biomarker discovery, functional genomics, and precision medicine based on complex data. Its ability to uncover both known and uncharacterized resistance-related features offers new opportunities for research and clinical applications in combating AMR.
Additional Links: PMID-41395110
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41395110,
year = {2025},
author = {Chen, YC and Yang, MR and Wu, YW},
title = {PanARGMiner (Pan-Genomic Antimicrobial Resistance Gene Miner): An advanced feature selection framework for extracting key resistance genes from pan-genomic datasets.},
journal = {Computational and structural biotechnology journal},
volume = {27},
number = {},
pages = {5363-5374},
pmid = {41395110},
issn = {2001-0370},
abstract = {Identifying antimicrobial resistance (AMR)-related biomarkers from large-scale genomic datasets is often akin to finding a needle in a haystack. With pan-genomic data containing more than 100,000 gene sequences, isolating features that truly drive resistance remains a major challenge in computational biology. Here we present PanARGMiner, a machine learning-based feature selection framework designed to robustly extract highly relevant and informative biomarkers from high-dimensional biological data. PanARGMiner uses an ensemble-based feature selection strategy to select highly informative and compact feature subsets. It then utilizes repeated iterations to ensure the stability and reliability of the proposed framework, enabling PanARGMiner to generate significantly reduced features with comparable prediction performance compared to those obtained with other feature selection algorithms. Applying PanARGMiner to bacterial pan-genomic antimicrobial resistance datasets successfully extracted as few as one to ten candidate AMR biomarkers from datasets with more than 100,000 genes for five common pathogens. Although many of the extracted candidate AMR biomarkers are well-known resistance genes, proteins not known to be associated with AMR mechanisms, including functionally uncharacterized hypothetical proteins, were also extracted. This indicates the potential of PanARGMiner in revealing both established and novel mechanisms of antibiotic resistance, thus providing actionable insights for biomarker discovery, functional genomics, and precision medicine based on complex data. Its ability to uncover both known and uncharacterized resistance-related features offers new opportunities for research and clinical applications in combating AMR.},
}
RevDate: 2025-12-17
CmpDate: 2025-12-15
Geographic diversity of the Streptococcus equi subsp. equi accessory genome: implications for vaccines and global surveillance.
Frontiers in veterinary science, 12:1721958.
Strangles, caused by the host-adapted Streptococcus equi subsp. equi (S. equi), imposes significant welfare and economic losses on the equine industry worldwide. Understanding its genomic features, virulence-associated genes (VAGs), antimicrobial resistance (AMR) and mobile genetic elements (MGEs) is essential for disease control and vaccine development. This study aimed to characterize the accessory genome composition, geographic distribution of VAGs and MGEs, and AMR profiles of S. equi by a large-scale genomic analysis of global publicly available S. equi sequences. All publicly available S. equi sequences in the Sequence Read Archive (SRA) database were retrieved and assembled. A total of 552 high-quality assemblies were obtained for further analysis. The strains originated from five continents (North/South America, Europe, Asia and Oceania). The geographical distribution of VAGs (analyzed using an in-house Streptococcus equi virulence factor database), antibiotic resistance gene (ARG) profiles, and the contribution of MGEs to S. equi VAGs were analyzed in this study. The results revealed that S. equi exhibited a closed pangenome with 1,661 core and 982 accessory genes. Among 71 identified VAGs, 40 were core VAGs, while accessory VAGs showed significant geographic variations, especially in nutritional/metabolic factor genes and exotoxin genes. No acquired ARGs were detected except a single qacG gene encoding resistance to quaternary ammonium compounds. This study revealed a functional specialization of MGEs, where prophages carry superantigen genes (speH, speI) and the hyaluronidase gene hylP; genomic islands (GIs) harbor iron acquisition genes (eqb cluster) and the virD4 gene encoding the T4SS coupling protein; and integrative conjugative elements (ICEs) carry the heme metabolism cluster (htsA, shp) and streptolysin S-associated genes (sagA, sagD). The geographic variation of VAGs suggests regional adaptive pressures and supports genome streamlining in S. equi. In conclusion, S. equi exhibits a closed and streamlined genome, characteristic of host-adapted bacteria. There is a minimal acquisition of ARGs while key VAGs are retained. Prophages, GIs, and ICEs play specialized roles in VAG distribution. These findings provide insights into prioritizing VAGs for strangles vaccine development and surveillance of antigenic variation to mitigate vaccine escape.
Additional Links: PMID-41394906
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41394906,
year = {2025},
author = {He, L and Khine, NO and Song, J and Loubière, C and Butaye, P},
title = {Geographic diversity of the Streptococcus equi subsp. equi accessory genome: implications for vaccines and global surveillance.},
journal = {Frontiers in veterinary science},
volume = {12},
number = {},
pages = {1721958},
pmid = {41394906},
issn = {2297-1769},
abstract = {Strangles, caused by the host-adapted Streptococcus equi subsp. equi (S. equi), imposes significant welfare and economic losses on the equine industry worldwide. Understanding its genomic features, virulence-associated genes (VAGs), antimicrobial resistance (AMR) and mobile genetic elements (MGEs) is essential for disease control and vaccine development. This study aimed to characterize the accessory genome composition, geographic distribution of VAGs and MGEs, and AMR profiles of S. equi by a large-scale genomic analysis of global publicly available S. equi sequences. All publicly available S. equi sequences in the Sequence Read Archive (SRA) database were retrieved and assembled. A total of 552 high-quality assemblies were obtained for further analysis. The strains originated from five continents (North/South America, Europe, Asia and Oceania). The geographical distribution of VAGs (analyzed using an in-house Streptococcus equi virulence factor database), antibiotic resistance gene (ARG) profiles, and the contribution of MGEs to S. equi VAGs were analyzed in this study. The results revealed that S. equi exhibited a closed pangenome with 1,661 core and 982 accessory genes. Among 71 identified VAGs, 40 were core VAGs, while accessory VAGs showed significant geographic variations, especially in nutritional/metabolic factor genes and exotoxin genes. No acquired ARGs were detected except a single qacG gene encoding resistance to quaternary ammonium compounds. This study revealed a functional specialization of MGEs, where prophages carry superantigen genes (speH, speI) and the hyaluronidase gene hylP; genomic islands (GIs) harbor iron acquisition genes (eqb cluster) and the virD4 gene encoding the T4SS coupling protein; and integrative conjugative elements (ICEs) carry the heme metabolism cluster (htsA, shp) and streptolysin S-associated genes (sagA, sagD). The geographic variation of VAGs suggests regional adaptive pressures and supports genome streamlining in S. equi. In conclusion, S. equi exhibits a closed and streamlined genome, characteristic of host-adapted bacteria. There is a minimal acquisition of ARGs while key VAGs are retained. Prophages, GIs, and ICEs play specialized roles in VAG distribution. These findings provide insights into prioritizing VAGs for strangles vaccine development and surveillance of antigenic variation to mitigate vaccine escape.},
}
RevDate: 2025-12-15
CmpDate: 2025-12-15
Pangenome Analysis of Proteus mirabilis Reveals Lineage-Specific Antimicrobial Resistance Profiles and Discordant Genotype-Phenotype Correlations.
bioRxiv : the preprint server for biology pii:2025.11.21.689858.
UNLABELLED: Urinary tract infections (UTIs) impose a substantial health care burden with increasing antimicrobial resistance and treatment failure rates. Proteus mirabilis is a challenging UTI pathogen due to intrinsic resistances coupled with formation of crystalline biofilms. We combined pangenome analysis, antimicrobial resistance gene (AMR) predication, and antimicrobial susceptibility testing (AST) to determine whether highly multidrug-resistant (MDR) isolates of P. mirabilis arise from distinct lineages and explored the clinical utility of multilocus sequence typing (MLST). The pangenome of 1,001 P. mirabilis genomes from human urine revealed an open conformation driven by strain diversity and the accessory genome. A total of 213 sequence types (STs) were identified and only 7% had ≥11 genomes, highlighting strain diversity. 93% of the P. mirabilis genomes harbored resistance genes for ≥2 antibiotic subclasses, and 25% were predicted to be resistant to >6 subclasses, confirming a high MDR burden. By focusing on the 15 most prevalent STs, we observed that AMR subclasses were largely lineage-specific. However, isolates with very high resistance gene counts (>20) were distributed across multiple STs, indicating that extreme resistance gene accumulation is not restricted to ST. Comprehensive AST of 27 P. mirabilis clinical isolates further revealed variable genotype-phenotype concordance, indicating unresolved mechanisms of resistance not captured in current AMR prediction databases. In summary, our study emphasizes the critical need to expand genomic surveillance of P. mirabilis to fully understand the complex landscape of AMR in this species and translate these insights into improved diagnostic and therapeutic strategies.
IMPORTANCE: Proteus mirabilis is a clinically-challenging cause of urinary tract infections due to multidrug resistance and its ability to form crystalline biofilms that provide further antibiotic protection. In this study, we sought to determine how well sequence typing and antimicrobial resistance gene prediction correlate with laboratory-based antimicrobial susceptibility testing. By analyzing more than 1,000 P. mirabilis genomes from human urine samples, we found that some resistance patterns were sequence type-specific. However, the genome structure of this species suggests frequent horizontal gene transfer, and the most highly-resistant strains did not cluster by lineage. Importantly, many isolates that appeared "susceptible" based on their genomes were in fact resistant upon laboratory testing, revealing hidden or uncharacterized resistance mechanisms. These findings show that current gene-based prediction tools can miss clinically relevant resistance, underscoring the need for further study to guide effective treatment of P. mirabilis infections.
Additional Links: PMID-41394595
Full Text:
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41394595,
year = {2025},
author = {Deka, N and Brauer, AL and Connerton, K and Hanson, B and Walker, JN and Armbruster, CE},
title = {Pangenome Analysis of Proteus mirabilis Reveals Lineage-Specific Antimicrobial Resistance Profiles and Discordant Genotype-Phenotype Correlations.},
journal = {bioRxiv : the preprint server for biology},
volume = {},
number = {},
pages = {},
doi = {10.1101/2025.11.21.689858},
pmid = {41394595},
issn = {2692-8205},
abstract = {UNLABELLED: Urinary tract infections (UTIs) impose a substantial health care burden with increasing antimicrobial resistance and treatment failure rates. Proteus mirabilis is a challenging UTI pathogen due to intrinsic resistances coupled with formation of crystalline biofilms. We combined pangenome analysis, antimicrobial resistance gene (AMR) predication, and antimicrobial susceptibility testing (AST) to determine whether highly multidrug-resistant (MDR) isolates of P. mirabilis arise from distinct lineages and explored the clinical utility of multilocus sequence typing (MLST). The pangenome of 1,001 P. mirabilis genomes from human urine revealed an open conformation driven by strain diversity and the accessory genome. A total of 213 sequence types (STs) were identified and only 7% had ≥11 genomes, highlighting strain diversity. 93% of the P. mirabilis genomes harbored resistance genes for ≥2 antibiotic subclasses, and 25% were predicted to be resistant to >6 subclasses, confirming a high MDR burden. By focusing on the 15 most prevalent STs, we observed that AMR subclasses were largely lineage-specific. However, isolates with very high resistance gene counts (>20) were distributed across multiple STs, indicating that extreme resistance gene accumulation is not restricted to ST. Comprehensive AST of 27 P. mirabilis clinical isolates further revealed variable genotype-phenotype concordance, indicating unresolved mechanisms of resistance not captured in current AMR prediction databases. In summary, our study emphasizes the critical need to expand genomic surveillance of P. mirabilis to fully understand the complex landscape of AMR in this species and translate these insights into improved diagnostic and therapeutic strategies.
IMPORTANCE: Proteus mirabilis is a clinically-challenging cause of urinary tract infections due to multidrug resistance and its ability to form crystalline biofilms that provide further antibiotic protection. In this study, we sought to determine how well sequence typing and antimicrobial resistance gene prediction correlate with laboratory-based antimicrobial susceptibility testing. By analyzing more than 1,000 P. mirabilis genomes from human urine samples, we found that some resistance patterns were sequence type-specific. However, the genome structure of this species suggests frequent horizontal gene transfer, and the most highly-resistant strains did not cluster by lineage. Importantly, many isolates that appeared "susceptible" based on their genomes were in fact resistant upon laboratory testing, revealing hidden or uncharacterized resistance mechanisms. These findings show that current gene-based prediction tools can miss clinically relevant resistance, underscoring the need for further study to guide effective treatment of P. mirabilis infections.},
}
RevDate: 2025-12-17
CmpDate: 2025-12-15
Higher frequency of prokaryotic low complexity regions in core and orthologous genes.
Frontiers in bioinformatics, 5:1673480.
Prokaryotic genome evolution is shaped by mutation, gene duplication, and horizontal gene transfer, yet the interaction of these mechanisms, particularly in relation to low complexity regions (LCRs), remains poorly understood. LCRs are known to be mutation-prone and have been proposed to promote genetic innovation. However, the interaction between LCR-mediated and paralogy-mediated genetic innovation is still unclear. To clarify the interplay between these two evolutionary forces, we analyzed the distribution of LCRs in protein-coding genes from three closely related enterobacteria (Escherichia coli, Salmonella enterica, and Klebsiella pneumoniae) at both species and population levels. Using pangenomic and orthology-based approaches, we categorized genes by duplication history and conservation status and assessed LCR frequencies across these groups. We found that LCRs were consistently enriched in core and orthologous genes rather than in accessory or paralogous ones. This pattern was stable across evolutionary timescales and particularly pronounced in genes involved in cell cycle control and defense. These results suggest that, contrary to prior assumptions, LCRs may serve conserved functional roles rather than acting primarily as agents of evolutionary plasticity even at population-level timescales.
Additional Links: PMID-41393790
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41393790,
year = {2025},
author = {Saravanan, V and Kravetz, A and Battistuzzi, FU},
title = {Higher frequency of prokaryotic low complexity regions in core and orthologous genes.},
journal = {Frontiers in bioinformatics},
volume = {5},
number = {},
pages = {1673480},
pmid = {41393790},
issn = {2673-7647},
abstract = {Prokaryotic genome evolution is shaped by mutation, gene duplication, and horizontal gene transfer, yet the interaction of these mechanisms, particularly in relation to low complexity regions (LCRs), remains poorly understood. LCRs are known to be mutation-prone and have been proposed to promote genetic innovation. However, the interaction between LCR-mediated and paralogy-mediated genetic innovation is still unclear. To clarify the interplay between these two evolutionary forces, we analyzed the distribution of LCRs in protein-coding genes from three closely related enterobacteria (Escherichia coli, Salmonella enterica, and Klebsiella pneumoniae) at both species and population levels. Using pangenomic and orthology-based approaches, we categorized genes by duplication history and conservation status and assessed LCR frequencies across these groups. We found that LCRs were consistently enriched in core and orthologous genes rather than in accessory or paralogous ones. This pattern was stable across evolutionary timescales and particularly pronounced in genes involved in cell cycle control and defense. These results suggest that, contrary to prior assumptions, LCRs may serve conserved functional roles rather than acting primarily as agents of evolutionary plasticity even at population-level timescales.},
}
RevDate: 2025-12-13
Exploring the molecular basis of serotyping and antibiotic resistance differences in Riemerella anatipestifer based on pan-genomics and machine learning.
Veterinary microbiology, 312:110828 pii:S0378-1135(25)00464-X [Epub ahead of print].
Riemerella anatipestifer (R. anatipestifer) is a major pathogen in poultry worldwide, with multidrug resistance (MDR) and diverse serotypes complicating disease control. This study systematically investigated the molecular basis of serotyping and MDR characteristics in 92 R. anatipestifer strains isolated from 13 provinces in China between 2008 and 2023 by integrating pan-genome analysis with machine learning approaches. In this study, serotypes 2 (29.4 %), 7 (25.0 %), and 1 (21.7 %) were predominant. All isolates exhibited MDR and strains of serotypes 2 and 7 had significantly higher MICs for tigecycline, doxycycline, polymyxin B, and florfenicol than serotype 1 (P < 0.05). Eleven resistance genes were identified, and some (e.g., floR, tetX) displayed serotype-specific distribution patterns. Phylogenetic analysis indicated close relatedness between serotypes 2 and 7, while serotype 1 strains were genetically diverse. The accessory genome was enriched in mobile elements and O-antigen-related genes, and multiple serotype-specific marker genes (e.g., pgIA, wbpI) were identified, revealing the molecular basis of serotype classification. Additionally, insertion sequences (IS1595) flanking CPS gene clusters implied that horizontal gene transfer and recombination events may play pivotal roles in serotype variation. This study provides the first integrative framework combining pan-genomics and machine learning to elucidate the genomic basis of serotype diversity and antimicrobial resistance in R. anatipestifer, offering new insights into its adaptive evolution and informing precision vaccine and diagnostic development.
Additional Links: PMID-41389591
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41389591,
year = {2025},
author = {Chen, J and Liu, Y and Fu, L and Song, X and Zhang, D and Duan, X and Li, Y and Zhao, Y and Guo, L},
title = {Exploring the molecular basis of serotyping and antibiotic resistance differences in Riemerella anatipestifer based on pan-genomics and machine learning.},
journal = {Veterinary microbiology},
volume = {312},
number = {},
pages = {110828},
doi = {10.1016/j.vetmic.2025.110828},
pmid = {41389591},
issn = {1873-2542},
abstract = {Riemerella anatipestifer (R. anatipestifer) is a major pathogen in poultry worldwide, with multidrug resistance (MDR) and diverse serotypes complicating disease control. This study systematically investigated the molecular basis of serotyping and MDR characteristics in 92 R. anatipestifer strains isolated from 13 provinces in China between 2008 and 2023 by integrating pan-genome analysis with machine learning approaches. In this study, serotypes 2 (29.4 %), 7 (25.0 %), and 1 (21.7 %) were predominant. All isolates exhibited MDR and strains of serotypes 2 and 7 had significantly higher MICs for tigecycline, doxycycline, polymyxin B, and florfenicol than serotype 1 (P < 0.05). Eleven resistance genes were identified, and some (e.g., floR, tetX) displayed serotype-specific distribution patterns. Phylogenetic analysis indicated close relatedness between serotypes 2 and 7, while serotype 1 strains were genetically diverse. The accessory genome was enriched in mobile elements and O-antigen-related genes, and multiple serotype-specific marker genes (e.g., pgIA, wbpI) were identified, revealing the molecular basis of serotype classification. Additionally, insertion sequences (IS1595) flanking CPS gene clusters implied that horizontal gene transfer and recombination events may play pivotal roles in serotype variation. This study provides the first integrative framework combining pan-genomics and machine learning to elucidate the genomic basis of serotype diversity and antimicrobial resistance in R. anatipestifer, offering new insights into its adaptive evolution and informing precision vaccine and diagnostic development.},
}
RevDate: 2025-12-15
CmpDate: 2025-12-12
A metabolic atlas of the Klebsiella pneumoniae species complex reveals lineage-specific metabolism and capacity for intra-species co-operation.
PLoS biology, 23(12):e3003559.
The Klebsiella pneumoniae species complex inhabits a wide variety of hosts and environments, and is a major cause of antimicrobial resistant infections. Genomics has revealed the population comprises multiple species/sub-species and hundreds of distinct co-circulating sub-lineage (SLs) that are associated with distinct gene complements. A substantial fraction of the pan-genome is predicted to be involved in metabolic functions and hence these data are consistent with metabolic differentiation at the SL level. However, this has so far remained unsubstantiated because in the past it was not possible to explore metabolic variation at scale. Here, we used a combination of comparative genomics and high-throughput genome-scale metabolic modeling to systematically explore metabolic diversity across the K. pneumoniae species complex (n = 7,835 genomes). We simulated growth outcomes for each isolate using carbon, nitrogen, phosphorus, and sulfur sources under aerobic and anaerobic conditions (n = 1,278 conditions per isolate). We showed that the distributions of metabolic genes and growth capabilities are structured in the population, and confirmed that SLs exhibit unique metabolic profiles. In vitro co-culture experiments demonstrated reciprocal commensalistic cross-feeding between SLs, effectively extending the range of conditions supporting individual growth. We propose that these substrate specializations may promote the existence and persistence of co-circulating SLs by reducing nutrient competition and facilitating commensal interactions. Our findings have implications for understanding the eco-evolutionary dynamics of K. pneumoniae and for the design of novel strategies to prevent opportunistic infections caused by this World Health Organization priority antimicrobial resistant pathogen.
Additional Links: PMID-41385522
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41385522,
year = {2025},
author = {Vezina, B and Cooper, HB and Barlow, CK and Rethoret-Pasty, M and Brisse, S and Monk, JM and Holt, KE and Wyres, KL},
title = {A metabolic atlas of the Klebsiella pneumoniae species complex reveals lineage-specific metabolism and capacity for intra-species co-operation.},
journal = {PLoS biology},
volume = {23},
number = {12},
pages = {e3003559},
pmid = {41385522},
issn = {1545-7885},
mesh = {*Klebsiella pneumoniae/metabolism/genetics/growth & development/classification ; Genome, Bacterial ; Metabolic Networks and Pathways/genetics ; Genomics/methods ; Species Specificity ; Phylogeny ; Klebsiella Infections/microbiology ; Coculture Techniques ; },
abstract = {The Klebsiella pneumoniae species complex inhabits a wide variety of hosts and environments, and is a major cause of antimicrobial resistant infections. Genomics has revealed the population comprises multiple species/sub-species and hundreds of distinct co-circulating sub-lineage (SLs) that are associated with distinct gene complements. A substantial fraction of the pan-genome is predicted to be involved in metabolic functions and hence these data are consistent with metabolic differentiation at the SL level. However, this has so far remained unsubstantiated because in the past it was not possible to explore metabolic variation at scale. Here, we used a combination of comparative genomics and high-throughput genome-scale metabolic modeling to systematically explore metabolic diversity across the K. pneumoniae species complex (n = 7,835 genomes). We simulated growth outcomes for each isolate using carbon, nitrogen, phosphorus, and sulfur sources under aerobic and anaerobic conditions (n = 1,278 conditions per isolate). We showed that the distributions of metabolic genes and growth capabilities are structured in the population, and confirmed that SLs exhibit unique metabolic profiles. In vitro co-culture experiments demonstrated reciprocal commensalistic cross-feeding between SLs, effectively extending the range of conditions supporting individual growth. We propose that these substrate specializations may promote the existence and persistence of co-circulating SLs by reducing nutrient competition and facilitating commensal interactions. Our findings have implications for understanding the eco-evolutionary dynamics of K. pneumoniae and for the design of novel strategies to prevent opportunistic infections caused by this World Health Organization priority antimicrobial resistant pathogen.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Klebsiella pneumoniae/metabolism/genetics/growth & development/classification
Genome, Bacterial
Metabolic Networks and Pathways/genetics
Genomics/methods
Species Specificity
Phylogeny
Klebsiella Infections/microbiology
Coculture Techniques
RevDate: 2025-12-15
CmpDate: 2025-12-12
Whole-genome analysis of Lysinibacillus boronitolerans MSR1: A dairy-isolated multidrug-resistant and non-pathogenic strain.
PloS one, 20(12):e0333844.
Lysinibacillus boronitolerans MSR1, a bacterium isolated from yogurt, was characterized through a detailed genomic and phylogenetic analysis. The strain demonstrated the ability to grow in MRS broth at pH 5-8 and exhibited positive activity in citrate utilization, catalase, oxidase, Methyl Red-Voges Proskauer (MR-VP), and Triple Sugar Iron (TSI) tests while being negative for amylase degradation and sulfide production. Whole-genome sequencing and Average Nucleotide Identity (ANI) analysis revealed a close genetic relationship between MSR1 and previously reported L. boronitolerans strains, with ANI values ranging from 95.25% to 98.30%, particularly for L. boronitolerans strain NBRC103108. Phylogenetic analyses based on whole-genome and 16S rRNA sequences confirmed the taxonomic placement of MSR1 within the L. boronitolerans species. A circular genome comparison highlighted the presence of unique genomic regions in MSR1, notably around the 3500 kbp mark, indicating the acquisition of novel genes that may contribute to its distinct phenotypic traits. Antibiotic susceptibility testing revealed a high level of resistance in MSR1 to glycopeptides and aminoglycosides, while the strain remained susceptible to imipenem, with in silico analysis identifying key antimicrobial resistance (AMR) genes, including qacJ, vanW, vanT, and FosBx1, which confer resistance to disinfectants, vancomycin, and fosfomycin through efflux pumps and target modification mechanisms. Five distinct biosynthetic gene cluster (BGC) regions were identified in the MSR1 genome, encoding genes for lanthipeptide-class-iii, RiPP-like, T3PKS, beta-lactone, terpene, and NRPS-like clusters. Pan-genome analysis suggested that L. boronitolerans possesses an open pan-genome, with a substantial proportion of accessory and unique genes. Functional annotation of core, accessory, and unique genes revealed that core genes are predominantly associated with metabolic processes, while accessory and unique genes are involved in information processing, storage, and defence mechanisms. These findings enhance our understanding of the genomic diversity, evolutionary dynamics, and potential adaptive strategies of L. boronitolerans MSR1, providing new insights into its ecological and functional roles.
Additional Links: PMID-41385521
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41385521,
year = {2025},
author = {Rahman, MS and Islam, MT and Hasan, M and Tariquzzaman, M and Samia, NSN and Rahman, MM and Rahman, M},
title = {Whole-genome analysis of Lysinibacillus boronitolerans MSR1: A dairy-isolated multidrug-resistant and non-pathogenic strain.},
journal = {PloS one},
volume = {20},
number = {12},
pages = {e0333844},
pmid = {41385521},
issn = {1932-6203},
mesh = {Phylogeny ; *Genome, Bacterial ; Whole Genome Sequencing ; *Bacillaceae/genetics/drug effects/isolation & purification/classification ; *Drug Resistance, Multiple, Bacterial/genetics ; RNA, Ribosomal, 16S/genetics ; Anti-Bacterial Agents/pharmacology ; Microbial Sensitivity Tests ; },
abstract = {Lysinibacillus boronitolerans MSR1, a bacterium isolated from yogurt, was characterized through a detailed genomic and phylogenetic analysis. The strain demonstrated the ability to grow in MRS broth at pH 5-8 and exhibited positive activity in citrate utilization, catalase, oxidase, Methyl Red-Voges Proskauer (MR-VP), and Triple Sugar Iron (TSI) tests while being negative for amylase degradation and sulfide production. Whole-genome sequencing and Average Nucleotide Identity (ANI) analysis revealed a close genetic relationship between MSR1 and previously reported L. boronitolerans strains, with ANI values ranging from 95.25% to 98.30%, particularly for L. boronitolerans strain NBRC103108. Phylogenetic analyses based on whole-genome and 16S rRNA sequences confirmed the taxonomic placement of MSR1 within the L. boronitolerans species. A circular genome comparison highlighted the presence of unique genomic regions in MSR1, notably around the 3500 kbp mark, indicating the acquisition of novel genes that may contribute to its distinct phenotypic traits. Antibiotic susceptibility testing revealed a high level of resistance in MSR1 to glycopeptides and aminoglycosides, while the strain remained susceptible to imipenem, with in silico analysis identifying key antimicrobial resistance (AMR) genes, including qacJ, vanW, vanT, and FosBx1, which confer resistance to disinfectants, vancomycin, and fosfomycin through efflux pumps and target modification mechanisms. Five distinct biosynthetic gene cluster (BGC) regions were identified in the MSR1 genome, encoding genes for lanthipeptide-class-iii, RiPP-like, T3PKS, beta-lactone, terpene, and NRPS-like clusters. Pan-genome analysis suggested that L. boronitolerans possesses an open pan-genome, with a substantial proportion of accessory and unique genes. Functional annotation of core, accessory, and unique genes revealed that core genes are predominantly associated with metabolic processes, while accessory and unique genes are involved in information processing, storage, and defence mechanisms. These findings enhance our understanding of the genomic diversity, evolutionary dynamics, and potential adaptive strategies of L. boronitolerans MSR1, providing new insights into its ecological and functional roles.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
Phylogeny
*Genome, Bacterial
Whole Genome Sequencing
*Bacillaceae/genetics/drug effects/isolation & purification/classification
*Drug Resistance, Multiple, Bacterial/genetics
RNA, Ribosomal, 16S/genetics
Anti-Bacterial Agents/pharmacology
Microbial Sensitivity Tests
RevDate: 2025-12-17
CmpDate: 2025-12-12
SAFARI: pangenome alignment of ancient DNA using purine/pyrimidine encodings.
NAR genomics and bioinformatics, 7(4):lqaf170.
Aligning DNA sequences retrieved from fossils or other paleontological artifacts, referred to as ancient DNA (aDNA), is particularly challenging due to the short sequence length and chemical damage which creates a specific pattern of substitution (C[Formula: see text]T and G[Formula: see text]A) in addition to the heightened divergence between the sample and the reference genome thus exacerbating reference bias. This bias can be mitigated by aligning to pangenome graphs to incorporate documented organismic variation, but this approach still suffers from substitution patterns due to chemical damage. We introduce a novel methodology introducing the RYmer index, a variant of the commonly used minimizer index which represents purines (A,G) and pyrimidines (C,T) as R and Y, respectively. This creates an indexing scheme robust to the aforementioned chemical damage. We implemented SAFARI (Sensitive Alignments From A RYmer Index), an aDNA damage-aware version of the pangenome aligner vg giraffe, which uses RYmers to rescue alignments containing deaminated seeds. For highly damaged samples, the recovery rate could be upwards of 10%, an amount which could well affect downstream results. We show that our approach produces more correct alignments from aDNA sequences than current approaches while maintaining a tolerable rate of spurious alignments. In addition, we demonstrate that our algorithm improves the estimate of the rate of aDNA damage, especially for highly damaged samples. Crucially, we show that this improved alignment can directly translate into better insights gained from the data by showcasing its integration with a number of extant pangenome tools.
Additional Links: PMID-41383830
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41383830,
year = {2025},
author = {Rubin, J and van Waaij, J and Kraft, L and Sirén, J and Sackett, PW and Renaud, G},
title = {SAFARI: pangenome alignment of ancient DNA using purine/pyrimidine encodings.},
journal = {NAR genomics and bioinformatics},
volume = {7},
number = {4},
pages = {lqaf170},
pmid = {41383830},
issn = {2631-9268},
support = {R01 HG010485/HG/NHGRI NIH HHS/United States ; U01 HG010961/HG/NHGRI NIH HHS/United States ; U41 HG010972/HG/NHGRI NIH HHS/United States ; },
mesh = {*DNA, Ancient/analysis ; *Purines/chemistry ; *Pyrimidines/chemistry ; Humans ; *Sequence Alignment/methods ; *Software ; Sequence Analysis, DNA/methods ; Algorithms ; },
abstract = {Aligning DNA sequences retrieved from fossils or other paleontological artifacts, referred to as ancient DNA (aDNA), is particularly challenging due to the short sequence length and chemical damage which creates a specific pattern of substitution (C[Formula: see text]T and G[Formula: see text]A) in addition to the heightened divergence between the sample and the reference genome thus exacerbating reference bias. This bias can be mitigated by aligning to pangenome graphs to incorporate documented organismic variation, but this approach still suffers from substitution patterns due to chemical damage. We introduce a novel methodology introducing the RYmer index, a variant of the commonly used minimizer index which represents purines (A,G) and pyrimidines (C,T) as R and Y, respectively. This creates an indexing scheme robust to the aforementioned chemical damage. We implemented SAFARI (Sensitive Alignments From A RYmer Index), an aDNA damage-aware version of the pangenome aligner vg giraffe, which uses RYmers to rescue alignments containing deaminated seeds. For highly damaged samples, the recovery rate could be upwards of 10%, an amount which could well affect downstream results. We show that our approach produces more correct alignments from aDNA sequences than current approaches while maintaining a tolerable rate of spurious alignments. In addition, we demonstrate that our algorithm improves the estimate of the rate of aDNA damage, especially for highly damaged samples. Crucially, we show that this improved alignment can directly translate into better insights gained from the data by showcasing its integration with a number of extant pangenome tools.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*DNA, Ancient/analysis
*Purines/chemistry
*Pyrimidines/chemistry
Humans
*Sequence Alignment/methods
*Software
Sequence Analysis, DNA/methods
Algorithms
RevDate: 2025-12-14
CmpDate: 2025-12-12
Bacillus velezensis T971 genome informs starch degradation in tobacco.
Frontiers in microbiology, 16:1689015.
Starch is an important reserve polysaccharide in tobacco leaves. An endophytic bacterium, Bacillus velezensis T971, was isolated from the leaves of Nicotiana tabacum L. Yunyan 97 and showed strong starch-degrading activity on the agar plate containing 1% starch (w/v). The complete genome of T971 was determined using PacBio long-read sequencing technology, revealing a single circular chromosome of 3,930,941 bp encoding 3,692 proteins. The T971 genome was compared with 242 other complete genomes of B. velezensis. Pan-genome analysis revealed a sporadic distribution of the plantazolicin (PZN) biosynthetic gene cluster (BGC) and mobile genetic elements (MGEs) (e.g., genomic islands (GIs) and prophages), which have contributed to the variability of B. velezensis genomes. Carbohydrate-active enzyme (CAZyme) analysis found 113 CAZymes in the T971 genome, including 41 glycoside hydrolases (GHs), 40 glycosyltransferases (GTs), and 14 carbohydrate esterases (CEs). One (GH13_28 family) of the four α-amylases is the most promising candidate for starch degradation in tobacco leaves, possessing a signal peptide and two carbohydrate-binding modules (CBMs). This study identifies B. velezensis T971 as a potential strain for industrial amylase production.
Additional Links: PMID-41383732
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41383732,
year = {2025},
author = {Hu, L and Zhang, X and Gao, Q and Yang, M and Dong, X and Cheng, T and Liang, T and Yang, B and Zhang, Y and Yin, Y},
title = {Bacillus velezensis T971 genome informs starch degradation in tobacco.},
journal = {Frontiers in microbiology},
volume = {16},
number = {},
pages = {1689015},
pmid = {41383732},
issn = {1664-302X},
abstract = {Starch is an important reserve polysaccharide in tobacco leaves. An endophytic bacterium, Bacillus velezensis T971, was isolated from the leaves of Nicotiana tabacum L. Yunyan 97 and showed strong starch-degrading activity on the agar plate containing 1% starch (w/v). The complete genome of T971 was determined using PacBio long-read sequencing technology, revealing a single circular chromosome of 3,930,941 bp encoding 3,692 proteins. The T971 genome was compared with 242 other complete genomes of B. velezensis. Pan-genome analysis revealed a sporadic distribution of the plantazolicin (PZN) biosynthetic gene cluster (BGC) and mobile genetic elements (MGEs) (e.g., genomic islands (GIs) and prophages), which have contributed to the variability of B. velezensis genomes. Carbohydrate-active enzyme (CAZyme) analysis found 113 CAZymes in the T971 genome, including 41 glycoside hydrolases (GHs), 40 glycosyltransferases (GTs), and 14 carbohydrate esterases (CEs). One (GH13_28 family) of the four α-amylases is the most promising candidate for starch degradation in tobacco leaves, possessing a signal peptide and two carbohydrate-binding modules (CBMs). This study identifies B. velezensis T971 as a potential strain for industrial amylase production.},
}
RevDate: 2025-12-11
CmpDate: 2025-12-11
Multispecies pangenomes reveal a pervasive influence of population size on structural variation.
Science (New York, N.Y.), 390(6778):eadw1931.
Structural variants (SVs) are widespread in vertebrate genomes, yet their evolutionary dynamics remain poorly understood. Using 45 long-read de novo genome assemblies and pangenome tools, we analyze SVs among three closely related species of North American jays (Aphelocoma, scrub-jays) displaying a 55-fold range in effective population size. We find rapid evolution of genome architecture, including ~100-megabase decreases in genome size driven by shifts in complex satellite landscapes. SVs exhibit slightly deleterious dynamics modulated by variant length and population size, with consistent evidence of adaptive fixation only in the largest population. Gene copy number variants exhibit an inverse relationship with population size, indicating strongly deleterious dynamics, with consequences for gene expression. Our long-read dataset and pangenome analysis demonstrate how population size shapes genome complexity.
Additional Links: PMID-41379974
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41379974,
year = {2025},
author = {Edwards, SV and Fang, B and Khost, D and Kolyfetis, GE and Cheek, RG and DeRaad, DA and Chen, N and Fitzpatrick, JW and McCormack, JE and Funk, WC and Ghalambor, CK and Garrison, E and Guarracino, A and Li, H and Sackton, TB},
title = {Multispecies pangenomes reveal a pervasive influence of population size on structural variation.},
journal = {Science (New York, N.Y.)},
volume = {390},
number = {6778},
pages = {eadw1931},
doi = {10.1126/science.adw1931},
pmid = {41379974},
issn = {1095-9203},
mesh = {*Population Density ; Animals ; *Genomic Structural Variation ; DNA Copy Number Variations ; *Evolution, Molecular ; Genome Size ; *Genome ; },
abstract = {Structural variants (SVs) are widespread in vertebrate genomes, yet their evolutionary dynamics remain poorly understood. Using 45 long-read de novo genome assemblies and pangenome tools, we analyze SVs among three closely related species of North American jays (Aphelocoma, scrub-jays) displaying a 55-fold range in effective population size. We find rapid evolution of genome architecture, including ~100-megabase decreases in genome size driven by shifts in complex satellite landscapes. SVs exhibit slightly deleterious dynamics modulated by variant length and population size, with consistent evidence of adaptive fixation only in the largest population. Gene copy number variants exhibit an inverse relationship with population size, indicating strongly deleterious dynamics, with consequences for gene expression. Our long-read dataset and pangenome analysis demonstrate how population size shapes genome complexity.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Population Density
Animals
*Genomic Structural Variation
DNA Copy Number Variations
*Evolution, Molecular
Genome Size
*Genome
RevDate: 2025-12-11
Improved reference assembly and core collection re-sequencing to facilitate exploration of important agronomical traits for the improvement of oilseed crop, Carthamus tinctorius L.
GigaScience pii:8377273 [Epub ahead of print].
BACKGROUND: Safflower (Carthamus tinctorius L.) is a drought-resilient oilseed crop. Besides producing edible oil rich in oleic and linoleic acid, it is also used in biofuels, cosmetics, colouring dyes, pharmaceuticals and nutraceuticals. Despite its significant economic uses, availability of genetic and genomic resources in safflower are limited.
RESULTS: We report an improved de novo genome assembly of safflower (Safflower_A2). A chromosome-level assembly of 1.15 Gb with telomeres and centromeric repeats, was constructed using PacBio HiFi reads, optical maps, Illumina short reads, and Hi-C sequencing. Safflower_A2 shows better contiguity, completeness, and high-quality annotation than previous assemblies. The assembly was further validated with the help of a single nucleotide polymorphism (SNP)-based linkage map. A genome-wide survey identified genes for comprehensive exploration of disease resistance in the safflower. Employing the de novo genome assembly as a reference, we used resequencing data of a global core-collection of 123 accessions to carry out a SNP-based genome-wide association study, which identified significant associations for several traits, their haplotypes of agronomic value, including seed oil content. Resequencing data was also applied for a pan-genome analysis which provided critical insights into genome diversity identifying an additional ∼11000 genes and their functional enrichment that will be useful for region-specific breeding lines.
CONCLUSION: Our study provides insights into the genomic architecture of safflower by leveraging an improved genome assembly and annotation. Additionally, resources including high-density linkage map, marker-trait associations, and pan-genome developed in this study provide valuable resources for use in breeding and crop improvement programs by the global research community.
Additional Links: PMID-41378979
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41378979,
year = {2025},
author = {Sharma, M and Bhardwaj, V and Oraon, PK and Choudhary, S and Ambreen, H and Shukla, RN and Jamedar, HR and Vijjeswarapu, A and Jaiswal, V and Kadirvel, P and Jagannath, A and Goel, S},
title = {Improved reference assembly and core collection re-sequencing to facilitate exploration of important agronomical traits for the improvement of oilseed crop, Carthamus tinctorius L.},
journal = {GigaScience},
volume = {},
number = {},
pages = {},
doi = {10.1093/gigascience/giaf151},
pmid = {41378979},
issn = {2047-217X},
abstract = {BACKGROUND: Safflower (Carthamus tinctorius L.) is a drought-resilient oilseed crop. Besides producing edible oil rich in oleic and linoleic acid, it is also used in biofuels, cosmetics, colouring dyes, pharmaceuticals and nutraceuticals. Despite its significant economic uses, availability of genetic and genomic resources in safflower are limited.
RESULTS: We report an improved de novo genome assembly of safflower (Safflower_A2). A chromosome-level assembly of 1.15 Gb with telomeres and centromeric repeats, was constructed using PacBio HiFi reads, optical maps, Illumina short reads, and Hi-C sequencing. Safflower_A2 shows better contiguity, completeness, and high-quality annotation than previous assemblies. The assembly was further validated with the help of a single nucleotide polymorphism (SNP)-based linkage map. A genome-wide survey identified genes for comprehensive exploration of disease resistance in the safflower. Employing the de novo genome assembly as a reference, we used resequencing data of a global core-collection of 123 accessions to carry out a SNP-based genome-wide association study, which identified significant associations for several traits, their haplotypes of agronomic value, including seed oil content. Resequencing data was also applied for a pan-genome analysis which provided critical insights into genome diversity identifying an additional ∼11000 genes and their functional enrichment that will be useful for region-specific breeding lines.
CONCLUSION: Our study provides insights into the genomic architecture of safflower by leveraging an improved genome assembly and annotation. Additionally, resources including high-density linkage map, marker-trait associations, and pan-genome developed in this study provide valuable resources for use in breeding and crop improvement programs by the global research community.},
}
RevDate: 2025-12-16
Establishing conserved biosynthetic gene clusters of the phylum Myxococcota.
Applied and environmental microbiology [Epub ahead of print].
UNLABELLED: A surge in sequenced myxobacteria catalyzed by advancements in long-read genome and metagenome sequencing has provided sufficient data to scrutinize the conserved biosynthetic gene clusters (BGCs) within the phylum Myxococcota. Provided the utility of myxobacteria in environmental nutrient cycles and discovery of novel therapeutic leads, we sought to determine any conserved specialized metabolism in the phylum. Using a pan-genome approach to analyze 11 genera and 195 sequenced genomes, including 10 newly reported myxobacterial isolates, we observed five conserved BGCs. All five clusters encode for characterized metabolites with established ecological roles for four of the metabolites, and none of the metabolites are known toxins. Validation of our approach was done by analyzing Myxococcota genera without sufficient sequenced representatives for pan-genome analysis to observe the presence/absence of these five clusters. This approach enabled observation of genus-level conservation of BGCs with varying degrees of confidence due to the diversity of sequenced species within each genus. The indigoidine BGC typically found in Streptomyces spp. was notably conserved in Melittangium; heterologous expression of the core biosynthetic gene bspA in Escherichia coli and subsequent detection of indigoidine confirmed the identity of the indigoidine cluster. Conserved BGCs in myxobacteria reveal maintenance of biosynthetic pathways and cognate metabolites with ecological roles as chemical signals and stress response; these observations suggest competitive specialization of secondary metabolism and toxin production in myxobacteria.
IMPORTANCE: Critical contributions to soil nutrient cycles by predatory bacteria, including the Myxococcota, and utility as a resource for the discovery of novel enzymology and metabolism motive continued isolation and characterization of myxobacteria from the environment. Each of these motivating factors involves specialized metabolites produced by myxobacteria and the biosynthetic gene clusters (BGCs) responsible for their assembly. Primarily associated with the predatory lifestyles of myxobacteria, myxobacterial specialized metabolites have been pursued as therapeutic leads for novel antibacterials, antifungals, anthelmintics, and cancer therapies. Despite these efforts and the observation that nearly all genera within the Myxococcota have an extraordinary number of BGCs, there is no consensus view of the conserved BGCs in the phylum. Our study revealed the core BGCs consistently present throughout the phylum. By reporting these core specialized metabolites and their ecological roles, we hope to streamline the discovery and investigation of specialized metabolism in myxobacteria.
Additional Links: PMID-41378891
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41378891,
year = {2025},
author = {Khanal Pokharel, S and Shehata, N and Ahearne, A and Knehans, T and Bailey, CB and Boudreau, PD and Stevens, DC},
title = {Establishing conserved biosynthetic gene clusters of the phylum Myxococcota.},
journal = {Applied and environmental microbiology},
volume = {},
number = {},
pages = {e0215125},
doi = {10.1128/aem.02151-25},
pmid = {41378891},
issn = {1098-5336},
abstract = {UNLABELLED: A surge in sequenced myxobacteria catalyzed by advancements in long-read genome and metagenome sequencing has provided sufficient data to scrutinize the conserved biosynthetic gene clusters (BGCs) within the phylum Myxococcota. Provided the utility of myxobacteria in environmental nutrient cycles and discovery of novel therapeutic leads, we sought to determine any conserved specialized metabolism in the phylum. Using a pan-genome approach to analyze 11 genera and 195 sequenced genomes, including 10 newly reported myxobacterial isolates, we observed five conserved BGCs. All five clusters encode for characterized metabolites with established ecological roles for four of the metabolites, and none of the metabolites are known toxins. Validation of our approach was done by analyzing Myxococcota genera without sufficient sequenced representatives for pan-genome analysis to observe the presence/absence of these five clusters. This approach enabled observation of genus-level conservation of BGCs with varying degrees of confidence due to the diversity of sequenced species within each genus. The indigoidine BGC typically found in Streptomyces spp. was notably conserved in Melittangium; heterologous expression of the core biosynthetic gene bspA in Escherichia coli and subsequent detection of indigoidine confirmed the identity of the indigoidine cluster. Conserved BGCs in myxobacteria reveal maintenance of biosynthetic pathways and cognate metabolites with ecological roles as chemical signals and stress response; these observations suggest competitive specialization of secondary metabolism and toxin production in myxobacteria.
IMPORTANCE: Critical contributions to soil nutrient cycles by predatory bacteria, including the Myxococcota, and utility as a resource for the discovery of novel enzymology and metabolism motive continued isolation and characterization of myxobacteria from the environment. Each of these motivating factors involves specialized metabolites produced by myxobacteria and the biosynthetic gene clusters (BGCs) responsible for their assembly. Primarily associated with the predatory lifestyles of myxobacteria, myxobacterial specialized metabolites have been pursued as therapeutic leads for novel antibacterials, antifungals, anthelmintics, and cancer therapies. Despite these efforts and the observation that nearly all genera within the Myxococcota have an extraordinary number of BGCs, there is no consensus view of the conserved BGCs in the phylum. Our study revealed the core BGCs consistently present throughout the phylum. By reporting these core specialized metabolites and their ecological roles, we hope to streamline the discovery and investigation of specialized metabolism in myxobacteria.},
}
RevDate: 2025-12-13
CmpDate: 2025-12-10
Poultry pathogenicity, antimicrobial resistance, and one health genomic characterization of ST83 Escherichia coli.
Brazilian journal of microbiology : [publication of the Brazilian Society for Microbiology], 57(1):4.
Escherichia coli is a clinically relevant zoonotic pathogen with significant impacts on both food safety and animal/human health. This study characterized the pathogenicity and genomic features of sequence type (ST) 83 E. coli isolated from liver samples of deceased chicken in China, along with its potential for zoonotic transmission. A total of eight ST83 E. coli isolates were obtained from liver samples. Pathogenicity tests demonstrated the isolates' virulence in chickens, inducing significant pathological lesions indicative of poultry health risks. Genomic analysis indicated an alarming antimicrobial resistance gene (ARG) profile, including tet(X4) and blaNDM-5, among ST83 E. coli strains in China. The ARGs were strongly associated with mobile genetic elements (MGEs) such as IncQ1 plasmids and insertion sequences (ISs) IS903, highlighting their co-transmission potential through horizontal gene transfer. In addition, core genome phylogeny and pangenome analysis identified minimal genetic divergence between human- and animal-derived strains. This close relationship, evidenced by shared accessory genes and small SNP differences, suggests potential zoonotic transmission. Collectively, these findings indicate that ST83 E. coli is a dual-threat pathogen, being not only a virulent pathogen in poultry but also a One Health concern due to its potential for zoonotic transmission and antimicrobial resistance dissemination.
Additional Links: PMID-41370031
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41370031,
year = {2025},
author = {Cai, X and Kang, C and Li, G and Zhang, M and Chen, X and Li, B and Li, B and Zhu, L and Wu, K and Chen, W},
title = {Poultry pathogenicity, antimicrobial resistance, and one health genomic characterization of ST83 Escherichia coli.},
journal = {Brazilian journal of microbiology : [publication of the Brazilian Society for Microbiology]},
volume = {57},
number = {1},
pages = {4},
pmid = {41370031},
issn = {1678-4405},
support = {CARS-40-S02//China Agriculture Research System of MOF and MARA/ ; },
mesh = {Animals ; *Escherichia coli/genetics/pathogenicity/drug effects/isolation & purification/classification ; *Escherichia coli Infections/veterinary/microbiology ; *Poultry Diseases/microbiology ; Chickens/microbiology ; Anti-Bacterial Agents/pharmacology ; Phylogeny ; Genome, Bacterial ; Virulence ; China ; Humans ; One Health ; *Drug Resistance, Bacterial ; Plasmids/genetics ; Escherichia coli Proteins/genetics ; Genomics ; Microbial Sensitivity Tests ; },
abstract = {Escherichia coli is a clinically relevant zoonotic pathogen with significant impacts on both food safety and animal/human health. This study characterized the pathogenicity and genomic features of sequence type (ST) 83 E. coli isolated from liver samples of deceased chicken in China, along with its potential for zoonotic transmission. A total of eight ST83 E. coli isolates were obtained from liver samples. Pathogenicity tests demonstrated the isolates' virulence in chickens, inducing significant pathological lesions indicative of poultry health risks. Genomic analysis indicated an alarming antimicrobial resistance gene (ARG) profile, including tet(X4) and blaNDM-5, among ST83 E. coli strains in China. The ARGs were strongly associated with mobile genetic elements (MGEs) such as IncQ1 plasmids and insertion sequences (ISs) IS903, highlighting their co-transmission potential through horizontal gene transfer. In addition, core genome phylogeny and pangenome analysis identified minimal genetic divergence between human- and animal-derived strains. This close relationship, evidenced by shared accessory genes and small SNP differences, suggests potential zoonotic transmission. Collectively, these findings indicate that ST83 E. coli is a dual-threat pathogen, being not only a virulent pathogen in poultry but also a One Health concern due to its potential for zoonotic transmission and antimicrobial resistance dissemination.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
Animals
*Escherichia coli/genetics/pathogenicity/drug effects/isolation & purification/classification
*Escherichia coli Infections/veterinary/microbiology
*Poultry Diseases/microbiology
Chickens/microbiology
Anti-Bacterial Agents/pharmacology
Phylogeny
Genome, Bacterial
Virulence
China
Humans
One Health
*Drug Resistance, Bacterial
Plasmids/genetics
Escherichia coli Proteins/genetics
Genomics
Microbial Sensitivity Tests
RevDate: 2025-12-12
CmpDate: 2025-12-10
Three Novel Genomes Broaden the Wild Side of the Capsicum Pangenome.
Plant direct, 9(12):e70121.
This study presents three genome assemblies within the Capsicum genus, enabling comprehensive comparative analyses for the Annuum and Baccatum complexes within the genus. We produced highly continuous assemblies of the nuclear genomes and complete chloroplast assemblies. Subsequent genome annotation identified 34,580 genes in nonpungent C. annuum cv. ECW, and 32,704 and 33,994 genes in pungent C. chacoense and C. galapagoense, respectively. These assemblies, including the first complete genomes for C. chacoense and C. galapagoense, provide additional genomic resolution within the Capsicum genus. The novel genomes were analyzed within a pangenomic framework, integrating 16 Capsicum genomes across the Annuum, Baccatum, and Pubescens complexes. Homology grouping was used to identify core, accessory and unique genes and showed a wide spectrum of genetic diversity, particularly in homology groups exclusive to C. chacoense and C. galapagoense. Out of 79,267 homology groups identified, 13% were core groups, present in all accessions, corresponding to approximately 30% of core genes per genome. Comparative analyses revealed distinct species and genus-specific genomic characteristics. Additionally, we used the graph pangenome to illustrate locus-level exploration by examining the Pun1 locus associated with capsaicinoid biosynthesis, identifying multiple Pun1-like genes including their genomic position and homology information. The integration of these new resources into a dynamic Capsicum pangenome framework provides a versatile platform for extracting genetic information relevant to both fundamental research and breeding applications.
Additional Links: PMID-41368547
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41368547,
year = {2025},
author = {Papastolopoulou, C and Nieuwenhuis, R and Warris, S and Bakker, LV and van Haarst, J and Cordewener, J and Hesselink, T and van den Broeck, H and van Dooijeweert, W and de Jong, H and Chunwongse, J and Trivino, SD and Schijlen, E and de Ridder, D and Smit, S and Peters, SA},
title = {Three Novel Genomes Broaden the Wild Side of the Capsicum Pangenome.},
journal = {Plant direct},
volume = {9},
number = {12},
pages = {e70121},
pmid = {41368547},
issn = {2475-4455},
abstract = {This study presents three genome assemblies within the Capsicum genus, enabling comprehensive comparative analyses for the Annuum and Baccatum complexes within the genus. We produced highly continuous assemblies of the nuclear genomes and complete chloroplast assemblies. Subsequent genome annotation identified 34,580 genes in nonpungent C. annuum cv. ECW, and 32,704 and 33,994 genes in pungent C. chacoense and C. galapagoense, respectively. These assemblies, including the first complete genomes for C. chacoense and C. galapagoense, provide additional genomic resolution within the Capsicum genus. The novel genomes were analyzed within a pangenomic framework, integrating 16 Capsicum genomes across the Annuum, Baccatum, and Pubescens complexes. Homology grouping was used to identify core, accessory and unique genes and showed a wide spectrum of genetic diversity, particularly in homology groups exclusive to C. chacoense and C. galapagoense. Out of 79,267 homology groups identified, 13% were core groups, present in all accessions, corresponding to approximately 30% of core genes per genome. Comparative analyses revealed distinct species and genus-specific genomic characteristics. Additionally, we used the graph pangenome to illustrate locus-level exploration by examining the Pun1 locus associated with capsaicinoid biosynthesis, identifying multiple Pun1-like genes including their genomic position and homology information. The integration of these new resources into a dynamic Capsicum pangenome framework provides a versatile platform for extracting genetic information relevant to both fundamental research and breeding applications.},
}
RevDate: 2025-12-09
Asynchronous Viral Spread of Two Unrelated Viruses Determines Lettuce Big Vein Disease Symptom Development.
Phytopathology [Epub ahead of print].
Lettuce big-vein disease (LBVD) is a major disease affecting lettuce cultivation worldwide. LBVD is caused by two unrelated negative-stranded RNA viruses, that is, Mirafiori lettuce big-vein virus (MiLBVV; Ophiovirus mirafioriense; Aspiviridae) and Lettuce big-vein associated virus (LBVaV; Varicosavirus lactucae; Rhabdoviridae) both vectored by the soilborne fungus Olpidium virulentus. Despite extensive research, a synergistic effect between the two viruses has not been observed, while both viruses individually have been suggested to be the causal agent for the disease. By performing lettuce infections using a large soil sample collection carrying LBVD infested O. virulentus spores, the presence of LBVaV was consistently established in diseased lettuce heads, while MiLBVV infections were apparently less prevalent. Yet, aboveground infections with MiLBVV corresponded with strong disease symptoms. Strikingly, the spread of LBVaV from the root to shoot always preceded that of MiLBVV. The LBVaV systemic spread was highly synchronized between the plants, while MiLBVV spread was always delayed and less synchronized. A pan-genome analysis revealed independent segment reassortments for both viruses indicative of mixed field infections over the sampled period. Yet, RNA segment abundance was highly conserved for both viruses between all re-infections, suggesting that segment abundance has a regulatory role for the two individual viruses, but that segment abundance is not impacted by the presence of the other two viruses. The pan-genome analysis also revealed different evolutionary rates of the viral ORFs suggesting that mutagenesis of certain ORFs compromises viral fitness and thus revealing a potential weak spot for both viruses.
Additional Links: PMID-41364654
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41364654,
year = {2025},
author = {Schravesande, WEW and de Heer, PM and Heilijgers, M and Carrillo-Reche, J and Verhage, A and van den Burg, HA},
title = {Asynchronous Viral Spread of Two Unrelated Viruses Determines Lettuce Big Vein Disease Symptom Development.},
journal = {Phytopathology},
volume = {},
number = {},
pages = {},
doi = {10.1094/PHYTO-04-25-0155-R},
pmid = {41364654},
issn = {0031-949X},
abstract = {Lettuce big-vein disease (LBVD) is a major disease affecting lettuce cultivation worldwide. LBVD is caused by two unrelated negative-stranded RNA viruses, that is, Mirafiori lettuce big-vein virus (MiLBVV; Ophiovirus mirafioriense; Aspiviridae) and Lettuce big-vein associated virus (LBVaV; Varicosavirus lactucae; Rhabdoviridae) both vectored by the soilborne fungus Olpidium virulentus. Despite extensive research, a synergistic effect between the two viruses has not been observed, while both viruses individually have been suggested to be the causal agent for the disease. By performing lettuce infections using a large soil sample collection carrying LBVD infested O. virulentus spores, the presence of LBVaV was consistently established in diseased lettuce heads, while MiLBVV infections were apparently less prevalent. Yet, aboveground infections with MiLBVV corresponded with strong disease symptoms. Strikingly, the spread of LBVaV from the root to shoot always preceded that of MiLBVV. The LBVaV systemic spread was highly synchronized between the plants, while MiLBVV spread was always delayed and less synchronized. A pan-genome analysis revealed independent segment reassortments for both viruses indicative of mixed field infections over the sampled period. Yet, RNA segment abundance was highly conserved for both viruses between all re-infections, suggesting that segment abundance has a regulatory role for the two individual viruses, but that segment abundance is not impacted by the presence of the other two viruses. The pan-genome analysis also revealed different evolutionary rates of the viral ORFs suggesting that mutagenesis of certain ORFs compromises viral fitness and thus revealing a potential weak spot for both viruses.},
}
RevDate: 2025-12-08
Characterization of Enterobacter sp. C2 and its impact on ginger metabolism and defense.
Plant physiology and biochemistry : PPB, 230:110850 pii:S0981-9428(25)01378-6 [Epub ahead of print].
Ginger (Zingiber officinale Roscoe), a globally significant medicinal and culinary crop, is increasingly threatened by soil-borne diseases, leading to substantial yield losses and economic damage. Although multiple pathogens contribute to these diseases, the involvement of Enterobacter species remains poorly characterized. In this study, we isolated and characterized a novel Enterobacter sp., strain C2, from infected ginger rhizomes. Pathogenicity assays revealed that C2 predominantly colonizes stem tissue, followed by root and leaf tissues. Genomic analysis showed that the C2 genome comprises 4.58 Mb, encoding 4222 predicted proteins, with pan-genome analysis indicating that 65.73 % of these genes are core genes, including known virulence determinants. Integration of multiomic data demonstrated that C2 pathogenicity is mediated by adhesion mechanisms, regulatory gene networks, secretion systems, toxin biosynthesis, and host invasion strategies. In response to C2 infection, ginger activates a robust antioxidant defense system and undergoes significant metabolic reprogramming, particularly in nucleotide and tryptophan metabolism, ABC transporters, and photosynthesis (p < 0.01). Notably, C2 flagellin triggers the upregulation of defense-related genes in ginger, including the activation of the phenylpropanoid pathway and induction of a hypersensitive response, which collectively restrict bacterial proliferation. This study elucidates the molecular mechanisms underlying Enterobacter-mediated pathogenesis in ginger and reveals key host metabolic and defense perturbations caused by C2 infection. Our findings provide a foundation for developing targeted strategies to mitigate soil-borne diseases in ginger cultivation.
Additional Links: PMID-41360046
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41360046,
year = {2025},
author = {Huang, K and Xu, P and Sun, X and Tu, Y and Huang, M and Li, W and Huang, X and Sui, Y},
title = {Characterization of Enterobacter sp. C2 and its impact on ginger metabolism and defense.},
journal = {Plant physiology and biochemistry : PPB},
volume = {230},
number = {},
pages = {110850},
doi = {10.1016/j.plaphy.2025.110850},
pmid = {41360046},
issn = {1873-2690},
abstract = {Ginger (Zingiber officinale Roscoe), a globally significant medicinal and culinary crop, is increasingly threatened by soil-borne diseases, leading to substantial yield losses and economic damage. Although multiple pathogens contribute to these diseases, the involvement of Enterobacter species remains poorly characterized. In this study, we isolated and characterized a novel Enterobacter sp., strain C2, from infected ginger rhizomes. Pathogenicity assays revealed that C2 predominantly colonizes stem tissue, followed by root and leaf tissues. Genomic analysis showed that the C2 genome comprises 4.58 Mb, encoding 4222 predicted proteins, with pan-genome analysis indicating that 65.73 % of these genes are core genes, including known virulence determinants. Integration of multiomic data demonstrated that C2 pathogenicity is mediated by adhesion mechanisms, regulatory gene networks, secretion systems, toxin biosynthesis, and host invasion strategies. In response to C2 infection, ginger activates a robust antioxidant defense system and undergoes significant metabolic reprogramming, particularly in nucleotide and tryptophan metabolism, ABC transporters, and photosynthesis (p < 0.01). Notably, C2 flagellin triggers the upregulation of defense-related genes in ginger, including the activation of the phenylpropanoid pathway and induction of a hypersensitive response, which collectively restrict bacterial proliferation. This study elucidates the molecular mechanisms underlying Enterobacter-mediated pathogenesis in ginger and reveals key host metabolic and defense perturbations caused by C2 infection. Our findings provide a foundation for developing targeted strategies to mitigate soil-borne diseases in ginger cultivation.},
}
RevDate: 2025-12-17
CmpDate: 2025-12-15
Rare metabolic gene essentiality is a determinant of microniche adaptation in Eschherichia coli.
PLoS pathogens, 21(12):e1013775.
Rare genes in bacterial pangenomes have historically been considered non-essential, dispensable, or even costly, and largely excluded from in-depth analyses due to their perceived redundancy, high variability, and presumed neutral evolutionary origin. However, whether rare genes contribute to metabolic robustness when core genes are lost remains an open question. In this study, we systematically investigate the role of rare metabolic genes in Escherichia coli, revealing their essentiality in maintaining metabolic functions under core gene loss. Through a pangenome-scale reconstruction of 15311 strain-specific genome-scale models (panGEM) and over 22.4 million gene knockout simulations, we demonstrate that: (i) 9.4% of rare metabolic genes are essential in at least one of three key host environments-feces, serum, and urine; (ii) 41% of strains rely on at least one rare essential metabolic gene for survival; (iii) rare metabolic genes emerge as a result of microniche adaptation, and (iv) panGEM allows for the prediction of a subset of highly conserved metabolic reactions with minimal genetic diversity as stable drug targets. These findings challenge the common view that rare genes primarily serve as evolutionary byproducts of genome fluidity and reveal their critical role in metabolic resilience.
Additional Links: PMID-41359686
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41359686,
year = {2025},
author = {Ardalani, O and Phaneuf, PV and Krishnan, J and Chauhan, SM and Pride, D and Zielinski, DC and Monk, JM and Nielsen, LK and Palsson, BO},
title = {Rare metabolic gene essentiality is a determinant of microniche adaptation in Eschherichia coli.},
journal = {PLoS pathogens},
volume = {21},
number = {12},
pages = {e1013775},
pmid = {41359686},
issn = {1553-7374},
mesh = {*Escherichia coli/genetics/metabolism ; *Adaptation, Physiological/genetics ; *Genes, Essential ; Genome, Bacterial ; Escherichia coli Proteins/genetics/metabolism ; },
abstract = {Rare genes in bacterial pangenomes have historically been considered non-essential, dispensable, or even costly, and largely excluded from in-depth analyses due to their perceived redundancy, high variability, and presumed neutral evolutionary origin. However, whether rare genes contribute to metabolic robustness when core genes are lost remains an open question. In this study, we systematically investigate the role of rare metabolic genes in Escherichia coli, revealing their essentiality in maintaining metabolic functions under core gene loss. Through a pangenome-scale reconstruction of 15311 strain-specific genome-scale models (panGEM) and over 22.4 million gene knockout simulations, we demonstrate that: (i) 9.4% of rare metabolic genes are essential in at least one of three key host environments-feces, serum, and urine; (ii) 41% of strains rely on at least one rare essential metabolic gene for survival; (iii) rare metabolic genes emerge as a result of microniche adaptation, and (iv) panGEM allows for the prediction of a subset of highly conserved metabolic reactions with minimal genetic diversity as stable drug targets. These findings challenge the common view that rare genes primarily serve as evolutionary byproducts of genome fluidity and reveal their critical role in metabolic resilience.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Escherichia coli/genetics/metabolism
*Adaptation, Physiological/genetics
*Genes, Essential
Genome, Bacterial
Escherichia coli Proteins/genetics/metabolism
RevDate: 2025-12-06
Pangenome analyses of tea plants reveal structural variations driving gene expression alterations and agronomic trait diversification.
Nature communications pii:10.1038/s41467-025-67060-5 [Epub ahead of print].
Tea plants, which are among the world's most economically important beverage crops, exhibit extensive genetic diversity and are rich in secondary metabolites. While structural variations (SVs) drive phenotypic diversification, their regulatory roles in transcriptional networks and agronomic traits remain underexplored in this perennial crop. Here, we construct a pangenome from 22 representative tea accessions and their wild relatives. Genomic SV analysis reveals that 22% of the gene promoters contain variants influencing flavonoid, amino acid, and terpenoid biosynthesis. Population SV analysis of 275 tea accessions reveals three haplotypes in the ANS3 promoter, with Hap1, containing a 192 bp insertion, predominantly found in wild relatives but largely lost in modern cultivars. This insertion increases CtANS3 expression and anthocyanin content in wild relatives. Additionally, a 159 bp insertion in the CtLRR1 promoter reduces resistance to Colletotrichum gloeosporioides in wild relatives. Our findings underscore SVs as pivotal regulators of flavor differentiation and adaptive evolution during tea plant domestication.
Additional Links: PMID-41353212
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41353212,
year = {2025},
author = {Tao, L and Zhu, J and Hu, J and Xu, Q and Wu, J and Chen, C and Li, Y and Li, F and Chen, H and Huang, S and Zhou, Q and Zhao, Y and Hu, Y and Liu, S and Dong, K and Chen, L and Liu, B and Wan, X and Xia, E and Zhou, Y and Wei, C},
title = {Pangenome analyses of tea plants reveal structural variations driving gene expression alterations and agronomic trait diversification.},
journal = {Nature communications},
volume = {},
number = {},
pages = {},
doi = {10.1038/s41467-025-67060-5},
pmid = {41353212},
issn = {2041-1723},
support = {U20A2045//National Natural Science Foundation of China (National Science Foundation of China)/ ; },
abstract = {Tea plants, which are among the world's most economically important beverage crops, exhibit extensive genetic diversity and are rich in secondary metabolites. While structural variations (SVs) drive phenotypic diversification, their regulatory roles in transcriptional networks and agronomic traits remain underexplored in this perennial crop. Here, we construct a pangenome from 22 representative tea accessions and their wild relatives. Genomic SV analysis reveals that 22% of the gene promoters contain variants influencing flavonoid, amino acid, and terpenoid biosynthesis. Population SV analysis of 275 tea accessions reveals three haplotypes in the ANS3 promoter, with Hap1, containing a 192 bp insertion, predominantly found in wild relatives but largely lost in modern cultivars. This insertion increases CtANS3 expression and anthocyanin content in wild relatives. Additionally, a 159 bp insertion in the CtLRR1 promoter reduces resistance to Colletotrichum gloeosporioides in wild relatives. Our findings underscore SVs as pivotal regulators of flavor differentiation and adaptive evolution during tea plant domestication.},
}
RevDate: 2025-12-06
Exploration of functional genes in Brucella anthropi for hexavalent chromium bioremoval through gene mining: A whole genome sequencing based approach.
Ecotoxicology and environmental safety, 309:119472 pii:S0147-6513(25)01817-2 [Epub ahead of print].
Development of whole genome sequencing (WGS) has revolutionized in understanding the functional genes and molecular perspectives involved in the resistance of heavy metals. In the current study, VITPLJ18 bacterial isolate obtained from tannery industry showed a maximum tolerable concentration of 1300 ppm against hexavalent chromium [Cr(VI)] and specific growth rate (µmax) for the cells at 650 ppm of chromium was found to be 0.088 h[-1] . Whole genome analysis confirmed VITPLJ18 to be Brucella anthropi which belonged to the family Brucellaceae. Mining of VITPLJ18 genome revealed the total number of genes to be 4468, among which key resistance genes such as chrA, chrB, chrC and chrF were further analysed through comparative gene cluster analysis across various bacteria. Pangenome analysis showed both core and accessory genes contributing to metal resistance, while phylogenetic reconstruction positioned the isolate within a distinct lineage, suggesting unique adaptive evolution. Further, upregulation of chromium resistant genes such as chrA, chrB, chrC and chrF was observed in cells treated with chromium with gyrA serving as housekeeping gene. SEM-EDX confirmed structural resilience and accumulation of Cr(VI) by cells under chromium stress which corroborated with the ICPMS study showing 40 mg/g of Cr(VI) accumulation. Collectively, these findings provide substantial insights into the resistance mechanism and bioaccumulation of Cr(VI) through the upregulation of transporter and resistant genes. Therefore, VITPLJ18 can be considered as a suitable strain for the bioremoval of Cr(VI).
Additional Links: PMID-41352260
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41352260,
year = {2025},
author = {Dey, P and Osborne, JW and Lincy Kirubhadharsini, B},
title = {Exploration of functional genes in Brucella anthropi for hexavalent chromium bioremoval through gene mining: A whole genome sequencing based approach.},
journal = {Ecotoxicology and environmental safety},
volume = {309},
number = {},
pages = {119472},
doi = {10.1016/j.ecoenv.2025.119472},
pmid = {41352260},
issn = {1090-2414},
abstract = {Development of whole genome sequencing (WGS) has revolutionized in understanding the functional genes and molecular perspectives involved in the resistance of heavy metals. In the current study, VITPLJ18 bacterial isolate obtained from tannery industry showed a maximum tolerable concentration of 1300 ppm against hexavalent chromium [Cr(VI)] and specific growth rate (µmax) for the cells at 650 ppm of chromium was found to be 0.088 h[-1] . Whole genome analysis confirmed VITPLJ18 to be Brucella anthropi which belonged to the family Brucellaceae. Mining of VITPLJ18 genome revealed the total number of genes to be 4468, among which key resistance genes such as chrA, chrB, chrC and chrF were further analysed through comparative gene cluster analysis across various bacteria. Pangenome analysis showed both core and accessory genes contributing to metal resistance, while phylogenetic reconstruction positioned the isolate within a distinct lineage, suggesting unique adaptive evolution. Further, upregulation of chromium resistant genes such as chrA, chrB, chrC and chrF was observed in cells treated with chromium with gyrA serving as housekeeping gene. SEM-EDX confirmed structural resilience and accumulation of Cr(VI) by cells under chromium stress which corroborated with the ICPMS study showing 40 mg/g of Cr(VI) accumulation. Collectively, these findings provide substantial insights into the resistance mechanism and bioaccumulation of Cr(VI) through the upregulation of transporter and resistant genes. Therefore, VITPLJ18 can be considered as a suitable strain for the bioremoval of Cr(VI).},
}
RevDate: 2025-12-25
CmpDate: 2025-12-23
Columba: fast approximate pattern matching with optimized search schemes.
Bioinformatics (Oxford, England), 41(12):.
MOTIVATION: Aligning sequencing reads to reference genomes is a fundamental task in bioinformatics. Aligners can be classified as lossy or lossless: lossy aligners prioritize speed by reporting only one or a few high-scoring alignments, whereas lossless aligners output all optimal alignments, ensuring completeness and sensitivity.
RESULTS: This paper introduces Columba, a high-performance lossless aligner tailored for Illumina sequencing data. Columba processes single or paired-end reads in FASTQ format and outputs alignments in SAM format. By utilizing advanced search schemes and bit-parallel alignment techniques, Columba achieves exceptional speed. Columba is available in two variants. The first, based on the bidirectional FM-index, prioritizes speed. The second, Columba RLC, uses run-length compression using a bidirectional move structure, significantly reducing memory usage for large, repetitive datasets like pan-genomes. Benchmarks on the human genome, as well as bacterial and human pan-genome datasets, demonstrate that Columba is much faster than existing lossless aligners and even competitive with lossy tools. We integrated Columba into the OptiType HLA genotyping pipeline, where it substantially reduced computational time while maintaining accuracy. These results position Columba as a versatile, state-of-the-art tool for high-sensitivity genomic analyses.
The source code of Columba is available at https://github.com/biointec/columba under AGPL license. Scripts to reproduce the benchmarks and analyses are available at https://doi.org/10.5281/zenodo.15849246.
Additional Links: PMID-41349000
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41349000,
year = {2025},
author = {Renders, L and Depuydt, L and Gagie, T and Fostier, J},
title = {Columba: fast approximate pattern matching with optimized search schemes.},
journal = {Bioinformatics (Oxford, England)},
volume = {41},
number = {12},
pages = {},
pmid = {41349000},
issn = {1367-4811},
support = {1SE7822N//Research Foundation-Flanders (FWO)/ ; 1117322N//Research Foundation-Flanders (FWO)/ ; },
mesh = {*Software ; Humans ; *Sequence Alignment/methods ; Genome, Human ; Algorithms ; High-Throughput Nucleotide Sequencing/methods ; *Sequence Analysis, DNA/methods ; Computational Biology/methods ; },
abstract = {MOTIVATION: Aligning sequencing reads to reference genomes is a fundamental task in bioinformatics. Aligners can be classified as lossy or lossless: lossy aligners prioritize speed by reporting only one or a few high-scoring alignments, whereas lossless aligners output all optimal alignments, ensuring completeness and sensitivity.
RESULTS: This paper introduces Columba, a high-performance lossless aligner tailored for Illumina sequencing data. Columba processes single or paired-end reads in FASTQ format and outputs alignments in SAM format. By utilizing advanced search schemes and bit-parallel alignment techniques, Columba achieves exceptional speed. Columba is available in two variants. The first, based on the bidirectional FM-index, prioritizes speed. The second, Columba RLC, uses run-length compression using a bidirectional move structure, significantly reducing memory usage for large, repetitive datasets like pan-genomes. Benchmarks on the human genome, as well as bacterial and human pan-genome datasets, demonstrate that Columba is much faster than existing lossless aligners and even competitive with lossy tools. We integrated Columba into the OptiType HLA genotyping pipeline, where it substantially reduced computational time while maintaining accuracy. These results position Columba as a versatile, state-of-the-art tool for high-sensitivity genomic analyses.
The source code of Columba is available at https://github.com/biointec/columba under AGPL license. Scripts to reproduce the benchmarks and analyses are available at https://doi.org/10.5281/zenodo.15849246.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Software
Humans
*Sequence Alignment/methods
Genome, Human
Algorithms
High-Throughput Nucleotide Sequencing/methods
*Sequence Analysis, DNA/methods
Computational Biology/methods
RevDate: 2025-12-05
CmpDate: 2025-12-05
Genome-based reclassification of Virgibacillus kapii Daroonpunt et al. 2016 and Virgibacillus massiliensis Khelaifia et al. 2023 as later heterotypic synonyms of Virgibacillus salexigens (Garabito et al. 1997) Heyrman et al. 2003.
International journal of systematic and evolutionary microbiology, 75(12):.
In the present study, genome-based metrics, phylogenetic and pangenomic analyses were used to investigate the relationship between three closely related species within the genus Virgibacillus (Virgibacillus salexigens, Virgibacillus kapii and Virgibacillus massiliensis). While 16S rRNA gene comparison revealed high similarity, overall genomic relatedness indices, including digital DNA-DNA hybridization (dDDH) and average nucleotide identity (ANI), exceeded the species delineation thresholds (ANI: 95-96 %, dDDH: 70%), indicating an even closer genetic relationship. Additionally, the phylogenomic analysis, along with pan-genome analysis, further supported the close relationship, placing the three species in a well-supported branch with a significant proportion of 3,809 shared core genes. Based on the combined evidence from genomic and phenotypic comparison, we propose reclassifying V. kapii Daroonpunt et al. 2016 and V. massiliensis Khelaifia et al. 2023 as later heterotypic synonyms of V. salexigens (Garabito et al. 1997) Heyrman et al. 2003. This study underscores the effectiveness of genome-based methods for accurate and reliable bacterial classification.
Additional Links: PMID-41348451
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41348451,
year = {2025},
author = {Menasria, T and Zaatout, N},
title = {Genome-based reclassification of Virgibacillus kapii Daroonpunt et al. 2016 and Virgibacillus massiliensis Khelaifia et al. 2023 as later heterotypic synonyms of Virgibacillus salexigens (Garabito et al. 1997) Heyrman et al. 2003.},
journal = {International journal of systematic and evolutionary microbiology},
volume = {75},
number = {12},
pages = {},
doi = {10.1099/ijsem.0.006990},
pmid = {41348451},
issn = {1466-5034},
mesh = {*Genome, Bacterial ; RNA, Ribosomal, 16S/genetics ; *Phylogeny ; DNA, Bacterial/genetics ; Nucleic Acid Hybridization ; *Virgibacillus/classification/genetics ; Sequence Analysis, DNA ; Bacterial Typing Techniques ; Genomics ; },
abstract = {In the present study, genome-based metrics, phylogenetic and pangenomic analyses were used to investigate the relationship between three closely related species within the genus Virgibacillus (Virgibacillus salexigens, Virgibacillus kapii and Virgibacillus massiliensis). While 16S rRNA gene comparison revealed high similarity, overall genomic relatedness indices, including digital DNA-DNA hybridization (dDDH) and average nucleotide identity (ANI), exceeded the species delineation thresholds (ANI: 95-96 %, dDDH: 70%), indicating an even closer genetic relationship. Additionally, the phylogenomic analysis, along with pan-genome analysis, further supported the close relationship, placing the three species in a well-supported branch with a significant proportion of 3,809 shared core genes. Based on the combined evidence from genomic and phenotypic comparison, we propose reclassifying V. kapii Daroonpunt et al. 2016 and V. massiliensis Khelaifia et al. 2023 as later heterotypic synonyms of V. salexigens (Garabito et al. 1997) Heyrman et al. 2003. This study underscores the effectiveness of genome-based methods for accurate and reliable bacterial classification.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Genome, Bacterial
RNA, Ribosomal, 16S/genetics
*Phylogeny
DNA, Bacterial/genetics
Nucleic Acid Hybridization
*Virgibacillus/classification/genetics
Sequence Analysis, DNA
Bacterial Typing Techniques
Genomics
RevDate: 2025-12-05
Col-BWT: Pangenomic Seed Chaining with Maximal Matches Improves Read Classification.
Journal of computational biology : a journal of computational molecular cell biology [Epub ahead of print].
Given a set of exact matches, chaining consists of finding those which are colinear with respect to the reference. However, this requires superlinear time in the number of matches and is not suitable for approaches such as read classification using a pangenome due to the increased multiplicity of matches. While compressed full-text indexes enable efficient read classification against a pangenome or tree-of-life index, past work on compressed index classification captures only fine-grained colinearity of exact matches. This fails to capture whether seeds appear colinearly in the reference, which we denote as "coarse-grained colinearity," as traditionally described by chaining. We present a novel data structure, col-BWT (for colinear Burrows-Wheeler Transform [BWT]), that additionally obtains coarse-grained colinearity ("chain") statistics. We start with a collection of strings, avoiding the multiple-alignment step required by graph approaches. We rapidly compute multi-maximal unique matches (multi-MUMs) and identify BWT sub-runs that correspond to these multi-MUMs. From these, we select those that can be "tunneled," in that they satisfy the concept of a tunnel in BWT compression, and mark them with their corresponding multi-MUM identifier. We call this approach the col-BWT due to its ability to recognize colinearity with respect to the reference. This yields an O(r + n/d)-space index for a collection of d sequences having a length-n BWT consisting of r maximal equal-character runs. Using col-BWT, we simultaneously compute both fine-grained statistics and coarse-grained chain statistics in linear time with respect to query length, faster than traditional superlinear chaining approaches. We found that this substantially improves classification accuracy compared with past compressed-indexing approaches and reaches the same level of accuracy as less efficient alignment-based methods.
Additional Links: PMID-41346014
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41346014,
year = {2025},
author = {Brown, NK and Shivakumar, VS and Langmead, B},
title = {Col-BWT: Pangenomic Seed Chaining with Maximal Matches Improves Read Classification.},
journal = {Journal of computational biology : a journal of computational molecular cell biology},
volume = {},
number = {},
pages = {},
doi = {10.1177/15578666251400784},
pmid = {41346014},
issn = {1557-8666},
abstract = {Given a set of exact matches, chaining consists of finding those which are colinear with respect to the reference. However, this requires superlinear time in the number of matches and is not suitable for approaches such as read classification using a pangenome due to the increased multiplicity of matches. While compressed full-text indexes enable efficient read classification against a pangenome or tree-of-life index, past work on compressed index classification captures only fine-grained colinearity of exact matches. This fails to capture whether seeds appear colinearly in the reference, which we denote as "coarse-grained colinearity," as traditionally described by chaining. We present a novel data structure, col-BWT (for colinear Burrows-Wheeler Transform [BWT]), that additionally obtains coarse-grained colinearity ("chain") statistics. We start with a collection of strings, avoiding the multiple-alignment step required by graph approaches. We rapidly compute multi-maximal unique matches (multi-MUMs) and identify BWT sub-runs that correspond to these multi-MUMs. From these, we select those that can be "tunneled," in that they satisfy the concept of a tunnel in BWT compression, and mark them with their corresponding multi-MUM identifier. We call this approach the col-BWT due to its ability to recognize colinearity with respect to the reference. This yields an O(r + n/d)-space index for a collection of d sequences having a length-n BWT consisting of r maximal equal-character runs. Using col-BWT, we simultaneously compute both fine-grained statistics and coarse-grained chain statistics in linear time with respect to query length, faster than traditional superlinear chaining approaches. We found that this substantially improves classification accuracy compared with past compressed-indexing approaches and reaches the same level of accuracy as less efficient alignment-based methods.},
}
RevDate: 2025-12-20
CmpDate: 2025-12-18
Pangenomes as a framework for adaptive radiation, speciation, and adaptation.
American journal of botany, 112(12):e70130.
Understanding the genomic basis of diversification is a central goal in evolutionary biology. In recent years, the development and use of pangenomes, a genomic representation of multiple individuals within a lineage (a set of related populations, subspecies, ecotypes, or species), has enabled researchers to differentiate between DNA sequences shared by all individuals of a given lineage (core regions) from those present only in some individuals (accessory or variable regions). Differentiating between core and accessory regions has highlighted a key limitation of relying on a single reference genome: It captures the genetic code of only one individual and this biases genomic analyses and our understanding of diversification. Here, we propose that by identifying genes associated with both core and accessory regions, we can deepen our understanding of the processes underlying diversification. We suggest that analyzing pangenomes and accessory regions will provide deeper insights into diversification, hybridization, and the genetic basis of adaptation and speciation.
Additional Links: PMID-41345855
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41345855,
year = {2025},
author = {Tellini, N and Tørresen, OK and Edwards, D and Rieseberg, LH and Jakobsen, KS and Cerca, J},
title = {Pangenomes as a framework for adaptive radiation, speciation, and adaptation.},
journal = {American journal of botany},
volume = {112},
number = {12},
pages = {e70130},
pmid = {41345855},
issn = {1537-2197},
mesh = {*Genetic Speciation ; *Genome, Plant/genetics ; *Adaptation, Physiological/genetics ; *Adaptation, Biological/genetics ; *Plants/genetics ; },
abstract = {Understanding the genomic basis of diversification is a central goal in evolutionary biology. In recent years, the development and use of pangenomes, a genomic representation of multiple individuals within a lineage (a set of related populations, subspecies, ecotypes, or species), has enabled researchers to differentiate between DNA sequences shared by all individuals of a given lineage (core regions) from those present only in some individuals (accessory or variable regions). Differentiating between core and accessory regions has highlighted a key limitation of relying on a single reference genome: It captures the genetic code of only one individual and this biases genomic analyses and our understanding of diversification. Here, we propose that by identifying genes associated with both core and accessory regions, we can deepen our understanding of the processes underlying diversification. We suggest that analyzing pangenomes and accessory regions will provide deeper insights into diversification, hybridization, and the genetic basis of adaptation and speciation.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Genetic Speciation
*Genome, Plant/genetics
*Adaptation, Physiological/genetics
*Adaptation, Biological/genetics
*Plants/genetics
RevDate: 2025-12-04
Hybrid genome sequence of Cryptococcus neoformans of Indian origin and comparative genome analysis.
Scientific reports pii:10.1038/s41598-025-25672-3 [Epub ahead of print].
Human opportunistic fungi C. neoformans are a global health concern as they cause severe pneumonia and cryptococcal meningitis in tropical and subtropical regions. Though the occurrence of infection has reduced from 2014 to 2022, the mortality rate remains high, and it is considered a neglected tropical disease. More studies are needed to elucidate the genomic architecture and functional traits required to underpin the disease management of Cryptococcus neoformans. This study showcases the first complete genome sequence of an Indian isolate of Cryptococcus, identified as subsp. grubii containing 4.8% of virulence genes in a total genome. A comparative virulome analysis revealed that study genomes Cn, H99 and JEC21 harbor crucial virulence genes associated with infection severity. While all study strains possess genes promoting antifungal resistance (AFR), most lack specific SNPs defining AFR. The whole genome Single Nucleotide Polymorphism (SNP) analysis categorized the strains as hyper- and non-hypermutators. The phylogenomic and pan-genome analysis revealed that the study strains share a core genome, highlighting evolutionary relatedness. The study shows gene expansion and contraction between the study strains. This pioneering study presents Cryptococcus's first comprehensive genomic analysis from India, integrating genetic variations, virulome, resistome, and pan-genome dynamics. The study highlights the necessity of bioinformatic tools for efficient genome analysis and large-scale comparative genome research.
Additional Links: PMID-41345134
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41345134,
year = {2025},
author = {Sathiyamoorthy, J and Ramakrishnan, J},
title = {Hybrid genome sequence of Cryptococcus neoformans of Indian origin and comparative genome analysis.},
journal = {Scientific reports},
volume = {},
number = {},
pages = {},
doi = {10.1038/s41598-025-25672-3},
pmid = {41345134},
issn = {2045-2322},
abstract = {Human opportunistic fungi C. neoformans are a global health concern as they cause severe pneumonia and cryptococcal meningitis in tropical and subtropical regions. Though the occurrence of infection has reduced from 2014 to 2022, the mortality rate remains high, and it is considered a neglected tropical disease. More studies are needed to elucidate the genomic architecture and functional traits required to underpin the disease management of Cryptococcus neoformans. This study showcases the first complete genome sequence of an Indian isolate of Cryptococcus, identified as subsp. grubii containing 4.8% of virulence genes in a total genome. A comparative virulome analysis revealed that study genomes Cn, H99 and JEC21 harbor crucial virulence genes associated with infection severity. While all study strains possess genes promoting antifungal resistance (AFR), most lack specific SNPs defining AFR. The whole genome Single Nucleotide Polymorphism (SNP) analysis categorized the strains as hyper- and non-hypermutators. The phylogenomic and pan-genome analysis revealed that the study strains share a core genome, highlighting evolutionary relatedness. The study shows gene expansion and contraction between the study strains. This pioneering study presents Cryptococcus's first comprehensive genomic analysis from India, integrating genetic variations, virulome, resistome, and pan-genome dynamics. The study highlights the necessity of bioinformatic tools for efficient genome analysis and large-scale comparative genome research.},
}
RevDate: 2025-12-07
CmpDate: 2025-12-04
Toward a standardized framework for pangenome graph evaluation: assessing crop plant pangenome variation graph construction from multiple assemblies.
GigaScience, 14:.
BACKGROUND: Pangenomes are crucial for understanding species-wide genetic diversity, delineating core and variable genes. This study compares 3 key pangenome graph assembly pipelines: Minigraph, PGGB, and Minigraph-Cactus, using publicly available Sorghum data. We introduce tailored metrics for comprehensive pangenome graph evaluation, including completeness, duplication levels, and fidelity of structural variants.
RESULTS: By assessing the tools on Sorghum datasets, we gauge their efficacy in handling diverse genomic features. The analysis provides detailed insights into the strengths and limitations of Minigraph, PGGB, and Minigraph-Cactus, aiding researchers in informed tool selection. The metrics developed contribute to standardizing pangenome graph assessments, enabling robust and objective tool comparisons. We further demonstrate the utility of the metrics by applying them to pangenome graphs of 3 crops: soybean, barley, and oilseed rape.
CONCLUSIONS: This benchmarking study advances our understanding of pangenome assembly tools and establishes a foundation for standardized evaluation metrics. We plan to further use these insights to optimize tool selection for specific applications, such as genome-wide association studies, improving the accuracy of downstream analyses.
Additional Links: PMID-41342577
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41342577,
year = {2025},
author = {Kopalli, V and Arslan, K and Morales-Díaz, N and Zanini, SF and Golicz, AA},
title = {Toward a standardized framework for pangenome graph evaluation: assessing crop plant pangenome variation graph construction from multiple assemblies.},
journal = {GigaScience},
volume = {14},
number = {},
pages = {},
pmid = {41342577},
issn = {2047-217X},
support = {MCIU/AEI/10.13039/501100011033//German Research Foundation/ ; 497667402//German Research Foundation/ ; PRE2020-095111//ESF/ ; },
mesh = {*Genome, Plant ; *Crops, Agricultural/genetics ; *Genomics/methods/standards ; Sorghum/genetics ; *Genetic Variation ; Software ; Hordeum/genetics ; Glycine max/genetics ; },
abstract = {BACKGROUND: Pangenomes are crucial for understanding species-wide genetic diversity, delineating core and variable genes. This study compares 3 key pangenome graph assembly pipelines: Minigraph, PGGB, and Minigraph-Cactus, using publicly available Sorghum data. We introduce tailored metrics for comprehensive pangenome graph evaluation, including completeness, duplication levels, and fidelity of structural variants.
RESULTS: By assessing the tools on Sorghum datasets, we gauge their efficacy in handling diverse genomic features. The analysis provides detailed insights into the strengths and limitations of Minigraph, PGGB, and Minigraph-Cactus, aiding researchers in informed tool selection. The metrics developed contribute to standardizing pangenome graph assessments, enabling robust and objective tool comparisons. We further demonstrate the utility of the metrics by applying them to pangenome graphs of 3 crops: soybean, barley, and oilseed rape.
CONCLUSIONS: This benchmarking study advances our understanding of pangenome assembly tools and establishes a foundation for standardized evaluation metrics. We plan to further use these insights to optimize tool selection for specific applications, such as genome-wide association studies, improving the accuracy of downstream analyses.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Genome, Plant
*Crops, Agricultural/genetics
*Genomics/methods/standards
Sorghum/genetics
*Genetic Variation
Software
Hordeum/genetics
Glycine max/genetics
RevDate: 2025-12-24
CmpDate: 2025-12-17
Discovery and Evolutionary Analysis of a Novel Genus of Endogenous Pararetroviruses with Long Terminal Repeats.
Genome biology and evolution, 17(12):.
Despite being widespread in plants, endogenous pararetroviruses (EPRVs) are still poorly understood in barley and many other cereal crops. In this study, the barley reference genome was examined and a new EPRV was identified and named Hvu-EPRV. In contrast to all EPRVs identified thus far, Hvu-EPRV contains long terminal repeats (LTRs) which are similar to LTR retrotransposons. Homologous sequences of Hvu-EPRV were found in a wide range of plants, however, only those in 17 grasses belonging to the six tribes contain LTRs. The insertion times of nested LTR retrotransposons indicated that Hvu-EPRVs inserted into barley more than 2.37 million years ago, but the invasion and endogenization of Hvu-EPRV related elements in the grass family may be ancient, and horizontal transfers may have occurred between grasses. Phylogenetic analysis revealed that Hvu-EPRV and its homologs in grasses were grouped apart from all 13 reported genera of exogenous and EPRVs, thus the EPRVs in grasses represent a novel genus of the Caulimoviridae family named Idahovirus. Genome-wide comparisons of Hvu-EPRVs were conducted between the reference genome and other 84 genomes of cultivated and wild barley, three independent integration events were observed and suggested that the integrations likely occurred after the divergence between barley and its wild progenitor. This is the first time to identify EPRVs with LTRs and to detect their recent integrations, and this research provides new insights into the evolution of plant EPRVs and their invasion history in the grass family.
Additional Links: PMID-41340553
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41340553,
year = {2025},
author = {Gao, D and Tressel, LG},
title = {Discovery and Evolutionary Analysis of a Novel Genus of Endogenous Pararetroviruses with Long Terminal Repeats.},
journal = {Genome biology and evolution},
volume = {17},
number = {12},
pages = {},
pmid = {41340553},
issn = {1759-6653},
support = {//U.S. Department of Agriculture, Agricultural Research Service/ ; },
mesh = {*Terminal Repeat Sequences ; Phylogeny ; *Evolution, Molecular ; *Hordeum/virology/genetics ; Retroelements ; *Endogenous Retroviruses/genetics/classification ; *Caulimoviridae/genetics/classification ; Genome, Plant ; },
abstract = {Despite being widespread in plants, endogenous pararetroviruses (EPRVs) are still poorly understood in barley and many other cereal crops. In this study, the barley reference genome was examined and a new EPRV was identified and named Hvu-EPRV. In contrast to all EPRVs identified thus far, Hvu-EPRV contains long terminal repeats (LTRs) which are similar to LTR retrotransposons. Homologous sequences of Hvu-EPRV were found in a wide range of plants, however, only those in 17 grasses belonging to the six tribes contain LTRs. The insertion times of nested LTR retrotransposons indicated that Hvu-EPRVs inserted into barley more than 2.37 million years ago, but the invasion and endogenization of Hvu-EPRV related elements in the grass family may be ancient, and horizontal transfers may have occurred between grasses. Phylogenetic analysis revealed that Hvu-EPRV and its homologs in grasses were grouped apart from all 13 reported genera of exogenous and EPRVs, thus the EPRVs in grasses represent a novel genus of the Caulimoviridae family named Idahovirus. Genome-wide comparisons of Hvu-EPRVs were conducted between the reference genome and other 84 genomes of cultivated and wild barley, three independent integration events were observed and suggested that the integrations likely occurred after the divergence between barley and its wild progenitor. This is the first time to identify EPRVs with LTRs and to detect their recent integrations, and this research provides new insights into the evolution of plant EPRVs and their invasion history in the grass family.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Terminal Repeat Sequences
Phylogeny
*Evolution, Molecular
*Hordeum/virology/genetics
Retroelements
*Endogenous Retroviruses/genetics/classification
*Caulimoviridae/genetics/classification
Genome, Plant
RevDate: 2025-12-03
Gramene 2025: expanded comparative genomics and pathway resources, integrated search, and pan-genome portals for crop research.
Nucleic acids research pii:8363844 [Epub ahead of print].
Gramene (gramene.org) is a comprehensive reference database for comparative plant genomics and pathway analysis, integrating functional annotations, evidence-based curated pathways and their projections, and multi-omics datasets. Since our last report, Gramene has added crop-specific pan-genome portals for maize, sorghum, rice, and grapevine. These pan-genome portals host population-scale datasets and multiple assembled genomes per species, all anchored by shared reference genomes. Importantly, these portals now adopt standardized rsIDs for genetic variants, advancing FAIR data principles and enabling cross-database interoperability. The main site is now Gramene Plants, emphasizing its broad genome coverage. Release 69 features 233 reference genomes, curated pathways for 139 species, expression data from 1026 studies across 27 species, and genetic variation data mapped to 27 genomes from 19 species. Key updates to the integrated search functionality include embedded expression viewers from the Bio-Analytic Resource for Plant Biology and EMBL-EBI Expression Atlas, a literature-curated catalog of gene functions, and a new Germplasm tab linking accessions with loss-of-function alleles to seed repositories. These advances reinforce Gramene as a comprehensive platform for exploring plant genomic diversity, gene function, and evolutionary conservation across the Green Tree of Life and within key agricultural species.
Additional Links: PMID-41335101
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41335101,
year = {2025},
author = {Olson, A and Kumari, S and Wei, X and Chougule, K and Lu, Z and Tello-Ruiz, MK and Kumar, V and Van Buren, P and Olson, A and Kim, C and Braynen, J and Zhang, L and Dyer, S and Alvarez-Jarreta, J and Saraf, S and Contreras-Moreira, B and Naamati, G and Ernst, C and Papatheodorou, I and George, N and Jaiswal, P and Naithani, S and Gupta, P and Elser, J and D'Eustachio, P and Assmann, SM and Ferrero-Serrano, Á and Pasha, A and Provart, N and Gladman, N and Ware, D},
title = {Gramene 2025: expanded comparative genomics and pathway resources, integrated search, and pan-genome portals for crop research.},
journal = {Nucleic acids research},
volume = {},
number = {},
pages = {},
doi = {10.1093/nar/gkaf1260},
pmid = {41335101},
issn = {1362-4962},
support = {//United States Department of Agriculture/ ; 8062-21000-051-000D//USDA/ ; 0201-88888-003-000D//USDA/ ; 0201-88888-002-000D//USDA/ ; 2122358//National Science Foundation/ ; 2122357//National Science Foundation/ ; 2029854//National Science Foundation/ ; 80NSSC22K0891/NASA/NASA/United States ; 80NSSC22K0855/NASA/NASA/United States ; HR0011-23-9-0054//Defense Advanced Research Projects Agency/ ; S10OD028632-01/NH/NIH HHS/United States ; U24 HG012198/HG/NHGRI NIH HHS/United States ; //Natural Sciences and Engineering Research Council of Canada/ ; WT222155/Z/20/Z/WT_/Wellcome Trust/United Kingdom ; 221401/Z/20/Z/WT_/Wellcome Trust/United Kingdom ; //European Molecular Biology Laboratory/ ; //Oregon State University/ ; },
abstract = {Gramene (gramene.org) is a comprehensive reference database for comparative plant genomics and pathway analysis, integrating functional annotations, evidence-based curated pathways and their projections, and multi-omics datasets. Since our last report, Gramene has added crop-specific pan-genome portals for maize, sorghum, rice, and grapevine. These pan-genome portals host population-scale datasets and multiple assembled genomes per species, all anchored by shared reference genomes. Importantly, these portals now adopt standardized rsIDs for genetic variants, advancing FAIR data principles and enabling cross-database interoperability. The main site is now Gramene Plants, emphasizing its broad genome coverage. Release 69 features 233 reference genomes, curated pathways for 139 species, expression data from 1026 studies across 27 species, and genetic variation data mapped to 27 genomes from 19 species. Key updates to the integrated search functionality include embedded expression viewers from the Bio-Analytic Resource for Plant Biology and EMBL-EBI Expression Atlas, a literature-curated catalog of gene functions, and a new Germplasm tab linking accessions with loss-of-function alleles to seed repositories. These advances reinforce Gramene as a comprehensive platform for exploring plant genomic diversity, gene function, and evolutionary conservation across the Green Tree of Life and within key agricultural species.},
}
RevDate: 2025-12-22
CmpDate: 2025-12-22
Complete de novo assembly and re-annotation of the zebrafish genome.
bioRxiv : the preprint server for biology.
The zebrafish (Danio rerio) is widely used in vertebrate research, but its reference genome assembly has contained extensively unresolved regions across both euchromatic and heterochromatic compartments. The previous reference genome assembly, GRCz11, consisted of 19,725 contigs assembled into 1,917 scaffolds. Recent advances in both long-read sequencing technologies and genome assembly algorithms have made "complete" genome assemblies possible for the first time. We used homozygous fish from two lab strains, "Tübingen" and "AB," for de novo genome assemblies. The new assemblies incorporated 7% more genomic sequence than GRCz11 and an additional 130 million bases of previously unassembled sequence. RefSeq annotation incorporating newly generated Iso-Seq cDNA sequences have added notable increases in mRNAs (68%), lncRNAs (47%), and misc_RNAs (1099%). Two assemblies have been elevated to reference genome status (GRCz12tu and GRCz12ab). We generated an additional 40 draft haplotypes to create a zebrafish pangenome resource and demonstrate its utility for variant analysis.
Additional Links: PMID-41332591
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41332591,
year = {2025},
author = {Okendo, J and Koren, S and Rhie, A and Torrado-Tapias, A and Pickett, BD and Brooks, SY and Bouffard, GG and Crawford, JK and Sison, C and Joardar, VS and Murphy, TD and Tierney, JAS and Haggerty, L and Martin, FJ and Wilson, C and Amores, A and Postlethwait, JH and Murphy, J and Sakai, N and Varga, ZM and Phillippy, AM and Burgess, SM},
title = {Complete de novo assembly and re-annotation of the zebrafish genome.},
journal = {bioRxiv : the preprint server for biology},
volume = {},
number = {},
pages = {},
pmid = {41332591},
issn = {2692-8205},
support = {/WT_/Wellcome Trust/United Kingdom ; ZIA HG000183/ImNIH/Intramural NIH HHS/United States ; },
abstract = {The zebrafish (Danio rerio) is widely used in vertebrate research, but its reference genome assembly has contained extensively unresolved regions across both euchromatic and heterochromatic compartments. The previous reference genome assembly, GRCz11, consisted of 19,725 contigs assembled into 1,917 scaffolds. Recent advances in both long-read sequencing technologies and genome assembly algorithms have made "complete" genome assemblies possible for the first time. We used homozygous fish from two lab strains, "Tübingen" and "AB," for de novo genome assemblies. The new assemblies incorporated 7% more genomic sequence than GRCz11 and an additional 130 million bases of previously unassembled sequence. RefSeq annotation incorporating newly generated Iso-Seq cDNA sequences have added notable increases in mRNAs (68%), lncRNAs (47%), and misc_RNAs (1099%). Two assemblies have been elevated to reference genome status (GRCz12tu and GRCz12ab). We generated an additional 40 draft haplotypes to create a zebrafish pangenome resource and demonstrate its utility for variant analysis.},
}
RevDate: 2025-12-02
Phased telomere-to-telomere super-pangenome: definitive reference genome in plants.
Trends in plant science pii:S1360-1385(25)00324-3 [Epub ahead of print].
With falling sequencing costs and the rise of computational methods, plant genomics is entering a new paradigmatic shift. Combination of phased telomere-to-telomere assemblies and super-pangenome is emerging as the ultimate reference needed in plants. Together they provide a gold standard for genetic dissection, molecular-design breeding, and resource conservation.
Additional Links: PMID-41330793
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41330793,
year = {2025},
author = {Zhang, J and Zheng, Y and Chen, F},
title = {Phased telomere-to-telomere super-pangenome: definitive reference genome in plants.},
journal = {Trends in plant science},
volume = {},
number = {},
pages = {},
doi = {10.1016/j.tplants.2025.11.002},
pmid = {41330793},
issn = {1878-4372},
abstract = {With falling sequencing costs and the rise of computational methods, plant genomics is entering a new paradigmatic shift. Combination of phased telomere-to-telomere assemblies and super-pangenome is emerging as the ultimate reference needed in plants. Together they provide a gold standard for genetic dissection, molecular-design breeding, and resource conservation.},
}
RevDate: 2025-12-02
High-quality mouse reference genomes reveal the structural complexity of the murine protein-coding landscape.
Cell genomics pii:S2666-979X(25)00330-1 [Epub ahead of print].
We present a collection of 17 high-quality long-read inbred mouse strain genomes with complete annotation (contig N50s of 0.8-33.9 Mbp). This collection includes 12 widely used classical laboratory strains and 5 wild-derived strains. We have resolved previously incomplete genomic regions, including the major histocompatibility complex (MHC), defensin cluster, T cell receptor, and Ly49 complexes. Hundreds of non-reference genes from previous publications not found in GRCm39, such as Defa1, Raet1a, and Klra20 (Ly49T), were localized in the new reference genomes. We conducted a genome-wide scan of variable number tandem repeats (VNTRs) within the coding regions, identifying over 400 genes with VNTR polymorphisms with up to 600 repeat copies and repeat units reaching 990 nucleotides. Our strain-specific annotations enhance RNA sequencing (RNA-seq) analyses, as demonstrated in PWK/PhJ, where we observed a 5.1% improvement in read mapping and expression-level differences in 2.1% of coding genes compared to using GRCm39.
Additional Links: PMID-41330379
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41330379,
year = {2025},
author = {Helmy, M and Li, JU and Yan, XF and Meade, RK and Anderson, E and Chen, PB and Czechanski, AM and Di Domenico, T and Flint, J and Garrison, E and Gontijo, MTP and Guarracino, A and Haggerty, L and Heard, E and Howe, K and Meena, N and Martin, FJ and Miska, EA and Rall, I and Ramakrishna, NB and Sapetschnig, A and Sinha, S and Sun, D and Tricomi, FF and Qu, R and Wood, JMD and Wu, T and Zhou, DJ and Reinholdt, L and Adams, DJ and Smith, CM and Lilue, J and Keane, TM},
title = {High-quality mouse reference genomes reveal the structural complexity of the murine protein-coding landscape.},
journal = {Cell genomics},
volume = {},
number = {},
pages = {101074},
doi = {10.1016/j.xgen.2025.101074},
pmid = {41330379},
issn = {2666-979X},
abstract = {We present a collection of 17 high-quality long-read inbred mouse strain genomes with complete annotation (contig N50s of 0.8-33.9 Mbp). This collection includes 12 widely used classical laboratory strains and 5 wild-derived strains. We have resolved previously incomplete genomic regions, including the major histocompatibility complex (MHC), defensin cluster, T cell receptor, and Ly49 complexes. Hundreds of non-reference genes from previous publications not found in GRCm39, such as Defa1, Raet1a, and Klra20 (Ly49T), were localized in the new reference genomes. We conducted a genome-wide scan of variable number tandem repeats (VNTRs) within the coding regions, identifying over 400 genes with VNTR polymorphisms with up to 600 repeat copies and repeat units reaching 990 nucleotides. Our strain-specific annotations enhance RNA sequencing (RNA-seq) analyses, as demonstrated in PWK/PhJ, where we observed a 5.1% improvement in read mapping and expression-level differences in 2.1% of coding genes compared to using GRCm39.},
}
RevDate: 2025-12-02
Graph-based and molecular modeling approaches to identify TpiA as a noval therapeutic target in Neisseria meningitidis.
Naunyn-Schmiedeberg's archives of pharmacology [Epub ahead of print].
Neisseria meningitidis continues to be a major etiology of invasive meningococcal disease, and the increasing frequency of antimicrobial resistance necessitates the discovery of new therapeutic strategies. This study aimed to identify and prioritize novel, pathogen-specific drug targets and potential natural inhibitors using an integrative computational approach integrating graph-based pangenome analysis, subtractive genomics, and network topology. Pangenome construction and subsequent subtractive genomics of the core genes, followed by network analysis, shortlisted five essential hub proteins (guaA, pykA, tktA, eno, and tpiA), with tpiA, a central glycolytic enzyme with little prior therapeutic exploration, was selected as a promising drug target for subsequent phytochemical screening. Subsequently, phytochemicals from Cinnamomum verum were screened against tpiA in order to identify potential inhibitors. The top three candidate compounds underwent 100 ns molecular dynamics and MM-GBSA analyses, exhibiting lower RMSD and reduced residue fluctuations compared to the apo form, while maintaining stable radius of gyration and SASA. Further insilico validation through MM-GBSA confirmed favorable binding energies with beta-sitosterol (- 21.28 kcal·mol[-][1]), stigmasterol (- 17.01 kcal·mol[-][1]), and riboflavin (- 16.09 kcal·mol[-][1]) with van der Waals interactions predominating for the sterols. Collectively, these in-silico findings validate TpiA as a promising antibacterial target and highlight Cinnamomum verum phytosterols, particularly beta-sitosterol and stigmasterol, as lead scaffolds warranting further experimental exploration\.
Additional Links: PMID-41329216
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41329216,
year = {2025},
author = {Ghaffar, SA and Muneer, I and Manzoor, I and Attar, RMS and Alzunaidy, NA and Alharbi, BF and Almatroudi, A and Alrumaihi, F and Allemailem, KS and Qamar, MTU},
title = {Graph-based and molecular modeling approaches to identify TpiA as a noval therapeutic target in Neisseria meningitidis.},
journal = {Naunyn-Schmiedeberg's archives of pharmacology},
volume = {},
number = {},
pages = {},
pmid = {41329216},
issn = {1432-1912},
abstract = {Neisseria meningitidis continues to be a major etiology of invasive meningococcal disease, and the increasing frequency of antimicrobial resistance necessitates the discovery of new therapeutic strategies. This study aimed to identify and prioritize novel, pathogen-specific drug targets and potential natural inhibitors using an integrative computational approach integrating graph-based pangenome analysis, subtractive genomics, and network topology. Pangenome construction and subsequent subtractive genomics of the core genes, followed by network analysis, shortlisted five essential hub proteins (guaA, pykA, tktA, eno, and tpiA), with tpiA, a central glycolytic enzyme with little prior therapeutic exploration, was selected as a promising drug target for subsequent phytochemical screening. Subsequently, phytochemicals from Cinnamomum verum were screened against tpiA in order to identify potential inhibitors. The top three candidate compounds underwent 100 ns molecular dynamics and MM-GBSA analyses, exhibiting lower RMSD and reduced residue fluctuations compared to the apo form, while maintaining stable radius of gyration and SASA. Further insilico validation through MM-GBSA confirmed favorable binding energies with beta-sitosterol (- 21.28 kcal·mol[-][1]), stigmasterol (- 17.01 kcal·mol[-][1]), and riboflavin (- 16.09 kcal·mol[-][1]) with van der Waals interactions predominating for the sterols. Collectively, these in-silico findings validate TpiA as a promising antibacterial target and highlight Cinnamomum verum phytosterols, particularly beta-sitosterol and stigmasterol, as lead scaffolds warranting further experimental exploration\.},
}
RevDate: 2025-12-21
CmpDate: 2025-12-17
Evolutionary Dynamics and Functional Diversification of bZIP Transcription Factors in Cucurbitaceae: A Pan-Genome Approach.
Journal of agricultural and food chemistry, 73(50):32363-32378.
The basic leucine zipper (bZIP) transcription factors are central regulators of plant growth and stress responses. This study presents a pan-genome analysis of bZIP genes across nine Cucurbitaceae species: cucumber, watermelon, melon, bitter gourd, bottle gourd, wax gourd, monk fruit, snake gourd, and Herpetospermum pedunculosum. We cataloged 502 bZIPs (350 core, 151 dispensable) in 12 subfamilies. Intron variation was the primary source of intrasubfamily diversity; core-only subfamilies showed lower polymorphism and stronger purifying selection. Core bZIPs mainly originated from whole-genome/segmental duplication, while dispensable bZIPs expanded via small-scale duplication. Integrating 393 transcriptomes revealed that both core and dispensable bZIPs contribute to development and abiotic-stress responses, with dispensable members more specifically hormone-induced and core bZIPs engaging broader gene-interaction networks. This pan-genome survey delineates bZIP genomic architecture, evolutionary dynamics, and functional specialization in Cucurbitaceae, providing a robust resource for downstream mechanistic studies and molecular breeding.
Additional Links: PMID-41329132
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41329132,
year = {2025},
author = {Sun, M and Jiang, Q and Zhang, E and Zhu, Z and Yang, Y and Tan, CL and Wang, Z and Li, R and Tao, Y and Zhao, Q},
title = {Evolutionary Dynamics and Functional Diversification of bZIP Transcription Factors in Cucurbitaceae: A Pan-Genome Approach.},
journal = {Journal of agricultural and food chemistry},
volume = {73},
number = {50},
pages = {32363-32378},
pmid = {41329132},
issn = {1520-5118},
mesh = {*Cucurbitaceae/genetics/metabolism/classification ; *Plant Proteins/genetics/metabolism ; Evolution, Molecular ; Genome, Plant ; Phylogeny ; *Basic-Leucine Zipper Transcription Factors/genetics/metabolism ; Gene Expression Regulation, Plant ; },
abstract = {The basic leucine zipper (bZIP) transcription factors are central regulators of plant growth and stress responses. This study presents a pan-genome analysis of bZIP genes across nine Cucurbitaceae species: cucumber, watermelon, melon, bitter gourd, bottle gourd, wax gourd, monk fruit, snake gourd, and Herpetospermum pedunculosum. We cataloged 502 bZIPs (350 core, 151 dispensable) in 12 subfamilies. Intron variation was the primary source of intrasubfamily diversity; core-only subfamilies showed lower polymorphism and stronger purifying selection. Core bZIPs mainly originated from whole-genome/segmental duplication, while dispensable bZIPs expanded via small-scale duplication. Integrating 393 transcriptomes revealed that both core and dispensable bZIPs contribute to development and abiotic-stress responses, with dispensable members more specifically hormone-induced and core bZIPs engaging broader gene-interaction networks. This pan-genome survey delineates bZIP genomic architecture, evolutionary dynamics, and functional specialization in Cucurbitaceae, providing a robust resource for downstream mechanistic studies and molecular breeding.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Cucurbitaceae/genetics/metabolism/classification
*Plant Proteins/genetics/metabolism
Evolution, Molecular
Genome, Plant
Phylogeny
*Basic-Leucine Zipper Transcription Factors/genetics/metabolism
Gene Expression Regulation, Plant
RevDate: 2025-12-01
Pangenome identification and functional characterization of AHL genes in wheat (Triticum aestivum L.) reveal the role of TaAHL67 in grain weight regulation.
BMC plant biology pii:10.1186/s12870-025-07817-8 [Epub ahead of print].
Additional Links: PMID-41327029
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41327029,
year = {2025},
author = {Guo, L and Song, Q and Zhang, X and Tian, T and Zhang, Y and Wu, Y and Zhang, P and Ma, J and Chen, T and Yang, D},
title = {Pangenome identification and functional characterization of AHL genes in wheat (Triticum aestivum L.) reveal the role of TaAHL67 in grain weight regulation.},
journal = {BMC plant biology},
volume = {},
number = {},
pages = {},
doi = {10.1186/s12870-025-07817-8},
pmid = {41327029},
issn = {1471-2229},
support = {32401920//National Natural Science Foundation of China/ ; 25JRRA403//Gansu Province Youth Science and Technology Fund program/ ; 24JRRA633//the Innovative Research Group Project of Gansu of Gansu Province/ ; 24ZD13NA019//the Key Sci & Tech Special Project of Gansu Province/ ; GYGG-2024-2//the Breakthrough Project in Seed Industry of Gansu Province/ ; GSARS07//the Wheat Crop Industry Technology System of Gansu Province/ ; },
}
RevDate: 2025-12-05
Genomic insights into multidrug resistant Escherichia coli from bovine mastitis in Bangladesh.
BMC microbiology, 25(1):788.
BACKGROUND: Mastitis poses a significant threat to dairy industry and public health due to the emergence of multidrug-resistant (MDR) Escherichia coli. This study provides a genomic characterization of two MDR E. coli strains, MBBL4 and MBBL5, from bovine mastitis in Bangladesh, highlighting their evolutionary relationships, resistome, and virulome.
METHODS: Species-level identification of MBBL4 and MBBL5 was confirmed using biochemical assays, VITEK-2 system, and 16S rRNA gene sequencing. Antimicrobial susceptibility profiling was conducted to determine their resistance patterns. Whole genome sequencing (WGS) and comprehensive genomic analysis were performed for phylogenetic, comparative genomics, mobile genetic elements (MGEs), antimicrobial resistance genes (ARGs), and virulence factor genes (VFGs) analyses.
RESULTS: Both isolates exhibited extensive MDR patterns, showing resistance to ten antibiotics. Phylogenetic and ANI analyses showed that MBBL4 clustered with mastitis-associated and human bacteremia strains of E. coli, while MBBL5 was closely related to wildlife-associated strains, reflecting divergent evolutionary lineages. Pangenome analysis revealed an open pangenome structure, indicating high genetic diversity, with MBBL4 harboring 21 unique genes and MBBL5 possessing nine unique genes. Both genomes harbored numerous ARGs spanning over 11 antibiotic classes, and VFGs, predominantly associated with adherence and secretion systems, underscoring their extensive resistome, virulome, and adaptive potentials. Abundant MGEs (plasmids, prophages, insertion sequence elements and genomic islands) further underscored the role of horizontal gene transfer in driving resistance and virulence in these strains.
CONCLUSION: This study highlights the zoonotic potential and adaptive capacity of MDR E. coli from bovine mastitis in Bangladesh driven by resistome, virulome, and mobile genetic elements. These findings highlight the urgent need for One Health-based genomic surveillance to mitigate MDR E. coli transmission from dairy farms to humans and the environment.
SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12866-025-04514-2.
Additional Links: PMID-41326987
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41326987,
year = {2025},
author = {Siddique, N and Arafat, KY and Gilman, MAA and Rahman, MM and Das, ZC and Islam, T and Hoque, MN},
title = {Genomic insights into multidrug resistant Escherichia coli from bovine mastitis in Bangladesh.},
journal = {BMC microbiology},
volume = {25},
number = {1},
pages = {788},
pmid = {41326987},
issn = {1471-2180},
support = {LS20221764, duration 2023-2025//Ministry of Education (BANBEIS), Government of the People's Republic of Bangladesh/ ; },
abstract = {BACKGROUND: Mastitis poses a significant threat to dairy industry and public health due to the emergence of multidrug-resistant (MDR) Escherichia coli. This study provides a genomic characterization of two MDR E. coli strains, MBBL4 and MBBL5, from bovine mastitis in Bangladesh, highlighting their evolutionary relationships, resistome, and virulome.
METHODS: Species-level identification of MBBL4 and MBBL5 was confirmed using biochemical assays, VITEK-2 system, and 16S rRNA gene sequencing. Antimicrobial susceptibility profiling was conducted to determine their resistance patterns. Whole genome sequencing (WGS) and comprehensive genomic analysis were performed for phylogenetic, comparative genomics, mobile genetic elements (MGEs), antimicrobial resistance genes (ARGs), and virulence factor genes (VFGs) analyses.
RESULTS: Both isolates exhibited extensive MDR patterns, showing resistance to ten antibiotics. Phylogenetic and ANI analyses showed that MBBL4 clustered with mastitis-associated and human bacteremia strains of E. coli, while MBBL5 was closely related to wildlife-associated strains, reflecting divergent evolutionary lineages. Pangenome analysis revealed an open pangenome structure, indicating high genetic diversity, with MBBL4 harboring 21 unique genes and MBBL5 possessing nine unique genes. Both genomes harbored numerous ARGs spanning over 11 antibiotic classes, and VFGs, predominantly associated with adherence and secretion systems, underscoring their extensive resistome, virulome, and adaptive potentials. Abundant MGEs (plasmids, prophages, insertion sequence elements and genomic islands) further underscored the role of horizontal gene transfer in driving resistance and virulence in these strains.
CONCLUSION: This study highlights the zoonotic potential and adaptive capacity of MDR E. coli from bovine mastitis in Bangladesh driven by resistome, virulome, and mobile genetic elements. These findings highlight the urgent need for One Health-based genomic surveillance to mitigate MDR E. coli transmission from dairy farms to humans and the environment.
SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12866-025-04514-2.},
}
RevDate: 2025-12-03
CmpDate: 2025-12-01
Integrating ATAC-Seq and Pan-Genomics Identifies Stress-Memory AP2/ERF Hubs in Foxtail Millet.
Food science & nutrition, 13(12):e71109.
Foxtail millet (Setaria italica), a key cereal crop, has developed robust mechanisms to adapt to drought conditions. Investigating genes associated with drought memory is crucial for improving the plant's resilience against recurring drought events. Our study leveraged ATAC-seq and RNA-seq to reveal significant changes in chromatin accessibility and gene expression in response to secondary drought stress. The AP2/ERF transcription factor motifs were the most enriched within different chromatin regions in both leaf and root tissues. Transcriptome analysis identified 80 significantly upregulated AP2/ERF genes in foxtail millet. Pan-genome analysis of 111 accessions revealed 16,778 AP2/ERF genes, categorized into 17 groups, highlighting gene number variations across populations. Codon usage analysis showed biased preferences across groups and populations. Natural selection studies indicated that most AP2/ERF genes are under strong purifying selection (Ka/Ks < 1), while dispensable genes exhibit greater evolutionary flexibility. Collinearity analysis has validated the conservation of 107 AP2/ERF genes across four Poaceae species, underscoring their ancient and critical functions in drought response. Of particular interest, 57 of these conserved genes were significantly upregulated in the leaves or roots under secondary drought stress, with SiERF-51 showing the highest expression levels in both tissues, a finding verified by qRT-PCR analysis. This study not only demonstrated the pivotal role of the AP2/ERF superfamily in drought memory but also provides a wealth of genetic resources for further investigations into the molecular mechanisms of drought adaptation in foxtail millet.
Additional Links: PMID-41323812
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41323812,
year = {2025},
author = {Li, T and Liao, W and Wang, H and Wang, Z and Li, J and Zhou, X and Cai, Y and Zhang, J and Feng, F and Wang, Y and Wang, W and Hu, J and Sun, Y},
title = {Integrating ATAC-Seq and Pan-Genomics Identifies Stress-Memory AP2/ERF Hubs in Foxtail Millet.},
journal = {Food science & nutrition},
volume = {13},
number = {12},
pages = {e71109},
pmid = {41323812},
issn = {2048-7177},
abstract = {Foxtail millet (Setaria italica), a key cereal crop, has developed robust mechanisms to adapt to drought conditions. Investigating genes associated with drought memory is crucial for improving the plant's resilience against recurring drought events. Our study leveraged ATAC-seq and RNA-seq to reveal significant changes in chromatin accessibility and gene expression in response to secondary drought stress. The AP2/ERF transcription factor motifs were the most enriched within different chromatin regions in both leaf and root tissues. Transcriptome analysis identified 80 significantly upregulated AP2/ERF genes in foxtail millet. Pan-genome analysis of 111 accessions revealed 16,778 AP2/ERF genes, categorized into 17 groups, highlighting gene number variations across populations. Codon usage analysis showed biased preferences across groups and populations. Natural selection studies indicated that most AP2/ERF genes are under strong purifying selection (Ka/Ks < 1), while dispensable genes exhibit greater evolutionary flexibility. Collinearity analysis has validated the conservation of 107 AP2/ERF genes across four Poaceae species, underscoring their ancient and critical functions in drought response. Of particular interest, 57 of these conserved genes were significantly upregulated in the leaves or roots under secondary drought stress, with SiERF-51 showing the highest expression levels in both tissues, a finding verified by qRT-PCR analysis. This study not only demonstrated the pivotal role of the AP2/ERF superfamily in drought memory but also provides a wealth of genetic resources for further investigations into the molecular mechanisms of drought adaptation in foxtail millet.},
}
RevDate: 2025-12-24
CmpDate: 2025-12-11
Comparative Genomics Reveals Multipartite Genomes Undergoing Loss in the Fungal Endosymbiotic Genus Mycetohabitans.
Genome biology and evolution, 17(12):.
Endosymbiotic bacteria extensively impact phenotypes of their eukaryotic hosts, while experiencing dramatic changes to their own genome as they become more host-restricted in lifestyle. Understanding the trajectory of such a genome has largely been done through the study of animal-associated bacteria, especially insect endosymbionts. Yet, endofungal bacteria provide another natural experimental model for investigating how microbial genomes change when living inside of a host cell. Mycetohabitans spp. are culturable bacterial endosymbionts of the Mucoromycota fungus Rhizopus microsporus. To investigate the genome dynamics resulting from the endohyphal nature of this emerging model genus, we long-read sequenced and assembled new complete genomes to combine with previous assemblies, creating a global dataset of 28 complete Mycetohabitans genomes. All genomes were between 3.3 and 3.9 Mbp in size and were multipartite, structured into two conserved replicons with some strains having an additional plasmid. Based on evolutionary rate and gene content analysis of the different replicons, we termed the two major ones a chromosome and chromid. The differential presence of a third, mobilome-rich plasmid in some strains and the proliferation of transposable elements provide putative mechanisms for recombination or gene loss. The conservation of intact prophage and putative toxin-antitoxin systems and extensive enrichment of secondary metabolite clusters in the Mycetohabitans genomes highlight the dynamic nature of this reducing genome. With fungal-bacterial symbioses becoming increasingly apparent phenomena, lessons learned from this symbiosis will inform our understanding of bacterial adaptation to novel hosts and the process of microbe-microbe coevolution.
Additional Links: PMID-41320831
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41320831,
year = {2025},
author = {Abbot, B and Field, S and Carneal, L and White, RA and Buchan, A and West, C and Lee, L and Carter, ME},
title = {Comparative Genomics Reveals Multipartite Genomes Undergoing Loss in the Fungal Endosymbiotic Genus Mycetohabitans.},
journal = {Genome biology and evolution},
volume = {17},
number = {12},
pages = {},
pmid = {41320831},
issn = {1759-6653},
mesh = {*Symbiosis/genetics ; *Genome, Fungal ; *Rhizopus/genetics ; Genomics ; Phylogeny ; Evolution, Molecular ; Genome, Bacterial ; },
abstract = {Endosymbiotic bacteria extensively impact phenotypes of their eukaryotic hosts, while experiencing dramatic changes to their own genome as they become more host-restricted in lifestyle. Understanding the trajectory of such a genome has largely been done through the study of animal-associated bacteria, especially insect endosymbionts. Yet, endofungal bacteria provide another natural experimental model for investigating how microbial genomes change when living inside of a host cell. Mycetohabitans spp. are culturable bacterial endosymbionts of the Mucoromycota fungus Rhizopus microsporus. To investigate the genome dynamics resulting from the endohyphal nature of this emerging model genus, we long-read sequenced and assembled new complete genomes to combine with previous assemblies, creating a global dataset of 28 complete Mycetohabitans genomes. All genomes were between 3.3 and 3.9 Mbp in size and were multipartite, structured into two conserved replicons with some strains having an additional plasmid. Based on evolutionary rate and gene content analysis of the different replicons, we termed the two major ones a chromosome and chromid. The differential presence of a third, mobilome-rich plasmid in some strains and the proliferation of transposable elements provide putative mechanisms for recombination or gene loss. The conservation of intact prophage and putative toxin-antitoxin systems and extensive enrichment of secondary metabolite clusters in the Mycetohabitans genomes highlight the dynamic nature of this reducing genome. With fungal-bacterial symbioses becoming increasingly apparent phenomena, lessons learned from this symbiosis will inform our understanding of bacterial adaptation to novel hosts and the process of microbe-microbe coevolution.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Symbiosis/genetics
*Genome, Fungal
*Rhizopus/genetics
Genomics
Phylogeny
Evolution, Molecular
Genome, Bacterial
RevDate: 2025-12-03
CmpDate: 2025-11-30
Unveiling the Potential of Lentilactobacillus hilgardii in Malolactic Fermentation: Comparative Genomics and Fermentation Dynamics.
Microbial biotechnology, 18(12):e70259.
This study aimed to assess the potential of Lentilactobacillus hilgardii as a novel candidate for malolactic fermentation (MLF) in winemaking, through comparative genomics and experimental validation, in direct comparison with Oenococcus oeni. We performed a pangenome analysis on 16 L. hilgardii and 7 O. oeni strains to explore their genetic diversity, focusing on wine-related traits. Functional predictions were generated using genome-scale metabolic models (ModelSEED/KBase), including in silico co-inoculation with Saccharomyces cerevisiae EC1118 and post-alcoholic fermentation simulations. The reference strains L. hilgardii DSM 20176 and O. oeni DSM 20252 were experimentally tested for MLF performance in a synthetic wine-like medium at 25°C and 10°C. Core-genome comparison revealed that 67.9% of the malolactic enzyme sequence is conserved between the two species, with comparable docking affinity to L-malic acid. L. hilgardii harboured unique enzymes with potential oenological interest (phenolic acid decarboxylase, mannitol dehydrogenase, glucosidase) and distinctive stress-related proteins (YaaA, HrcA, ASP23), suggesting improved tolerance to oxidative, temperature, and alkaline stresses. Notably, L. hilgardii showed genomic potential to degrade putrescine, arginine, and ornithine, precursors of ethyl carbamate. Experimentally, L. hilgardii reduced L-malic acid from 2.5 g/L to < 0.1 g/L within 12 days at 10°C, while O. oeni showed no MLF activity at this temperature. At 25°C, both strains completed MLF within 6-7 days. L. hilgardii also consumed > 80% of residual fructose at 10°C, whereas O. oeni showed minimal utilisation. Our results demonstrate that L. hilgardii combines a favourable genomic repertoire for wine adaptation with superior MLF performance at low temperature, suggesting its potential as an alternative to O. oeni in cool-climate winemaking. This work provides the first genome-scale comparative and functional evaluation of L. hilgardii in the winemaking context, highlighting its technological promise to improve fermentation reliability, reduce spoilage risk, and expand the biodiversity of malolactic starters.
Additional Links: PMID-41319147
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41319147,
year = {2025},
author = {Mantegazza, G and Mangieri, N and Yazdi, EV and Russo, P and Mora, D and Gargari, G},
title = {Unveiling the Potential of Lentilactobacillus hilgardii in Malolactic Fermentation: Comparative Genomics and Fermentation Dynamics.},
journal = {Microbial biotechnology},
volume = {18},
number = {12},
pages = {e70259},
pmid = {41319147},
issn = {1751-7915},
support = {//Università degli Studi di Milano/ ; },
mesh = {*Fermentation ; *Wine/microbiology ; Oenococcus/genetics/metabolism ; Genomics ; *Malates/metabolism ; Saccharomyces cerevisiae/metabolism ; Genome, Bacterial ; Malate Dehydrogenase ; },
abstract = {This study aimed to assess the potential of Lentilactobacillus hilgardii as a novel candidate for malolactic fermentation (MLF) in winemaking, through comparative genomics and experimental validation, in direct comparison with Oenococcus oeni. We performed a pangenome analysis on 16 L. hilgardii and 7 O. oeni strains to explore their genetic diversity, focusing on wine-related traits. Functional predictions were generated using genome-scale metabolic models (ModelSEED/KBase), including in silico co-inoculation with Saccharomyces cerevisiae EC1118 and post-alcoholic fermentation simulations. The reference strains L. hilgardii DSM 20176 and O. oeni DSM 20252 were experimentally tested for MLF performance in a synthetic wine-like medium at 25°C and 10°C. Core-genome comparison revealed that 67.9% of the malolactic enzyme sequence is conserved between the two species, with comparable docking affinity to L-malic acid. L. hilgardii harboured unique enzymes with potential oenological interest (phenolic acid decarboxylase, mannitol dehydrogenase, glucosidase) and distinctive stress-related proteins (YaaA, HrcA, ASP23), suggesting improved tolerance to oxidative, temperature, and alkaline stresses. Notably, L. hilgardii showed genomic potential to degrade putrescine, arginine, and ornithine, precursors of ethyl carbamate. Experimentally, L. hilgardii reduced L-malic acid from 2.5 g/L to < 0.1 g/L within 12 days at 10°C, while O. oeni showed no MLF activity at this temperature. At 25°C, both strains completed MLF within 6-7 days. L. hilgardii also consumed > 80% of residual fructose at 10°C, whereas O. oeni showed minimal utilisation. Our results demonstrate that L. hilgardii combines a favourable genomic repertoire for wine adaptation with superior MLF performance at low temperature, suggesting its potential as an alternative to O. oeni in cool-climate winemaking. This work provides the first genome-scale comparative and functional evaluation of L. hilgardii in the winemaking context, highlighting its technological promise to improve fermentation reliability, reduce spoilage risk, and expand the biodiversity of malolactic starters.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Fermentation
*Wine/microbiology
Oenococcus/genetics/metabolism
Genomics
*Malates/metabolism
Saccharomyces cerevisiae/metabolism
Genome, Bacterial
Malate Dehydrogenase
RevDate: 2025-11-30
Integrating Pan-genome, GWAS, and Interpretable Machine Learning to Prioritize Trait-Associated Structural Variations in Setaria italica.
Plant communications pii:S2590-3462(25)00388-8 [Epub ahead of print].
Structural variations (SVs), especially presence-absence variations (PAVs), play a crucial role in crop domestication and trait improvement. While pan-genome analysis provides an exhaustive view of PAVs, it is often limited by high costs and restricted sample sizes. Conversely, genome-wide association studies (GWAS) can effectively identify trait-marker associations in large populations but typically overlook PAVs and face challenges in distinguishing causal variants due to linkage disequilibrium. In this study, we performed de novo assembly of eight reference-quality foxtail millet (Setaria italica) genomes and constructed a graph-based pan-genome to systematically explore PAVs. We subsequently conducted GWAS with 344 millet accessions, targeting genomic regions associated with the color of the leaf, leaf sheath, and leaf pulvinus. Through the application of interpretable machine-learning models, we pinpointed large-effect variants within the 26.84-26.94 Mb region on chromosome 7, including a 5002-bp Copia element insertion among other key variants correlated with phenotypic variations in leaf color traits. This integrative approach combines the detailed variant detection capabilities of pan-genome analysis with the large-scale mapping potential of GWAS and enhances variant prioritization using interpretable machine learning, providing a cost-efficient yet effective framework for dissecting agronomic traits in crops.
Additional Links: PMID-41319002
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41319002,
year = {2025},
author = {Wang, W and Wu, T and Fan, G and Zhang, S and Liu, S and Jiang, S and Cheng, Q and Shang, M and Xu, Y and Zhang, W and Zhang, J and Wang, X and Zhao, Z and Yan, J},
title = {Integrating Pan-genome, GWAS, and Interpretable Machine Learning to Prioritize Trait-Associated Structural Variations in Setaria italica.},
journal = {Plant communications},
volume = {},
number = {},
pages = {101626},
doi = {10.1016/j.xplc.2025.101626},
pmid = {41319002},
issn = {2590-3462},
abstract = {Structural variations (SVs), especially presence-absence variations (PAVs), play a crucial role in crop domestication and trait improvement. While pan-genome analysis provides an exhaustive view of PAVs, it is often limited by high costs and restricted sample sizes. Conversely, genome-wide association studies (GWAS) can effectively identify trait-marker associations in large populations but typically overlook PAVs and face challenges in distinguishing causal variants due to linkage disequilibrium. In this study, we performed de novo assembly of eight reference-quality foxtail millet (Setaria italica) genomes and constructed a graph-based pan-genome to systematically explore PAVs. We subsequently conducted GWAS with 344 millet accessions, targeting genomic regions associated with the color of the leaf, leaf sheath, and leaf pulvinus. Through the application of interpretable machine-learning models, we pinpointed large-effect variants within the 26.84-26.94 Mb region on chromosome 7, including a 5002-bp Copia element insertion among other key variants correlated with phenotypic variations in leaf color traits. This integrative approach combines the detailed variant detection capabilities of pan-genome analysis with the large-scale mapping potential of GWAS and enhances variant prioritization using interpretable machine learning, providing a cost-efficient yet effective framework for dissecting agronomic traits in crops.},
}
RevDate: 2025-12-02
CmpDate: 2025-11-29
Genome graphs reveal the importance of structural variation in Mycobacterium tuberculosis evolution and drug resistance.
Nature communications, 16(1):10746.
Structural variants (SVs) are increasingly recognized as key drivers of bacterial evolution, yet their role has not been explored thoroughly. This is due to limitations in traditional short-read sequencing and linear reference-based analyses, which can miss complex structural changes. Tuberculosis (TB), a disease caused by Mycobacterium tuberculosis (Mtb), remains a major global health concern. In this study, we harness long-read sequencing technologies and genome graph tools to construct a Mtb pangenome reference graph (PRG) from 859 high-quality, diverse, long-read assemblies. To enable accurate genotyping of SVs leveraging the PRG, we developed miniwalk, a tool that outperforms a traditional linear genome-based approach in precision for SV detection. We characterize patterns of structural variation genome-wide, revealing a virulence-associated ESX-5 deletion to be recurrent across the phylogeny, and fixed in a sub-lineage of L4. Systematic screens for additional genes that are recurrently affected by SVs implicated those related to metal homeostasis, including a copper exporter fixed in the widely distributed L1.2.1 sub-lineage. Lastly, we genotyped 41,134 isolates and found SVs putatively associated with resistance to various first and second-line drugs. These findings underscore the broader role of SVs in shaping Mtb diversity, highlighting their importance in both understanding evolution and designing strategies to combat drug-resistant TB.
Additional Links: PMID-41315227
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41315227,
year = {2025},
author = {Canalda-Baltrons, A and Silcocks, M and Hall, MB and Theys, D and Chang 常戌灵, X and Viberg, LT and Sherry, NL and Coin, L and Dunstan, SJ},
title = {Genome graphs reveal the importance of structural variation in Mycobacterium tuberculosis evolution and drug resistance.},
journal = {Nature communications},
volume = {16},
number = {1},
pages = {10746},
pmid = {41315227},
issn = {2041-1723},
support = {U19 AI162583/AI/NIAID NIH HHS/United States ; },
mesh = {*Mycobacterium tuberculosis/genetics/drug effects/pathogenicity ; *Genome, Bacterial/genetics ; Phylogeny ; *Evolution, Molecular ; *Drug Resistance, Bacterial/genetics ; *Genomic Structural Variation/genetics ; Humans ; Antitubercular Agents/pharmacology ; Tuberculosis/microbiology ; Genotype ; },
abstract = {Structural variants (SVs) are increasingly recognized as key drivers of bacterial evolution, yet their role has not been explored thoroughly. This is due to limitations in traditional short-read sequencing and linear reference-based analyses, which can miss complex structural changes. Tuberculosis (TB), a disease caused by Mycobacterium tuberculosis (Mtb), remains a major global health concern. In this study, we harness long-read sequencing technologies and genome graph tools to construct a Mtb pangenome reference graph (PRG) from 859 high-quality, diverse, long-read assemblies. To enable accurate genotyping of SVs leveraging the PRG, we developed miniwalk, a tool that outperforms a traditional linear genome-based approach in precision for SV detection. We characterize patterns of structural variation genome-wide, revealing a virulence-associated ESX-5 deletion to be recurrent across the phylogeny, and fixed in a sub-lineage of L4. Systematic screens for additional genes that are recurrently affected by SVs implicated those related to metal homeostasis, including a copper exporter fixed in the widely distributed L1.2.1 sub-lineage. Lastly, we genotyped 41,134 isolates and found SVs putatively associated with resistance to various first and second-line drugs. These findings underscore the broader role of SVs in shaping Mtb diversity, highlighting their importance in both understanding evolution and designing strategies to combat drug-resistant TB.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Mycobacterium tuberculosis/genetics/drug effects/pathogenicity
*Genome, Bacterial/genetics
Phylogeny
*Evolution, Molecular
*Drug Resistance, Bacterial/genetics
*Genomic Structural Variation/genetics
Humans
Antitubercular Agents/pharmacology
Tuberculosis/microbiology
Genotype
RevDate: 2025-12-01
CmpDate: 2025-11-28
Origin and development of two Escherichia coli clones vertically transferred in broiler production.
Microbial genomics, 11(11):.
Investigation of clonal development of dominant persistent clones of avian pathogenic Escherichia coli (APEC) is important to understand their evolution and to gain knowledge to improve their control in poultry production. Whole-genomic sequencing, including hybrid assembled genomes of short and long reads, was used to analyse clonal persistence and evolution of APEC. Two vertically transferred E. coli clones, represented by ten isolates from sequence type (ST) 95-PFGE type 65 and eight isolates from ST131-PFGE type 47, were selected to identify genomic variations. The isolates had been sampled in broiler production during a period of 9 months in a previous study. The main differences among strains within each clone were related to plasmids, transposases, incomplete phage elements and amino acid substitutions which by far exceeded the genetic variation related to core-genome SNPs (cgSNPs). Fourier-transform infrared spectroscopy was, for the most part, only able to trace clones within the same ST. The genome-wide mutation rate was equivalent to 1.48 mutations per genome per year for ST95-PFGE65 and 2.86 for ST131-PFGE47, respectively. The most recent common ancestors were estimated back to 2009 for ST95-PFGE65 and to 2011 for ST131-PFGE47, with further divergence occurring in years until sampling in 2014-2015. The methodology introduced is able to trace the temporal origin of APEC clones. The conventional threshold of ten or fewer cgSNPs to include strains in the same clone did not consider any gain or loss of plasmids for the strains compared. On average, one plasmid transfer event was predicted every second year. For strains expected to be vertically transferred during the long production periods of great-grandparents over grandparents and parents to broilers, one to two plasmid transfers are therefore predicted, and several cgSNPs may be introduced, whereas up to one cgSNP is expected to be manifested during a broiler production cycle and rarely involving plasmid transfer.
Additional Links: PMID-41313693
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41313693,
year = {2025},
author = {Zhao, Y and Heuvelink, A and Olsen, JE and Poulsen, L and Christensen, H},
title = {Origin and development of two Escherichia coli clones vertically transferred in broiler production.},
journal = {Microbial genomics},
volume = {11},
number = {11},
pages = {},
pmid = {41313693},
issn = {2057-5858},
mesh = {Animals ; *Escherichia coli/genetics/classification/isolation & purification/pathogenicity ; *Chickens/microbiology ; *Escherichia coli Infections/microbiology/veterinary ; Plasmids/genetics ; *Poultry Diseases/microbiology ; Whole Genome Sequencing ; Genome, Bacterial ; Polymorphism, Single Nucleotide ; Phylogeny ; },
abstract = {Investigation of clonal development of dominant persistent clones of avian pathogenic Escherichia coli (APEC) is important to understand their evolution and to gain knowledge to improve their control in poultry production. Whole-genomic sequencing, including hybrid assembled genomes of short and long reads, was used to analyse clonal persistence and evolution of APEC. Two vertically transferred E. coli clones, represented by ten isolates from sequence type (ST) 95-PFGE type 65 and eight isolates from ST131-PFGE type 47, were selected to identify genomic variations. The isolates had been sampled in broiler production during a period of 9 months in a previous study. The main differences among strains within each clone were related to plasmids, transposases, incomplete phage elements and amino acid substitutions which by far exceeded the genetic variation related to core-genome SNPs (cgSNPs). Fourier-transform infrared spectroscopy was, for the most part, only able to trace clones within the same ST. The genome-wide mutation rate was equivalent to 1.48 mutations per genome per year for ST95-PFGE65 and 2.86 for ST131-PFGE47, respectively. The most recent common ancestors were estimated back to 2009 for ST95-PFGE65 and to 2011 for ST131-PFGE47, with further divergence occurring in years until sampling in 2014-2015. The methodology introduced is able to trace the temporal origin of APEC clones. The conventional threshold of ten or fewer cgSNPs to include strains in the same clone did not consider any gain or loss of plasmids for the strains compared. On average, one plasmid transfer event was predicted every second year. For strains expected to be vertically transferred during the long production periods of great-grandparents over grandparents and parents to broilers, one to two plasmid transfers are therefore predicted, and several cgSNPs may be introduced, whereas up to one cgSNP is expected to be manifested during a broiler production cycle and rarely involving plasmid transfer.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
Animals
*Escherichia coli/genetics/classification/isolation & purification/pathogenicity
*Chickens/microbiology
*Escherichia coli Infections/microbiology/veterinary
Plasmids/genetics
*Poultry Diseases/microbiology
Whole Genome Sequencing
Genome, Bacterial
Polymorphism, Single Nucleotide
Phylogeny
RevDate: 2025-11-28
CmpDate: 2025-11-28
Pan-genome analysis of Morganella morganii reveals niche-specific selection of functional traits: friend or foe?.
Archives of microbiology, 208(1):40.
Morganella morganii exemplifies a typical case of an open pangenome, where genes move intra- and interspecies via horizontal gene transfer. Through pangenome analysis, the study maps three agriculture isolates; M. morganii with strong plant growth promoting (PGP) activity, along with 78 publicly available genomes from clinical, food, wastewater, and animal sources. The analysis showed 20,860 gene clusters with only 9.99% core genes and a discriminating distribution of 75.20% cloud genes across different niches. Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis showed 33, 36, and 38 genes related to nutrient solubilization in M. morganii isolates HM01, HM02, and HM03, respectively. Chemotaxis genes, crucial for stress response, were most abundant in HM03 (30), followed by HM01 (17) and HM02 (27). Additionally, numerous biosynthetic gene clusters encoding antibacterial and antifungal metabolites were identified. Clinical and wastewater isolates harboured a higher number of mobile genetic element (MGE) linked antimicrobial resistance (AMR) genes that confer resistance to 15 antibiotic classes. These AMR genes were predominantly plasmid-borne and found to transfer in M. morganii from clinical pathogens such as E. coli and A. baumannii. This study indicates that habitat pressure creates the scenario for selection of functional traits which enables the ecosystem specific survival of M. morganii. Together, the present investigation provides important insight into the genomic diversity and remarkable PGP potential of M. morganii strains for sustainable agriculture. The pangenome analysis proposes that detailed investigation is needed to confirm their efficacy as PGP bacteria and to distinguish them from pathogenic strains.
Additional Links: PMID-41313384
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41313384,
year = {2025},
author = {Pal, R and Poddar, BJ and D Pandit, P and Purohit, HJ and Warke, R and Warke, GM},
title = {Pan-genome analysis of Morganella morganii reveals niche-specific selection of functional traits: friend or foe?.},
journal = {Archives of microbiology},
volume = {208},
number = {1},
pages = {40},
pmid = {41313384},
issn = {1432-072X},
mesh = {*Genome, Bacterial ; *Morganella morganii/genetics/isolation & purification/drug effects/classification ; Gene Transfer, Horizontal ; Anti-Bacterial Agents/pharmacology ; Multigene Family ; Drug Resistance, Bacterial/genetics ; Phylogeny ; Wastewater/microbiology ; Animals ; },
abstract = {Morganella morganii exemplifies a typical case of an open pangenome, where genes move intra- and interspecies via horizontal gene transfer. Through pangenome analysis, the study maps three agriculture isolates; M. morganii with strong plant growth promoting (PGP) activity, along with 78 publicly available genomes from clinical, food, wastewater, and animal sources. The analysis showed 20,860 gene clusters with only 9.99% core genes and a discriminating distribution of 75.20% cloud genes across different niches. Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis showed 33, 36, and 38 genes related to nutrient solubilization in M. morganii isolates HM01, HM02, and HM03, respectively. Chemotaxis genes, crucial for stress response, were most abundant in HM03 (30), followed by HM01 (17) and HM02 (27). Additionally, numerous biosynthetic gene clusters encoding antibacterial and antifungal metabolites were identified. Clinical and wastewater isolates harboured a higher number of mobile genetic element (MGE) linked antimicrobial resistance (AMR) genes that confer resistance to 15 antibiotic classes. These AMR genes were predominantly plasmid-borne and found to transfer in M. morganii from clinical pathogens such as E. coli and A. baumannii. This study indicates that habitat pressure creates the scenario for selection of functional traits which enables the ecosystem specific survival of M. morganii. Together, the present investigation provides important insight into the genomic diversity and remarkable PGP potential of M. morganii strains for sustainable agriculture. The pangenome analysis proposes that detailed investigation is needed to confirm their efficacy as PGP bacteria and to distinguish them from pathogenic strains.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Genome, Bacterial
*Morganella morganii/genetics/isolation & purification/drug effects/classification
Gene Transfer, Horizontal
Anti-Bacterial Agents/pharmacology
Multigene Family
Drug Resistance, Bacterial/genetics
Phylogeny
Wastewater/microbiology
Animals
RevDate: 2025-11-28
Comprehensive genomic analysis of a novel Bacillus cereus decomposing toluene potentially applicable in bioremediation.
Microbiology spectrum [Epub ahead of print].
UNLABELLED: Environmental pollution treatment is a worldwide challenge. Biological treatments are preferred to remedy the pollutants owing to higher efficiency and biocompatibility. Here, 18 indigenous bacterial strains with the usability of toluene as the sole carbon and energy source were screened, resulting in the selection of one isolate that was most capable of removing toluene. This isolate was characterized as a rod-shaped, gram-positive, spore-forming, halotolerant, catalase-positive, motile bacterium. The optimal temperature (40°C), pH (8), and NaCl concentration (50 g/L) for toluene removal were determined. Then, the experiment with four variables (petrochemical effluent dilution, carbon source, minerals, and amount of municipal wastewater) was designed using a Taguchi L16 orthogonal array. Ultimately, a 38.15% chemical oxygen demand decrease, as the best removal efficiency, was achieved. Gas chromatography-mass spectrometry analysis of bacterial metabolites assisted in recognizing the toluene degradation products, such as acetate and 2-propanone, indicating a meta-pathway of toluene decomposition. Next-generation sequencing strategy followed by core/pan-genome analysis indicated that the genome of the strain encompasses an assembly size of 5,154,622 bp with a content of 35.34% comprising 3,437 core and 2,258 accessory genes. Phylogenetic and multi-locus sequence typing studies confirmed that the strain belongs to the genus Bacillus and the species cereus. All the annotated and anticipated coding sequences were allocated to Kyoto Encyclopedia of Genes and Genomes routes. Accordingly, genes contributing to the toluene degradation pathway, including genes coding catechol 2, 3-dioxygenase, 4-oxalocrotonate isomerase, 2-hydroxymuconate-semialdehyde hydrolase, 2-keto-4-pentenoate hydratase, acetaldehyde dehydrogenase, alcohol dehydrogenase, flavocytochrome P450, anthranilate 3-monooxygenase (flavin adenine dinucleotide [FAD]), and flavin-dependent oxidoreductase, were ascertained. Herein, a general view of the GYRND102 strain is presented, which led to providing further insight into the decomposition mechanisms of aromatic pollutants by the strain for its effective application at the industrial level. Therefore, using this strain for pollutant removal will be cost-effective due to the simplicity of the method and the fact that it requires no special facilities or equipment.
IMPORTANCE: The purpose of the present study was to develop a feasible process for the treatment of heavily contaminated Mahshahr petrochemical effluent. Since regular treatment of this effluent was inefficient, isolation of an indigenous bacterium with the ability to decompose toluene (the main pollutant of the effluent) and use it as a biological material to treat wastewater containing aromatic compounds on an industrial scale was very important.
Additional Links: PMID-41313188
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41313188,
year = {2025},
author = {Safari, M and Ghahroodian, S and Abyarazimi, Mv and Rahmaniyan, S and Heydaryan, F and Rezaei, M and Yakhchali, B},
title = {Comprehensive genomic analysis of a novel Bacillus cereus decomposing toluene potentially applicable in bioremediation.},
journal = {Microbiology spectrum},
volume = {},
number = {},
pages = {e0289224},
doi = {10.1128/spectrum.02892-24},
pmid = {41313188},
issn = {2165-0497},
abstract = {UNLABELLED: Environmental pollution treatment is a worldwide challenge. Biological treatments are preferred to remedy the pollutants owing to higher efficiency and biocompatibility. Here, 18 indigenous bacterial strains with the usability of toluene as the sole carbon and energy source were screened, resulting in the selection of one isolate that was most capable of removing toluene. This isolate was characterized as a rod-shaped, gram-positive, spore-forming, halotolerant, catalase-positive, motile bacterium. The optimal temperature (40°C), pH (8), and NaCl concentration (50 g/L) for toluene removal were determined. Then, the experiment with four variables (petrochemical effluent dilution, carbon source, minerals, and amount of municipal wastewater) was designed using a Taguchi L16 orthogonal array. Ultimately, a 38.15% chemical oxygen demand decrease, as the best removal efficiency, was achieved. Gas chromatography-mass spectrometry analysis of bacterial metabolites assisted in recognizing the toluene degradation products, such as acetate and 2-propanone, indicating a meta-pathway of toluene decomposition. Next-generation sequencing strategy followed by core/pan-genome analysis indicated that the genome of the strain encompasses an assembly size of 5,154,622 bp with a content of 35.34% comprising 3,437 core and 2,258 accessory genes. Phylogenetic and multi-locus sequence typing studies confirmed that the strain belongs to the genus Bacillus and the species cereus. All the annotated and anticipated coding sequences were allocated to Kyoto Encyclopedia of Genes and Genomes routes. Accordingly, genes contributing to the toluene degradation pathway, including genes coding catechol 2, 3-dioxygenase, 4-oxalocrotonate isomerase, 2-hydroxymuconate-semialdehyde hydrolase, 2-keto-4-pentenoate hydratase, acetaldehyde dehydrogenase, alcohol dehydrogenase, flavocytochrome P450, anthranilate 3-monooxygenase (flavin adenine dinucleotide [FAD]), and flavin-dependent oxidoreductase, were ascertained. Herein, a general view of the GYRND102 strain is presented, which led to providing further insight into the decomposition mechanisms of aromatic pollutants by the strain for its effective application at the industrial level. Therefore, using this strain for pollutant removal will be cost-effective due to the simplicity of the method and the fact that it requires no special facilities or equipment.
IMPORTANCE: The purpose of the present study was to develop a feasible process for the treatment of heavily contaminated Mahshahr petrochemical effluent. Since regular treatment of this effluent was inefficient, isolation of an indigenous bacterium with the ability to decompose toluene (the main pollutant of the effluent) and use it as a biological material to treat wastewater containing aromatic compounds on an industrial scale was very important.},
}
RevDate: 2025-11-28
STEC in the natural environment of Uruguay: genomic surveillance and environmental circulation in the framework of One Health.
Microbiology spectrum [Epub ahead of print].
UNLABELLED: Shiga toxin-producing Escherichia coli (STEC) are zoonotic pathogens with significant public health implications. The interplay between humans, domestic animals, wildlife, and environmental reservoirs such as water bodies influences their circulation. This study aimed to evaluate the role of the environment in the dissemination of STEC. Fourteen STEC isolates were recovered from stream water and feces of wild animals collected between 2022 and 2024 in a tourist and wild village-like region in Uruguay. All isolates carried the stx2 gene, and one also harbored stx1 and eae (E. coli O157:H7). The remaining isolates were LEE-negative. Most isolates were susceptible to tested antibiotics. Whole genome sequencing (WGS) revealed a wide diversity of serotypes, sequence types, and virulence gene profiles. Pangenome analysis revealed genomic variability among the analyzed isolates, suggesting the presence of adaptive mechanisms and potential ecological plasticity. The Locus of Adhesion and Autoaggregation (LAA), a pathogenicity island described in some LEE-negative strains, was detected complete in three isolates and partially in three others. Core genome SNP-based phylogenetics and cgMLST analysis (chewBBACA) indicated that the local isolates clustered with strains previously reported from food, livestock, and wildlife across the Americas and Europe, suggesting genetic relatedness. These findings underscore the STEC's environmental diversity and potential public health risk of STEC circulating in a One Health context. The seasonal detection of the isolates, genomic features, and phylogenetic relationships suggests ongoing interactions between human activity, wildlife, and water sources as key factors in STEC ecology.
IMPORTANCE: Understanding the presence of Shiga toxin-producing Escherichia coli (STEC) in the environment is essential to assess potential exposure pathways for animals and humans. In this study, we analyzed STEC isolates recovered from wildlife and environmental samples within a single watershed under a One Health framework. Comparative genomic analyses revealed considerable diversity and the presence of shared virulence determinants, including the Locus of Adhesion and Autoaggregation (LAA) pathogenicity island. The detection of related lineages across ecological compartments may suggest possible exchange between environmental and animal reservoirs. These findings provide novel evidence of the environmental presence and dissemination of STEC in this watershed and highlight the importance of integrating genomic surveillance across environmental, animal, and human sectors to better understand the ecological dynamics and public health implications of this pathogen.
Additional Links: PMID-41313028
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41313028,
year = {2025},
author = {Figueroa, Y and Stoletniy, C and Michelacci, V and Montero, D and Silvera, J and Martínez de la Escalera, G and Bertoglio, F and Croci, C and Lepillanca, F and Coitiño, H and Zunino, P and Vidal, RM and Morabito, S and Piccini, C and Umpiérrez, A},
title = {STEC in the natural environment of Uruguay: genomic surveillance and environmental circulation in the framework of One Health.},
journal = {Microbiology spectrum},
volume = {},
number = {},
pages = {e0315325},
doi = {10.1128/spectrum.03153-25},
pmid = {41313028},
issn = {2165-0497},
abstract = {UNLABELLED: Shiga toxin-producing Escherichia coli (STEC) are zoonotic pathogens with significant public health implications. The interplay between humans, domestic animals, wildlife, and environmental reservoirs such as water bodies influences their circulation. This study aimed to evaluate the role of the environment in the dissemination of STEC. Fourteen STEC isolates were recovered from stream water and feces of wild animals collected between 2022 and 2024 in a tourist and wild village-like region in Uruguay. All isolates carried the stx2 gene, and one also harbored stx1 and eae (E. coli O157:H7). The remaining isolates were LEE-negative. Most isolates were susceptible to tested antibiotics. Whole genome sequencing (WGS) revealed a wide diversity of serotypes, sequence types, and virulence gene profiles. Pangenome analysis revealed genomic variability among the analyzed isolates, suggesting the presence of adaptive mechanisms and potential ecological plasticity. The Locus of Adhesion and Autoaggregation (LAA), a pathogenicity island described in some LEE-negative strains, was detected complete in three isolates and partially in three others. Core genome SNP-based phylogenetics and cgMLST analysis (chewBBACA) indicated that the local isolates clustered with strains previously reported from food, livestock, and wildlife across the Americas and Europe, suggesting genetic relatedness. These findings underscore the STEC's environmental diversity and potential public health risk of STEC circulating in a One Health context. The seasonal detection of the isolates, genomic features, and phylogenetic relationships suggests ongoing interactions between human activity, wildlife, and water sources as key factors in STEC ecology.
IMPORTANCE: Understanding the presence of Shiga toxin-producing Escherichia coli (STEC) in the environment is essential to assess potential exposure pathways for animals and humans. In this study, we analyzed STEC isolates recovered from wildlife and environmental samples within a single watershed under a One Health framework. Comparative genomic analyses revealed considerable diversity and the presence of shared virulence determinants, including the Locus of Adhesion and Autoaggregation (LAA) pathogenicity island. The detection of related lineages across ecological compartments may suggest possible exchange between environmental and animal reservoirs. These findings provide novel evidence of the environmental presence and dissemination of STEC in this watershed and highlight the importance of integrating genomic surveillance across environmental, animal, and human sectors to better understand the ecological dynamics and public health implications of this pathogen.},
}
RevDate: 2025-11-30
CmpDate: 2025-11-28
Comparative Analysis of Antibiotic Resistance and Genomic Characteristics of Enterococcus faecium and Enterococcus lactis Along the Food Chain - 5 PLADs, China, 2015-2024.
China CDC weekly, 7(46):1434-1440.
INTRODUCTION: Foodborne antibiotic-resistant enterococci pose significant risks to One Health and clinical antimicrobial efficacy through food chain transmission. Following the taxonomic reclassification of Enterococcus faecium (E. faecium), comprehensive long-term surveillance data on antibiotic resistance (ABR) patterns and genomic characteristics of E. faecium and Enterococcus lactis (E. lactis) across food animals, environmental sources, and human populations remain limited.
METHODS: A total of 2,233 samples were collected from multiple nodes along the food chain across 5 Chinese provincial-level administrative divisions (PLADs) during 2015-2019 and 2023-2024. E. faecium (87 isolates) and E. lactis (153 isolates) were identified through whole-genome sequencing and average nucleotide identity analysis. Antimicrobial susceptibility testing, comprehensive genomic content analysis, and pan-genome-wide association studies were performed.
RESULTS: E. faecium demonstrated significantly higher resistance rates to 12 antimicrobials compared with E. lactis (P<0.05). Conversely, E. lactis exhibited a higher resistance rate to erythromycin than E. faecium (P<0.01). The multidrug-resistant (MDR) rate of E. faecium (43/87, 49.4%) substantially exceeded that of E. lactis (16/153, 10.5%) (P<0.001). Genomic analysis revealed that E. faecium harbors significantly more antibiotic resistance genes, mobile genetic elements, and plasmid replicons than E. lactis. No significant interspecies differences were observed in virulence gene profiles associated with adhesion, immune modulation, biofilm formation, and exotoxin production.
CONCLUSIONS: E. faecium presents substantially greater ABR risks than E. lactis within the Chinese food chain, necessitating enhanced species-specific surveillance programs. Future interventions should prioritize targeted control strategies tailored to each species to effectively mitigate One Health threats.
Additional Links: PMID-41312526
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41312526,
year = {2025},
author = {Chen, W and Zheng, X and Wu, H and Jing, Y and Ye, Z and Peng, Z and Qiu, S},
title = {Comparative Analysis of Antibiotic Resistance and Genomic Characteristics of Enterococcus faecium and Enterococcus lactis Along the Food Chain - 5 PLADs, China, 2015-2024.},
journal = {China CDC weekly},
volume = {7},
number = {46},
pages = {1434-1440},
pmid = {41312526},
issn = {2096-7071},
abstract = {INTRODUCTION: Foodborne antibiotic-resistant enterococci pose significant risks to One Health and clinical antimicrobial efficacy through food chain transmission. Following the taxonomic reclassification of Enterococcus faecium (E. faecium), comprehensive long-term surveillance data on antibiotic resistance (ABR) patterns and genomic characteristics of E. faecium and Enterococcus lactis (E. lactis) across food animals, environmental sources, and human populations remain limited.
METHODS: A total of 2,233 samples were collected from multiple nodes along the food chain across 5 Chinese provincial-level administrative divisions (PLADs) during 2015-2019 and 2023-2024. E. faecium (87 isolates) and E. lactis (153 isolates) were identified through whole-genome sequencing and average nucleotide identity analysis. Antimicrobial susceptibility testing, comprehensive genomic content analysis, and pan-genome-wide association studies were performed.
RESULTS: E. faecium demonstrated significantly higher resistance rates to 12 antimicrobials compared with E. lactis (P<0.05). Conversely, E. lactis exhibited a higher resistance rate to erythromycin than E. faecium (P<0.01). The multidrug-resistant (MDR) rate of E. faecium (43/87, 49.4%) substantially exceeded that of E. lactis (16/153, 10.5%) (P<0.001). Genomic analysis revealed that E. faecium harbors significantly more antibiotic resistance genes, mobile genetic elements, and plasmid replicons than E. lactis. No significant interspecies differences were observed in virulence gene profiles associated with adhesion, immune modulation, biofilm formation, and exotoxin production.
CONCLUSIONS: E. faecium presents substantially greater ABR risks than E. lactis within the Chinese food chain, necessitating enhanced species-specific surveillance programs. Future interventions should prioritize targeted control strategies tailored to each species to effectively mitigate One Health threats.},
}
RevDate: 2025-11-28
The chromosome-level genome of Chinese indicine cattle breed provides insights into bovine adaptation and immunity.
DNA research : an international journal for rapid publication of reports on genes and genomes pii:8348289 [Epub ahead of print].
Genomic research is currently undergoing a paradigm shift from reliance on a single reference sequence to the use of breed-specific genomes. Chinese indicine cattle (Bos taurus indicus), characterized by their notable tick resistance and heat tolerance, display extensively genetic diversity than taurine. Here, we generated a chromosome level genome assembly of Chinese indicine cattle, achieving a contiguity N50 of 90.92 Mb and an overall size of 2.91Gb, utilizing PacBio HiFi sequencing complemented by Hi-C sequencing technology. The assembly is characterized by near-complete chromosomes, telomeres, and less gaps. Utilizing this highly quality assembly, we explored the phylogenetic relationship and speciation time. The gene family and selection signatures analyses indicated that candidate genes and biosynthetic pathways potentially contributing to disease immunity and thermotolerance of indicine cattle. Altogether, this study enriches the bovine pangenome repository and advances our understanding of the complex evolutionary patterns and distinctive adaptation traits of Chinese indicine cattle.
Additional Links: PMID-41311030
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41311030,
year = {2025},
author = {Ge, F and Guo, Y and Xu, L and Low, WY and Ma, H and Li, Q and Wang, Z and Zhu, B and Xu, L and Gao, X and Zhang, L and Gao, H and Li, J and Chen, Y},
title = {The chromosome-level genome of Chinese indicine cattle breed provides insights into bovine adaptation and immunity.},
journal = {DNA research : an international journal for rapid publication of reports on genes and genomes},
volume = {},
number = {},
pages = {},
doi = {10.1093/dnares/dsaf034},
pmid = {41311030},
issn = {1756-1663},
abstract = {Genomic research is currently undergoing a paradigm shift from reliance on a single reference sequence to the use of breed-specific genomes. Chinese indicine cattle (Bos taurus indicus), characterized by their notable tick resistance and heat tolerance, display extensively genetic diversity than taurine. Here, we generated a chromosome level genome assembly of Chinese indicine cattle, achieving a contiguity N50 of 90.92 Mb and an overall size of 2.91Gb, utilizing PacBio HiFi sequencing complemented by Hi-C sequencing technology. The assembly is characterized by near-complete chromosomes, telomeres, and less gaps. Utilizing this highly quality assembly, we explored the phylogenetic relationship and speciation time. The gene family and selection signatures analyses indicated that candidate genes and biosynthetic pathways potentially contributing to disease immunity and thermotolerance of indicine cattle. Altogether, this study enriches the bovine pangenome repository and advances our understanding of the complex evolutionary patterns and distinctive adaptation traits of Chinese indicine cattle.},
}
RevDate: 2025-12-01
Wheat chloroplast pangenome reveals frequent intramolecular recombination in the inverted repeat regions.
BMC plant biology, 25(1):1654.
BACKGROUND: In eukaryotes, the mutation rate of the chloroplasts is lower than that of the nuclear genomes. Advances in Next-Generation Sequencing (NGS) and Third-Generation Sequencing (TGS) technologies, together with improvements in genome assembly algorithms, have substantially propelled research in chloroplast genomics. Although nearly 9,000 chloroplast genomes have been released, chloroplast population genetics for specific species remains unexplored. The chloroplast genome possesses a quadripartite structure consisting of a large single-copy (LSC) region, a small single-copy (SSC) region, and two inverted repeats (IRs). A longstanding question is why the maternally inherited chloroplast genome does not appear to suffer from the Muller’s ratchet effect. It has been hypothesized that intramolecular recombination within the chloroplast genome may counteract this effect; however, direct evidence for such recombination remains lacking.
RESULTS: We conducted chloroplast population data analysis in hexaploid wheat and its ancestral relatives. One the basis of the pan-genome constructed and phylogeny analysis of chloroplast genomes of all samples, we calculated the chloroplast diversity of hexaploid wheat (π = 0.0001) is the lowest among the three ploidy types tested. Additionally, we found that during the formation of hexaploid wheat, only the chloroplasts from tetraploid wheat were inherited. Moreover, Aegilops tauschii contributed solely as the paternal provider of nuclear genome material. In the chloroplast genome assembly, we assembled IRa (inverted repeat A) and IRb (inverted repeat B), revealing multiple insertion/deletion sequence differences between them. Importantly, we discovered that recombination occurs between the IR regions of the chloroplast genome. Frequent recombination results in two structural configurations existing in nearly equal proportions within a single sample. This phenomenon has led to an increase in the nucleotide diversity of the chloroplast IR region within the wheat population, which was originally low among species in Poaceae.
CONCLUSION: This study demonstrates the feasibility of assembling chloroplast genomes using low-depth whole-genome sequencing (WGS) and confirms that the chloroplast genome of hexaploid wheat originates from tetraploid wheat rather than Aegilops tauschii. Furthermore, we provide evidence of frequent intra-molecular recombination in the chloroplast IR regions, leading to the coexistence of two equimolar inversion isomers. Despite strong purifying selection, recombination increases genetic diversity within the IR regions, facilitating adaptation and maintaining the functional stability of essential genes. Our findings highlight the role of recombination in balancing genetic stability and flexibility in chloroplast genome evolution, offering new insights into nuclear-cytoplasmic interactions and polyploid adaptation.
SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12870-025-07577-5.
Additional Links: PMID-41310440
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41310440,
year = {2025},
author = {Zhang, H and Qiu, X and Zhang, Z and Zhang, J and Qing, Y and Guo, Y and Song, X and Liang, C and Sun, Y and Zhao, Y and Yin, C and Wang, J and Lu, F and Chen, M},
title = {Wheat chloroplast pangenome reveals frequent intramolecular recombination in the inverted repeat regions.},
journal = {BMC plant biology},
volume = {25},
number = {1},
pages = {1654},
pmid = {41310440},
issn = {1471-2229},
support = {2023YFF1000604//National Key Research and Development Program of China/ ; },
abstract = {BACKGROUND: In eukaryotes, the mutation rate of the chloroplasts is lower than that of the nuclear genomes. Advances in Next-Generation Sequencing (NGS) and Third-Generation Sequencing (TGS) technologies, together with improvements in genome assembly algorithms, have substantially propelled research in chloroplast genomics. Although nearly 9,000 chloroplast genomes have been released, chloroplast population genetics for specific species remains unexplored. The chloroplast genome possesses a quadripartite structure consisting of a large single-copy (LSC) region, a small single-copy (SSC) region, and two inverted repeats (IRs). A longstanding question is why the maternally inherited chloroplast genome does not appear to suffer from the Muller’s ratchet effect. It has been hypothesized that intramolecular recombination within the chloroplast genome may counteract this effect; however, direct evidence for such recombination remains lacking.
RESULTS: We conducted chloroplast population data analysis in hexaploid wheat and its ancestral relatives. One the basis of the pan-genome constructed and phylogeny analysis of chloroplast genomes of all samples, we calculated the chloroplast diversity of hexaploid wheat (π = 0.0001) is the lowest among the three ploidy types tested. Additionally, we found that during the formation of hexaploid wheat, only the chloroplasts from tetraploid wheat were inherited. Moreover, Aegilops tauschii contributed solely as the paternal provider of nuclear genome material. In the chloroplast genome assembly, we assembled IRa (inverted repeat A) and IRb (inverted repeat B), revealing multiple insertion/deletion sequence differences between them. Importantly, we discovered that recombination occurs between the IR regions of the chloroplast genome. Frequent recombination results in two structural configurations existing in nearly equal proportions within a single sample. This phenomenon has led to an increase in the nucleotide diversity of the chloroplast IR region within the wheat population, which was originally low among species in Poaceae.
CONCLUSION: This study demonstrates the feasibility of assembling chloroplast genomes using low-depth whole-genome sequencing (WGS) and confirms that the chloroplast genome of hexaploid wheat originates from tetraploid wheat rather than Aegilops tauschii. Furthermore, we provide evidence of frequent intra-molecular recombination in the chloroplast IR regions, leading to the coexistence of two equimolar inversion isomers. Despite strong purifying selection, recombination increases genetic diversity within the IR regions, facilitating adaptation and maintaining the functional stability of essential genes. Our findings highlight the role of recombination in balancing genetic stability and flexibility in chloroplast genome evolution, offering new insights into nuclear-cytoplasmic interactions and polyploid adaptation.
SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12870-025-07577-5.},
}
RevDate: 2025-11-30
CmpDate: 2025-11-27
Bovine Clinical E. coli Mastitis in Italian Dairy Herds Is Not Associated with a Specific Pathotype.
Pathogens (Basel, Switzerland), 14(11):.
BACKGROUND: Escherichia coli is a cause of severe clinical bovine mastitis; however, it is not yet fully understood what makes mastitis-associated bacteria different from commensal strains at the genetic level. The goal of this study was to compare the genomic features, sequence types, virulence, and antibiotic resistance profiles of E. coli isolated from healthy cows and cows with clinical mastitis in Northern Italy.
METHODS: Between 2023 and 2024, 46 E. coli isolates, 23 from healthy animals and 23 from mastitis cases were recovered. Standard phenotypic approaches and Oxford Nanopore sequencing were used to investigate the genomic landscape of the strains.
RESULTS: Phylogroups A and B1 were the most common in both groups. MLST showed several types, with ST10 (19.6%), ST58 (13.0%), and ST69 (8.7%) being the most common. There was no lineage that was uniquely able to describe the isolates as Mammary Pathogenic Escherichia coli (MPEC); indeed, the ST distribution and phylogeny were the same in both groups. A total of 47.8% of isolates had antimicrobial resistance determinants, with β-lactamases (21.7%) and tetA (15.2%) being the most common. No significant differences in resistance rates were observed between mastitis and healthy isolates. Pangenome investigation found a large pool of accessory genes, but no genomic signature that distinguished mastitis from commensal isolates across the MPEC.
CONCLUSIONS: Bovine E. coli isolated from milk of both healthy and mastitic cows share sequence types, resistance rates, and accessory genome content, supporting the absence of a unique MPEC pathotype and highlighting the ecological versatility of these bacteria.
Additional Links: PMID-41305417
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41305417,
year = {2025},
author = {Laterza, G and Meroni, G and Soggiu, A and Martino, PA and Sora, VM and Zaghen, F and Bonizzi, L and Colombo, L and Zecconi, A},
title = {Bovine Clinical E. coli Mastitis in Italian Dairy Herds Is Not Associated with a Specific Pathotype.},
journal = {Pathogens (Basel, Switzerland)},
volume = {14},
number = {11},
pages = {},
pmid = {41305417},
issn = {2076-0817},
mesh = {Animals ; Cattle ; *Escherichia coli/genetics/pathogenicity/classification/isolation & purification/drug effects ; *Mastitis, Bovine/microbiology/epidemiology ; Italy/epidemiology ; Female ; *Escherichia coli Infections/veterinary/microbiology ; Multilocus Sequence Typing ; Phylogeny ; Dairying ; Anti-Bacterial Agents/pharmacology ; Drug Resistance, Bacterial ; Virulence ; Virulence Factors/genetics ; Microbial Sensitivity Tests ; Genotype ; },
abstract = {BACKGROUND: Escherichia coli is a cause of severe clinical bovine mastitis; however, it is not yet fully understood what makes mastitis-associated bacteria different from commensal strains at the genetic level. The goal of this study was to compare the genomic features, sequence types, virulence, and antibiotic resistance profiles of E. coli isolated from healthy cows and cows with clinical mastitis in Northern Italy.
METHODS: Between 2023 and 2024, 46 E. coli isolates, 23 from healthy animals and 23 from mastitis cases were recovered. Standard phenotypic approaches and Oxford Nanopore sequencing were used to investigate the genomic landscape of the strains.
RESULTS: Phylogroups A and B1 were the most common in both groups. MLST showed several types, with ST10 (19.6%), ST58 (13.0%), and ST69 (8.7%) being the most common. There was no lineage that was uniquely able to describe the isolates as Mammary Pathogenic Escherichia coli (MPEC); indeed, the ST distribution and phylogeny were the same in both groups. A total of 47.8% of isolates had antimicrobial resistance determinants, with β-lactamases (21.7%) and tetA (15.2%) being the most common. No significant differences in resistance rates were observed between mastitis and healthy isolates. Pangenome investigation found a large pool of accessory genes, but no genomic signature that distinguished mastitis from commensal isolates across the MPEC.
CONCLUSIONS: Bovine E. coli isolated from milk of both healthy and mastitic cows share sequence types, resistance rates, and accessory genome content, supporting the absence of a unique MPEC pathotype and highlighting the ecological versatility of these bacteria.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
Animals
Cattle
*Escherichia coli/genetics/pathogenicity/classification/isolation & purification/drug effects
*Mastitis, Bovine/microbiology/epidemiology
Italy/epidemiology
Female
*Escherichia coli Infections/veterinary/microbiology
Multilocus Sequence Typing
Phylogeny
Dairying
Anti-Bacterial Agents/pharmacology
Drug Resistance, Bacterial
Virulence
Virulence Factors/genetics
Microbial Sensitivity Tests
Genotype
RevDate: 2025-11-30
CmpDate: 2025-11-27
Establishment of Specific Multiplex PCR Detection Methods for the Predominant tet(X)-Positive Acinetobacter Species.
Microorganisms, 13(11):.
The increasing prevalence of the mobile tigecycline resistance gene tet(X) poses a severe global health threat, and the genus Acinetobacter is a major reservoir. This study aimed to develop a rapid and specific multiplex PCR assay for detecting the predominant tet(X)-positive Acinetobacter species. Through pan-genome analyses of 390 tet(X)-positive Acinetobacter genomes, a total of 20 tet(X) variants were identified in 24 Acinetobacter species, including 17 published lineages and seven taxonomically unresolved Taxa. Acinetobacter indicus (30.8%), Acinetobacter amyesii (17.2%), and Acinetobacter towneri (16.1%) were the top three hosts of diverse tet(X) variants. Species-specific signature genes were identified and used for primer design, yielding amplicons of 267 bp (tet(X)), 424 bp (A. indicus), 690 bp (A. amyesii), and 990 bp (A. towneri). The assay was rigorously adjusted for an optimal annealing temperature of 52.8 °C and a primer ratio of 1:1:1:1, demonstrating high sensitivity with a detection limit of 0.3 ng/μL DNA and excellent stability under -20 °C, 4 °C, 20 °C storage conditions. Validation experiments on 151 bacterial strains showed high accuracy for DNA templates (≥97.8%) and bacterial suspensions (≥93.5%) within two hours. This cost-effective and highly accurate multiplex PCR provides a powerful tool for proactive surveillance and control of the critical Acinetobacter sp. pathogens.
Additional Links: PMID-41304269
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41304269,
year = {2025},
author = {Chen, C and Liu, J and Gao, J and Wu, T and Huang, J},
title = {Establishment of Specific Multiplex PCR Detection Methods for the Predominant tet(X)-Positive Acinetobacter Species.},
journal = {Microorganisms},
volume = {13},
number = {11},
pages = {},
pmid = {41304269},
issn = {2076-2607},
support = {2024YFC2310300//National Key Research and Development Program of China/ ; 32402890//National Natural Science Foundation of China/ ; 2023M732993//China Postdoctoral Science Foundation/ ; },
abstract = {The increasing prevalence of the mobile tigecycline resistance gene tet(X) poses a severe global health threat, and the genus Acinetobacter is a major reservoir. This study aimed to develop a rapid and specific multiplex PCR assay for detecting the predominant tet(X)-positive Acinetobacter species. Through pan-genome analyses of 390 tet(X)-positive Acinetobacter genomes, a total of 20 tet(X) variants were identified in 24 Acinetobacter species, including 17 published lineages and seven taxonomically unresolved Taxa. Acinetobacter indicus (30.8%), Acinetobacter amyesii (17.2%), and Acinetobacter towneri (16.1%) were the top three hosts of diverse tet(X) variants. Species-specific signature genes were identified and used for primer design, yielding amplicons of 267 bp (tet(X)), 424 bp (A. indicus), 690 bp (A. amyesii), and 990 bp (A. towneri). The assay was rigorously adjusted for an optimal annealing temperature of 52.8 °C and a primer ratio of 1:1:1:1, demonstrating high sensitivity with a detection limit of 0.3 ng/μL DNA and excellent stability under -20 °C, 4 °C, 20 °C storage conditions. Validation experiments on 151 bacterial strains showed high accuracy for DNA templates (≥97.8%) and bacterial suspensions (≥93.5%) within two hours. This cost-effective and highly accurate multiplex PCR provides a powerful tool for proactive surveillance and control of the critical Acinetobacter sp. pathogens.},
}
RevDate: 2025-11-30
CmpDate: 2025-11-27
Paraburkholderia suaedae sp. nov., a Potential Plant Growth-Promoting Bacterium Isolated from the Halophyte Suaeda japonica.
Microorganisms, 13(11):.
A novel bacterial strain was isolated from the roots of Suaeda japonica, a halophytic plant inhabiting tidal zones. Phylogenetic, genomic, and phenotypic analyses identified the isolate as a novel species within the genus Paraburkholderia, for which the name Paraburkholderia suaedae sp. nov. is proposed. The strain exhibits multiple plant growth-promoting traits, including the production of 1-aminocyclopropane-1-carboxylic acid, indole-3-acetic acid, and siderophore, along with the ability to fix nitrogen and solubilize phosphate. Genomic analysis revealed genes associated with enhanced root surface adhesion and rhizosphere survival, such as those involved in thiamine biosynthesis and transport, and biofilm formation via poly-β-1,6-N-acetyl-D-glucosamine (PGA) synthesis. These features suggest the strain's potential for persistent colonization and beneficial interaction with host plants. Although its direct impact on plant growth has not yet been experimentally validated, the genetic and biochemical evidence supports its potential application in agriculture. The objective of this study was to conduct a polyphasic taxonomic characterization of a novel strain DGU8[T] isolated from the roots of the halophyte Suaeda japonica, and to assess its potential as a plant growth-promoting agent, particularly its tolerance to drought-related osmotic stress.
Additional Links: PMID-41304184
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41304184,
year = {2025},
author = {Park, S and Lee, H and Yook, S and Baek, C and Kim, J and Kwak, S and Na, T and Seo, T},
title = {Paraburkholderia suaedae sp. nov., a Potential Plant Growth-Promoting Bacterium Isolated from the Halophyte Suaeda japonica.},
journal = {Microorganisms},
volume = {13},
number = {11},
pages = {},
pmid = {41304184},
issn = {2076-2607},
support = {202402203//National Institute of Biological Resources (KR)/ ; 2022R1F1A1070108//National Research Foundation of Korea/ ; },
abstract = {A novel bacterial strain was isolated from the roots of Suaeda japonica, a halophytic plant inhabiting tidal zones. Phylogenetic, genomic, and phenotypic analyses identified the isolate as a novel species within the genus Paraburkholderia, for which the name Paraburkholderia suaedae sp. nov. is proposed. The strain exhibits multiple plant growth-promoting traits, including the production of 1-aminocyclopropane-1-carboxylic acid, indole-3-acetic acid, and siderophore, along with the ability to fix nitrogen and solubilize phosphate. Genomic analysis revealed genes associated with enhanced root surface adhesion and rhizosphere survival, such as those involved in thiamine biosynthesis and transport, and biofilm formation via poly-β-1,6-N-acetyl-D-glucosamine (PGA) synthesis. These features suggest the strain's potential for persistent colonization and beneficial interaction with host plants. Although its direct impact on plant growth has not yet been experimentally validated, the genetic and biochemical evidence supports its potential application in agriculture. The objective of this study was to conduct a polyphasic taxonomic characterization of a novel strain DGU8[T] isolated from the roots of the halophyte Suaeda japonica, and to assess its potential as a plant growth-promoting agent, particularly its tolerance to drought-related osmotic stress.},
}
RevDate: 2025-11-30
CmpDate: 2025-11-27
Comparative Genomic Analysis of Brevibacillus brevis: Insights into Pan-Genome Diversity and Biocontrol Potential.
Microorganisms, 13(11):.
The promising biocontrol agent Brevibacillus brevis is a broadly dispersed bacterium exhibiting significant antibacterial properties against plant diseases. This study conducted a comprehensive comparative genomic analysis of 25 B. brevis strains to examine their taxonomic classification, genetic diversity, and biocontrol potential. The genome sizes, excluding strain NEB573, varied from 5.95 to 6.73 Mb, with GC content between 47.0% and 47.5%. Notably, strain NEB573 exhibited distinct genomic characteristics based on Average Nucleotide Identity (ANI), digital DNA-DNA hybridisation (dDDH), and phylogenetic analyses, suggesting it may represent a novel Brevibacillus species pending additional phenotypic confirmation. The remaining 24 strains were grouped into six phylogenetic clades. The pan-genome study demonstrated significant genomic flexibility, demonstrating an open architecture with 2855 core gene families (33.08%) and 1699 distinct genes. Functional annotations indicated that unique genes were enriched in tasks related to DNA repair and environmental adaptation, while core genes predominantly participated in amino acid metabolism and transcription. The examination of biosynthetic gene clusters (BGCs) identified multiple antimicrobial compounds, such as gramicidin and tyrocidine, which have been reported to exhibit both antibacterial and antifungal activities, thereby underscoring the broad-spectrum biocontrol potential of B. brevis. These findings endorse the application of biocontrol in sustainable plant disease management and offer novel perspectives on its genetic basis in B. brevis. Future investigations of its metabolic repertoire may unveil novel agro-biotechnological applications.
Additional Links: PMID-41304142
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41304142,
year = {2025},
author = {Yang, W and Bao, Q and Wang, Y and Xiao, L and Zeng, Z and Zhou, L and Yang, H},
title = {Comparative Genomic Analysis of Brevibacillus brevis: Insights into Pan-Genome Diversity and Biocontrol Potential.},
journal = {Microorganisms},
volume = {13},
number = {11},
pages = {},
pmid = {41304142},
issn = {2076-2607},
support = {32072625//National Natural Science Foundation of China/ ; 32372772//National Natural Science Foundation of China/ ; 2024CX84//Agricultural Science and Technology Innovation Fund Project of Hunan Province/ ; 2024RC2078//Yuelushan Laboratory Talent Program/ ; 2023NK2013//Key Research and Development Program of Hunan Province/ ; },
abstract = {The promising biocontrol agent Brevibacillus brevis is a broadly dispersed bacterium exhibiting significant antibacterial properties against plant diseases. This study conducted a comprehensive comparative genomic analysis of 25 B. brevis strains to examine their taxonomic classification, genetic diversity, and biocontrol potential. The genome sizes, excluding strain NEB573, varied from 5.95 to 6.73 Mb, with GC content between 47.0% and 47.5%. Notably, strain NEB573 exhibited distinct genomic characteristics based on Average Nucleotide Identity (ANI), digital DNA-DNA hybridisation (dDDH), and phylogenetic analyses, suggesting it may represent a novel Brevibacillus species pending additional phenotypic confirmation. The remaining 24 strains were grouped into six phylogenetic clades. The pan-genome study demonstrated significant genomic flexibility, demonstrating an open architecture with 2855 core gene families (33.08%) and 1699 distinct genes. Functional annotations indicated that unique genes were enriched in tasks related to DNA repair and environmental adaptation, while core genes predominantly participated in amino acid metabolism and transcription. The examination of biosynthetic gene clusters (BGCs) identified multiple antimicrobial compounds, such as gramicidin and tyrocidine, which have been reported to exhibit both antibacterial and antifungal activities, thereby underscoring the broad-spectrum biocontrol potential of B. brevis. These findings endorse the application of biocontrol in sustainable plant disease management and offer novel perspectives on its genetic basis in B. brevis. Future investigations of its metabolic repertoire may unveil novel agro-biotechnological applications.},
}
RevDate: 2025-11-30
CmpDate: 2025-11-27
Genomic Surveillance and Resistance Profiling of Multidrug-Resistant Acinetobacter baumannii Clinical Isolates: Clonal Diversity and Virulence Insights.
Microorganisms, 13(11):.
Acinetobacter baumannii is a multidrug-resistant opportunistic pathogen that poses critical challenges in hospital settings due to its environmental resilience and high resistance to antibiotics. Genomic surveillance has become essential for identifying transmission patterns, guiding antimicrobial stewardship, and informing infection control policies. We conducted whole-genome sequencing on 44 A. baumannii isolates collected between 2022 and 2023 from diverse wards in an Italian hospital. Illumina-based sequencing was followed by a comprehensive bioinformatics pipeline, including genome assembly, taxonomic validation, MLST, SNP-based phylogeny, pan-genome analysis, antimicrobial resistance (AMR) gene profiling, and virulence factor prediction. Most isolates were classified as ST2; SAMPLE-34 was ST1 and genetically distinct. Phylogenetic analysis revealed four clonal clusters with cluster-specific AMR and accessory gene content. The pan-genome included 5050 genes, with notable variation linked to hospital ward origin. ICU and internal medicine strains carried higher loads of AMR genes, especially against aminoglycosides, β-lactams, and quinolones. Virulence profiling highlighted widespread immune evasion mechanisms; "Acenovactin" was predominant, while some isolates lacked key adhesion or toxin factors. Our findings underscore the clinical relevance of integrating genomic epidemiology into routine hospital surveillance. Identifying clonal clusters and resistance signatures supports real-time outbreak detection, risk stratification, and targeted infection prevention strategies.
Additional Links: PMID-41304116
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41304116,
year = {2025},
author = {Ristori, MV and Pirona, I and De Florio, L and Aita, SE and Macari, G and Spoto, S and Antonelli Incalzi, R and Angeletti, S},
title = {Genomic Surveillance and Resistance Profiling of Multidrug-Resistant Acinetobacter baumannii Clinical Isolates: Clonal Diversity and Virulence Insights.},
journal = {Microorganisms},
volume = {13},
number = {11},
pages = {},
pmid = {41304116},
issn = {2076-2607},
abstract = {Acinetobacter baumannii is a multidrug-resistant opportunistic pathogen that poses critical challenges in hospital settings due to its environmental resilience and high resistance to antibiotics. Genomic surveillance has become essential for identifying transmission patterns, guiding antimicrobial stewardship, and informing infection control policies. We conducted whole-genome sequencing on 44 A. baumannii isolates collected between 2022 and 2023 from diverse wards in an Italian hospital. Illumina-based sequencing was followed by a comprehensive bioinformatics pipeline, including genome assembly, taxonomic validation, MLST, SNP-based phylogeny, pan-genome analysis, antimicrobial resistance (AMR) gene profiling, and virulence factor prediction. Most isolates were classified as ST2; SAMPLE-34 was ST1 and genetically distinct. Phylogenetic analysis revealed four clonal clusters with cluster-specific AMR and accessory gene content. The pan-genome included 5050 genes, with notable variation linked to hospital ward origin. ICU and internal medicine strains carried higher loads of AMR genes, especially against aminoglycosides, β-lactams, and quinolones. Virulence profiling highlighted widespread immune evasion mechanisms; "Acenovactin" was predominant, while some isolates lacked key adhesion or toxin factors. Our findings underscore the clinical relevance of integrating genomic epidemiology into routine hospital surveillance. Identifying clonal clusters and resistance signatures supports real-time outbreak detection, risk stratification, and targeted infection prevention strategies.},
}
RevDate: 2025-11-30
CmpDate: 2025-11-27
Whole-Genome Analysis of Escherichia coli from One Health Sources: Evaluating Genetic Relatedness and Antimicrobial Resistance Carriage.
Antibiotics (Basel, Switzerland), 14(11):.
Background/Objectives: Due to the numerical dominance of environmental and commensal strains, understanding antimicrobial resistance (AMR) transmission in Escherichia coli requires consideration of non-clinical as well as pathogenic isolates. In this cross-sectional study, associations between the genetic context of non-clinical E. coli and AMR carriage are examined in isolates sampled from different niches within a One Health continuum. Methods: Two hundred eighty-eight E. coli isolates collected in Alberta, Canada (2018-2019) from wastewater, well water, feces of broiler chickens and feedlot cattle, and retail beef and chicken meat were selected from existing surveillance collections using a stratified random sampling structure. Using short-read whole genome assemblies, phylogenetic relationships were inferred from pan-genome multiple sequence alignments. Principal coordinate analysis and permutational analysis of variance (PERMANOVA) of a Jaccard dissimilarity matrix derived from gene presence/absence data were used to investigate contributions of source and AMR strata to observe genetic dissimilarity. Population clustering and gene under- or over-representation by source and cluster were also explored. Results: Minimal phylogenetic segregation of isolates was noted based on source or AMR strata, and both contributed significant but small proportions of observed genetic dissimilarity, with the largest proportion attributed to phylogroup. There was notable diversity of E. coli within and between sources; however, in some larger clusters, differential gene presence/absence was potentially linked to ecological niche rather than source of isolation. Conclusions: This study highlights the ecological complexity of AMR in E. coli in non-clinical contexts, offering a novel lens on how niche-specific factors can influence population structure and AMR carriage. It also provides insight into apparent discrepancies in the literature regarding clustering of E. coli by source. These findings support a more integrative One Health approach to AMR surveillance, emphasizing the need to account for microbial diversity and niche-specific adaptation across interconnected systems.
Additional Links: PMID-41301646
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41301646,
year = {2025},
author = {Butters, A and Jovel, J and Gow, S and Waldner, C and Checkley, SL},
title = {Whole-Genome Analysis of Escherichia coli from One Health Sources: Evaluating Genetic Relatedness and Antimicrobial Resistance Carriage.},
journal = {Antibiotics (Basel, Switzerland)},
volume = {14},
number = {11},
pages = {},
pmid = {41301646},
issn = {2079-6382},
support = {RGPIN/05307-2019//The National Sciences and Engineering Research Council of Canada/ ; 2021R098RC//Results Driven Agricultural Research/ ; ANH.29.20//Canadian Cattle Association's Beef Cattle Research Council/ ; 2021R098R//Alberta Beef Producers/ ; },
abstract = {Background/Objectives: Due to the numerical dominance of environmental and commensal strains, understanding antimicrobial resistance (AMR) transmission in Escherichia coli requires consideration of non-clinical as well as pathogenic isolates. In this cross-sectional study, associations between the genetic context of non-clinical E. coli and AMR carriage are examined in isolates sampled from different niches within a One Health continuum. Methods: Two hundred eighty-eight E. coli isolates collected in Alberta, Canada (2018-2019) from wastewater, well water, feces of broiler chickens and feedlot cattle, and retail beef and chicken meat were selected from existing surveillance collections using a stratified random sampling structure. Using short-read whole genome assemblies, phylogenetic relationships were inferred from pan-genome multiple sequence alignments. Principal coordinate analysis and permutational analysis of variance (PERMANOVA) of a Jaccard dissimilarity matrix derived from gene presence/absence data were used to investigate contributions of source and AMR strata to observe genetic dissimilarity. Population clustering and gene under- or over-representation by source and cluster were also explored. Results: Minimal phylogenetic segregation of isolates was noted based on source or AMR strata, and both contributed significant but small proportions of observed genetic dissimilarity, with the largest proportion attributed to phylogroup. There was notable diversity of E. coli within and between sources; however, in some larger clusters, differential gene presence/absence was potentially linked to ecological niche rather than source of isolation. Conclusions: This study highlights the ecological complexity of AMR in E. coli in non-clinical contexts, offering a novel lens on how niche-specific factors can influence population structure and AMR carriage. It also provides insight into apparent discrepancies in the literature regarding clustering of E. coli by source. These findings support a more integrative One Health approach to AMR surveillance, emphasizing the need to account for microbial diversity and niche-specific adaptation across interconnected systems.},
}
RevDate: 2025-11-30
CmpDate: 2025-11-27
Analysis of the Dirigent Pan-Gene Family in 26 Diverse Inbred Lines Reveals Genomic Diversity in Maize.
Genes, 16(11):.
BACKGROUND: Dirigent genes play crucial roles in regulating plant architecture development and responses to environmental stress. However, the pan-genomic attributes of these genes remain poorly characterized.
METHOD: The dirigent pan-gene family was reconstructed using the public genome assemblies from the 26 maize Nested Association Mapping project founder lines. Orthogroup classification based on multiple sequence alignment revealed both core and variable family members. Evolutionary pressures were assessed through Ka/Ks ratio analysis, and promoter regions were examined for cis-acting regulatory elements. Haplotype, transcriptomic and genome-wide association study (GWAS) analyses were integrated to explore genetic diversity and functional relevance.
RESULTS: Most dirigent members were under purifying selection, whereas a subset may have undergone positive selection. Promoter analysis demonstrated enrichment of stress- and phytohormone-responsive cis-acting regulatory elements, suggesting that regulatory divergence was associated with environmental adaptation. Haplotype analysis revealed allelic diversity among heterotic clusters, potentially contributing to heterosis. Integration with public genome-wide association study datasets identified candidate genes significantly associated with plant architecture and kernel-quality-related traits. Transcriptome profiles indicated that several dirigent genes were preferentially expressed in the roots, suggesting their involvement in root development and nutrient uptake. In addition, public gene expression data showed that certain dirigent genes are induced in response to salt stress, supporting their putative roles in abiotic stress tolerance.
CONCLUSIONS: These findings provide insights into the molecular mechanisms underlying dirigent gene functions and reveal candidate genes with potential utility for improving maize performance and stress resilience through molecular breeding.
Additional Links: PMID-41300738
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41300738,
year = {2025},
author = {Liu, Z and Xue, Y and Xie, Y and Zhao, Y and Yang, W and Yang, W and Wang, F and Ren, X},
title = {Analysis of the Dirigent Pan-Gene Family in 26 Diverse Inbred Lines Reveals Genomic Diversity in Maize.},
journal = {Genes},
volume = {16},
number = {11},
pages = {},
pmid = {41300738},
issn = {2073-4425},
support = {2022ZD04017//Biological Breeding-National Science and Technology Major Project/ ; 20240601081RC//Outstanding Talents (Team) Project for Middle-aged and Young People in Science, Technology Innovation and Entrepreneurship/ ; },
mesh = {*Zea mays/genetics/growth & development ; Genome-Wide Association Study ; *Genetic Variation ; Haplotypes ; Gene Expression Regulation, Plant ; *Multigene Family ; *Genome, Plant ; *Plant Proteins/genetics ; Transcriptome ; },
abstract = {BACKGROUND: Dirigent genes play crucial roles in regulating plant architecture development and responses to environmental stress. However, the pan-genomic attributes of these genes remain poorly characterized.
METHOD: The dirigent pan-gene family was reconstructed using the public genome assemblies from the 26 maize Nested Association Mapping project founder lines. Orthogroup classification based on multiple sequence alignment revealed both core and variable family members. Evolutionary pressures were assessed through Ka/Ks ratio analysis, and promoter regions were examined for cis-acting regulatory elements. Haplotype, transcriptomic and genome-wide association study (GWAS) analyses were integrated to explore genetic diversity and functional relevance.
RESULTS: Most dirigent members were under purifying selection, whereas a subset may have undergone positive selection. Promoter analysis demonstrated enrichment of stress- and phytohormone-responsive cis-acting regulatory elements, suggesting that regulatory divergence was associated with environmental adaptation. Haplotype analysis revealed allelic diversity among heterotic clusters, potentially contributing to heterosis. Integration with public genome-wide association study datasets identified candidate genes significantly associated with plant architecture and kernel-quality-related traits. Transcriptome profiles indicated that several dirigent genes were preferentially expressed in the roots, suggesting their involvement in root development and nutrient uptake. In addition, public gene expression data showed that certain dirigent genes are induced in response to salt stress, supporting their putative roles in abiotic stress tolerance.
CONCLUSIONS: These findings provide insights into the molecular mechanisms underlying dirigent gene functions and reveal candidate genes with potential utility for improving maize performance and stress resilience through molecular breeding.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Zea mays/genetics/growth & development
Genome-Wide Association Study
*Genetic Variation
Haplotypes
Gene Expression Regulation, Plant
*Multigene Family
*Genome, Plant
*Plant Proteins/genetics
Transcriptome
RevDate: 2025-12-22
CmpDate: 2025-12-12
Molecular characterization of two phenotypes of Mixta calida isolated from post-liver transplant drainage fluid.
Microbial pathogenesis, 210:108203.
BACKGROUND: This study investigated the molecular basis of phenotypic variation between mucoid (mixA) and dry (mixB) Mixta calida strains from a liver transplant recipient.
METHODS: Genomic analysis (comparative and pangenome), virulence/antibiotic resistance profiling, transcriptomic analysis, and quantitative biofilm assays were conducted.
RESULTS: MixA and mixB showed >99.99 % sequence identity and extensive collinearity. Pangenome analysis identified mixA's unique CLUSTER3688 (O-antigen/teichoic acid export protein), suggesting altered cell surface modification. Virulence/resistance profiles and MICs were largely conserved. Significant differential gene expression occurred: mixA upregulated stress response and carbohydrate utilization genes, while downregulating sulfur metabolism, ABC transporters, ribosome, and peptidoglycan biosynthesis genes, suggesting lower metabolic activity. Biofilm assays showed no significant biomass difference, but mixA's unique surface export gene may influence structure.
CONCLUSIONS: The mucoid/dry phenotypic variation in Mixta calida appears driven by combined subtle genomic differences, including unique surface component export genes, and significant transcriptional reprogramming affecting key metabolic, stress response, and cell surface pathways. These findings are crucial for understanding this emerging pathogen's adaptation and pathogenesis.
Additional Links: PMID-41297773
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41297773,
year = {2026},
author = {Jiang, B and Zhu, X and Lv, Y and Wang, Y and Yang, H and Yang, Y and Xu, Y and Shan, B and Song, G},
title = {Molecular characterization of two phenotypes of Mixta calida isolated from post-liver transplant drainage fluid.},
journal = {Microbial pathogenesis},
volume = {210},
number = {},
pages = {108203},
doi = {10.1016/j.micpath.2025.108203},
pmid = {41297773},
issn = {1096-1208},
mesh = {Phenotype ; *Liver Transplantation/adverse effects ; Biofilms/growth & development ; Gene Expression Profiling ; Humans ; Microbial Sensitivity Tests ; Virulence/genetics ; Genome, Bacterial ; Anti-Bacterial Agents/pharmacology ; Phylogeny ; Virulence Factors/genetics ; Genomics ; Gene Expression Regulation, Bacterial ; Drainage ; },
abstract = {BACKGROUND: This study investigated the molecular basis of phenotypic variation between mucoid (mixA) and dry (mixB) Mixta calida strains from a liver transplant recipient.
METHODS: Genomic analysis (comparative and pangenome), virulence/antibiotic resistance profiling, transcriptomic analysis, and quantitative biofilm assays were conducted.
RESULTS: MixA and mixB showed >99.99 % sequence identity and extensive collinearity. Pangenome analysis identified mixA's unique CLUSTER3688 (O-antigen/teichoic acid export protein), suggesting altered cell surface modification. Virulence/resistance profiles and MICs were largely conserved. Significant differential gene expression occurred: mixA upregulated stress response and carbohydrate utilization genes, while downregulating sulfur metabolism, ABC transporters, ribosome, and peptidoglycan biosynthesis genes, suggesting lower metabolic activity. Biofilm assays showed no significant biomass difference, but mixA's unique surface export gene may influence structure.
CONCLUSIONS: The mucoid/dry phenotypic variation in Mixta calida appears driven by combined subtle genomic differences, including unique surface component export genes, and significant transcriptional reprogramming affecting key metabolic, stress response, and cell surface pathways. These findings are crucial for understanding this emerging pathogen's adaptation and pathogenesis.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
Phenotype
*Liver Transplantation/adverse effects
Biofilms/growth & development
Gene Expression Profiling
Humans
Microbial Sensitivity Tests
Virulence/genetics
Genome, Bacterial
Anti-Bacterial Agents/pharmacology
Phylogeny
Virulence Factors/genetics
Genomics
Gene Expression Regulation, Bacterial
Drainage
RevDate: 2025-12-13
CmpDate: 2025-12-11
Enrichment of root-associated Streptomyces strains in response to drought is driven by diverse functional traits and does not predict beneficial effects on plant growth.
PLoS biology, 23(11):e3003526.
The genus Streptomyces has consistently been found enriched in drought-stressed plant root microbiomes, yet the ecological basis and functional variation underlying this enrichment at the strain and isolate level remain unclear. Using two 16S rRNA sequencing methods with different levels of taxonomic resolution, we confirmed drought-associated enrichment (DE) of Streptomyces in field-grown sorghum roots and identified five closely related but distinct amplicon sequence variants (ASVs) belonging to the genus with variable drought enrichment patterns. From a culture collection of sorghum root endophytes, we selected 12 Streptomyces isolates representing these ASVs for phenotypic and genomic characterization. Whole-genome sequencing revealed substantial variation in gene content, even among closely related isolates, and exometabolomic profiling showed distinct metabolic responses to media supplemented with drought- versus well-watered root tissue. Traits linked to drought survival, including osmotic stress tolerance, siderophore production, and carbon utilization, varied widely among isolates and were not phylogenetically conserved. Using a broader panel of 48 Streptomyces, we demonstrate that DE scores, determined through mono-association experiments in gnotobiotic sorghum systems, showed high variability and lacked correlation with plant growth promotion. Pangenome-wide association identified orthogroups involved in osmolyte transport (e.g., proP) and membrane biosynthesis (e.g., fabG) as positively associated with DE, though most associations lacked phylogenetic signal. Collectively, these results demonstrate that Streptomyces DE is not a conserved genus-level trait but is instead strain-specific and functionally heterogeneous. Furthermore, DE in the root microbiome was shown not to predict beneficial effects on plant growth. This work underscores the need to resolve functional traits at the strain level and highlights the complexity of microbe-host-environment interactions under abiotic stress.
Additional Links: PMID-41296785
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41296785,
year = {2025},
author = {Fonseca-Garcia, C and Pettinga, D and Caddell, D and Ploemacher, H and Louie, K and Bowen, BP and Park, J and Sanchez, J and Zimic-Sheen, A and Traxler, MF and Northen, TR and Coleman-Derr, D},
title = {Enrichment of root-associated Streptomyces strains in response to drought is driven by diverse functional traits and does not predict beneficial effects on plant growth.},
journal = {PLoS biology},
volume = {23},
number = {11},
pages = {e3003526},
pmid = {41296785},
issn = {1545-7885},
mesh = {*Streptomyces/genetics/physiology/isolation & purification/classification/metabolism ; *Plant Roots/microbiology/growth & development ; *Droughts ; *Sorghum/microbiology/growth & development ; RNA, Ribosomal, 16S/genetics ; Phylogeny ; *Plant Development ; Stress, Physiological ; Microbiota ; },
abstract = {The genus Streptomyces has consistently been found enriched in drought-stressed plant root microbiomes, yet the ecological basis and functional variation underlying this enrichment at the strain and isolate level remain unclear. Using two 16S rRNA sequencing methods with different levels of taxonomic resolution, we confirmed drought-associated enrichment (DE) of Streptomyces in field-grown sorghum roots and identified five closely related but distinct amplicon sequence variants (ASVs) belonging to the genus with variable drought enrichment patterns. From a culture collection of sorghum root endophytes, we selected 12 Streptomyces isolates representing these ASVs for phenotypic and genomic characterization. Whole-genome sequencing revealed substantial variation in gene content, even among closely related isolates, and exometabolomic profiling showed distinct metabolic responses to media supplemented with drought- versus well-watered root tissue. Traits linked to drought survival, including osmotic stress tolerance, siderophore production, and carbon utilization, varied widely among isolates and were not phylogenetically conserved. Using a broader panel of 48 Streptomyces, we demonstrate that DE scores, determined through mono-association experiments in gnotobiotic sorghum systems, showed high variability and lacked correlation with plant growth promotion. Pangenome-wide association identified orthogroups involved in osmolyte transport (e.g., proP) and membrane biosynthesis (e.g., fabG) as positively associated with DE, though most associations lacked phylogenetic signal. Collectively, these results demonstrate that Streptomyces DE is not a conserved genus-level trait but is instead strain-specific and functionally heterogeneous. Furthermore, DE in the root microbiome was shown not to predict beneficial effects on plant growth. This work underscores the need to resolve functional traits at the strain level and highlights the complexity of microbe-host-environment interactions under abiotic stress.},
}
MeSH Terms:
show MeSH Terms
hide MeSH Terms
*Streptomyces/genetics/physiology/isolation & purification/classification/metabolism
*Plant Roots/microbiology/growth & development
*Droughts
*Sorghum/microbiology/growth & development
RNA, Ribosomal, 16S/genetics
Phylogeny
*Plant Development
Stress, Physiological
Microbiota
RevDate: 2025-11-29
Pan-genome analysis of Ralstonia pseudosolanacearum associated with tobacco bacterial wilt in China.
BMC microbiology, 25(1):775.
UNLABELLED: Tobacco bacterial wilt, caused by Ralstonia solanacearum species complex (RSSC), poses a significant threat to tobacco production in most major tobacco-growing regions of China. Here, 204 strains of Ralstonia were isolated from infected tobacco plants across 12 provinces and autonomous regions in China. Molecular identification and biovar typing of the RSSC strains revealed that all the isolates were found in phylotype I (classified as R. pseudosolanacearum) and biovar III. Seven sequevars were identified and sequevar 15 was the most prevalent. Pathogenicity tests indicated that 29% of the strains showing high virulence were found in the southwest tobacco-growing region of China. Notably, significant pathogenic variation was observed within strains of the same sequevar, and no clear correlation was found between the sequevar type and pathogenicity. Using Oxford Nanopore sequencing, we analyzed 103 strains of tobacco R. pseudosolanacearum from different geographical origins and pathotypes. The pangenome of R. pseudosolanacearum is comprised of 9008 non-redundant genes, divided into a core genome (36%), accessory genome (48.7%) and isolate-specific genes (15%). Core genome functions were related to oxidation–reduction reaction process and DNA transcription regulation, while the accessory genome was linked to DNA recombination, integration, and transposition. SNP analysis revealed an average of 36,740 SNP loci per strain, indicating evolutionary purification with ka/ks value below 1. Phylogenetic analysis divided 99 strains into 4 main groups, with sequevars correlated to specific branches, reflecting evolutionary relationships. We identified 2226 genomic islands across the strains, with each strain containing 18–25 islands, primarily related to translation regulation, transposition, and transposase activity. Analysis of virulence factors using the virulence factors database (VFDB) highlighted 1252 virulence genes within these islands, with a significant portion (36.5%) linked to effector transport systems, predominantly the type III secretion system (66.7%). Each strain averaged 60 type III effector proteins, with RipBK and RipAZ2 unique to highly pathogenic strains. This study provides a comprehensive understanding of pan-genome of R. pseudosolanacearum causing tobacco bacterial wilt in China, providing valuable insights into virulence variation and environmental adaptation of the pathogen.
SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12866-025-04485-4.
Additional Links: PMID-41291432
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41291432,
year = {2025},
author = {Kang, J and Sohail, MA and Zhang, Y and Wang, Z and Xu, R and Hsiang, T and Zheng, L and Huang, J and Fang, Y and Li, Y},
title = {Pan-genome analysis of Ralstonia pseudosolanacearum associated with tobacco bacterial wilt in China.},
journal = {BMC microbiology},
volume = {25},
number = {1},
pages = {775},
pmid = {41291432},
issn = {1471-2180},
support = {110202101045, LS-05//Pests and Diseases Green Prevention and Control Major Special Project/ ; },
abstract = {UNLABELLED: Tobacco bacterial wilt, caused by Ralstonia solanacearum species complex (RSSC), poses a significant threat to tobacco production in most major tobacco-growing regions of China. Here, 204 strains of Ralstonia were isolated from infected tobacco plants across 12 provinces and autonomous regions in China. Molecular identification and biovar typing of the RSSC strains revealed that all the isolates were found in phylotype I (classified as R. pseudosolanacearum) and biovar III. Seven sequevars were identified and sequevar 15 was the most prevalent. Pathogenicity tests indicated that 29% of the strains showing high virulence were found in the southwest tobacco-growing region of China. Notably, significant pathogenic variation was observed within strains of the same sequevar, and no clear correlation was found between the sequevar type and pathogenicity. Using Oxford Nanopore sequencing, we analyzed 103 strains of tobacco R. pseudosolanacearum from different geographical origins and pathotypes. The pangenome of R. pseudosolanacearum is comprised of 9008 non-redundant genes, divided into a core genome (36%), accessory genome (48.7%) and isolate-specific genes (15%). Core genome functions were related to oxidation–reduction reaction process and DNA transcription regulation, while the accessory genome was linked to DNA recombination, integration, and transposition. SNP analysis revealed an average of 36,740 SNP loci per strain, indicating evolutionary purification with ka/ks value below 1. Phylogenetic analysis divided 99 strains into 4 main groups, with sequevars correlated to specific branches, reflecting evolutionary relationships. We identified 2226 genomic islands across the strains, with each strain containing 18–25 islands, primarily related to translation regulation, transposition, and transposase activity. Analysis of virulence factors using the virulence factors database (VFDB) highlighted 1252 virulence genes within these islands, with a significant portion (36.5%) linked to effector transport systems, predominantly the type III secretion system (66.7%). Each strain averaged 60 type III effector proteins, with RipBK and RipAZ2 unique to highly pathogenic strains. This study provides a comprehensive understanding of pan-genome of R. pseudosolanacearum causing tobacco bacterial wilt in China, providing valuable insights into virulence variation and environmental adaptation of the pathogen.
SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12866-025-04485-4.},
}
RevDate: 2025-11-28
CmpDate: 2025-11-25
Exploring cattle structural variation in the era of long reads, pangenome graphs, and near-complete assemblies.
Journal of animal science and biotechnology, 16(1):158.
Structural variations (SVs ≥ 50 bp) are a critical but underexplored source of genetic diversity in cattle, shaping traits vital for productivity, adaptability, and health. Advances in long-read sequencing, pangenome graph construction, and near-complete genome assemblies now allow accurate SV detection and genotyping. These innovations overcome the limitations of single-reference genomes, enabling the discovery of complex SVs, including nested and overlapping variants, and providing access to previously inaccessible genomic regions such as centromeres and telomeres. This review highlights the current landscape of cattle SV research, with emphasis on integrating long-read sequencing and pangenome frameworks to uncover breed-specific and population-level variation. While many SVs are linked to economically important traits such as feed efficiency and disease resistance, their broader regulatory impacts remain an active area of investigation. Emerging functional genomics approaches, including transcriptomics, epigenomics, and genome editing, will clarify how SVs influence gene regulation and phenotype. Looking forward, the integration of SV catalogs with multi-omics data, imputation resources, and artificial intelligence-driven models will be essential for translating discoveries into breeding and conservation applications. Integrating structural variants into breeding pipelines promises to revolutionize livestock genomics, enabling precision selection and sustainable agriculture despite challenges in cost, data sharing, and functional validation.
Additional Links: PMID-41287010
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41287010,
year = {2025},
author = {Liu, GE},
title = {Exploring cattle structural variation in the era of long reads, pangenome graphs, and near-complete assemblies.},
journal = {Journal of animal science and biotechnology},
volume = {16},
number = {1},
pages = {158},
pmid = {41287010},
issn = {1674-9782},
support = {2019-67015-29321//National Institute of Food and Agriculture/ ; 2021-67015-33409//National Institute of Food and Agriculture/ ; 0500-00093-001-00-D//Agricultural Research Service/ ; },
abstract = {Structural variations (SVs ≥ 50 bp) are a critical but underexplored source of genetic diversity in cattle, shaping traits vital for productivity, adaptability, and health. Advances in long-read sequencing, pangenome graph construction, and near-complete genome assemblies now allow accurate SV detection and genotyping. These innovations overcome the limitations of single-reference genomes, enabling the discovery of complex SVs, including nested and overlapping variants, and providing access to previously inaccessible genomic regions such as centromeres and telomeres. This review highlights the current landscape of cattle SV research, with emphasis on integrating long-read sequencing and pangenome frameworks to uncover breed-specific and population-level variation. While many SVs are linked to economically important traits such as feed efficiency and disease resistance, their broader regulatory impacts remain an active area of investigation. Emerging functional genomics approaches, including transcriptomics, epigenomics, and genome editing, will clarify how SVs influence gene regulation and phenotype. Looking forward, the integration of SV catalogs with multi-omics data, imputation resources, and artificial intelligence-driven models will be essential for translating discoveries into breeding and conservation applications. Integrating structural variants into breeding pipelines promises to revolutionize livestock genomics, enabling precision selection and sustainable agriculture despite challenges in cost, data sharing, and functional validation.},
}
RevDate: 2025-11-24
The value of structural variants to conservation genomics in the pangenome era.
The Journal of heredity pii:8341471 [Epub ahead of print].
Structural variants (SVs) comprise an axis of genetic diversity with strong consequences for phenotype and fitness, making them a potentially important target for conservation genomics. Here we review how and why SVs can play a role in in conservation genomics; the different types of SVs and how they can affect phenotype; and how pangenomes and long-read sequencing are illuminating their evolution in populations, including small populations and those of conservation concern. SVs comprise multinucleotide mutations including insertions, deletions, transpositions, inversions, and other multinucleotide mutations, often overlapping genes and other functional genome regions. As a result, SVs often play important roles in phenotypic evolution and local adaptation and can contribute substantially to genetic load in inbred populations. However, our understanding of the factors influencing SV diversity in populations is still in its infancy and is complicated by the vast range of sizes, effects, and mechanisms of formation of these mutations. We argue that SVs are an important axis of genetic diversity that should be characterized alongside more traditional metrics of genetic diversity in conservation contexts. There are a number of analytical challenges to detecting and studying SVs, but analyses aimed at understanding the role of SVs in inbreeding load and population health are rapidly becoming realizable goals, accelerated by new technologies and analytical approaches. New tools, including population-scale long-read sequencing and pangenome approaches, are beginning to make SVs accessible in ways that can be readily applied in conservation settings.
Additional Links: PMID-41283848
Publisher:
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41283848,
year = {2025},
author = {Pegan, TM and Sivakumar, A and Burns, C and Lopez, KA and Fang, B and Melendez, NM and Bartel, L and Edwards, SV},
title = {The value of structural variants to conservation genomics in the pangenome era.},
journal = {The Journal of heredity},
volume = {},
number = {},
pages = {},
doi = {10.1093/jhered/esaf098},
pmid = {41283848},
issn = {1465-7333},
abstract = {Structural variants (SVs) comprise an axis of genetic diversity with strong consequences for phenotype and fitness, making them a potentially important target for conservation genomics. Here we review how and why SVs can play a role in in conservation genomics; the different types of SVs and how they can affect phenotype; and how pangenomes and long-read sequencing are illuminating their evolution in populations, including small populations and those of conservation concern. SVs comprise multinucleotide mutations including insertions, deletions, transpositions, inversions, and other multinucleotide mutations, often overlapping genes and other functional genome regions. As a result, SVs often play important roles in phenotypic evolution and local adaptation and can contribute substantially to genetic load in inbred populations. However, our understanding of the factors influencing SV diversity in populations is still in its infancy and is complicated by the vast range of sizes, effects, and mechanisms of formation of these mutations. We argue that SVs are an important axis of genetic diversity that should be characterized alongside more traditional metrics of genetic diversity in conservation contexts. There are a number of analytical challenges to detecting and studying SVs, but analyses aimed at understanding the role of SVs in inbreeding load and population health are rapidly becoming realizable goals, accelerated by new technologies and analytical approaches. New tools, including population-scale long-read sequencing and pangenome approaches, are beginning to make SVs accessible in ways that can be readily applied in conservation settings.},
}
RevDate: 2025-12-09
CmpDate: 2025-12-08
Rapid, accurate long- and short-read mapping to large pangenome graphs with vg Giraffe.
Research square.
We previously introduced Giraffe, a short-read-to-pangenome graph mapper available in the vg pangenomics toolkit. Giraffe was fast and accurate for mapping short reads to human-scale pangenomes, but struggled with long reads. Long reads present a unique challenge to pangenome mapping algorithms due to their length and error profile, which allow them to take more topologically complex paths through the pangenome graph and increase the possible search space for the algorithm. We present updates to Giraffe that allow it to quickly and accurately map long reads to pangenome graphs. For both short and long reads, Giraffe mapping to a pangenome containing data from more than 450 human haplotypes, generated by the Human Pangenome Reference Consortium, is comparable in speed to linear mappers to human reference genomes; Giraffe is also over an order of magnitude faster than GraphAligner, the current state-of-the-art long-read-to-pangenome mapper. Its alignments produce similar or improved small and structural variant calling results, compared to those from commonly used graph-based and linear mappers. We additionally demonstrate using Giraffe's long read alignments in a pangenome-guided assembly workflow, which is capable of producing more contiguous local assemblies than Hifiasm in our test regions.
Additional Links: PMID-41282249
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41282249,
year = {2025},
author = {Chang, X and Novak, AM and Eizenga, JM and Sirén, J and Monlong, J and Negi, S and Andreace, F and Nag, S and Kyriakidis, K and Hickey, G and Hwang, S and Délot, EC and Carroll, A and Shafin, K and Chang, PC and Okamoto, F and Paten, B},
title = {Rapid, accurate long- and short-read mapping to large pangenome graphs with vg Giraffe.},
journal = {Research square},
volume = {},
number = {},
pages = {},
pmid = {41282249},
issn = {2693-5015},
support = {U01 HG013748/HG/NHGRI NIH HHS/United States ; U24 HG011853/HG/NHGRI NIH HHS/United States ; U41 HG010972/HG/NHGRI NIH HHS/United States ; },
abstract = {We previously introduced Giraffe, a short-read-to-pangenome graph mapper available in the vg pangenomics toolkit. Giraffe was fast and accurate for mapping short reads to human-scale pangenomes, but struggled with long reads. Long reads present a unique challenge to pangenome mapping algorithms due to their length and error profile, which allow them to take more topologically complex paths through the pangenome graph and increase the possible search space for the algorithm. We present updates to Giraffe that allow it to quickly and accurately map long reads to pangenome graphs. For both short and long reads, Giraffe mapping to a pangenome containing data from more than 450 human haplotypes, generated by the Human Pangenome Reference Consortium, is comparable in speed to linear mappers to human reference genomes; Giraffe is also over an order of magnitude faster than GraphAligner, the current state-of-the-art long-read-to-pangenome mapper. Its alignments produce similar or improved small and structural variant calling results, compared to those from commonly used graph-based and linear mappers. We additionally demonstrate using Giraffe's long read alignments in a pangenome-guided assembly workflow, which is capable of producing more contiguous local assemblies than Hifiasm in our test regions.},
}
RevDate: 2025-11-26
CmpDate: 2025-11-24
Whole-genome sequencing reveals genetic diversity, population structure, and core collection construction in Korean peach (Prunus persica) germplasm.
Frontiers in plant science, 16:1702527.
Peach (Prunus persica) is an important temperate fruit crop and a model species for genomic research due to its diploid genome, short juvenile period, and relatively small genome size. Despite advances in next-generation sequencing (NGS), most peach genome-wide studies focused on a limited number of elite cultivars, and thus, the diversity of conserved germplasm is underrepresented. In Korea, a large number of peach genetic resources are maintained at the National Institute of Horticultural and Herbal Science (NIHHS), a branch of the Rural Development Administration (RDA), but no genome-scale core collection has been developed to date. This study aimed to perform whole-genome sequencing (WGS) on 445 peach accessions conserved in Korea between 2020 and 2025 using the Illumina NovaSeq 6000 platform, with the primary objective of constructing a representative genome-scale core collection and secondary objectives of identifying genome-wide single-nucleotide polymorphisms (SNPs) and assessing genetic diversity, population structure, and phylogenetic relationships. A total of 944,670 high-confidence SNPs were identified, with chromosomes 2 (G2) and 4 (G4) showing the highest variant density. Analyses using fastSTRUCTURE, principal component analysis (PCA), and phylogenetic reconstruction revealed a complex population structure and substantial genetic variation. From this data, a representative core collection was established, effectively capturing the majority of the genetic diversity present in the Korean peach germplasm. These results offer valuable genomic resources for peach improvement, marker development, pan-genome construction, and comparative genomics within the Rosaceae family.
Additional Links: PMID-41281343
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41281343,
year = {2025},
author = {Bae, SH and Jeong, N and Kwon, JH and Lee, JH and Hwang, K and Hur, YY and Lee, SJ},
title = {Whole-genome sequencing reveals genetic diversity, population structure, and core collection construction in Korean peach (Prunus persica) germplasm.},
journal = {Frontiers in plant science},
volume = {16},
number = {},
pages = {1702527},
pmid = {41281343},
issn = {1664-462X},
abstract = {Peach (Prunus persica) is an important temperate fruit crop and a model species for genomic research due to its diploid genome, short juvenile period, and relatively small genome size. Despite advances in next-generation sequencing (NGS), most peach genome-wide studies focused on a limited number of elite cultivars, and thus, the diversity of conserved germplasm is underrepresented. In Korea, a large number of peach genetic resources are maintained at the National Institute of Horticultural and Herbal Science (NIHHS), a branch of the Rural Development Administration (RDA), but no genome-scale core collection has been developed to date. This study aimed to perform whole-genome sequencing (WGS) on 445 peach accessions conserved in Korea between 2020 and 2025 using the Illumina NovaSeq 6000 platform, with the primary objective of constructing a representative genome-scale core collection and secondary objectives of identifying genome-wide single-nucleotide polymorphisms (SNPs) and assessing genetic diversity, population structure, and phylogenetic relationships. A total of 944,670 high-confidence SNPs were identified, with chromosomes 2 (G2) and 4 (G4) showing the highest variant density. Analyses using fastSTRUCTURE, principal component analysis (PCA), and phylogenetic reconstruction revealed a complex population structure and substantial genetic variation. From this data, a representative core collection was established, effectively capturing the majority of the genetic diversity present in the Korean peach germplasm. These results offer valuable genomic resources for peach improvement, marker development, pan-genome construction, and comparative genomics within the Rosaceae family.},
}
RevDate: 2025-12-06
CmpDate: 2025-12-05
Seqwin: Ultrafast identification of signature sequences in microbial genomes.
bioRxiv : the preprint server for biology.
MOTIVATION: Polymerase chain reaction (PCR) enables rapid, cost-effective diagnostics but requires prior identification of genomic regions that allow sensitive and specific identification of target microbial groups, herein referred to as microbial signature sequences. We introduce Seqwin, an open-source framework designed to automate microbial genome signature discovery. Tens of thousands of microbial genomes are now available, limiting the application of existing manual and automated approaches for identifying signatures. Modern approaches that are capable of leveraging all available microbial genomes will ensure sensitive and accurate DNA signatures identification and enable robust pathogen detection for clinical, environmental, and public health applications.
RESULTS: Seqwin builds weighted pan-genome minimizer graphs and uses a traversal algorithm to identify signature sequences that occur frequently in target genomes but remain rare in non-targets. Unlike earlier tools that depend on strict presence or absence of sequences, Seqwin accommodates natural sequence variation and scales to very large genome collections. When applied to genomes from C. difficile, M. tuberculosis and S. enterica, Seqwin recovered more high-quality signatures than alternative methods with lower computational burden. Seqwin analysis of nearly 15,000 S. enterica genomes yielded over 200 candidate signatures in less than 10 minutes. Seqwin provides an open-source solution for the long-standing need for scalable microbial signature discovery and diagnostic assay design.
AVAILABILITY: Seqwin is freely available for academic use (https://github.com/treangenlab/Seqwin) and can be installed via Bioconda.
Additional Links: PMID-41279887
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41279887,
year = {2025},
author = {Wang, MX and Kille, B and Nute, MG and Zhou, S and Stadler, LB and Treangen, TJ},
title = {Seqwin: Ultrafast identification of signature sequences in microbial genomes.},
journal = {bioRxiv : the preprint server for biology},
volume = {},
number = {},
pages = {},
pmid = {41279887},
issn = {2692-8205},
support = {P01 AI152999/AI/NIAID NIH HHS/United States ; R21 AI190938/AI/NIAID NIH HHS/United States ; T15 LM007093/LM/NLM NIH HHS/United States ; },
abstract = {MOTIVATION: Polymerase chain reaction (PCR) enables rapid, cost-effective diagnostics but requires prior identification of genomic regions that allow sensitive and specific identification of target microbial groups, herein referred to as microbial signature sequences. We introduce Seqwin, an open-source framework designed to automate microbial genome signature discovery. Tens of thousands of microbial genomes are now available, limiting the application of existing manual and automated approaches for identifying signatures. Modern approaches that are capable of leveraging all available microbial genomes will ensure sensitive and accurate DNA signatures identification and enable robust pathogen detection for clinical, environmental, and public health applications.
RESULTS: Seqwin builds weighted pan-genome minimizer graphs and uses a traversal algorithm to identify signature sequences that occur frequently in target genomes but remain rare in non-targets. Unlike earlier tools that depend on strict presence or absence of sequences, Seqwin accommodates natural sequence variation and scales to very large genome collections. When applied to genomes from C. difficile, M. tuberculosis and S. enterica, Seqwin recovered more high-quality signatures than alternative methods with lower computational burden. Seqwin analysis of nearly 15,000 S. enterica genomes yielded over 200 candidate signatures in less than 10 minutes. Seqwin provides an open-source solution for the long-standing need for scalable microbial signature discovery and diagnostic assay design.
AVAILABILITY: Seqwin is freely available for academic use (https://github.com/treangenlab/Seqwin) and can be installed via Bioconda.},
}
RevDate: 2025-11-26
CmpDate: 2025-11-25
Polygenic basis of incipient reproductive isolation in hybridizing populations is revealed by pangenomic and epigenetic divergence.
bioRxiv : the preprint server for biology.
Incipient reproductive isolation in the presence of gene flow has traditionally been attributed to a small number of major-effect loci under strong selection, representing only a minor fraction of the genome. Using the Heliconius erato adaptive radiation-a butterfly species with populations at varying levels of genetic differentiation-we employed a pan-genome approach to investigate genome-wide mutational divergence, epigenetic changes, and structural variation contributing to divergence. In contrast to earlier studies that identified only a few highly divergent loci, our high-resolution analysis reveals widespread divergence across the genome, suggesting polygenic barriers to gene flow. Epigenetic divergence assessed using ATAC-seq, highlights population-specific differences in chromatin accessibility, which may reflect SNPs distribution or precede genetic differentiation by reshaping regulatory landscapes. We also identified new structural variants, including polymorphic indels in open chromatin, that further associate with genomic divergence. Together, these findings show that the genomic landscape of H. erato is shaped by a dynamic interplay of mutational changes, epigenetic modifications, and structural variation. We emphasize the role of developmental, behavioral, and ecological adaptations and provide a broader perspective on the functional genetic basis of genome-wide divergence in Heliconius. This emerging polygenic framework offers a more detailed understanding of how reproductive barriers evolve under ongoing gene flow.
Additional Links: PMID-41279830
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41279830,
year = {2025},
author = {Ruggieri, AA and Cicconardi, F and Bellin, N and Montgomery, SH and Mallet, J and Van Belleghem, SM and McMillan, WO and Counterman, BA and Papa, R},
title = {Polygenic basis of incipient reproductive isolation in hybridizing populations is revealed by pangenomic and epigenetic divergence.},
journal = {bioRxiv : the preprint server for biology},
volume = {},
number = {},
pages = {},
pmid = {41279830},
issn = {2692-8205},
abstract = {Incipient reproductive isolation in the presence of gene flow has traditionally been attributed to a small number of major-effect loci under strong selection, representing only a minor fraction of the genome. Using the Heliconius erato adaptive radiation-a butterfly species with populations at varying levels of genetic differentiation-we employed a pan-genome approach to investigate genome-wide mutational divergence, epigenetic changes, and structural variation contributing to divergence. In contrast to earlier studies that identified only a few highly divergent loci, our high-resolution analysis reveals widespread divergence across the genome, suggesting polygenic barriers to gene flow. Epigenetic divergence assessed using ATAC-seq, highlights population-specific differences in chromatin accessibility, which may reflect SNPs distribution or precede genetic differentiation by reshaping regulatory landscapes. We also identified new structural variants, including polymorphic indels in open chromatin, that further associate with genomic divergence. Together, these findings show that the genomic landscape of H. erato is shaped by a dynamic interplay of mutational changes, epigenetic modifications, and structural variation. We emphasize the role of developmental, behavioral, and ecological adaptations and provide a broader perspective on the functional genetic basis of genome-wide divergence in Heliconius. This emerging polygenic framework offers a more detailed understanding of how reproductive barriers evolve under ongoing gene flow.},
}
RevDate: 2025-12-11
CmpDate: 2025-12-10
Improving long-read somatic structural variant calling with pangenome and de novo personal genome assembly.
bioRxiv : the preprint server for biology.
Accurate detection of mosaic and somatic structural variants (SVs) provides early diagnostic and therapeutic evidence for cancers. While long-read whole-genome sequencing leads to more accurate SV detection than short read sequencing, existing long-read SV callers only look at alignment against a single reference genome and are susceptible to systematic false discovery caused by germline differences between the individual genome and the reference genome. Here we develop a new SV calling method that jointly considers the alignment against a pangenome and the de novo assembly of the germline genome. It dramatically reduces false positive mosaic SVs in normal samples and somatic SVs in cancer cell lines with little loss in sensitivity. Our study highlights the essential need for pangenome or personal genome assembly to integrate SV calls for both SV discoveries and clinical diagnostics.
Additional Links: PMID-41279411
PubMed:
Citation:
show bibtex listing
hide bibtex listing
@article {pmid41279411,
year = {2025},
author = {Qin, Q and Heinz, J and Li, H},
title = {Improving long-read somatic structural variant calling with pangenome and de novo personal genome assembly.},
journal = {bioRxiv : the preprint server for biology},
volume = {},
number = {},
pages = {},
pmid = {41279411},
issn = {2692-8205},
support = {R01 HG010040/HG/NHGRI NIH HHS/United States ; U01 HG013748/HG/NHGRI NIH HHS/United States ; U24 CA294203/CA/NCI NIH HHS/United States ; },
abstract = {Accurate detection of mosaic and somatic structural variants (SVs) provides early diagnostic and therapeutic evidence for cancers. While long-read whole-genome sequencing leads to more accurate SV detection than short read sequencing, existing long-read SV callers only look at alignment against a single reference genome and are susceptible to systematic false discovery caused by germline differences between the individual genome and the reference genome. Here we develop a new SV calling method that jointly considers the alignment against a pangenome and the de novo assembly of the germline genome. It dramatically reduces false positive mosaic SVs in normal samples and somatic SVs in cancer cell lines with little loss in sensitivity. Our study highlights the essential need for pangenome or personal genome assembly to integrate SV calls for both SV discoveries and clinical diagnostics.},
}
▼ ▼ LOAD NEXT 100 CITATIONS
ESP Quick Facts
ESP Origins
In the early 1990's, Robert Robbins was a faculty member at Johns Hopkins, where he directed the informatics core of GDB — the human gene-mapping database of the international human genome project. To share papers with colleagues around the world, he set up a small paper-sharing section on his personal web page. This small project evolved into The Electronic Scholarly Publishing Project.
ESP Support
In 1995, Robbins became the VP/IT of the Fred Hutchinson Cancer Research Center in Seattle, WA. Soon after arriving in Seattle, Robbins secured funding, through the ELSI component of the US Human Genome Project, to create the original ESP.ORG web site, with the formal goal of providing free, world-wide access to the literature of classical genetics.
ESP Rationale
Although the methods of molecular biology can seem almost magical to the uninitiated, the original techniques of classical genetics are readily appreciated by one and all: cross individuals that differ in some inherited trait, collect all of the progeny, score their attributes, and propose mechanisms to explain the patterns of inheritance observed.
ESP Goal
In reading the early works of classical genetics, one is drawn, almost inexorably, into ever more complex models, until molecular explanations begin to seem both necessary and natural. At that point, the tools for understanding genome research are at hand. Assisting readers reach this point was the original goal of The Electronic Scholarly Publishing Project.
ESP Usage
Usage of the site grew rapidly and has remained high. Faculty began to use the site for their assigned readings. Other on-line publishers, ranging from The New York Times to Nature referenced ESP materials in their own publications. Nobel laureates (e.g., Joshua Lederberg) regularly used the site and even wrote to suggest changes and improvements.
ESP Content
When the site began, no journals were making their early content available in digital format. As a result, ESP was obliged to digitize classic literature before it could be made available. For many important papers — such as Mendel's original paper or the first genetic map — ESP had to produce entirely new typeset versions of the works, if they were to be available in a high-quality format.
ESP Help
Early support from the DOE component of the Human Genome Project was critically important for getting the ESP project on a firm foundation. Since that funding ended (nearly 20 years ago), the project has been operated as a purely volunteer effort. Anyone wishing to assist in these efforts should send an email to Robbins.
ESP Plans
With the development of methods for adding typeset side notes to PDF files, the ESP project now plans to add annotated versions of some classical papers to its holdings. We also plan to add new reference and pedagogical material. We have already started providing regularly updated, comprehensive bibliographies to the ESP.ORG site.
ESP Picks from Around the Web (updated 28 JUL 2024 )
Old Science
Weird Science
Treating Disease with Fecal Transplantation
Fossils of miniature humans (hobbits) discovered in Indonesia
Paleontology
Dinosaur tail, complete with feathers, found preserved in amber.
Astronomy
Mysterious fast radio burst (FRB) detected in the distant universe.
Big Data & Informatics
Big Data: Buzzword or Big Deal?
Hacking the genome: Identifying anonymized human subjects using publicly available data.