Bifidobacterium adolescentis is an obligately anaerobic, non-motile, non-spore-forming Gram-positive rod with a characteristic bifurcated morphology, classified in the family Bifidobacteriaceae within the phylum Actinomycetota. It is one of the dominant bifidobacterial species in the adult human gut, where it ferments dietary oligosaccharides and starch via the bifid shunt pathway, producing acetate and lactate that contribute to gut health. B. adolescentis has been shown to promote Th17 cell differentiation and immune homeostasis, and its genome encodes a large repertoire of glycoside hydrolases, reflecting its specialisation in complex carbohydrate utilisation.
For detailed methods on how these thresholds were calculated, please see Methods. The suggested thresholds are in the table below. These thresholds are based on 18 genomes from RefSeq and 2061 genomes from other sources. These thresholds were applied to all the bacteria dataset, which resulted in removing 83 and retaining 1996. The list of genomes retained (i.e. high quality) and the list of genomes rejected (filtered) can be downloaded below. These files are in .xz format. The rejected genomes file also includes the reason why.
These tables provide a summary of the distribution of each metric, including SDeviation, Mean, Median, and Percentiles.
These thresholds were applied to all the bacteria dataset. The list of genomes retained (i.e. high quality) and the list of genomes rejected (filtered) can be downloaded below. These files are in .xz format. The rejected genomes file also includes the reason why.
| Metric | Lower bound | Upper bound |
|---|---|---|
| N50 | 62,000 | - |
| no_of_contigs | - | 90.00 |
| GC_Content | 58.00 | 61.00 |
| Completeness | 93.00 | - |
| Contamination | - | 6.000 |
| Total_Coding_Sequences | 1,600 | 2,200 |
| Genome_Size | 2,000,000 | 2,500,000 |
This plot shows the relationship between the number of coding sequences (CDS) and genome size. It helps to visualize how genome size correlates with the number of genes. This should be linear – as genome size increases, the number of coding sequences should also increase. Any secondary trend lines or non-linear behaviour indicates bona fide separate populations within the retained genomes or some remaining contaminant.
Histogram comparing SRA to RefSeq; each bar shows genome density across value ranges to highlight shifts, peaks, or outliers.
QQ (quantile-quantile) plot comparing SRA and RefSeq. Points along the diagonal follow the expected distribution; deviations indicate skew, outliers, or other systematic differences.
A table of complete RefSeq genomes for Bifidobacterium adolescentis used to calibrate this scheme. The file includes accessions, some sample information, genome size, GC content, and other key metrics.
These plots show genomes before and after filtering to highlight the outliers removed. Left: Heatmap of all genomes in the dataset. Middle: A representative sample of genomes, with anomalies highlighted (purple). Right: The filtered distribution after applying filtering. There may have been additional adjustments and rounding so the distribution here may not enirely match with the final suggested metrics.