Serratia sarumannii QC

Serratia sarumannii is a Gram-negative, facultatively anaerobic rod of the family Yersiniaceae, recently proposed as a novel species within the genus Serratia. It has been isolated from environmental and clinical contexts, and its taxonomic position was established through polyphasic analysis including whole-genome average nucleotide identity. Like other serratiae, S. sarumannii possesses a moderately sized genome and may carry intrinsic antimicrobial resistance determinants common to the genus.

Preferred scheme: REFSEQ-QC-v1

Engine quality flag: error

The reference dataset for this species has substantial quality issues. Thresholds should be treated as indicative only.

Low genome count (error)Only 19 genomes survived Isolation-Forest outlier filtering — the engine flags reference pools below 500 as warn, below 100 as error, because percentile-based bounds become unreliable at low n.

Derived from 38 genomes: 19 from RefSeq and 19 from other sources. For the derivation pipeline and the PASS / WARN / FAIL verdict model, see the methods page for REFSEQ-QC-v1.

Summary table

This table summarises the distribution of each metric, including standard deviation, mean, median, and percentiles.

A combined summary table across all species is available on the summary page.

summary.csv

Show summary statistics inline

Metric	Distribution	n	Mean	SD	Min	Q1	Median	Q3	Max
N50	non-normal	19	814,221	397,571	191,170	529,214	602,238	1,265,072	1,372,682
no_of_contigs	non-normal	19	29.37	7.02	22	24	27	32	48
longest	normal	19	1,323,161	378,555	534,075	1,123,997	1,383,898	1,485,437	2,122,422
GC_Content	normal	19	59.88	0.11	59.64	59.8	59.91	59.94	60.07
Completeness_Specific	non-normal	19	99.78	0.92	95.87	100	100	100	100
Contamination	non-normal	19	0.11	0.34	0.01	0.02	0.03	0.04	1.54
Total_Coding_Sequences	non-normal	19	4,783	143.91	4,609	4,710	4,714	4,808	5,263
Genome_Size	normal	19	5,198,831	78,993	5,055,011	5,150,132	5,182,852	5,257,573	5,339,266

Full statistics including KS test vs RefSeq and Wasserstein distance are in the downloadable summary.csv.

Suggested thresholds for Serratia sarumannii (REFSEQ-QC-v1)

Derived from 38 genomes including 19 RefSeq references

Both Fail and Warn bands shown as the published rounded values — easier to cite and consistent across the species page, CSV downloads, and downstream QC tools.

Metric	Fail below	Warn below	Warn above	Fail above
Genome_Size	5,000,000	5,000,000	5,400,000	5,400,000
GC_Content	59.6	59.6	60.1	60.1
Total_Coding_Sequences	4,600	4,600	5,200	5,300
Completeness_Specific	95	97	-	-
Contamination	-	-	1	2
N50	191,000	263,000	-	-
no_of_contigs	-	-	50	50
longest	-	-	-	-

How to read this: a value between the two warn columns is typical for this species and passes QC. A value between a warn column and the corresponding fail column is borderline — worth a manual look but not an outright failure. A value outside the fail columns is unusual enough to fail QC.

Download CSV

CDS vs Genome Size

This plot shows the relationship between the number of coding sequences (CDS) and genome size — how the number of genes scales with assembly length. The relationship should be roughly linear: as genome size increases, the number of coding sequences should rise proportionally. A secondary trend line or non-linear behaviour can indicate either bona fide sub-populations within the retained genomes (e.g. distinct sub-clades) or residual contamination that survived filtering.

RefSeq distributions

GC Content (RefSeq)

1 / 5

Histogram (SRA vs RefSeq)

Histogram comparing SRA to RefSeq; each bar shows genome density across value ranges to highlight shifts, peaks, or outliers.

QQ plot (SRA vs RefSeq)

QQ (quantile-quantile) plot comparing SRA and RefSeq. Points along the diagonal follow the expected distribution; deviations indicate skew, outliers, or other systematic differences.

Table of included RefSeq complete genomes

A table of complete RefSeq genomes for Serratia sarumannii used to calibrate this scheme. The file includes accessions, some sample information, genome size, GC content, and other key metrics.

Download table

Included assembly inputs

Per-assembly inputs the engine used to derive the Serratia sarumannii reference distribution for this scheme: sample, sylph species call, N50, contig count, longest contig, total length, completeness, contamination, total coding sequences, genome size, GC content. Gzipped CSV.

Download table

Preferred scheme: REFSEQ-QC-v1

Engine quality flag: error

The reference dataset for this species has substantial quality issues. Thresholds should be treated as indicative only.

Low genome count (error)Only 19 genomes survived Isolation-Forest outlier filtering — the engine flags reference pools below 500 as warn, below 100 as error, because percentile-based bounds become unreliable at low n.

Derived from 38 genomes: 19 from RefSeq and 19 from other sources. For the derivation pipeline and the PASS / WARN / FAIL verdict model, see the methods page for REFSEQ-QC-v1.

Summary table

This table summarises the distribution of each metric, including standard deviation, mean, median, and percentiles.

A combined summary table across all species is available on the summary page.

summary.csv

Show summary statistics inline

Metric	Distribution	n	Mean	SD	Min	Q1	Median	Q3	Max
N50	non-normal	19	814,221	397,571	191,170	529,214	602,238	1,265,072	1,372,682
no_of_contigs	non-normal	19	29.37	7.02	22	24	27	32	48
longest	normal	19	1,323,161	378,555	534,075	1,123,997	1,383,898	1,485,437	2,122,422
GC_Content	normal	19	59.88	0.11	59.64	59.8	59.91	59.94	60.07
Completeness_Specific	non-normal	19	99.78	0.92	95.87	100	100	100	100
Contamination	non-normal	19	0.11	0.34	0.01	0.02	0.03	0.04	1.54
Total_Coding_Sequences	non-normal	19	4,783	143.91	4,609	4,710	4,714	4,808	5,263
Genome_Size	normal	19	5,198,831	78,993	5,055,011	5,150,132	5,182,852	5,257,573	5,339,266

Full statistics including KS test vs RefSeq and Wasserstein distance are in the downloadable summary.csv.

Suggested thresholds for Serratia sarumannii (REFSEQ-QC-v1)

Derived from 38 genomes including 19 RefSeq references

Both Fail and Warn bands shown as the published rounded values — easier to cite and consistent across the species page, CSV downloads, and downstream QC tools.

Metric	Fail below	Warn below	Warn above	Fail above
Genome_Size	5,000,000	5,000,000	5,400,000	5,400,000
GC_Content	59.6	59.6	60.1	60.1
Total_Coding_Sequences	4,600	4,600	5,200	5,300
Completeness_Specific	95	97	-	-
Contamination	-	-	1	2
N50	191,000	263,000	-	-
no_of_contigs	-	-	50	50
longest	-	-	-	-

Download CSV

CDS vs Genome Size

RefSeq distributions

GC Content (RefSeq)

1 / 5

Histogram (SRA vs RefSeq)

Histogram comparing SRA to RefSeq; each bar shows genome density across value ranges to highlight shifts, peaks, or outliers.

QQ plot (SRA vs RefSeq)

QQ (quantile-quantile) plot comparing SRA and RefSeq. Points along the diagonal follow the expected distribution; deviations indicate skew, outliers, or other systematic differences.

Table of included RefSeq complete genomes

A table of complete RefSeq genomes for Serratia sarumannii used to calibrate this scheme. The file includes accessions, some sample information, genome size, GC content, and other key metrics.

Download table

Included assembly inputs

Download table

Serratia sarumannii QC Overview

Preferred scheme: REFSEQ-QC-v1

Engine quality flag: error

Suggested thresholds for Serratia sarumannii (REFSEQ-QC-v1)

CDS vs Genome Size

RefSeq distributions

Table of included RefSeq complete genomes

Included assembly inputs

All QC schemes for this species

Serratia sarumannii QC Overview

Preferred scheme: REFSEQ-QC-v1

Engine quality flag: error

Suggested thresholds for Serratia sarumannii (REFSEQ-QC-v1)

CDS vs Genome Size

RefSeq distributions

Table of included RefSeq complete genomes

Included assembly inputs

All QC schemes for this species