NRPS/PKS biosynthesis pathway encyclopedia (NPdia)

NPdia biosynthesis pathway scheme

About

NRPS/PKS biosynthesis pathway encyclopedia (NPdia) is a manually curated database of Type I PKS (T1PKS) and NRPS biosynthetic pathways from actinomycetes, providing step-by-step SMILES representations of every biosynthetic intermediate from starter unit loading to final scaffold release.

Actinomycetota contribute 51% of all bacterial BGCs, and T1PKS and NRPS represent 42% of actinomycetota BGCs — yet the biochemical intermediates they produce have never been systematically represented. NPdia fills this gap by linking each nucleic acid-encoded enzymatic domain to its chemical outcome, making genotype–phenotype relationships explicit at domain resolution.

All entries are manually curated from primary literature and validated using an AI-assisted pipeline integrated with BGC GenBank files. The full dataset is freely downloadable for use in pathway engineering, machine learning, and drug discovery.

Key Features

  • Step-by-step SMILES for every biosynthetic intermediate
  • Gene-to-reaction mapping with domain-level annotation (including inactive, missing, transAT, and iterative states)
  • Search and filter by class, organism, or compound
  • Full dataset download for machine learning and pathway engineering

Data Summary

450
BGC Entries
211
T1PKS Clusters
154
NRPS Clusters
85
PKS-NRPS hybrid Clusters
7,347
Biosynthetic Reactions
349
Producing Organisms
11,493
Biosynthetic Genes
18,119
Domain Annotations