What it does
The DWS IQ SyntheticDataAgent generates structured, verifiable synthetic datasets that don't exist in the real world, yet: fault scenarios your machines haven't failed through, tolerance deviations your QC team hasn't measured, edge-case building profiles your portfolio doesn't contain, and rare regulatory events (grid anomalies, supply-chain ruptures, compliance boundary cases) that break standard reasoning models.
Use cases
CSRD / ESRS pre-training
Synthetic disclosure datasets across all 82 ESRS datapoints for pre-training compliance reasoning models before a live customer rollout.
CBAM carbon declaration
Schema-grounded synthetic supplier datasets (Reg. 2023/1773 defaults + embedded emissions) for training importer compliance workflows.
Additive manufacturing QC
Fault scenarios, tolerance deviations, and process variations for training quality-control models on AM production lines (NX AM, HP MJF, EOS, Trumpf).
Digital twin training
Edge-case operational profiles for building, factory, fleet, and grid digital twins where real-world observation coverage is thin.
Predictive maintenance
Synthetic failure-mode datasets for CNC, injection moulding, rolling stock, HVAC — generating the rare failures real data doesn't capture.
Privacy-preserving ML
Synthetic substitutes for customer-confidential datasets that preserve statistical properties without leaking personal or commercial data.
How it works
1. Schema grounding
Every generation run is grounded in a typed schema sourced from EU regulation or an industrial standard (CBAM, ESRS, ISO 50001, EN 15978). Schemas live in src/synthetic-data/ and are versioned with the regulation they reference.
2. NeMo DataDesigner core
Generation uses NVIDIA NeMo DataDesigner as the synthesis engine, with fine-tuned EU regulatory context injected via the Lifetime World Model.
3. LLM-judge validation
Every synthetic record passes through an LLM-judge that scores schema compliance, statistical plausibility, and regulatory relevance. Failed records are discarded; aggregate quality scores are logged.
4. Lineage & signing
Every dataset ships with a provenance chain: source schema, generation config, NeMo version, judge scores, and the model hashes used — signed with the DWS IQ Aegis provenance chain. Suitable for assurance providers and EU AI Act high-risk model documentation.
Compliance coverage
| Regulation / Standard | Schema available | Typical use |
|---|---|---|
| CSRD ESRS 2023 final set (82 disclosures) | Yes | Pre-training disclosure reasoning, gap-analysis simulation |
| CBAM (Reg. 2023/956, Implementing Reg. 2023/1773) | Yes | Importer declaration simulation, supplier emission factor training |
| EPBD 2024 recast (Directive (EU) 2024/1275) | Yes | Synthetic building-performance profiles for renovation-passport models |
| EU Taxonomy (Reg. 2020/852 + Delegated Acts) | Yes | Substantial contribution / DNSH boundary cases, real-estate activities 7.1–7.7 |
| ETS Phase 4 (Reg. 2021/447 benchmarks) | Yes | Installation-level allocation scenarios |
| EN 15978 (embodied carbon, LCA A1–D) | Yes | Material-level embodied carbon synthetic datasets for construction |
| EU AI Act high-risk (Annex III) | Scaffolded | Training data documentation, bias-testing synthetic cohorts |
Additional schemas (ISO 50001, NIS 2 incident taxonomies, REACH, FuelEU Maritime) on the roadmap.
Partnership context
NVIDIA Connect ISV
Lifetime Oy is a confirmed NVIDIA Connect ISV Partner. SyntheticDataAgent runs on NVIDIA NeMo DataDesigner and accelerates on L40S GPU nodes (cloud) and DGX Spark / Jetson Orin Nano (edge).
Siemens wedge
SyntheticDataAgent is the wedge we bring to the broader Siemens partnership conversation, covering Digital Industries (Xcelerator, NX, MindSphere), Smart Infrastructure (Building X, Desigo CC), Mobility, and Energy. Buildings is one reference unit; the platform value sits across the Siemens stack.
Firehorse hardware
DWS IQ's own Firehorse hardware line — Aegis edge appliances, Control Room chassis, Jetson / DGX Spark enclosures — uses synthetic-data-driven QC models during small / mid-series production.
NVIDIA, NeMo DataDesigner, Jetson, DGX, and L40S are trademarks of NVIDIA Corporation. Siemens Building X, Desigo CC, Xcelerator, MindSphere, and NX are registered trademarks of Siemens AG. HP Multi Jet Fusion is a trademark of HP Inc. Mentioned for factual compatibility only; no endorsement implied.
Delivery model
- Pilot (4–6 weeks): scope one regulation + one customer use case, generate the first signed synthetic dataset, validate with assurance provider.
- Managed service: ongoing synthetic dataset generation under customer-controlled schema, delivered via EU-sovereign cloud or on-prem edge appliance.
- Work-for-hire: custom schema + generator developed against customer IP, deployed to customer's own NVIDIA infrastructure.
Co-develop a synthetic data pilot
Regulated enterprise? Industrial platform? ERP, BMS, or ESG tool builder? Let's scope a 4–6 week pilot on a schema that matters to your roadmap.
Scope a Pilot Schedule 15 minSee also: Siemens Integration · All Enterprise Integrations · Aegis Partners · All Products · Contact: risto@onelifetime.world