Product · NVIDIA Connect ISV

DWS IQ Synthetic Data Solutions

EU-sovereign synthetic data pipelines built on NVIDIA NeMo DataDesigner. For compliance datasets, quality-control training, and digital-twin simulation, where real data is scarce, confidential, or regulated.

What it does

The DWS IQ SyntheticDataAgent generates structured, verifiable synthetic datasets that don't exist in the real world, yet: fault scenarios your machines haven't failed through, tolerance deviations your QC team hasn't measured, edge-case building profiles your portfolio doesn't contain, and rare regulatory events (grid anomalies, supply-chain ruptures, compliance boundary cases) that break standard reasoning models.

Built on NVIDIA NeMo DataDesigner EU-sovereign compute (Scaleway / Aiven / UpCloud) LLM judge validation CBAM / CSRD / ESRS schema ready
Why synthetic data matters now. CSRD, CBAM, and EU AI Act high-risk models all demand training data that (a) covers regulated edge cases, (b) can be shared with assurance providers, and (c) doesn't leak customer confidential information. Purely real data rarely satisfies all three. Purely simulated data rarely matches the statistical realism regulators accept. DWS IQ's approach: schema-grounded generation with LLM-judge validation + traceable lineage back to source EU regulation.

Use cases

CSRD / ESRS pre-training

Synthetic disclosure datasets across all 82 ESRS datapoints for pre-training compliance reasoning models before a live customer rollout.

CBAM carbon declaration

Schema-grounded synthetic supplier datasets (Reg. 2023/1773 defaults + embedded emissions) for training importer compliance workflows.

Additive manufacturing QC

Fault scenarios, tolerance deviations, and process variations for training quality-control models on AM production lines (NX AM, HP MJF, EOS, Trumpf).

Digital twin training

Edge-case operational profiles for building, factory, fleet, and grid digital twins where real-world observation coverage is thin.

Predictive maintenance

Synthetic failure-mode datasets for CNC, injection moulding, rolling stock, HVAC — generating the rare failures real data doesn't capture.

Privacy-preserving ML

Synthetic substitutes for customer-confidential datasets that preserve statistical properties without leaking personal or commercial data.

How it works

1. Schema grounding

Every generation run is grounded in a typed schema sourced from EU regulation or an industrial standard (CBAM, ESRS, ISO 50001, EN 15978). Schemas live in src/synthetic-data/ and are versioned with the regulation they reference.

2. NeMo DataDesigner core

Generation uses NVIDIA NeMo DataDesigner as the synthesis engine, with fine-tuned EU regulatory context injected via the Lifetime World Model.

3. LLM-judge validation

Every synthetic record passes through an LLM-judge that scores schema compliance, statistical plausibility, and regulatory relevance. Failed records are discarded; aggregate quality scores are logged.

4. Lineage & signing

Every dataset ships with a provenance chain: source schema, generation config, NeMo version, judge scores, and the model hashes used — signed with the DWS IQ Aegis provenance chain. Suitable for assurance providers and EU AI Act high-risk model documentation.

Compliance coverage

Regulation / Standard Schema available Typical use
CSRD ESRS 2023 final set (82 disclosures) Yes Pre-training disclosure reasoning, gap-analysis simulation
CBAM (Reg. 2023/956, Implementing Reg. 2023/1773) Yes Importer declaration simulation, supplier emission factor training
EPBD 2024 recast (Directive (EU) 2024/1275) Yes Synthetic building-performance profiles for renovation-passport models
EU Taxonomy (Reg. 2020/852 + Delegated Acts) Yes Substantial contribution / DNSH boundary cases, real-estate activities 7.1–7.7
ETS Phase 4 (Reg. 2021/447 benchmarks) Yes Installation-level allocation scenarios
EN 15978 (embodied carbon, LCA A1–D) Yes Material-level embodied carbon synthetic datasets for construction
EU AI Act high-risk (Annex III) Scaffolded Training data documentation, bias-testing synthetic cohorts

Additional schemas (ISO 50001, NIS 2 incident taxonomies, REACH, FuelEU Maritime) on the roadmap.

Partnership context

NVIDIA Connect ISV

Lifetime Oy is a confirmed NVIDIA Connect ISV Partner. SyntheticDataAgent runs on NVIDIA NeMo DataDesigner and accelerates on L40S GPU nodes (cloud) and DGX Spark / Jetson Orin Nano (edge).

Siemens wedge

SyntheticDataAgent is the wedge we bring to the broader Siemens partnership conversation, covering Digital Industries (Xcelerator, NX, MindSphere), Smart Infrastructure (Building X, Desigo CC), Mobility, and Energy. Buildings is one reference unit; the platform value sits across the Siemens stack.

Firehorse hardware

DWS IQ's own Firehorse hardware line — Aegis edge appliances, Control Room chassis, Jetson / DGX Spark enclosures — uses synthetic-data-driven QC models during small / mid-series production.

NVIDIA, NeMo DataDesigner, Jetson, DGX, and L40S are trademarks of NVIDIA Corporation. Siemens Building X, Desigo CC, Xcelerator, MindSphere, and NX are registered trademarks of Siemens AG. HP Multi Jet Fusion is a trademark of HP Inc. Mentioned for factual compatibility only; no endorsement implied.

Delivery model

Co-develop a synthetic data pilot

Regulated enterprise? Industrial platform? ERP, BMS, or ESG tool builder? Let's scope a 4–6 week pilot on a schema that matters to your roadmap.

Scope a Pilot Schedule 15 min

See also: Siemens Integration · All Enterprise Integrations · Aegis Partners · All Products · Contact: risto@onelifetime.world