Two-sided radiology data marketplace · DACH

Radiology data,
certified anonymous,
ready to train.

We install a certified pipeline inside radiology practices, turn their dormant archives into de-identified, report-paired imaging data, and license it to AI and pharma teams. Practices earn recurring revenue. Builders get data they can actually use.

GDPR · German GDNG · EHDS-aligned — anonymisation independently audited

0+
De-identified images in the founding corpus
0+
Radiology praxen live & onboarding across DACH
0+
Years in radiology clinical experience behind it
Compliance & provenance
  • GDPR
  • German GDNG
  • EU EHDS-aligned
  • DICOM PS3.15
  • Independent Datenschutz-Gutachten
  • Secure clean-room delivery
  • Built with practicing radiologists

One pipeline · two winners

A marketplace where practices and builders both profit

Radiology praxen supply the raw archives and get paid. AI and pharma teams get certified, ready-to-train data. Lumen Datasets is the trusted, certified layer in between.

For data producers

Radiology praxen & clinics

Your archive is a dormant asset. Turn 10+ years of studies into a new, compliant revenue line — without changing how you work.

  • New recurring revenueEarn ~30% of net revenue (illustrative) every time your data is licensed — passive income from archives that sit idle today.
  • Zero workflow disruptionOur edge appliance reads from your PACS/RIS in the background. Nothing changes for your radiologists or staff.
  • Compliance handled for youWe carry the anonymisation, auditing and data-governance work, designed around §203 StGB, GDPR and consent-first intake.
  • You stay in controlNon-exclusive by default, opt-out anytime, full transparency on what is licensed and to whom.
Become a data partner
For data consumers

AI, foundation-model & pharma teams

Stop fighting for scraps of public imaging data. License a growing, certified corpus built for training and validation.

  • Report-paired & multimodalEvery record is a DICOM study matched to its radiology report plus structured metadata — not orphan images.
  • Certified anonymousDICOM scrubbing, pixel-PHI removal, head-CT/MRI defacing and German report de-identification, independently audited.
  • DACH-representativeReal European clinical distributions and German-language reports that US-sourced datasets structurally lack.
  • Licensed your wayNon-exclusive data feed / API, or premium custom and semi-exclusive cohorts — delivered to a secure clean room.
Request data access

The operating model

From dormant archive to training-ready data

Six steps, one continuous pipeline. The warm end is the practice; the cool end is the buyer — and value flows in both directions.

  1. 01

    Connect

    A certified edge appliance plugs into the practice PACS/RIS — no workflow change, data never leaves the building unprotected.

  2. 02

    Extract

    It pulls historic and incoming studies — DICOM images paired with their radiology reports and metadata.

  3. 03

    Anonymise

    On-site, the certified pipeline scrubs DICOM tags, removes burned-in pixel PHI, defaces head scans and de-identifies German reports.

  4. 04

    Curate

    Clean records become structured, searchable, longitudinal entries in the Lumen platform — version-controlled and lineage-tracked.

  5. 05

    License

    Buyers access the corpus via secure API / clean room — non-exclusive feeds plus premium custom and semi-exclusive cohorts.

  6. 06

    Reward

    Revenue flows back to contributing practices — every licensed study pays the praxis that produced it. The loop compounds.

The flywheel: more praxen make the corpus richer and more representative → which attracts more buyers → which pays praxen more → which attracts more praxen. Every new practice raises the value of the data for everyone already in.

The certified moat

Anonymisation we’ll put our name on

The whole model rests on one thing: data that is genuinely, defensibly anonymous — so it sits outside GDPR and can be licensed freely. For multimodal radiology, header-scrubbing alone is not enough. This is the part we obsess over.

DICOM tag scrubbing

Full PS3.15 confidentiality profiles — identifiers, private tags and UIDs removed or replaced with keyed pseudonyms.

Pixel-PHI removal

OCR sweeps every frame to find and redact text burned into the image itself — the failure mode header-scrubbing misses.

Head-scan defacing

Faces can be reconstructed from 3-D head CT/MRI. We deface and skull-strip cranial studies so a patient can never be recognised.

German report de-ID

Clinical NER tuned for German-language reports strips names, dates, places and identifiers while preserving the clinical signal.

Re-identification, measured

Motivated-intruder testing and k-anonymity checks on every release quantify residual risk — we do not ship on faith.

Audited & reproducible

Every transform is versioned, logged and independently auditable — backed by a formal Datenschutz-Gutachten.

Inside the corpus

Multimodal by construction

Most imaging datasets are orphan pixels. Ours aren’t. A Lumen record binds the image, the report and structured metadata into a single de-identified unit — the thing builders actually need to train and validate.

The imageDICOM study · scrubbed, defaced, pixel-clean
The reportMatched findings · de-identified German text
The metadataModality · region · age band · device · labels

Trust by design

Privacy-first, or it doesn’t ship

Patient data is a responsibility before it is a product. Governance isn’t a footnote here — it’s the foundation that makes the whole marketplace possible.

Consent-first intake

We design data sourcing around explicit patient consent and German §203 physician-secrecy rules — validated by qualified counsel.

Aligned to the new rules

Built for GDPR, the German GDNG and the EU Health Data Space — the regulatory direction of travel, not against it.

No-re-identification covenant

Every buyer is contractually barred from re-identifying patients or linking records to outside data — with audit rights and kill-switches.

Practices stay in control

Non-exclusive by default, transparent reporting on every license, and the right to opt out at any time. Your data, your call.

From the inside

“I’ve spent 10+ years building a radiology practice. We were sitting on 2 million studies that helped no one once the report was signed. This turns that archive into research that matters — and a fair return for the practice.”
— Founding clinical partner · radiology praxis, Germany
  • 10+ years in radiology
  • ≈2M studies contributed
  • Fair revenue share

Questions, answered

Frequently asked questions

Is selling this data even legal?

Properly anonymised data falls outside the GDPR, which is the basis on which it can be licensed. We design intake around explicit patient consent and German §203 physician-secrecy rules, and we validate the whole approach with qualified counsel and an independent Datenschutz-Gutachten before data ever moves. We treat this as the central question, not an afterthought.

How is the data anonymised?

On-site, before anything leaves the practice: DICOM tags are scrubbed to PS3.15 profiles, burned-in pixel text is removed by OCR, head CT/MRI scans are defaced so faces can’t be reconstructed, and German-language reports are de-identified with clinical NER. Every release is risk-tested and independently auditable.

How do radiology practices get paid?

Practices earn a revenue share — illustratively around 30% of the net revenue attributable to their data — every time it’s licensed. It’s non-exclusive, transparent, and you can opt out at any time. The archive you already own becomes recurring income.

What’s actually in the data?

Multimodal, report-paired records: a de-identified DICOM study plus its matched radiology report and structured metadata (modality, body region, age band, device, labels). Modalities span CT, MRI, X-ray, ultrasound, mammography and angiography, with native German reports and longitudinal links where available.

Can we get exclusive access?

The base corpus is licensed non-exclusively so it stays affordable and the flywheel keeps turning. For specific needs we also build premium custom and semi-exclusive cohorts — talk to us about scope, modality mix and validation requirements.

How do we get started?

Practices: book a short data audit and we’ll scope your archive and revenue potential. Buyers: tell us your use case and we’ll arrange a sample and a design-partner conversation. Either way, it starts with a 20-minute call.

Two ways in

Let’s turn radiology data into value

Whether you produce the data or build on it, the conversation starts the same way.