Saudi Data Protection Compliant

Anonymize sensitive data.
Automatically.

Upload your dataset and watch it pass through classification, NLP detection, masking, and validation — all in one pipeline. Download the result knowing it meets privacy thresholds.

How it works

Six stages, fully automated. Swipe through the pipeline.

Step 1
Upload
Drop your file
CSV · JSON · Parquet
CSV
patients_200.csv
20.3 KB · 200 rows

Drop your CSV, JSON, or Parquet

Step 2
Classify
nameSENSITIVE
ageQID
countryPUBLIC
nat_idSENSITIVE

Columns auto-classified by sensitivity

Step 3
NLP Detect
Free-text column
Patient Ahmed AliNAME from RiyadhCITY with ID 1234···ID

Named entities found in free text

Step 4
Mask
nameAhmed Al-SaudSUPPRESSED
age3430–40

5 techniques applied per column type

Step 5
Validate
k-anon29
Pass
l-div5
Pass
t-close.046
Pass

k-anonymity, l-diversity, t-closeness

Step 6
Approve
92/ 100
Approve
Reject

Review metrics, then release

Built for real privacy

Not a toy. Production-grade anonymization with mathematically proven privacy guarantees.

5 Masking Techniques

Suppression, generalization, pseudonymization, date shifting, and NLP redaction — automatically selected per column.

Privacy Metrics

k-anonymity, entropy l-diversity, t-closeness via earth mover's distance. Prosecutor, journalist, and marketer risk models.

NLP Entity Detection

Regex + dictionary layers detect Saudi IDs, IBANs, names, hospitals, and locations in free-text columns.

Approval Workflow

Data owners review privacy scores and approve or quarantine datasets before release. Full audit trail.

Ready to protect your data?

Upload a dataset and see the anonymization pipeline in action. Full privacy metrics, approval workflow, and audit trail included.

Get Started
SADNSADN· CS 511 Capstone Project
University project — not for production use