KVKK-compliant CDP architecture: Eight layered decisions
Selling from Istanbul to Frankfurt and KVKK plus GDPR sit on the same backlog? Eight layered architectural decisions, distilled from four CDP builds over 14 months.
If your company sells from Istanbul to Frankfurt and the question of “two separate systems or one CDP” is on the table, this post is aimed at you. Meeting KVKK and GDPR in the same stack is simple in theory and not in practice. The two regulations point in the same direction — the user has a say over their own data — but the operational details (storage region, consent format, deletion deadline, audit records) clash in places. The “let’s run two separate systems” answer reads well on paper and falls apart in production.
Across the last 14 months we built KVKK + GDPR-compliant Customer Data Platforms (CDPs) for four clients. The sectors differed — fintech, B2B SaaS, edtech, logistics — yet the underlying architectural decisions were strikingly repetitive. This post walks through eight layered decisions: which layer answers which regulatory clause, which tools work under which conditions, and which trade-offs are real trade-offs. The aim is not a “KVKK compliance template.” The aim is to put the calls in front of your team in the right order, before you sign the first contract. Some decisions cannot be undone after going to production; seeing them up front saves both money and reputation.
Decision 1: Collection and explicit consent
No CDP starts before the consent layer is solved, period. KVKK Article 5 and GDPR Article 6 both require a legal basis for processing; “explicit consent” is the strongest basis, but the operational meaning of “explicit” lives in the details. A single “I agree” button on a page passes neither.
The practical setup has three parts. First, a cookie banner with separated granularity — marketing consent, analytics consent, and personalization consent are selectable independently. OneTrust, Cookiebot, and Iubenda are enterprise options; Klaro is a credible open-source alternative for smaller teams. The “Reject all” button must be as visible as “Accept all.” Anything less voids the GDPR definition of “freely given consent.” Second, double opt-in is mandatory for email subscriptions. The user receives a confirmation link on first signup and is not added to the list until that link is clicked. Resend, Postmark, and Mailgun support this out of the box. Third, the consent record must be auditable: which user gave which consent, under which version of the text, from which IP, with which timestamp? The record is kept at least 5 years; KVKK auditors apply a “consent you cannot prove does not exist” principle.
A common mistake is burying the consent layer inside the CDP. Segment’s consent management is fine but is not a regulatory defense on its own. Consent should sit in a separate append-only log that the CDP merely reads from. In three of our engagements that log lives on Cloudflare D1 or Postgres, independent of the CDP.
Decision 2: Storage and data residency
Storage region is where KVKK and GDPR collide hardest. KVKK Article 9 ties cross-border transfers to “adequate protection.” GDPR Article 44 ties EU-out transfers to SCCs or an adequacy decision. In practice this means the question “where is the data” demands a convincing answer.
Our default setup keeps primary data inside the EU. If a Turkey-resident replica is required (typical in finance or healthcare), a hybrid-region setup is added on top. Practical tool picks: eu.posthog.com (Frankfurt) instead of the US PostHog, Resend with region: 'eu-west-1', Cloudflare R2 with EU jurisdiction tagging instead of S3, Qdrant Cloud EU or Pinecone’s EU region for vector storage. Cloudflare Workers’ global edge is a discussion point under KVKK; the resolution is to use Workers as edge cache only, with persistent storage delegated to R2.
If a Turkey replica is needed, the operational load is real. Either a Turkish cloud provider (Türk Telekom Cloud, A101 Cloud) is added, or a subset of the EU primary data is replicated into Turkey for Turkish users. The second option is operationally simpler; replication is done via CDC (Change Data Capture) with 2-5 minute lag.
Important note: data residency is not just the database. Logging, monitoring, and error tracking matter too. Sentry has an EU instance (de.sentry.io); Datadog has its EU site. If the team shares production data samples on Slack, that is also a transfer — Slack Enterprise Grid’s EU data residency add-on is required.
Decision 3: PII pseudonymization
Removing personal data from the system entirely is rarely possible. The “the fewer places it travels, the better” rule still pays off in every audit. Pseudonymization is the key here: PII is transformed via hash + salt, and downstream analytics and activation layers use the deterministic surrogate instead of cleartext.
Practical setup. As fields like email, phone, national_id, ip enter the CDP they are hashed via HMAC-SHA256 with a fixed salt. Storing the salt in a separate KMS (AWS KMS, GCP KMS, or restricted Cloudflare R2 tokens) matters; reversing the hash becomes possible only if the salt leaks. Identity resolution operates on the hashed value; downstream systems (Customer.io, HubSpot) see the hashed identifier rather than cleartext email.
The discipline of keeping PII out of vector embeddings is equally important. When transforming user content into embeddings for an AI feature, names, phone numbers, and emails are scrubbed automatically by Microsoft Presidio or a similar PII detector. Recovering PII from an embedding is close to impossible in practice, but anything entering the database in cleartext exposes you to a breach later. The “daily model evaluation” habit we discussed in our discipline in AI product development post applies here too — the cases your PII filter misses get caught by manual review.
For activation triggers headed to the MAP, the typical architecture splits into two: hashed user inside the CDP, cleartext email inside the MAP. A “lookup table” on the MAP side bridges the two; the hash-to-email match resolves only at activation time and is not written to logs.
Decision 4: Deletion request workflows
KVKK Article 11 and GDPR Article 17 both grant the right to erasure. KVKK sets 30 days; GDPR says “without undue delay” and typically expects under 30 days. Hitting a 30-day SLA is operationally enough; setting a tighter target (15 days) gives audit headroom.
Deletion is hard because data has been copied everywhere. A user’s data typically sits in: the primary CDP, the warehouse, the MAP (Customer.io/HubSpot), the CRM, the support tool (Zendesk/Intercom), analytics (PostHog/Mixpanel), backups, cold storage, vector store, and log files. Manual deletion across 10-12 surfaces almost always leaves a hidden copy.
The practical solution is to route deletion requests through a single “deletion orchestrator.” A user submits a request via a form or a regulated channel like [email protected], the ticket lands in the system, an automation deletes from each downstream system in sequence with an API call plus a receipt log. n8n or Temporal is the right tool here; every downstream has a different API and retry logic is non-negotiable. At the end the user receives a report: “deleted from these systems, retained in these for legal reasons (e.g. tax records under VUK 10-year retention).” That report is itself an audit instrument.
Backups are a separate discussion. Saying “delete from backup” is unhelpful because a backup is a recovery instrument by definition. The fix: backups have a fixed rotation period (e.g. 35 days), and any backup older than that gets deleted automatically, so a deleted user does not resurface from a restore. Setting that policy adds work; in audits, the answer “I have a backup, restoring it would bring the data back” lands very poorly.
Decision 5: Data portability and export API
GDPR Article 20 (right to data portability) and KVKK Article 11 require the user’s data to be portable in a machine-readable format. In practice the user must be able to download their own data in JSON, CSV, or similar, and import it into another system if needed.
Our typical export endpoint works as follows: the user requests an export through the portal, a backend job collects the user’s data across all systems (CDP profile, events, support tickets, subscription history), packages it as JSON-LD, and emails a signed URL with a 7-day expiry. JSON-LD’s advantage is its alignment with schema.org types, which simplifies import on the receiving side. CSV is an option but loses fidelity for nested relations like event attributes.
The export job itself contains PII, so it should be short-lived: the file behind the signed URL deletes itself after 14 days. That detail comes up in audits often.
Decision 6: Audit log and breach notification
KVKK Article 12 and GDPR Article 33 both require notifying the regulator (the KVKK Board’s VERBİS portal in Turkey) and affected users within 72 hours of a breach. Two preconditions are required: detecting the breach, and producing the affected-user list quickly.
The audit log layer captures every read, change, and export structurally. The typical schema: timestamp, user_id, action, resource, ip, user_agent, result. The log is append-only, retained at least 5 years, write-only for system services, read-only for audit and security teams. The log itself processes PII, so any PII it contains should be hashed.
Detection-wise: a SIEM (Wazuh, Elastic Security, Datadog Security) flags abnormal access patterns. If one user reads 10,000 records in an hour, an alert fires. To meet the 72-hour clock, a runbook must be ready: who files with the KVKK Board, who emails affected users, who coordinates with legal. Practicing the runbook with a tabletop exercise once or twice a year is invaluable in audits — the answer “we never had this happen” lands badly. The answer “we drill twice a year, here are the results” lands well.
Decision 7: A practical architecture diagram
The eight layers come together in a typical flow: user browser → Cloudflare edge (consent check) → Hono or Astro server endpoint (PII pseudonymization) → CDP (Segment or a self-hosted RudderStack instance) → Warehouse (BigQuery EU or Snowflake EU) → reverse-ETL (Hightouch/Census) → MAP (Customer.io EU) and Analytics (eu.posthog.com). The audit log is written in parallel from every layer to an append-only Postgres. The deletion orchestrator (n8n) walks this chain in reverse.
Which tools we tend to use under which conditions is covered at length in our martech stack architecture post. The single difference: under KVKK + GDPR conditions, the warehouse-first approach becomes the default, because the warehouse resolves data residency in the simplest way.
Decision 8: The rule of regulatory interpretation
Regulatory text is often open to interpretation. What is “explicit consent”? At what level is “adequate protection”? What “appropriate technical measures” are required? Answering those is legal’s job. Engineering’s job is producing a defensible decision. “Defensible” means: in an audit, the answer to “why did you make this call” is backed by a document.
The practical discipline is attaching a DPIA (Data Protection Impact Assessment) note to every architectural decision. Which regulatory clause does this control map to, what was the alternative, why did we choose this option? These notes are the only way to answer “we made this call for these reasons” 12 months later when an audit arrives. Notion, Confluence, or a plain markdown repo is enough; the format does not matter, the discipline does.
Closing
A KVKK + GDPR-compliant CDP is not dramatically more complex than a “regular” CDP; it just demands a deliberate decision at each of eight layers. Most of the calls made in the first setup are expensive to reverse — changing the storage region, rewriting the audit log schema, or bolting on a deletion orchestrator after the fact are all operationally costly and risk-bearing. Putting the eight layers on the table in week one creates a visible difference when an audit arrives in month twelve.
Which of these decisions are open in your stack, and which sit in a “we’ll figure that out later” list? For a context review you can reach us through the martech and AI operations page, or write to [email protected] via the contact page.