How Indian businesses should design secure data filtering for B2B and B2C operations in 2026 to meet DPDP, RBI and SEBI cybersecurity expectations.
No Coupler.io skill applies to this blog-writing task. Proceeding directly with the regenerated post.
Secure B2B and B2C Data Filtering in India: A Practical 2026 Framework
Indian businesses must now filter, classify and govern data as a hard regulatory obligation. The Digital Personal Data Protection Act 2023 (DPDP Act), RBI's Master Direction on IT Governance (2023), SEBI's revised Cyber Security and Cyber Resilience Framework (CSCRF, August 2024) and CERT-In's 6-hour breach-reporting direction together create overlapping obligations governing how data enters your systems, how it is handled inside them and what is allowed to leave. Get any layer wrong and the penalty exposure is measured in crores — not lakhs.
What Data Filtering Actually Means in a Compliance Context
Most finance and IT teams still treat data filtering as a data-quality task: remove blanks, fix formats, deduplicate. In FY 2026-27, the term carries a different and heavier regulatory meaning.
Data filtering is the disciplined process of inspecting every record against a policy set — classification tier, consent status, contractual limits and regulatory obligations — and then deciding whether to allow, transform, mask, tokenise, redact or block that record. It applies at three distinct points: on the way in (inbound validation), while the data lives inside your systems (in-platform controls) and on the way out to vendors, partners, ad platforms and analytics tools (outbound governance).
The legal reason this matters: Section 8(4) of the DPDP Act 2023 requires every Data Fiduciary to ensure that personal data used to make decisions about a Data Principal is complete, accurate and consistent. A broken inbound filter that allows stale mobile numbers or incorrect PAN mappings into a lending or insurance workflow is not just a data-quality failure — it is a statutory violation the Data Protection Board can cite.
The Regulatory Stack Governing Your Data in FY 2026-27
Understanding which rule maps to which data flow prevents you from building three separate compliance programmes. Here is the consolidated picture.
DPDP Act 2023
Applies to any entity that processes digital personal data within India, or processes the data of Indian residents outside India. For filtering purposes, the critical obligations are:
- Section 6 — Consent: Processing for any purpose requires free, specific, informed and unambiguous consent. Pre-ticked boxes and bundled consent clauses are invalid. Consent status must be a machine-readable filter condition in your data pipelines, not a human-readable footnote in your privacy policy.
- Section 8(5) — Reasonable security safeguards: Penalty under the Schedule: up to Rs. 250 crore for breach of this obligation.
- Breach notification: Notify the Data Protection Board of India and affected Data Principals within timelines to be prescribed in Rules (Rules were pending as of May 2026). Until notified, align to the CERT-In 6-hour standard as a conservative interim baseline.
RBI IT Governance and Tokenisation
RBI's Master Direction on Information Technology Governance, Risk, Controls and Assurance Practices (2023) governs banks, NBFCs and payment system operators. On tokenisation, RBI's circular CO.DPSS.POLC.No.S-516/02-14-003/2021-22 and successor circulars prohibit merchants, payment aggregators and acquirers from storing raw card Primary Account Numbers (PANs), expiry dates or CVV after the October 2022 deadline. Tokens issued by card networks (Visa, Mastercard, RuPay) are the only permissible stored representation.
CERT-In Direction (IT Act 2000, Section 70B)
Cybersecurity incidents — including data breaches, unauthorised access and ransomware — must be reported to CERT-In within 6 hours of detection, not 6 hours of investigation completion. The clock starts when any team member becomes aware something has gone wrong. Approximately 20 categories of incidents are covered. Failure to report is a criminal offence under the IT Act.
SEBI CSCRF (August 2024 Revision)
The revised framework applies a tiered architecture:
- Market Infrastructure Institutions (MIIs) — exchanges, clearing corporations, depositories: real-time monitoring, quarterly board review, strictest controls.
- Qualified Regulated Entities (REs): annual third-party security audit, mandatory DLP, vulnerability scans.
- Mid-size and Small REs: lighter but substantive requirements including access controls, data masking and incident reporting.
Step 1: Classify Every Dataset Before Writing a Single Filter Rule
You cannot filter what you have not classified. A data classification policy is the non-negotiable foundation.
Practical five-tier classification for Indian businesses:
| Tier | Label | Examples | Regulatory sensitivity |
|---|---|---|---|
| 1 | Public | Published pricing, stock filings, blog posts | None |
| 2 | Internal | Operational KPIs, internal memos | Low |
| 3 | Confidential | Vendor price lists, customer contracts | Contractual |
| 4 | Sensitive Personal | PAN, Aadhaar, mobile, email, location, health, financial history | DPDP + IT Act |
| 5 | Critical Financial | CVV, card PANs, bank account credentials, SWIFT codes | RBI + PCI DSS |
How to execute this in practice — a seven-step sequence:
- Pull a complete inventory: databases, object stores (S3, Azure Blob, GCS), SaaS platforms (Salesforce, Zoho, Tally), API endpoints and email marketing tools.
- For each data element, assign the tier above. Start with any system that touches customer or employee records.
- Tag fields in a data catalog — Apache Atlas, Collibra and Microsoft Purview work at enterprise scale; a maintained spreadsheet works for SMEs.
- Record for each sensitive field: the lawful purpose under DPDP, the consent reference, the retention period and every downstream system that consumes the field.
- Run an automated data-discovery scan using tools such as BigID or Privacera (or open-source alternatives) to surface Aadhaar patterns (
\d{4}\s\d{4}\s\d{4}), PAN formats ([A-Z]{5}[0-9]{4}[A-Z]{1}) and card numbers (16-digit strings passing the Luhn algorithm) lurking in unexpected places. - Document the data flow map: source → processing system → downstream consumers → deletion/archival endpoint.
- Schedule an annual classification review, and trigger an ad-hoc review whenever a new data source or use case is introduced.
Without this inventory, every downstream filtering rule is guesswork — and regulators will treat it as such.
Filtering for B2B Data Flows
B2B filtering is transactional and contractual in character, but regulatory obligations apply equally.
Inbound Vendor and Partner Data
When vendors send you master-data files — supplier lists, invoice feeds, employee-secondment records — validate at the point of ingestion before any record enters your ERP or data warehouse:
- GSTIN format check: 15-character alphanumeric. Characters 1–2 = state code, characters 3–12 = PAN of the entity, character 13 = entity number, character 14 =
Z(default), character 15 = checksum digit. Any record failing this pattern should be quarantined, not rejected silently. - PAN validation: 10-character format check (
AAAAA9999Apattern), cross-referenced against your existing TDS master to prevent duplicates that create reconciliation problems on Form 26Q / 27Q uploaded to TRACES (TDS Reconciliation Analysis and Correction Enabling System). - Bank account verification via penny drop: Every new beneficiary account must be verified through a penny-drop API before it is marked active for payment. This is both a fraud-prevention control and a practical requirement before any NEFT/RTGS instruction.
Why this matters for tax compliance: Under Sections 194C, 194J and 206AA of the Income-tax Act 1961, if a vendor's PAN on file is invalid or missing, TDS must be deducted at 20% instead of the standard rate (typically 1–10% depending on the section). On a vendor invoice of Rs. 50 lakh, that is a TDS differential of Rs. 4.5–9.5 lakh — all recoverable in theory, but practically slow and contested when TRACES reconciliation fails.
Outbound Data to Vendors and Partners
Every data-sharing arrangement must clear a vendor-risk gate before the first byte leaves your system:
- What data classification tier does this vendor receive?
- Does the vendor hold ISO 27001, SOC 2 Type II or an equivalent current certification?
- Is a Data Processing Agreement (DPA) — mandatory for processors under DPDP — signed and on file?
- What is the breach-notification SLA in the DPA? It should be ≤ 72 hours from the vendor to you, so you can meet your CERT-In 6-hour clock.
- Does the vendor use sub-processors? If so, are sub-processors contractually bound to the same security standards, and will the vendor notify you of any sub-processor change?
Filtering outbound TDS files: The quarterly TDS statement uploaded under Form 26Q (non-salary domestic), Form 27Q (non-resident payments), Form 24Q (salary) and Form 27EQ (TCS) must contain only: deductee PAN, deductee name, amount paid, TDS deducted and challan details. Strip every other field — full address, mobile number, email, bank account details — before the file is generated. This is simultaneously a DPDP data-minimisation obligation and a sensible security control.
Filtering for B2C Data Flows
B2C filtering is where DPDP bites hardest, because Data Principals have enforceable rights that your systems must be technically capable of honouring.
Consent as a Hard Filter Gate
Before any personal data enters a processing pipeline for marketing, analytics or behavioural modelling:
- Query the consent register in your CRM or Customer Data Platform (CDP).
- If consent is absent, withdrawn or has expired, route the record to a quarantine bucket. Do not delete — deletion may destroy evidence of a consent dispute. Do not process — processing without consent is a Section 6 violation.
- Store every consent record with: timestamp, channel (web, IVR, mobile app), version of the consent notice displayed, IP address (for web) and a cryptographic hash of the notice text so you can prove what the person actually agreed to.
RBI Tokenisation: The Non-Negotiable for Card Data
If your business accepts card payments as a merchant, marketplace or payment aggregator:
- Audit immediately: Search every database and data store for columns named
card_number,pan,cvv,card_no,cc_num,expiry,card_expiryor any variant. Run a regex across unstructured data stores for 13–19 digit numeric strings that pass the Luhn algorithm. - Any match in a table that is not your payment gateway's token vault is a live RBI non-compliance.
- Tokens are issued by card networks through acquiring banks and are specific to the merchant-device-network combination — a token cannot be used at a different merchant even if intercepted.
Masking for Internal Analysis
Even inside your systems, analysts, CRM users and BI developers should not see raw sensitive personal data unless the role explicitly requires it.
- Static data masking: Before any data reaches a development or test environment, mask it. A real PAN
ABCDE1234FbecomesXXXXX1234F. A real email[email protected]becomesr*****@g*****.com. Treat this as a CI/CD pipeline gate — no real personal data in dev or staging, ever. - Dynamic data masking: In production query environments, a data analyst sees
XXXXXXXX0091for a mobile number while the application layer retains the full value. Implement this at the database view or API gateway layer. - Access logging: Every time a full sensitive field is accessed — not just modified — the access must be logged with user identity, timestamp and client IP. This log is your evidence in a Data Protection Board inquiry.
Honouring Data Principal Rights
DPDP creates rights that are effectively filter and workflow triggers:
- Right of access: On request, your system must be able to extract every piece of personal data held about one individual across CRM, warehouse, email platform, analytics tool and backup copies. Build this capability before a request arrives. Provisional target: respond within 30 days (Rules may prescribe a shorter period).
- Right to erasure: Purge or irreversibly anonymise data across all systems — including backup copies, which the Act explicitly addresses. Anonymisation means the individual can no longer be re-identified even with additional data.
- Grievance redressal: Designate a point of contact (or a Consent Manager if you are a Significant Data Fiduciary). Aim for 48-hour acknowledgement and 15 business day resolution as an internal standard until Rules specify a timeline.
Three-Layer Architecture: Inbound, In-Platform, Outbound
The most resilient filtering architecture places controls at all three stages. A failure at one layer is caught by the next.
Layer 1 — Inbound Controls
- Schema validation: field types, mandatory fields, format patterns (GSTIN, PAN, IBAN, IFSC)
- Consent token check: does this inbound record carry a valid, purpose-matched consent reference?
- Duplicate detection: especially for vendor master and customer master feeds
- Regulatory routing: records containing PAN, Aadhaar or card data are flagged immediately for handling under the appropriate policy
Layer 2 — In-Platform Controls
- Role-based access control (RBAC) enforced at database and API layer, not just at the application UI
- Dynamic data masking for analyst-facing query environments
- Card tokenisation vault with network-issued tokens as the only stored representation
- Data lineage tracking: every downstream table or BI report that consumes a sensitive field is mapped
- Immutable audit logs retained for a minimum of 180 days (CERT-In direction minimum) and ideally 12 months for audit-readiness
Layer 3 — Outbound Controls
- Pre-flight consent check before any export to advertising platforms (Meta CAPI, Google Enhanced Conversions, DV360, trade desks)
- Data minimisation: programmatically strip every field the recipient does not need before the export runs
- Encryption in transit: TLS 1.2 minimum, TLS 1.3 preferred; never transmit sensitive data over plain HTTP
- Digital watermarking or canary tokens for high-risk data shares, so a leaked copy can be traced back to which recipient received it
- API rate-limit anomaly detection: a partner pulling 10× their normal data volume in an off-hours window should trigger an alert and temporary rate cap
Worked Example: What the Cost of Getting This Wrong Actually Looks Like
Scenario A — The misconfigured outbound API
A mid-size e-commerce company with 2 lakh customer records has a misconfigured analytics API connector that has been pulling full name, email, mobile number and city — all Tier 4 Sensitive Personal data — into a third-party BI tool for 18 months. A security researcher discovers the API is unauthenticated and files a public disclosure.
Regulatory exposure:
- DPDP Act Schedule, Item 1: Failure to implement reasonable security safeguards → up to Rs. 250 crore penalty.
- CERT-In direction: Breach detected Day 1, notified Day 3. Two-day gap on the 6-hour clock → potential criminal liability under Section 70B of the IT Act.
- Direct incident costs (Indian mid-market benchmarks): forensics Rs. 15–25 lakh, legal Rs. 10–20 lakh, customer notification Rs. 5–10 lakh, PR management Rs. 5–15 lakh. Total direct cost before any regulatory penalty: Rs. 35–70 lakh.
- Revenue impact: Cart abandonment typically spikes 15–25% in the 90 days following a publicised breach in Indian e-commerce.
The prevention cost: a dynamic data masking layer on the analytics API (Rs. 2–5 lakh per year for a SaaS DLP tool) and a consent management platform (Rs. 1–3 lakh per year). The economics are unambiguous.
Scenario B — The over-populated TDS outbound file
A finance team exports a vendor TDS file containing PAN, name, TDS amount and credit limit, outstanding balance and pricing tier — because the report was built from the full vendor master view rather than a purpose-limited extract. The file lands in the vendor's inbox. One vendor employee is also a consultant to a competitor. The pricing data leaks.
The contractual and legal dispute costs Rs. 5–20 lakh in legal fees. The regulatory exposure under DPDP (sharing confidential data without a lawful basis) and IT Act Section 43A (failure of reasonable security practices) creates additional civil liability. The fix — an outbound filter that strips every field except PAN, name, amount and challan reference from the TDS extract — takes one developer half a day to implement.
Common Mistakes and How to Fix Them
Mistake 1: Treating Consent as a One-Time Checkbox at Signup
What goes wrong: Terms-and-conditions acceptance is treated as blanket consent for all processing purposes, including sharing data with advertising partners and behavioural analytics vendors. Under DPDP Section 6, bundled consent is invalid. When challenged, you have no lawful basis for most of your data-processing activity. Fix: Granular, purpose-specific consent notices. Separate consent per purpose (order fulfilment, marketing, analytics, third-party sharing). Store each consent record with its own timestamp, notice version hash and withdrawal mechanism.
Mistake 2: Real Personal Data in Development and Test Environments
What goes wrong: Developers use production exports as test data because it "makes debugging easier." A developer's laptop contains 50,000 real PANs. The laptop is lost or stolen. Fix: Static data masking as a mandatory CI/CD gate. No real personal data enters any non-production environment. Automate the masking step — do not rely on developer discipline.
Mistake 3: Ignoring the Vendor's Sub-Processors
What goes wrong: Your primary cloud analytics vendor is ISO 27001 certified and has signed a DPA. But they use a sub-processor in a jurisdiction without an adequacy determination under DPDP Section 16. Any cross-border transfer through the primary vendor to that sub-processor is a potential violation. Fix: Request a sub-processor list from every data vendor. Require your vendors to contractually bind sub-processors to identical security standards and to notify you of any sub-processor change with 30 days' notice.
Mistake 4: Log Retention Set Below Regulatory Minimum
What goes wrong: A breach is discovered 45 days after the intrusion. CERT-In demands logs from the incident window. Logs are configured to auto-delete after 30 days. You cannot reconstruct the incident or demonstrate containment — which is itself a separate compliance failure. Fix: Immutable (write-once) log retention of 180 days minimum across all systems. For payment systems and critical financial applications, retain 12 months. Store logs in a separate, write-protected storage tier with independent access controls.
Mistake 5: API Endpoints Returning More Fields Than the Consumer Needs
What goes wrong: A third-party integration was built three years ago and pulls a broad customer object including fields the vendor never actually uses — because narrowing the response required a ticket and no one prioritised it. Over time, the vendor becomes a data store of personal data you never intended to share. Fix: Enforce field-level API policies at the API gateway (AWS API Gateway, Apigee, Kong). Every endpoint returns only the minimum necessary fields, enforced at the infrastructure layer — not left to the discretion of the consuming application.
Incident Response: The Timelines You Cannot Afford to Miss
Build your incident-response runbook around these hard deadlines. The clock on each starts at the moment of detection, not when investigation is complete.
| Event | Deadline | Authority | Consequence of missing |
|---|---|---|---|
| Cybersecurity incident (any of 20 categories) | 6 hours from detection | CERT-In (MeitY) | Criminal liability under IT Act Section 70B |
| Personal data breach — Data Principals | As per DPDP Rules (use 72-hour standard until notified) | Data Protection Board | Up to Rs. 200 crore penalty per Schedule |
| Personal data breach — Data Protection Board | As per DPDP Rules | Data Protection Board | Up to Rs. 200 crore penalty |
| RBI-regulated entity breach | Immediate to RBI CISO; formal within 2–6 hours | RBI | Regulatory action under Banking Regulation Act / PSS Act |
| SEBI MII/RE breach | Report to SEBI within 6 hours; board intimation within 24 hours | SEBI | Show-cause, penalties under SEBI Act 1992 |
Tabletop exercise requirement: Run at least one annual exercise with your CFO, CISO (or IT head), legal counsel and communications lead. Simulate three distinct scenarios: (a) a ransomware attack that encrypts the customer database, (b) a vendor-side breach where a sub-processor exposes data, and (c) an accidental public link that exposes an internal financial model. Each scenario requires a different response — detecting that your plan handles all three is the only way to know it works.
Key Takeaways
- Classify first, filter second. Assign every data element a tier (Public through Critical Financial) before writing a single filter rule. Without an enforced classification, every downstream control is guesswork — and regulators treat it that way.
- Consent is a machine-readable filter gate, not a policy document clause. Under DPDP Section 6, bundled consent is invalid. Wire consent status — per purpose, per channel — into your data pipeline as a hard block on processing.
- RBI tokenisation is a live obligation, not a future roadmap item. Scan your databases for raw card PANs, expiry dates and CVVs today. Any match outside a card-network token vault is a standing compliance violation with direct RBI consequences.
- Your vendor's perimeter is your perimeter. Most Indian data breaches trace back to a vendor or sub-processor, not the enterprise itself. Every data-sharing arrangement requires a DPA, a security attestation and a breach-notification SLA that fits inside your own regulatory clocks.
- Three-layer filtering is more resilient than a single perimeter control. Inbound validation, in-platform masking and tokenisation, and outbound field-level controls in combination catch failures that any single layer alone would miss.
- The CERT-In 6-hour clock starts on detection, not on investigation completion. Designate a CERT-In reporting officer. That person must be able to file an initial report within 6 hours of any team member raising a security alert, even if the full scope of the incident is not yet known.
- Immutable logs are your evidence, not just your audit trail. Set 180-day minimum retention across all systems, stored in a write-protected tier. A breach you cannot reconstruct from logs is a breach you cannot defend before any of the four regulatory authorities above.




![Read article: Cyber Crime FIR in India: How to File Complaint for Online Fraud, Banking Fraud & Digital Harassment [2025 Guide]](/_next/image?url=%2Fapi%2Fmedia%2Ffile%2FCyber-Crime-Complaint.png&w=3840&q=75)
