Scaling Controls

Controls are put in place to reduce risk. However, far too often we fail to consider scalability in our control approaches. There is a tendency to rush towards remediating an issue through production of endless documentation (policies, procedures) and manual processes that begin to fall apart as soon as they are implemented. This is particularly rampant in smaller growing companies where controls are often the responsibility of less technical (though still knowledgable) IT risk and compliance staff. You can escape this truth: Entropy is the ruler of controls based on documentation and manual procedures.

As a control profession, I scratch my head at this approach. In a world with no shortage of software developers, it seems that many companies fail to deploy such skills towards the problem of control automation.

The consequences? Simple: controls are incomplete in their adoption. Even worse, nobody knows the extent of adoption but little, if anything is measured. Then your auditors come and beat you over the head. Or worse, the regulators. And then you throw more money at it in the form of people and manual processes.

The solution? Standardize controls, and automate them end to end. By end to end, I mean automation of their adoption, measurement, and correction. It requires security and control engineering. Its requires people and skills to think about, specify, and engineer controls that are easy to adopt (if not fully automated). It requires a mindset and discipline to engineer controls similar to how a company thinks about engineering their customer facing technologies (hopefully you have some discipline there!)

So below I am going to share some key examples of controls that would (and should) commonly be at a financial institution, with insights on how you may automate them. By no means is this a perfect list. What you do will depend on the context of your environment. However, it is a starting ground you can use to think about this problem in your environment. Copy and paste this down into a spreadsheet and think about it for your control environment.

The key takeaway here is this: Think twice about the how you want to design your controls. Manual controls, or semi automatic controls may work in certain circumstances, but will eventually succumb to entropy, especially as the environment grows in scale and complexity. Invest in automation using common software analysis and engineering processes. Federate the problem to your existing engineers if necessary, or invest in a software development team that support implementation of control automation.

Good Luck!

Section Common Anti-Patterns Adverse Impact of Anti-Patterns Key Automation Capabilities (with Sample Vendors/Solutions) Scaling Challenges Addressed Automating Adoption of the Control
Identity and Access Management (IAM)
  1. Manual user provisioning and deprovisioning
  2. Over-provisioning of access
  3. Lack of periodic access reviews
Increases the risk of orphaned accounts, unauthorized access, and compliance violations as user base grows.
  1. Centralized Identity Provider (IdP) (e.g., Okta, Azure AD)
  2. Automated Provisioning/Deprovisioning (e.g., SailPoint, OneLogin)
  3. Access Analytics and Reporting (e.g., BeyondTrust)
  4. Just-in-Time (JIT) Access (e.g., Thycotic)
Reduces the overhead of managing multiple user accounts and ensures secure access control as the number of users grows.
  1. Use integration with HR systems (e.g., Workday) to trigger automatic provisioning/deprovisioning
  2. Role-based access control templates to automate policy setup
Data Protection and Privacy
  1. Manual data classification
  2. Using weak encryption standards
  3. Not regularly reviewing data retention policies
Increases the likelihood of misclassified or unprotected sensitive data, leading to data breaches and non-compliance with privacy regulations.
  1. Automated Data Classification (e.g., Varonis, BigID)
  2. Automated Encryption Management (e.g., AWS KMS, Azure Key Vault)
  3. Data Masking and Tokenization (e.g., Protegrity, Thales CipherTrust)
  4. Data Retention Policies (e.g., Druva)
Automates data security and handling processes, making it easier to manage large volumes of sensitive data while maintaining compliance.
  1. Integrate automated classification tools to scan all data sources continuously
  2. Use encryption management tools with built-in compliance templates
Third-Party Risk Management
  1. One-time vendor risk assessments
  2. Lack of ongoing vendor monitoring
  3. Over-reliance on contracts without enforcing security measures
Vendors can introduce significant risks as the organization grows, leading to supply chain attacks, data breaches, and compliance failures.
  1. Vendor Risk Scoring Platforms (e.g., BitSight, SecurityScorecard)
  2. API-Driven Vendor Due Diligence (e.g., Prevalent, OneTrust)
  3. Automated Contract Management (e.g., Icertis, DocuSign CLM)
Automation reduces the manual workload of vendor risk monitoring and contract compliance as the number of vendors grows.
  1. Implement continuous monitoring platforms with automated risk scoring based on vendor behavior and contract status
  2. Automate renewal and compliance check reminders
Cybersecurity Threat Detection and Response
  1. Relying on manual incident response
  2. Not aggregating threat data across systems
  3. Lack of defined response playbooks
Increases the time to detect and respond to security incidents, leading to prolonged exposure to attacks and significant operational disruptions.
  1. Security Information and Event Management (SIEM) (e.g., Splunk, Azure Sentinel)
  2. Automated Incident Response (SOAR) (e.g., Palo Alto Cortex XSOAR, IBM Resilient)
  3. Endpoint Detection and Response (EDR) (e.g., CrowdStrike, SentinelOne)
Automated threat detection and response reduce manual intervention, allowing for efficient scaling of cybersecurity operations.
  1. Use SOAR platforms to automate incident response workflows
  2. Integrate threat feeds for automated detection
  3. Regularly update playbooks through version control
System Resilience and Availability
  1. Manual failover processes
  2. Over-provisioning of resources without autoscaling
  3. No regular disaster recovery testing
Leads to downtime, inefficient resource usage, and longer recovery times in the event of failures or increased load.
  1. Automated Failover and Recovery (e.g., AWS Elastic Disaster Recovery, Azure Site Recovery)
  2. Autoscaling Infrastructure (e.g., AWS Auto Scaling, Azure Virtual Machine Scale Sets)
  3. Chaos Engineering (e.g., Gremlin, AWS Fault Injection Simulator)
Ensures system resilience and availability through automated failover, scaling, and testing in large-scale environments.
  1. Implement autoscaling groups that adjust resource allocation in real-time
  2. Set up automated disaster recovery failover tests on a regular schedule
Regulatory Compliance Automation
  1. Manual control assessments
  2. Reactively handling compliance issues
  3. Lack of centralized compliance reporting
Increases the likelihood of missed compliance deadlines, inaccurate reporting, and exposure to penalties as compliance obligations grow with scaling infrastructure.
  1. Compliance-as-Code (e.g., HashiCorp Sentinel, Open Policy Agent)
  2. Automated Compliance Audits (e.g., Drata, Vanta)
  3. Automated Reporting and Alerts (e.g., OneTrust, AuditBoard)
Automation ensures compliance across growing infrastructures without the need for manual oversight.
  1. Use compliance-as-code templates that automatically validate infrastructure configurations
  2. Enable real-time compliance monitoring dashboards
Audit and Logging
  1. Inconsistent log collection
  2. Storing logs across siloed systems
  3. Lack of real-time log analysis
Results in incomplete audit trails and missed detection of anomalies, making it difficult to meet compliance requirements or identify security incidents at scale.
  1. Centralized Log Aggregation (e.g., AWS CloudWatch, Azure Monitor, ELK Stack)
  2. Log Analysis with AI (e.g., Datadog, Splunk)
  3. Automated Audit Trails (e.g., AWS CloudTrail, Azure Activity Logs)
AI-driven log analysis and automated audit trails reduce the burden of manual log management and provide compliance at scale.
  1. Use centralized logging systems that integrate with all key infrastructure
  2. Implement AI-driven log analysis tools for anomaly detection and audit trail generation
Payment Rails Automation
  1. Manual routing of payments
  2. Not using AI for fraud detection
  3. Lack of real-time reconciliation
Leads to delayed payments, higher fraud rates, and manual reconciliation errors, which can impact operational efficiency and customer trust as transaction volume scales.
  1. Automated Payment Routing (e.g., Stripe, Rapyd)
  2. Fraud Detection and Prevention (e.g., Feedzai, Forter)
  3. Automated Reconciliation (e.g., ReconArt, BlackLine)
  4. API-Driven Settlement Systems (e.g., Plaid, Dwolla)
Automated routing, fraud detection, and reconciliation streamline operations for managing large volumes of payments across networks.
  1. Implement real-time fraud detection and prevention tools integrated with payment gateways
  2. Automate reconciliation between different payment rails and accounts
Credit Card Networks Automation
  1. Storing credit card information without tokenization
  2. Manual chargeback handling
  3. Lack of multi-factor authentication (MFA) for transactions
Increases the risk of credit card data breaches, higher chargeback costs, and compliance failures (PCI-DSS), impacting scalability and security of card transactions.
  1. Tokenization for Card Transactions (e.g., Visa Token Service, Marqeta)
  2. Automated Chargeback Management (e.g., Chargehound, Verifi)
  3. Compliance with Card Network Rules (e.g., Alloy, ComplyAdvantage)
  4. Dynamic Card Authentication (e.g., Adyen)
Automation manages complex tokenization, compliance, and fraud prevention needs for growing credit card transaction volumes.
  1. Set up automated chargeback management integrated with card networks
  2. Use tokenization services to protect card data and integrate MFA for transaction authentication
Marketplace Lending Rules and Regulations
  1. Manual underwriting processes
  2. Inconsistent compliance with lending regulations
  3. Not performing regular borrower risk assessments
Leads to slower loan processing, higher risk of non-compliance with lending laws, and increased exposure to financial loss as loan volumes grow.
  1. Automated Loan Underwriting (e.g., Zest AI, Upstart)
  2. Regulatory Compliance Engines (e.g., Ascent, Compliance.ai)
  3. Automated Loan Servicing (e.g., LoanPro, Fiserv)
  4. KYC/AML Automation (e.g., Onfido, Jumio)
Automation ensures efficient handling of increasing loan volumes and complex regulatory requirements for lending platforms.
  1. Integrate AI-driven underwriting systems to automate borrower risk assessment
  2. Use compliance engines that automatically check loan programs against regulations
Data Governance Automation
  1. Manual data discovery and classification
  2. Lack of access control policies for sensitive data
  3. Not tracking data lineage
Increases the risk of data sprawl, unauthorized data access, and non-compliance with data privacy regulations like GDPR or CCPA, especially as data volumes increase.
  1. Automated Data Discovery and Classification (e.g., Collibra, BigID)
  2. Policy Automation for Data Access (e.g., Apache Ranger, Open Policy Agent)
  3. Data Lineage Tracking (e.g., Informatica, Atlan)
  4. Automated Data Quality Management (e.g., Talend)
Automation helps manage data quality, access control, and compliance in large and complex data environments, ensuring scaling without added complexity.
  1. Use automated discovery tools that scan data sources for sensitive information continuously
  2. Set up automated access policy enforcement and data lineage tracking for end-to-end visibility









Previous
Previous

Auditing Generative AI: Risks, Controls, and Best Practices

Next
Next

A Letter to a young IT Auditor