Scaling Controls

Sep 7

Written By MCEO

Controls are put in place to reduce risk. However, far too often we fail to consider scalability in our control approaches. There is a tendency to rush towards remediating an issue through production of endless documentation (policies, procedures) and manual processes that begin to fall apart as soon as they are implemented. This is particularly rampant in smaller growing companies where controls are often the responsibility of less technical (though still knowledgable) IT risk and compliance staff. You can escape this truth: Entropy is the ruler of controls based on documentation and manual procedures.

As a control profession, I scratch my head at this approach. In a world with no shortage of software developers, it seems that many companies fail to deploy such skills towards the problem of control automation.

The consequences? Simple: controls are incomplete in their adoption. Even worse, nobody knows the extent of adoption but little, if anything is measured. Then your auditors come and beat you over the head. Or worse, the regulators. And then you throw more money at it in the form of people and manual processes.

The solution? Standardize controls, and automate them end to end. By end to end, I mean automation of their adoption, measurement, and correction. It requires security and control engineering. Its requires people and skills to think about, specify, and engineer controls that are easy to adopt (if not fully automated). It requires a mindset and discipline to engineer controls similar to how a company thinks about engineering their customer facing technologies (hopefully you have some discipline there!)

So below I am going to share some key examples of controls that would (and should) commonly be at a financial institution, with insights on how you may automate them. By no means is this a perfect list. What you do will depend on the context of your environment. However, it is a starting ground you can use to think about this problem in your environment. Copy and paste this down into a spreadsheet and think about it for your control environment.

The key takeaway here is this: Think twice about the how you want to design your controls. Manual controls, or semi automatic controls may work in certain circumstances, but will eventually succumb to entropy, especially as the environment grows in scale and complexity. Invest in automation using common software analysis and engineering processes. Federate the problem to your existing engineers if necessary, or invest in a software development team that support implementation of control automation.

Good Luck!

  
      Section
      Common Anti-Patterns
      Adverse Impact of Anti-Patterns
      Key Automation Capabilities (with Sample Vendors/Solutions)
      Scaling Challenges Addressed
      Automating Adoption of the Control
    
      Identity and Access Management (IAM)
      
        Manual user provisioning and deprovisioning
Over-provisioning of access
Lack of periodic access reviews

        Increases the risk of orphaned accounts, unauthorized access, and compliance violations as user base grows.
      
        Centralized Identity Provider (IdP) (e.g., Okta, Azure AD)
Automated Provisioning/Deprovisioning (e.g., SailPoint, OneLogin)
Access Analytics and Reporting (e.g., BeyondTrust)
Just-in-Time (JIT) Access (e.g., Thycotic)

      Reduces the overhead of managing multiple user accounts and ensures secure access control as the number of users grows.
      
        Use integration with HR systems (e.g., Workday) to trigger automatic provisioning/deprovisioning
Role-based access control templates to automate policy setup

      Data Protection and Privacy
      
        Manual data classification
Using weak encryption standards
Not regularly reviewing data retention policies

      Increases the likelihood of misclassified or unprotected sensitive data, leading to data breaches and non-compliance with privacy regulations.
      
        Automated Data Classification (e.g., Varonis, BigID)
Automated Encryption Management (e.g., AWS KMS, Azure Key Vault)
Data Masking and Tokenization (e.g., Protegrity, Thales CipherTrust)
Data Retention Policies (e.g., Druva)

      Automates data security and handling processes, making it easier to manage large volumes of sensitive data while maintaining compliance.
      
        Integrate automated classification tools to scan all data sources continuously
Use encryption management tools with built-in compliance templates

      Third-Party Risk Management
      
        One-time vendor risk assessments
Lack of ongoing vendor monitoring
Over-reliance on contracts without enforcing security measures

      Vendors can introduce significant risks as the organization grows, leading to supply chain attacks, data breaches, and compliance failures.
      
        Vendor Risk Scoring Platforms (e.g., BitSight, SecurityScorecard)
API-Driven Vendor Due Diligence (e.g., Prevalent, OneTrust)
Automated Contract Management (e.g., Icertis, DocuSign CLM)

      Automation reduces the manual workload of vendor risk monitoring and contract compliance as the number of vendors grows.
      
        Implement continuous monitoring platforms with automated risk scoring based on vendor behavior and contract status
Automate renewal and compliance check reminders

      Cybersecurity Threat Detection and Response
      
        Relying on manual incident response
Not aggregating threat data across systems
Lack of defined response playbooks

      Increases the time to detect and respond to security incidents, leading to prolonged exposure to attacks and significant operational disruptions.
      
        Security Information and Event Management (SIEM) (e.g., Splunk, Azure Sentinel)
Automated Incident Response (SOAR) (e.g., Palo Alto Cortex XSOAR, IBM Resilient)
Endpoint Detection and Response (EDR) (e.g., CrowdStrike, SentinelOne)

      Automated threat detection and response reduce manual intervention, allowing for efficient scaling of cybersecurity operations.
      
        Use SOAR platforms to automate incident response workflows
Integrate threat feeds for automated detection
Regularly update playbooks through version control

      System Resilience and Availability
      
        Manual failover processes
Over-provisioning of resources without autoscaling
No regular disaster recovery testing

      Leads to downtime, inefficient resource usage, and longer recovery times in the event of failures or increased load.
      
        Automated Failover and Recovery (e.g., AWS Elastic Disaster Recovery, Azure Site Recovery)
Autoscaling Infrastructure (e.g., AWS Auto Scaling, Azure Virtual Machine Scale Sets)
Chaos Engineering (e.g., Gremlin, AWS Fault Injection Simulator)

      Ensures system resilience and availability through automated failover, scaling, and testing in large-scale environments.
      
        Implement autoscaling groups that adjust resource allocation in real-time
Set up automated disaster recovery failover tests on a regular schedule

      Regulatory Compliance Automation
      
        Manual control assessments
Reactively handling compliance issues
Lack of centralized compliance reporting

      Increases the likelihood of missed compliance deadlines, inaccurate reporting, and exposure to penalties as compliance obligations grow with scaling infrastructure.
      
        Compliance-as-Code (e.g., HashiCorp Sentinel, Open Policy Agent)
Automated Compliance Audits (e.g., Drata, Vanta)
Automated Reporting and Alerts (e.g., OneTrust, AuditBoard)

      Automation ensures compliance across growing infrastructures without the need for manual oversight.
      
        Use compliance-as-code templates that automatically validate infrastructure configurations
Enable real-time compliance monitoring dashboards

      Audit and Logging
      
        Inconsistent log collection
Storing logs across siloed systems
Lack of real-time log analysis

      Results in incomplete audit trails and missed detection of anomalies, making it difficult to meet compliance requirements or identify security incidents at scale.
      
        Centralized Log Aggregation (e.g., AWS CloudWatch, Azure Monitor, ELK Stack)
Log Analysis with AI (e.g., Datadog, Splunk)
Automated Audit Trails (e.g., AWS CloudTrail, Azure Activity Logs)

      AI-driven log analysis and automated audit trails reduce the burden of manual log management and provide compliance at scale.
      
        Use centralized logging systems that integrate with all key infrastructure
Implement AI-driven log analysis tools for anomaly detection and audit trail generation

      Payment Rails Automation
      
        Manual routing of payments
Not using AI for fraud detection
Lack of real-time reconciliation

      Leads to delayed payments, higher fraud rates, and manual reconciliation errors, which can impact operational efficiency and customer trust as transaction volume scales.
      
        Automated Payment Routing (e.g., Stripe, Rapyd)
Fraud Detection and Prevention (e.g., Feedzai, Forter)
Automated Reconciliation (e.g., ReconArt, BlackLine)
API-Driven Settlement Systems (e.g., Plaid, Dwolla)

      Automated routing, fraud detection, and reconciliation streamline operations for managing large volumes of payments across networks.
      
        Implement real-time fraud detection and prevention tools integrated with payment gateways
Automate reconciliation between different payment rails and accounts

      Credit Card Networks Automation
      
        Storing credit card information without tokenization
Manual chargeback handling
Lack of multi-factor authentication (MFA) for transactions

      Increases the risk of credit card data breaches, higher chargeback costs, and compliance failures (PCI-DSS), impacting scalability and security of card transactions.
      
        Tokenization for Card Transactions (e.g., Visa Token Service, Marqeta)
Automated Chargeback Management (e.g., Chargehound, Verifi)
Compliance with Card Network Rules (e.g., Alloy, ComplyAdvantage)
Dynamic Card Authentication (e.g., Adyen)

      Automation manages complex tokenization, compliance, and fraud prevention needs for growing credit card transaction volumes.
      
        Set up automated chargeback management integrated with card networks
Use tokenization services to protect card data and integrate MFA for transaction authentication

      Marketplace Lending Rules and Regulations
      
        Manual underwriting processes
Inconsistent compliance with lending regulations
Not performing regular borrower risk assessments

      Leads to slower loan processing, higher risk of non-compliance with lending laws, and increased exposure to financial loss as loan volumes grow.
      
        Automated Loan Underwriting (e.g., Zest AI, Upstart)
Regulatory Compliance Engines (e.g., Ascent, Compliance.ai)
Automated Loan Servicing (e.g., LoanPro, Fiserv)
KYC/AML Automation (e.g., Onfido, Jumio)

      Automation ensures efficient handling of increasing loan volumes and complex regulatory requirements for lending platforms.
      
        Integrate AI-driven underwriting systems to automate borrower risk assessment
Use compliance engines that automatically check loan programs against regulations

      Data Governance Automation
      
        Manual data discovery and classification
Lack of access control policies for sensitive data
Not tracking data lineage

      Increases the risk of data sprawl, unauthorized data access, and non-compliance with data privacy regulations like GDPR or CCPA, especially as data volumes increase.
      
        Automated Data Discovery and Classification (e.g., Collibra, BigID)
Policy Automation for Data Access (e.g., Apache Ranger, Open Policy Agent)
Data Lineage Tracking (e.g., Informatica, Atlan)
Automated Data Quality Management (e.g., Talend)

      Automation helps manage data quality, access control, and compliance in large and complex data environments, ensuring scaling without added complexity.
      
        Use automated discovery tools that scan data sources for sensitive information continuously
Set up automated access policy enforcement and data lineage tracking for end-to-end visibility

Section	Common Anti-Patterns	Adverse Impact of Anti-Patterns	Key Automation Capabilities (with Sample Vendors/Solutions)	Scaling Challenges Addressed	Automating Adoption of the Control
Identity and Access Management (IAM)	Manual user provisioning and deprovisioning Over-provisioning of access Lack of periodic access reviews	Increases the risk of orphaned accounts, unauthorized access, and compliance violations as user base grows.	Centralized Identity Provider (IdP) (e.g., Okta, Azure AD) Automated Provisioning/Deprovisioning (e.g., SailPoint, OneLogin) Access Analytics and Reporting (e.g., BeyondTrust) Just-in-Time (JIT) Access (e.g., Thycotic)	Reduces the overhead of managing multiple user accounts and ensures secure access control as the number of users grows.	Use integration with HR systems (e.g., Workday) to trigger automatic provisioning/deprovisioning Role-based access control templates to automate policy setup
Data Protection and Privacy	Manual data classification Using weak encryption standards Not regularly reviewing data retention policies	Increases the likelihood of misclassified or unprotected sensitive data, leading to data breaches and non-compliance with privacy regulations.	Automated Data Classification (e.g., Varonis, BigID) Automated Encryption Management (e.g., AWS KMS, Azure Key Vault) Data Masking and Tokenization (e.g., Protegrity, Thales CipherTrust) Data Retention Policies (e.g., Druva)	Automates data security and handling processes, making it easier to manage large volumes of sensitive data while maintaining compliance.	Integrate automated classification tools to scan all data sources continuously Use encryption management tools with built-in compliance templates
Third-Party Risk Management	One-time vendor risk assessments Lack of ongoing vendor monitoring Over-reliance on contracts without enforcing security measures	Vendors can introduce significant risks as the organization grows, leading to supply chain attacks, data breaches, and compliance failures.	Vendor Risk Scoring Platforms (e.g., BitSight, SecurityScorecard) API-Driven Vendor Due Diligence (e.g., Prevalent, OneTrust) Automated Contract Management (e.g., Icertis, DocuSign CLM)	Automation reduces the manual workload of vendor risk monitoring and contract compliance as the number of vendors grows.	Implement continuous monitoring platforms with automated risk scoring based on vendor behavior and contract status Automate renewal and compliance check reminders
Cybersecurity Threat Detection and Response	Relying on manual incident response Not aggregating threat data across systems Lack of defined response playbooks	Increases the time to detect and respond to security incidents, leading to prolonged exposure to attacks and significant operational disruptions.	Security Information and Event Management (SIEM) (e.g., Splunk, Azure Sentinel) Automated Incident Response (SOAR) (e.g., Palo Alto Cortex XSOAR, IBM Resilient) Endpoint Detection and Response (EDR) (e.g., CrowdStrike, SentinelOne)	Automated threat detection and response reduce manual intervention, allowing for efficient scaling of cybersecurity operations.	Use SOAR platforms to automate incident response workflows Integrate threat feeds for automated detection Regularly update playbooks through version control
System Resilience and Availability	Manual failover processes Over-provisioning of resources without autoscaling No regular disaster recovery testing	Leads to downtime, inefficient resource usage, and longer recovery times in the event of failures or increased load.	Automated Failover and Recovery (e.g., AWS Elastic Disaster Recovery, Azure Site Recovery) Autoscaling Infrastructure (e.g., AWS Auto Scaling, Azure Virtual Machine Scale Sets) Chaos Engineering (e.g., Gremlin, AWS Fault Injection Simulator)	Ensures system resilience and availability through automated failover, scaling, and testing in large-scale environments.	Implement autoscaling groups that adjust resource allocation in real-time Set up automated disaster recovery failover tests on a regular schedule
Regulatory Compliance Automation	Manual control assessments Reactively handling compliance issues Lack of centralized compliance reporting	Increases the likelihood of missed compliance deadlines, inaccurate reporting, and exposure to penalties as compliance obligations grow with scaling infrastructure.	Compliance-as-Code (e.g., HashiCorp Sentinel, Open Policy Agent) Automated Compliance Audits (e.g., Drata, Vanta) Automated Reporting and Alerts (e.g., OneTrust, AuditBoard)	Automation ensures compliance across growing infrastructures without the need for manual oversight.	Use compliance-as-code templates that automatically validate infrastructure configurations Enable real-time compliance monitoring dashboards
Audit and Logging	Inconsistent log collection Storing logs across siloed systems Lack of real-time log analysis	Results in incomplete audit trails and missed detection of anomalies, making it difficult to meet compliance requirements or identify security incidents at scale.	Centralized Log Aggregation (e.g., AWS CloudWatch, Azure Monitor, ELK Stack) Log Analysis with AI (e.g., Datadog, Splunk) Automated Audit Trails (e.g., AWS CloudTrail, Azure Activity Logs)	AI-driven log analysis and automated audit trails reduce the burden of manual log management and provide compliance at scale.	Use centralized logging systems that integrate with all key infrastructure Implement AI-driven log analysis tools for anomaly detection and audit trail generation
Payment Rails Automation	Manual routing of payments Not using AI for fraud detection Lack of real-time reconciliation	Leads to delayed payments, higher fraud rates, and manual reconciliation errors, which can impact operational efficiency and customer trust as transaction volume scales.	Automated Payment Routing (e.g., Stripe, Rapyd) Fraud Detection and Prevention (e.g., Feedzai, Forter) Automated Reconciliation (e.g., ReconArt, BlackLine) API-Driven Settlement Systems (e.g., Plaid, Dwolla)	Automated routing, fraud detection, and reconciliation streamline operations for managing large volumes of payments across networks.	Implement real-time fraud detection and prevention tools integrated with payment gateways Automate reconciliation between different payment rails and accounts
Credit Card Networks Automation	Storing credit card information without tokenization Manual chargeback handling Lack of multi-factor authentication (MFA) for transactions	Increases the risk of credit card data breaches, higher chargeback costs, and compliance failures (PCI-DSS), impacting scalability and security of card transactions.	Tokenization for Card Transactions (e.g., Visa Token Service, Marqeta) Automated Chargeback Management (e.g., Chargehound, Verifi) Compliance with Card Network Rules (e.g., Alloy, ComplyAdvantage) Dynamic Card Authentication (e.g., Adyen)	Automation manages complex tokenization, compliance, and fraud prevention needs for growing credit card transaction volumes.	Set up automated chargeback management integrated with card networks Use tokenization services to protect card data and integrate MFA for transaction authentication
Marketplace Lending Rules and Regulations	Manual underwriting processes Inconsistent compliance with lending regulations Not performing regular borrower risk assessments	Leads to slower loan processing, higher risk of non-compliance with lending laws, and increased exposure to financial loss as loan volumes grow.	Automated Loan Underwriting (e.g., Zest AI, Upstart) Regulatory Compliance Engines (e.g., Ascent, Compliance.ai) Automated Loan Servicing (e.g., LoanPro, Fiserv) KYC/AML Automation (e.g., Onfido, Jumio)	Automation ensures efficient handling of increasing loan volumes and complex regulatory requirements for lending platforms.	Integrate AI-driven underwriting systems to automate borrower risk assessment Use compliance engines that automatically check loan programs against regulations
Data Governance Automation	Manual data discovery and classification Lack of access control policies for sensitive data Not tracking data lineage	Increases the risk of data sprawl, unauthorized data access, and non-compliance with data privacy regulations like GDPR or CCPA, especially as data volumes increase.	Automated Data Discovery and Classification (e.g., Collibra, BigID) Policy Automation for Data Access (e.g., Apache Ranger, Open Policy Agent) Data Lineage Tracking (e.g., Informatica, Atlan) Automated Data Quality Management (e.g., Talend)	Automation helps manage data quality, access control, and compliance in large and complex data environments, ensuring scaling without added complexity.	Use automated discovery tools that scan data sources for sensitive information continuously Set up automated access policy enforcement and data lineage tracking for end-to-end visibility

MCEO

Scaling Controls

Auditing Generative AI: Risks, Controls, and Best Practices

A Letter to a young IT Auditor