Why On-Premise AI Is the Only Way to Actually Audit Hiring Algorithms (Legal Teams Explain)
Jan 24, 2026
Your AI hiring vendor sends you their annual bias audit report. It's 47 pages long, produced by a third-party auditor, and claims the system is compliant with NYC Local Law 144.
Your Chief Compliance Officer reads it and asks one question: "Can we inspect the actual model that's making these hiring decisions?"
The vendor's response: "The model is proprietary. The third-party audit report is your verification."
That's when legal kills the project.
This isn't an edge case. It's the standard procurement conversation at every enterprise that takes bias auditing seriously. New York City Local Law 144, Colorado's AI Act, Illinois regulations, and expanding state-level requirements all mandate bias audits for automated employment decision tools. The question isn't whether you need to audit AI hiring systems. The question is whether you can actually do it.
Most SaaS AI hiring vendors can't give you what bias audits actually require: direct access to the model, the training data, and the decision logic. They can give you reports about their system. Reports are not audits. Reports are what vendors provide when they can't provide access.
For Chief Compliance Officers, this creates an unsolvable problem: you're legally responsible for demonstrating that your AI hiring system doesn't discriminate, but you're dependent on vendor-controlled audits of vendor-controlled models using vendor-selected auditors.
There's exactly one way to solve this: deploy the AI system inside your infrastructure where your audit team can inspect it directly.
What Bias Audit Laws Actually Require
NYC Local Law 144, which took effect July 5, 2023, established the template that other jurisdictions are following. The law requires employers using automated employment decision tools in New York City to obtain annual independent bias audits and publish the results.
Anti-bias auditing begins by examining whether the tool's results differ for protected groups at each stage of the process—for example, with regard to resume scores, rankings, who receives interviews, who passes assessments, and who ultimately gets hired.
The law appears straightforward until you try to operationalize it. What does "independent audit" mean when:
The vendor controls access to the model
The auditor is paid by the vendor
The employer can't verify the audit methodology
The model changes between audit cycles
The training data is inaccessible
Under NYC Local Law 144, all employers using automated employment decision tools are obligated to hire a third-party independent auditor to conduct annual 'bias audits' and post the resultant audit reports on their website.
Colorado's AI Act, effective February 2026, goes further. It requires not just bias audits but ongoing impact assessments, and explicitly holds deployers (not just developers) responsible for algorithmic discrimination. Illinois mandates disclosure when AI evaluates video interviews. California has proposed requirements for impact assessments of automated decision tools.
The regulatory landscape is fragmenting, with each jurisdiction imposing slightly different audit requirements, disclosure obligations, and compliance timelines. What they share: employers are legally responsible for AI hiring outcomes regardless of whether they built the tool or bought it from a vendor.
This creates a fundamental problem. You're liable for the decisions an AI system makes, but if you're using a SaaS vendor, you don't control the system making those decisions.
The Third-Party Auditor Problem
NYC Local Law 144 requires "independent, impartial" bias audits. In practice, here's how it works:
The AI vendor hires an auditing firm
The auditor analyzes the model and data the vendor provides
The auditor produces a report
The employer receives the report and publishes a summary
Notice what's missing: the employer never sees the actual model. The employer's compliance team cannot independently validate the audit findings. The employer is trusting a vendor-paid auditor's analysis of a vendor-controlled black box.
Even the best-intentioned audits exist in a system of incentives. Consultants are still vendors. And vendors, as a rule, don't bite the hand that signs the cheque.
This isn't theoretical concern. It's documented reality. Consider Citizens Bank's published bias audit results. In total, 57 comparisons (out of 640 comparisons or 8.9%) resulted in an impact ratio below 0.80. These results were mainly concentrated in the top tier scores (40 impact ratios below 0.80) across all groups.
The EEOC's four-fifths rule considers impact ratios below 0.80 as evidence of potential adverse impact. Citizens Bank's audit found 57 instances below this threshold. Yet the system presumably passed the audit because the report was published. What happened next? We don't know, because the employer can't inspect the model to verify whether fixes were implemented.
The "Independent Auditor" Illusion
The New York City Department of Consumer and Worker Protection published final rules to clarify who qualifies as an independent auditor, including how to conduct bias audits and guidance around what must be included in the summary.
But "independent" has a structural limit when the vendor controls access. The auditor can only examine what the vendor makes available. If the vendor's training data is incomplete, mislabeled, or unavailable, the auditor works with what they're given. If the model architecture includes proprietary components the vendor won't disclose, the auditor works around them.
For all the attention that AI audits have received, their ability to actually detect and protect against bias remains unproven. The term "AI audit" can mean many different things, which makes it hard to trust the results of audits in general.
Consider the incentive structure:
The vendor wants audit results showing compliance. Failed audits mean lost customers.
The auditor wants to maintain the vendor relationship. Vendors who receive bad audits hire different auditors.
The employer is the only party whose incentive is actually finding bias, but the employer has the least access to information.
This isn't a flaw in individual auditors' integrity. It's a structural problem when the audited party controls access to the thing being audited.
What Third-Party Auditors Can't Tell You
Even when third-party audits are conducted in perfect good faith, they have inherent limitations when the employer doesn't control the underlying system:
Model drift: The audit evaluates the model at a point in time. SaaS vendors continuously update their models. The model you're using today isn't the model that was audited six months ago. You have no way to verify it hasn't drifted toward discriminatory patterns.
Training data opacity: Examining the training and reference data, features that might act as stand-ins or "proxies" for protected traits, how features are built, the score cutoffs applied, any settings by location or role, and how recruiters and managers actually use the output or tool are important steps. If you can't access the training data, you can't verify these critical elements.
Intersectional bias: Another issue that neither the four-fifths rule nor audits address is intersectionality. The rule compares men with women and one racial group with another to see if they pass at the same rates, but it doesn't compare white men with Asian men or Black women. Without model access, you can't test for bias patterns the initial audit didn't cover.
Feature engineering details: AI models don't just use raw data. They engineer features—derived variables that might proxy for protected characteristics even when those characteristics aren't explicitly included. AI systems develop their decision-making based on training data; when those data overrepresent or underrepresent certain groups, it can cause biased results. Without access to feature engineering logic, you can't verify whether proxy discrimination is occurring.
Threshold adjustments: Small changes to decision thresholds can dramatically shift impact ratios across protected groups. If you can't access and modify thresholds, you can't test whether alternative cutoffs would reduce adverse impact while maintaining job-relatedness.
Third-party audits tell you how the model performed on a specific dataset at a specific time using the auditor's methodology. They don't tell you whether the model you're currently using is compliant. They don't give you the ability to investigate anomalies. They don't enable you to test alternative configurations.
The Black Box Problem at Scale
The term "black box AI" has become cliché, but in the context of bias auditing, it's a precise description of the actual problem Chief Compliance Officers face.
Many hiring platforms are "black boxes." HR teams can't inspect model logic or weightings. Without explainability, biases stay hidden until flagged by an audit or complaint.
Here's what happens in practice:
Scenario 1: Anomalous Results
Your recruiting team notices that qualified female candidates for technical roles are being scored lower than similarly qualified male candidates. You ask your AI vendor to investigate. They run a query, send you aggregate statistics, and assure you the system is working as designed.
Can you verify this? No. You don't have access to the model. You can't inspect the feature weights, examine the training data, or test alternative scoring approaches. You're dependent on the vendor's analysis of their own system.
Scenario 2: Regulatory Investigation
The EEOC investigates and requests documentation of your AI hiring system, including how it was validated for job-relatedness and tested for adverse impact. As legal scrutiny intensifies, employers cannot treat AI tools as a black box.
You provide the third-party audit report. The EEOC requests additional information: specific feature weights, training data composition, validation methodology for job-relatedness. You ask your vendor. They provide summaries but cite intellectual property protection for actual model details.
The EEOC's question: How can you demonstrate your hiring system is job-related and consistent with business necessity if you can't explain how it actually works?
Scenario 3: Remediation Requirements
A bias audit identifies adverse impact against candidates over 40. The auditor recommends adjusting decision thresholds. You ask your vendor to implement the changes. They explain that threshold adjustments would require retraining the model, which affects all their customers, not just you. They'll consider it for a future release.
You can't implement the fix because you don't control the model. Your compliance posture is now dependent on your vendor's product roadmap.
Why Vendors Can't Give You Model Access
This isn't vendors being difficult. It's a structural consequence of the SaaS business model.
Multi-tenant architecture: SaaS vendors serve hundreds or thousands of customers from shared infrastructure. One model processes candidates from many employers. If the vendor gives you full model access, they're potentially exposing other customers' patterns, which violates their obligations to those customers.
Intellectual property protection: The model architecture, training methodology, and feature engineering represent the vendor's competitive advantage. Providing full model access means competitors could reverse-engineer their intellectual property.
Continuous updates: SaaS vendors update models continuously. If they gave customers direct model access, every update would require individual customer approval, destroying the operational advantage of centralized infrastructure.
Data commingling: Many SaaS AI vendors train models on aggregated data from multiple customers to improve performance. Your candidates' patterns improve the model for other customers, and vice versa. This creates accuracy benefits but means the model making decisions about your candidates was partially trained on other companies' data. Legal implications: you can't audit training data you don't control.
The SaaS model is fundamentally incompatible with employer-controlled bias auditing. This isn't a technical limitation that vendors could fix with better architecture. It's inherent to the shared-infrastructure approach.
What On-Premise Deployment Changes
Virtual Private Cloud deployment means the AI hiring system runs entirely inside your infrastructure. Not "connected to" your infrastructure. Inside it. The model lives in your VPC. The training data is yours. The decision logic operates on your hardware.
For bias auditing, this changes everything.
Direct Model Access
Your data science team can inspect the actual production model anytime. Not a report about the model. The model itself. This means:
Feature inspection: Examine which features the model uses, how they're weighted, and whether any could proxy for protected characteristics.
Training data access: Review the actual data the model learned from. Verify that it's representative, properly labeled, and doesn't encode historical discrimination patterns.
Decision logic transparency: Trace how the model processes a specific candidate through the evaluation pipeline. Understand exactly why candidate A scored 0.73 and candidate B scored 0.81.
Threshold testing: Modify decision thresholds and immediately see the impact on selection rates across protected groups. Test whether alternative cutoffs reduce adverse impact while maintaining predictive validity.
Continuous monitoring: Set up automated bias monitoring that runs on every batch of candidates, not annually. Detect distributional shifts before they compound into systemic discrimination.
Independent Validation
Legally privileged bias audits can anchor AI governance efforts by channeling audits through legal counsel, maintaining an inventory and classification of tools, setting clear policies and vendor obligations, conducting ongoing monitoring and remediation.
When you control the model, you control the audit process:
Choose your own auditors: Hire auditors with no relationship to the vendor. Their only obligation is accurate assessment, not maintaining vendor relationships.
Verify audit methodology: Your technical team can validate that the auditor used appropriate statistical tests, properly segmented protected groups, and tested for intersectional bias.
Re-run audits anytime: If new bias concerns emerge or regulations change, you don't wait for the vendor's annual audit cycle. You initiate testing immediately.
Test remediation effectiveness: When bias is detected and fixes are implemented, you can verify the fix worked. Not "vendor says they fixed it." Actual verification through re-testing.
Regulatory Defense
When regulators investigate, you provide direct evidence from your own infrastructure:
Model documentation: Complete technical specifications of the model architecture, training process, and decision logic.
Training data access: Ability to demonstrate that training data was representative and didn't encode discriminatory patterns.
Validation evidence: Results from your own validation studies showing the model is job-related and predicts actual performance.
Audit trails: Complete logs of every hiring decision, the features that influenced it, and the scores across all candidates—not just those who advanced.
Remediation documentation: When bias is detected, evidence that fixes were implemented and retested, not just promised.
The difference between vendor-dependent compliance and direct-access compliance is the difference between "our vendor assures us" and "here is the evidence from our system."
How CNO Financial Actually Audits Their AI
CNO Financial is a Fortune 500 insurance company that deployed NODES across all 215 locations. Unlike enterprises using SaaS AI vendors, CNO Financial can actually audit their hiring AI.
Here's what that looks like in practice:
Quarterly Bias Testing
CNO Financial doesn't wait for annual audits. Their compliance team runs bias analyses quarterly, examining:
Selection rates by protected group: How many candidates from each demographic group are advancing at each stage of the hiring funnel.
Score distributions: Whether candidate scores cluster differently for different protected groups, which could indicate feature bias.
Intersectional analysis: Selection rates for intersectional categories (e.g., Black women, Asian men over 40) that standard four-fifths tests miss.
Outcome validation: Whether candidates flagged as high-performers actually became high-performers, broken down by demographic group. If the model is better at predicting success for one group than another, that's evidence of differential validity.
They can run these analyses because they have direct access to:
The model generating candidate scores
The complete candidate dataset
The downstream performance data validating predictions
The feature weights and decision logic
No vendor approval required. No waiting for annual audit cycles. No dependence on third-party auditors who can't access proprietary model details.
Model Transparency for Legal
When CNO Financial's legal team needs to verify compliance, they don't request documentation from a vendor. They examine the actual system:
Feature audit: Legal worked with the data science team to verify that no features in the model could plausibly proxy for protected characteristics. They examined correlation matrices showing which features correlate with which outcomes. They validated that high-signal features (those with strong predictive power) are job-related.
Training data validation: Legal reviewed the performance data used to train the model. They verified it represented a diverse set of successful employees across protected groups, preventing the model from learning patterns where success looks like "white male with college degree" because that's who was promoted historically.
Adverse impact testing: Legal doesn't trust vendor assurances. They ran their own tests using the EEOC's Uniform Guidelines on Employee Selection Procedures, examining selection rates at each stage for each protected group.
Job-relatedness evidence: Because the model trains on CNO Financial's actual performance data, legal can demonstrate that the features the AI uses actually predict job performance at their company. Not generic job performance. CNO-specific performance outcomes.
This is what bias auditing looks like when you control the system being audited.
Remediation in Days, Not Months
When CNO Financial's quarterly bias testing identified a potential issue with interview scoring for candidates over 50, here's what happened:
Day 1: Compliance team flagged the anomaly. Data science team began investigation.
Day 3: Root cause identified. Interview scoring was over-weighting "technical terminology currency"—a feature that inadvertently favored candidates who used newer jargon over candidates with deep domain expertise who used established terminology.
Day 5: Feature weight adjusted. Model retrained on CNO's performance data to optimize for predictive accuracy without the problematic feature weighting.
Day 7: Bias tests re-run. Impact ratio now above 0.80 across all age groups. Fix validated.
Day 10: Documentation provided to legal showing problem identification, root cause, fix implementation, and validation testing.
Total time from detection to validated fix: 10 days.
Compare this to the SaaS vendor scenario: You identify the issue, report it to your vendor, vendor investigates (2-4 weeks), vendor proposes a fix affecting all customers (requires product planning discussion), fix scheduled for next quarterly release (8-12 weeks), you test the updated version, hope it actually fixed your specific issue.
Median time for SaaS vendor remediation: 4-6 months. If it happens at all.
Production Metrics That Validate the Approach
CNO Financial's deployment demonstrates what's possible when you can actually audit and optimize the AI you're using:
660,000+ candidates processed with complete bias monitoring at every stage.
80% accuracy predicting top performers (validated against actual performance reviews), broken down by demographic group to verify the model works equally well across protected classes.
Zero adverse impact violations in quarterly bias testing since deployment.
17 days from contract to legal approval because legal could inspect the actual system before deployment, not trust vendor assurances.
$1.58M documented savings while maintaining fair hiring practices—proof that bias auditing and business outcomes aren't in tension.
The model gets better every quarter because they can continuously learn from their own performance data. Legal can audit any time they want. Compliance isn't dependent on vendor cooperation.
This is what AI hiring looks like when the employer controls the infrastructure.
The Questions Legal Teams Should Ask Vendors
When evaluating AI hiring vendors on bias auditability, these questions separate vendors whose architecture enables real auditing from vendors whose audits are vendor-controlled theater:
Model Access Questions
"Can our data science team inspect the actual production model being used to evaluate our candidates?"
Red flag answer: "We provide detailed audit reports and model cards documenting our system's behavior."
Reports about the model aren't the model. If you can't inspect the actual model, you can't independently verify bias claims.
Green flag answer: "Yes. The model runs in your VPC. Your team can inspect it, validate it, and audit it anytime."
"Can we examine the training data used to build the model evaluating our candidates?"
Red flag answer: "Our model is trained on industry-leading datasets from multiple sources to ensure broad applicability."
Multi-source training data means you're being evaluated by a model trained on other companies' hiring patterns. You can't verify whether those patterns are job-related for your roles.
Green flag answer: "The model is trained exclusively on your performance data. You control the training data, you can audit it, and you can validate that it's representative."
Audit Control Questions
"Who selects the third-party auditor: us or you?"
Red flag answer: "We work with certified independent auditors who specialize in AI bias assessment."
Vendor-selected auditors create the conflict of interest problem. The auditor's continued relationship with the vendor depends on producing acceptable audit results.
Green flag answer: "You select the auditor. The system runs in your VPC, so any qualified auditor can examine it without vendor permission."
"Can we run our own bias tests using our own methodology whenever we want?"
Red flag answer: "Our annual bias audits are conducted by independent third parties using industry-standard methodologies."
Annual audits mean you're monitoring for discrimination once a year. AI models can drift toward discriminatory patterns between audit cycles.
Green flag answer: "Yes. You control the model and data, so you can run any tests you want, anytime you want."
Remediation Questions
"If a bias audit identifies adverse impact, can we implement fixes immediately without waiting for vendor approval or product releases?"
Red flag answer: "We take bias audit findings very seriously and prioritize remediation in our product roadmap."
"Prioritize in our product roadmap" means your compliance timeline is controlled by the vendor's development schedule. Fixes might take months, if they happen at all.
Green flag answer: "Yes. You control the model, so you can implement and test fixes immediately."
"Can we verify that bias fixes were actually implemented and effective?"
Red flag answer: "We'll provide updated audit documentation after fixes are released."
Vendor-provided documentation about fixes isn't verification of fixes. You need to be able to test the model yourself.
Green flag answer: "Yes. After implementing fixes, you can re-run bias tests on your model to verify the remediation worked."
Ongoing Monitoring Questions
"How do we monitor for model drift and emerging bias between annual audits?"
Red flag answer: "Our models are continuously monitored by our AI governance team."
The vendor monitors the vendor's model for the vendor's interests. Your compliance needs might not align with their monitoring priorities.
Green flag answer: "You set up continuous bias monitoring on your deployment. You define the metrics, thresholds, and alerting that match your compliance requirements."
"What happens when regulations change and we need to test for new bias metrics?"
Red flag answer: "We stay current with regulatory changes and update our audit methodologies accordingly."
Regulatory changes happen faster than vendor product cycles. Waiting for your vendor to update their audit methodology means operating out of compliance until they catch up.
Green flag answer: "You control the model and audit process, so you can implement new bias tests immediately when regulations change."
Why This Matters Right Now
The regulatory environment for AI hiring bias is accelerating, not stabilizing.
NYC Local Law 144 has been in force since July 5, 2023 and set the precedent for bias audits of automated employment decision tools along with rules for notifying the public about these checks and sharing a summary of the findings.
Colorado's AI Act takes effect February 2026. California has multiple bills pending. Illinois expanded its requirements. The EU AI Act classifies AI hiring tools as high-risk systems requiring conformity assessments.
Every new jurisdiction adds slightly different requirements: different protected categories to test, different statistical thresholds for adverse impact, different disclosure obligations, different remediation timelines.
If you're using a SaaS vendor, you're dependent on that vendor updating their audit methodology to match every new jurisdiction's requirements. If you operate in ten states with different AI bias laws, you need your vendor to support ten different audit frameworks. And you need to trust they'll do it before you're out of compliance.
If you control the model, you update your bias testing to match new requirements immediately. Your compliance timeline isn't hostage to vendor product development.
The Litigation Risk Acceleration
The Mobley v. Workday case shows that AI risk in hiring is not theoretical. Employers adopting these tools should move quickly to ensure their systems are explainable, monitored and statistically tested.
The Mobley case established that AI vendors can be held directly liable for discrimination. But vendor liability doesn't eliminate employer liability. It adds to it.
When employment discrimination litigation involves AI, plaintiffs' attorneys will request:
The complete model architecture and decision logic
Training data composition and validation
All bias audit results and remediation documentation
Feature weights and their job-relatedness justification
Evidence that the system was validated for the specific roles at issue
If you're using a SaaS vendor, you'll request this information from the vendor. The vendor will provide what they're contractually obligated to provide or legally compelled to disclose. Discovery disputes will delay proceedings. Incomplete information will weaken your defense.
If you control the model, you produce this information from your own infrastructure. Your legal team can conduct their own analysis. You can demonstrate job-relatedness using your actual performance data.
The difference matters in litigation outcomes.
The Evidence Standard Is Changing
Maintaining an inventory and classification of tools, setting clear policies and vendor obligations, conducting ongoing monitoring and remediation, and preserving records supporting job-relatedness, business necessity, and "less-discriminatory alternatives" analyses are becoming table stakes for AI hiring compliance.
"Our vendor assured us their system is compliant" was never a sufficient defense. It's becoming actively harmful as courts and regulators expect employers to demonstrate direct knowledge of how their AI systems actually work.
The organizations that survive regulatory scrutiny will be the ones that can show:
We inspected the model ourselves
We tested it on our candidates
We validated job-relatedness using our performance outcomes
We monitored continuously for bias
When we found problems, we fixed them immediately and verified the fixes worked
None of this is possible with vendor-controlled black box systems.
What Chief Compliance Officers Can Require Now
If you're evaluating AI hiring vendors and take bias auditing seriously, here's what you should require:
Complete model transparency: The vendor must provide full access to model architecture, feature engineering logic, and decision processes. If they cite intellectual property protection, they're telling you they can't give you what bias audits require.
Employer-controlled audit process: Your team selects the auditors, defines the methodology, and runs tests whenever needed. Vendor-controlled annual audits are insufficient for ongoing compliance.
Direct training data access: You must be able to inspect and validate the data used to train models evaluating your candidates. If the training data includes other companies' patterns, you can't verify job-relatedness for your roles.
Immediate remediation capability: When bias is detected, you must be able to implement and test fixes without waiting for vendor product releases. "We'll fix it in Q3" means operating out of compliance until Q3.
Continuous bias monitoring: Annual audits are obsolete. You need continuous monitoring for adverse impact, model drift, and emerging bias patterns. This requires direct access to the model and candidate data.
Regulatory adaptability: When new bias audit requirements emerge, you must be able to implement them immediately, not wait for vendor compliance updates.
These aren't unreasonable demands. They're basic requirements for demonstrating that your AI hiring system is actually auditable—which is what bias audit laws require.
SaaS vendors can't meet these requirements because their architecture prevents it. On-premise deployment meets them by default because you control the infrastructure.
The procurement question isn't "which vendor has the most impressive third-party audit reports." It's "which architecture actually lets us audit the system ourselves."
NODES deploys AI hiring infrastructure inside your VPC, giving your compliance team direct access to models, training data, and decision logic.
CNO Financial proved it works: quarterly bias testing, zero adverse impact violations, 10-day remediation cycles, legal can audit anytime.
Stop outsourcing bias audits to vendor-paid third parties. Audit the system yourself.
Visit nodes.inc to see how on-premise deployment eliminates the black box problem that makes SaaS AI hiring tools unauditable.






