How Behavioral Science Can Improve the Return on AI Investments
- Jonathan H. Westover, PhD
- 2 hours ago
- 23 min read
Listen to this article:
Abstract: Artificial intelligence adoption consistently underdelivers on organizational expectations, with failure rates approaching 95% in some estimates. This article examines why AI investments fail when leaders treat implementation as purely a technical exercise rather than a behavioral change challenge. Drawing on behavioral science research and organizational change management principles, we introduce the Behavioral Human-Centered AI framework—an evidence-based approach that addresses human biases, cognitive shortcuts, and resistance across design, adoption, and management phases. Organizations that ignore fundamental psychological patterns—including loss aversion, algorithm aversion, and escalation of commitment—waste millions on sophisticated systems employees resist or abandon. By contrast, those applying behavioral insights across the full change cycle build AI capabilities that align with how people actually think and work, dramatically improving return on investment and long-term competitive advantage.
The gap between AI's promise and its actual performance in organizations has become impossible to ignore. Despite years of breathless predictions about AI revolutionizing business operations, empirical evidence paints a sobering picture. MIT's NANDA initiative estimated that 95% of AI initiatives fail to deliver intended value, while Boston Consulting Group found that only 26% of companies report tangible ROI from AI investments (De Cremer et al., 2025). For context, these failure rates exceed those of most organizational change initiatives, which already struggle with 60-70% failure rates (Hiatt & Creasey, 2012).
Gartner estimated global AI software revenue would reach 62 billion in 2022, with enterprise spending on AI systems projected to exceed 300 billion annually by 2026 (Columbus, 2021). Yet this massive capital deployment frequently produces minimal organizational value, leaving executives questioning fundamental assumptions about AI implementation.
The central problem is not technological inadequacy. Today's AI systems demonstrate remarkable capabilities across domains from natural language processing to predictive analytics. Rather, the disconnect emerges from a pervasive technosolutionist mindset—the belief that technological sophistication alone will overcome organizational challenges (Morozov, 2013). When leaders view AI adoption primarily as an engineering exercise, they systematically underestimate the behavioral complexities that determine whether sophisticated systems create actual value.
This article argues that successful AI implementation requires applying behavioral science principles across the entire change management cycle. We examine why human cognitive biases derail even well-designed systems, how organizations can design for actual human behavior rather than idealized rationality, and what evidence-based interventions help leaders navigate the psychological terrain of AI adoption. The framework we present—Behavioral Human-Centered AI—offers a systematic approach to aligning AI capabilities with human needs, limitations, and decision-making patterns.
The AI Implementation Landscape
Defining Behavioral Challenges in AI Adoption
AI implementation failures rarely stem from technical deficiencies in isolation. Instead, they emerge from systematic misalignment between how AI systems operate and how humans actually process information, make decisions, and respond to change. Behavioral Human-Centered AI represents an approach that places human cognitive patterns, biases, and needs at the center of AI design, deployment, and management decisions (De Cremer et al., 2025).
This perspective contrasts sharply with conventional implementation models that prioritize technical specifications, computational efficiency, or algorithmic accuracy without adequately considering the psychological factors that shape user engagement. A technically superior AI tool that employees resist, misunderstand, or work around creates zero organizational value, regardless of its theoretical capabilities.
The behavioral lens reveals three critical psychological phenomena that influence AI adoption. First, loss aversion—the well-established finding that losses loom approximately twice as large as equivalent gains in human decision-making—means employees fixate on workflow disruptions, autonomy reductions, or skill obsolescence rather than potential productivity improvements (Kahneman & Tversky, 1979). Second, algorithm aversion describes people's tendency to abandon algorithmic decision support after observing even minor errors, despite algorithms often outperforming human judgment over time (Dietvorst et al., 2015). Third, the availability heuristic causes individuals to overweight vivid, memorable AI failures when evaluating system reliability, even when such failures represent rare outliers (Tversky & Kahneman, 1974).
These patterns are not cognitive defects to be overcome through training alone. They represent fundamental features of human information processing that have evolved over millennia and resist simple correction through rational persuasion.
State of AI Implementation Practice
Current organizational practices reveal a systematic gap between rhetoric about "people-centered" AI and actual implementation approaches. The Foundry 2023 State of the CIO survey found that while 71% of Chief Information Officers identified themselves as responsible for accelerating AI-driven innovation, only 32% believed they bore responsibility for broader organizational transformation required for successful adoption (De Cremer et al., 2025). This disconnect reflects the persistent technosolutionist assumption that delivering sophisticated technology constitutes the primary leadership challenge.
Research on healthcare AI adoption illustrates these dynamics. Clinical decision-support tools embedded in electronic health records demonstrate clear benefits for patient outcomes, yet physicians routinely underutilize or circumvent them when alerts disrupt established workflows or add perceived verification time (Chaudhry et al., 2006). The technical superiority of these systems proves insufficient to overcome the behavioral friction created by workflow disruption and additional cognitive demands.
Similarly, studies of algorithmic hiring tools reveal that recruiters often override or selectively apply AI recommendations in ways that undermine system effectiveness, particularly when algorithms challenge existing mental models about candidate quality (Cowgill & Tucker, 2020). The issue extends beyond initial resistance—even after adoption, sustained engagement requires ongoing behavioral support that most organizations fail to provide.
Industry surveys consistently identify "cultural resistance" and "lack of employee buy-in" as primary barriers to AI value realization, yet organizations continue to invest disproportionately in technical capabilities rather than change management infrastructure (Ransbotham et al., 2020). This resource allocation pattern reflects deep-seated assumptions about where implementation challenges truly reside.
Organizational and Individual Consequences of AI Implementation Failures
Organizational Performance Impacts
Failed AI implementations impose substantial direct and opportunity costs on organizations. The direct costs include wasted capital expenditures on unused or abandoned systems, consulting fees for implementation support, and internal labor costs for integration efforts. A 2020 Gartner study estimated that organizations waste an average of $8.5 million annually on AI initiatives that fail to progress beyond pilot phases (Davenport & Mittal, 2022).
Beyond direct costs, failed implementations create opportunity costs through delayed competitive responses and diminished organizational learning. When AI initiatives fail, organizations often become more risk-averse about subsequent AI investments, creating a learned helplessness dynamic that prevents beneficial future adoption (Seligman, 1972). This learned caution can prove particularly costly in industries where AI capabilities increasingly differentiate market leaders from followers.
The reputational consequences extend beyond individual organizations. High-profile AI failures contribute to broader skepticism about AI value propositions, making subsequent change initiatives more difficult across industries. When Amazon abandoned its AI recruiting tool after discovering gender bias, the incident not only cost Amazon directly but heightened scrutiny of algorithmic hiring tools across the technology sector (Dastin, 2018).
Employee productivity suffers during extended, poorly managed AI implementations. Research on technology-induced stress demonstrates that prolonged uncertainty about job roles, repeated system changes, and forced adoption of tools perceived as unhelpful significantly reduce employee engagement and performance (Tarafdar et al., 2007). Organizations implementing AI without adequate behavioral support risk creating exactly these conditions.
Individual Wellbeing and Stakeholder Impacts
For individual employees, poorly managed AI adoption creates psychological strain through multiple mechanisms. Concerns about job displacement, even when unfounded, generate anxiety that impairs cognitive performance and decision quality (Brougham & Haar, 2018). The perceived loss of autonomy when AI systems constrain decision latitude triggers psychological reactance—an oppositional response to perceived freedom restriction (Brehm, 1966).
Studies of AI adoption in professional services reveal that forcing practitioners to use AI tools they distrust or don't understand creates moral distress, particularly when professionals feel pressured to follow AI recommendations that conflict with their own judgment (Grote & Keeling, 2022). This distress is especially acute in healthcare, where physicians report ethical discomfort when pressured to override clinical judgment based on algorithmic suggestions they cannot fully evaluate.
The competence threat posed by AI systems affects employee wellbeing beyond simple job security concerns. When AI performs tasks previously considered skilled human work, it can undermine individuals' sense of professional identity and self-efficacy (Meijerink et al., 2021). Research on automation in accounting and legal services shows that even when jobs remain secure, the redefinition of "valuable" skills creates identity threats that reduce job satisfaction and organizational commitment.
For external stakeholders—customers, patients, citizens—poorly implemented AI can erode trust in organizations and institutions. When AI systems produce discriminatory outcomes, privacy violations, or opaque decisions affecting people's lives, the resulting harm extends far beyond organizational boundaries. The algorithmic bias identified in criminal justice risk assessment tools, for instance, has contributed to wrongful detention and perpetuated racial disparities, imposing profound costs on affected individuals and communities (Angwin et al., 2016).
Evidence-Based Organizational Responses
Table 1: Behavioral Science Bridging the Gap in AI Implementation Return
Organization or Study | AI Application Domain | Behavioral Strategy or Principle Applied | Key Design/Implementation Elements | Reported Outcomes or ROI Metrics | Primary Behavioral Challenge Addressed | Target Audience/Users | Success Factors (Inferred) |
Procter & Gamble (P&G) | Consumer Insights Analysis | Inclusive Co-Design and the Endowment Effect | Iterative testing using actual job tasks and prototypes; involved managers throughout development. | 89% adoption rates within six months (vs. 40% historical average for analytics tools). | Cultural resistance and lack of employee buy-in. | Brand Managers | High involvement in the design process created psychological ownership, ensuring the tool addressed actual user pain points like speed over complexity. |
Deloitte | Audit Processes | Framing as Augmentation and Job Redesign | AI handled tedious sampling and anomaly detection; auditors redeployed to high-value advisory work and complex judgment. | 85% auditor adoption within the first year; improved audit quality and engagement scores. | Loss aversion and fear of job displacement or autonomy reduction. | Auditors | Concrete task reallocation made the augmentation promise tangible, reducing the threat to professional identity by focusing humans on strategic work. |
JPMorgan Chase (COiN) | Legal Contract Analysis | Continuous Learning and Multi-stakeholder Feedback | Dedicated feedback channels for lawyers; weekly cross-functional reviews; quarterly performance audits. | Replaced 360,000 hours of manual legal labor; sustained lawyer engagement and high accuracy. | Automation complacency and performance drift. | Lawyers | Rapid refinement based on user feedback maintained trust and system relevance while preventing disengagement or over-reliance on drifting algorithms. |
Walmart | Inventory Management | Gradual Rollout and Small Wins | Initial deployment in high-visibility categories in select stores; focused on clear pain points like stock-outs. | 30% reduction in out-of-stock situations; 10% reduction in excess inventory in pilot stores. | Skepticism and uncertainty about AI value. | Store Managers | Targeting recognizable workflow frustrations created early, visible successes that generated social proof and internal champions for scaling. |
IBM (Watson for Oncology) | Healthcare / Oncology Treatment | Explainability and Perceived Control | Redesigned interface to show supporting evidence and confidence levels for recommendations. | Improved physician acceptance (previously faced resistance due to lack of reasoning). | Algorithm aversion and psychological need for decision rationale. | Physicians and Oncologists | Providing transparent explanations reduced anxiety and met the need for understanding, allowing experts to feel in control rather than dictated to by a "black box". |
Capital One | Fraud Detection | Transparent Communication and Error Normalization | Detailed materials explaining algorithm logic, patterns, and limitations; staff training for customer explanations. | High employee adoption rates and customer acceptance of false-positive friction. | Algorithm aversion and erosion of trust due to opaque decisions. | Customer Service Representatives and Customers | Proactive disclosure of limitations allowed users to calibrate trust realistically, preventing disillusionment from oversold system perfection. |
Unilever | Executive Decision Making / Forecasting | Behavioral Leadership and Experiential Learning | Executive immersion program requiring hands-on exercises with actual company data and AI tools. | Improved AI investment decisions and more sophisticated leadership questioning post-program. | Overconfidence bias and technosolutionist mindset in leadership. | Senior Executives | Direct experience with system capabilities and frustrations reduced overconfidence and shifted focus from technical specs to behavioral adoption plans. |
Microsoft (AETHER Committee) | AI Ethics and Governance | Ethical Oversight and Impact Assessment | Multi-disciplinary committee with authority to block or modify deployments based on bias or privacy risks. | Prevented multiple deployments with unacceptable risks; built employee confidence in ethical standards. | Ethical fading and algorithmic bias. | Technical Teams and Stakeholders | Structural safeguards kept ethical dimensions salient, counteracting the organizational drift toward narrow efficiency metrics under pressure. |
Design for Human Cognitive Architecture
Successful AI implementation begins with design decisions that account for how humans actually process information rather than how idealized rational actors might behave. Explainability represents a critical design element that addresses humans' psychological need to understand decision rationale. Research demonstrates that providing transparent explanations for AI recommendations significantly increases user trust and appropriate reliance, even when those explanations reveal system limitations (Ribeiro et al., 2016).
The key insight is that explainability serves psychological functions beyond mere information provision. When people understand why an AI system reached a particular conclusion, they experience greater perceived control and reduced anxiety about delegating decisions to algorithms (Binns et al., 2018). This psychological benefit persists even when users lack technical expertise to fully evaluate algorithmic logic.
Effective approaches to designing for cognitive architecture include:
Selective friction introduction: Deliberately adding minor obstacles at critical decision points to promote more careful human evaluation. Research on AI transcription tools found that harder-to-read fonts actually improved error detection by slowing down processing and increasing scrutiny (Alter et al., 2007).
Confidence calibration displays: Showing AI uncertainty levels helps users develop appropriate trust by signaling when algorithmic recommendations warrant greater caution versus high confidence (Zhang et al., 2020).
Default option architecture: Leveraging humans' tendency toward path-of-least-resistance by making beneficial AI usage the default while preserving override capability addresses both adoption and autonomy concerns (Thaler & Sunstein, 2008).
Progressive disclosure of complexity: Revealing system capabilities gradually rather than overwhelming users with comprehensive functionality exploits humans' limited working memory capacity (Miller, 1956).
IBM's Watson for Oncology illustrates both the promise and pitfalls of design choices. The system provides treatment recommendations for cancer patients based on medical literature and clinical data. However, early implementations that presented recommendations without clear reasoning faced physician resistance. When IBM redesigned the interface to show supporting evidence and confidence levels for recommendations, physician acceptance improved substantially (Somashekhar et al., 2018). The technical capability remained constant, but design changes that addressed physicians' psychological needs for understanding and control proved decisive.
Inclusive Co-Design and Iteration Processes
Behavioral research consistently demonstrates that involving end-users in design decisions increases both system quality and user commitment. The endowment effect—people's tendency to value items more highly when they partially own them—extends to technologies that users help create (Kahneman et al., 1990). When employees contribute to AI tool development, they develop psychological ownership that translates into higher engagement and more constructive feedback about system limitations.
Participatory design approaches also surface use cases and constraints that technical teams often miss. A 2020 study examining automated speech recognition systems from Amazon, Apple, Google, IBM, and Microsoft revealed that all five systems made approximately twice as many errors for Black speakers compared to white speakers (Koenecke et al., 2020). This systematic bias reflected training data limitations and testing protocols that failed to incorporate sufficient linguistic diversity. Including more diverse beta-testers during development would have identified these disparities before public deployment.
The challenge lies in structuring co-design processes to genuinely incorporate user input rather than conducting pro forma consultation. Effective strategies include:
Early and sustained engagement: Involving end-users before technology selection decisions are finalized, not just after systems are purchased, prevents locking in solutions misaligned with actual needs.
Diverse user representation: Ensuring beta-testing pools reflect the full range of eventual system users, including edge cases and non-typical usage patterns.
Behavioral expertise integration: Including behavioral scientists in design teams to translate user feedback into specific design modifications that address underlying psychological patterns.
Iterative testing cycles: Conducting multiple rounds of user testing with incremental improvements rather than single large-scale pilots that delay course correction.
Procter & Gamble applied these principles when developing AI tools for consumer insights analysis. Rather than building a system based solely on data scientists' assumptions, P&G involved brand managers throughout development, conducting iterative testing sessions where managers performed actual job tasks using prototypes. This process revealed that managers needed different visualization approaches than data scientists anticipated and valued speed of insight generation over analytical comprehensiveness. The resulting system achieved 89% adoption rates within six months, compared to the company's historical average of 40% for analytics tools (Davenport & Ronanki, 2018).
Transparent Communication About Capabilities and Limitations
Addressing algorithm aversion requires proactive transparency about both AI capabilities and limitations. Research demonstrates that when organizations honestly communicate system constraints upfront, users develop more realistic expectations and display greater tolerance for occasional errors (Dietvorst et al., 2018). Conversely, overselling AI capabilities sets unrealistic expectations that any system failure dramatically undermines trust.
The paradox is that highlighting limitations can actually increase adoption when done appropriately. Studies of healthcare AI show that when physicians received detailed information about an AI system's error patterns, limitations, and the safeguards in place, they reported higher trust and willingness to use the tool compared to minimal disclosure conditions (Jacobs et al., 2021). This counterintuitive finding reflects humans' appreciation for honesty and their sophisticated ability to calibrate trust when given adequate information.
Effective transparency strategies include:
Proactive limitation disclosure: Explicitly describing circumstances where AI performs poorly or where human judgment should override algorithmic recommendations, rather than waiting for users to discover limitations through negative experience.
Error normalization: Framing AI mistakes as comparable to human errors rather than suggesting algorithms should achieve perfection, which helps users develop appropriate expectations (Madhavan & Wiegmann, 2007).
Safeguard communication: Clearly explaining oversight mechanisms, human review processes, and other protections against AI errors to reduce perceived risk.
Use case specificity: Defining precisely which tasks AI handles well versus where human judgment remains superior, helping users understand appropriate delegation boundaries.
Capital One applied these principles when deploying AI-powered fraud detection systems. Rather than simply implementing the technology, the company created detailed communication materials explaining how the algorithms worked, what patterns triggered alerts, known limitation areas, and the human review processes for flagged transactions. Customer service representatives received training on explaining system logic to concerned customers. This transparency approach contributed to both high employee adoption rates and customer acceptance of occasional false-positive friction in the fraud detection process (Davenport & Ronanki, 2018).
Framing AI as Augmentation Rather Than Replacement
How organizations frame AI's organizational role profoundly influences employee receptivity. When AI is presented primarily as an automation technology that will replace human workers, loss aversion and threat responses dominate employee reactions (Huang & Rust, 2018). By contrast, framing AI as augmentation that enhances human capabilities by handling routine tasks and freeing capacity for higher-value work triggers opportunity-oriented responses.
This framing difference is not mere semantic manipulation. Research demonstrates that AI creates more organizational value when deployed to augment human judgment rather than fully automate decisions, particularly for complex, context-dependent tasks (Raisch & Krakowski, 2021). The augmentation frame therefore reflects both psychological wisdom and operational reality.
The critical implementation challenge involves making augmentation concrete through job redesign rather than leaving it as abstract rhetoric. When organizations articulate specifically which routine tasks AI will handle and how liberated time will be redirected toward valued activities, employees perceive genuine opportunity rather than euphemistic job elimination (Kolbjørnsrud et al., 2016).
Effective augmentation framing approaches include:
Concrete task reallocation: Specifying which specific activities AI will assume and documenting how this creates capacity for work employees find more meaningful or strategic.
Skill evolution pathways: Defining how employee roles will develop as AI handles certain tasks, providing clear career progression that incorporates AI collaboration rather than displacement.
Human-in-the-loop emphasis: Structuring AI systems to support human decision-making rather than bypassing human involvement entirely, preserving meaningful human agency.
Outcome accountability clarity: Maintaining human accountability for final decisions even when AI provides recommendations, which addresses autonomy concerns while leveraging algorithmic support.
Deloitte restructured audit processes by deploying AI to handle transaction sampling, pattern detection, and anomaly flagging—tasks auditors found tedious—while redeploying auditor time toward client advisory work and complex judgment calls that required contextual understanding. The firm explicitly communicated this reallocation, showed auditors how AI freed capacity for more engaging work, and measured both efficiency gains and auditor satisfaction. The result was 85% auditor adoption within the first year and measurable improvements in both audit quality and employee engagement scores (Kokina & Davenport, 2017).
Gradual Rollout and Demonstrated Quick Wins
Behavioral science research on change management emphasizes the value of early, visible successes in building momentum for larger transformations. The small wins approach—targeting initial efforts toward achievable improvements that demonstrate value—creates psychological and political capital for sustained change efforts (Weick, 1984). This principle applies directly to AI implementation, where gradual rollouts with demonstrated benefits overcome resistance more effectively than comprehensive, high-risk deployments.
Quick wins serve multiple psychological functions. They reduce uncertainty about AI value, provide concrete evidence against skepticism, and create positive associations that counteract algorithm aversion. When employees witness peers achieving tangible benefits from AI tools, social proof dynamics encourage broader adoption (Cialdini, 2009).
The challenge lies in selecting initial use cases that balance achievability with meaningfulness. Trivial applications that succeed technically but don't address genuine pain points fail to generate enthusiasm, while overly ambitious pilots that stumble reinforce skepticism.
Strategic approaches to gradual rollout include:
Pain point prioritization: Targeting initial AI applications toward widely recognized workflow frustrations where solutions will be immediately appreciated.
Visible beneficiary selection: Beginning with departments or teams whose success will be noticed across the organization, creating social proof effects.
Success metric clarity: Defining specific, measurable improvements from initial deployments that can be communicated broadly.
Lessons-learned integration: Treating early deployments as learning opportunities that inform refinements before scaling, which both improves systems and demonstrates organizational responsiveness to feedback.
Walmart applied this approach when implementing AI for inventory management. Rather than deploying across all stores simultaneously, the company began with a limited set of high-visibility product categories in select stores. The initial deployment focused on items where stock-outs created clear customer frustration and lost sales. After demonstrating 30% reduction in out-of-stock situations and 10% reduction in excess inventory in pilot stores, Walmart gradually expanded to additional categories and locations (Davenport & Ronanki, 2018). This measured approach allowed refinement based on store manager feedback and created internal champions who advocated for broader adoption based on direct experience.
Building Long-Term AI Governance and Capability
Developing AI-Savvy Leadership
Long-term AI success requires leadership capable of recognizing their own cognitive biases and limitations regarding technology. Research on overconfidence bias shows that leaders frequently overestimate their understanding of technologies they don't directly use, leading to poor resource allocation and inadequate oversight (Moore & Healy, 2008). For AI specifically, this manifests as underestimating implementation complexity and assuming technical sophistication alone will overcome organizational resistance.
The executive education challenge is substantial. Most senior leaders developed expertise in business domains when AI capabilities were limited, creating knowledge gaps about current possibilities and constraints. Yet overcoming these gaps requires more than technical training. Leaders need to understand AI's behavioral and organizational implications, not just its computational mechanisms.
Developing AI-savvy leadership involves:
Behavioral change management competency: Training leaders to recognize and address resistance, communicate transparently about AI implications, and model appropriate AI usage rather than simply mandate adoption.
Bias awareness and mitigation: Helping leaders identify their own overconfidence, escalation of commitment tendencies, and other biases that impair AI investment decisions.
Expert network cultivation: Encouraging leaders to build relationships with both internal AI experts who understand specific business applications and external consultants who provide independent perspectives on implementation approaches.
Hands-on AI experience: Requiring leaders to directly use AI tools relevant to their functions rather than merely receiving reports about AI initiatives, which builds intuitive understanding of both capabilities and limitations.
Unilever implemented an executive AI immersion program requiring all senior leaders to complete hands-on exercises using AI tools for tasks like consumer sentiment analysis and demand forecasting. Rather than abstract lectures, executives worked with actual company data to experience both AI capabilities and frustrations firsthand. This experiential approach proved more effective than traditional training at building realistic expectations and appreciation for implementation challenges. Post-program assessments showed significantly improved AI investment decisions, with leaders asking more sophisticated questions about behavioral adoption plans rather than focusing exclusively on technical specifications (Ransbotham et al., 2020).
Creating Continuous Learning Systems
AI technologies evolve rapidly, creating performance drift risks where initially well-calibrated systems become less accurate over time as conditions change. Sustaining AI value therefore requires continuous learning systems that monitor performance, gather user feedback, and implement refinements (Sculley et al., 2015). From a behavioral perspective, these systems must address humans' tendency toward complacency once initial adoption occurs.
The automation complacency phenomenon—people's tendency to over-rely on automated systems without adequate monitoring—poses particular risks for AI applications (Parasuraman & Manzey, 2010). When users become overly confident in AI recommendations, they may fail to detect when system performance degrades or when edge cases arise that algorithms handle poorly.
Effective continuous learning systems must therefore balance encouraging appropriate AI usage with maintaining healthy skepticism and ongoing oversight. This requires deliberate design of feedback mechanisms, performance monitoring, and refinement processes.
Key components of continuous learning systems include:
Multi-stakeholder feedback channels: Creating accessible mechanisms for end-users, managers, and affected stakeholders to report concerns, suggest improvements, and flag unexpected AI behaviors.
Regular performance auditing: Systematically evaluating AI system accuracy across user subgroups to detect performance drift or emerging biases before they create significant harm.
Rapid refinement cycles: Establishing processes to quickly implement improvements based on user feedback rather than allowing concerns to accumulate until scheduled major updates.
Transparent change communication: Clearly explaining system modifications to users, which maintains trust by demonstrating organizational responsiveness and helps users recalibrate expectations.
JPMorgan Chase developed a continuous learning system for its AI-powered contract analysis tool, COiN (Contract Intelligence). The bank created dedicated feedback channels where lawyers could flag unusual contracts or question AI interpretations. A cross-functional team reviewed this feedback weekly, implementing refinements and communicating changes back to users. The bank also conducted quarterly audits examining COiN performance across different contract types and client categories. This systematic learning approach allowed JPMorgan to maintain high accuracy rates even as contract language evolved and new client needs emerged, while simultaneously sustaining lawyer engagement with the tool (Son, 2017).
Embedding Ethical AI Principles and Oversight
Sustaining AI value over time requires governance frameworks that address algorithmic bias, fairness concerns, and unintended consequences that undermine stakeholder trust. The organizational challenge extends beyond establishing policies to creating cultures where ethical considerations genuinely influence AI deployment decisions rather than being treated as compliance exercises.
Behavioral research on ethical fading—the psychological process by which ethical dimensions of decisions become less salient under time pressure or competing priorities—demonstrates why strong governance structures matter (Tenbrunsel & Messick, 2004). Without deliberate mechanisms to keep ethical considerations visible, organizations drift toward optimizing narrow technical or financial metrics while neglecting broader stakeholder impacts.
Effective AI ethics governance requires both structural safeguards and cultural norms that legitimize raising concerns about algorithmic fairness, privacy implications, or discriminatory impacts. Research shows that employees hesitate to voice ethical concerns when they perceive leaders as primarily focused on efficiency or revenue outcomes (Morrison, 2014).
Components of robust AI ethics governance include:
Diverse ethics review boards: Establishing oversight groups with varied expertise and stakeholder representation to evaluate AI systems for fairness, bias, and unintended consequences before deployment.
Impact assessment protocols: Requiring systematic analysis of how AI systems might affect different user groups, with particular attention to potential disparate impacts on vulnerable populations.
Whistleblower protections: Creating safe channels for employees to raise concerns about AI systems producing biased outcomes or being deployed inappropriately without fear of retaliation.
Regular ethics audits: Periodically examining deployed AI systems for emerging bias patterns, privacy concerns, or stakeholder impacts that weren't apparent during initial development.
Microsoft established an AI Ethics Committee (AETHER) that reviews significant AI deployments for ethical implications before launch. The committee includes technical experts, ethicists, social scientists, and business leaders who evaluate systems for potential bias, privacy concerns, and societal impact. Importantly, the committee has authority to require modifications or even prevent deployment of AI systems that raise significant ethical concerns. This governance structure prevented several deployments that posed unacceptable fairness or privacy risks while building employee confidence that ethical considerations meaningfully influence AI decisions (Hao, 2019).
Conclusion
The persistent failure of AI implementations to deliver promised value stems fundamentally from treating adoption as a technical challenge rather than a behavioral one. Organizations continue to invest disproportionately in algorithmic sophistication while systematically underestimating the psychological and organizational factors that determine whether sophisticated systems create actual value. This misallocation reflects deep technosolutionist assumptions that better technology alone will overcome human resistance, bias, and inertia.
The evidence reviewed here demonstrates that successful AI adoption requires systematic application of behavioral science principles across design, implementation, and management phases. Designing for human cognitive architecture rather than idealized rationality, transparently communicating both capabilities and limitations, framing AI as augmentation rather than replacement, and developing governance systems that address ethical concerns all represent evidence-based interventions that dramatically improve adoption outcomes.
The practical implications are clear. Leaders must recognize their own biases toward technological optimism and escalation of commitment, invest in behavioral expertise alongside technical capabilities, and measure adoption success through employee trust and appropriate usage metrics rather than system deployment alone. Organizations that embrace AI as fundamentally a change management challenge will achieve the productivity gains and competitive advantages that continue to elude those pursuing purely technical solutions.
The path forward requires humility about AI limitations, transparency about implementation challenges, and sustained attention to the human factors that determine whether sophisticated algorithms create organizational value or expensive shelfware. For organizations willing to make this behavioral shift, the returns on AI investments can finally match the technology's considerable promise.
References
Alter, A. L., Oppenheimer, D. M., Epley, N., & Eyre, R. N. (2007). Overcoming intuition: Metacognitive difficulty activates analytic reasoning. Journal of Experimental Psychology: General, 136(4), 569-576.
Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016). Machine bias: There's software used across the country to predict future criminals. And it's biased against blacks. ProPublica, 23(2016), 77-91.
Binns, R., Van Kleek, M., Veale, M., Lyngs, U., Zhao, J., & Shadbolt, N. (2018). 'It's reducing a human being to a percentage': Perceptions of justice in algorithmic decisions. Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, 1-14.
Brehm, J. W. (1966). A theory of psychological reactance. Academic Press.
Brougham, D., & Haar, J. (2018). Smart technology, artificial intelligence, robotics, and algorithms (STARA): Employees' perceptions of our future workplace. Journal of Management & Organization, 24(2), 239-257.
Chaudhry, B., Wang, J., Wu, S., Maglione, M., Mojica, W., Roth, E., Morton, S. C., & Shekelle, P. G. (2006). Systematic review: Impact of health information technology on quality, efficiency, and costs of medical care. Annals of Internal Medicine, 144(10), 742-752.
Cialdini, R. B. (2009). Influence: Science and practice (5th ed.). Pearson Education.
Columbus, L. (2021). Gartner predicts AI software revenue to reach $62B in 2022. Forbes.
Cowgill, B., & Tucker, C. E. (2020). Algorithmic fairness and economics. Columbia Business School Research Paper.
Dastin, J. (2018). Amazon scraps secret AI recruiting tool that showed bias against women. Reuters.
Davenport, T. H., & Mittal, N. (2022). All in on AI: How smart companies win big with artificial intelligence. Harvard Business Press.
Davenport, T. H., & Ronanki, R. (2018). Artificial intelligence for the real world. Harvard Business Review, 96(1), 108-116.
De Cremer, D., Schweitzer, S., McGuire, J. J., & Narayanan, D. (2025). How behavioral science can improve the return on AI investments. Harvard Business Review.
Dietvorst, B. J., Simmons, J. P., & Massey, C. (2015). Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General, 144(1), 114-126.
Dietvorst, B. J., Simmons, J. P., & Massey, C. (2018). Overcoming algorithm aversion: People will use imperfect algorithms if they can (even slightly) modify them. Management Science, 64(3), 1155-1170.
Grote, T., & Keeling, G. (2022). On algorithmic fairness in medical practice. Cambridge Quarterly of Healthcare Ethics, 31(1), 83-94.
Hao, K. (2019). This is how AI bias really happens—and why it's so hard to fix. MIT Technology Review.
Hiatt, J., & Creasey, T. J. (2012). Change management: The people side of change (2nd ed.). Prosci.
Huang, M. H., & Rust, R. T. (2018). Artificial intelligence in service. Journal of Service Research, 21(2), 155-172.
Jacobs, M., Pradier, M. F., McCoy, T. H., Perlis, R. H., Doshi-Velez, F., & Gajos, K. Z. (2021). How machine-learning recommendations influence clinician treatment selections: The example of antidepressant selection. Translational Psychiatry, 11(1), 1-9.
Kahneman, D., Knetsch, J. L., & Thaler, R. H. (1990). Experimental tests of the endowment effect and the Coase theorem. Journal of Political Economy, 98(6), 1325-1348.
Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47(2), 263-291.
Koenecke, A., Nam, A., Lake, E., Nudell, J., Quartey, M., Mengesha, Z., Toups, C., Rickford, J. R., Jurafsky, D., & Goel, S. (2020). Racial disparities in automated speech recognition. Proceedings of the National Academy of Sciences, 117(14), 7684-7689.
Kokina, J., & Davenport, T. H. (2017). The emergence of artificial intelligence: How automation is changing auditing. Journal of Emerging Technologies in Accounting, 14(1), 115-122.
Kolbjørnsrud, V., Amico, R., & Thomas, R. J. (2016). How artificial intelligence will redefine management. Harvard Business Review, 2(November), 1-6.
Madhavan, P., & Wiegmann, D. A. (2007). Similarities and differences between human–human and human–automation trust: An integrative review. Theoretical Issues in Ergonomics Science, 8(4), 277-301.
Meijerink, J., Boons, M., Keegan, A., & Marler, J. (2021). Algorithmic human resource management: Synthesizing developments and cross-disciplinary insights on digital HRM. The International Journal of Human Resource Management, 32(12), 2545-2562.
Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81-97.
Moore, D. A., & Healy, P. J. (2008). The trouble with overconfidence. Psychological Review, 115(2), 502-517.
Morozov, E. (2013). To save everything, click here: The folly of technological solutionism. PublicAffairs.
Morrison, E. W. (2014). Employee voice and silence. Annual Review of Organizational Psychology and Organizational Behavior, 1(1), 173-197.
Parasuraman, R., & Manzey, D. H. (2010). Complacency and bias in human use of automation: An attentional integration. Human Factors, 52(3), 381-410.
Raisch, S., & Krakowski, S. (2021). Artificial intelligence and management: The automation–augmentation paradox. Academy of Management Review, 46(1), 192-210.
Ransbotham, S., Kiron, D., Gerbert, P., & Reeves, M. (2020). Expanding AI's impact with organizational learning. MIT Sloan Management Review.
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). "Why should I trust you?" Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135-1144.
Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner, D., Chaudhary, V., Young, M., Crespo, J. F., & Dennison, D. (2015). Hidden technical debt in machine learning systems. Advances in Neural Information Processing Systems, 28, 2503-2511.
Seligman, M. E. (1972). Learned helplessness. Annual Review of Medicine, 23(1), 407-412.
Somashekhar, S. P., Sepúlveda, M. J., Puglielli, S., Norden, A. D., Shortliffe, E. H., Rohit Kumar, C., Rauthan, A., Arun Kumar, N., Patil, P., Rhee, K., & Ramya, Y. (2018). Watson for oncology and breast cancer treatment recommendations: Agreement with an expert multidisciplinary tumor board. Annals of Oncology, 29(2), 418-423.
Son, H. (2017). JPMorgan software does in seconds what took lawyers 360,000 hours. Bloomberg.
Tarafdar, M., Tu, Q., Ragu-Nathan, B. S., & Ragu-Nathan, T. S. (2007). The impact of technostress on role stress and productivity. Journal of Management Information Systems, 24(1), 301-328.
Tenbrunsel, A. E., & Messick, D. M. (2004). Ethical fading: The role of self-deception in unethical behavior. Social Justice Research, 17(2), 223-236.
Thaler, R. H., & Sunstein, C. R. (2008). Nudge: Improving decisions about health, wealth, and happiness. Yale University Press.
Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124-1131.
Weick, K. E. (1984). Small wins: Redefining the scale of social problems. American Psychologist, 39(1), 40-49.
Zhang, Y., Liao, Q. V., & Bellamy, R. K. (2020). Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 295-305.

Jonathan H. Westover, PhD is Chief Academic & Learning Officer (HCI Academy); Associate Dean and Director of HR Programs (WGU); Professor, Organizational Leadership (UVU); OD/HR/Leadership Consultant (Human Capital Innovations). Read Jonathan Westover's executive profile here.
Suggested Citation: Westover, J. H. (2025). How Behavioral Science Can Improve the Return on AI Investments. Human Capital Leadership Review, 30(2). doi.org/10.70175/hclreview.2020.30.2.3



















