top of page
HCL Review
nexus institue transparent.png
Catalyst Center Transparent.png
Adaptive Lab Transparent.png
Foundations of Leadership
DEIB
Purpose-Driven Workplace
Creating a Dynamic Organizational Culture
Strategic People Management Capstone

Clio: Privacy-Preserving Insights into Real-World AI Use

Listen to this article:


Abstract: This paper presents Clio (Claude insights and observations), a privacy-preserving platform that uses AI assistants to analyze and surface aggregated usage patterns across millions of conversations without requiring human reviewers to read raw user data. The system addresses a critical gap in understanding how AI assistants are used in practice while maintaining robust privacy protections through multiple layers of safeguards. We validate Clio's accuracy through extensive evaluations, demonstrating 94% accuracy in reconstructing ground-truth topic distributions and achieving undetectable levels of private information in final outputs through empirical privacy auditing. Applied to one million Claude.ai conversations, Clio reveals that coding, writing, and research tasks dominate usage, with significant cross-language variations—for example, Japanese conversations discuss elder care at higher rates than other languages. We demonstrate Clio's utility for safety purposes by identifying coordinated abuse attempts, monitoring for unknown risks during high-stakes periods like capability launches and elections, and improving existing safety classifiers. By enabling scalable analysis of real-world AI usage while preserving privacy, Clio provides an empirical foundation for AI safety and governance.

Despite widespread interest in AI's societal impact, remarkably little public data exists about how AI assistants are actually used in practice. Which capabilities see real adoption? How does usage vary across cultures? Which anticipated benefits and risks materialize in concrete data? This knowledge gap persists despite model providers having access to usage data that could answer these questions, primarily due to four fundamental challenges.


First, users share sensitive personal and business information with AI systems, creating tension between privacy protection and the provider's need to understand system usage. Second, having humans review conversations raises ethical concerns due to the repetitive nature of the task and potentially distressing content reviewers might encounter. Third, competitive pressures discourage providers from releasing usage data that could reveal information about their user base to competitors, even when such disclosure would serve the public interest. Finally, the sheer scale—millions of daily messages—makes manual review impractical.


Clio addresses these challenges by using AI assistants themselves to surface aggregated insights across millions of model interactions while preserving user privacy. Similar to how Google Trends provides aggregate insights about web search behavior without exposing individual queries, Clio reveals patterns about how AI assistants are used in the real world. The system transforms raw conversations through a multi-stage pipeline: extracting key facets (like conversation topic or language), clustering semantically similar conversations, generating privacy-preserving cluster descriptions, and organizing clusters into a navigable hierarchy. This approach enables discovering both specific patterns of interest and unknown unknowns through an interactive visualization interface.


Why Clio Matters Now

As AI systems become more capable and integrated into society, the need for empirical understanding of their real-world use intensifies. Pre-deployment testing—including red-teaming and benchmark evaluations—remains crucial but cannot capture all real-world usage patterns and emergent risks. Post-deployment monitoring provides an essential complement by surfacing patterns that predetermined scenarios might miss. These insights can inform future pre-deployment tests, creating a virtuous cycle between empirical observation and proactive safeguards.


The timing is particularly critical. AI assistants now handle increasingly sensitive tasks, from medical advice to financial planning to legal research. Without systematic understanding of actual usage, we risk developing governance frameworks disconnected from reality, safety measures that address hypothetical rather than actual risks, and product improvements that miss what users genuinely need. Clio represents one approach to privacy-preserving insight at scale, contributing to an emerging culture of empirical transparency in AI development.


The AI Usage Understanding Landscape

Defining Privacy-Preserving Analytics in the AI Context


Traditional approaches to understanding technology usage face unique challenges when applied to AI assistants. Unlike analyzing web search queries or application usage logs, AI conversations often contain extended, context-rich exchanges that may include deeply personal information, proprietary business data, creative works, and sensitive decision-making processes. This richness makes conversations valuable for understanding impact but also heightens privacy concerns.


Privacy-preserving analytics must therefore balance two competing objectives: generating actionable insights about system usage while protecting individual privacy. Formal privacy frameworks like differential privacy and k-anonymity provide strong theoretical guarantees but prove difficult to apply to rich textual outputs. Differential privacy adds noise to query results to prevent inference about individual records, but determining appropriate noise levels for natural language descriptions remains an open challenge. K-anonymity ensures each record is indistinguishable from at least k-1 others, but defining "indistinguishability" for complex conversational data lacks clear metrics.


Clio takes a defense-in-depth approach, implementing multiple privacy layers that collectively reduce private information exposure to undetectable levels in empirical evaluations. This statistical validation approach complements formal guarantees by measuring actual privacy preservation in practice rather than relying solely on theoretical bounds. The system explicitly defines private information broadly—encompassing not just individual identifiers but also information that could identify small groups or specific organizations, recognizing that group privacy violations can be as concerning as individual ones.


State of Practice: Current Approaches to AI Usage Analysis


Existing approaches to understanding AI usage fall into several categories, each with limitations that motivate Clio's development. Public datasets like WildChat and LMSYS-1M-Chat provide valuable windows into AI usage but suffer from selection bias—they capture users willing to interact with AI through specific platforms (often for free access) rather than representative samples of mainstream usage. These datasets reveal important patterns, such as coding dominating 15-25% of conversations across platforms, but cannot answer questions about how paid users behave differently or how usage evolves over time on production systems.


Academic research on crowdworker-generated datasets (like the Anthropic Red Team dataset and Stanford Human Preferences Dataset) offers controlled insights into specific scenarios but lacks the ecological validity of real-world usage. Crowdworkers following instructions to explore model capabilities produce different interaction patterns than users genuinely trying to accomplish tasks. This gap between research scenarios and actual usage limits the applicability of findings to real-world safety and product decisions.


Some providers share high-level usage statistics or case studies, but these typically offer limited granularity and lack systematic methodology for identifying patterns. Anecdotal evidence from user forums and social media provides qualitative insights but cannot quantify prevalence or identify unknown patterns at scale. The field has lacked a systematic, privacy-preserving approach to analyzing production usage data—the gap Clio aims to fill.


Privacy, Ethics, and Competitive Dynamics


The reluctance to analyze and share usage data reflects genuine tensions rather than mere unwillingness. Privacy concerns are paramount: users reasonably expect their conversations to remain confidential, and analyzing them—even in aggregate—requires careful consideration. Ethical issues extend beyond privacy to worker wellbeing: human reviewers who manually examine potentially disturbing content can experience psychological harm, raising questions about the ethics of such review processes.


Competitive dynamics create additional complexity. Usage patterns reveal valuable information about which features drive engagement, which user segments find value, and which competitors might be gaining traction. Sharing this intelligence could advantage competitors while potentially harming the company's ability to invest in AI safety research through reduced competitive position. However, this competitive concern must be weighed against the public interest in understanding AI's societal impact.


Clio attempts to navigate these tensions by prioritizing privacy through technical safeguards, reducing human exposure to potentially disturbing content through AI-mediated analysis, and sharing insights that serve the public interest even when they might reveal competitively sensitive information. This approach recognizes that model providers have both capabilities and responsibilities that extend beyond immediate commercial interests.


Organizational and Individual Consequences of AI Assistant Usage

Understanding Organizational Adoption Patterns


Organizations adopt AI assistants across a remarkably diverse range of use cases, from automating routine tasks to augmenting complex decision-making. Clio's analysis reveals that coding-related tasks dominate usage, with "Web and mobile application development" representing over 10% of all conversations in our Claude.ai sample. This finding suggests that AI assistants have achieved significant penetration in software development workflows, potentially accelerating development cycles and lowering barriers to entry for programming.


Writing and communication tasks comprise another major category, including professional email drafting, document creation, and content editing. This usage pattern indicates that AI assistants serve as cognitive tools for knowledge workers, potentially increasing productivity in communication-intensive roles. Research and educational uses—representing 6-10% of usage—suggest that AI assistants function as learning aids and research accelerators, raising important questions about their impact on educational outcomes and research quality.


The prevalence of specific use cases varies across linguistic communities, revealing cultural and contextual factors that shape AI adoption. Japanese and Chinese conversations show elevated rates of elder care discussions compared to other languages, potentially reflecting demographic challenges these societies face. Spanish conversations show higher prevalence of economic theory discussions, while anime and manga content creation features prominently in Japanese conversations. These cross-cultural patterns suggest that AI assistant adoption reflects and potentially amplifies existing social priorities and cultural practices.


Individual and Stakeholder Impacts


Beyond organizational efficiency, AI assistant usage affects individual users in complex ways. The diversity of personal use cases—from dream interpretation to Dungeons & Dragons game mastering to hairstyle advice—demonstrates that AI assistants increasingly mediate intimate and creative aspects of human life. This mediation raises important questions about autonomy, authenticity, and the evolution of human capabilities.


Educational uses present particularly significant implications. When students use AI assistants for homework help or exam preparation, the line between legitimate learning support and academic dishonesty becomes ambiguous. Clio's ability to identify clusters of conversations about "academic cheating and avoiding detection" highlights this tension, suggesting that meaningful numbers of users attempt to use AI assistants in ways that undermine educational integrity. The prevalence of such behavior, and how it varies across educational levels and subjects, remains an important area for ongoing monitoring and research.


Creative and professional tasks increasingly involve AI collaboration, raising questions about attribution, skill development, and the nature of expertise. When users rely on AI assistants for code debugging, legal research, or medical information synthesis, they gain access to capabilities that would otherwise require extensive training or expensive professional services. This democratization of expertise brings both benefits—reduced barriers to accomplishment—and risks—potential erosion of professional standards and increased likelihood of errors when users lack domain knowledge to evaluate AI outputs critically.


The emotional and psychological dimensions of AI usage warrant attention as well. Conversations about dreams, consciousness, and philosophical questions suggest that some users engage AI assistants as interlocutors for existential reflection. While potentially valuable for self-exploration, such usage raises questions about the appropriate boundaries of AI involvement in human meaning-making and whether AI assistants might substitute for human connection in concerning ways.


Evidence-Based Organizational Responses

Transparent Communication About AI Capabilities and Limitations


Organizations deploying AI assistants must communicate clearly about system capabilities, limitations, and appropriate use cases. Anthropic's Usage Policy explicitly prohibits certain uses—including political campaigning, election interference, and generating sexually explicit content—providing clear boundaries for acceptable usage. However, policy alone proves insufficient; organizations must also educate users about why certain uses pose risks and how to evaluate whether their intended use case falls within acceptable bounds.


Effective communication strategies include:


  • Contextual guidance: Providing just-in-time explanations when users attempt potentially problematic tasks, explaining why certain requests might be refused or flagged

  • Capability transparency: Clearly documenting what the system can and cannot reliably do, including known failure modes and areas of uncertainty

  • Privacy education: Helping users understand what data is collected, how it's used, and what privacy protections exist

  • Evolving limitations: Regularly updating users as model capabilities change, ensuring they don't rely on outdated mental models of system behavior


Anthropic's approach to election monitoring exemplifies transparent communication in action. During the 2024 US general elections, the company used Clio to identify election-related conversations and flag clusters that might indicate policy violations. Rather than silently removing violating content, Anthropic explains to users why certain election-related uses (like generating campaign materials) violate policy while others (like learning about voting procedures) are acceptable. This transparency helps users develop accurate mental models of appropriate use.


Procedural Justice in Content Moderation


When AI systems identify potentially violating behavior, organizations must respond in ways that users perceive as fair and legitimate. Procedural justice—the fairness of the processes used to make decisions—proves as important as the substantive outcomes themselves. Users who understand why their account was restricted and have opportunities to appeal are more likely to accept enforcement actions as legitimate.


Key elements of procedural justice include:


  • Explanation and transparency: Providing clear reasons for enforcement actions rather than opaque "policy violations"

  • Appeal mechanisms: Enabling users to contest decisions they believe were made in error

  • Consistency: Applying rules uniformly across similar cases to avoid perceptions of arbitrary enforcement

  • Human oversight: Ensuring that high-stakes decisions (like account termination) involve human review rather than fully automated processes


OpenAI's approach to research access demonstrates procedural justice principles. When researchers request access to usage data for safety research, OpenAI maintains documented criteria for approval, explains decisions, and provides appeal paths. This procedural clarity helps researchers understand requirements and builds trust even when requests are denied.


Clio's design reflects procedural justice considerations by not automating enforcement based solely on cluster membership. Instead, clusters flagged as concerning trigger manual review by authorized Trust and Safety team members who examine individual conversations and make contextualized judgments. This human-in-the-loop approach reduces false positive rates while maintaining legitimacy.


Capability Building for Responsible AI Usage


Organizations benefit from investing in user education about responsible AI usage rather than relying solely on technical controls. Users who understand AI capabilities, limitations, and risks can make better decisions about when and how to use these systems.


Effective capability building approaches include:


  • Domain-specific guidance: Providing tailored advice for specific use cases (e.g., medical advice, legal research, financial planning) about appropriate AI involvement and necessary verification steps

  • Critical evaluation training: Teaching users to assess AI outputs critically, recognize hallucinations and errors, and know when to seek expert verification

  • Ethical frameworks: Helping users think through questions of attribution, privacy, and appropriate delegation of judgment to AI systems

  • Best practice sharing: Facilitating communities of practice where users share effective patterns for AI collaboration


Microsoft's approach in enterprise deployments exemplifies capability building. When deploying Copilot for Microsoft 365, the company provides extensive training materials helping employees understand when AI assistance adds value versus when it might introduce risks. Domain-specific guidance addresses common pitfalls—like relying on AI for final legal language without attorney review—while celebrating productive use patterns.


Duolingo's integration of AI tutoring demonstrates capability building in educational contexts. The language learning platform uses AI assistants to provide personalized practice while clearly communicating to learners that AI interactions complement rather than substitute for comprehensive language instruction. This framing helps users develop realistic expectations about what AI tutoring can accomplish.


Operating Model and Technical Controls


Technical architecture and operating procedures play crucial roles in enabling responsible AI usage. Organizations must design systems that make safe behaviors easy and unsafe behaviors difficult while preserving flexibility for legitimate edge cases.


Effective technical controls include:


  • Rate limiting: Preventing automated abuse by limiting requests per user or account

  • Input filtering: Blocking or flagging clearly prohibited content before it reaches the model

  • Output filtering: Preventing models from generating prohibited content even when users attempt to elicit it

  • Behavioral monitoring: Tracking patterns across multiple conversations to identify coordinated abuse that individual conversations might not reveal


Clio itself represents an operating model innovation: using AI assistants to identify patterns of violative behavior that would be invisible at the individual conversation level. When Clio identifies clusters suggesting coordinated abuse—like accounts systematically generating SEO spam across many conversations—it enables enforcement against sophisticated attacks that evade simpler detection methods.


Anthropic's multi-layered safety approach combines technical controls at multiple levels. Models receive training and instructions to refuse harmful requests. Classifiers detect and flag problematic conversations even when models initially respond. Rate limits prevent automated abuse. Usage policies provide clear boundaries. Trust and Safety teams review flagged content under strict privacy controls. This defense-in-depth approach recognizes that no single control provides perfect protection.


Google's approach to commercial AI deployment demonstrates the importance of technical controls at scale. When offering AI capabilities through Google Cloud, the company implements quotas, abuse detection systems, and access controls that prevent individual customers from monopolizing resources or using systems for prohibited purposes. These controls balance openness with responsibility, enabling innovation while preventing misuse.


Financial and Benefit Supports for Positive Use Cases


Organizations can actively promote beneficial AI usage through strategic pricing, access policies, and partnership programs that make AI assistants available for high-social-value applications.

Strategies for supporting positive use cases include:


  • Educational access programs: Providing free or subsidized access to students, educators, and educational institutions

  • Research partnerships: Enabling academic researchers to access usage data or computational resources for safety and social impact research

  • Nonprofit support: Offering preferential pricing or capabilities to organizations working on social challenges

  • Safety research funding: Investing in external research on AI safety, fairness, and beneficial applications


Anthropic's approach to research access exemplifies this support model. The company provides researchers studying AI safety with access to models and usage data under strict privacy controls, enabling independent analysis that informs both Anthropic's development and broader community understanding. This investment in external scrutiny demonstrates commitment to safety beyond immediate commercial interests.


OpenAI's ChatGPT Edu program shows how targeted access can support beneficial use. By offering educational institutions specialized access designed for learning applications, OpenAI enables exploration of AI's educational potential while building safeguards against academic dishonesty. The program includes features like activity dashboards that help educators understand how students use AI.


Google's AI for Social Good initiatives demonstrate financial support for positive applications. By providing computational resources, expertise, and funding to organizations addressing social and environmental challenges, Google enables AI application to high-value problems that might otherwise lack resources for sophisticated AI deployment.


Building Long-Term Capabilities for Responsible AI Development

Empirical Observation and Continuous Learning Systems


The rapid evolution of AI capabilities means that static governance frameworks quickly become outdated. Organizations must build capabilities for continuous empirical observation, learning, and adaptation to maintain effective governance as models and usage patterns evolve.


Clio exemplifies this approach by enabling ongoing analysis of real-world usage without requiring predetermined hypotheses about what patterns might emerge. The system's bottom-up design—clustering conversations based on semantic similarity rather than predefined categories—allows discovery of usage patterns that developers and safety teams might not anticipate. This capability proves especially valuable during periods of uncertainty: new capability launches, major world events, or rapid changes in model behavior.


Building sustainable empirical observation requires:


  • Scalable analysis infrastructure: Systems that can process millions of conversations efficiently enough to provide timely insights

  • Multilingual capabilities: Analysis methods that work across languages to understand global usage patterns

  • Temporal tracking: Monitoring how usage patterns evolve over time to identify emerging trends before they become widespread

  • Cross-domain integration: Combining usage analysis with other data sources (like user surveys, external research, and safety incident reports) to develop holistic understanding


Organizations should view empirical observation not as one-time audit but as continuous monitoring analogous to how technology companies monitor system performance and reliability. Just as engineering teams use observability platforms to detect performance degradation or service failures, safety and governance teams need observability into usage patterns and risks. This requires investment in infrastructure, dedicated teams with appropriate expertise, and organizational processes that translate insights into action.


The feedback loop between observation and action proves crucial. When Clio identifies concerning patterns—like coordinated abuse attempts or classifier false positives—those insights should trigger concrete responses: updating safety classifiers, refining usage policies, improving user education, or adjusting model training. Without this action-oriented approach, observation generates data without impact.


Data Stewardship and Privacy Infrastructure


Long-term responsible AI development requires robust data stewardship—policies, processes, and technologies that ensure data is collected, stored, analyzed, and shared in ways that respect privacy while enabling necessary analysis.


Clio's privacy architecture demonstrates key principles of effective data stewardship. Multiple privacy layers work in concert: conversation summaries extract key information while excluding private details, cluster aggregation thresholds ensure clusters represent many users rather than individuals, cluster summaries are generated with explicit privacy instructions, and automated auditing removes clusters containing private information. This defense-in-depth approach recognizes that no single protection provides perfect privacy.


Essential elements of privacy infrastructure include:


  • Privacy by design: Building privacy protections into systems from inception rather than adding them as afterthoughts

  • Access controls: Limiting who can view sensitive data to authorized personnel with legitimate business needs

  • Audit capabilities: Maintaining logs of data access and analysis to ensure accountability

  • Retention policies: Deleting data when it no longer serves necessary purposes rather than retaining indefinitely

  • User transparency: Clearly communicating to users what data is collected and how it's used


Organizations must also grapple with tensions between privacy and other values. Identifying coordinated abuse requires linking behavior across accounts, potentially conflicting with strong anonymization. Improving model safety through analyzing failure cases requires examining specific problematic conversations, creating tension with policies against human review. Navigating these tensions requires explicit value judgments about acceptable trade-offs rather than pretending conflicts don't exist.


The role of differential privacy and formal guarantees in AI usage analysis remains an important area for research and development. While Clio currently relies on empirical privacy validation rather than formal guarantees, future systems might incorporate differential privacy techniques that provide mathematical bounds on privacy loss. However, the richness of natural language outputs makes direct application challenging—research into privacy-preserving text generation could enable stronger formal guarantees while maintaining analytical utility.


Distributed Leadership and Cross-Functional Collaboration


Effective AI governance requires expertise spanning multiple domains: machine learning, software engineering, policy and legal analysis, ethics, social science, and domain-specific knowledge for particular applications. No single team or individual possesses all necessary expertise, making cross-functional collaboration essential.


Clio's development reflects distributed leadership: AI researchers developed core clustering algorithms, engineers built scalable infrastructure, safety specialists designed privacy protections, policy experts defined acceptable use cases, social scientists contributed analysis frameworks, and ethics experts helped navigate tensions between competing values. This collaboration produced a system more robust than any single discipline could have created.


Organizational structures that enable distributed leadership include:


  • Cross-functional teams: Bringing together diverse expertise for major initiatives rather than siloing work by discipline

  • Embedded specialists: Placing experts in ethics, safety, or policy directly within product and research teams

  • Structured consultation processes: Creating mechanisms for seeking input from relevant stakeholders before major decisions

  • Transparent decision-making: Documenting key decisions and rationales to enable learning and accountability


Organizations should resist the temptation to centralize all AI governance decisions in a single team, as this creates bottlenecks and fails to leverage domain expertise. Instead, governance should be distributed across the organization with clear accountability for different types of decisions. Product teams might make day-to-day choices about feature design within guardrails established by safety teams, who in turn operate within boundaries set by executive leadership and informed by external feedback.


Purpose, Values, and Organizational Culture


Technical systems and processes ultimately rest on organizational culture—shared values, assumptions, and priorities that shape how people make decisions when formal rules don't provide clear guidance. Building culture that prioritizes responsible AI development requires explicit attention to values and mechanisms that reinforce them.


Key cultural elements include:


  • Safety as core value: Treating safety as fundamental rather than constraint on innovation

  • Transparency as norm: Defaulting to sharing information about capabilities, limitations, and usage patterns unless specific harm would result

  • Epistemic humility: Acknowledging uncertainty about long-term impacts and potential risks we haven't imagined

  • External accountability: Welcoming scrutiny from researchers, civil society, and the public rather than defending against it


Anthropic's decision to publish this paper exemplifies these values in practice. Sharing detailed information about Clio—including its capabilities, limitations, and actual usage insights—serves the public interest even though it reveals competitively sensitive information about user behavior and potentially advantages competitors. This transparency reflects organizational commitment to empirical grounding of AI governance rather than relying solely on internal judgment.


Building and maintaining purpose-driven culture requires ongoing investment beyond one-time statements of values. Organizations must:


  • Hire for values alignment: Selecting team members who demonstrate genuine commitment to responsible development

  • Reward values-aligned behavior: Ensuring that promotion and recognition systems reward long-term safety thinking rather than only short-term metrics

  • Create space for dissent: Enabling team members to raise concerns without fear of retaliation

  • Learn from failures: Treating safety incidents as learning opportunities rather than occasions for blame

  • Engage with critics: Seeking out and taking seriously feedback from those skeptical of AI development


The field of AI development faces profound challenges in aligning rapid capability growth with robust safety and governance. No single organization will solve these challenges alone—progress requires collaboration, knowledge sharing, and willingness to prioritize collective benefit over individual competitive advantage. By sharing Clio and its insights openly, we hope to contribute to this collaborative effort.


Conclusion

Clio demonstrates that privacy-preserving analysis of real-world AI usage is both technically feasible and practically valuable. Through multi-layered privacy protections, the system surfaces meaningful insights—from coding dominating usage patterns to cross-cultural variations in application focus to coordinated abuse attempts—while maintaining user privacy at empirically validated levels. The platform enables discovery of unknown unknowns that predetermined tests might miss, complementing proactive safety measures with reactive learning from actual deployment.


The findings shared in this paper—top use cases, multilingual patterns, safety classifier performance—represent early explorations of questions that will only grow more important as AI systems become more capable and widespread. Understanding that Japanese users discuss elder care at elevated rates, that coding tasks dominate across languages and platforms, or that certain clusters of conversations systematically attempt to evade safety measures provides concrete grounding for governance decisions that might otherwise rely on speculation.


Looking ahead, several research directions warrant attention. Extending Clio's analysis to long-term usage trajectories could reveal how individuals' relationships with AI assistants evolve over time. Developing more sophisticated multilingual analysis could uncover cross-cultural patterns invisible in English-only studies. Improving formal privacy guarantees while maintaining analytical utility remains an important technical challenge. And expanding beyond conversational data to include outcomes—how AI-assisted work products differ from unassisted ones—would provide crucial insight into real-world impacts.


We share Clio not as a complete solution but as one approach to an urgent challenge: grounding AI safety and governance in empirical reality rather than hypothetical speculation. The platform's effectiveness depends on continued refinement based on new insights, evolving capabilities, and feedback from the broader research community. We invite that engagement, recognizing that responsible AI development requires sustained collaboration across organizations, disciplines, and perspectives.


By making AI usage patterns visible while preserving privacy, systems like Clio can help ensure that governance frameworks, safety measures, and product improvements respond to actual usage rather than imagined scenarios. This empirical grounding—combined with proactive safety research, robust governance structures, and genuine commitment to public benefit—can help navigate the profound challenges and opportunities that increasingly capable AI systems present.


References

  1. Ahlberg, C., Williamson, C., & Shneiderman, B. (1992). Dynamic queries for information exploration: An implementation and evaluation. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 619-626.

  2. Alexander, E., Kohlmann, J., Valenza, R., Witmore, M., & Gleicher, M. (2014). Serendip: Topic model-driven visual exploration of text corpora. IEEE Conference on Visual Analytics Science and Technology (VAST), 173-182.

  3. Anthropic. (2024). The Claude 3 Model Family: Opus, Sonnet, Haiku. Anthropic Technical Report.

  4. Bates, M. J. (1989). The design of browsing and berrypicking techniques for the online search interface. Online Review, 13(5), 407-424.

  5. Brehmer, M., Ng, K., Tate, K., & Munzner, T. (2014). Matches, mismatches, and methods: Multiple-view workflows for energy portfolio analysis. IEEE Transactions on Visualization and Computer Graphics, 20(12), 1795-1804.

  6. Brown, H., Lee, K., Mireshghallah, F., Shokri, R., & Tramèr, F. (2022). What does it mean for a language model to preserve privacy? ACM Conference on Fairness, Accountability, and Transparency, 2280-2292.

  7. Chan, A., Salganik, R., Markelius, A., Pang, C., Rajkumar, N., Krasheninnikov, D., ... & Krueger, D. (2023). Harms from increasingly agentic algorithmic systems. ACM Conference on Fairness, Accountability, and Transparency, 651-666.

  8. Cockburn, A., Karlson, A., & Bederson, B. B. (2009). A review of overview+detail, zooming, and focus+context interfaces. ACM Computing Surveys, 41(1), 1-31.

  9. Collins, C., Carpendale, S., & Penn, G. (2009). DocuBurst: Visualizing document content using language structure. Computer Graphics Forum, 28(3), 1039-1046.

  10. Das, S., Dunn, S., Kim, S., & Papernot, N. (2024). Privacy auditing with one (1) training run. Advances in Neural Information Processing Systems, 37.

  11. Dwork, C., McSherry, F., Nissim, K., & Smith, A. (2006). Calibrating noise to sensitivity in private data analysis. Theory of Cryptography Conference, 265-284.

  12. Eder, E., Krieg-Holz, U., & Hahn, U. (2020). De-identification of emails: Pseudonymizing privacy-sensitive data in a German email corpus. Language Resources and Evaluation Conference, 259-269.

  13. Eloundou, T., Ouyang, L., Agarwal, S., Krueger, D., Brundage, M., Lam, G., ... & Zhang, M. (2024). GPTs are GPTs: An early look at the labor market impact potential of large language models. Science, 383(6679), 1-10.

  14. Ethayarajh, K., Choi, Y., & Swayamdipta, S. (2022). Understanding dataset difficulty with V-usable information. International Conference on Machine Learning, 5988-6008.

  15. Gabriel, I., Ghazavi, A., Hendrycks, D., Kirk, H. R., Rieser, V., Everitt, T., ... & Weidinger, L. (2024). The ethics of advanced AI assistants. Anthropic Technical Report.

  16. Ganguli, D., Lovitt, L., Kernion, J., Askell, A., Bai, Y., Kadavath, S., ... & Clark, J. (2022). Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned. arXiv preprint arXiv:2209.07858.

  17. Hagendorff, T. (2020). The ethics of AI ethics: An evaluation of guidelines. Minds and Machines, 30(1), 99-120.

  18. Irving, G., & Askell, A. (2019). AI safety needs social scientists. Distill, 4(2), e14.

  19. Jobin, A., Ienca, M., & Vayena, E. (2019). The global landscape of AI ethics guidelines. Nature Machine Intelligence, 1(9), 389-399.

  20. Lam, M. S., Sharma, A., Mitchell, M., Ma, J., Ré, C., & Bernstein, M. S. (2024). Concept induction: Analyzing unstructured text with high-level concepts using LLooM. arXiv preprint arXiv:2405.18216.

  21. Lloyd, S. (1982). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2), 129-137.

  22. Lyu, L., Yu, H., & Yang, Q. (2020). Threats to federated learning: A survey. arXiv preprint arXiv:2003.02133.

  23. Marchionini, G. (1995). Information seeking in electronic environments. Cambridge University Press.

  24. Marchionini, G. (2006). Exploratory search: From finding to understanding. Communications of the ACM, 49(4), 41-46.

  25. McInnes, L., Healy, J., & Astels, S. (2017). hdbscan: Hierarchical density based clustering. Journal of Open Source Software, 2(11), 205.

  26. McInnes, L., Healy, J., & Melville, J. (2020). UMAP: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426.

  27. McMahan, B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017). Communication-efficient learning of deep networks from decentralized data. Artificial Intelligence and Statistics, 1273-1282.

  28. Mireshghallah, F., Taram, M., Vepakomma, P., Singh, A., Raskar, R., & Esmaeilzadeh, H. (2020). Privacy in deep learning: A survey. arXiv preprint arXiv:2004.12254.

  29. Mireshghallah, F., Uniyal, A., Wang, T., Evans, D., & Berg-Kirkpatrick, T. (2024). An empirical analysis of memorization in fine-tuned autoregressive language models. Empirical Methods in Natural Language Processing.

  30. Neel, S., & Chang, R. (2023). Privacy from first principles: Definitions and proofs for differential privacy. arXiv preprint arXiv:2311.08153.

  31. Nicholas, G. (2024). AI companies must be transparent. Center for Democracy & Technology Policy Blog.

  32. Nimmo, B. (2024). Threat intelligence update: Covert influence operations. OpenAI Blog.

  33. Nomic. (2024). Atlas: Organize the world's unstructured data. Nomic Technical Documentation.

  34. Ouyang, S., Zhang, J., Levy, S., Raj, A., & Wallace, E. (2023). Measuring the gap between natural language understanding benchmarks and real-world usage. Findings of the Association for Computational Linguistics: ACL, 4392-4408.

  35. Pan, X., Zhang, M., Ji, S., & Yang, M. (2020). Privacy risks of general-purpose language models. IEEE Symposium on Security and Privacy, 1314-1331.

  36. Peris, C., Deng, C., & Choi, Y. (2023). TIFA: Accurate and interpretable text-to-image faithfulness evaluation with question answering. International Conference on Computer Vision, 20406-20417.

  37. Pilán, I., Lison, P., Øvrelid, L., Papadopoulou, A., Sánchez, D., & Batet, M. (2022). The text anonymization benchmark (TAB): A dedicated corpus and evaluation framework for text anonymization. Computational Linguistics, 48(4), 1053-1101.

  38. Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence embeddings using Siamese BERT-networks. Empirical Methods in Natural Language Processing, 3982-3992.

  39. Reimers, N., & Gurevych, I. (2022). Making monolingual sentence embeddings multilingual using knowledge distillation. Empirical Methods in Natural Language Processing, 4512-4525.

  40. Russell, D. M., Stefik, M. J., Pirolli, P., & Card, S. K. (1993). The cost structure of sensemaking. INTERACT and CHI Conference on Human Factors in Computing Systems, 269-276.

  41. Shneiderman, B. (2003). The eyes have it: A task by data type taxonomy for information visualizations. IEEE Symposium on Visual Languages, 336-343.

  42. Speer, R. (2024a). langcodes: A Python library for working with language codes. GitHub Repository.

  43. Speer, R. (2024b). language-data: Supplementary data about languages used by the langcodes module. GitHub Repository.

  44. Stasko, J., Görg, C., & Liu, Z. (2007). Jigsaw: Supporting investigative analysis through interactive visualization. IEEE Symposium on Visual Analytics Science and Technology, 131-138.

  45. Stein, D., Dahl, M., Kang, D., Grasemann, U., & Eloundou, T. (2024). Deploying AI responsibly: Lessons from frontier model developers. arXiv preprint arXiv:2406.15003.

  46. Suri, G., McKee, K., Huang, S., Hashimoto, T., & Liang, P. (2024). Do users write more insecure code with AI assistants? Computer and Communications Security Conference, 2785-2799.

  47. Sweeney, L. (2002). k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(5), 557-570.

  48. Taylor, L., Floridi, L., & van der Sloot, B. (Eds.). (2016). Group privacy: New challenges of data technologies. Springer.

  49. Taylor, R. S. (1968). Question-negotiation and information seeking in libraries. College & Research Libraries, 29(3), 178-194.

  50. Wan, A., Wallace, E., Shen, S., & Klein, D. (2024). Analyzing lottery tickets in vision transformers: Similarity, transferability, and a recipe. arXiv preprint arXiv:2406.03146.

  51. Wang, T., Qian, Y., Qiu, L., & Kummerfeld, J. K. (2023). Goal-driven explainable clustering via language descriptions. Association for Computational Linguistics, 1875-1889.

  52. Weidinger, L., Mellor, J., Rauh, M., Griffin, C., Uesato, J., Huang, P. S., ... & Gabriel, I. (2021). Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359.

  53. Weidinger, L., Uesato, J., Rauh, M., Griffin, C., Huang, P. S., Mellor, J., ... & Gabriel, I. (2022). Taxonomy of risks posed by language models. ACM Conference on Fairness, Accountability, and Transparency, 214-229.

  54. Weidinger, L., Lukman, L., Bhatt, U., Wiles, J., Mateos-Garcia, J., & Gabriel, I. (2023). Sociotechnical safety evaluation of generative AI systems. arXiv preprint arXiv:2310.11986.

  55. Weick, K. E., Sutcliffe, K. M., & Obstfeld, D. (2005). Organizing and the process of sensemaking. Organization Science, 16(4), 409-421.

  56. White, R. W., & Roth, R. A. (2009). Exploratory search: Beyond the query-response paradigm. Morgan & Claypool.

  57. Yao, Y., Duan, J., Xu, K., Cai, Y., Sun, Z., & Zhang, Y. (2024). A survey on large language model (LLM) security and privacy: The good, the bad, and the ugly. High-Confidence Computing, 4(2), 100211.

  58. Zhao, L., Rao, J., Wu, S., Yuan, L., Wang, W., Zhang, R., ... & Bras, R. L. (2024). WildChat: 1M ChatGPT interaction logs in the wild. International Conference on Learning Representations.

  59. Zheng, L., Chiang, W. L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., ... & Stoica, I. (2023). Judging LLM-as-a-judge with MT-bench and Chatbot Arena. Advances in Neural Information Processing Systems, 36.

Jonathan H. Westover, PhD is Chief Academic & Learning Officer (HCI Academy); Associate Dean and Director of HR Programs (WGU); Professor, Organizational Leadership (UVU); OD/HR/Leadership Consultant (Human Capital Innovations). Read Jonathan Westover's executive profile here.

Suggested Citation: Westover, J. H. (2026). Designing for Resilience: Principles for Building Organizational Adaptability. Human Capital Leadership Review, 14(1). doi.org/10.70175/hclreview.2020.14.1.8

Human Capital Leadership Review

eISSN 2693-9452 (online)

future of work collective transparent.png
Renaissance Project transparent.png

Subscription Form

HCI Academy Logo
Effective Teams in the Workplace
Employee Well being
Fostering Change Agility
Servant Leadership
Strategic Organizational Leadership Capstone
bottom of page