Episodes
Tuesday Nov 19, 2024
Tuesday Nov 19, 2024
Summary of https://www.dhs.gov/sites/default/files/2024-11/24_1114_dhs_ai-roles-and-responsibilities-framework-508.pdf
The document is a framework developed by the U.S. Department of Homeland Security (DHS) in collaboration with an AI Safety and Security Board to address the risks of artificial intelligence (AI) within U.S. critical infrastructure.
The framework outlines roles and responsibilities for five key groups: cloud and compute infrastructure providers, AI developers, critical infrastructure owners and operators, civil society, and the public sector.
The goal is to encourage the safe and secure development and deployment of AI within critical infrastructure sectors, such as energy, transportation, and healthcare, while protecting individual rights and fostering innovation.
The framework addresses five areas of responsibility: securing environments, responsible model and system design, data governance, safe and secure deployment, and performance and impact monitoring.
Friday Nov 15, 2024
Friday Nov 15, 2024
Summary of https://www.developer.tech.gov.sg/products/collections/data-science-and-artificial-intelligence/playbooks/public-sector-ai-playbook.pdf
This resource is a playbook for public sector workers in Singapore, designed to guide them in adopting Artificial Intelligence (AI) in their work.
It presents a comprehensive overview of AI, including its applications in different areas of the public sector. The playbook outlines key steps for starting an AI project, from identifying potential issues to collecting data and choosing a solution provider.
It also addresses the importance of developing AI capabilities within the workforce and establishing a robust framework for deploying, operating, and maintaining AI models.
The document highlights a variety of successful AI projects implemented in Singapore, providing real-world examples for public officers.
Friday Nov 15, 2024
Friday Nov 15, 2024
Summary of https://www.langchain.com/stateofaiagents
The "LangChain State of AI Agents Report" explores the current state of AI agent adoption across different industries and company sizes. The report highlights the increasing use of AI agents in production, with nearly half of survey respondents currently using agents, and a large majority planning to implement them in the near future.
The report also examines leading use cases for AI agents, including research and summarization, personal productivity, and customer service. Importantly, the report emphasizes the need for control mechanisms like tracing and observability tools to ensure reliable and safe agent behavior.
The report further explores challenges faced by organizations deploying agents, such as performance quality and knowledge gaps, along with emerging themes, such as the growing importance of open-source agents and the anticipation of more powerful models.
Wednesday Nov 13, 2024
Wednesday Nov 13, 2024
Summary of https://studentsupportaccelerator.org/sites/default/files/Tutor_CoPilot.pdf
This paper describes the development and evaluation of "Tutor CoPilot," a human-AI system designed to improve the quality of tutoring sessions, particularly for novice tutors working with K-12 students from historically underserved communities.
The system leverages large language models (LLMs) trained on expert thinking to generate real-time, expert-like suggestions for tutors, providing them with guidance on how to respond to student questions and mistakes.
The research utilizes a randomized controlled trial with over 900 tutors and 1,800 students, demonstrating that Tutor CoPilot significantly improves student learning outcomes, particularly for lower-rated tutors.
Additionally, the study finds that tutors using Tutor CoPilot are more likely to use high-quality pedagogical strategies that foster deeper understanding.
This approach offers a scalable and cost-effective alternative to traditional training programs, making high-quality education more accessible to all students.
Wednesday Nov 06, 2024
Wednesday Nov 06, 2024
Summary of https://conference.nber.org/conf_papers/f210475.pdf
This research paper examines the impact of an artificial intelligence (AI) tool on materials discovery in the R&D lab of a large U.S. firm.
The AI tool, which leverages deep learning to partially automate the materials discovery process, was rolled out to scientists in three waves, allowing the researchers to analyze the effects of the technology. The study found that the AI tool significantly accelerated materials discovery, resulting in an increase in patent filings and product prototypes, particularly for scientists with strong initial productivity.
However, the tool's effectiveness depended on the scientist's ability to evaluate the AI-generated compounds, highlighting the importance of human judgment in the scientific discovery process.
The paper concludes by exploring the AI tool's impact on scientist job satisfaction and beliefs about artificial intelligence, revealing that while the tool enhances productivity, it also leads to changes in the types of tasks scientists perform, potentially affecting job satisfaction and prompting a need for reskilling.
Tuesday Nov 05, 2024
Tuesday Nov 05, 2024
Summary of https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5007084
This working paper investigates the impact of generative AI on the nature of work, specifically focusing on software development. The authors study the introduction of GitHub Copilot, an AI-powered code completion tool, within the open-source software development ecosystem.
They use a natural experiment based on GitHub's "top developer" program, which provides free Copilot access to developers of the most popular repositories. Through a regression discontinuity design, they find that access to Copilot induces developers to allocate more time towards coding activities and less towards project management.
This shift is driven by two mechanisms: an increase in autonomous work and an increase in exploration activities. The authors also find that the effects are greater for developers with lower ability, suggesting that generative AI has the potential to flatten organizational hierarchies and reduce inequality in the knowledge economy.
Friday Nov 01, 2024
Friday Nov 01, 2024
Summary of https://www.hbs.edu/ris/Publication%20Files/24-074_bee7fd2f-882e-4e8c-adfe-150f8439dff6.pdf
This working paper examines the challenges of senior professionals learning to use generative AI from junior professionals. The authors argue that while the existing literature on communities of practice suggests that juniors are well-suited to teach seniors about new technologies, this is not the case with generative AI.
The paper highlights the emerging technology risks that senior professionals are concerned about, including inaccurate output, lack of explainability, and the possibility of user complacency. The authors suggest that juniors may recommend ineffective novice AI risk mitigation tactics because of their limited understanding of the technology's capabilities and their tendency to focus on changing human routines instead of system design.
The paper concludes by recommending that corporate leaders should focus on educating both junior and senior employees about AI risks and mitigating these risks through system-level changes and interventions at the ecosystem level.
Friday Nov 01, 2024
Friday Nov 01, 2024
Summary of https://arxiv.org/pdf/2410.05229
This research paper investigates the mathematical reasoning capabilities of large language models (LLMs) and finds that their performance is not as robust as previously thought.
The authors introduce a new benchmark called GSM-Symbolic, which generates variations of math problems to assess the models' ability to generalize and handle changes in question structure.
The results show that LLMs struggle to perform true logical reasoning, often exhibiting a high degree of sensitivity to minor changes in input.
The authors also find that LLMs often blindly follow irrelevant information in the questions, suggesting that their reasoning process is more like pattern matching than true conceptual understanding.
Thursday Oct 31, 2024
Thursday Oct 31, 2024
Summary of https://arxiv.org/pdf/2410.03703
This research paper investigates the impact of large language models (LLMs) on human creativity, specifically focusing on divergent and convergent thinking with over 1,100 participants.
The results showed that while LLMs can provide short-term boosts in performance, they may ultimately hinder independent creative abilities.
The paper concludes that the long-term effects of using LLMs for creativity need to be carefully considered, and AI systems should be designed to enhance, rather than diminish, human cognitive skills.
Thursday Oct 31, 2024
Thursday Oct 31, 2024
Summary of https://github.com/ibm-granite/granite-3.0-language-models/blob/main/paper.pdf
This document details the development and release of Granite 3.0, a new family of open-source, lightweight foundation language models from IBM. The paper provides a thorough overview of the models' design, including their architecture, training data, and post-training techniques.
It also explores the models' performance across various benchmarks, focusing on their capabilities in general knowledge, instruction following, function calling, retrieval augmented generation, and cybersecurity.
The paper concludes by discussing the socio-technical harms and risks associated with LLMs and outlines IBM's efforts to mitigate these concerns through responsible AI practices.