Exploring AI Coding Agents: Boosting Unit Tests with Guru

Join me as I explore the exciting world of coding agents, where AI and human collaboration are revolutionizing software development workflows, and discover how we're leveraging generative AI to boost unit tests and solve real-life problems.

1. The presentation is about coding agents and their role in future software development workflows.
2. Generative AI is expected to handle many routine tasks, with humans focusing on creative aspects like product design and complex problem-solving.
3. Collaboration between humans and AI agents can be synchronous or asynchronous.
4. Synchronous collaboration:
a. Example: GitHub Co-pilot or Cursor (an AI working live inside your IDE)
b. Works simultaneously with you while typing
c. Has been around since 2020 and has rapidly grown in popularity
5. Asynchronous collaboration:
a. Example: GitHub bot (a bot that can be triggered manually or automatically)
b. Completes tasks without human attention, fully autonomously
c. Submits a deliverable once done
d. A new concept, starting in 2024
6. Both synchronous and asynchronous agents are essential for solving real-life problems.
7. In the future, workflows will have numerous small AI agents addressing various issues: unit testing, bug fixing, documentation, code reviews, and releases.
8. Humans can focus on more creative aspects of development with these agents handling routine tasks.
9. Introduction of a detailed coding agent, Guru, which helps developers write and manage unit tests.
10. Guru detects code changes and determines if new or modified unit tests are needed.
11. Guru writes the test code, runs it, and prepares a pull request with all relevant information.
12. Human reviewers determine if the unit test is good before merging it into the repository.
13. In production, over 50% of pull requests generated by Guru are merged and accepted.
14. Guru handles around 80% of unit tests on its own, becoming the top contributor in their team.
15. To build an agent like Guru, follow these steps:
a. Define a clear, concrete, and doable problem (e.g., unit testing)
b. Build datasets for evaluation
c. Create an evaluation harness
d. Work on LLMs (Language Models)
e. Evaluate models on different scenarios to find the best fit
f. Use different LMSs (Language Model Systems) for various stages
16. Guru uses two GPT-40 models with human-labeled unit test code to improve test code generation.
17. Building context is essential for agents, including specific tasks and languages/frameworks.
18. Environments like GitHub issues, code reviews, commits, pull requests, and code itself contribute to the context.
19. The vision for agents goes beyond unit testing, targeting various software engineering tasks (refactoring, e2e testing, etc.)
20. Building multiple agents for different tasks is challenging; therefore, an agent operating system (agent OS) is being developed.
21. Agents can share runtime, tools, and context using the agent OS, enabling faster development in specific domains.
22. The agent era is coming; businesses should embrace agents in their workflows.
23. Using agents leads to more efficient development processes and better outcomes.
24. Thank you for watching.

Source: AI Engineer via YouTube

❓ What do you think? What are your thoughts on the ideas shared in this video? Feel free to share your thoughts in the comments!