Building Finn Voice: Our Journey into AI-Powered Phone Support

Join me as I share our journey in building Finn Voice, a voice AI agent that answers customer questions and escalates to human support when needed, and explore the key decisions and challenges we faced along the way.

  • 1. The speaker will be discussing Finn Voice, a voice agent for phone support, designed as a frontline teammate for Intercom calls.
  • 2. Intercom is a customer service platform that offers a messenger product and has evolved to include robust tooling for various channels like email, WhatsApp, and phone.
  • 3. Finn, an AI agent launched by Intercom two years ago, has shown significant growth in handling customer interactions without human intervention.
  • 4. The speaker highlights voice as the next big frontier and a cost-effective solution for customer service, with over 80% of support teams using phone support and one-third of global customer service
  • 5. Voice AI agents can provide 24/7 support, eliminate wait times, bypass IVR menus, and support multiple languages, while also offering major cost savings and scalability for businesses.
  • 6. The speaker discusses how Finn Voice was built in approximately 100 days, focusing on seven main areas that had the most significant impact on its development.
  • 7. The use case for Finn Voice started with a broad, flexible, knowledge-based agent capable of answering various customer questions instead of a narrow problem space like scheduling appointments.
  • 8. The initial wedge use case for voice agents was out-of-office hours support, allowing companies to test the technology gradually and build confidence before deploying it during regular office hours
  • 9. The MVP (Minimum Viable Product) for Finn Voice focused on quickly shipping a meaningful product by concentrating on testing, deploying, and monitoring agent behavior.
  • 10. Technical foundations for Finn Voice included leveraging Intercom's existing chat agent setup and native phone support product when possible.
  • 11. The speaker highlights key differences in designing conversations for voice compared to chat, such as addressing latency, answer length, and user mindset.
  • 12. Integrating Finn Voice into support workflows was crucial, with the majority of feedback focusing on how it works with team workflows rather than model performance or latency.
  • 13. Smooth escalation paths and context handoff were essential integration points for Finn Voice, providing video summaries to human agents after AI agent calls.
  • 14. The speaker discusses various evaluation methods, such as manual and automated evaluations, internal tooling for troubleshooting, resolution rate, and using language models as judges for call anal
  • 15. Cost for voice AI agents typically ranges between three and 20 cents per minute, with usage-based pricing and outcome-based pricing being the dominant pricing models on the market.
  • 16. The speaker believes that the market will converge toward outcome-based pricing, as it aligns incentives better between providers and customers.
  • 17. Key takeaways from building Finn Voice include addressing performance issues as both a model and product problem, designing for the realities of phone conversations, building internal and external

Source: AI Engineer via YouTube

❓ What do you think? What are the most significant differences between designing conversations for voice versus text-based interfaces, and how do these differences impact the overall user experience? Feel free to share your thoughts in the comments!