Build Cross-Platform Applications with Local AI: Why and How with Microsoft Foundry
Hi, I'm Ima, Program Manager at Microsoft, and today I'm excited to talk about Foundry Local, an end-to-end AI inference solution that enables developers to easily build cross-platform applications powered by local AI.
- 1. Ima, a program manager at Microsoft, is discussing Foundry Local, which enables developers to build cross-platform applications with local AI.
- 2. Four key reasons for using local AI instead of cloud AI:
- a. Low network bandwidth or offline access
- b. Privacy and security concerns
- d. Real-time latency requirements
- 3. Local AI is now a reality due to powerful computing hardware and optimization techniques in recent decades.
- 4. Microsoft has several assets that can be used for Foundry Local:
- a. Azure AI Foundry with 70,000+ organizations and 1,900+ models
- b. Onyx runtime, a cross-platform high-performance on-device inference engine, with over 10 million downloads per month
- c. The scale and reach of Windows on client devices
- 5. Foundry Local includes:
- a. Onyx runtime for performance acceleration across various hardware
- b. A new Foundry Local management service for hosting and managing models on client devices
- c. Connecting to Azure AI Foundry for downloading open-source models on demand
- d. Foundry Local CLI for exploring models on the device
- e. SDKs for easily integrating Foundry Local into applications
- 6. Foundry Local was announced at the Microsoft Builder Conference and is available on Windows and Mac OS.
- 7. Over 100 customers have joined the private preview of Foundry Local, providing positive feedback on its ease-of-use and performance.
- 8. Local AI enables offline first AI applications, which are essential for sensitive data processing in restricted environments.
- 9. Foundry Local supports popular generative AI models with variants optimized for different hardware, such as CPUs, CUDA, and integrated GPUs.
- 10. The 2.5 billion model can process around 90 tokens per second with robos mode enabled.
- 11. The 54 mini model provides more detailed information than the Cuban model but has a larger model size.
- 12. Foundry Local is useful for building cross-platform AI applications that run directly on devices, providing high-level summaries of internal projects.
- 13. Foundry Local offers Python and JavaScript SDKs for integration into applications.
- 14. The agent feature in Foundry Local is still in private preview but allows users to create, build, and run local agents using local models and MCP servers.
- 15. An agent consists of one model and one or more MCP servers based on user needs.
- 16. Example agents include an OCR agent that extracts text from images and a speech-to-text service running locally.
- 17. Foundry Local can provide tools related to file system management and OCR in an agent.
- 18. Users can run queries with agents to perform specific tasks, such as finding and processing a receipt to get the total amount.
- 19. Foundry Local has unlocked significant potential for local AI applications, but users should not expect it to perform at the same level as cloud models or agents in every aspect.
- 20. More information on Foundry Local can be found via the provided link, and interested users can sign up for the private preview of the agent feature.
Source: AI Engineer via YouTube
❓ What do you think? What are your thoughts on the ideas shared in this video? Feel free to share your thoughts in the comments!