Build Cross-Platform Applications with Local AI: Why and How with Microsoft Foundry

Hi, I'm Ima, Program Manager at Microsoft, and today I'm excited to talk about Foundry Local, an end-to-end AI inference solution that enables developers to easily build cross-platform applications powered by local AI.

1. Ima, a program manager at Microsoft, is discussing Foundry Local, which enables developers to build cross-platform applications with local AI.
2. Four key reasons for using local AI instead of cloud AI:
a. Low network bandwidth or offline access
b. Privacy and security concerns
d. Real-time latency requirements
3. Local AI is now a reality due to powerful computing hardware and optimization techniques in recent decades.
4. Microsoft has several assets that can be used for Foundry Local:
a. Azure AI Foundry with 70,000+ organizations and 1,900+ models
b. Onyx runtime, a cross-platform high-performance on-device inference engine, with over 10 million downloads per month
c. The scale and reach of Windows on client devices
5. Foundry Local includes:
a. Onyx runtime for performance acceleration across various hardware
b. A new Foundry Local management service for hosting and managing models on client devices
c. Connecting to Azure AI Foundry for downloading open-source models on demand
d. Foundry Local CLI for exploring models on the device
e. SDKs for easily integrating Foundry Local into applications
6. Foundry Local was announced at the Microsoft Builder Conference and is available on Windows and Mac OS.
7. Over 100 customers have joined the private preview of Foundry Local, providing positive feedback on its ease-of-use and performance.
8. Local AI enables offline first AI applications, which are essential for sensitive data processing in restricted environments.
9. Foundry Local supports popular generative AI models with variants optimized for different hardware, such as CPUs, CUDA, and integrated GPUs.
10. The 2.5 billion model can process around 90 tokens per second with robos mode enabled.
11. The 54 mini model provides more detailed information than the Cuban model but has a larger model size.
12. Foundry Local is useful for building cross-platform AI applications that run directly on devices, providing high-level summaries of internal projects.
13. Foundry Local offers Python and JavaScript SDKs for integration into applications.
14. The agent feature in Foundry Local is still in private preview but allows users to create, build, and run local agents using local models and MCP servers.
15. An agent consists of one model and one or more MCP servers based on user needs.
16. Example agents include an OCR agent that extracts text from images and a speech-to-text service running locally.
17. Foundry Local can provide tools related to file system management and OCR in an agent.
18. Users can run queries with agents to perform specific tasks, such as finding and processing a receipt to get the total amount.
19. Foundry Local has unlocked significant potential for local AI applications, but users should not expect it to perform at the same level as cloud models or agents in every aspect.
20. More information on Foundry Local can be found via the provided link, and interested users can sign up for the private preview of the agent feature.

Source: AI Engineer via YouTube

❓ What do you think? What are your thoughts on the ideas shared in this video? Feel free to share your thoughts in the comments!