Improving Pinterest Search Relevance with Language Models: A Global Approach" (18 words)

Join us as we explore how Pinterest leverages large language models (LLMs) to improve search relevance, achieving significant performance gains across multiple languages and markets.

1. Title: Integrating Language Model (LM) into Pinterest Search
2. Presenters: Khan and Mukunda, machine learning engineers from Pinterest's search relevance team
3. Pinterest is a visual discovery platform that helps users find inspiration for their lives
4. Three main discovery surfaces on Pinterest: home feed, related pins, and search
5. Monthly, Pinterest handles over six billion searches covering topics like recipes, home decor, travel, fashion, and more
6. Pinterest search supports over 45 languages and reaches users in more than 100 countries
7. Pinterest search backend comprises of query understanding, retrieval, ranking, and blending stages that produce relevant and engaging search feeds
8. Focus on semantic relevance modeling at the ranking stage and how LM improves search relevance
9. Presenting four key learnings from using LMs in Pinterest search relevance
10. Lesson 1: LMs are good at relevance prediction, with improved performance as model size increases
11. Baseline for LM is Search Sage, an in-house content and query embedding; 8 billion LAMA mastery model provides a 12% improvement over the multilingual BERT-based model and 20% over search stage em
12. Lesson 2: Multilingual LMs generate captions and user actions that can be useful content annotations
13. Text representation for each pin includes title, description, VM generated synthetic image caption, board titles, and queries leading to the highest engagement
14. Oblation studies indicate enriching features is helpful for relevance prediction; notably, user action-based features improve performance
15. Lesson 3: Knowledge distillation used to productionize LM model
16. Fine-tune multilingual language models using human labeled data from specific segments and generic features that scale across various domains
17. Train student model with semi-supervised learning, using the teacher model's five-scale soft scores for training data
18. Online student model serves relevant search results while being scalable and cost-effective
19. Lesson 4: Relevance tuning produces rich semantic representations in embeddings from large language models (LLMs)
20. Pin and query embeddings can be used across Pinterest for content representation, benefiting related pins, home feed, and other surfaces
21. Question: How were open-source LLMs chosen for fine-tuning?
Answer: Experiments with different language models determined the best performance; 8 billion LAMA mastery model provided the best results
22. Question: Role of LLMs in distillation and cross-encoder architecture
Answer: LLMs are used to distill a student model for predicting search relevance; cross-encoder structure helps better capture interaction between query and pin
23. Question: Evolution from previous search systems to the current LM architecture
Answer: New system improves upon existing features, especially with visual language models for expanding beyond limited markets for relevance data
24. Question: Handling multimodality in the embedding model
Answer: Visual captions effectively capture image content; enriching features improve performance, and the same model is used for all languages with a multilingual LM.

Source: AI Engineer via YouTube

❓ What do you think? What are your thoughts on the ideas shared in this video? Feel free to share your thoughts in the comments!