Exciting advancement in recommendation systems from the team at Meituan and Renmin University of China! Introducing the Dual-Flow Generative Ranking Network (DFGR), an innovative two-stream architecture transforming recommendation approaches by effectively modeling user behaviors. Traditional recommendation models often depend heavily on manual feature engineering, limiting performance due to information loss. DFGR addresses this by directly utilizing raw user behavior sequences and minimal attribute data, significantly streamlining the process. Under the hood, DFGR splits user interaction sequences into two flows-real and fake. Both flows share parameters within a decoder-only Transformer network, employing specialized self-attention mechanisms. While the real flow carries genuine action identifiers (e.g., clicks, purchases), the fake flow uses placeholders. During training, this setup ensures comprehensive contextual understanding without revealing action labels prematurely, vastly improving training stability and efficiency compared to Meta's generative recommendation (MetaGR). At inference, DFGR adopts a single-flow strategy, concatenating candidate items and historical user data, allowing rapid, parallel scoring through a single forward pass. This achieves approximately 4x inference efficiency over previous methods. Evaluations on open-source datasets (RecFlow, KuaiSAR) and an extensive industrial dataset (TRec) demonstrated DFGR's clear superiority, surpassing established baselines including DIN, DCN, DIEN, DeepFM, and MetaGR. This advancement not only marks a substantial leap in recommendation technology but also provides valuable insights into optimal parameter allocation and the scaling potential of generative ranking models.
Adaptive Recommendation Engines
Explore top LinkedIn content from expert professionals.
Summary
Adaptive recommendation engines are advanced systems that personalize suggestions based on a user's unique preferences and behaviors, often learning and adjusting in real time. These engines combine data from user activities, item features, and context to create smarter, more relevant recommendations that evolve as needs change.
- Refine user understanding: Build profiles that capture both long-term interests and recent actions so recommendations match what people want right now.
- Fuse multiple data sources: Integrate information from item features, user history, and contextual signals to deliver suggestions that feel intuitive and meaningful.
- Prioritize continuous adaptation: Update models often using fresh data and feedback, allowing the system to respond to shifting preferences and new trends.
-
-
Traditional RAG systems are great at pulling in relevant chunks, but they hit a wall when it comes to understanding people. They retrieve based on surface-level similarity, but they don’t reason about who you are, what you care about right now, and how that might differ from your long-term patterns. That’s where Agentic RAG (ARAG)comes in, instead of relying on one giant model to do everything, ARAG takes a multi-agent approach, where each agent has a job just like a real team. First up is the User Understanding Agent. Think of this as your personalized memory engine. It looks at your long-term preferences and recent actions, then pieces together a nuanced profile of your current intent. Not just "you like shoes" more like "you’ve been exploring minimal white sneakers in the last 48 hours." Next is the Context Summary Agent. This agent zooms into the items themselves product titles, tags, descriptions and summarizes their key traits in a format other agents can reason over. It’s like having a friend who reads every label for you and tells you what matters. Then comes the NLI Agent, the real semantic muscle. This agent doesn’t just look at whether an item is “related,” but asks: Does this actually match what the user wants? It’s using entailment-style logic to score how well each item aligns with your inferred intent. The Item Ranker Agent takes everything user profile, item context, semantic alignment and delivers a final ranked list. What’s really cool is that they all share a common “blackboard memory,” where every agent writes and reads from the same space. That creates explainability, coordination, and adaptability. So my takeaway is Agentic RAG reframes recommendations as a reasoning task, not a retrieval shortcut. It opens the door to more robust feedback loops, reinforcement learning strategies, and even interactive user dialogue. In short, it’s where retrieval meets cognition and the next chapter of personalization begins.
-
Knowledge Graphs (KGs) have long been the unsung heroes behind technologies like search engines and recommendation systems. They store structured relationships between entities, helping us connect the dots in vast amounts of data. But with the rise of LLMs, KGs are evolving from static repositories into dynamic engines that enhance reasoning and contextual understanding. This transformation is gaining significant traction in the research community. Many studies are exploring how integrating KGs with LLMs can unlock new possibilities that neither could achieve alone. Here are a couple of notable examples: • 𝐏𝐞𝐫𝐬𝐨𝐧𝐚𝐥𝐢𝐳𝐞𝐝 𝐑𝐞𝐜𝐨𝐦𝐦𝐞𝐧𝐝𝐚𝐭𝐢𝐨𝐧𝐬 𝐰𝐢𝐭𝐡 𝐃𝐞𝐞𝐩𝐞𝐫 𝐈𝐧𝐬𝐢𝐠𝐡𝐭𝐬: Researchers introduced a framework called 𝐊𝐧𝐨𝐰𝐥𝐞𝐝𝐠𝐞 𝐆𝐫𝐚𝐩𝐡 𝐄𝐧𝐡𝐚𝐧𝐜𝐞𝐝 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐀𝐠𝐞𝐧𝐭 (𝐊𝐆𝐋𝐀). By integrating knowledge graphs into language agents, KGLA significantly improved the relevance of recommendations. It does this by understanding the relationships between different entities in the knowledge graph, which allows it to capture subtle user preferences that traditional models might miss. For example, if a user has shown interest in Italian cooking recipes, the KGLA can navigate the knowledge graph to find connections between Italian cuisine, regional ingredients, famous chefs, and cooking techniques. It then uses this information to recommend content that aligns closely with the user’s deeper interests, such as recipes from a specific region in Italy or cooking classes by renowned Italian chefs. This leads to more personalized and meaningful suggestions, enhancing user engagement and satisfaction. (See here: https://lnkd.in/e96EtwKA) • 𝐑𝐞𝐚𝐥-𝐓𝐢𝐦𝐞 𝐂𝐨𝐧𝐭𝐞𝐱𝐭 𝐔𝐧𝐝𝐞𝐫𝐬𝐭𝐚𝐧𝐝𝐢𝐧𝐠: Another study introduced the 𝐊𝐆-𝐈𝐂𝐋 𝐦𝐨𝐝𝐞𝐥, which enhances real-time reasoning in language models by leveraging knowledge graphs. The model creates “prompt graphs” centered around user queries, providing context by mapping relationships between entities related to the query. Imagine a customer support scenario where a user asks about “troubleshooting connectivity issues on my device.” The KG-ICL model uses the knowledge graph to understand that “connectivity issues” could involve Wi-Fi, Bluetooth, or cellular data, and “device” could refer to various models of phones or tablets. By accessing related information in the knowledge graph, the model can ask clarifying questions or provide precise solutions tailored to the specific device and issue. This results in more accurate and relevant responses in real time, improving the customer experience. (See here: https://lnkd.in/ethKNm92) By combining structured knowledge with advanced language understanding, we’re moving toward AI systems that can reason in a more sophesticated way and handle complex, dynamic tasks across various domains. How do you think the combination of KGs and LLMs is going to influence your business?
-
LLM-based recommendations. Papers from Amazon, Walmart, Spotify, Bytedance, Ikea and Meta The key to good recommendations is personalization and understanding a user's context or jobs. LLMs are ideal for making recommendations and generating "personal narratives - stories that resonate with users". From a simple list of items to watch, listen or buy next, we are moving to a world where automatic recommendations deeply resonate with the user. Spotify says that users 4x times more likely to click on recommendations accompanied by explanations. The same positive effect adds commentary alongside recommendations. Before LLMs, the best practice for creating recommendation engines was to combine traditional inverted index and embedding-based neural retrieval. From an implementation perspective, it means using products such as Solar and BERT-based models. BERT was used to build custom embedding models to find similarities between a search request and products in the catalogue. This is the main difference between web search and product search - products have much less information than a web page. LLMs can make end-to-end product recommendations. The paper from Ikea shows how to help LLMs remember product IDs. In this case, we can send the user's request as a prompt directly to LLM and receive a product ID without additional actions. The method is based on fine-tuning and extending the vocabulary with product IDs. The most advanced methods include user preferences, behaviour and multi-modality. The Meta coined a new term - Preference Discerning, where user preferences are explicitly included in the recommendation system. The first step is to collect preferences from a user. Product reviews, posts, or anything else can lead to understanding preferences. This process is called Preference Approximation in the paper. The next step is preference conditioning, which includes the most relevant preferences to find the best recommendations. The 'Triple Modality Fusion' paper treats images, texts, and user behaviour as separate modalities. The paper proposes building adapters for image and behaviour data. These adapters and attention mechanisms fuse all three modalities into the same embedding space to use the result for LLM later. We can divide recommendation systems (RS) with LLMs into three types: LLMs as RS(fine-tuning), LLM assists as RS enhancers, and AI agents as RS controllers. The last type requires a custom agent, which uses tools such as the rank or retrieve tool. The main idea is to use reasoning and simulation abilities to find the best match. The 'Let Me Do It For You' paper shows how to use surrogate users and attribute-oriented tools for recommendation systems. The Monolith paper from Bytedance shows some aspects of the operational challenges. The first issue to resolve is online training, or, in other words, how to deliver product and recommendation updates to users. The second is optimizing speed and resources.
-
For consumer-facing platforms, delivering relevant and personalized recommendations isn’t just about convenience—it’s key to enhancing the traveler experience. In a recent blog post, Expedia Group's Data Science team shared how they’ve refined their property search ranking algorithm to better match user intent and provide more meaningful results. Expedia’s recommendation system is traditionally designed for destination searches, where travelers enter a location and filter to find suitable lodging. In this case, the algorithm ranks properties based on their overall relevance. However, another common scenario is property searches, where users arrive on the platform looking for a specific hotel—often through external channels like search engines. If that property is unavailable, simply displaying top-ranked hotels in the area isn’t the best solution. Instead, the system needs to recommend accommodations that closely match the traveler’s original intent. To tackle this, the Data Science team enhanced their machine learning models by incorporating property similarity into the ranking process. They improved data preprocessing by focusing on past property searches that led to bookings, ensuring the model learns from real traveler behavior. Additionally, they introduced new similarity-based features that compare properties based on key factors like location, amenities, and brand affiliation. These improvements allow the system to suggest highly relevant alternatives when a traveler’s first choice isn’t available, making recommendations feel more intuitive and personalized. While broad recommendation systems lay the foundation for personalization, adapting them to specific user behaviors can greatly improve satisfaction. Expedia’s approach highlights the power of fine-tuning machine learning models to better address evolving business needs. #MachineLearning #DataScience #Algorithm #Recommendation #Customization #SnacksWeeklyonDataScience – – – Check out the "Snacks Weekly on Data Science" podcast and subscribe, where I explain in more detail the concepts discussed in this and future posts: -- Spotify: https://lnkd.in/gKgaMvbh -- Apple Podcast: https://lnkd.in/gj6aPBBY -- Youtube: https://lnkd.in/gcwPeBmR https://lnkd.in/gFZSXpMQ
-
Intriguing insights from the Netflix Tech Blog (https://lnkd.in/gmd576ft) on their push towards a Foundation Model for personalized recommendations! Moving beyond numerous specialized models ("model first") to a unified, LLM-inspired system leveraging comprehensive user history ("data first") is a bold, potentially game-changing approach in the TMT space. Here are a few points that stood out for me: • Centralized learning: Share insights across every “Continue Watching” and “Today’s Top Picks.” • Smart tokenization: Merge clicks, plays, and scrolls into meaningful tokens. • Long‑term objectives: Predict multiple next interactions and genres, not just the next click. • Cold‑start: New shows get their moment, even before anyone’s watched them (highly intrigued by this). • Scale‑driven gains: Bigger data + bigger models = better recommendations. I see a few hurdles while scaling at the enterprise level, beyond the initial promise: 𝗧𝗵𝗲 𝗖𝗼𝗹𝗱 𝗦𝘁𝗮𝗿𝘁 𝗖𝗼𝗻𝘂𝗻𝗱𝗿𝘂𝗺: How effectively can a massive FM handle brand new content recommendations before interaction data exists? 𝗘𝘃𝗼𝗹𝘃𝗶𝗻𝗴 𝗪𝗼𝗿𝗹𝗱𝘀: Unlike static text corpus, media catalogs and user tastes change constantly. Can incremental training keep pace without breaking the bank or sacrificing relevance? 𝗦𝗰𝗮𝗹𝗲 & 𝗟𝗮𝘁𝗲𝗻𝗰𝘆: The sheer cost and complexity of training and serving inferences at Netflix scale, while maintaining low latency, sounds challenging. While consolidating models is appealing, the practicalities of cost, accuracy across diverse tasks, real-time adaptation, and managing ever-changing content libraries are critical considerations for any enterprise exploring this path. What are your thoughts? Is this the future of personalization, or are specialized models still king for specific use cases in the enterprise? #AI #FoundationModels #RecommendationSystems #Personalization #Netflix #EnterpriseAI #MachineLearning #TMT #Scalability #TechTrends #TMTatFractal #Fractal
-
Netflix just released a fascinating blog post detailing their journey adopting foundation models for recommendation systems - a significant departure from their traditional approach. While many focus on performance gains with foundation models, Netflix highlights an equally compelling benefit: unification and simplicity. In #NLP, we've witnessed a paradigm shift from numerous specialized models with extensive feature engineering to single, versatile foundation models trained on massive datasets. Netflix is now bringing this transformation to recommendation systems. Even when performance improvements are modest, this unified approach delivers: ✅ Shorter development cycles ✅ Easier model evolution ✅ Simplified deployment architecture What makes this particularly interesting are the creative solutions Netflix developed to adapt foundation model concepts to recommendation systems: 🔹 Interaction Tokenization - Processing data from 300M+ users by merging adjacent interactions into meaningful tokens 🔹 Heterogeneous Embeddings - Handling diverse data types from location and time to content metadata 🔹 Multi-Token Prediction - Moving beyond standard next-token prediction to weight different user interactions appropriately 🔹 Performance Optimization - Implementing sparse attention and KV caches to deliver recommendations in milliseconds 🔹 Cold Start Solutions - Using warm-start embeddings and metadata-heavy representations for new content This demonstrates how foundation model principles can transform domains well beyond language processing. It's a brilliant example of adapting core AI concepts to solve domain-specific challenges. Have you adapted foundation models to non-standard domains? What challenges did you face? Let me know in the comments below! Also the original blog is listed in the comments below! #AI #MachineLearning #Recommendation #System #Netflix #Foundation #Models #id #ml #llm #largelanguagemodel #embedding
-
A longstanding approach in large-scale recommendation systems has been model-centric: manually engineered features, hand-tuned pipelines, and multiple specialized models optimized for different business KPIs. In this paradigm, domain expertise is encoded in the feature design, and model complexity often grows with business complexity. Netflix’s new foundation model for personalized recommendations (https://lnkd.in/grBhzECy) reflects a shift to a data-centric, end-to-end learning philosophy. Instead of relying on bespoke features, the system learns directly from raw sequences of user-item interactions, complemented by metadata and context. This enables a “one model to rule them all” architecture: a single model that dynamically adapts to evolving business goals and personalization objectives without requiring structural changes or retraining separate models. The underlying model remains stable even as the use cases around it evolve. This is the natural evolution of recommender systems in the LLM era: self-supervised learning with minimal human priors. It aligns with the bitter lesson—that general methods that scale with data ultimately outperform hand-crafted solutions, even in domains like recsys that historically relied heavily on feature engineering.