December 22, 2024

Salesforce announces the World’s First LLM Benchmark for CRM

Facebook
Twitter
LinkedIn
New benchmark and leaderboard from Salesforce give businesses the guidance they need to make smart decisions when evaluating generative AI models for their CRM systems
Salesforce Announces the World’s First LLM Benchmark for CRM-Image Courtesy: - Salesforce

Salesforce announced the world’s first LLM benchmark for CRM to help businesses evaluate the rapidly growing number of large language models (LLMs) for customer relationship management (CRM) systems. 

Salesforce Announces the World’s First LLM Benchmark for CRM

According to Industry sources, the latest benchmark is a comprehensive evaluation framework that measures the performance of LLMs against four key measures: accuracy, cost, speed, and trust and safety. It’s been specifically designed to evaluate common sales and service use cases, including prospecting, lead nurturing, sales opportunity, and service case summaries.

The benchmark also comprises a public leaderboard to help professionals decide which LLM is best for their CRM needs. Salesforce will continue to incorporate new use case scenarios into the benchmark and enhance its evaluation of LLMs, which will soon include fine-tuned LLMs. 

From L To R: – Silvio Savarese, EVP & Chief Scientist, Salesforce AI Research And Clara Shih, CEO of Salesforce AI

Silvio Savarese, EVP & Chief Scientist, Salesforce AI Research stated “As AI continues to evolve, enterprise leaders are saying it’s important to find the right mix of performance, accuracy, responsibility, and cost to unlock the full potential of generative AI to drive business growth. Salesforce’s new LLM Benchmark for CRM is a significant step forward in the way businesses assess their AI strategy within the industry. It not only provides clarity on next-generation AI deployment but also can accelerate time to value for CRM-specific use cases. Our commitment is to continuously evolve this benchmark to keep pace with technological advancements, ensuring it remains relevant and valuable.” 

Clara Shih, CEO of Salesforce AI stated “Business organizations are looking to utilize AI to drive growth, cut costs, and deliver personalized customer experiences, not to plan a kid’s birthday party or summarize Othello. Our customers have been asking for a purpose-built way to evaluate and select from among the proliferation of new AI models, and we are thrilled to introduce the world’s first LLM benchmark for CRM to help them navigate the complex landscape of models. This benchmark is not just a measure; it’s a comprehensive, dynamically evolving framework that empowers companies to make informed decisions, balancing accuracy, cost, speed, and trust.”

Why it matters: Existing LLM benchmarks have been limited to academic and consumer use cases, with very little business relevance. They also lack adequate expert human evaluations and fail to address accuracy, speed, cost, and trust considerations. These deficiencies have left CRM customers lacking a reliable way to gauge the effectiveness of generative AI-powered CRM solutions. Without a clear sense of how LLMs perform across those metrics for specific use cases, businesses are left to make decisions in the dark. 

Dive deeper: Developed by Salesforce AI Research, the benchmark uniquely uses real-world CRM data, and also uniquely makes use of expert human evaluations by practitioners. This enables businesses to use the benchmark to make more strategic decisions about how to incorporate generative AI into their CRM systems, with specific attention to: 

  1. Accuracy: This metric comprises four subcategories: factuality, completeness, conciseness, and instruction-following. The more accurate the predictions or recommendations, the more valuable the results are to teams across the organization. And the more valuable the results, the better the actions they can take to improve customer experience. If a model is accurate enough for a use case, it’s also important to consider the other metrics. Even if the model isn’t accurate enough, techniques like prompt engineering and fine-tuning can improve it. 
  2. Cost: The cost metric is categorized as high, medium, and low, based on percentiles. It’s the estimated operational cost that varies by CRM use case. Customers can evaluate the cost-effectiveness of different LLMs to ensure they align with their budget and resource allocation strategies.
  3. Speed: This metric assesses the LLM’s responsiveness and efficiency in processing and delivering information. Faster response times enhance the user experience, reduce wait times for customers, and enable sales and service teams to address inquiries and issues promptly.
  4. Trust and Safety: This metric measures the LLM’s capability to shield sensitive customer data, adhere to data privacy regulations, secure information, and refrain from bias and toxicity for CRM use cases. By assessing the reliability of LLMs for CRM, this benchmark gives organizations a sense of transparency regarding trust and safety.

Source

Share.

RELATED POSTS

Karim Benkirane, Chief Commercial Officer at du. Image Courtesy: du
Du Partners with Nokia to Drive Digitization Through 5G Private Wireless Networks
du, the leading telecom and digital services provider, today announced an innovative connectivity portfolio designed to address the digital transformation needs of government and large organisations. Image courtesy: du
du Unveils Enterprise Plus Connectivity Platform To Boost Digital Transformation
Schonning Eysturoy, Senior Director of Innovation Ecosystems, Wazoku. Image courtesy: Wazoku
Wazoku Partners with FKRA to Boost AI Innovation in UAE and GCC
  • Asialink Finance

LATEST POSTS

International Business Magazine
Charabanc Transportation officially introduces “Ankai”, the prestigious Chinese bus brand under Anhui Ankai Automobile Company Limited, in the UAE. This launch marks a significant step to enhance the nation’s transportation sector. Image Courtesy: Charabanc Transportation
Modon Holding completes the acquisition of La Zagaleta. Image Courtesy: Modon
Jyothi Bathula. Image Courtesy Mashreq