Snowflake Cortex: LLMs and RAG Power Next-Gen Insights

As we enter 2024, Snowflake is set to make significant strides in artificial intelligence (AI). This year, AI is poised to become a fundamental business strategy rather than just a trend, and Snowflake is leading the way with its intelligent and fully managed service, Snowflake Cortex. The platform enables users to quickly analyze data and develop AI applications while being easily accessible.

At BUILD 2023, we gained insights into the latest trends and advancements in AI through Snowflake's innovations, ones that will be at the forefront into 2024. Among the most noteworthy topics were retrieval augmented generation (RAG) and custom large language model (LLM) applications. Snowflake Cortex will empower businesses to unlock new ways to capitalize on cutting-edge AI frameworks and technologies.

Snowflake Cortex

Snowflake Cortex is an advanced AI-powered data exploration and analysis tool that enriches the capabilities of the Snowflake platform. This tool offers industry-leading AI models, LLMs and vector search functionality, alongside complete LLM-powered experiences. With Snowflake Cortex, users can securely leverage the power of generative AI and unlock dynamic insights with their enterprise data, using languages they already know, such as SQL or Python.

Serverless functions in Cortex

With Cortex, users have access to serverless functions designed for specialized ML and LLM models tailored to specific tasks. The functions of Cortex can be divided into three categories:

Task-specific LLMs

Cortex has four task-specific models:

Answer Extraction - Extracts information from your unstructured data more efficiently

Use Case - Answer Extraction can be employed in healthcare systems to sift through medical literature, patient records and research papers, extracting specific answers or relevant information for medical professionals and researchers.

Sentiment Detection - Detects the sentiment of text on your Snowflake tables directly

Use Case - Sentiment Detection can assess the success of marketing or advertising campaigns. Understanding how users react to campaigns can guide future strategies.

Text Summarization - Summarizes long documents for faster consumption

Use Case - Text Summarization can be used to generate executive summaries or abstracts for research papers, reports or legal documents, aiding professionals in quickly understanding the main points.

Translation - Translates text at scale

Use Case - Translation can enable e-commerce businesses to localize and translate product descriptions, reviews and other content, enhancing the user experience for customers in various regions.

ML-powered functions

In addition to LLMs, Snowflake Cortex includes ML-powered functions to jumpstart advanced analytics:

Forecasting is powered by a gradient boosting machine (GBM) that automatically handles seasonality and scaling.

Use Case - Analyze historical production data to predict future production levels, aiding manufacturers in optimizing production planning.

Anomaly Detection is a technique that helps you identify any unusual data points in your time series data.

Use Case - Review network traffic, user behavior and system logs to identify abnormal patterns indicative of potential cyber threats or attacks.

Contribution Explorer streamlines and improves root cause analysis of changes and metrics.

Use Case - Analyze contributions and associated comments can provide insights into the quality and effectiveness of the code review process, helping teams identify areas for improvement.

Classification assigns predefined classes or labels to data for better pattern recognition and recommendations.

Use Case - Analyze customer behavior, usage patterns and other relevant data to identify indicators of potential churn.

General purpose models

Snowflake introduced two new features for text completion response and generating SQL code based on natural language instructions, powered by Llama 2 and Text2SQL, respectively. Its out-of-the-box functionality can be used for analysis and app development in Snowflake. These functions can seamlessly be incorporated into LLM applications like chatbots.

Native LLM experiences built on Cortex

Snowflake has implemented a range of advanced features that utilize the capabilities of Snowflake Cortex to improve user experience and facilitate efficient business analysis and team collaboration across organizations. These features encompass a set of pre-built user interfaces, high-performance LLMs, and fully hosted and managed search capabilities.

Snowflake Copilot

Copilot is an AI-powered assistant that uses natural language to generate and refine SQL queries. Analysts can simply ask Snowflake Copilot a question, and it will generate a SQL query using relevant tables. Users can refine queries by conversing with the assistant to filter down to the most relevant insights for their task. The best part is that no setup is required. Text-to-code functionality will soon be available programmatically via a general-purpose function called Text2SQL with Snowflake Cortex.

Document AI

Document AI is an experience powered by LLM for data extraction use cases. Using an intuitive interface and a pre-trained model, customers can extract information from any document, including PDFs, Word documents, plain text files and screenshots. This technology can be used to create a pipeline for document extraction, saving time and manual labor resources.

Universal Search

Universal Search is an LLM-powered search tool that helps you quickly discover and access data and apps. It’s based on search engine technology acquired from Neeva and allows you to find database objects within your Snowflake account, data products and Snowflake Native Apps from the Snowflake Marketplace. Snowflake Copilot uses Universal Search in the background to identify relevant tables and columns for SQL generation.

Highlights from Snowflake BUILD

LLM app development with Cortex

Snowflake Cortex has made it possible for enterprises to create custom LLM applications that can effectively comprehend their data. The platform offers a growing collection of serverless functions that support inference on top generative LLMs like Meta AI's Llama 2 model. These functions also include task-specific models that help accelerate analytics and advanced vector search functionality.

With the combination of Cortex and Snowpark Container Services, developers now have an easy way to build LLM applications without needing to move data outside of Snowflake's controlled boundary. This allows them to quickly deploy, manage and scale custom containerized workloads and models. Additionally, they can fine-tune open-source LLMs using secure Snowflake-managed infrastructure with GPU instances.

The benefit of this platform is that developers can create customized LLM applications that can learn the unique nuances of their business and data in a matter of minutes. They don't have to worry about manual LLM deployment, GPU-based infrastructure management or integrations. Retrieval-augmented generation is available natively inside Snowflake, enabling developers to build LLM apps that are tailored to their data.

Retrieval-augmented generation

Retrieval-augmented generation (RAG) is an advanced approach in natural language processing that combines retrieval and generation models. It retrieves data from a pre-existing knowledge base and generates new content based on that retrieved information. This approach facilitates the incorporation of domain knowledge into generative AI without requiring further language model training or fine-tuning.

To implement RAG, it’s necessary to store embeddings that represent the semantic meaning of texts and perform a semantic search to find similar vectors. This capability enables the discovery of similar content or content that answers specific questions.

It’s vital to have semantic search integrated within Snowflake, without relying on external services. This allows for precise retrieval of relevant information from the company's knowledge base. This information can then be added to the LLM to provide additional context. By integrating these systems, AI can generate responses using the extensive knowledge base available, resulting in more informed and precise answers.

Snowflake has announced its support for vectorized data within the database. Certain functionalities enabling RAG, such as returning embeddings, vector storage and searching for similar vectors, are already available for private previews.

With the continually accelerating pace of AI developments, we expect these advancements and tools to be the foundation of even greater changes further into 2024. Snowflake will continue to expand their AI capabilities, leading the charge with Snowflake Cortex. If you’re ready to dip your toe in the AI waters, check out some of Ollion’s AI solution offerings and contact us to set up time to talk.

Snowflake Cortex: LLMs and RAG power next-gen insights