DeepSeek is the talk of the tech world right now, and rightfully so!
If you’re implementing the DeepSeek Large Language Model (or any LLM for that matter) in your Retrieval-Augmented Generation (RAG) Pipeline, you have to ensure that the LLM accesses only the data its authorized to.
This guide will walk you through the nuts and bolts of securing your RAG pipelines with Fine Grained Authorization while also about making your queries secure and super efficient! There’s also a notebook linked at the end if you want to look at some code.
Note: This example uses DeepSeek R1 but works with any LLM. Using Authorization for RAG Pipelines is a best practice regardless of which LLM and Emebedding model you are using.
How is this image relevant? It’s relevant to our RAG Pipeline and you’ll find out how at the end of this guide. 🤭
Software used in this guide:
- DeepSeek R1 LLM (through OpenRouter)
- OpenAI for Embeddings
- SpiceDB for permissions
- Pinecone as our Vector Database
- Langchain for language model integration
Why is this important?
Because we now need to think of Day2 AI Ops.
Enterprises are working extra hard to keep sensitive info (like personal details and company secrets) from leaking out. The go-to solution? Setting up some solid guardrails around RAG to keep data safe while making sure everything runs smoothly and efficiently.
To get these guardrails just right, you need to set up some smart permission systems that can keep track of who can see what and which resources they can access.
How It Works
Let me break down how a typical RAG pipeline works – it’s pretty straightforward with two main parts:
1. Ingestion
Think of this as preparing your knowledge base. We grab all sorts of data, clean it up a bit, turn it into embeddings (vectors that represent real-world objects), and store them in a vector database. It’s like organizing your digital library, where each book (or document) gets a special tag – like “document123” – so we can keep track of where everything came from.
2. Query & Response
Here’s where it gets fun! When someone asks the chatbot a question, it transforms their question into the same kind of embedding format and goes hunting through the vector database for relevant matches. It’s like having a super-smart librarian who knows exactly where to look! Once it finds the answer, the chatbot feeds this information to the LLM, which crafts a nice, helpful response based on what it found.
But here’s the catch – and it’s a big one – this setup is missing something crucial: authorization checks! 🚨
For example, if someone who shouldn’t have access to sensitive financial data asks “What was our Q4 revenue?”, they might get an answer they’re not supposed to see. Not ideal, right?
Authorization, ReBAC & SpiceDB
In case you’re new to the world of AuthZ, here’s a quick primer:
Authorization determines whether you have permission to access a resource. Traditional models like Role-Based Access Control (RBAC) work well for simple setups, but as systems grow more complex, defining permissions based on roles alone can get messy. That’s where Relationship-Based Access Control (ReBAC) comes in. Instead of just assigning roles, ReBAC uses relationships—like “Alice is a manager of Project X” or “Bob is a friend of Charlie”—to determine access dynamically. This makes it ideal when it comes to securing your RAG pipelines.
This guide uses SpiceDB, a powerful, open-source database designed to handle ReBAC at scale. Inspired by Google’s Zanzibar (which powers Google’s Authorization systems across Docs, YouTube and more), SpiceDB lets you define and enforce complex access rules efficiently. With it, you can model relationships between users and resources, then perform lightning-fast permission checks.
Three things about SpiceDB
Here’s a quick TL;DR of how SpiceDB works:
-
Schema: This defines the types of objects found, how those objects relate to one another, and the permissions that can be computed off of those relations. Developers can read and write a schema based on their use-case and then store & query data.
-
Relationships: Relationships are what binds together a Subject and a Resource via a Relation. A functioning Permissions System that uses ReBAC is the combination of Schema and Relationships
-
Checks & Lookups: Now that we have a schema and relationships in the database, we can issue checks on whether a subject has a permission on a specific resource, or what resources a subject can access whether via a computed permission or relation membership.
Adding Authorization to your RAG Pipeline
Now there are two approaches to adding AuthZ to your RAG Pipeline.
- Post-filter Authorization
So here’s the deal: each embedding can have meta data showing which document it came from (like document123
). We use this to check if you’re actually allowed to see that content.
The process? We can perform a check for each relevant embedding to see if the user has permissions to view the document that the embedding originated from. You can specify the contexts you require: Ex: “I need 5 pieces of additional context before I make the prompt to the LLM” or “exhaust all the embeddings returned”
- Pre Filter Authorization
Here we make a query and embed it. But before diving in, we check with our permissions system to see what stuff we’re actually allowed to peek at. It gives us back a list of all the documents we can access.
Then we just use that list as our filter, grab all the relevant embeddings we’re allowed to see, and boom – we’re good to go! That’s what we’ll be playing with in this guide.
Step-by-step guide
Where’s the code you ask? Well that’s in Part II of this guide. Now that you’ve understood the concepts, here’s the step-by-step guide to securing your RAG Pipelines.