In this episode, I sat down with Shawkat Kabbara, Founder & CEO of Papr AI, to discuss why traditional RAG systems fundamentally break at scale.
Key insights:
• Why retrieval gets WORSE with more data (backed by Google research)
• The 400-500ms latency requirement that kills personalized voice AI
• How multi-agent systems create hallucination cascades
• Brain-inspired predictive memory architecture as the solution
• Why you need to predict context, not just retrieve it
We dive into retrieval loss metrics, the working memory approach inspired by the prefrontal cortex, and why Paper built prediction layers instead of another search-based system.
Perfect for developers building production AI systems who are hitting the limits of RAG.
On pricing, good feedback. We do have a usage based subscription for our pro and team plan. Makes sense to add this on the website so it's clear.
We fixed the issue with google account sign-up, please give it a try again. We don't access any sensitive information from your Google account. We just use Google to sign-up and login with Google.
Today we are launching https://Papr.ai – the first AI-native workspace with infinite memory, powered by state-of-the-art retrieval accuracy. Teams work better, 3x faster. Your AI never forgets.
Ever wondered why even the smartest AI feels like meeting for the first time, every time? It's like hiring a genius who can't remember what you discussed five minutes ago! It's the same problem plaguing our workplaces - and here's the kicker: a staggering 80% of knowledge workers are losing about 2.5 hours every single day just trying to find stuff scattered all over the place. We're talking about a $1.8 trillion productivity black hole annually, globally. Ouch.
Having been in the trenches leading product at Microsoft, Facebook, and Apple, I've seen this chaos up close. During a major product launch at Apple, our team hit a critical roadblock when dependency teams made API changes without proper documentation visibility. Our developers discovered compatibility issues late in the development cycle, causing weeks of delays. It was beyond frustrating! Our teams were essentially drowning in a sea of information, spread across what felt like a million different tools. And the latest AI solutions? Incredibly powerful, sure, but missing that critical "memory" component.
It's like equipping your team with Formula 1 race cars but forgetting to build the racetrack. You wouldn't hire a brilliant executive who forgets every conversation they've ever had, would you? Yet, that's precisely how we're using AI today. Each interaction starts from scratch, forcing us to constantly rebuild context. It's exhausting, inefficient, and, frankly, rather eccentric.
That's why we created Papr.ai – the first AI-native workspace that you can trust, giving your AI a super-powered, photographic memory that actually works.
Here’s what makes Papr different
- State-of-the-art accuracy: We've achieved 86% accuracy on Stanford's STARK benchmark, securing the number one spot on the leaderboard. STARK represent real-world queries vs. simple needle-in-the-haystack benchmarks. Link here: https://huggingface.co/spaces/snap-stanford/stark-leaderboar...
- Connected to your data: Your AI instantly recalls context from Slack, docs, and meetings—creating a unified knowledge base that answers complex questions and grows with your team.
- Document-first workspace: Unlike chat-first platforms, we're building an AI-native document experience with Apple-inspired design. Create and edit documents while AI maintains perfect context—no agent configuration needed. Think Cursor.ai but for collaborative documents with infinite memory.
Users are experiencing response times in mere seconds for memory retrieval, compared to the minutes or even hours it used to take to sift through company knowledge. That's a 99% reduction in search time. Think of what you can do with an extra 2.5 hours a day.
The future of work isn't just about AI – it's about AI that remembers, learns, and grows alongside your team. While others are busy building faster calculators, we're building a true digital brain for your organization. Forget the noise, focus on what makes a difference.
Ready to transform how your team works and finally put an end to the information scavenger hunt? Visit papr.ai to join the forward-thinking companies already leveraging Papr.
Thanks for all the people supporting us in this journey. Please don’t forget try it out and share your feedback.
Right now I've focused on the following more general integrations:
VectorDBs - These include established providers like qdrant / pgvector / weaviate / pinecone / chroma. I’m also happy to try out lantern.dev if anyone is interested since their offering appears quite good.
Embeddings - I will embed your data with Jina-V2-base as part of onboarding. Alternatively, happy to use OpenAI if you supply keys (or I could take a crack at figuring out Cohere’s API).
LLMs - I can integrate with Anthropic / OpenAI / Hugging / Mistral / SciPhi and I am open to exploring any other providers that are relevant.
Why build your own RAG pipeline, just use Papr memory instead to add personalization to your GPTs.
Papr Memory enables GPTs to perform actions like saving, retrieving, updating, and deleting personal memories for end-users. When a conversation begins, the GPT assesses whether to store or recall parts of the dialogue. The initial step involves authenticating the user, so only the authenticated user has access to their memories.
Key insights: • Why retrieval gets WORSE with more data (backed by Google research) • The 400-500ms latency requirement that kills personalized voice AI • How multi-agent systems create hallucination cascades • Brain-inspired predictive memory architecture as the solution • Why you need to predict context, not just retrieve it
We dive into retrieval loss metrics, the working memory approach inspired by the prefrontal cortex, and why Paper built prediction layers instead of another search-based system. Perfect for developers building production AI systems who are hitting the limits of RAG.