Last month, the LLMs in Prod community had the pleasure of hosting Rohit Chatter, Chief Software Architect at Walmart Tech Global, for a fireside chat on Gen AI and semantic caching in retail. This conversation spanned a wide range of topics, from Rohit's personal journey in the tech industry to
In two days (i.e. Jan 4), OpenAI will retire 33 models, including GPT-3 (text-davinci-003) and various others. This is OpenAI's biggest model deprecation so far. Here's what you need to know: GPT-3 Model Retirement The text-davinci-003 model (commonly known as GPT-3) will be unavailable from Jan 4. → You must
💡This is Portkey's first collaboration with the Hasura Team. Hasura helps you build robust RAG data pipelines by unifying multiple private data sources (relational DB, vector DB, etc.) and letting you query the data securely with production-grade controls. LLMs have been around for some time now and have shown that
Over the past few months, we've been keenly observing latencies for both GPT 3.5 & 4. The emerging patterns have been intriguing. The standout observation? GPT-4 is catching up in speed, closing the latency gap with GPT 3.5. Our findings reveal a consistent decline in GPT-4 latency. While your
Implementing semantic cache from scratch for production use cases.
Portkey CEO Rohit Agarwal shares practical tips from his own experience on crafting production-grade & reliable LLM systems. Read more LLM reliability tips here.
Rohit from Portkey is joined by Weaviate's Research Scientist Connor where they go on a deep dive about the differences between MLOps and LLMOps, building RAG systems, and what lies ahead for building production-grade LLM-based apps. This and much more in this podcast!