Author: shubham ashok gandhi

  • CatBoost- Inner Workings and Optimizations

    CatBoost- Inner Workings and Optimizations

    Gradient boosting is a cornerstone technique for modeling tabular data due to its speed and simplicity. It delivers great results without any fuss. When you look around you’ll see multiple options like LightGBM, XGBoost, etc. CatBoost is one such variant. In this post, we will take a detailed look at this model, explore its inner…

  • Statistical Testing 101

    Statistical Testing 101

    In the era of LLMs, AI agents, and other groundbreaking technologies, one essential concept holds its ground—statistical testing. It serves as the foundation for many companies when evaluating whether a new feature or model resonates with their users. In this post, we’ll dive into Statistical Testing 101 to explore some of the key tests in…

  • SGD Classifier vs Logistic Regression

    SGD Classifier vs Logistic Regression

    Among all the classifiers provided by Sklearn, two stand out for their similarities: SGDClassifier and LogisticRegression. So, what differentiates the two? In this post, we will explore the key differences and compare SGD Classifier vs Logistic Regression. Let’s start with the most important ones [/latexpage] Optimization Difference Logistic Regression uses solvers like lbfgs, saga, newton-cg…

  • What is KV-Cache?

    What is KV-Cache?

    If you have been reading articles on LLMs, you would have often come across an interesting term called KV-Cache and how developers are trying to do all sorts of trickery to speed up LLMs. And that is what we are going to do today– talk about KV-Cache to understand it in detail! Before we talk…

  • Fine-Tune Embeddings for RAG to Improve Retrieval

    Fine-Tune Embeddings for RAG to Improve Retrieval

    Does your RAG application throw up bizarre, irrelevant content? The issue might not be your LLM but your retrieval. Fine-tune embeddings for RAG can significantly improve retrieval quality. When your RAG application doesn’t work, you don’t need to throw a bigger LLM at it for better response generation. What you need to do is improve…

  • How I Optimize Memory Usage in Pandas

    How I Optimize Memory Usage in Pandas

    This is not a blog post about code optimizations, data structures, etc. to reduce memory usage. Although that would be pretty useful too, this isn’t that. This is something more fundamental than that– How to Optimize Memory Usage in Pandas! So let’s see how we can optimize memory usage in our ML workflows. While training…