Author: shubham ashok gandhi
-
CatBoost- Inner Workings and Optimizations
Gradient boosting is a cornerstone technique for modeling tabular data due to its speed and simplicity. It delivers great results without any fuss. When you look around you’ll see multiple options like LightGBM, XGBoost, etc. CatBoost is one such variant. In this post, we will take a detailed look at this model, explore its inner…
-
Statistical Testing 101
In the era of LLMs, AI agents, and other groundbreaking technologies, one essential concept holds its ground—statistical testing. It serves as the foundation for many companies when evaluating whether a new feature or model resonates with their users. In this post, we’ll dive into Statistical Testing 101 to explore some of the key tests in…
-
SGD Classifier vs Logistic Regression
Among all the classifiers provided by Sklearn, two stand out for their similarities: SGDClassifier and LogisticRegression. So, what differentiates the two? In this post, we will explore the key differences and compare SGD Classifier vs Logistic Regression. Let’s start with the most important ones [/latexpage] Optimization Difference Logistic Regression uses solvers like lbfgs, saga, newton-cg…
-
What is KV-Cache?
If you have been reading articles on LLMs, you would have often come across an interesting term called KV-Cache and how developers are trying to do all sorts of trickery to speed up LLMs. And that is what we are going to do today– talk about KV-Cache to understand it in detail! Before we talk…
-
How I Optimize Memory Usage in Pandas
This is not a blog post about code optimizations, data structures, etc. to reduce memory usage. Although that would be pretty useful too, this isn’t that. This is something more fundamental than that– How to Optimize Memory Usage in Pandas! So let’s see how we can optimize memory usage in our ML workflows. While training…