Tabular data—structured information stored in rows and columns—is at the heart of most real-world machine learning problems, from healthcare records to financial transactions. Over the years, models based on decision trees, such as Random Forest, XGBoost, and CatBoost, have become the default choice for these tasks. Their strength lies in handling mixed data types, capturing complex feature interactions, and delivering strong performance without heavy preprocessing. While deep learning has transformed areas like computer vision and natural language processing, it has historically struggled to consistently outperform these tree-based approaches on tabular datasets.
That long-standing trend is now being questioned. A newer approach, TabPFN, introduces a different way of tackling tabular problems—one that avoids traditional dataset-specific training altogether. Instead of learning from scratch each time, it relies on a pretrained model to make predictions directly, effectively shifting much of the learning process to inference time. In this article, we take a closer look at this idea and put it to the test by comparing TabPFN with established tree-based models like Random Forest and CatBoost on a sample dataset, evaluating their performance in terms of accuracy, training time, and inference speed.
What is TabPFN?
TabPFN is a tabular foundation model designed to handle structured data in a completely different way from traditional machine learning. Instead of training a new model for every dataset, TabPFN is pretrained on millions of synthetic tabular tasks generated from causal processes. This allows it to learn a general strategy for solving supervised learning problems. When you give it your dataset, it doesn’t go through iterative training like tree-based models—instead, it performs predictions directly by leveraging what it has already learned. In essence, it applies a form of in-context learning to tabular data, similar to how large language models work for text.
The latest version, TabPFN-2.5, significantly expands this idea by supporting larger and more complex datasets, while also improving performance. It has been shown to outperform tuned tree-based models like XGBoost and CatBoost on standard benchmarks and even match strong ensemble systems like AutoGluon. At the same time, it reduces the need for hyperparameter tuning and manual effort. To make it practical for real-world deployment, TabPFN also introduces a distillation approach, where its predictions can be converted into smaller models like neural networks or tree ensembles—retaining most of the accuracy while enabling much faster inference.
Comparing TabPFN with Tree based models
Setting up the dependencies
pip install tabpfn-client scikit-learn catboost
import time
import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Models
from sklearn.ensemble import RandomForestClassifier
from catboost import CatBoostClassifier
from tabpfn_client import TabPFNClassifier
To run the model, you require the TabPFN API Key. You can get the same from https://ux.priorlabs.ai/home
import os
from getpass import getpass
os.environ[‘TABPFN_TOKEN’] = getpass(‘Enter TABPFN Token: ‘)
Creating the dataset
For our experiment, we generate a synthetic binary classification dataset using make_classification from scikit-learn. The dataset contains 5,000 samples and 20 features, out of which 10 are informative (actually contribute to predicting the target) and 5 are redundant (derived from the informative ones). This setup helps simulate a realistic tabular scenario where not all features are equally useful, and some introduce noise or correlation.
We then split the data into training (80%) and testing (20%) sets to evaluate model performance on unseen data. Using a synthetic dataset allows us to have full control over the data characteristics while ensuring a fair and reproducible comparison between TabPFN and traditional tree-based models.
X, y = make_classification(
n_samples=5000,
n_features=20,
n_informative=10,
n_redundant=5,
random_state=42
)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
Testing Random Forest
We start with a Random Forest classifier as a baseline, using 200 trees. Random Forest is a robust ensemble method that builds multiple decision trees and aggregates their predictions, making it a strong and reliable choice for tabular data without requiring heavy tuning.
After training on the dataset, the model achieves an accuracy of 95.5%, which is a solid performance given the synthetic nature of the data. However, this comes with a training time of 9.56 seconds, reflecting the cost of building hundreds of trees. On the positive side, inference is relatively fast at 0.0627 seconds, since predictions only involve passing data through the already constructed trees. This result serves as a strong baseline to compare against more advanced methods like CatBoost and TabPFN.
rf = RandomForestClassifier(n_estimators=200)
start = time.time()
rf.fit(X_train, y_train)
rf_train_time = time.time() – start
start = time.time()
rf_preds = rf.predict(X_test)
rf_infer_time = time.time() – start
rf_acc = accuracy_score(y_test, rf_preds)
print(f”RandomForest → Acc: {rf_acc:.4f}, Train: {rf_train_time:.2f}s, Infer: {rf_infer_time:.4f}s”)
Testing CatBoost
Next, we train a CatBoost classifier, a gradient boosting model specifically designed for tabular data. It builds trees sequentially, where each new tree corrects the errors of the previous ones. Compared to Random Forest, CatBoost is typically more accurate because of this boosting approach and its ability to model complex patterns more effectively.
On our dataset, CatBoost achieves an accuracy of 96.7%, outperforming Random Forest and demonstrating its strength as a state-of-the-art tree-based method. It also trains slightly faster, taking 8.15 seconds, despite using 500 boosting iterations. One of its biggest advantages is inference speed—predictions are extremely fast at just 0.0119 seconds, making it well-suited for production scenarios where low latency is critical. This makes CatBoost a strong benchmark before comparing against newer approaches like TabPFN.
cat = CatBoostClassifier(
iterations=500,
depth=6,
learning_rate=0.1,
verbose=0
)
start = time.time()
cat.fit(X_train, y_train)
cat_train_time = time.time() – start
start = time.time()
cat_preds = cat.predict(X_test)
cat_infer_time = time.time() – start
cat_acc = accuracy_score(y_test, cat_preds)
print(f”CatBoost → Acc: {cat_acc:.4f}, Train: {cat_train_time:.2f}s, Infer: {cat_infer_time:.4f}s”)
Testing TabPFN
Finally, we evaluate TabPFN, which takes a fundamentally different approach compared to traditional models. Instead of learning from scratch on the dataset, it leverages a pretrained model and simply conditions on the training data during inference. The .fit() step mainly involves loading the pretrained weights, which is why it is extremely fast.
On our dataset, TabPFN achieves the highest accuracy of 98.8%, outperforming both Random Forest and CatBoost. The fit time is just 0.47 seconds, significantly faster than the tree-based models since no actual training is performed. However, this shift comes with a trade-off—inference takes 2.21 seconds, which is much slower than CatBoost and Random Forest. This is because TabPFN processes both the training and test data together during prediction, effectively performing the “learning” step at inference time.
Overall, TabPFN demonstrates a strong advantage in accuracy and setup speed, while highlighting a different computational trade-off compared to traditional tabular models.
tabpfn = TabPFNClassifier()
start = time.time()
tabpfn.fit(X_train, y_train) # loads pretrained model
tabpfn_train_time = time.time() – start
start = time.time()
tabpfn_preds = tabpfn.predict(X_test)
tabpfn_infer_time = time.time() – start
tabpfn_acc = accuracy_score(y_test, tabpfn_preds)
print(f”TabPFN → Acc: {tabpfn_acc:.4f}, Fit: {tabpfn_train_time:.2f}s, Infer: {tabpfn_infer_time:.4f}s”)
Results
Across our experiments, TabPFN delivers the strongest overall performance, achieving the highest accuracy (98.8%) while requiring virtually no training time (0.47s) compared to Random Forest (9.56s) and CatBoost (8.15s). This highlights its key advantage: eliminating dataset-specific training and hyperparameter tuning while still outperforming well-established tree-based methods. However, this benefit comes with a trade-off—inference latency is significantly higher (2.21s), as the model processes both training and test data together during prediction. In contrast, CatBoost and Random Forest offer much faster inference, making them more suitable for real-time applications.
From a practical standpoint, TabPFN is highly effective for small-to-medium tabular tasks, rapid experimentation, and scenarios where minimizing development time is critical. For production environments, especially those requiring low-latency predictions or handling very large datasets, newer advancements such as TabPFN’s distillation engine help bridge this gap by converting the model into compact neural networks or tree ensembles, retaining most of its accuracy while drastically improving inference speed. Additionally, support for scaling to millions of rows makes it increasingly viable for enterprise use cases. Overall, TabPFN represents a shift in tabular machine learning—trading traditional training effort for a more flexible, inference-driven approach.
Check out the Full Codes with Notebook here. Also, feel free to follow us on Twitter and don’t forget to join our 130k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.
Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us
The post How TabPFN Leverages In-Context Learning to Achieve Superior Accuracy on Tabular Datasets Compared to Random Forest and CatBoost appeared first on MarkTechPost.