Livoa LogoLivoa

Gating MLP

(Feature Attention)

Linear (9 → 32)

ReLU

Linear (32 → 9)

Sigmoid

Gate outputs

g ∈ (0, 1)

g1

g2

g3

g4

g5

g6

g7

g8

Gated Inputs Xg = X ⊕ (1 + g)

Per Feature: Scale Factor in [1, 2]

Main MLP

Dense +
Activation

Dense +
Activation

Dense +
Activation

Output

Ŷ Prediction

(S11)

𝒳 (9 Input
Features)

Fig. 1: Overview of the proposed method.
a
TabPFN is trained on synthetic data to take entire
datasets as inputs and predict in a forward pass
Xtrain
ytrain
Xtest
?
TabPFN
neural network
parameterized by θ
Prediction
-log qθ(ytest|...)
Training loss to be optimized
across millions of datasets
b
1.2
6.1
3.0
33.3
2D TabPFN layer (12x)
1D feature attention
1D sample attention
MLP
Predictions: ŷtest

1

by faf

0
0 uses