Atomic-level protein–ligand recognition with PBCNet2.0 for probe discovery

Jie Yu (et. al) · Paper · June 14, 2026

The model takes in a pair of 3D protein-ligand complexes: (Protein X, Ligand A) and (Protein X, Ligand B), where the ligands are docked in the same pocket of Protein X.
Mutation prediction is supported——the input here is (Wild Type Pocket, Ligand A) and (Mutated Pocket, Ligand A)
- The mutation sensitivity portion is definitely the most interesting result. This is a pretty known problem for existing biology models (e.g. AlphaFold is bad at capturing point mutations), so getting signal here is really cool (see Figure 4b in the paper).
  - I wonder how much of this performance is from a cherry-picked dataset.

Siamese neural setup: both complexes passed through same embedding layer.
- MLP then takes the difference of the embeddings as input. This is a very intentional choice; taking both the individual embeddings and their difference concatenated was possible.
The graphs for message passing are protein-ligand interaction-aware
The inductive bias via the 3x3 matrix decomposition is really nice, cool that it seems to work

BindingDB 2023.12 is used (~3 million affinity measurements)
Series of ligands are identified with Tanimoto similarity (0.4 threshold), after which docking is performed against a crystallized protein-ligand structure with a similar ligand to the series.
- All data is then experimental binding affinity difference labels between docked structures, i.e. structure is obtained from docking, not experimentally.

The “almost FEP” claims are bad imo. This model almost surely suffers from the same data contamination issues that plagued Boltz-2. So saying things like
Notably, PBCNet2.0 approaches the performance of the gold standard
Schrödinger FEP+ (ρ = 0.70 and r = 0.70, ∆ρ = 0.03 and Δr = 0.04), with statistically indistinguishable ranking accuracy (P = 0.39).

feels tacky to me.
The authors mention enforcing prediction antisymmetry through training; ideally, the model should respect \(\text{prediction}(A, B) = - \text{prediction}(B, A)\). It’s not clear to me why this shouldn’t be baked into the architecture. Probably would have been an interesting ablation, but maybe not worth the time.