← back
Atomic-level protein–ligand recognition
with PBCNet2.0 for probe discovery
Jie Yu (et. al) · Paper · June 14, 2026
General
- The model takes in a pair of 3D protein-ligand complexes: (Protein X, Ligand A) and (Protein X, Ligand B), where the ligands are docked in the same pocket of Protein X.
- Mutation prediction is supported——the input here is (Wild Type Pocket, Ligand A) and (Mutated Pocket, Ligand A)
- The mutation sensitivity portion is definitely the most interesting result. This is a pretty known problem for existing biology models (e.g. AlphaFold is bad at capturing point mutations), so getting signal here is really cool (see Figure 4b in the paper).
- I wonder how much of this performance is from a cherry-picked dataset.
Architecture
- Siamese neural setup: both complexes passed through same embedding layer.
- MLP then takes the difference of the embeddings as input. This is a very intentional choice; taking both the individual embeddings and their difference concatenated was possible.
- The graphs for message passing are protein-ligand interaction-aware
- The inductive bias via the 3x3 matrix decomposition is really nice, cool that it seems to work
Data Curation
- BindingDB 2023.12 is used (~3 million affinity measurements)
- Series of ligands are identified with Tanimoto similarity (0.4 threshold), after which docking is performed against a crystallized protein-ligand structure with a similar ligand to the series.
- All data is then experimental binding affinity difference labels between docked structures, i.e. structure is obtained from docking, not experimentally.
Criticisms
- The “almost FEP” claims are bad imo. This model almost surely suffers from the same data contamination issues that plagued Boltz-2. So saying things like
Notably, PBCNet2.0 approaches the performance of the gold standard
Schrödinger FEP+ (ρ = 0.70 and r = 0.70, ∆ρ = 0.03 and Δr = 0.04), with statistically indistinguishable ranking accuracy (P = 0.39).
feels tacky to me.
- The authors mention enforcing prediction antisymmetry through training; ideally, the model should respect \(\text{prediction}(A, B) = - \text{prediction}(B, A)\). It’s not clear to me why this shouldn’t be baked into the architecture. Probably would have been an interesting ablation, but maybe not worth the time.