A Generalization of Transformer Networks to Graphs
Do Transformers Really Perform Bad for Graph Representation?
Avoiding Reasoning Shortcuts- Adversarial Evaluation, Training, and Model Development for Multi-Hop QA
TRANSFORMER-XH: MULTI-EVIDENCE REASONING WITH EXTRA HOP ATTENTION
旋转数组的最小数字&搜索旋转排序数组
GRAPH-BERT: Only Attention is Needed for Learning Graph Representations
Do Multi-Hop Question Answering Systems Know How to Answer the Single-Hop Sub-Questions?
Pytorch RNN之pack_padded_sequence()和pad_packed_sequence()
Multi-hop Attention Graph Neural Networks