// the find
bojone/kg-2019
2019年百度的三元组抽取比赛,“科学空间队”源码
Competition code from the 2019 Baidu knowledge graph triple extraction contest, finishing 7th place with an F1 of 0.8807. It implements a custom pointer-tagging annotation structure for relation extraction using CNN + Attention, which the author claims is novel relative to the literature at the time. Primarily of interest to NLP researchers studying Chinese relation extraction approaches from that era.
The annotation scheme is genuinely novel — a hierarchical pointer-tagging hybrid that sidesteps the entity-overlap problem common in pipeline RE systems. The F1 of 0.8807 on a competitive public benchmark is a credible result, not a toy demo. The accompanying blog post (kexue.fm/archives/6671) gives a real technical explanation of the design decisions, which is more than most competition repos offer.
Requires Python 2.7 and Keras 2.2.4 with TensorFlow 1.8 — none of which have received security patches in years, and the author explicitly refuses to help with Python 3 porting. Five files total: no tests, no training pipeline documentation beyond 'run data_trans.py first', and no pretrained weights. The dataset itself requires a separate download from the competition organizers, who may or may not still be hosting it.