finds.dev← search

// the find

bojone/kg-2019

★ 767 · Python · updated May 2020

2019年百度的三元组抽取比赛,“科学空间队”源码

Competition code from the 2019 Baidu knowledge graph triple extraction contest, finishing 7th place with an F1 of 0.8807. It implements a custom pointer-tagging annotation structure for relation extraction using CNN + Attention, which the author claims is novel relative to the literature at the time. Primarily of interest to NLP researchers studying Chinese relation extraction approaches from that era.

The annotation scheme is genuinely novel — a hierarchical pointer-tagging hybrid that sidesteps the entity-overlap problem common in pipeline RE systems. The F1 of 0.8807 on a competitive public benchmark is a credible result, not a toy demo. The accompanying blog post (kexue.fm/archives/6671) gives a real technical explanation of the design decisions, which is more than most competition repos offer.

Requires Python 2.7 and Keras 2.2.4 with TensorFlow 1.8 — none of which have received security patches in years, and the author explicitly refuses to help with Python 3 porting. Five files total: no tests, no training pipeline documentation beyond 'run data_trans.py first', and no pretrained weights. The dataset itself requires a separate download from the competition organizers, who may or may not still be hosting it.

View on GitHub → Homepage ↗

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →