Name Size Keywords Download link Reference
Course Prerequisite Relation 3.99 MB Prerequisite Relation, Relation Learning data&code ACL'17

Course Prerequisite Relation

This is the dataset of paper "Prerequisite Relation Learning for Concepts in MOOCs" in ACL 2017.

DSA and ML are the dataset of different data of course category "Data structure and Algorithm" and "Machine Learning". Each dataset contains three kinds of file: Captions, Core_concepts and Labeled file.

1. Captions file
Video captions of MOOC courses in the dataset, each line represents a video. Captions are in standard json format, each domain contains at least two courses. The text has been tokenized and labeled with POS tagging. We select the POS tagger implemented by the Stanford NLP group.(http://nlp.stanford.edu/software/tagger.shtml).

2. Core_concepts file
Course core concepts extracted from the dataset. Each line represents a core concept of the courses' domain.

3. Labeled file
The Prerequisite Relation extracted from the dataset. Each line contains three items. Head concept, Tail concept and label. The "label" field is the human annotated label for a Relation. "1" stands for the head and tail concepts have the prerequisite relation. And w-label represents the A-B and B-A are both prerequisite.

You may use the dataset to test your prerequisite relation Extraction model or do some more talent jobs.