Data Model
Overview
This page describes the metadata structure of the Text+ Tutorial dataset. The goal is to uniformly document the fields, value ranges, and definitions used in data collection and curation processes.
The described metadata fields serve to systematically capture, classify, and make learning resources discoverable. They support the interoperability, reusability, and quality assurance of the dataset and enable consistent integration into existing research and education infrastructures.
Each tutorial in the dataset is identified by a unique local identifier and supplemented with information about content, target audience, difficulty level, language, license, provenance, and other properties. Some fields are freely fillable (e.g., Description, Keywords), while others are restricted to predefined value ranges or boolean values to ensure consistency and comparability.
Tutorial Metadata Description
- LocalID: An internal local ID to identify tutorial entries. Follows the pattern ttp00001 (Tutorials Text Plus)
- Title: Title of the tutorial. During initial data collection, it matched the tutorial title provided in each source.
- Contributor: Identifies the contributors to the tutorial. These can be authors, instructors, editors, etc.
- mediaType: Field with predefined value range that identifies the type of learning resource according to the format of content presentation. Possible values: Article, Book, Case Study, Code Notebook, Diagram, Drill and Practice, Lecture, Poster, Report, Tutorial, Webpage, Workshop, Online course.
- targetGroup: Field with predefined value range that describes the target audience of the learning resource, i.e., for which learning or experience level it is designed. Possible values: Bachelor Student, Master’s Student, PhD Student, Data Steward.
- Difficulty: Field with predefined value range that indicates the difficulty level or complexity level of the resource. Possible values: Low, Intermediate, Advanced.
- license: Field that indicates the resource’s license, i.e., the legal terms of use according to a standardized license form (e.g., CC BY 4.0).
- dateVersion: Field that indicates the publication date of the resource. If not available, the date of the last update or publication of the current version is used.
- language: Field that designates the language of the resource in which the content is written or presented (e.g., German, English).
- relatedTo: Field that documents relationships between different tutorials within the dataset (e.g., Part 1 and Part 2 of a series). The value contains a list of internal tutorial IDs that are linked together.
- Description: Description of the tutorial and its contents.
- keywords: Field that contains a list of keywords describing the content or thematic focus of the resource and improving discoverability.
- learningObjective: Field with predefined value range that describes the learning objective or content focus of the resource. Possible values: Programming skills, Use cases, Tool, Service, Method.
- URL: Field that indicates the web address of the learning resource or its main page through which it is directly accessible.
- Provenance: Field that describes the origin of the resource, i.e., from which source or platform the tutorial was adopted during initial data collection (e.g., specific websites or tutorial directories).