Title:
An Assistant to Populate Repositories: Gathering Educational Digital Objects and Metadata Extraction [Download]Authors:
Casali, Ana and Deco, Claudia and Beltramone, Santiago
Index Terms:
Data mining;Crawlers;Metadata;Systems architecture;Electronic mail;Web pages;Indexes;Information Gathering;Educational Digital Objetcs;Repositories;Metadata Extraction;Information gathering;educational digital objetcs;repositories;metadata extraction
Abstract:
This paper presents an assistant to populate institutional repositories. This tool can detect all educational digital objects in a text format that are already published on institutional Websites and can be uploaded to a repository. This recopilation is a tedious task and is usually performed manually. In this paper, we propose a system architecture for automating this task of collecting text documents within a restricted domain in order to detect plausible documents that can be loaded into a repository. In addition, its metadata, such as language, category, title, authors, and their contact data, is automatically extracted. A prototype of this system was developed, and case studies in two different domains are analyzed.
DOI:
How to cite:
Casali, Ana and Deco, Claudia and Beltramone, Santiago, "An Assistant to Populate Repositories: Gathering Educational Digital Objects and Metadata Extraction" in IEEE Revista Iberoamericana de Tecnologias del Aprendizaje, pp. 87-94, May. 2016. doi: 10.1109/RITA.2016.2554018