Research Data Management in Data Intensive Computing
The summer school introduces the topic of Data Intensive Computing and the related challenges for managing large research data sets. The modules for these seminars have their focus on management aspects from a producer / consumer perspective. It will be divided into two emphases:
- Infrastructure (EmInf): aimed at computer scientists and related fields
- Data Users (EmUse): aimed at large data users and/or creators
Who can apply
- EmInf: Graduates (Master and Phd) in Computer Science or related fields from any university;
- EmUse: Graduates (Master and Phd) from any fields at UFPR;
How to Subscribe
- UFPR Graduate students should subscribe to the PPGInf course “INFO7070 – Tópicos Especiais I” using the SIGA platform.
- External Students should request an isolated course to the PPGInf
- UFPR Graduate students should subscribe to the PRPPG course “PRPPG7007” using the SIGA platform. For more information access: http://www.prppg.ufpr.br/site/research-data-management-in-data-intensive-computing-2019-1/
When and Where
11 to 29 of March 2019
- EmInf: Monday to Friday, 9h to 12:30h
- Auditório no Departamento de Informática, Centro Politécnico, UFPR
- EmUse: Monday, Tuesday and Thursday, 16h to 19:30h
- Auditório do CIFLOMA – Engenharia Florestal, Campus Jardim Botânico (Avenida Prefeito Lothário Meissner, 632)
Summer School Modules
- Practical aspects and implementation of (Research) Data Management (EmInf): Since there is no existing monolithic turn-key solution for Research Data Management, a knowledge of various technology modules is needed to build up a coherent RDM strategy. In this module, a selection of these building blocks is introduced in a hands-on fashion. Particular topics investigated in practice will be object stores based on S3 and iRODS and their application to Research Data Management.
- Lecturers: Jonathan Bauer, Bernd Wiebelt
- Resource Abstraction (EmInf): The module tackles available concepts in modern (Cloud) Computing from Infrastructure as a Service to Containers, highlights practical challenges and open issues in modern state of the art. Storage, as an important resource to handle state and still use modern elastic approaches, is further analysed with looking at distributed database systems.
- Lecturers: Christopher Hauser, Daniel Seybold
- Large Scale Research Infrastructures (EmInf): Data intensive Computing focuses on the required infrastructure to store and process a large amount of research data. Starting with the planning of Cloud and HPC infrastructures, this module walks through to the processing and analysis of large continuous data sets. Cloud computing as operation model to process research data and encapsulate execution environments is practically considered.
- Lecturers: Bernd Wiebeld, Jonathan Bauer.
- Research Data Management – An Introduction (EmUse): Core concepts of research data management defines basics in the area of digital data, meta data, as well as persistent identifiers for long term addressing of research data. This module introduces the problem statements and defines the terminology and basic concepts.
- Lecturers: Dirk von Suchodoletz.
- Long Term Perspective of Access, Abstraction of Computer Systems (EmUse): The long term perspective of access introduces the problem statement of reproducibility of research data, and sketches solutions. The ongoing development of execution environments lead historically to incompatibilities of research data, which hinders or even blocks a long term usage. Research data for long term archiving hence requires metadata about their environments. With this information, virtualisation and emulation can be utilised to enable usage and even reproducibility, even on differing execution environments.
- Lecturers: Klaus Rechert, Gerhard Schneider.
- RDM Organizational Perspectives (EmUse): This institutional perspective deals with planning and realising strategies for research data management. This module includes besides the strategic orientation of universities or research environments organisational aspects such as certification, licensing, access plans and resources to provide research data. In particular providing access to external research data or providing access to and hence publish produced research data with respect to government and financial planning.
- Lecturers: Gerhard Schneider, Ligia E. Setenareski, Marcos Didonet del Fabro
- Field Specific aspects and Data Intensive Computing (EmInf): Field specific aspects of RDM explains with practical fields problem statements and solutions of research data management. The main focus is on “digital humanities” and research data management in machine learning, as well as Open Government Data.
- Lecturers: Stefan Wesner, Daniel Weingaertner, Paulo Soethe, Luiz Eduardo Soares de Oliveira