AI Training Series - High Performance Data Analytics Using R at LRZ

This course is part of the "LRZ AI Training Series", a series of courses aiming at the needs and expectations of data analytics, big data & AI users at LRZ. While focusing on these particular users and their use cases, this session as well as all other courses offered as part of the AI Training Series are, of course, open to all interested parties.

This course for academic participants from Germany will be organised as an online event.

Contents

R is a highly popular and powerful programming language for data analysis and graphics, used in many research domains. The Leibniz Supercomputing Centre (LRZ) is addressing the needs of R users by facilitating various ways of working with R on LRZ systems.

For one it is hosting a RStudio Server web application as frontend to the LRZ AI Systems. This is an easy to use and powerful, interactive platform for data analytics, machine learning and AI projects. Additionally, R can be used on the high performance computing (HPC) systems operated by LRZ, the Linux Cluster and SuperMUC-NG.

In this course, the different possibilities of using R for data analytics, machine learning and AI projects at LRZ will be demonstrated and experienced in hands-on session. Guidelines and best practice examples for running R applications efficiently and productively on the various systems will be provided. Special attention will be paid to different ways of parallelizing R code in order to utilize various LRZ cluster systems. There will be breaks during the session.

There will be three content blocks of roughly one and a half hour each:

  • The LRZ AI Systems and RStudio Server (B) / (I)
  • RStudio Server for AI projects (B) / (I)
  • R and the LRZ AI Systems: R package management and containerization (B) / (I)

 

  • R on the LRZ Linux Cluster: environment modules, R package management (B) / (I)
  • Slurm Workload Manager, interactive session, job processing (B) / (I)
  • Parallelization Using R: Overview and resources (I)
  • Pleasingly parallel workloads (B) / (I)

 

  • Introduction to worker queue scenario/weak coupling (incl. batchtools, clustermq) (I) / (A)
  • Shared memory parallelization (parallel/doParallel, foreach) (I) / (A)
  • Message passing (rmpi/doMPI) (I) / (A)
  • Futures/Promises (parallel, future, doFuture) (A)
  • Workflow management (targets, crew) (A)

Prerequisites

  • Basic knowledge of R
  • AI Training Series: Orientation Session (or comparable previous knowledge)
  • AI Training Series: Introduction to Container Technology & Application to AI at LRZ (or comparable previous knowledge)
  • AI Training Series: Introduction to the LRZ AI Systems (or comparable previous knowledge)

Content Level

The content level of the course is broken down as:

Beginner's content:

1,0h

22%

Intermediate content:

2,0h

45%

Advanced content:

1,5h

33%

Community-targeted content:

0,0h

0%

Language

English

Lecturers

Dr. Johannes Albert-von der Gönna (LRZ)

Prices and Eligibility

The course is open and free of charge for academic participants from Germany.

Registration

Please register with your official e-mail address to prove your affiliation.

Withdrawal Policy

See Withdrawal

Legal Notices

For registration for LRZ courses and workshops we use the service edoobox from Etzensperger Informatik AG (www.edoobox.com). Etzensperger Informatik AG acts as processor and we have concluded a Data Processing Agreement with them.

See Legal Notices

Online Course AI Training Series - High Performance Data Analytics Using R at LRZ
Number hdta5s24
Available places 92
Date 10.07.2024 – 10.07.2024
Price EUR 0.00
Location ONLINE


Room
Registration deadline 05.07.2024 23:59
E-mail education@lrz.de
No. Date Time Leader Location Room Description
1 10.07.2024 10:00 – 16:00 Johannes Albert-von der Gönna ONLINE Lecture