Polyglot Workshop – Developing LLMs for Low-Resource Languages

Europe/Berlin
2. OG (IMPULSE-Haus)

2. OG

IMPULSE-Haus

Adenauerallee 131, 53113 Bonn
Aniket Sen, Nicholas Kluge, Shiza Fatimah, Sophia Falk
Description

“Polyglot Workshop – Developing LLMs for Low-Resource Languages” is a four-day, hands-on program focused on sharing expertise in building large language models, with a special emphasis on underrepresented languages. Our goal is to empower a new generation of practitioners to advance LLM development beyond high-resource settings. The Workshop covers all key stages of an LLM training pipeline, making this knowledge accessible to a broad and diverse audience.

 

Registration
Pre-Registration for the Polyglot Workshop
    • Setting up your Workstation (optional!) 2. OG

      2. OG

      IMPULSE-Haus

      Adenauerallee 131, 53113 Bonn

      Introduction to basic concepts that are prerequisites for attendees to take full advantage of the workshop (e.g., SSH basics, Introduction to Linux, etc.).

      Conveners: Alexander Ermakov, Aniket Sen, Nicholas Kluge, Shiza Fatimah
    • Introduction to LLMs 2. OG

      2. OG

      IMPULSE-Haus

      Adenauerallee 131, 53113 Bonn

      Participants are introduced to the fundamentals of LLMs. What are they? How do they work? What is cool about them? What will you be learning during the next days?

      Conveners: Aniket Sen, Nicholas Kluge
    • 12:00
      Lunch Break 2. OG

      2. OG

      IMPULSE-Haus

      Adenauerallee 131, 53113 Bonn
    • Success Stories 2. OG

      2. OG

      IMPULSE-Haus

      Adenauerallee 131, 53113 Bonn

      Case studies from teams and researchers that have developed LLMs or NLP tools for low-resource languages, highlighting both obstacles and breakthroughs.
      Speaker 1: Marcellus Amadeus
      Speaker 2: TBC.

      Convener: Marcellus Amadeus
    • Marvin Tour

      A guided tour of the Marvin Cluster. How does it operate? How was it built? Who are the brave souls that maintain it?

      Convener: Jan Steiner
    • Introduction to HPC 2. OG

      2. OG

      IMPULSE-Haus

      Adenauerallee 131, 53113 Bonn

      Participants are introduced to the fundamentals of High-Performance Computing (HPC), including how HPC infrastructures can support LLM training, with hands-on demos.

      Convener: Jan Steiner
    • 12:00
      Lunch Break 2. OG

      2. OG

      IMPULSE-Haus

      Adenauerallee 131, 53113 Bonn
    • Working with Datasets 2. OG

      2. OG

      IMPULSE-Haus

      Adenauerallee 131, 53113 Bonn

      Participants are introduced to the fundamentals of how to work with text datasets (e.g., downloading, documenting, deduplication, filtering, and synthetic creation). Extra: participants will also receive an introduction to the basics of tokenization/text encoding.

      Conveners: Nicholas Kluge, Shiza Fatimah
    • LLM Pretraining 101 2. OG

      2. OG

      IMPULSE-Haus

      Adenauerallee 131, 53113 Bonn

      Participants are introduced to the fundamentals of pretraining LLMs (e.g., model architecture, parallel training, optimization, monitoring, evaluation).

      Conveners: Aniket Sen, Nicholas Kluge
    • 12:00
      Lunch Break 2. OG

      2. OG

      IMPULSE-Haus

      Adenauerallee 131, 53113 Bonn
    • Post-Training and Alignment 2. OG

      2. OG

      IMPULSE-Haus

      Adenauerallee 131, 53113 Bonn

      Participants are introduced to the fundamentals of fine-tuning and aligning LLMs (e.g., SFT, DPO, Constitutional AI).

      Conveners: Florian Mai, Nicholas Kluge
    • Environmental Practices to LLM Development 2. OG

      2. OG

      IMPULSE-Haus

      Adenauerallee 131, 53113 Bonn

      An exposé on the ecological impact of training LLMs, with emphasis on sustainable practices.

      Conveners: Nicholas Kluge, Sophia Falk
    • 12:00
      Lunch Break 2. OG

      2. OG

      IMPULSE-Haus

      Adenauerallee 131, 53113 Bonn
    • LLMs for Humanities 2. OG

      2. OG

      IMPULSE-Haus

      Adenauerallee 131, 53113 Bonn

      A lecture on the possibilities of application for LLMs in Humanities research.

      Conveners: Alexander Ermakov, Nicholas Kluge