Sharing computational training material at larger scale: a French multi-tenant attempt#

https://py-edu-fr.pages.heptapod.net/_static/logo-py-edu-fr.svg

Nicolas M. Thiéry, Professor, Laboratoire Interdisciplinaire des Sciences du Numérique (LISN), Université Paris-Saclay

Joint work with Pierre Augier, Éléonore Barthelian, Françoise Conil, Loïc Grobol, Chiara Marmo, Olha Nahorna, Pierre Poulain, N. T., Jeremy Laforet, …

October 1st of 2025, PyData Paris 2025

Abstract#

With the rise of computation and data as pillars of science, institutions are struggling to provide large-scale training to their students and staff. Often, this leads to redundant, fragmented efforts, with each organization producing its own bespoke training material. In this talk, we report on a collaborative multi-tenant initiative to produce a shared corpus of interactive training resources in the Python language, designed as a digital common that can be adapted to diverse contexts and formats in French higher education and beyond.

Despite continuous efforts like Unisciel or FUN MOOC, training material reuse remains very limited in French higher education. To some extent, this is cultural with curricula that are not standardized across universities and the absence of a textbook tradition. Beyond intellectual property, language, and cultural barriers, instructors need or want to adapt the training material to the split in teaching units, the audience, the format, and pedagogical choices. Computational training material pose unique challenges as they require adapting to various technological choices or constraints including programming language, computational libraries, computing environments, and infrastructure. Also they needs to be continuously maintained to adapt to the evolving technology which is incompatible with reuse patterns such as “copy-and-forget”.

We describe the team’s use cases (from undergraduate to lifelong teaching, computer science students to non specialists, intensive week-long workshops to unsupervised), the sources of inspiration and reuse (MOOC’s, Software Carpentry, …), the current status and content (introductory programming, …, development tools, and best practices), the computational environment and authoring tools (Jupyter, MyST, Jupyter-Book, version control, software forge, and CI) and explore some levers to facilitate sharing and reuse (modularity, gamification and decontextualisation, portability, adaptive learning, machine assisted multilingual authoring).

This talk is intended for instructors, students, potential contributors, and anyone interested in computational and scientific software engineering education.

Yet another Python course? Really? Why?#

Observations#

Rise of Computation and Data

as pillar of science, and beyond …

Major training needs

  • Computing, data processing, machine learning, …

  • Programming, software engineering, open science, …

Major efforts

  • MOOC’s: Python, Scikit learn, FIDDLE, …

  • Online platforms: France IOI, …

  • Libraries of teaching resources: Unisciel, …

  • Software Carpentry, …

  • A flurry of courses delivered by universities, SME’s, …

Yet, in practice

Very little reuse

Example at Université Paris-Saclay

  • aim to deliver some basic computational training to most students (and staff)

  • 10+ independently crafted teaching units

    • covering about the same scope:
      «Computing 101»: basic programming, computing and visualization

    • using about the same technology:
      Python, Jupyter, numpy, pandas, matplotlib, …

Barriers to reuse of computational training material in higher education#

Cultural barriers

  • no standardized modular curricula

  • no textbook tradition

  • barely emerging open science tradition in education

  • language: French? English?

  • personal touch on education

Technological barriers

  • programming language, computational libraries, …

  • computing environment, infrastructure, …

  • quickly evolving technology, paradigms, and even science

  • personal taste

Diversity of public

  • complete beginners to experts (possibly in the same room)

  • from math, physics, computer science, chemistry, biology, geosciences, sports sciences, economists, humanities, …

  • bachelor, master, PhD, engineers, researchers, …

How to grab their interest? Fit their constraints?

Diversity of formats

  • online courses

  • small to large scale physical courses (10-300 students, one semester)

  • intensive training sessions and summer schools (3-5 days)

  • lectures? recitations? projects?

Time pressure

  • high quality, reusable and reused: a high value long term investment

  • quick and dirty: oh well, good enough for tomorrow’s class

Py-edu-fr in a nutshell#

An emerging cross institution cross profession community

De Pierre Augier, calcul@listes.math.cnrs.fr, 15/01/2025:
«… Je me dis que travailler uniquement à l’échelle de notre petit groupe à Grenoble est un peu dommage et qu’un niveau national (ou même francophone) serait raisonnable. …»

  • Pierre Augier, Researcher in Fluid Mechanics, CNRS, Université Grenoble Alpes

  • Eleonore Barthenlian, Data scientist

  • Françoise Conil, CNRS Software Engineer at LIRIS laboratory in Lyon

  • Loïc Grobol, Associate Professor in Computational Linguistics at Université Paris Nanterre

  • Chiara Marmo, Research Software Engineer in Astronomy, Geosciences and Computer Science, Université Paris-Saclay

  • Olha Nahorna, Research Engineer in Data Analysis, CNRS, Bordeaux Sciences Économiques (BSE)

  • Pierre Poulain, Associate Professor in bioinformatics, Université Paris Cité

  • N. T., Professor in Computer Science, Université Paris-Saclay

  • Jeremy Laforet, Research Engineer in Biomedical modeling, CNRS

  • … and you?

trying to share open educational material

  • Python based?

  • for Higher Education and Research?

  • for France? French speaking countries?

  • FAIR principles: Findable, Accessible, Accessible, Interoperable, Reusable

XKCD about n+1 standards

Current status#

Content

  • Introduction à la programmation avec Python et Jupyter (“Programming and Computing 101”)

    • In French

    • About 80 Jupyter worksheets / 14h of course

    • Building on previous work in Paris-Saclay and elsewhere

    • Available online and beta tested

    • In planning: larger adoption in Paris-Saclay

  • Initiation to Python

    • In English

    • A separate course? Or a translation of the above?

  • Advanced Python for sciences

    • Plenty of material to be imported

Infrastructure

Institutional support and funding

  • Python work group of the CNRS professional networks “Calcul” and “DevLog”

  • Funding by CMA SaclAI-School

Design#

Engaging the student#

Desirable take home messages for beginners

  1. You can do it!

  2. It’s fun!

  3. It’s power!
    At your fingertip. In your own world.

  4. It’s science not alchemy

You can do it! And it’s fun!#

Gamification

Can you program the ant out of the maze?

from laby.global_fr import *
Laby(niveau="2a")
avance()
avance()
avance()
avance()
avance()

Engaging, with (mostly) no prerequisites.

A good old effective idea

  • Mindstorms: Children, Computers, and Powerful Ideas, S. Papert, 1980

  • original version of Laby by Gimenez et al.

  • similar to, e.g., France IOI’s robots
    Could we share that widget?

It’s power!#

Do interesting stuff ASAP

  • The Python ecosystem rocks here!

  • Potential: image, sound, 3D geometry, you name it

Domain Context or not?

Solving problems in mathematics, biology, humanities, …

  • 💡Makes things concrete
    “Oh that’s what it means, in my world”

  • 👍Engages
    “Oh, that would be useful, in my world”

  • 🫨Adds cognitive load, distracts

  • Adds prerequisites (👎 Reuse)

Tentative resolution

  • Most of the material without domain context

  • Select material rooted in context
    With conclusion to abstract away

  • Mini projects rooted in context

It’s science#

Main learning objective

  • analyze programs and reason on them

  • predict, and control their behavior

Strategy

  • Introducing concepts

  • Introducing models (for the memory, …)
    As simple as possible, but no simpler; and iterate
    Example: at first, you don’t need to know how integer are stored in memory

  • Defining the syntax and semantic of constructs in these models

  • Learning to analyze step by step (syllabic method first; then global)

A key learning tool: the step-by-step debugger

  • simplify the JupyterLab interface

  • support in JupyterLite

Fostering reusability and reuse#

Producing and reusing open content

Modularity (👍 Reusable)#

A collection of courses learning activities

Learning activity (aka Learning nuggets):

  • A narrative

  • Possibly with interactivity, self assessment, …

  • With explicit prerequisites and learning objectives (ongoing)

Example: mini course, exercise, mini-project, …

From which courses can be composed

  • Write a narrative referencing the chosen activities

  • Or just steal the activities you like

Adaptive learning?

  1. Empower the learner: own pace, own helpers, …

  2. Offer a personalized experience to the learner

Challenges

  • Granularity?

  • Decontextualize the content

  • Where to host transitions?

Format#

Learning unit = Markdown (+ MyST) file with metadata

  • Simple and standard (👍 interoperable, reusable, sustainable)

  • Can include learning metadata (👍 findable, adaptive)
    Prerequisites, learning objectives, difficulty

  • Can include solutions, instructor notes, … (👍 adaptive)

  • Can be interactive (👍 engaging)
    Markdown based Jupyter notebooks

  • Can include self assessment (👍 adaptive, engaging) nbgrader, jupylates, …

  • Can be randomized (👍 adaptive)

  • Easy to version control (👍 accessible)

  • Easy to export: pdf, web, … (👍 accessible)
    Jupyter-Book, MySTmd, Quarto, …

  • Easy to transforms
    grammar-check, automated formatting, solution striping … (👍 reuse)

Authoring conventions

---
jupytext:
  ...
learning:
  objectives:
    apply: [fonction]
  prerequisites:
    apply: [boucle for]
---

# TP : implanter la fonction exponentielle (1/5)

**Imaginez que vous développez ...**

Pour cela, on utilise la définition de $e^x$ en tant que *série* (somme infinie) :

$$e^x = \sum_{n=0}^{+\infty} \frac{x^n}{n!} = 1 + x + \frac{x^2}{2!} + \frac{x^3}{3!} +\cdots+\frac{x^n}{n!}+\cdots$$

...

```{code-cell} ipython3
:tags: [answer]
def factorielle(n):
    ### BEGIN SOLUTION
    r = 1.0
    for i in range(1, n+1):
        r *= i    # Rappel: c'est équivalent à r = r * i
    return r
    ### END SOLUTION
```
working on the previous worksheet in Jupyter, with help from AI
A typical work environment, with Jupyter, Travo, Laby and Jupylates

Desirable tooling improvements

  • standardization of markdown-based format for Jupyter

  • support for macros in JupyterLab-MyST

  • easy export to slides on the web

Adaptive learning (👍 autonomy, engaging)#

Tooling (work in progress)

  • Learning records: track the student activity

  • Learner model: estimate the student abilities

    • from learning records

    • from learning metadata

  • Traffic lights: ready to engage into that activity?

  • Student dashboard: display progress, recommend activities

Multilingual?#

Aim: introductory courses in French and English

  • A maintenance nightmare?

Use Machine Translation assistance (work in progress) (👍 reuse)

Use Translate dir (beta): DobbiKov/translate-dir-cli

By Yehor Kotorenko (and T.)

  • Incremental translation

  • Preserves syntax and structure

  • Preserves terminology

  • Preserves post-edits and style

  • Uses your favorite LLM

  • Integrates in your favorite git workflow

Ease deployment#

The learner can work on the courses:

  • Online, with JupyterLite

  • Online, with your favorite virtual environment (jupyterhub, mydocker, …)

  • Locally, on laptop, computer lab, …

Discussion#

  • notebooks ?

  • Feedback from users

Thank you for your attention!#

Upcoming jobs at Paris-Saclay: project ATLAS - AI for Teaching and Learning (AI) at Scale

  • Post-doc to conduct research in education and human-centric design and computing

  • Research Software Engineer: javascript, jupyter, …