Summer Internship Projects 2026

Below are the available projects for our Summer Internship Program. Applicants indicate their preferred projects – up to three – in the application form.

Digital Sovereignty in the Public Sector

Research topic

To analyze and map digital supply chain dependencies in Austria’s public sector to inform policy strategies for strengthening digital sovereignty.

Quantitative analysis of public data, such as procurement records and DNS-based mapping of web and email service infrastructure used by government entities.

  • computer science / informatics
  • physics
  • Python
  • R

Large Language Models in Digital Forensics

Research topic

Bernhard Haslhofer

To explore how Large Language Models can enhance or automate key tasks in digital forensic investigations, such as evidence analysis, reporting, and decision support.

Hands-on experimentation using LLM frameworks to prototype and evaluate forensic applications on synthetic or anonymized case data; literature review.

  • computer Science / informatics
  • physics
  • mathematics
  • statistics
  • Python

Cancer Diagnoses and Mental Disorders

Research topic

Cancer patients with mental disorders may face systematic disadvantages in both diagnosis and treatment. We hypothesize that these patients experience delays in cancer diagnoses and encounter greater challenges compared to those without mental disorders, as well as those who develop mental disorders only after their cancer diagnosis. To test this hypothesis, we will analyze two nationwide hospital claims datasets from Austria. The first dataset covers 13 million hospital stays of four million patients between 2015 and 2019, with diagnoses coded at the four-digit ICD-10 level. The second dataset spans from 1997 to 2014 and includes 45 million hospital stays of 8.9 million patients, recorded at the three-digit ICD-10 level. Both datasets provide detailed demographic and clinical information, including sex, age, admission and release dates, hospital departments, and both primary and secondary diagnoses. We will investigate diagnostic delays, and transitions across hospital departments to identify patterns in patient pathways. We expect to observe delayed treatments and less targeted therapeutic choices within these longitudinal trajectories. Our findings will provide a quantitative basis for understanding how mental disorders influence cancer care and will inform strategies to reduce disparities in timely diagnosis and treatment.

Analyze patients’ longitudinal hospital pathways by examining transitions between hospital departments across the three patient groups.

1) First, we will examine how patients move between hospital departments in three groups: (i) cancer patients with a prior mental disorder, (ii) cancer patients without mental disorders, and (iii) cancer patients who develop a mental disorder after their cancer diagnosis.

2) These groups will be matched by age and sex to ensure comparability. For each group, we will compare the length of hospital pathways, the number of unique departments visited, and the number of transitions between departments.

3) Based on the results of this first step, we will further explore similarities between trajectories, based on the assumption that departments sharing more patients are closer in the hospital care space.

Statistics, network science, clustering algorithms, dynamic time warping and similar methods.

  • data science
  • statistics
  • applied math
  • physics
  • Python
  • R

The Science of Science at CSH

Research topic

Data VisualizationLiuhuaying Yang

This project explores how the Complexity Science Hub produces, connects, and spreads knowledge. You will work closely with the CSH Visual Team to design interactive, artistic, and data-driven visualizations (not limited to websites) that reveal patterns in collaboration, knowledge flows, innovation, and inequality within CSH. Exceptional outputs may be showcased in outreach events or exhibitions.

  • Turn CSH research data into engaging experiences that uncover how our community produces and spreads knowledge.
  • Visualize scientists and their communities through data-driven portraits and dynamic visualizations.
  • Experiment with different media formats for data communication, including print, animation, projection, sonification, or hybrid approaches.

Not limited to websites — the focus is on creative and engaging ways to make CSH science visible. Open to physical installations, animations, videos, posters, zines, interactive touchscreens, or hybrid formats.

  • design
  • art
  • communication
  • data science
  • Basic data analysis skills, for example, using Excel/spreadsheets, Python, or R.

We encourage candidates to share the range of their ideas or motivations for this project. This flexibility allows you to express your interests and aspirations within the scope of our work. However, it’s perfectly fine if you don’t have any particular concepts in mind at this stage; we welcome all applicants regardless of their current thoughts, and we can shape the directions together.

How We Perceive Polarization

Research topic

Perceived Polarization with Identity-Driven Perception Distortions. We previously developed a framework about how people perceive polarization. The framework describes how people perceive opinion bundles through a subjective and dynamic lens. These lenses are distorted – in quite different ways – because a person’s lens is tied to the (changing) variance of opinions in their political in-group. If the in-group is very homogeneous about a political question, the person might perceive deviations from this opinion as more pronounced than someone whose political in-group holds heterogeneous opinions. As such, some groups can perceive much higher levels of opinion polarization, even if we do not see opinion divergence in survey data, and vice versa.

We hope to find a student who is interested in exploring the phenomena of perceived polarization:

  • apply the framework to new datasets, other countries, and more political issues (e.g., immigration, covid, social inequality)
  • test how different definitions of identity groups affect the results
  • extend the framework in any direction of interest, e.g. implementing it on a network rather than group descriptions and considering out-group distinctiveness

Processing of survey data, data analysis, mathematical modelling.

Any; some experience with coding/data analysis is required.

  • Python is preferred but others can be considered

Tracing the Origins of Political Contributions

Research topic

Foundations of Complex Systems [Principles of Emergent Things] • Eddie Lee

The US presidential election in 2025 has attracted billions of dollars from individuals and companies. Other federal and state elections also attract many millions of dollars in contributions. Where does this money come from? How are donations related to one another, and how do donors decide when to donate?

  • We will analyze the structure of donation networks that are recovered from publicly available data from the Federal Election Commission.
  • We will explore the patterns behind donations in a few key historical elections to map the networks of donors and the strategic dynamics behind donations. 
  • We will use such information to develop mathematical models of collective decisions.

Scraping large data sets, network science, statistical physics.

  • applied math / stats
  • data science
  • physics
  • social science / computational social science
  • Python
  • Postgresql/duckdb
  • Github
  • bash

The intern will ideally be familiar with large datasets (including PostgreSQL and Python), network science and statistical physics (or have a strong mathematical background), and should be interested in taking the initiative to drive a project forward.

Counterfactual Time Series for Graph Dynamics

Research topic

Counterfactual modelling is well established for classical time series and causal inference, but for dynamic graphs it remains an unsolved challenge. Existing approaches either work by local perturbation of observed networks or rely on generative models that struggle to capture both local dependencies and global structural consistency. As a result, there is no general framework for constructing plausible counterfactual graph trajectories, especially in the case of discrete time dynamic graphs. This project aims to address this gap by developing methods to generate counterfactual time series of dynamic graphs, enabling the systematic study of how alternative scenarios affect both local interactions and global network structures.

  • Implement our generalised degrees method for dynamic graph generation to construct counterfactual trajectories.
  • Devise quantitative measures to capture both local (node- or edge-level) and global (structural) changes in graph evolution.
  • Apply the framework to real datasets (e.g., trade networks, transaction networks) to demonstrate its effectiveness.
  • Our own recently developed method for learning a representation of node dynamics for discrete-time graph generation.
  • Design of measures for quantifying structural changes (degree distributions, clustering, modularity, centrality dynamics).
  • Application to available datasets of dynamic networks.

Strong statistics and computational background, preferred ML, AI or any with good statistical inference background.

  • Python

Building Blocks of Production Networks

Research topic

Production networks exhibit a surprising variety of observed production functions when measured from firm-level network data. However, this heterogeneity may not always be fundamental: some observed functions may in fact be nested compositions of simpler processes, or the result of incorrect or incomplete product information. For example, a large firm may either employ a genuinely different production process, or simply operate as a multi-product firm where the observed function is an averaging of its distinct underlying processes. This project aims to develop a statistical framework to identify which observed production functions are irreducible (base functions) and which can be decomposed into nested ones. The ultimate goal is to construct a dictionary of base reactions that serve as the building blocks of production networks.

  • Develop a statistical method to test whether observed production functions are reducible into nested compositions.
  • Identify irreducible (base) production functions and assemble them into a dictionary of elementary reactions.
  • Demonstrate the method on empirical production network data, providing insight into the true sources of heterogeneity.
  • Representation of production functions as bipartite input–output mappings in production networks.
  • Use of random network models for statistical validation and benchmarking.
  • Data and coding skills for applications to empirical datasets.

Quantitative scientific field (physics, maths, engineering, computer science, etc.).

  • Python

Synthetic Graph Generation

Research topic

An important challenge for complex networks is the generation of synthetic dynamic graphs that capture the statistical properties of real systems while remaining fully artificial. Synthetic data are crucial for benchmarking, controlled experimentation, and privacy-preserving sharing. While many graph models are well established they tend to rest on strong assumptions about sufficient statistics and are not learned from the data. Furthermore, extending them to dynamic settings with realistic temporal dependencies remains difficult. This project will develop methods for generating synthetic dynamic graphs using our generalised degrees framework, with two complementary directions: (i) producing synthetic trajectories that are statistically equivalent to observed networks but not initialised from real data, and (ii) extending the method to generate ensembles with strict structural constraints, such as dynamic versions of the configuration model that reproduce degree sequences exactly.

  • Develop a scalable framework for synthetic dynamic graph generation.
  • Explore the two options: (i) synthetic trajectories statistically equivalent to real networks, and (ii) improve constraint satisfaction link matching.
  • Benchmark the realism of generated graphs against real-world datasets.
  • Node-dynamics–based generative modelling of discrete-time graphs.
  • Statistical validation using structural measures (degree distributions, clustering, modularity, centrality) and temporal measures (persistence, turnover, stability).
  • Comparative benchmarking against standard static and dynamic generative models.

Good statistics and inference background.

  • Python

Science Communication : Design a Workshop for Kids

Research topic

Several research topics to choose from among CSH research themes. The specific selection will be made at the beginning of the internship.

Anja Böck + the respective research group leader(s)

At CSH, we strive to make cutting-edge research accessible to a broad audience – including, and especially, children and school students. Inspiring young people at an early age not only nurtures curiosity and critical thinking, but also counters science skepticism and can spark interest in pursuing a scientific career. At the same time, researchers themselves gain valuable skills by learning how to communicate complex ideas in clear and engaging ways.

With this in mind, the CSH has launched CSH Goes School, a program that provides free, ready-to-use classroom materials based on CSH research and visualizations, as well as interactive workshops for students. This internship will contribute directly to this initiative.

The intern(s) will design a workshop on a CSH research topic for students aged 10–14. The workshop should be suitable across these school levels and accompanied by teaching materials similar to those already offered by CSH. It should foster network thinking, illustrate how interdependencies arise, and demonstrate the broader impact of science on society.

The workshop will be tested internally – and possibly with children in a vacation program. Interns will gain experience in science communication, educational design, and translating complex concepts into accessible materials. 

The intern will read and synthesize relevant publications, translate complex research into simpler, engaging narratives, and connect scientific concepts to students’ everyday experiences. Creative approaches and thinking beyond disciplinary boundaries are encouraged.

Any. Curiosity, creativity, and an interest in science communication or education are most important, along with quantitative skills and an interest in complexity science.

German language skills are an advantage.

0 Pages 0 Press 0 News 0 Events 0 Projects 0 Publications 0 Person 0 Visualisation 0 Art

Signup

CSH Newsletter

Choose your preference
   
Data Protection*