Summer Internship Projects 2026
Below are the available projects for our Summer Internship Program. Applicants indicate their preferred projects – up to three – in the application form.
Digital Sovereignty in the Public Sector
Research topic
Supervisor(s)
Project description
Goal for the internship
To analyze and map digital supply chain dependencies in Austria’s public sector to inform policy strategies for strengthening digital sovereignty.
Methods to be used
Quantitative analysis of public data, such as procurement records and DNS-based mapping of web and email service infrastructure used by government entities.
Preferred academic background
- computer science / informatics
- physics
Preferred coding languages, packages, etc.
- Python
- R
Large Language Models in Digital Forensics
Research topic
Supervisor(s)
Bernhard Haslhofer
Project description
Goal for the internship
To explore how Large Language Models can enhance or automate key tasks in digital forensic investigations, such as evidence analysis, reporting, and decision support.
Methods to be used
Hands-on experimentation using LLM frameworks to prototype and evaluate forensic applications on synthetic or anonymized case data; literature review.
Preferred academic background
- computer Science / informatics
- physics
- mathematics
- statistics
Preferred coding languages, packages, etc.
- Python
Cancer Diagnoses and Mental Disorders
Research topic
Supervisor(s)
Project description
Cancer patients with mental disorders may face systematic disadvantages in both diagnosis and treatment. We hypothesize that these patients experience delays in cancer diagnoses and encounter greater challenges compared to those without mental disorders, as well as those who develop mental disorders only after their cancer diagnosis. To test this hypothesis, we will analyze two nationwide hospital claims datasets from Austria. The first dataset covers 13 million hospital stays of four million patients between 2015 and 2019, with diagnoses coded at the four-digit ICD-10 level. The second dataset spans from 1997 to 2014 and includes 45 million hospital stays of 8.9 million patients, recorded at the three-digit ICD-10 level. Both datasets provide detailed demographic and clinical information, including sex, age, admission and release dates, hospital departments, and both primary and secondary diagnoses. We will investigate diagnostic delays, and transitions across hospital departments to identify patterns in patient pathways. We expect to observe delayed treatments and less targeted therapeutic choices within these longitudinal trajectories. Our findings will provide a quantitative basis for understanding how mental disorders influence cancer care and will inform strategies to reduce disparities in timely diagnosis and treatment.
Goal for the internship
Analyze patients’ longitudinal hospital pathways by examining transitions between hospital departments across the three patient groups.
1) First, we will examine how patients move between hospital departments in three groups: (i) cancer patients with a prior mental disorder, (ii) cancer patients without mental disorders, and (iii) cancer patients who develop a mental disorder after their cancer diagnosis.
2) These groups will be matched by age and sex to ensure comparability. For each group, we will compare the length of hospital pathways, the number of unique departments visited, and the number of transitions between departments.
3) Based on the results of this first step, we will further explore similarities between trajectories, based on the assumption that departments sharing more patients are closer in the hospital care space.
Methods to be used
Statistics, network science, clustering algorithms, dynamic time warping and similar methods.
Preferred academic background
- data science
- statistics
- applied math
- physics
Preferred coding languages, packages, etc.
- Python
- R
The Science of Science at CSH
Research topic
Data Visualization • Liuhuaying Yang
Supervisor(s)
Project description
This project explores how the Complexity Science Hub produces, connects, and spreads knowledge. You will work closely with the CSH Visual Team to design interactive, artistic, and data-driven visualizations (not limited to websites) that reveal patterns in collaboration, knowledge flows, innovation, and inequality within CSH. Exceptional outputs may be showcased in outreach events or exhibitions.
Goal for the internship
- Turn CSH research data into engaging experiences that uncover how our community produces and spreads knowledge.
- Visualize scientists and their communities through data-driven portraits and dynamic visualizations.
- Experiment with different media formats for data communication, including print, animation, projection, sonification, or hybrid approaches.
Methods to be used
Not limited to websites — the focus is on creative and engaging ways to make CSH science visible. Open to physical installations, animations, videos, posters, zines, interactive touchscreens, or hybrid formats.
Preferred academic background
- design
- art
- communication
- data science
Preferred coding languages, packages, etc.
- Basic data analysis skills, for example, using Excel/spreadsheets, Python, or R.
Additional information
We encourage candidates to share the range of their ideas or motivations for this project. This flexibility allows you to express your interests and aspirations within the scope of our work. However, it’s perfectly fine if you don’t have any particular concepts in mind at this stage; we welcome all applicants regardless of their current thoughts, and we can shape the directions together.
How We Perceive Polarization
Research topic
Supervisor(s)
Project description
Perceived Polarization with Identity-Driven Perception Distortions. We previously developed a framework about how people perceive polarization. The framework describes how people perceive opinion bundles through a subjective and dynamic lens. These lenses are distorted – in quite different ways – because a person’s lens is tied to the (changing) variance of opinions in their political in-group. If the in-group is very homogeneous about a political question, the person might perceive deviations from this opinion as more pronounced than someone whose political in-group holds heterogeneous opinions. As such, some groups can perceive much higher levels of opinion polarization, even if we do not see opinion divergence in survey data, and vice versa.
Goal for the internship
We hope to find a student who is interested in exploring the phenomena of perceived polarization:
- apply the framework to new datasets, other countries, and more political issues (e.g., immigration, covid, social inequality)
- test how different definitions of identity groups affect the results
- extend the framework in any direction of interest, e.g. implementing it on a network rather than group descriptions and considering out-group distinctiveness
Methods to be used
Processing of survey data, data analysis, mathematical modelling.
Preferred academic background
Any; some experience with coding/data analysis is required.
Preferred coding languages, packages, etc.
- Python is preferred but others can be considered
Further reading
- Dalege, J., Galesic, M., & Olsson, H. (2025). Networks of beliefs: An integrative theory of individual- and social-level belief dynamics. Psychological Review, 132(2), 253–290.
- Galesic, M., Barkoczi, D., Berdahl, A., Biro, D., Carbone, G., Giannoccaro, I., Goldstone, R., Gonzalez, C., Kandler, A., Kao, A., Kendal, R., Kline, M., Lee, R., Massari, G. F., Mesoudi, A., Olsson, H., Pescetelli, N., Sloman, S., Smaldino, P. E., & Stein, D. L. (2023). Beyond collective intelligence: Collective adaptation. Journal of The Royal Society Interface, 20, 20220736.
- Steiglechner, P., Smaldino, P.E. and Merico, A. (2025) How opinion variation among in-groups can skew perceptions of ideological polarization. PNAS Nexus, 4(7), p. pgaf184.
Tracing the Origins of Political Contributions
Research topic
Foundations of Complex Systems [Principles of Emergent Things] • Eddie Lee
Supervisor(s)
Project description
The US presidential election in 2025 has attracted billions of dollars from individuals and companies. Other federal and state elections also attract many millions of dollars in contributions. Where does this money come from? How are donations related to one another, and how do donors decide when to donate?
Goal for the internship
- We will analyze the structure of donation networks that are recovered from publicly available data from the Federal Election Commission.
- We will explore the patterns behind donations in a few key historical elections to map the networks of donors and the strategic dynamics behind donations.
- We will use such information to develop mathematical models of collective decisions.
Methods to be used
Scraping large data sets, network science, statistical physics.
Preferred academic background
- applied math / stats
- data science
- physics
- social science / computational social science
Preferred coding languages, packages, etc.
- Python
- Postgresql/duckdb
- Github
- bash
Additional information
The intern will ideally be familiar with large datasets (including PostgreSQL and Python), network science and statistical physics (or have a strong mathematical background), and should be interested in taking the initiative to drive a project forward.
Counterfactual Time Series for Graph Dynamics
Research topic
Supervisor(s)
Project description
Counterfactual modelling is well established for classical time series and causal inference, but for dynamic graphs it remains an unsolved challenge. Existing approaches either work by local perturbation of observed networks or rely on generative models that struggle to capture both local dependencies and global structural consistency. As a result, there is no general framework for constructing plausible counterfactual graph trajectories, especially in the case of discrete time dynamic graphs. This project aims to address this gap by developing methods to generate counterfactual time series of dynamic graphs, enabling the systematic study of how alternative scenarios affect both local interactions and global network structures.
Goal for the internship
- Implement our generalised degrees method for dynamic graph generation to construct counterfactual trajectories.
- Devise quantitative measures to capture both local (node- or edge-level) and global (structural) changes in graph evolution.
- Apply the framework to real datasets (e.g., trade networks, transaction networks) to demonstrate its effectiveness.
Methods to be used
- Our own recently developed method for learning a representation of node dynamics for discrete-time graph generation.
- Design of measures for quantifying structural changes (degree distributions, clustering, modularity, centrality dynamics).
- Application to available datasets of dynamic networks.
Preferred academic background
Strong statistics and computational background, preferred ML, AI or any with good statistical inference background.
Preferred coding languages, packages, etc.
- Python
Building Blocks of Production Networks
Research topic
Supervisor(s)
Project description
Production networks exhibit a surprising variety of observed production functions when measured from firm-level network data. However, this heterogeneity may not always be fundamental: some observed functions may in fact be nested compositions of simpler processes, or the result of incorrect or incomplete product information. For example, a large firm may either employ a genuinely different production process, or simply operate as a multi-product firm where the observed function is an averaging of its distinct underlying processes. This project aims to develop a statistical framework to identify which observed production functions are irreducible (base functions) and which can be decomposed into nested ones. The ultimate goal is to construct a dictionary of base reactions that serve as the building blocks of production networks.
Goal for the internship
- Develop a statistical method to test whether observed production functions are reducible into nested compositions.
- Identify irreducible (base) production functions and assemble them into a dictionary of elementary reactions.
- Demonstrate the method on empirical production network data, providing insight into the true sources of heterogeneity.
Methods to be used
- Representation of production functions as bipartite input–output mappings in production networks.
- Use of random network models for statistical validation and benchmarking.
- Data and coding skills for applications to empirical datasets.
Preferred academic background
Quantitative scientific field (physics, maths, engineering, computer science, etc.).
Preferred coding languages, packages, etc.
- Python
Synthetic Graph Generation
Research topic
Supervisor(s)
Project description
An important challenge for complex networks is the generation of synthetic dynamic graphs that capture the statistical properties of real systems while remaining fully artificial. Synthetic data are crucial for benchmarking, controlled experimentation, and privacy-preserving sharing. While many graph models are well established they tend to rest on strong assumptions about sufficient statistics and are not learned from the data. Furthermore, extending them to dynamic settings with realistic temporal dependencies remains difficult. This project will develop methods for generating synthetic dynamic graphs using our generalised degrees framework, with two complementary directions: (i) producing synthetic trajectories that are statistically equivalent to observed networks but not initialised from real data, and (ii) extending the method to generate ensembles with strict structural constraints, such as dynamic versions of the configuration model that reproduce degree sequences exactly.
Goal for the internship
- Develop a scalable framework for synthetic dynamic graph generation.
- Explore the two options: (i) synthetic trajectories statistically equivalent to real networks, and (ii) improve constraint satisfaction link matching.
- Benchmark the realism of generated graphs against real-world datasets.
Methods to be used
- Node-dynamics–based generative modelling of discrete-time graphs.
- Statistical validation using structural measures (degree distributions, clustering, modularity, centrality) and temporal measures (persistence, turnover, stability).
- Comparative benchmarking against standard static and dynamic generative models.
Preferred academic background
Good statistics and inference background.
Preferred coding languages, packages, etc.
- Python
Science Communication : Design a Workshop for Kids
Research topic
Several research topics to choose from among CSH research themes. The specific selection will be made at the beginning of the internship.
Supervisor(s)
Anja Böck + the respective research group leader(s)
Project description
At CSH, we strive to make cutting-edge research accessible to a broad audience – including, and especially, children and school students. Inspiring young people at an early age not only nurtures curiosity and critical thinking, but also counters science skepticism and can spark interest in pursuing a scientific career. At the same time, researchers themselves gain valuable skills by learning how to communicate complex ideas in clear and engaging ways.
With this in mind, the CSH has launched CSH Goes School, a program that provides free, ready-to-use classroom materials based on CSH research and visualizations, as well as interactive workshops for students. This internship will contribute directly to this initiative.
Goal for the internship
The intern(s) will design a workshop on a CSH research topic for students aged 10–14. The workshop should be suitable across these school levels and accompanied by teaching materials similar to those already offered by CSH. It should foster network thinking, illustrate how interdependencies arise, and demonstrate the broader impact of science on society.
The workshop will be tested internally – and possibly with children in a vacation program. Interns will gain experience in science communication, educational design, and translating complex concepts into accessible materials.
Methods to be used
The intern will read and synthesize relevant publications, translate complex research into simpler, engaging narratives, and connect scientific concepts to students’ everyday experiences. Creative approaches and thinking beyond disciplinary boundaries are encouraged.
Preferred academic background
Any. Curiosity, creativity, and an interest in science communication or education are most important, along with quantitative skills and an interest in complexity science.
German language skills are an advantage.