Thomas Sokolowski joined the FIAS in April 2020 as part of the LOEWE focus CMMS. In an interview he explains how he uses stochastic methods and fundamental physical laws to describe complicated biochemical processes on the cell and tissue level efficiently and as accurately as possible.
In your research you use stochastic methods to describe living systems. Why is this kind of mathematical approach interesting for biological processes?
In our times the idea that living systems have a "perfect design" and therefore are superior to comparable technical solutions created by humans is very widespread. And quite often this is indeed the case. However, this view neglects the fact that biological mechanisms are based on molecular transport and conversion processes. These are fundamentally stochastic, i.e. random, and therefore can reach specific "goals" only with a certain—possibly low—probability. For example, a molecule that is supposed to signal a change at the cell membrane to the interior of the cell can only bring this signal to its destination by random movement (diffusion). And even when the molecule finally reaches its destination, the signal can only be transmitted by a chemical transformation, which again is a random process. The influence of such microscopic random processes on the large-scale behavior of an entire superordinate system is not only significant in biological cells, but also in populations of organisms. We are currently experiencing this in a very true-to-life way via the spread of the coronavirus.
How does that work?
To describe living systems accurately, one thus needs to employ stochastic models. Here it is important that both the microscopic random processes mentioned above and their biophysical boundary conditions are modeled realistically. This is because the effects of these elementary stochastic processes can add up, and consequently even small discrepancies in their description can lead to very different results at the cell or tissue level. In practice this means that the probability distributions used in biological models and descriptions of the effects influencing them must be in accordance with the basic principles of statistical physics and chemistry.
Which processes or systems are particularly fascinating to you?
First of all, the fact that living organisms can reliably process information using fundamentally noisy random processes for decision-making and initiating specific actions—with a precision and efficiency that oftentimes is not inferior or even superior to our modern technology. The most prominent example of this is certainly the human brain, which—due to its complexity—now defines a whole separate scientific discipline. More surprisingly, however, we find examples of efficient information processing even in the simplest organisms, such as bacteria. With the help of improved experimental methods, in the recent decades the amazingly high precision of many such processes has been quantitatively determined. But our mechanistic understanding of how this precision is achieved is still in its infancy—especially for more complex, multicellular systems.
To what extent is information processing relevant for living systems?
Efficient information processing is not only relevant during the lifetime of the organism, but for its development in the first place. Most people intuitively associate the term "life" with metabolism or self-preservation and reproduction or preservation of species, and perhaps with the ability for purposeful movement. However, the phenomenon of life can also be regarded as a spatially spreading information processing sequence extending over many generations of organisms, which is refined by continuous evolution for efficiently passing on information about successful concepts. Humans tend to take it for naturally granted that they can recognize themselves in their children—however, the opposite is the case. The more one learns about the underlying processes of embryo development, the more astonishing it becomes that the "blueprint" of an organism encoded in its genetic information is that highly reproducible. And the more fascinating the question becomes, which natural mechanisms are instrumental to this. What I find even more exciting, though, is the fact that evolution—despite its brutal pressure towards optimization—not only found one or few good solutions for the problem of efficient information processing, but continues to surprise us with previously unknown but equally efficient mechanisms. This is in contrast to the obvious convergence of technical solutions for "purely physical" problems, such as carrier rockets or passenger planes. It is possible that a fundamental principle is hidden behind this: namely that, starting from a certain level of system complexity and under suitable boundary conditions, the number of "sufficiently optimal" solutions explodes. We observe something similar, for example, with deep learning algorithms. I believe that a better understanding of this interplay between complexity and "optimizability" will be an important key to elucidating the phenomenon of life and its evolutionary diversity.
In order to describe biochemical reactions you probably need a lot of computing power. How do you manage to create algorithms that are at the same time efficient but do not simplify reality to such an extent that they become inaccurate?
A biological cell consists of trillions of atoms whose behavior is controlled by random processes. Of course, we cannot model all these processes microscopically to simulate and understand biochemical processes. And even if we could do so in the distant future, it would not make sense. After all, smart algorithms can simulate many random processes with considerably fewer calculation steps. The " trick" here is to separate the processes whose action we can predetermine from derivable probability distributions from those which we actually have to simulate microscopically step by step. For example, if the abovementioned signal molecule spends most of its time freely diffusing inside the cell and only rarely encounters reaction partners, then it is unnecessary to simulate each diffusion step individually. Instead, in this case, we can exactly calculate the probability distribution for the diffusion process and use it for predicting a consistent later location. Under favorable circumstances, this can save millions of computation steps. Even if the molecule gets close to potential reaction partners, in many cases the probability distribution for the reaction events can be derived exactly. This requires many analytical derivations, which in fact oftentimes are only tractable under simplifying assumptions. For example, a common simplification is to represent molecules as perfect spheres when their internal structure is not as relevant to the process under investigation as their spatial distribution in the cell. However, it is then extremely important to define the model based on this such that the simplified "sphere molecules" effectively behave as if they were not simplified. Oftentimes this is ensured not only via theoretical considerations, but also by simulations on a spatially finer scale. That is one of the reasons why it is so important to link approaches on different spatial scales, as now pursued by the CMMS research focus.
Will this also influence your work at FIAS?
Definitely, because the compromise between realism and detail accuracy of models on the one hand and computational efficiency on the other is a fundamental one. Even if we had "unlimited" computing capacity at our disposal, we would not want to waste it. After all, computer calculations do not only cause financial costs, but—above all—they consume significant amounts of energy. Therefore there is a permanent need, not only in the life sciences, for making realistic models more efficiently computable, or—vice versa—extending efficient algorithms to more detailed model scenarios.
With my work at FIAS, I actually plan to pursue both of these paths: In the past, I have developed models and algorithms that can calculate biochemical processes on the cell and tissue level in a very efficient way, but only in relatively simplified geometries. Here we will work on extensions towards more realistic representations of biological structures and their dynamics while maintaining high algorithmic efficiency. On the other hand, we want to start from deterministic models with a higher degree of detail and develop efficient and biophysically faithful stochastic algorithms for their simulation. As a matter of course, the derivation of mathematical predictions for the underlying microscopic stochastic processes will play a central role in this.
The LOEWE focus CMMS brings together biologists, computer scientists, physicists and mathematicians with the aim of gaining a comprehensive understanding of elementary molecular biological processes, such as the mode of action of an enzyme, up to the complex behaviour of organisms. What do you expect from this major interdisciplinary project?
The genuinely interdisciplinary approach of the CMMS focus offers all participants great opportunities for integrating and synergizing their research efforts, precisely because of their different scientific backgrounds. This mainly takes place on three levels: Firstly, researchers from different disciplines bring their specific principal view and approach to the project, which illuminates the same research topic from different angles and thus promotes the understanding of all relevant details. Secondly, the project encompasses research on a variety of different living systems; despite all the differences, these systems have one feature in common—namely that nature has found efficient solutions to perform targeted operations based on fundamentally stochastic elementary processes. By comparing the corresponding mechanisms between systems, new hypotheses can be formulated on the one hand, and on the other hand the natural diversity of different efficient solutions for similar tasks can be explored. Thirdly, the approach on different adjacent scales allows to validate the correctness of approaches and models on the adjacent finer scale, and to understand how mechanisms are interacting on the next-higher scale.
Personally, I hope that the interactions within the project will above all broaden and deepen my understanding of the various mechanisms that "keep randomness in check" in living systems. In particular, I seek to better understand how elementary mechanisms of cellular information processing are orchestrated on the superordinate scale of cell populations and tissues and thereby further enhanced, but also how these elementary mechanisms can be efficiently implemented on the subordinate molecular scale chemically and physically.
Are there any collaborations/synergies that you find particularly exciting / that you are looking forward to?
My work spans over a scale ranging from groups of single molecules to communicating cell groups. Therefore, I am particularly looking forward to collaborating with researchers that focus on the adjacent scales, in line with the primary goal of the CMMS project. Here it will be particularly exciting and challenging to develop new models and algorithms that efficiently encompass the different scales, and thus will be both more realistic and versatile.
At the same time, I am looking forward to the thematic exchanges with the groups dealing with mechanisms of information processing in living systems, especially in the field of tissue development. The variety of different systems that are being studied within the CMMS project will allow us to share relevant insights and move towards a comprehensive picture of successful biological information processing strategies. Furthermore, I am convinced that methods and findings from stochastic modeling of biochemical processes could also contribute to the understanding of stochastically interacting populations, not only in a biological but also in a social setting. In that context I find it very exciting to explore possible synergies with the fields of epidemiology and econophysics, both represented here at FIAS.