# Aljoscha Palinkas:

## Integrated modelling of metabolic and regulatory networks

### Abstract

Two cellular subsystems are the metabolic network and the gene regulatory network. In systems biology they have mostly been modelled in isolation with ordinary differential equations (ODEs) or with tailored formalisms as e.g. constraint-based methods for metabolism or logical networks for gene regulation. In reality the two systems are strongly interdependent. For mathematical modelling the integration is a challenge and a variety of different approaches has been proposed. Long term alterations in metabolism result from changes in gene expression, which determines the production of enzymes. This transcriptional control can adjust the metabolic network to changes in the environment or the requirements of the cell. In fact, the cell cycle is connected to cyclic changes in metabolism, so-called metabolic cycling, but alterations are also observed in non-proliferating cells in a constant environment. A mathematical model to describe and explain alterations in metabolism will be proposed here. At first, a resource allocation model for the enzymes in a metabolic network is developed and integrated into a constraint-based model of metabolism in Chap. 3. The reaction rates are bounded depending on the availability of enzymes, which in turn is determined by the overall distribution of the limited resources. In Chap. 4, this model is used to test the hypothesis that metabolic alterations are a means of the cell to achieve the required production of metabolic output most efficiently. First a toy model is analysed and then the method is applied to a core metabolic network of the central carbon metabolism. The tasks of this metabolic network are the production of biomass precursors as well as constantly providing a minimum of energy and anti-oxidants. The mathematical model gives a mixed integer linear optimisation problem with a few quadratic constraints and a quadratic objective function. Instead of searching for a single flux distribution, a feasible solution corresponds here to a sequence of several flux distributions together with the time that is spent in each of them. The consecutive usage of these flux distributions during the associated time spans yields the required output. The objective is the minimisation of the total time needed. The computations demonstrate that switching between several flux distributions allows producing the output in a significantly shorter time span, compared to an optimal single flux distribution. In a toy model we could identify the relationship between the model parameters and the results concerning the efficiency of static versus sequential flux distributions. Such a comprehensive analysis is not possible for the large number of parameters in our core metabolic network. To make sure that the confirmation of the hypothesis is not restricted to a minor region in the parameter space of the resource allocation model, we perturbed the parameters randomly and repeated all computations. This empirical analysis showed that the significant gain in performance is a robust feature of the model. From the mathematical point of view the proposed resource allocation model defines for each gene expression state a flux space from which a flux distribution can be chosen. This flux space is in general not linear and not convex, which turns out to depend on the space of all possible gene expression states. In our model the genes regulate the enzyme concentrations in an on-off manner, only determining the active and inactive parts of metabolism. Furthermore, certain groups of genes are regulated together as functional units. As a consequence, the enzyme concentrations cannot be perfectly adjusted to a given flux distribution in this model and it is for this reason that switching can increase the efficiency. A simpler model of resource allocation, which is solely based on molecular crowding, has been proposed before in the literature. It allows distributing the resources to perfectly match any given flux distribution and switching is then not necessary to obtain the minimal production time. In contrast to such a resource allocation model, our modelling assumptions and computational results suggest a design principle, where the optimal adjustment to given conditions and requirements is not achieved by fine-tuning of enzyme concentrations, but by switching between different flux distributions, which are only roughly determined by transcriptional control and which do not perfectly match one certain condition or requirement. In terms of geometry, the difference lies in the convexity of the flux space. If it is convex, minimal production time can always be achieved with a single flux distribution. To characterise a set of flux distributions sufficient to constitute an optimal sequence, the flux space of the network without the resource allocation model is considered in Chap. 3. The corresponding polytope allows characterising a finite subset of the flux space in terms of decomposability, a notion which is closely related to elementary modes. For any output requirements, an optimal sequence can be constituted from this finite set of flux distributions. In practice, solving the optimisation problem that was derived from the modelling approach as well as computing the sufficient finite subset, is not tractable for large networks. Also divide and conquer strategies are not promising to obtain optimal solutions in general, a counterexample is given in Chap. 6. Alternative computational methods to obtain optimal or approximative optimal solutions are then presented. The gene regulatory network behind the metabolic genes is not fully considered in the resource allocation model of Chap. 3. Only some constraints are added in the application to the core metabolic network in order to exclude unrealistic patterns of gene expression. Incorporating more information about the gene regulation into the computational model is in fact improving the tractability, because the search space is reduced. A sufficiently small search space of gene expression sequences gives the possibility to perform a more precise and extensive analysis using an alternative computational approach. In Chap. 5, the perturbations of model parameters, as applied to the core metabolic network to verify the robustness, are considered in general. From the mathematical point of view, the linear constraints that bound the flux space are perturbed. The consequences on the geometry of the flux space and on the objective value of an optimisation problem over this flux space are analysed and an effect is discovered, which is surprising at first sight. If the bounds on the reaction rates are perturbed individually, without a bias for increase or decrease, the expected objective value of a given linear optimisation problem is decreased in expectation. This effect emerges from the representation of the flux space. In particular redundancy of the constraints plays a crucial role. The modelling and the analysis of the dynamics of gene regulatory networks with so-called logical networks is a common discrete approach. Logical networks are often represented by logical functions, which have the advantage of being mathematical objects that can be given in a natural and easily understandable format, namely Boolean expressions. In Chap. 7, a method is presented to obtain a short and well readable representation of a given logical function. It is based on the minimisation of Boolean expressions, but is designed for multi-valued logical functions in particular. All possible dynamics of a logical network can be represented in the so-called state transition graph. Simply by assigning rates to all edges, which represent the transitions between different states, this directed graph becomes a continuous time Markov chain (CTMC) which we call a stochastic logical network. This modelling approach opens new possibilities for the analysis of quantitative dynamical properties as shown in Chap. 8. In contrast to this abstract model, detailed mechanistic and stochastic models of biochemical reaction systems can be formulated with the chemical master equation, which also defines a CTMC. In fact, these two formalisms can be combined, so that distinct components of the biological system are modelled in much detail by the master equation and other parts on a higher abstraction level as a stochastic logical network. The combined model can focus on certain aspects, capturing related quantitative and stochastic effects, while keeping the overall complexity to a minimum. Finally, Chap. 9 discusses the feedback regulation from metabolism to gene regulation. In an integrated dynamic model of gene regulation and metabolism, this aspect should not be missing. Since constraint-based models neglect the concentrations of metabolites, it is difficult to determine the regulatory feedback to the genes. This problem can be circumvented by only inferring metabolic mediated interactions between genes, in the sense that a switch in gene expression leads to an alteration in the metabolic network, which in turn gives a new regulatory input to the gene network. To this end, a constraint- based approach is proposed and compared to a method from the literature, which is based on metabolic sensitivity analysis. Furthermore, a strategy to derive concentration changes from changes in flux rates and enzyme activities is shortly presented.