# Sayed-Amir Marashi:

## Constraint-based analysis of substructures of metabolic networks

### Abstract

Constraint-based methods (CBMs) are promising tools for the analysis of metabolic networks, as they do not require detailed knowledge of the biochemical reactions. Some of these methods only need information about the stoichiometric coefficients of the reactions and their reversibility types, i.e., constraints for steady-state conditions. Nevertheless, CBMs have their own limitations. For example, these methods may be sensitive to missing information in the models. Additionally, they may be slow for the analysis of genome-scale metabolic models. As a result, some studies prefer to consider substructures of networks, instead of complete models. Some other studies have focused on better implementations of the CBMs. In Chapter 2, the sensitivity of flux coupling analysis (FCA) to missing reactions is studied. Genome-scale metabolic reconstructions are comprehensive, yet incomplete, models of real- world metabolic networks. While FCA has proved an appropriate method for analyzing metabolic relationships and for detecting functionally related reactions in such models, little is known about the impact of missing reactions on the accuracy of FCA. Note that having missing reactions is equivalent to deleting reactions, or to deleting columns from the stoichiometric matrix. Based on an alternative characterization of flux coupling relations using elementary flux modes, we study the changes that flux coupling relations may undergo due to missing reactions. In particular, we show that two uncoupled reactions in a metabolic network may be detected as directionally, partially or fully coupled in an incomplete version of the same network. Even a single missing reaction can cause significant changes in flux coupling relations. In case of two consecutive E. coli genome-scale networks, many fully-coupled reaction pairs in the incomplete network become directionally coupled or even uncoupled in the more complete reconstruction. In this context, we found gene expression correlation values being significantly higher for the pairs that remained fully coupled than for the uncoupled or directionally coupled pairs. Our study clearly suggests that FCA results are indeed sensitive to missing reactions. Since the currently available genome-scale metabolic models are incomplete, we advise to use FCA results with care. In Chapter 3, a different, but related problem is considered. Due to the large size of genome-scale metabolic networks, some studies suggest to analyze subsystems, instead of original genome-scale models. Note that analysis of a subsystem is equivalent to deletion of some rows from the stoichiometric matrix, or identically, assuming some internal metabolites to be external. We show mathematically that analysis of a subsystem instead of the original model can lead the flux coupling relations to undergo certain changes. In particular, a pair of (fully, partially or directionally) coupled reactions may be detected as uncoupled in the chosen subsystem. Interestingly, this behavior is the opposite of the flux coupling changes that may happen due to the existence of missing reactions, or equivalently, deletion of reactions. We also show that analysis of organelle subsystems has relatively little influence on the results of FCA, and therefore, many of these subsystems may be studied independent of the rest of the network. In Chapter 4, we introduce a rapid FCA method, which is appropriate for genome-scale networks. Previously, several approaches for FCA have been proposed in the literature, namely flux coupling finder algorithm, FCA based on minimal metabolic behaviors, and FCA based on elementary flux patterns. To the best of our knowledge none of these methods are available as a freely available software. Here, we introduce a new FCA algorithm FFCA (Feasibility-based Flux Coupling Analysis). This method is based on checking the feasibility of a system of linear inequalities. We show on a set of benchmarks that for genome-scale networks FFCA is faster than other existing FCA methods. Using FFCA, flux coupling analysis of genome-scale networks of S. cerevisiae and E. coli can be performed in a few hours on a normal PC. A corresponding software tool is freely available for non-commercial use. In Chapter 5, we introduce a new concept which can be useful in the analysis of fluxes in network substructures. Analysis of elementary modes (EMs) is proven to be a powerful CBM in the study of metabolic networks. However, enumeration of EMs is a hard computational task. Additionally, due to their large numbers, one cannot simply use them as an input for subsequent analyses. One possibility is to restrict the analysis to a subset of interesting reactions, rather than the whole network. However, analysis of an isolated subnetwork can result in finding incorrect EMs, i.e. the ones which are not part of any steady-state flux distribution in the original network. The ideal set of vectors to describe the usage of reactions in a subnetwork would be the set of all EMs projected onto the subset of interesting reactions. Recently, the concept of ``elementary flux patterns'' (EFPs) has been proposed. Each EFP is a subset of the support (i.e. non-zero elements) of at least one EM. In the present work, we introduce the concept of ProCEMs (Projected Cone Elementary Modes). The ProCEM set can be computed by projecting the flux cone onto the lower-dimensional subspace and enumerating the extreme rays of the projected cone. In contrast to EFPs, ProCEMs are not merely a set of reactions, but from the mathematical point of view they are projected EMs. We additionally prove that the set of EFPs is included in the set of ProCEM supports. Finally, ProCEMs and EFPs are compared in the study of substructures in biological networks.