Randomness is a natural part of biological systems and can make comparisons between biological entities difficult. The challenges lie in separating truly different phenomena from random perturbations. The aim of this study was to compare, with statistical accuracy, the growth of 13 Escherichia coli strains subjected to varying concentrations of the growth inhibitor lactoferrin.
E. coli is a complex group of bacteria comprising mostly harmless commensals that are normal inhabitants of the intestinal tract of all warm-blooded animals including humans. A subgroup of E. coli has been proposed as candidates for probiotic treatment of enteric diseases, while other subsets have acquired different sets of virulence factors that may cause intestinal and extra-intestinal disease. Most pathogenic E. coli follow a common strategy for infection based on adhesion and colonization of epithelial cells in the host, evasion of host defenses, multiplication and host damage . Diarrhoeagenic E. coli consist of six pathogroups based on different virulence factors, clinical symptoms and serotypes: Enteropathogenic E. coli (EPEC), enterotoxigenic E. coli (ETEC), enteroinvasive E. coli (EIEC), enteroaggregative E. coli (EAEC), Shiga toxin-producing E. coli (STEC) and diffusely adherent E. coli (DAEC) .
Lactoferrin is an iron-binding protein in secretions such as milk, tears and saliva, and has antibacterial effect based on two distinct mechanisms [3, 4]. Lactoferrin inhibits growth by its ability to bind ferric iron, and limiting this essential nutrient result in bacteriostasis [4, 5]. In addition, lactoferrin may prevent bacterial adhesion and invasion of mammalian cells by interfering with surface expressed virulence factors and thereby decrease virulence . Decreased adhesion to epithelial cells caused by lactoferrin has been reported for several E. coli pathogroups such as EPEC, STEC, EAEC and ETEC [6–8], and decreased invasion has been identified for EIEC .
Antibacterial effect is often studied as minimal inhibitory concentration (MIC) of the tested sample at one time point, usually after bacterial growth over night. Studying growth continuously over time is of importance for observing how and when bacteria respond to an antibacterial substance. However, associating explanatory variables with continuous bacterial growth curves, known as longitudinal studies, using traditional statistical methods such as ordinary least squares regression (OLS) is difficult due to the dependence between the measured points. In addition, bacterial growth, as measured using optical densities, can lead to non-linear curves that are difficult to handle with linear OLS-based methods.
Modeling of bacterial growth is commonly divided into two branches, namely predictive and descriptive models (See  and references therein). Examples of the former include mathematical models using largely different types of differential equations . These models are usually supplied by a list of initial conditions that are instrumental in predicting the progress of bacterial growth. Predictive models therefore assume a predetermined rule or causation of how bacterial cultures evolve with time. Descriptive models on the other hand are based on statistical inference, i.e. they are descriptive in the sense that no predetermined rule is assumed. Hence, descriptive models can be used to examine the association between a set of relevant external factors with the observed growth trends. Such methods can also be used to analyze and compare differences in and between bacterial growth curves.
Due to the nature of the growth curve data standard statistical models will easily find differences between growth curves; the challenge lies however in separating the truly different growth curves from the similar curves differing only as a result of random fluctuations in growth. In the approach taken here the slopes, or growth rates, are calculated for each bacterial growth curve, which in this study consists of OD measurements with respect to time. Since bacterial growth varies with time, each curve is divided into three intervals of equal duration of which a slope is calculated. The focus here is on comparing the growth of the different strains and not modeling the growth per se. For the comparisons to be as accurate as possible each interval must contain the same amount of points so that a slope can be computed from line segments of identical length (i.e. duration in our case since the data is time dependent). Since the length of the lag, exponential and stationary phases of the growth curves vary considerably among the different strains, modeling and comparing these phases exactly using the procedure applied here will make the analyses inaccurate and cumbersome. We have therefore opted for a more unusual way of comparing growth curves that focuses on growth change over three equal intervals for each curve. This is referred to as growth rate in the present study, but since growth is measured using absorbance it should be understood as OD rate. We assert that the growth rate of a set of bacterial strains is similar if the calculated slopes are within a 95% confidence interval found using, for instance, a t-distribution. Conversely, we consider slopes, or growth rates, inside the 95% confidence interval to be significantly different from the ones outside. We claim that this approach makes it fast and easy to statistically compare many bacterial growth curves.
There exist several extensions to OLS based methods such as generalized linear regression models (GLM) , Gompertz based models and others  that can be used to analyze and compare bacterial growth and similar types of studies. A recent approach used a non-parametric mixed effects-model based regression with random effects to handle the dependence between each consecutive point found in growth curve data . The regression model was further improved by bootstrapping the regression coefficients so that a prediction band could be obtained for the modeled growth curve. We also demonstrate the use of GAM regression, which is an extension to spline regression, to model bacterial growth in one of the E. coli strains. GAMs are almost as easy to set up as standard linear regression models, but the use of splines makes GAMs better at modeling longitudinal data. Not only can a confidence interval be obtained for the whole curve, but we also show that derivatives, with confidence intervals, of the fitted curves can be easily computed, revealing more information with respect to how growth changes at each time point.