Big data in biopharmaceutical process development – vice or virtue?


*Corresponding author
1. Institute of Chemical and Bioengineering, Department of Chemistryand Applied Biosciences, ETH Zürich, Switzerland
2. DataHow AG, Zurich, Switzerland


Digitalization is one of the most pronounced trends in our society. 

In manufacturing, this trend is closely linked to so-called Industry 4.0, which aims to create synergetic connections among all relevant data sources, devices and stakeholders. While industries like automotive have adapted and realized many of the associated prerequisites in its production sites and products, the chemical and particularly the biotechnological and pharmaceutical industry are clearly lagging behind. These conservative industries produce a variety of molecules through complex processes based on a long-standing technological expertise. However, the innovative potential offered by digitalization is vastly unexplored with a digitalization degree recently estimated as 27%1. This article critically addresses the role of big data and big data analytics in the bioprocessing domain, where, for the moment, many data sources remain unexplored, data analysis is often performed in a basic and non-reflected way and data-related decisions are managed in an inefficient and economically non-optimal procedure.


The worldwide biopharmaceuticals market is projected to exceed USD390 billion by 2020, making up with more than two hundred products approved for clinical use for 28% of the pharmaceuticals market (1). Thereby, monoclonal antibodies (mAbs) are a prominent and steadily growing product class approved for a variety of diseases, ranging from orphan indications, through some cancer variants and multiple sclerosis to asthma and rheumatoid arthritis (2).

As antibodies are generally well-tolerated and highly specific, a significantly reduced risk of unexpected safety issues in human clinical trials is expected. On the other hand, a much more complex product quality pattern must be understood and ensured.

The clinical and commercial success of those biopharmaceuticals encouraged the large-scale production up to volumes of 20,000 liters (3). The corresponding multistage process development cycle starts in the upstream process with the selections of cell line, medium and reactor system and operation mode. Optimal ranges for the process parameters have to be determined (building the process design space) a ...