T-shaped partial least squares regression (T-PLSR) is a valuable machine learning technique for the formulation and manufacturing process development of new drug products. An accurate T-PLSR model requires experimental data with multiple formulations and process conditions. However, it is usually challenging to collect comprehensive experimental data using large-scale manufacturing equipment because of the cost, time, and large consumption of raw materials. This study proposes a hybrid modeling of T-PLSR and transfer learning (TL) to enhance the prediction performance of a T-PLSR model for large-scale manufacturing data by exploiting a large amount of small-scale manufacturing data for model building. The proposed method of T-PLSR+TL was applied to a practical case study focusing on scaling up the tableting process from an experienced compaction simulator to a less-experienced rotary tablet press. The T-PLSR+TL models achieved significantly better prediction performance for tablet quality attributes of new drug products than T-PLSR models without using large-scale manufacturing data with new drug products. The results demonstrated that T-PLSR+TL is more capable of addressing new drug products than T-PLSR by using small-scale manufacturing data to cover a scarcity of large-scale manufacturing data. Furthermore, T-PLSR+TL holds the potential to streamline formulation and manufacturing process development activities for new drug products using an extensive database.
Keywords: Machine learning; Scale-up; T-shaped partial least squares regression; Tableting; Transfer learning.
Copyright © 2024 Elsevier B.V. All rights reserved.