Study

The application of machine learning as a powerful approach for delineating disease pathogenesis and potential therapeutic interventions

Clinical outcomes for patients with COVID-19 are heterogeneous and there is interest in defining subgroups for prognostic modelling and development of treatment algorithms. We obtained 28 demographic and laboratory variables in patients admitted to hospital with COVID-19. These comprised a training cohort (n= 6099) and two validation cohorts during the first and second waves of the pandemic (n=996; n=1011). Uniform manifold approximation and projection (UMAP) dimension reduction and Gaussian mixture model (GMM) analysis was used to define patient clusters. 29 clusters were defined in the training cohort and associated with markedly different mortality rates which were predictive within confirmation datasets. Deconvolution of clinical features within clusters identified unexpected relationships between variables. Integration of large datasets using UMAP-assisted clustering can therefore identify patient subgroups with prognostic information and uncovers unexpected interactions between clinical variables. This application of machine learning represents a powerful approach for delineating disease pathogenesis and potential therapeutic interventions.

Link for more information- (PDF) Machine learning of COVID-19 clinical data identifies population structures with therapeutic potential (researchgate.net)

If you want to request access to data find out how to do this here

Accessing Health Data