The dataset was derived from the experiments led by Recursion, in collaboration with Utah State University


Recursion releases open-source SARS-CoV-2 morphological imaging dataset. (Credit: Pixabay/fernando zhiminaicela)

US-based digital biology company Recursion has unveiled its open-source RxRx19 dataset, which is the first human cellular morphological dataset of SARS-CoV-2, which causes COVID-19.

The company said that releasing RxRx19 is aimed at quickly contributing human cellular morphological data and more than 1,600 small molecules to researchers across the world who are working on advancing the fight against the COVID-19 pandemic.

Recursion chief technology officer Ben Mabey said: “At Recursion we have repeatedly seen how artificial intelligence coupled with target-agnostic drug discovery can rapidly uncover insights that are obscured through traditional approaches.

“The release of RxRx19 creates an unprecedented opportunity for the machine learning community to uncover those hidden insights that will be most valuable in the fight against a global pandemic. Beyond the immediate purpose, this open-source dataset will help researchers advance in their abilities to use high content imaging for compound efficacy screening.”

Recursion has derived dataset from the experiments conducted with Utah State University

Recursion said that the dataset was derived from its experiments, conducted in collaboration with Utah State University to investigate the therapeutic potential of 1,672 approved or clinical-stage compounds for modulation of the effect of SARS-CoV-2 in human renal cortical epithelial (HRCE) cells.

The company has carried out experiments for four weeks, and were conducted at the USU Biosafety Level 3 facility and analyzed by its team of data scientists, engineers and machine learning scientists who are currently working remotely.

To process the images, the company has leveraged its deep learning neural network capabilities to generate high-dimensional featurisations of each image, facilitating the identification of distinct phenotypic profiles, which are also being shared publicly.

The RxRx19 data set will provide the researchers in the scientific community with access to 305,520 5-channel fluorescent microscopy images and corresponding deep learning embeddings to analyze or apply to their own experimentation.

Combined with Recursion’s RxRx1 dataset released last year, RxRx19 is said to enable the machine learning researchers to leverage modern deep learning techniques to bridge two related datasets.

Recursion co-founder and CEO Chris Gibson said: “I am so humbled by and proud of the work of our team who worked long hours, transporting equipment and reagents nearly 60 miles each way to our collaborator every day for weeks during the height of the current pandemic crisis.

“The generation of more than 300,000 5-channel images, a preliminary pre-print manuscript and more in just four weeks is incredible under these circumstances. This speaks both to the scrappiness of the amazing team at Recursion, as well as the flexibility of our platform to adapt rapidly to explore broad areas of biology. This is just the start, there is more to come.”