Algorithm can detect biomarker in aggressive breast cancer

Cells Image analysis Computer calculations Data analysis Software and programming Mathematical analysis Mathematical modelling Health and diseases

Jeppe Thagaard has developed a mathematical model for use in automated image analysis of tissue samples. The model provides the possibility for better and more similar cancer prognosis and treatment.

When pathologists examine tissue samples from cancer patients, they make an estimate of the number of specific biomarkers in the tissue to see how strong the patient's immune system is in fighting the cancerous tumor. It is based on digital microscopic images of stained tissue samples - so-called histopathological sections. Based on this, the doctors give a prognosis for relapse and/or survival and put together the best treatment for the individual patient.

Today, the work is done manually, it takes time, and in many countries, there is a shortage of pathologists. But in the near future, machine learning will be able to help analyze histological images.

Industrial PhD Fellow
 at DTU Compute Jeppe Thagaard has developed a very promising algorithm for image analysis of tissue samples. Like pathologists, the method will be able to estimate the risk of dying from a certain type of breast cancer within x number of years.

In many places - also in Denmark - images of tissue samples are still not stored digitally, but that development is ongoing and necessary, and algorithms such as Jeppe Thagaards will play an important role:

"Everyone talks about personalized medicine, where you find the right treatment based on individual biomarkers, and therefore we have to fundamentally think in a different way. Our research shows that it is possible to make a fully automatic setup with machine learning, where the biopsy is automatically analyzed so that hospitals save time.”

“At the same time, our AI system will be objective and consistent in its assessment and, therefore, be a valuable tool for pathologists when making their manual estimates, which also depend on the pathologists' experience. The algorithm can thus help to create more equality in cancer treatment, no matter where in the world the patients are,” says Jeppe Thagaard.

The algorithm is targeted at aggressive breast cancer

Jeppe Thagaard's algorithm has been developed together with Herlev and Gentofte Hospital and colleagues in the company Visiopharm in Hørsholm and in the research sections Visual Computing, and Cognitive Systems at DTU Compute.

The model is based on the advice of the expert group International Immuno-Oncology Biomarker Working Group, which works to improve diagnoses and treatment for a group of patients with aggressive breast cancer, Triple-negative (TNBC). The expert group has just pointed out the need to develop algorithms that can help in the work.

About 15 percent of breast cancer patients suffer from TNBC cancer, and patients have poorer 5-year survival rates than other types of breast cancer (77 percent versus 93 percent) because the cancer cells do not respond to medical treatment e.g. hormone therapy.

Among the patients, some are doing better thanks to a better immune system. It can be predicted by the number of the biomarker ‘stromal tumor-infiltrating lymphocytes’ (sTIL), where a high number improves the survival of TNBC patients.

When patients are at low risk of dying, they do not have to undergo a very hard treatment with chemotherapy and radiation. Similarly, the doctors can turn up the treatment of those patients where the tumor just shuts itself off so that the immune system cannot fight the tumor.

The algorithm works in several layers

"At the same time, our AI system will be objective and consistent in its assessment and, therefore, be a valuable tool for pathologists when making their manual estimates."
Jeppe Thagaard, PhD student at the research section Visual Computing at DTU Compute

The algorithm is built up in several parts, where the special immune cell detector does different things. The model can, among other things, count the cells per square millimeters, ensure that the cells have close contact with the tumor, and the cells must not be inside the tumor or in dead tissue to ensure the cells respond to the tumor and is not just an inflammatory condition.

“There are so many exceptions to the rules that it is difficult to make an algorithm, and it is difficult to take the rules from pathologists and implement them into the formula. But we have succeeded in this together with the international expert group,” says Jeppe Thagaard.

As proof of the result is extremely promising and important is also emphasized by the fact that the scientific article 'Automated Quantification of sTIL Density with H&E-Based Digital Image Analysis Has Prognostic Potential in Triple-Negative Breast Cancers' is included in the first special issue of the magazine ‘Cancers’, which the expert group has published in the research area.

At DTU Compute, one of Jeppe Thagaard's supervisors, Professor Søren Hauberg, also highlights the strength of the method:

“The potential of the algorithm is great, as it is the first time that an AI system is being developed that actually follows the work process that pathologists demand. If we are to give pathologists a tool with actual value, it is incredibly important that we develop it in close collaboration with the field, here via the expert group.”

 The development work continues

The model has been validated on a data set with 257 patients from 2004, where the prognostic biomarker of the algorithm has been kept up against the knowledge of how the patients fared. However, the algorithm still requires some development before it can be built into a tool in the software systems used right now.

"E.g. we need to deal with the disadvantage of AI systems, like how do we make sure that the AI algorithm works? What does it do if something comes along that it has not seen before? We are still working on that. We will train the model on more pictures,” says Jeppe Thagaard.

He will submit his PhD thesis at the end of August and continue working in DTU Science Park at Visiopharm A/S, which was established as a start-up from DTU and is celebrating its 20th anniversary this year.

“I am very aware that in the long run, my research can have a great impact on patients' survival. This is also why I work with a company because it is necessary to get the method commercialized. When universities themselves develop something, it may be used by the partners. If this solution is to go out to the whole world - even low-income countries where it could be very useful - then it must be commercialized and wrapped in a solution that can be bought."

Figure from the article in Cancers, 18 June 2021: Automated Quantification of sTIL Density with H&E-Based Digital Image Analysis Has Prognostic Potential in Triple-Negative Breast Cancers. Figure 1 Overview of the fully automated image analysis pipeline. The input data are the scanned WSI of a TNBC patient, which is then analyzed by multiple steps. First, the tissue (dark red) is recognized from the glass to limit the analysis to only the relevant part of the scanned slide. Secondly, the tissue-level model classifies slide regions into tumor tissue (blue), non-invasive epithelium (yellow), and necrotic regions (red). In the third step, the macro-outline of the tumor is approximated, and then tumor-associated stroma and margin (turquoise) are defined. Cells across the entire sample in the tumor-associated stroma are classified as TILs (green) or not, and finally, the sTIL density and heatmap can be outputted for review. 

The figure gives an overview of the fully automated image analysis pipeline. The input data are the scanned WSI of a TNBC patient, which is then analyzed by multiple steps. First, the tissue (dark red) is recognized from the glass to limit the analysis to only the relevant part of the scanned slide. Secondly, the tissue-level model classifies slide regions into tumor tissue (blue), non-invasive epithelium (yellow), and necrotic regions (red). In the third step, the macro-outline of the tumor is approximated, and then tumor-associated stroma and margin (turquoise) are defined. Cells across the entire sample in the tumor-associated stroma are classified as TILs (green) or not, and finally, the sTIL density and heatmap can be outputted for review.
Source: Cancers, 18 June 2021: Automated Quantification of sTIL Density with H&E-Based Digital Image Analysis Has Prognostic Potential in Triple-Negative Breast Cancers.

The algorithm works on cheap staining of cell samples

  • As part of the research, Jeppe Thagaard has also developed a new way to train the algorithm in the image analysis tool so pathologists can do staining of the tissue samples based on hematoxylin and erosin (H&E), which has been used by pathologists for many years and gives tissue structures pink and purple colors.
  • H&E staining is much cheaper than immunohistochemical biomarkers, which colors specific structures in tissue yellow, green, and blue in relation to specific cancers.
  • This can be of great importance in low-income countries where they cannot afford to do DNA tests and molecular tests of tissue samples.
  • In the future, staff will be able to locally in a hospital, stain cells, take pictures and upload tissue images to a cloud solution that automatically performs image analysis to help the hospital so patients receive the best appropriate treatment.