How collaborative is medical research

Algorithms for Medicine

Modern medicine produces massive amounts of images. X-raying, scanning, preparation and coloring are carried out, but then all of these images have to be assessed by trained medical staff - “by hand”, so to speak. In such cases, AI-controlled image recognition has long been used in other areas of life. Why is development in the medical field so slow, when otherwise the use of this technology is downright exploding?

Prof. Dr. Christian Herta, data scientist at HTW Berlin and head of the deepHEALTH project, can name a number of reasons for this. On the one hand, there are still very practical problems to be solved in medicine, for example with images of histological tissue sections: They have about eighty times the resolution of ordinary photos and are therefore too large to be processed in one piece by neural networks. “Histology images can only be analyzed in part. The algorithm can lose crucial information for the overall diagnosis because it lacks the connections, ”says Herta.

Another problem is the availability of data - because AI needs large amounts of it to learn. In addition, the data must first be annotated, i.e. annotated with comments on the image content. For cancer diagnostics, for example, this means that doctors have to mark tumor areas on tissue sections (see article image) before the images can be used to train an algorithm. And while millions and millions of ordinary photos of houses, cars or animals are stored in freely accessible databases, medical images are strictly protected for good reason. "Seen in this way, data protection is a real challenge for AI development, because there is simply less training data in well-protected areas," says institute employee Dr. Christian Krumnow.

Images, protein sequences, sleep data - a case for AI

The Berlin scientists still want to try to give medicine a little more artificial intelligence - of course, taking into account the special conditions that the field brings with it. Image recognition plays an important role in the deepHEALTH project, because there is a lot of potential for it: The most common diseases, including cancer, naturally also produce the largest amounts of images and thus most of the work for pathologists, oncologists and radiologists.

But AI has other advantages as well. For example, appropriately trained neural networks are also suitable for speech recognition. And language, in turn, shows clear parallels to biochemistry, in which sequences play a role: "The sequence of bases on a DNA strand or of amino acids in a protein is structurally very similar to the sequence of letters, words and sentences," explains Herta. This is why the team is also developing algorithms for the recognition and correct assignment of biomolecule sequences so that the machines can, for example, only identify the type of pathogen based on genome fragments in a sample. And the project team has also already worked on algorithms for sleep medicine, where complex data is generated and pattern recognition is required.

Algorithms should support the doctor, not take over his job

In addition, the Berlin researchers are concerned with a special problem that is not quite as relevant in many areas of application of AI, but it is in medicine: algorithms are a kind of black box. "Even the programmers of an algorithm often cannot say how they came to their decision in detail," explains Christian Krumnow. The result is more important: You feed the program with input (“Here you have a stack of pictures”) and get the required output (“Find out all the pictures with cats for me”). However, the concrete path that led to this result is often completely unclear.

Deep learning and artificial neural networks (kNN)

Deep learning is a form of machine learning (ML) in which, however, mainly raw data is used. For example in language analysis: in earlier ML approaches, attempts were made to teach the machines the grammar rules before they were let loose on texts. In deep learning, on the other hand, an algorithm is simply fed with vast amounts of text and then has the task of deriving the valid grammar rules itself from this data. However, such approaches only became possible with increasing computing capacity. Deep learning can be implemented in different ways. One of them are neural networks. They are remotely modeled on the human brain: numerous nodes that carry out simple calculations are linked to form a network in which information is passed on from node to node.

When it comes to the automatic identification of cat images, this dark field may still be tolerable. In the worst case, however, medicine is a matter of life and death - of course, the doctor and patient want to know exactly why the AI ​​wants to have discovered a tumor in a tissue sample, for example. After all, algorithms do not work properly either, on the contrary: Sometimes they are far too sure of what they are doing, this is called “overconfident”. Krumnow gives a vivid example of the effect: “If you train a standard algorithm to differentiate dog pictures from cat pictures, but then show it pictures of cars, it will not recognize the cars, but classify them as dogs or cats - and at the same time signal that they are very sure of their work. ”An algorithm for medical applications, on the other hand, has to reliably detect when it is facing something other than the things it has been trained for and report that it does not provide a classification at this point can. In such cases, the doctor can then intervene and make up his own mind.

Avoid overconfidence and illuminate the decision-making paths in the "black box": This is one of the most important project goals of deepHEALTH, because there is still a lot of basic research to be done here. Christian Herta and his colleagues also want to reduce reservations among future users: Doctors often reacted skeptically to programs that seem to do the work for them, according to Herta. The goal, however, is a system that supports the work of doctors instead of taking it off their hands. "The system must therefore not work in a non-transparent manner and also not rob the doctor of the motivation to make his own decisions."

Right in the middle of a rapid development

For Herta and his colleagues, the fact that they are fully in line with the trend with their project not only has advantages. One of the sub-projects was overtaken by current developments during the application phase: A research team from the USA published results on one of the research questions from deepHEALTH before work could even begin in Berlin. “We were able to adapt the project plan accordingly,” says Herta. And meanwhile there are successes: The first doctoral candidate in the project has already completed his work, his successor is now increasingly concerned with the interpretability of the algorithms and the question of whether and how one can program an explanatory function into the system. Meanwhile, Christian Herta is already formulating visions for the future: At some point he would like to have a program with voice output that tells the doctor directly what it has discovered and why it interprets the data in one way and not another.

Overall, however, with deepHEALTH, HTW Berlin is very close to one of the most exciting topics of the future. Even if some AI systems for medicine are already being tested at university hospitals, many of them have not yet been implemented commercially. Sooner or later, however, Artificial Intelligence will also be part of the norm in medicine - to support its users and to perhaps even find things here and there that even the trained eye of a doctor has overlooked. If the algorithms can really help doctors and patients, then Christian Herta and his colleagues are satisfied - for the time being.