Already better than a human: BioGPT from Microsoft is making its way into medicine

I study neural networks and actively write about them. Right now we are witnessing an explosion in their popularity, particularly ChatGPT. It only exists for a couple of months and it’s already getting on people’s nerves because every second person is bragging about invented slogans, texts, or compiled resumes.

Okay, with many professions everything is clear. A conclusion can already be drawn from the existing capabilities of neural networks: they help humans but cannot replace them due to a large number of nuances. However, almost nothing has been heard about the involvement of neural networks in medicine, but there has already been a big breakthrough there. Let’s study.

How GPT Neural Networks Work

In my blog, I try to write simply about complex and not entirely clear things, so now, without any complications, we’ll try to understand the essence of GPT. This will make it easier to explain Microsoft’s “breakthrough.” If I make a mistake, knowledgeable people can correct me in the comments.

GPT (Generative Pre-trained Transformer) is a machine learning model capable of generating text. It takes the text from huge databases and, after the neural network becomes smarter, it starts to work in human language. It responds to specific queries, not giving ready information based on keywords like search engines do.

Imagine you go to a search engine and want to find something. A hypothetical Google reacts to keywords, giving ready information from the internet in a relevant order. It cannot generate an answer, unlike the GPT model.

In turn, GPT is able to compose answers based on what you need, not offering pre-made material. But in order to compose an answer, you need a source (database). And this is where something truly interesting begins.

Microsoft is entering the medical field with its BioGPT. BioGPT is a language model developed by Microsoft that has been trained to generate text and search for information in medical literature. The source for the GPT is PubMed, a massive database of abstracts, studies, and other medical documents.

The key is that the model is designed to work through “natural language” instead of keywords. This greatly simplifies the process of obtaining the necessary data from the database and the accuracy of BioGPT’s results is higher not only compared to other AI systems, but also to human performance.

Now you may ask, how was this verified? I will explain.

Source: https://analyticsindiamag.com/microsoft-launches-biogpt-the-chatgpt-of-lifescience/

In the medical “search engine” PubMed, there is a PubMedQA. PubMedQA is used to evaluate the quality of machine learning models, such as BioGPT. This is done through a task of “answering questions”.

When AI answers questions, its responses are compared to those provided by experts in the field of medicine. This comparison is used to evaluate the model’s quality compared to overall human performance (from the Human Perfomance graph).

For example, if the question “How to treat diabetes?” is asked, the model should find articles from the PubMed database containing information on diabetes treatment and provide a reliable answer to the question.

Overall, after “charging” its knowledge with BioGPT, Microsoft conducted a study and found that its neural network has already surpassed human performance.

So what?

This is exactly a good example of how a neural network can help doctors make accurate diagnoses and choose more appropriate treatment based on individual factors.

Is it good or bad? On one hand, it’s great. People can be treated more effectively, and everything will be fine for everyone soon. On the other hand, it may not be so straightforward.

We can imagine many variations of the future and not all of them may seem bright. If a person makes a mistake in treatment or diagnosis, it’s a human factor. But if a neural network makes a mistake, how should people respond to it?

Responses