Precision Neuroscience Reimagined: A Perspective on AI

Everywhere we turn, we hear about AI. The benefits, the pitfalls, and how we can manage its use effectively, without AI damaging our society. To help us understand more about AI and what it has to offer the healthcare research community, we caught up with Yuxing Fang, who found himself drawn into the world of AI science.

Tina Marshall: Hello, and welcome to Precision Neuroscience Reimagined. My name is Tina Marshall, and today I’m joined by Yuxing Fang, who is actually heading up an AI Lab at Akrivia. So today we’re going to be spending a little bit of time digging into the work that Yuxing does in supporting mental health and dementia. So hello, Yuxing. Thank you very much for joining me today. I really appreciate it.

Could you just take a moment to tell me a little bit about yourself?

Yuxing Fang: I’m Yuxing Fang, and I graduated from Beijing University. I got my psychology Ph.D. there, and then I have two postdocs. The first is very boring about mathematics stuff, and neuroscience. And the second one is at the University of Cambridge. We’re trying to understand how humans process language, and also to understand how these AI models process language and whether we can say the difference between AI models and the human brain.

Tina Marshall: So, understanding how humans process language, and then how AI models process language, and then tying them together. And that’s the work that you did?

Yuxing Fang: So, the original plan was to find some overlaps between these AI models and the human brain. But after four years, unfortunately, we didn’t see many overlaps. So, this is a very good question.

Why these AI models are so powerful? Is that because they like the human brain?

I have to say we don’t have much evidence. So, basically, the AI models are used in a very different way, but they can get very good performance.

Tina Marshall: Okay, that’s interesting. No overlap between the human brain and the AI models, but the AI models can mimic the language of the human brain.

Yuxing Fang: Yes. And based on very different things, and we are not very clear why.

Tina Marshall: Okay, that is really interesting, because AI is such a huge thing at the moment, and you’re constantly seeing lots of models and new AI tools being spun out and bought to the market.

For you, when we’re talking about the human brain and how the human brain could process language, what’s the connection between that and NLP research?

Yuxing Fang: I honestly have not too much. So surprisingly I’ve become an AI scientist and the original plan, yes, I know many people like to do that. We called brain-like models. We want to borrow some stuff from human brain research and put it in AI models and hope this model is like humans and can give you better performance. But actually, I’ve spent a long time, as I said though, the AI models and human brain looks like different lines, and we don’t have-

Tina Marshall: They don’t correlate.

Yuxing Fang: Not too much, but maybe in the future people can find some ways, for example, put the combinations or models or even at the very important level, for example, use these AI models to mimic human neurons, and to make this model better. But currently, at this point, I have to say, not too much interaction.

How does that work, if there’s no correlation between the human brain and the language that’s being used, and the AI model and the language that uses? Could you explain more about that? How does that work? How can the AI model then replicate what the human brain would say?

Yuxing Fang: This is a very good question. So, I would like to talk about how to train a model. So, I think, training a model is very like to let a baby speak, for example, Chinese or English. So how to train a baby to speak something, we need to give them some examples. For example, some sentences, then told the baby whether this is correct or not. And the baby also gives you some feedback. They can speak something and say, “Now this is not right.” And then baby knows, okay, this is incorrect. They have to change something in human brains. So, the model is very like this procedure. So, we give models lots of examples, and the model can learn the roles based on your examples and also receive your feedback. For example, whether this is correct or not. And every time if the models give a correct label, it will, okay, this is very good and they will keep the parameters, but we will tell the model this is wrong. Then it-

Tina Marshall: It will change the parameters.

Yuxing Fang: They will change the parameters, and we will do it many, many, many times. For example, millions of times. The model can trick itself, and generate the correct sentences. And the interesting thing is, we are not still very clear about how the model can do this. I know we are very sure about the mathematics, and how to change the parameters and humans can find a role, a set of roles, and how to change the parameters. But what are these parameters about? This pattern is learned by the models we don’t know. So back to your question, I think, although I explain how to train a model, we are still not very clear.

Tina Marshall: It’s about how it’s taking it in and how it’s being trained, even though, well, so we understand how we’re training the model, but we don’t understand how the model is learning. And just every time, just seeing so many different models of AI being spun out, and for the understanding that okay, we don’t actually know how they learn. So, do we know how long they take to learn?

Yuxing Fang: It depends on the training time. So, of course, for a model you can train long times, you spend hours and hours, but it’s not always giving you a better performance. So, it really depends on this training set, and we call the scale of the model basically how many parameters we have. If you have a very big model, for example, OpenAI, they have ChatGPT, they have 10 billion parameters. It, of course, takes a very long time to train this model. But if it’s a small model, I think it’s quick. But I have to say the train model, it’s just a part of building a nice NLP model. Actually, we have to spend more time preparing these examples, as I said, have good examples, and give them some feedback, whether it’s crack or not, it occupies lots of time.

Tina Marshall: So, if we’re looking at training an AI model for mental health and dementia, typically how long do they take to train, and how do you categorise whether it’s good enough, and how do you decide which one to go for first? Because if we’re talking about lots of parameters, then in mental health and dementias, we know that there are lots of parameters because it’s not really as fixed as potentially other conditions or therapy areas. So, we’re constantly looking at the progression of the disease and signs and symptoms, and how the patient is feeling and everything around them. So

How are we able to, or how long does it take to, train a model to help with these conditions?

Yuxing Fang: So, typically, for example, in Akrivia, we spend six months, and we can divide these six months into different stages. So, for example, so the first question is what do we need to figure out what’s dementia? What’s this, what’s that mean? What’s the definition of that? So, clinical doctors and researchers, they have very different opinions. So, the first thing is, we need to define, need to make this question more accurate. Once we figure out this discussion, a lot of discussion between these clinical researchers and AI scientists, and then we need to prepare some examples of training data for this model. It also takes lots of time. You have to clean this data also. And then this step needs lots of human power, I have to say. Humans have to manually, we call, the annotation. So basically, give this label whether it’s dementia or not, whether this sentence contains the parameters or clinical features you are interested in.

Once you prepare this training data, then we can send it to the model. Then it is my job, I’m a scientist, and use the training data and play with this we call hyperparameters and mix this model running. And then the last step, we need to validate the model. So, as I said, it’s like training a baby and we don’t know how much learned from this one. We have to put the exam and then we let the model generate a lot of output, then still, like humans, to take whether it’s cracked or not, give you accuracy. So, this is all these steps, we put them together, it takes months, several months.

Is there any way of being able to speed up that process?

Yuxing Fang: So, I think we spend a lot of time in preparing the training data. For example, if we train the model needs 10,000 sentences. For each one, we have to manually check and give a label.

Tina Marshall: Sorry. You need 10,000 sentences every time you train a model?

Yuxing Fang: It depends on whether the concept is, whether it is complex.

Can you tell me how NLP relates to AI?

Yuxing Fang: So, NLP, so as its name, NLP means natural language processing, which means everything, every model can process real language like English or Chinese, and can be seen as the NLP models. And when we talk about the AI stuff is something, well, it’s quite difficult to define. Many people have different definitions, but we can see some overlaps. So, what’s these AI and NLP models mean?  Akrivia, we mean, use some different neural network, these advanced NLP models. So those models, are not just like keyword-searching robot-based models. These models can capture context.

Tina Marshall: And it’s understanding the context?

Yuxing Fang: Understanding meaning, or even like humans.

Tina Marshall: And so, this is where the difference is between a keyword search. A lot of the questions I’m asked is:

How is this different to a keyword search?

Yuxing Fang: So, for example, a patient says, “Okay, I took these medications many times, have to spend half an hour talk to you.” Then he suddenly said, “Oh, it’s not true, I’m just a liar” or something. Then if you are a keyword model, you just pick up some keywords like medication name searching or something. So, at the end of this paragraph, they certainly know, well it is not true. So only when you put it in a context, you know the first sentence is, it’s-

Tina Marshall: Understanding the situation between what’s been written down.

Yuxing Fang: Yeah, maybe I can give a simple sentence. For example, “I haven’t taken this medication”, and have not used a medication problem, which means we shouldn’t keep this medication name, but if we use a keyword search, it’s just the key.

Tina Marshall: It will pull that out. But without the context of, “I haven’t taken the medication, I have not taken the medication”, then at least it can be removed. So, it’s not counted in what everybody’s looking for. So, you’ve joined the organisation, and we’ve launched an AI Lab function. Can you tell me about that? Because certainly, this is very exciting, I think, for everybody that’s listening or watching, and has an interest in AI, which I think everybody in the world pretty much does at the moment.

How can the AI Lab really help accelerate the use of real-world data to help with conditions such as mental health and dementia?

Yuxing Fang: So, I have this AI Lab’s ideas, because I know this is a very good company that responds to customers’ requirements or something. One day I ask Ben, my line manager, and head of research, “What we can do?” For example, can we create some products the customer has never seen? Create something that people have never seen? I mean in this AI field, I know many people do this, NLP models, extract concepts or something. But can we do something more than that, based on the structured data? So, AI gives us some examples, but yeah, so basically this is an idea. So, AI Lab will focus on these innovative applications. So, for example, in one application, we try to understand what’s behind these models. So, to give an output, for example, if a model can predict whether this patient’s dementia or not.

Tina Marshall: I’m always asked about bias with regards to AI, and the NLP that we use to extract data.

How can we reduce bias? Tell me how can we ensure that our AI models reduce bias?

Yuxing Fang: Yes. So, the first thing, I think, the most important thing, as I say, is about training data. So, the AI model can mirror, so just reflect what you put in these models, for example, in the training data about certain diseases, they’re all famous. So, then the model must learn this problem. So, I think the most important thing is to prepare the non-bias, training data, and remove it at the first beginning. So, then I think another way is we can use some human feedback to retrain this or fine-tune this model. So, I will also give an example, like OpenAI. They hire lots of humans to rank the output of the models to say whether it’s biased or not and give a score. And the training model itself can learn these scores. If it inspires that, okay, we’ll publish some parameters. So, after many turns, the model will try to give you a relatively good process.

Tina Marshall: Okay. Thank you.

Why is it important to explain how the AI model makes decisions? I mean even actually just even for the AI model to understand if it has got something right or wrong, how does that happen?

Yuxing Fang: I have these ideas because I don’t want to if a doctor uses a model to make some decisions about diagnosis or something, just told the patients, “Ah, you are diagnosed, oh the probability of you becoming AD patient is 19% of your something”. But they don’t explain why, this is very weird. But then the problem is because this deep neural network depends on stuff, we call this model like black box, which means we don’t know what happened. So then back to the very first questions, we don’t know what’s learned by these models. So, to understand, what happens within these models, we need additional layers. So, we try to develop some techniques to make this model more transparent. And this is also very important for clinical researchers or pharmacists because we want to know the critical points within these models. For example, which clinical features are important, and can we finalise some markers, or something, we can attack. So, to understand all of this, we need to understand what happens within the model.

Tina Marshall: This is really interesting, Yuxing.

How do we dispel any negative connotations around AI?

Yuxing Fang: Yeah, I’m very negative about the future of AI.

Tina Marshall: Oh yeah? Explain that.

Yuxing Fang: First thing is, I think AI will in future, I think will definitely be a competitor with humans and maybe occupied the world or something. But I have to say this AI stuff is very dangerous, as AI scientists maybe not be suitable to see this, especially after this ChatGPT stuff. So, the ChatGPT, although is still a language model, where they just try to predict the sentence or something. But we can see this model can do generic tasks. That’s why it’s quiet dangerous, we didn’t teach the model how to calculate stuff, so something, but actually the model can do this.

So, then I will say the critical point is, or the decision point is, where the model can train itself, can self-involution, and we are very close to that point. If we get to this point, the model can be more and more powerful, and finally, it will be out of control. And I can say no one can avoid this process if the model gets these abilities. So, for example, OpenAI and Microsoft, always say, “Ah, at some point, we can slow down the training, we can slow down the models”, but I have to see how the model can train themself and use the human network or training facilities, how to stop that.

Tina Marshall: Do you think that that could be part of the technology that we will learn though? I mean if you think 10 years ago, could we think that AI could do everything that it could do now? So, could it be that as you are developing, you are also developing a break safe within that?

Yuxing Fang: I don’t know. I have to say this kind of research is much behind the performance model itself. For example, like OpenAI, they released GPT-3, the base model of ChatGPT, I think two or three years ago. And even within this month, they have already evolved two or three times this model. So, it’s so fast, and not too many people focusing on how to let’s slow down these AI models, talk about how to defeat AI models, and more people working to increase the power of AI.

Tina Marshall: Negative is one way of looking at it. The way that I would look at it is that is the difference between negativity and responsibility, and you are taking responsibility for this, and responsibility for how the AI model that you are developing is developing.

Yuxing Fang: The model I developed is just, okay, let’s see, let me put some definition about the AIs. So, people usually say strong AI and weak AI. So weak AI is just about specific tasks, like the NLP models we developed, just extracts some concepts from sentences or we don’t think, oh it’s dangerous, it’s not very dangerous. Maybe some issues with information governance, but that’s all. We don’t believe this model can be a robot or something. No way. But for this strong AI, they try to mimic humans. Everything about human beings, about humans. For example, the most dangerous one like conscience, when people talk about the conscience in AI, although at this point no, but in future-

Tina Marshall: Sorry, when they talk about what?

Yuxing Fang: The model has a conscience. You can say it’s become very interesting.

And the problem is, previously they were very different roles and we don’t see any crossing, because, for this weak AI stuff, it’s just some like NLP models, we just extract some concept as I said. And those people learn about AI, with generic AI, or the more powerful AI they use very different techniques. But the problem is ChatGPT, this OpenAI release its model and ChatGPT can be seen as some kind of weak AI because it is just a language model, just predicts some sentences. But then we form these sentences not just, something really like a human does. For example, as a kind of thinking, and can do some decision tasks or even like a TOM. The survey of mine, this is a very high-level human ability, can understand what’s the feeling about humans or something. The model can even do this. So, then we can see that that’s the cross. You will mix them together. Then it’s quite difficult to stop this dangerous research, because, I can say, “Okay, I’m just developing a language model, yeah?”

I can pass all this governance or something. It is just a language model, just predicting sentences. But then you see back this kind of model can be dangerous.

Tina Marshall:

Yuxing, could you help me understand a little bit more about digital twins? Digital twins are something that comes up in the conversations that I have. And then I think it’d be great if we could understand more about exactly what is it, what does it mean?

Yuxing Fang: The first day I come to this company, I ask this very simple question, can we sell data to other companies? And the answer is, definitely no. Because there is lots of sensitive data and the law doesn’t allow us to do that. Of course, it would become a disaster. So, then another question is can we generate some synthetic data, like fake data, but this fake data contains something this clinical research is interested in within this text, but they are synthetic data still are fake data, no sensitive staff. They’re purely fake.

Tina Marshall: So, help me understand again, the difference between the synthetic data, and, I guess, normal patient data. So, if we’re saying we definitely cannot give normal patient data to anybody, definitely not, we can’t do that for ID-ing, but the synthesised data, what particular parts, because obviously, it won’t include any personally identifiable information. And then will it include everything else?

Yuxing Fang: Yeah, maybe I can give you examples of how to generate this synthetic data. Basically, we have real data, and the model will learn the principles within this model, build some matrix, and learn the relations between medication and symptoms of something. And then use these relations to regenerate some synthetic data used like some free text. But you can see because the things extract from the real patient data are just a relation between medication and clinical features like symptoms, and experience, there’s no identifiable information. And then based on these extractions, you will not see anything about addresses or names because they will not include within this intermediate state.

Tina Marshall:

And then would the synthetic data, would it include information such as… so it won’t include patient identifiable information, but could it include information on the patient journey?

Yuxing Fang: It depends on how you train the models. We can, for example, manipulate which kind of information is extracted by these models, and what we want in the final synthetic data is something you can play with.

Tina Marshall: Okay, that sounds very interesting, and I actually already know a couple of people who’d be interested in that.

Yuxing Fang: Yeah, I think this is a very good opportunity. So, then we can send this synthetic data to any companies or researchers. We find something interesting. For example, a relation between medication and some sentence or advice effects. They can send these keywords or terms or the model they want to test, back to us. Then we can use this real data back then, some just answers, and we don’t have these information governance issues because they just sent us some models or even some hypotheses.

Tina Marshall: So Yuxing, thank you. Thank you very much for speaking to me today. I find actually all of the work you do, fascinating. I’m really keen to see how it goes and where it’s going. And I can just see the AI Lab functionality just growing actually, and just being able to benefit patients hugely. So, thank you very much for everything that you’re doing there.

If you enjoy this episode, then please do Like, Follow, and Tag. We’re on LinkedIn and Twitter. We have very exciting episodes coming up for Precision Neuroscience Reimagined. So, I hope you can join us.

You can find the latest episode and more here: https://spotifyanchor-web.app.link/e/2kd9iy8XRzb or you can find all of the episodes on YouTube: https://youtu.be/FmD1anm11YE

Newsletter Updates

Enter your email address below to subscribe to our newsletter