Emily (E) : Welcome to another episode of our Innovation Sandbox series, from the Data Analysis Bureau Podcast. The Innovation Sandbox is a collaborative initiative at the confluence of industry and academia, allowing both academic research on real-world datasets and the transfer of cutting-edge machine learning research into industry application.
In this episode, we’re catching up with Kohdai and Hamid, we heard from them earlier in the series.
Kohdai’s project was about identifying how to predict UTI risks in care homes. He was supervised by Hamid and co-supervised by Eric, who joins us this time as well.
Kohdai’s research followed five main stages.
- Data extraction
- Data preprocessing and analysis
- Baseline modelling
- Main method modelling – and
- Evaluation of models.
First, we’ll hear from Kohdai who’ll talk about the type of data they had access to.
Kohdai (K): So I think when we last spoke we were kind of working with small amounts of data – what ultimately happened was that we needed to kind of use a form of restored database where we would get back all the data from the past and then use that to work with the project and the model – that was the start of March that was finally set in stone and from there onwards we went on with testing what we had already done – you know, with the small database, with our baseline model – would the results be drastically different?– it really wasn’t – and then we were just getting on with the main methods from there onwards – and yeah, that’s how the project went concerning data – acquiring the data and kind of analysing what data to use.
E: So to recap, Kohdai was working with three types of data. First, there were physiological measurements like temperature, blood sugar and pulse rates. Second, there were the infection records themselves. And third, was a rich mixture of behaviours recorded by the care home workers.
K: We call it behavioural actions: so you know simple things like how much food they had, or the most important ones are the amount of urine output, or how much sleep did they get, or did they you know to cause any trouble – do they have any irregular behaviours – things like that. So that’s the data we used throughout the project.
E; Here’s Eric now to tell us more about why this data is so special.
ERIC: In all of that data the bit that is particularly interesting and unique is this behavioural actions data – the fact that the carers are collecting between 100-150 actions about an individual resident per day – these are quite granular actions and also the amplitude implied by the action – so not only the fact that you had a cup of tea but how much of it did you finish? It’s that data that can then be combined with some of our physiological data that is perhaps better understood you know clinically and other projects which have been done tried to apply clinical and physiological data to diagnostic effect.
E: so onto the data analysis
K: I think the main thing that we got out of it is that it’s very specific and unique to the kind of care homes environment – a lot of the physiological measurements that you see in normal kind of healthy humans – it’s things like blood pressure levels or I’ve alluded to blood sugar levels – some are normal across all residents in care homes but for some measurements we saw either they were like either high, mostly…
E: Take systolic blood pressure, for example –
K: It should be in general about from 80 to 120 mmHg – what we saw in kind of the data that we had across all the residents was a bit higher than that in the range 100-160 or something like that. We found that as we enlarged the dataset we were able to find more features of the data that showed kind of uniqueness to the care homes domain – and what we got out of this research as a result – for now, it applies only to the care homes domain but obviously you know as we get more and more data from not just care homes but maybe from across other environments maybe we can do an even better job of kind of applying this research and data domain stuff.
E: The data may have been unique, but it was also complex.
ERIC: A feature of the dataset was that it was large in the sense that there were a lot of individual care residents that were reported in it and it is also large because you have the time dimension stretched through quite a long period and you have some people that are still living in it and some people have since passed away that historically are in the dataset.
E: Oh, and this data wasn’t exactly continuous.
ERIC: So you might have a lot of instances where there’s nothing recorded on a particular day for a particular action for a particular person so that poses a challenge in how do you fill that time where there is no recorded observation? How do you deal with that? Do you assume nothing happened? Do you assume the fact that if you have two points where there was an observation and those two things were the same do you assume that in between they are also in that state? So there are issues about how you deal with a dataset that is inherently very large but also very sparse.
E: So let’s get into some figures
K: So we have around 43,000 residents that we have data for and then out of which about 5.5K have a UTI record so that amounts to about 12% – or 12-15% – ish of all residents having a record of a UTI.
E: This means there is a large class imbalance between those who do have a UTI – the positive class – and those who don’t. This could be problematic when it comes to modelling the data.
Let’s move on now to the third and fourth stages of the project – establishing the baseline model and the main model.
ERIC: The baseline method – what we are doing here is we have data that is collected through time and we have observations that are gathered through time – the machine learning models that we were using do not inherently deal with the time dimension – they are taking in static, time-static observations about the present if you like.
E: The baseline model works with data that compress a series of observations made over time into a single observation data point for that series – so just one value for blood oxygenation, for instance, even if there had been daily measurements taken –
K: Yes, so ultimately we planned to have this baseline method and then compare it with this main method called the TIRP space method. The main difference between those two is do we account for temporal relations tying the time axis, the time dimension, between different features and different measurements that are logged? So the baseline method just basically removes all of that and then simply looks at all the measurements that were taken in the past and then predicts whether someone is likely to have a UTI or not.
E: But the main model was trying to capture something much more useful, which was the relationship between these variables through time. –
K: instead of just looking at each measurement and then predicting the outcomes from those, we kind of assess and extract out the links between, you know, the connections, the temporal relations between each of these features, each of these pairs of measurements. So things like did we see this resident A having a high, a very high value of blood pressure and then all of a sudden this dips to a very low value? Did we see that pattern and that kind of led to a diagnosis with a UTI? So those kinds of patterns with regards to the time axis is what we wanted to extract out of the measurements and then that’s what got fed into the models, whereas with the baseline method we were simply just saying ‘OK – this person had a value of 90% blood O₂ at this time.
E: Why is this so important? Well, ideally you’d be able to spot the signs of an impending UTI in advance. So the main model was being trained to identify these in the period leading up to a positive test for those residents who did end up with a UTI.
ERIC: So what we’re doing is creating like an artificial time horizon and trying to predict that several days in advance on the basis that if we can do that, that’s going to give more foresight – more lead time to making an early diagnosis which means that you can treat somebody more preventatively.
But how do you make sense of this longitudinal data? The answer is TIRPs – which stands for
K: Time Interval Related Patterns. Time Intervals – it’s basically you know from time A to time B that we saw this observation and we saw another measurement from time C to D – this period, this time interval is what we want to detect different patterns from. So if we saw a certain period of time of high pressure and then we saw a certain period of time it could be overlapping, it could be after, of low blood pressure – this relation between different periods of time is what we want to assess and extract.
ERIC: Those temporal patterns are then encoded into a symbolic manner which can then be ingested by the model which contains inherently information about the time pattern as well as the observation which is why it has this great advantage over simply, you know, taking a period of time and for instance averaging the observations over it – and that point is that when you average you lose the information about the time component within the data and it’s actually the time component which is very powerful in understanding how a person behaves and how their state changes through time and that’s what important correlating to your target you’re trying to detect.
E: Even when you find a way to represent longitudinal data in time interval related patterns, though, the problem is further complicated by the data itself being pretty noisy. But this is precisely the problem that experienced carers are able to solve – and which the model is trying to replicate and generalize.
ERIC: So if you already have a bad cough, it’s difficult to know whether the fact you’re coughing is because you normally cough or whether there is something that’s making the coughing worse. So we know anecdotally that the way that the experienced carers are able to figure out that there is something wrong with the care home resident is that they know the care resident very well and they perceive changes in behaviour and changes in symptoms – if you like, physiological states. So the idea here was that we had an analytical method for replicating that detection of changes in state and patterns of behaviour that we could build into a machine learning profile.
E: So let’s go into some more detail.
ERIC: I think a simple way of thinking about it is that what the machine learning model is doing is learning to associate – think of it as a set of barcodes that are associated with having a UTI – they’re characteristic of having a UTI – these temporal patterns you can think of as black stripes on a white background and depending on the exact pattern of those black stripes is, some of those can be characteristic to having a UTI and then there will be a whole bunch of other ones that are unspecific – they’re patterns, they exist and they correlate to other things but they’re not associated with having a UTI – and that’s what the machine learning model is, is learning to differentiate to – those patterns that are characteristic of having a UTI and those that are not.
E: Moving onto the results. Probably the biggest concerns with the results were the number of false positives that the model predicted. Here’s Kohdai with the figures.
K: We have about 8,500 data points of about 8,300 where their true label is negative and the remaining 200 or so are positively labelled. So for the negatively labelled 8,300 windows, we had a false positive of about 6,000 – so that’s about a little less than 75%-ish – that’s a very high false-positive rate –
ERIC: – So the bad side of a false positive is that effectively you’re crying wolf – like you’re saying somebody’s got an issue when they don’t, and therefore people no longer trust the system and suchlike, so you need to bring that false positive rate down – you might not want to be treating people unnecessarily and so on. But on balance it’s better to make false positives than false negatives, but for the moment the false positive rate is too large for us to be comfortable with it, and so it needs to have some investigation into why those false positives are occurring – are they you know linked to particular genders, by age, by certain TIRPs and so on –
K: – at the same time, though, as we mentioned out of the 300 or so positively labelled windows we were able to predict about 250 or so which are truly positive, so that’s about 83-ish% – so that’s also a good sign, but as Eric mentioned we need to find the reason behind this high false-positive rate.
E: So the good news is that they barely miss anyone with a UTI, but the downside is that the model is overenthusiastic with predicting the presence of one.
So that now that the project is over we wanted to find out how Kohdai, Hamid and Eric felt it went – and what will happen to this work next.
K: Overall it’s given me a kind of insight into working in the medical field – working with medical data, healthcare data – and I feel that I was able to establish a base for future research to come for UTI diagnosis or other infections in care homes. I wanted to do a lot more with regards to the project – it was unfortunate – because just deep-diving into the formalities of my degree – mine’s incorporating a joint degree with maths and much less time is allocated to the project and I wish I’d had more time.
H: It was great working with Kohdai. I was also working with the data in a slightly different way so I had familiarity with the data so I could help Kohdai especially there – but I learned a lot from him as well, I think – we did some great work and it’s set up quite well to, you know, pick up what he’s done and apply it elsewhere – it’s a very valuable project.
ERIC: So Kohdai should be very proud of what he’s done – he’s done an enormous amount of work – and as with all research, you know, you’re always slightly frustrated in that you could always have done some more and that’s the nature of research, right – there’s always yet another question. This piece of research will go forwards but like all research, there will be a short period of assessment and reflection so that we can then prepare the next steps as carefully as possible. And from Hamid’s job, I think Hamid’s done a fantastic job supervising Kohdai – I know that he’s come from Oxford, of course, so he’s familiar with that kind of mode of supervision and tutoring. He’s done a great job and together they’ve achieved a huge amount. And Hamid will be the guardian of this until its next iteration so I think it’s an exciting piece of work, it aligns very well with you know what we know from the academic literature. It also has some parallels with some other work that was done using similar methods for predicting falls in elderly residents, so as a body of work, you know it contributes more widely and that’s been exciting.
E: That was Kohdai, Hamid and Eric from T-DAB.AI speaking about their project to identify the means of early detection of UTI in care homes.