Misuse or missed use of data?

Considerations for a Public Health Approach

Lewis Prescott-Mayling
Thames Valley Violence Reduction Unit
Thames Valley Police

I remember my early days as a police officer. Before each shift an officer would come in early to start the printer into action spitting out reams of paper. Once the team were settled at the briefing room table, they would commence reading the printed paper stack, which was every single incident in the area since we were last on duty. It was a manageable volume of data. We would use the information to target our efforts where we believed we could have the greatest impact. Ultimately, we were using our clinical judgement of the data to forecast where the best places to patrol were. With the increase in calls, crimes and data, the idea of doing this style of “briefing”, is laughable.

Those early days now seem crude but presently there is an unknowable volume of data, and that creates risks as well as opportunities. When people come to harm, questions are legitimately asked. Were opportunities missed? Could this have been prevented? The single most sighted issue from serious case reviews is that information was not shared between agencies. But there is a converse problem highlighted in a study showing, notifications of vulnerabilities from one agency to another have increased dramatically, but nearly 90% result in no action, despite over a third being repeat notifications. Therefore, there is potentially a large group in need of support not receiving assistance. That’s a lot of potential opportunities missed. Or are these referrals false negatives, creating noise and making it difficult to see the real threat? Statutory partners need to work together to make sense of the collective data they hold, and here is why.

People are effected by heterogeneous harm events (e.g. crimes, neglect) and statutory agencies are required by law to reduce the likelihood of these events occurring. Often the identification of individuals or groups most likely to come to harm is done in isolation by each agency. However, harmful events interact, as do the agencies’ responses and in trying to maximise isolated parameters are we paying enough attention to the potential side-effects of actions, particularly on another agencies goals? This is seen in the every growing evidence regarding Adverse Childhood Experiences or ACEs (stressful events occurring before a person is 18 years old). For example parental drug abuse, domestic abuse or parental incarceration have an effect on both the parent themselves and they are recognised ACEs, affecting the child. They can have a significant negative impact on a child’s life, including early mortality. But fully understanding who is at risk may require greater data sharing. One agency may be aware of an individual being incarcerated, another aware the domestic abuse, another that they are a parent and another that their partner has lost their job. No single agency can see the potential risk, unless data is shared. The issue being they may not be willing to share via the established pathways unless they believe the risk is already so great, based on data they alone hold, to justify it. Had they had the whole picture, they may have made a different decision. This is a catch-22.

Finite resources are being deployed where harm has already occurred or when individuals are already at the point of crisis. At which point the problem is obvious to everyone so data is shared, but the problems are also entrenched. Early intervention provides support when issues are easier to resolve and many studies have shown this can make a substantial difference. There is another issue with measuring what works in tackling acute problems as what may seem like a successful intervention is actually a regression to the mean post crisis. It means we often confuse correlation with causation and muddle outputs (i.e. reduced crime) for outcomes (i.e. an individual’s needs are met).

Another way to think about this problem is to consider using data to identify risk factors for future harm. Risk factors, such as ACEs are not deterministic. But by understanding data we can afford insights on where and to whom harm is more likely to occur, then we can use data for social good, to support not stigmatise, for growth not control. This is not to predict binary outcomes but probabilistic forecasts. Only by taking such an approach can we ever prevent harm. This is a public health approach to data and like a disease, crime is more rationally treated by prevention than by curative methods. In health care we are accustomed to looking for risk factors for a disease and taking action with individuals or groups to prevent the harm ever occurring.

However, human behaviour is complex. When looking for risk factors for criminal harm we can use heart disease as an analogy. Whilst an ideal aim is to diagnose a group of symptoms to a single cause, this is not possible as heart disease, like crime, has many pathways. Also, just as there are different sub-types of heart disease there are different types of crime which may have different predictors. Many of these predictors will be common in those who do not commit crime. This is where data analysis, mirroring techniques used in health, are required. We could not only identify those at risk and support them but have measures for population health i.e. how much domestic abuse are children witnessing in a city, or how many are living in areas of multiple deprivation, with bullying or with high-pollution? Given many of these factors correlate with poor health as well as with poor social outcomes like crime, we could problem solve the causes of the causes and create cross partnership joint strategic aims.

Ultimately of course, it is what we were doing, in the early days of my career, we were forecasting, using our own clinical judgements on where to patrol. We’ve come a long way, but we can do more. There are justifiable concerns about such an approach, such as biased data being used to design systems, about a ‘black-box’ (where decisions cannot be explained), about negative feedback loops (which could automate inequality and perpetuate or amplify bias) and of course privacy. These concerns are real and justified. There are also many more considerations than we have covered here. Rigorous safeguards will be crucial, but without making better use of data we will miss opportunities and continue to see the same patterns in serious case reviews. The needle of useful information is buried in too large a digital haystack.


This blog aims to reflect the opinions, thoughts, and concerns of academics and researchers related to COVID-19. All views belong to authors and it does not represent the views of any organisation.