When I first took on the role of CEO at The Alan Turing Institute, the strap line beneath the title was ‘The National Institute for Data Science’. A year or so later, this became ‘The National Institute for Data Science and AI’ – at a time when there was a mini debate about whether there should be a separate ‘national institute for AI’. It has always seemed to me that ‘AI’ was included in ‘data science’ – or maybe vice versa. In the early ‘data science’ days, there were plenty of researchers in Turing focused on machine learning for example. However, we acquired the new title – ‘for avoidance of doubt’ one might say – and it now seems worthwhile to unpick the meanings of these terms. However we define them, there will be overlaps but by making the attempt, we can gain some new insights.
Ai has a long history, with well-known ‘summers’ and ‘winters’. Data science is newer and is created from the increases in data that have become available (partly generated by the Internet of Things) closely linked with continuing increases in computing power. For example, in my own field of urban modelling, where we need location data and flow data for model calibration, the advent of mobile phones means that there is now a data source that locates most of us at any time – even when phones are switched off. In principle, this means that we could have data that would facilitate real-time model calibration. New data, ‘big data’, is certainly transforming virtually all disciplines, industry and public services.
Not surprisingly, most universities now have data science (or data analytics) centres or institutes – real or virtual. It has certainly been the fashion but may now be overtaken by ‘AI’ in that respect. In Turing, our ‘Data science for science’ theme has now transmogrified into ‘AI for science’ as more all embracing. So there may now be some more renaming!
Let’s start the unpicking. ‘Big data’ has certainly invigorated statistics. And indeed, the importance of machine learning within data science is a crucial dimension – particularly as a clustering algorithm with obvious implications for targeted marketing (and electioneering!). Machine learning is sometimes called ‘statistics reinvented’! The best guide to AI and its relationship to data science that I have found is Michael Jordan’s blog piece ‘Artificial intelligence – the revolution hasn’t happened yet’ – googling the title takes you straight there. He notes that historically AI stems from what he calls ‘ human-imitative’ AI; whereas now, it mostly refers to the applications of machine learning – ‘engineering’ rather than mimicking human thinking. As this has had huge successes in the business world and beyond, ‘it has come to be called data science’ – closer to my own interpretation of data science, but which, as noted, fashion now relabels as AI.
We are a long way from machines that think and reason like humans. But what we have is very powerful. Much of this augments human intelligence, and thus, following Jordan, we can reverse the acronym: ‘IA’ is ‘intelligence augmentation’ – which is exactly where the Turing Institute works on rapid and precise machine-learning-led medical diagnosis – the researchers working hand in hand with clinicians. Jordan also adds another acronym: ‘II’ – ‘intelligent infrastructure’. ‘Such infrastructure is beginning to make its appearance in domains such as transportation, medicine, commerce and finance, with vast implications for individual humans and societies.’ This is a bigger scale concept than my notion that an under-developed field of research is the design of (real-time) information systems.
This framework, for me, provides a good articulation of what AI means now – IA and II. However, fashion and common useage will demand that we stick to AI! And it will be a matter of personal choice whether we continue to distinguish data science within this!!