Bulletin October 2021 Number 196

An innovative project, led by The University of Manchester in collaboration with the National School of Healthcare Science, seeks to create a new educational programme in clinical data science with the health and social care workforce.

The prevalence and utility of health data has never been as evident and critical as over the last 18 months. Building a healthcare workforce that has a strong understanding and skill set in data science will enable the full potential of this data to be harnessed for the best possible clinical decision-making and outcomes for patients.

Health Education England (HEE), through the National School of Healthcare Science (NSHCS) in HEE, is working with The University of Manchester (UoM) to develop a flexible programme of continuing professional development in clinical data science. This innovative initiative, funded by HEE, is being developed in collaboration with clinical partners at The Christie Hospital and the wider healthcare workforce to support the NHS cancer programme, NHS long-term workforce development plans, the People Plan and the Richards Report.

This innovative initiative, funded by HEE, is being developed in collaboration with clinical partners at The Christie Hospital and the wider healthcare workforce to support the NHS cancer programme...

This educational programme will support the development of data science, statistics, machine learning and programming capabilities. It includes introductory and advanced genomic courses across the healthcare science workforce and beyond to many healthcare professions in the NHS, including medicine, nursing, pharmacy and allied health professionals.

Healthcare professionals will be able to take individual modules as well as – if desired – combined modules that could lead to a 60-credit postgraduate qualification in clinical data science. There is a plethora of health data science courses offered by other institutions, but this is the first programme that, to our knowledge, will be co-created with and be specifically designed for health and social care professionals.

The main objectives of this programme of work are:

  • to develop a nationally agreed curriculum for clinical data science and introductory genomics
  • to develop course content, with flexible delivery methods, which will be tested and approved
  • to develop a secure cloud-based teaching and training environment for national delivery of the curriculum at scale
  • to develop an evaluation framework for determining the impact of the provision on healthcare scientists, the healthcare workforce, and on the wider healthcare system.

Demonstrating the importance of clinical data science

The current COVID-19 pandemic clearly illustrates the importance of developing widespread health data science capability across the workforce. During the pandemic, digital technologies have been harnessed to support symptom reporting, surveillance monitoring and contact tracing. This data is being reported widely and publicly via dashboards and visualisations to scientists, medical professionals and the general public alike.1,2 Terms such as PCR and R-rate are now known and being used by the public, indicating the explosion of interest in both genomics and health data.

The scaling up of data initiatives, aggregation and visualisation of health data, both in the UK and globally, has been critical in supporting successful research and public health surveillance throughout the pandemic.

The scaling up of data initiatives, aggregation and visualisation of health data, both in the UK and globally, has been critical in supporting successful research and public health surveillance throughout the pandemic. Aggregated data sets and visualisations provided by the organisation Our World in Data allow the monitoring of cases, deaths and vaccinations, not only in the UK but across the world. They propel research by sharing data openly under full creative commons licenses and via open-source data repositories such as Github.3

Linkage of data has also been of real importance during the pandemic, allowing health and social care providers to understand pre-existing health conditions that might cause vulnerability to COVID-19. Innovative projects such as the Greater Manchester Care Record, led by Health Innovation Manchester and the GM Health and Social Care Partnership, have been accelerated by the pandemic. They bring together complete patient records from ten localities across Greater Manchester enabling the best possible decision-making and outcomes for patients.4 Patient information has been brought together from:

  • primary care (e.g. GP practices)
  • community services
  • mental health services
  • social care
  • secondary care (e.g. hospitals)
  • specialist services (e.g. North West Ambulance Service).

The record also indicates if a patient has COVID-19, as this might affect their ongoing treatment, monitoring or medication.

How do we prepare the workforce to best utilise all of this data?

Time is a key factor in upskilling existing staff. New training programmes will need to rapidly develop data science specialists from existing areas of the healthcare science workforce to support growing areas, such as cancer sciences and precision medicine delivery. Protected time will be required to allow these digital champions the time and space to develop the additional knowledge and expertise that complements their clinical work and enables them to contribute effectively to enable data-led digital transformation.
 

Approach to the co-creation of the clinical data science curriculum
Figure 1: Approach to the co-creation of the clinical data science curriculum.


Through the development of this flexible programme of clinical data science, we will offer continuing professional development in data science, statistics, genomics and programming, and other areas as defined by this programme of work. Early adopters and pioneers in clinical data science will require support in their journey. This might be achieved through peer support or through the development of national communities of practice, exemplified by the pioneering Topol Fellowship programme.5

Core concepts and skills to cover in clinical data science programme.
Figure 2: Core concepts and skills to cover in clinical data science programme.

Co-creation of a curriculum

The educational development team comprises academics (Professor Ang Davies, Dr Alan Davies, Fran Hooley, UoM), a learning technologist, a project manager and an information systems programme manager (Dr Phil Couch, UoM). This team is working closely with Professor Berne Ferry and colleagues at the NSHCS, and other key stakeholders within the healthcare workforce to develop the curriculum and infrastructure to deliver this cutting-edge educational programme.

The virtual laboratory will be developed further to allow learners to work in real-world, cutting-edge data science environments with anonymised patient data. Interestingly, patient data can be used for research and service development purposes...

Phil Couch and other research software engineers at UoM are developing Manchester’s virtual laboratory, E-lab, which will host a set of virtual images that can be used anywhere with web access, currently including Jupyter Notebooks and R programming. This enables the support of both programming and statistics teaching. The virtual laboratory will be developed further to allow learners to work in real-world, cutting-edge data science environments with anonymised patient data. Patient data can be used for research and service development purposes (unless patients have opted out of sharing their data under the national data opt-out, introduced in 2018),6 however permissions do not allow it to be used in an anonymised way for teaching. In undertaking the development of this programme, we have lots of interesting challenges ahead in terms of governance and public involvement to enable the use of authentic patient data.

Working with a world-leading specialist cancer centre

To incorporate world-leading expertise in real-world clinical data science, the development team includes colleagues from The Christie Hospital, a specialist cancer hospital, with which UoM has existing collaborations. The Christie Hospital is pioneering a team science approach within clinical practice, often encompassing multidisciplinary teams including data scientists, clinicians and research software engineers, to utilise approaches such as machine learning to support clinical decision-making. Therefore, involving them in curriculum design and content development will ensure that teaching materials and case studies are developed from real-world clinical data examples captured from a specialist cancer hospital.

A three-phase approach

The creation of the curriculum for the new programme is being informed by a rigorous three-phase approach. This includes a systematic literature review (including published literature as well as grey literature and relevant job postings), a series of 20 semi-structured interviews, including diverse representation across healthcare professionals, and finally a digital survey which will be widely distributed via relevant stakeholders and networks (Figure 1). To date, some key themes have come through during the interviews and survey participation, including: database management, statistics and machine learning methodologies, skills in R and Python, and data interpretation and interrogation (Figure 2).

It is envisaged that the course is likely to be fully online to accommodate the working schedules and study requirements of busy healthcare professionals.

Once the areas of focus are decided for the curriculum, the next steps will involve detailed descriptions of each module to be taught, inclusion of learning outcomes and then the learning materials themselves will be created. It is envisaged that the course is likely to be fully online to accommodate the working schedules and study requirements of busy healthcare professionals. Cumulatively, the team involved has significant combined expertise in designing fully online courses in the areas of data science and bioinformatics. These courses have embedded rich, authentic clinical case studies and have used Jupyter Notebooks to deliver topics such as programming and machine learning.7

Next steps

A clinical advisory board and also a number of key stakeholders have been involved to date. We envisage that in late 2021, a number of smaller curriculum development groups will be established to focus on specific parts of the curriculum and content creation. The academic team is delivering a fully online workshop for the second cohort of Topol Digital Fellows from September to December 2021. This will provide an opportunity to test some of the new Jupyter Notebooks, which will contain teaching content related to data science and machine learning algorithms, and will also test the virtual laboratory infrastructure. The team aims to launch the first module of the new clinical data science programme in data engineering in September 2022. If you’re interested in finding out more about this programme you can contact Angela Davies ([email protected]) and we would welcome your contributions in shaping this new clinical data science programme – you can participate in the digital survey here.

References available on our website.