Core data science skills: filling the gaps with community developed workshops – Times Higher Education (THE)

Posted: February 7, 2022 at 6:34 am

Core data science skills are needed for all kinds of scientific research. While many excellent resources are available, putting together a skills training programme suitable for your research institute is a challenge. Interdisciplinary research programmes attract students and staff with a wide range of background knowledge and skills. Graduate students are funded by a hodgepodge of schemes with different training requirements and support. Training of postdocs and early career researchers can be neglected, and many struggle to build the skills they need to progress their research and careers. Students and staff alike can start any time of year, though there is often a cohort of new students each autumn.

We are leading a UKRI-funded programme called Ed-DaSH, developing new training workshop materials available to the whole research community. We are working with The Carpentries, an inclusive community teaching coding and data skills, whose pedagogical model of collaborative hands-on learning we have adopted in our workshops. The workshop topics include statistics, Fair (findability,accessibility,interoperability andreuse) principles of data management, and workflow management systems. Starting in autumn 2021, our institutes began using these new materials in core data science training programmes, focused initially on the PhD student intake but available to all staff and students.

What does your audience need to learn to fulfil their potential as researchers? Surveys are a good start, especially if these are short and easy to complete. For example, a survey of the MRC Human Genetics Units parent Institute of Genetics and Cancer regarding statistical training needs and support found high demand for both one-to-one training and workshops.

However, surveying can only capture what you ask about, and what people know they need right now. Future needs must also be looked after, especially for early career researchers. Observations from experts need to be factored in: what is the bleeding edge doing? We could observe a sea change towards workflow management systems in health and bioscience research and a lack of training to incorporate these into everyday usage.

What is your institute offering? And how is that related to the training needs youve identified? Your survey can tell you what training has helped in the past, and it is also helpful to collate existing post-training surveys. If training is offered via your institutes graduate research programme, is it open only to the students on that programme, or could it be made more widely available? We had a number of locally developed workshops, such as an introduction to our university computer cluster and an introduction to genome browsers, that were well received and in high demand.

What training and resources have other people developed that you can use? Dont waste time reinventing the wheel. Training material developed by others in the research community is often freely available, adaptable and high quality. Even better, academic research communities will generally welcome contributions and feedback.

We believe that to foster a living curriculum, it is worth letting go of some control over its content. Data science is in the fortunate position of having access to the open source Carpentries workshop material. We use lessons from the Software Carpentry and Data Carpentry suites, covering the basics of the Unix shell, Python and R.

Dont be afraid to adapt: we previously offered an internally developed genomics workshop, but we have replaced this with a more up-to-date Carpentries lesson. Lessons developed with the input of the wider research community are tested and updated by hundreds of instructors worldwide, making them easier to share across institutes. Our local Edinburgh Carpentries community facilitates collaboration.

Looking back at your training needs, what is missing? In our case, it was statistics, data management and workflow management systems that we felt most needed new material. If you have the capacity to begin right away and are interested in making your new material a community effort, talk to the Carpentries about their Incubator scheme, and take a look at their Curriculum Development Handbook. For funding, as well as UKRI, the Software Sustainability Institute generally and Elixirspecifically for biosciences may have relevant schemes.

When, where and how do you want to deliver your new training programme?

Data science is here for the long term, and your programme will need to evolve with changing needs. Collecting feedback and, more importantly, acting on it will keep your programme relevant and effective. Community-developed materials help to spread the burden of keeping your lessons current, and you can pay it forward by contributing fixes and updates based on your experiences.

Alison Meynert is a senior research fellow and bioinformatics analysis core manager in the MRC Human Genetics Unit, and Edward Wallace is a Sir Henry Dale fellow in the School of Biological Sciences, both at the University of Edinburgh.

If you found this interesting and want advice and insight from academics and university staff delivered directly to your inbox each week,sign up for theTHE Campusnewsletter.

Originally posted here:
Core data science skills: filling the gaps with community developed workshops - Times Higher Education (THE)

Related Posts