Computational text data analysis using R, quanteda and tidytext

Science Coding conference 2019

Wednesday, 4 September 2019 from 9:00 am to 12:00 pm (NZST)

2019 Science Coding Conference - Pre-conference Workshop

Speaker / Session Lead:
Arindam Basu, Senior Lecturer, College of Education, Health & Human Development
University of Canterbury

Computational text data analysis is an exciting field of research not only for digital humanities, but also for social and health scientists in unraveling and articulating meanings embedded in written text either in native text format or transcribed from audio and video interviews.
The free and open-source software R and associated packages tidytext and quanteda make it possible to conduct this complex task based on a corpus of text; however, many researchers find the task daunting as to how
analyse a corpus of text to generate insights applying computational thinking to the process.
The goal of this workshop will be to develop computational thinking and enable researchers to use R and packages tidytext and quanteda to conduct computational text analysis.

We will use a specified corpus of text obtained from Project Gutenberg texts, and we will use quanteda and tidytext packages to analyse the corpus of the text. We will use a data carpentry approach and enable practitioners (students, social scientists, and anyone interested in the process) to conduct text data analysis.
We will use a hosted Jupyter notebook instance and cover the following steps:
(1) reading a corpus of text
(2) tokenise the text corpus
(3) applying dictionaries to conduct sentiment analysis in the text
(4) identify hidden constructs using topic modelling.  
In the workshop, we will use a data carpentry approach using a modular approach to design the lessons, sharing resources over the web, and in the session live coding each step, obtaining frequent feedbacks from the participants using formative assessments in the session.

In order to get the most out of this training, it is strongly recommended that you:

  • Can bring your own laptop to use during the session that can get connected to the Internet and has a modern web browser installed (eg. Chrome/Firefox/Safari).

This is a free pre-conference workshop and we encourage you to register your interest and attend. If you are not sure if this workshop is for you or have any further questions, feel free to email Nooriyah at  

Bentley room at UCSA Events Centre
90 Ilam Road, Ilam, Christchurch, 8041
New Zealand

The Science Coding Conference is where scientific programmers, software engineers, developers, and coding enthusiasts from universities and research institutes can gather in one place to share how they’re supporting New Zealand’s research ecosystem.

