Teaching with live coding in R and RStudio
The Data Lab teaches data science courses targeted toward pediatric cancer researchers that introduce topics such as analysis of gene expression in bulk and single-cell data and principles of reproducible research. I wrote previously about how we use RStudio Server for our remote courses to simplify setup, and I wanted to write a bit more about some of the instructional practices we use so that our participants get the best experience we can provide. In particular, I wanted to talk about our use of live coding to facilitate active learning, and one of the tools we developed to make our course development just a bit easier.
Active learning with live coding
While we often use slides to introduce conceptual information, we spend the vast majority of our time in courses working through R notebooks that contain instructional content, example code, and blank chunks where code will be filled in during the lesson. The notebook that the instructor presents from on screen is exactly the same as the ones that the participants have on their own screens, and we fill them out together, adding the code that is needed and running everything in parallel between the instructors and participants. This is the “live coding” part of the course.
Others have written about the practice of live coding and active learning in general, but here are some of the advantages of this approach that we see:
- Coding live provides a natural pace to the lecture. After years of experience, we are relatively fast at typing, but not as fast as slides would be! The extra time it takes to type (and fix typos) gives us room to explain the meaning of each step as we go.
- We make mistakes! Everyone does, all the time, so it is critical to normalize mistakes, especially for people new to programming. We also get to demonstrate reading and interpreting error messages (even the more inscrutable ones).
- We can respond to questions fluidly, changing code and options on the fly to provide context or show alternative approaches.
- Most importantly, since participants are coding alongside instructors and filling out their own copy of the notebook, they get immediate and continuous practice with the process of coding.
Our overall goal is that this hands-on approach gives participants the experience and confidence they need to take what they learn back to their own labs and start applying their new skills to their own research.
There are of course potential pitfalls to live coding as well:
- If the lesson is slowed down too much by the time it takes to type in the code, more advanced participants may become bored and/or frustrated waiting to continue. We try to mitigate this by prefilling some parts of the notebook in areas where the code is repetitive, or where minor typos can lead to frustrating errors, such as when entering file paths.
- When (not if) participants make mistakes in transcribing code and encounter errors, they may fall behind the lecture, get stuck, and/or require help debugging. This is where having other instructors available is critical. If just a couple of people are stuck, we can help them with a quiet word if we are in person, or in a breakout room on Zoom. If there is a pervasive problem, we may have to pause or re-explain a point.
Another class of potential pitfalls with live coding is the dreaded brain freeze. It is not uncommon to come upon a blank code section while teaching and simply forget what you meant to put in it! To avoid this, we take two approaches:
- The first is just good practice: comments! We try to make sure that what is expected in the code chunk is described either in the surrounding text or in code comments.
- Second, we always have a completed version of the notebook alongside to remind us what should go in each block. We also provide this completed, rendered notebook to the participants; they can use it for reference, to catch up on something they missed, or just to refer back to after the course.
One weird trick: the {exrcise} package
Maintaining two versions of the workshop notebooks, one partially blank and one with all code completed, quickly became a source of annoyance for us, as they would tend to get out of sync. To solve this, we wrote a little R package called {exrcise} that takes a notebook and removes all the code from flagged code chunks, but leaves all line comments.
Now, whenever we make a change to a notebook, we do it in a fully solved version (ensuring that all of our solution code works properly). We then run [.inline-snippet]exrcise::exrcise()[.inline-snippet] on that file to generate a new copy with blank code chunks ready for live coding or for use as an exercise notebook that participants can practice with on their own. For even more convenience, we actually do the [.inline-snippet]exrcise()[.inline-snippet] step in a Github Action that pulls the repository, renders all of the notebooks to html, creates copies ready for live coding with blanked out code chunks, then submits a pull request with all of the updates. This allows us to separate the work of making and reviewing the changes to the notebooks themselves from generating the “exrcised” versions.
If you want to play around with {exrcise}, you can install it from github with [.inline-snippet]remotes::install_github("AlexsLemonade/exrcise")[.inline-snippet]. (I’d love to say that this package is coming to CRAN, but unfortunately that isn’t likely at the moment, as the code relies on some internal functions in the {knitr} package.) If you give it a try, we’d love to hear about your experiences. You can get involved by filing an issue on Github or even a pull request if you are ambitious!
Join us!
If you are a childhood cancer researcher and this kind of learning sounds like it would appeal to you, keep an eye out for our next training sessions! Subscribe to our blog, follow us on Twitter, and join our workshop mailing list for updates.
The Data Lab teaches data science courses targeted toward pediatric cancer researchers that introduce topics such as analysis of gene expression in bulk and single-cell data and principles of reproducible research. I wrote previously about how we use RStudio Server for our remote courses to simplify setup, and I wanted to write a bit more about some of the instructional practices we use so that our participants get the best experience we can provide. In particular, I wanted to talk about our use of live coding to facilitate active learning, and one of the tools we developed to make our course development just a bit easier.
Active learning with live coding
While we often use slides to introduce conceptual information, we spend the vast majority of our time in courses working through R notebooks that contain instructional content, example code, and blank chunks where code will be filled in during the lesson. The notebook that the instructor presents from on screen is exactly the same as the ones that the participants have on their own screens, and we fill them out together, adding the code that is needed and running everything in parallel between the instructors and participants. This is the “live coding” part of the course.
Others have written about the practice of live coding and active learning in general, but here are some of the advantages of this approach that we see:
- Coding live provides a natural pace to the lecture. After years of experience, we are relatively fast at typing, but not as fast as slides would be! The extra time it takes to type (and fix typos) gives us room to explain the meaning of each step as we go.
- We make mistakes! Everyone does, all the time, so it is critical to normalize mistakes, especially for people new to programming. We also get to demonstrate reading and interpreting error messages (even the more inscrutable ones).
- We can respond to questions fluidly, changing code and options on the fly to provide context or show alternative approaches.
- Most importantly, since participants are coding alongside instructors and filling out their own copy of the notebook, they get immediate and continuous practice with the process of coding.
Our overall goal is that this hands-on approach gives participants the experience and confidence they need to take what they learn back to their own labs and start applying their new skills to their own research.
There are of course potential pitfalls to live coding as well:
- If the lesson is slowed down too much by the time it takes to type in the code, more advanced participants may become bored and/or frustrated waiting to continue. We try to mitigate this by prefilling some parts of the notebook in areas where the code is repetitive, or where minor typos can lead to frustrating errors, such as when entering file paths.
- When (not if) participants make mistakes in transcribing code and encounter errors, they may fall behind the lecture, get stuck, and/or require help debugging. This is where having other instructors available is critical. If just a couple of people are stuck, we can help them with a quiet word if we are in person, or in a breakout room on Zoom. If there is a pervasive problem, we may have to pause or re-explain a point.
Another class of potential pitfalls with live coding is the dreaded brain freeze. It is not uncommon to come upon a blank code section while teaching and simply forget what you meant to put in it! To avoid this, we take two approaches:
- The first is just good practice: comments! We try to make sure that what is expected in the code chunk is described either in the surrounding text or in code comments.
- Second, we always have a completed version of the notebook alongside to remind us what should go in each block. We also provide this completed, rendered notebook to the participants; they can use it for reference, to catch up on something they missed, or just to refer back to after the course.
One weird trick: the {exrcise} package
Maintaining two versions of the workshop notebooks, one partially blank and one with all code completed, quickly became a source of annoyance for us, as they would tend to get out of sync. To solve this, we wrote a little R package called {exrcise} that takes a notebook and removes all the code from flagged code chunks, but leaves all line comments.
Now, whenever we make a change to a notebook, we do it in a fully solved version (ensuring that all of our solution code works properly). We then run [.inline-snippet]exrcise::exrcise()[.inline-snippet] on that file to generate a new copy with blank code chunks ready for live coding or for use as an exercise notebook that participants can practice with on their own. For even more convenience, we actually do the [.inline-snippet]exrcise()[.inline-snippet] step in a Github Action that pulls the repository, renders all of the notebooks to html, creates copies ready for live coding with blanked out code chunks, then submits a pull request with all of the updates. This allows us to separate the work of making and reviewing the changes to the notebooks themselves from generating the “exrcised” versions.
If you want to play around with {exrcise}, you can install it from github with [.inline-snippet]remotes::install_github("AlexsLemonade/exrcise")[.inline-snippet]. (I’d love to say that this package is coming to CRAN, but unfortunately that isn’t likely at the moment, as the code relies on some internal functions in the {knitr} package.) If you give it a try, we’d love to hear about your experiences. You can get involved by filing an issue on Github or even a pull request if you are ambitious!
Join us!
If you are a childhood cancer researcher and this kind of learning sounds like it would appeal to you, keep an eye out for our next training sessions! Subscribe to our blog, follow us on Twitter, and join our workshop mailing list for updates.