The Childhood Cancer Data Lab's not-so-secret sauce for efficient workflows — aka Philadelphia’s third most famous process
‘Work smarter not harder’ is useless advice if you don’t know how to ‘work smarter’. But the Childhood Cancer Data Lab's work and processes may be the smartest I’ve ever had the pleasure of learning and adopting. Their processes are just as valuable to me now that I’ve moved on to the next stage of my career (and no longer work for the Data Lab).
The Data Lab's processes allow them to maximize their efficiency as well as the efficiency of anyone who works alongside them. This outcome is no accident: it’s heavily implied in their mission statement to help others accelerate the race to cures.
Here’s their not-so-secret sauce for you (tips are in no particular order):
Sharing is caring!
In other words, when it comes to research and code, open source helps everyone! Redundancy is the enemy of efficiency. And in childhood cancer research every finding or new method can be priceless! Obviously there are limits to this — some data cannot be shared because it is Personally Identifiable Information (PII) or Protected Health Information (PHI). But the Data Lab has made it a habit to share their methods and findings whenever possible!
Tips and sources for applying this to your workflow:
- Open source on GitHub - GitHub is an online service for version control and having your code publicly available!
- Figshare is a handy way to get your data and figures associated with a manuscript out there.
- The Data Lab has been a strong initiator of open source projects like OpenPBTA.
- Note: Email is not a code or data sharing system.
Take more time now to do a thing that will save you time long-term.
There are a lot of things that may be time consuming now and come with upfront costs, but the payoff is long-term. The Data Lab opts for the long-term payoff.
Here are some examples of this long-term mindset in action.
Write it down!
You think you will remember something until you have to remember it. The Data Lab has a very strong mantra of “write it down.” Writing things down is not only helpful for “future you,” who will forget what “today you” was thinking, but it’s also helpful for your collaborators and teammates. (Shout out to Stephanie Spielman on code comments.)
This has become especially important in the world of remote work. It’s critical to know what your teammates are working on or what they are planning so you don’t do redundant work. Don’t be redundant - write it down!
Writing things down with enough detail for others to understand is also showing them that you respect their time. Don’t be afraid to spend time drafting an important GitHub comment.
Tips and sources for applying this to your workflow:
- GitHub issues are a great way to make sure information is recorded in a place where others can access it.
- GitHub issue templates are a great way to give yourself and your collaborators cues to make sure that all information makes it from the brain to the issue! A well crafted issue enables future work to be completed more quickly.
- Documentation is a long-term time saver! I heavily pulled the lessons I learned from the Data Lab to write this course about documentation for cancer informatics for the ITCR training network.
Everybody benefits from code review.
The reviewer, the reviewee and whoever comes to look at the reviewed code later! To quote Parker 2017:
Code review will not guarantee an accurate analysis, but it’s one of the most reliable ways of establishing one that is more accurate than before.
Code review does take more time upfront – this is true – but the resulting code will stand the test of time much longer than unreviewed code.
Most wouldn’t dare think of sending out a manuscript without having their collaborators review it, but sometimes people don’t think the same way about code. I can’t speak highly enough of code review. Similarly, try not to merge your changes straight to your main branch; get that code looked over by another pair of eyes.
Personally, my ability to write readable code has been drastically enhanced by the code review I participated in at the Data Lab.
Tips and sources for applying this to your workflow:
- On Empathy and Pull Requests
- Code Review Guidelines for Humans
- Code Reviews Done Right: Your Missing Guideline
- Best practices for Code Review
- Writing Great Scientific Code
- Six Commandments for Writing Good Code
Have robots do your work for you where possible.
Why do something yourself if a robot can do it for you instead?! (At least until the robot uprising happens after they gain sentience.) In many cases, robots do a far better job than humans.
Automating tasks is another upfront cost, but a long-term time saver. Do you find yourself repeating a monotonous task? Delegate the robots to do that task! This will ensure that it gets done while freeing up your brain glucose for tasks the robots are not so good at (like writing a blog).
There are many GitHub actions out there that are already written to have robots do work for you.
Tips and sources for applying this to your workflow:
- An introduction to GitHub actions by Daniel Weibel.
- A handy reference guide for GitHub actions.
Some particularly handy GitHub actions:
- One to check URLs
- One to sync files between repositories
- A list of broadly applicable GitHub actions put together by Julien Danjo
- An even longer list of cool GitHub actions put together by Sarah Drasner
Some GitHub actions examples from the Data Lab's repositories:
- Code styling and spell checker
- Docker image testing and pushing
- Re-rendering RMarkdowns
Zapier can be handy for integrating apps together. It can integrate with GitHub, Slack, email, and tons of other apps. You are allowed a certain number of Zaps for free before you are required to pay.
Context switching is bad.
Brains take some time to get up to speed. Sometimes if you have too much on your plate you may feel the urge to multitask. Don’t! Do one thing and do it well.
Context switching often happens when we aren’t intentional enough with our todo list. Figuring out a task management system that works for you is something that is well worth the time! (But you have to stick with it.)
Tips and sources for applying this to your workflow:
- Scrum - what it is, how it works, and why it's awesome - The Data Lab uses scrum to be intentional about the tasks they need to get done each day and to keep their fellow teammates informed about what they intend to accomplish (or how their teammates may be blocking their productivity).
- ZenHub is an app for use with GitHub. When you have many repositories with projects all moving forward at once it can be difficult to juggle them and see what’s happening. ZenHub is how the Data Lab juggles these projects.
- Figure out a task management system that works for you. The greater your workload, the more important it is that your task management system is working. Email is not a task manager but rather a black hole that consumes project ideas and tasks that need to get done.
- Don’t be afraid to turn on “do not disturb” for some tasks, for some amount of time in the day. (This might mean “do not disturb” for Slack, email, or your computer’s notifications.)
Let it simmer.
This may feel like the opposite of the previous point but both can balance together. Do you feel like you just wrote amazing code? Let it sit, reread it tomorrow and see if you still feel the same way. Do you feel like you can’t solve a problem? Let it simmer and come back tomorrow to find a solution with fresh eyes. Sometimes resting will be a more efficient use of your time than hitting your head against the wall.
Tips and sources for applying this to your workflow:
- How to take effective breaks (and be more productive)
- The Best Coding Advice I Ever Got: Take a Break
Incremental/iterative improvements are good and encouraged!
“Perfect is the enemy of good,” so they say. Small chunks are easier to accomplish. There is no such thing as a finished scientific project.
Tips and sources for applying this to your workflow:
- Smaller and more frequent, focused pull requests >> one massive pull request
- Learn how to do Stacked PRs
- More tips for splitting up PRs: How to Split Big Pull Requests – Good Practices and 4 Git Strategies
No fault autopsies.
Take time out each week to evaluate your own work, how you spent your time, and how you communicated with your teammates.
Did things go as well as you hoped? Even if they went well, are there areas for improvement? Rules, routines, and processes are meant to be helpful, not a burden. So if a particular process is feeling more burdensome than helpful – maybe it’s time to switch! But most importantly, check in with your teammates. How did they think that last pull request review went? Is there a time you feel like you missed each other in communication? Time for some no-fault autopsy.
Tips and sources for applying this to your workflow:
- Empathy is a useful skill: On Empathy and Pull Requests
You are not your user!
(Or your collaborator.) Try to see other’s perspectives when you are writing or creating something. Realize others may not have the same background knowledge as you. You may need to fill in some gaps!
Tips and sources for applying this to your workflow:
- Obtaining user feedback - a chapter I wrote based heavily on what I learned from the Data Lab and from Deepa, the Data Lab’s UX Designer.
- Google forms are useful (and free) tools for obtaining feedback.
Trust the processes you’ve implemented, but re-evaluate when necessary!
Things take time to work and habits take time to form. Sometimes human brains get tired and want to look for a shortcut. Don’t do it! Stick to your processes! It’s helpful if your teammates stick to them too. Positive peer pressure is a powerful tool.
P.S. About the reference in the title if you are confused.
‘Work smarter not harder’ is useless advice if you don’t know how to ‘work smarter’. But the Childhood Cancer Data Lab's work and processes may be the smartest I’ve ever had the pleasure of learning and adopting. Their processes are just as valuable to me now that I’ve moved on to the next stage of my career (and no longer work for the Data Lab).
The Data Lab's processes allow them to maximize their efficiency as well as the efficiency of anyone who works alongside them. This outcome is no accident: it’s heavily implied in their mission statement to help others accelerate the race to cures.
Here’s their not-so-secret sauce for you (tips are in no particular order):
Sharing is caring!
In other words, when it comes to research and code, open source helps everyone! Redundancy is the enemy of efficiency. And in childhood cancer research every finding or new method can be priceless! Obviously there are limits to this — some data cannot be shared because it is Personally Identifiable Information (PII) or Protected Health Information (PHI). But the Data Lab has made it a habit to share their methods and findings whenever possible!
Tips and sources for applying this to your workflow:
- Open source on GitHub - GitHub is an online service for version control and having your code publicly available!
- Figshare is a handy way to get your data and figures associated with a manuscript out there.
- The Data Lab has been a strong initiator of open source projects like OpenPBTA.
- Note: Email is not a code or data sharing system.
Take more time now to do a thing that will save you time long-term.
There are a lot of things that may be time consuming now and come with upfront costs, but the payoff is long-term. The Data Lab opts for the long-term payoff.
Here are some examples of this long-term mindset in action.
Write it down!
You think you will remember something until you have to remember it. The Data Lab has a very strong mantra of “write it down.” Writing things down is not only helpful for “future you,” who will forget what “today you” was thinking, but it’s also helpful for your collaborators and teammates. (Shout out to Stephanie Spielman on code comments.)
This has become especially important in the world of remote work. It’s critical to know what your teammates are working on or what they are planning so you don’t do redundant work. Don’t be redundant - write it down!
Writing things down with enough detail for others to understand is also showing them that you respect their time. Don’t be afraid to spend time drafting an important GitHub comment.
Tips and sources for applying this to your workflow:
- GitHub issues are a great way to make sure information is recorded in a place where others can access it.
- GitHub issue templates are a great way to give yourself and your collaborators cues to make sure that all information makes it from the brain to the issue! A well crafted issue enables future work to be completed more quickly.
- Documentation is a long-term time saver! I heavily pulled the lessons I learned from the Data Lab to write this course about documentation for cancer informatics for the ITCR training network.
Everybody benefits from code review.
The reviewer, the reviewee and whoever comes to look at the reviewed code later! To quote Parker 2017:
Code review will not guarantee an accurate analysis, but it’s one of the most reliable ways of establishing one that is more accurate than before.
Code review does take more time upfront – this is true – but the resulting code will stand the test of time much longer than unreviewed code.
Most wouldn’t dare think of sending out a manuscript without having their collaborators review it, but sometimes people don’t think the same way about code. I can’t speak highly enough of code review. Similarly, try not to merge your changes straight to your main branch; get that code looked over by another pair of eyes.
Personally, my ability to write readable code has been drastically enhanced by the code review I participated in at the Data Lab.
Tips and sources for applying this to your workflow:
- On Empathy and Pull Requests
- Code Review Guidelines for Humans
- Code Reviews Done Right: Your Missing Guideline
- Best practices for Code Review
- Writing Great Scientific Code
- Six Commandments for Writing Good Code
Have robots do your work for you where possible.
Why do something yourself if a robot can do it for you instead?! (At least until the robot uprising happens after they gain sentience.) In many cases, robots do a far better job than humans.
Automating tasks is another upfront cost, but a long-term time saver. Do you find yourself repeating a monotonous task? Delegate the robots to do that task! This will ensure that it gets done while freeing up your brain glucose for tasks the robots are not so good at (like writing a blog).
There are many GitHub actions out there that are already written to have robots do work for you.
Tips and sources for applying this to your workflow:
- An introduction to GitHub actions by Daniel Weibel.
- A handy reference guide for GitHub actions.
Some particularly handy GitHub actions:
- One to check URLs
- One to sync files between repositories
- A list of broadly applicable GitHub actions put together by Julien Danjo
- An even longer list of cool GitHub actions put together by Sarah Drasner
Some GitHub actions examples from the Data Lab's repositories:
- Code styling and spell checker
- Docker image testing and pushing
- Re-rendering RMarkdowns
Zapier can be handy for integrating apps together. It can integrate with GitHub, Slack, email, and tons of other apps. You are allowed a certain number of Zaps for free before you are required to pay.
Context switching is bad.
Brains take some time to get up to speed. Sometimes if you have too much on your plate you may feel the urge to multitask. Don’t! Do one thing and do it well.
Context switching often happens when we aren’t intentional enough with our todo list. Figuring out a task management system that works for you is something that is well worth the time! (But you have to stick with it.)
Tips and sources for applying this to your workflow:
- Scrum - what it is, how it works, and why it's awesome - The Data Lab uses scrum to be intentional about the tasks they need to get done each day and to keep their fellow teammates informed about what they intend to accomplish (or how their teammates may be blocking their productivity).
- ZenHub is an app for use with GitHub. When you have many repositories with projects all moving forward at once it can be difficult to juggle them and see what’s happening. ZenHub is how the Data Lab juggles these projects.
- Figure out a task management system that works for you. The greater your workload, the more important it is that your task management system is working. Email is not a task manager but rather a black hole that consumes project ideas and tasks that need to get done.
- Don’t be afraid to turn on “do not disturb” for some tasks, for some amount of time in the day. (This might mean “do not disturb” for Slack, email, or your computer’s notifications.)
Let it simmer.
This may feel like the opposite of the previous point but both can balance together. Do you feel like you just wrote amazing code? Let it sit, reread it tomorrow and see if you still feel the same way. Do you feel like you can’t solve a problem? Let it simmer and come back tomorrow to find a solution with fresh eyes. Sometimes resting will be a more efficient use of your time than hitting your head against the wall.
Tips and sources for applying this to your workflow:
- How to take effective breaks (and be more productive)
- The Best Coding Advice I Ever Got: Take a Break
Incremental/iterative improvements are good and encouraged!
“Perfect is the enemy of good,” so they say. Small chunks are easier to accomplish. There is no such thing as a finished scientific project.
Tips and sources for applying this to your workflow:
- Smaller and more frequent, focused pull requests >> one massive pull request
- Learn how to do Stacked PRs
- More tips for splitting up PRs: How to Split Big Pull Requests – Good Practices and 4 Git Strategies
No fault autopsies.
Take time out each week to evaluate your own work, how you spent your time, and how you communicated with your teammates.
Did things go as well as you hoped? Even if they went well, are there areas for improvement? Rules, routines, and processes are meant to be helpful, not a burden. So if a particular process is feeling more burdensome than helpful – maybe it’s time to switch! But most importantly, check in with your teammates. How did they think that last pull request review went? Is there a time you feel like you missed each other in communication? Time for some no-fault autopsy.
Tips and sources for applying this to your workflow:
- Empathy is a useful skill: On Empathy and Pull Requests
You are not your user!
(Or your collaborator.) Try to see other’s perspectives when you are writing or creating something. Realize others may not have the same background knowledge as you. You may need to fill in some gaps!
Tips and sources for applying this to your workflow:
- Obtaining user feedback - a chapter I wrote based heavily on what I learned from the Data Lab and from Deepa, the Data Lab’s UX Designer.
- Google forms are useful (and free) tools for obtaining feedback.
Trust the processes you’ve implemented, but re-evaluate when necessary!
Things take time to work and habits take time to form. Sometimes human brains get tired and want to look for a shortcut. Don’t do it! Stick to your processes! It’s helpful if your teammates stick to them too. Positive peer pressure is a powerful tool.
P.S. About the reference in the title if you are confused.