Three ways we’ve enhanced the Single-cell Pediatric Cancer Atlas (ScPCA) Portal in 2024!

December 2, 2024

When the Data Lab launched the Single-cell Pediatric Cancer Atlas (ScPCA) Portal in 2022, we knew it was only the beginning! We started by making data easily available for the research community and received an overwhelmingly positive response. But we know firsthand from training hundreds of pediatric cancer researchers in analysis that making data available is just the first step. We’re increasing the impact of the Portal by listening to the growing ScPCA community. Now more researchers can contribute datasets, new features are continuously being developed, and we started an open, collaborative project to further explore the available data!

Here’s a look back at how we’ve enhanced the ScPCA Portal in 2024.

1. More data to discover

The ScPCA Portal was originally created to hold data funded by Alex’s Lemonade Stand Foundation (ALSF). Enabling access to ALSF-funded data was a good start, but we knew that expanding the collection of available data would require the help of the greater pediatric cancer research community. We opened two calls for community contributions, and three labs have since become eligible to share data!

Dr. Joae Wu from the University of Massachusetts Chan Medical School submitted two single-nuclei RNA-seq datasets. These datasets added 4 neuroblastoma cell lines and 2 neuroblastoma xenografts to the Portal. View the projects:

Dr. Irina Pushel from Children's Mercy Research Institute submitted a single-cell RNA-Seq dataset comprised of 25 leukemia samples collected from multiple patients. 

Dr. Jo Lynne Rokita, formerly from the Center for Data-Driven Discovery in Biomedicine (D3b) at Children’s Hospital of Philadelphia, submitted a single-nuclei RNA-seq dataset comprised of 15 high-grade glioma and glioblastoma samples obtained from multiple patients. 

2. More choices for a growing community

We’re ensuring that a broad community of researchers can more easily download, work with, and get the most out of this data! Here are just some of the features we’ve recently added for downloading and analyzing data from the Portal.

Cell type annotations

Users can now download samples with added cell type annotations. We used two different methods for annotating cell types: [.inline-snippet]SingleR[.inline-snippet], a reference-based cell type annotation method, and [.inline-snippet]CellAssign[.inline-snippet], a marker-gene-based cell type annotation method. A cell type report is available with each downloaded sample and includes a comparison of the annotations obtained from both tools. 

Python-compatible data

Python users can now avoid the time-consuming process of converting ScPCA data to another format before using it! The ScPCA portal provides Python-compatible data, enabling users to perform downstream analyses with their preferred tools. Previously, all data available for download were provided as [.inline-snippet]SingleCellExperiment[.inline-snippet] objects in an RDS file. We’ve added another option! You can choose to receive your download packaged as an [.inline-snippet]AnnData[.inline-snippet] object in an HDF5 file instead. You can even change your default settings for future downloads. 

Options for downloading data and metadata

  • Metadata can now be downloaded separately from gene expression data downloads. Each project page offers the option to download the metadata for all of its samples as a tab separated file.
  • A single TSV file containing the metadata for all samples from all projects on the Portal is also available. You can download this file by using the [.inline-snippet]Get All Sample Metadata[.inline-snippet] button in the global navigation.
  • Users who prefer directly downloading data to a server or cluster can obtain a copy download link. This feature lets you copy the URL to download a project using a command line tool. (It doesn't trigger a download via your web browser, so you’ll have to take additional steps to download the data with another tool!) All project pages now have a button to generate a copy download link.

Visit the Portal to try out some of these features and to download data from 700 samples representing 55 pediatric cancer types!

3. The Open Single-cell Pediatric Cancer Atlas (OpenScPCA) project 

Have you heard about the Open Single-cell Pediatric Cancer Atlas (OpenScPCA) yet? Earlier this year, we started an open science initiative to analyze the available ScPCA data more deeply! External contributors from different institutions joined the project and proposed a variety of analyses focused on improving the quality of cell type annotations for pediatric cancer samples on the Portal. So far, contributors have analyzed data from Wilms tumor, acute lymphoblastic leukemia (ALL), and desmoplastic small-round-cell tumors. In November, three researchers became eligible for grants after submitting cell type labels for four ScPCA projects! 

How it works: The Data Lab and contributors work together to develop analysis ideas on GitHub Discussions. All contributors must participate in analytical code review before results are made openly available in the [.inline-snippet]OpenScPCA-analysis[.inline-snippet] GitHub repository.

Help us grow and improve!

  • Contribute data - Interested in submitting your pediatric single-cell data to the ScPCA Portal? Read more about contributing and sign up to be notified when we open our next call for contributions!
  • Participate in user research - You can help us improve and further develop the Portal and other Data Lab tools by providing your feedback. Sign up here if you’d like to participate in future user research.
  • Join OpenScPCA - Want to get involved with this project? Fill out the interest form to receive more information. 

When the Data Lab launched the Single-cell Pediatric Cancer Atlas (ScPCA) Portal in 2022, we knew it was only the beginning! We started by making data easily available for the research community and received an overwhelmingly positive response. But we know firsthand from training hundreds of pediatric cancer researchers in analysis that making data available is just the first step. We’re increasing the impact of the Portal by listening to the growing ScPCA community. Now more researchers can contribute datasets, new features are continuously being developed, and we started an open, collaborative project to further explore the available data!

Here’s a look back at how we’ve enhanced the ScPCA Portal in 2024.

1. More data to discover

The ScPCA Portal was originally created to hold data funded by Alex’s Lemonade Stand Foundation (ALSF). Enabling access to ALSF-funded data was a good start, but we knew that expanding the collection of available data would require the help of the greater pediatric cancer research community. We opened two calls for community contributions, and three labs have since become eligible to share data!

Dr. Joae Wu from the University of Massachusetts Chan Medical School submitted two single-nuclei RNA-seq datasets. These datasets added 4 neuroblastoma cell lines and 2 neuroblastoma xenografts to the Portal. View the projects:

Dr. Irina Pushel from Children's Mercy Research Institute submitted a single-cell RNA-Seq dataset comprised of 25 leukemia samples collected from multiple patients. 

Dr. Jo Lynne Rokita, formerly from the Center for Data-Driven Discovery in Biomedicine (D3b) at Children’s Hospital of Philadelphia, submitted a single-nuclei RNA-seq dataset comprised of 15 high-grade glioma and glioblastoma samples obtained from multiple patients. 

2. More choices for a growing community

We’re ensuring that a broad community of researchers can more easily download, work with, and get the most out of this data! Here are just some of the features we’ve recently added for downloading and analyzing data from the Portal.

Cell type annotations

Users can now download samples with added cell type annotations. We used two different methods for annotating cell types: [.inline-snippet]SingleR[.inline-snippet], a reference-based cell type annotation method, and [.inline-snippet]CellAssign[.inline-snippet], a marker-gene-based cell type annotation method. A cell type report is available with each downloaded sample and includes a comparison of the annotations obtained from both tools. 

Python-compatible data

Python users can now avoid the time-consuming process of converting ScPCA data to another format before using it! The ScPCA portal provides Python-compatible data, enabling users to perform downstream analyses with their preferred tools. Previously, all data available for download were provided as [.inline-snippet]SingleCellExperiment[.inline-snippet] objects in an RDS file. We’ve added another option! You can choose to receive your download packaged as an [.inline-snippet]AnnData[.inline-snippet] object in an HDF5 file instead. You can even change your default settings for future downloads. 

Options for downloading data and metadata

  • Metadata can now be downloaded separately from gene expression data downloads. Each project page offers the option to download the metadata for all of its samples as a tab separated file.
  • A single TSV file containing the metadata for all samples from all projects on the Portal is also available. You can download this file by using the [.inline-snippet]Get All Sample Metadata[.inline-snippet] button in the global navigation.
  • Users who prefer directly downloading data to a server or cluster can obtain a copy download link. This feature lets you copy the URL to download a project using a command line tool. (It doesn't trigger a download via your web browser, so you’ll have to take additional steps to download the data with another tool!) All project pages now have a button to generate a copy download link.

Visit the Portal to try out some of these features and to download data from 700 samples representing 55 pediatric cancer types!

3. The Open Single-cell Pediatric Cancer Atlas (OpenScPCA) project 

Have you heard about the Open Single-cell Pediatric Cancer Atlas (OpenScPCA) yet? Earlier this year, we started an open science initiative to analyze the available ScPCA data more deeply! External contributors from different institutions joined the project and proposed a variety of analyses focused on improving the quality of cell type annotations for pediatric cancer samples on the Portal. So far, contributors have analyzed data from Wilms tumor, acute lymphoblastic leukemia (ALL), and desmoplastic small-round-cell tumors. In November, three researchers became eligible for grants after submitting cell type labels for four ScPCA projects! 

How it works: The Data Lab and contributors work together to develop analysis ideas on GitHub Discussions. All contributors must participate in analytical code review before results are made openly available in the [.inline-snippet]OpenScPCA-analysis[.inline-snippet] GitHub repository.

Help us grow and improve!

  • Contribute data - Interested in submitting your pediatric single-cell data to the ScPCA Portal? Read more about contributing and sign up to be notified when we open our next call for contributions!
  • Participate in user research - You can help us improve and further develop the Portal and other Data Lab tools by providing your feedback. Sign up here if you’d like to participate in future user research.
  • Join OpenScPCA - Want to get involved with this project? Fill out the interest form to receive more information. 
Back To Blog