What is HeartBioPortal?
HeartBioPortal is a computational infrastructure platform that provides intuitive visualization, analysis, and downloads of large-scale publicly available cardiovascular disease (CVD) datasets focused on gene expression, genetic association, and ancestry information. To learn more, please read the HeartBioPortal publication
What is HeartBioPortal written in?
Mostly R, Python-3.8, JavaScript and Rust. Frontend: Vue.js. Backend: Django, Aerospike, ElasticSearch and PostgreSQL.
What biological databases does HeartBioPortal source and sync its data from?
Ensembl, ClinVar, NHGRI-EBI GWAS catalog, gwasATLAS, OMIM, dbGaP, GTEx, CREEDS, HapMap, 1000Genomes, gnomAD, GEO, ArrayExpress, and OmicsDI.
What GWAS consortium efforts does HeartBioPortal’s genetic association data come from?
CARDIoGRAMplusC4D, TOPMed, AFGen, MESA, MEGASTROKE, UK Biobank, CHARGE, Biobank Japan, MyCode, among others (including individual publications, e.g., Watanabe et al., 2019, Nature Genetics). This includes highly specialized GWAS studies for cardiovascular disease phenotypes such as tetralogy of fallot, which can be found under "congenital heart disease" in HeartBioPortal.
How does HeartBioPortal present genetic association (GWAS) results?
GWAS summary statistics for cardiovascular related traits were downloaded and filtered down for markers with reported p<=0.05. We did not attempt to further harmonize the p-values from different studies but present them as downloaded from each source. Our p-value filtering is intended to remove most markers with no phenotype-association signal, but is not intended to signal genome-wide significance which typically uses a much more stringent threshold. Users looking to see only the most significant association results (e.g. p<1E-5 or p<1E-8) should consider using the p-value slider on the HeartBioPortal gene pages. To annotate the filtered genetic markers we queried myvariant.info via post requests, and parsed the returned status and JSON using a custom script. Where possible we attempted to use RSIDs from the original GWAS study to annotate each marker of interest. Where this was not possible, we queried using the chromosomal coordinates and Ref/Alt alleles (genomic HGVS). Returned annotated results for multi-allelic markers were split into the child bi-allelic markers, each keeping the original p-value for convenience. We chose to primarily rely on SNPeff annotations because these were the most complete in terms of number of variants covered for the most relevant annotation fields.
What FinnGen data is used and how is it cited?
We want to acknowledge the participants and investigators of FinnGen study. We used FinnGen public release 3 GWAS summary tables. Currently not all FinnGen data is citable in a publication apart from their public releases. See https://finngen.gitbook.io/documentation/how-to-cite. Where we require a traditional citation, internally we have chosen Mars et al. 2020 (PMID: 32273609, DOI: 10.1038/s41591-020-0800-0) to represent the currently available FinnGen cardiovascular data.
What do you mean by ‘quantitative gwas traits’?
In contrast to disease states which have a yes or no value for each patient, these traits involve a continuous variable for each patient. For example, hypertension is a disease state while systolic blood pressure is a quantitative trait. Instead of the pval indicating the association of a variant with risk of disease, quantitative GWAS studies report the association of a variant with a raised or lowered clinical value such as systolic blood pressure. We have processed quantitative trait GWAS summary statistics the same as for disease state summary statistics.
How does HeartBioPortal identify differentially expressed genes?
HeartBioPortal uses limma to identify genes that are differentially expressed across experimental conditions (e.g., case/control), and applies a Benjamini & Hochberg (False discovery rate) adjustment to the p-values, reporting only those top differentially expressed genes whose adjusted p-values are less than 0.05. Therefore, the expression results presented in HeartBioPortal are of high stringency -- get in touch with us if you would like to explore other (e.g., less stringent) parameters.
Can I use HeartBioPortal visualizations in my paper?
Yes, all portal figures are free to use in publications.
Why are there often multiple ENSP IDs in the dropdown menu of the variant viewer plot (e.g., for the gene DSP) in HeartBioPortal?
To display variants found in both the canonical transcript of the gene as well as its alternative splice isoforms. Therefore, the variant viewer plot depicts variants found in all transcripts that have a known protein product associated with the respective CVD phenotype.
How do I properly cite/attribute HeartBioPortal?
We request that any use of data obtained from HeartBioPortal cite the HeartBioPortal paper. There's no need to include us as authors on your manuscript, unless we contributed specific advice or analysis for your work according to ICMJE authorship criteria.
Why do I need to register/login to download data in HeartBioPortal?
The NIH Genomic Data Sharing policy encourages minimal barriers to access genomic summary results. Therefore, investigators are able to see and query data at the summary level within HeartBioPortal without an approved data access request. However, logging in via an authorized account is required before a user can download the data. Similar to other resources like dbGaP or DataSTAGE, this will enable logging and monitoring of users and provide an opportunity to remind users of the three principles for responsible research use and compliance (no attempt to re-identify, use only for research or health purposes, and review of the responsible genomic data use information materials).
Does HeartBioPortal have any restrictions on data usage?
The data provided in HeartBioPortal (HBP) are available under the ODC Open Database License (ODbL). You are free to share and modify the HBP data so long as you attribute any public use of the database, or works produced from the database; keep the resulting datasets open; and offer your shared or adapted version of the dataset under the same ODbL license.
What are HeartBioPortal's terms of use?
All data in HeartBioPortal (HBP) are released openly and publicly for the benefit of the wider cardiovascular disease research community. You can freely download and search the data, and we encourage the use and publication of results generated from these data. There are absolutely no restrictions or embargos on the publication of results derived from HBP data.