Introduction

GA4GH Phenoboard is a tauri app designed to help curate cohorts of individuals with rare genetic disease using the GA4GH Phenopacket Schema.

Download

Installers for Mac, Windows, and Linux are available from the Releases page of the GitHub repository.

Background

Phenoboard is designed to curate cohorts of individuals with heritable diseases for the HPO project.

The Application

Phenoboard is a tauri application that can be installed on Mac, Windows, and Debian linux systems. Most users will want to download prebuilt installers from the Releases page of the projects GitHub repository.

Users can curate individuals with a certain disease, represented using an OMIM identifier. One or many individuals can be curated. The help section provides tutorials for the major functionalities of the app.

Phenoboard
GA4GH Phenoboard. Phenoboard is an application for biocuration of case and cohort reports in genetic medicine.

Background

GA4GH Phenopackets

The Global Alliance for Genomics and Health (GA4GH) is developing a suite of coordinated standards for genomics for healthcare. The Phenopacket is a GA4GH standard for sharing disease and phenotype information that characterizes an individual person, linking that individual to detailed phenotypic descriptions, genetic information, diagnoses, and treatments.

Phenopacket Schema
Phenopacket Schema overview. The GA4GH Phenopacket Schema is a hierarchical structure that consists of two required fields, id and MetaData, as well as eight optional fields, Individual, Disease, Interpretation, Biosample, PhenotypicFeature, Measurement, MedicalAction, and files ([GA4GH Phenopackets: A Practical Introduction](https://pubmed.ncbi.nlm.nih.gov/36910590/))

Human Phenotype Ontology (HPO)

Ontologies are systematic representations of knowledge that can be used to capture medical phenotype data by providing concepts (terms) from a knowledge domain and additionally specifying formal semantic relations between the concepts. Ontologies enable precise patient classification by supporting the integration and analysis of large amounts of heterogeneous data. The HPO is widely used in human genetics and other fields that care for individuals with rare diseases (RDs) and is also increasingly being used in other settings, such as electronic health records (EHRs). HPO terms are used in the Phenopacket Schema to represent phenotypic features such as signs, symptoms, and laboratory and imaging findings.

Additionally, the HPO project is developing a corpus of phenopackets derived from the published literature that with time will form the backbone of the HPO annotation project (A corpus of GA4GH phenopackets: Case-level phenotyping for genomic diagnostics and discovery).

HPO
HPO. Screenshot of internationalized HPO Web application. For each term, users can choose from the available languages (seven, in this example) in addition to English.

Applications

Phenoboard is designed to help rapidly and accurately curate case and cohort reports about Mendelian disease from the medical literature. A growing number of software packages is available for analyzing cohorts of phenopackets, some of which we describe here.

GPSEA (Genotype-Phenotype Statistical Evaluation of Associations)

There are a huge number of clinical manifestations of human disease, and even individuals with the same clinical diagnosis may present with different combinations of phenotypic abnormalities, ages of onset of these abnormalities, and degrees of clinical severity. A key question for genomic precision medicine is how specific genetic variants influence clinical phenotype. The correlation between genotype (the type of variant or variants present at a given location) and phenotype (presence or absence of medically relevant observable traits) is defined as an above-chance probability of an association between the two, an association termed genotype-phenotype correlation (GPC). GPSEA leverages case-level phenopackets, characterizing an individual person or biosample and linking the individual to detailed phenotypic descriptions, genetic information, diagnoses, and treatments. GPSEA automates the process of visualizing and performing GPC analysis (GA4GH Phenopacket-Driven Characterization of Genotype-Phenotype Correlations in Mendelian Disorders).

gpsea
Schematic overview of GPSEA workflow.. a) Overview. GPSEA is a Python package designed to work well in Jupyter notebooks. GPSEA takes a collection of GA4GH phenopackets as input, performs quality assessment and visualizes the salient characteristics of the cohort; genotype classes are defined (Figure 2); and one of four classes of statistical test is performed for each hypothesis the user decides to test. b) Visualize data and formulate hypotheses. GPSEA displays tables with the distribution of phenotypic abnormalities, disease diagnoses, variants, and other information, and presents a cartoon with the distribution of variants across the protein. This information intends to help users formulate hypotheses about genotype-phenotype correlations (GPCs). c) Statistical testing. GPSEA offers four main ways of testing phenotypes.

Validation of bioinformatics software

Exomiser and many other software packages for genomic diagnostics user Phenopackets as input files. Developers of such software can use phenopackets from Phenopacket Store to test new algorithms. PheVal is designed to take phenopackets as input to test software for diagnostic genomics.

For special use cases, Phenoboard can be used to rapidly create specific phenopackets needed for testing or validation of software.

Phenoboard Help

Phenoboard offers several ways of curating clinical data.

Workflow

In general, phenoboard supports curation of cohorts (which can consist of one or multiple individuals with a specified disease). Users create a new cohort file or open an existing one. following this, Phenoboard provides tools to support curation of individual cases or or external Excel files that contain information about a cohort. Each step is quality-controlled, and text-mining, autocompletion, and rretrieval of information about variants using VariantValidator is provided. After finishing curation, the user can store or update the cohort file, export a collection og phenopackets representing all of the individuals in the cohort file or export an aggregate tab-separate file representing a summary of phenotypic features of the cohort (HPOA format).

Installation

Phenoboard is available as prepackaged installers for macOS, Windows, and Linux. Download the latest version from the Releases page.

Installing on macOS

File to download: phenoboard_0.5.10_aarch64.dmg
This is the macOS installer for Apple Silicon (M1/M2/M3/M4 Macs)

Because this application is open-source and distributed for free, it is not signed or notarized by Apple. macOS will warn you the first time you try to open it. Here's how to install:

  1. Download the .dmg file from the Releases page
  2. Open the DMG and drag the app into your Applications folder
  3. When you try to open it, macOS may show an error message:
    "App can't be opened because it is from an unidentified developer" or “phenoboard” is damaged and can’t be opened. You should move it to the Trash

This error happens because macOS applies strict security checks for programs downloaded from the Web that are not signed with a paid Apple Developers account. There are at least two ways of dealing with this. (Of course, do not move the app to the trash!)

1) xattr

  • run in Terminal: xattr -cr /Applications/phenoboard.app (to open the Terminal, search for Terminal in Spotlight and then paste the above text into it and press Enter)

2) System Settings Depending on our OS version, you may also be able to do the following:

  • go to System Settings → Privacy & Security → click "Open Anyway"

Installing on Windows

File to download: phenoboard_0.5.10_x64_en-US.msi
Windows installer (MSI format)

  1. Download the .msi installer from the Releases page
  2. Double-click to start the installer
  3. If Windows shows a blue SmartScreen dialog saying:
    "Windows protected your PC"
  4. Click "More info""Run anyway"

Note: Windows shows this for unsigned apps from new developers. Once you install and run it, the warning will not reappear.

Installing on Linux

File to download: phenoboard_0.5.10_amd64.deb
Debian/Ubuntu package

  1. Download the .deb package from the Releases page
  2. Install using:
sudo apt install ./phenoboard_0.5.10_amd64.deb

Or using dpkg:

sudo dpkg -i phenoboard_0.5.10_amd64.deb

Other Linux Distributions

File to download: phenoboard_0.5.10_amd64.AppImage
Universal Linux application (no installation needed)

  1. Download the .AppImage file from the Releases page
  2. Make it executable:
chmod +x phenoboard_0.5.10_amd64.AppImage
  1. Run it:
./phenoboard_0.5.10_amd64.AppImage

Building from Source

This will work on any OS.

git clone https://github.com/your-username/your-repo.git
cd your-repo
npm install
npm run tauri build

The built installers will appear under:

src-tauri/target/release/bundle/

Prerequisites

Node.js (at least version 18) and npm (at least version 9).

You can check if you have them installed via

node -v
npm -v

If necessary, go to https://nodejs.org to install these programs.

Rust and Cargo

See https://rustup.rs if needed.

Git

If you do not have git installed, replace the cloning step with a download of the archive.

Platform-specific code

Please report any dependencies not listed above.

Running the app

  1. Clone from GitHub
git clone https://github.com/P2GX/phenoboard.git
cd phenoboard
  1. Install npm dependencies From within the phenoboard directory, enter
npm install
  1. Running the app
npm run tauri dev

This will run the application.

Start page

phenoboard
Phenoboard start page.

ORCID

Before using Phenoboard for the first time, the user needs to enter an ORCID research indentifier. Enter just the number (e.g., enter 0000-0002-0736-9199 and not https://orcid.org/0000-0002-0736-9199). Phenoboard stores the ORCID in its settings directory (which is automatically created as a hidden directory in the user's home directory upon the first use of the app). From this point on, the ORCID will be automatically loaded upon program start.

Load the HPO

Before curation, the user needs to load the hp.json file. We recommend always using the latest version, which can be found in the Download section of the HPO website. The path to this file is always stored in the settings directory, and the ontology will be loaded automatically upon program start. Users should check if an update is available and if so, download the new hp.json file and load it in Phenoboard.

Select phetools (legacy) template file

This option is only of use to the HPO maintainers. The first version of Phenopacket Store was developed using a standardized Excel template, which we are currently updating to use the Phenoboard JSON format. Note that the phenopackets generated from both sources are identical. This option will disappear once the HPO maintainers have finished the migration to the new format.

Select phetools JSON file

This option selects a Pheboard JSON file that is used to store data about a cohort and which Pheboard uses to create a collection of phenopackets representing the cohort. TODO

Create a new template

TODO

Open external table

This option is used to add data from an external table (such as the Supplemental Table representing data about a cohort).

New cohort

Phenoboard creates cohort files (JSON files) that represent one or more individuals diagnosed with a disease. There are three support disease categories:

  • Mendelian
  • Melded phenotypes
  • Digenic disease
New Cohort Creation
Choose one of three cohort types to curate.

Case reports

Phenoboard allows individual case reports to be curated using text mining. This functionality is useful for publications with narrative descriptions of a case. In this example, we will curate an individual with Loeys-Dietz syndrome type 1 from PMID: 35003478. First, make sure that information about a cohort has been loaded by either using the new cohort page or by loading an existing cohort JSON file. Then go to the Add Case page.

Phenoboard
Add case. Users should enter information in each of the sections on this page, after which the submit case button will be activated, allowing the case information to be added to the cohort.

Lookup PubMed

Phenoboard is currently setup to curate published literature with a PubMed identifier (contact us if you would like to use the app for in house cohorts). The user will first need to enter a PMID, either as PMID: 35003478or with just the number (35003478). The app will reach out to PubMed and retrieve the title, which is also stored. The app will warn users if it is attempted to enter a previously used PMID (perhaps because the same article is being mistakenly entered a second time). If you are entering multiple individuals from the same article, the warning can be ignored.

Add/Edit Age entries.

Enter the age entries needed to curate the case. See GA4GH Phenopackets: A Practical Introduction and the Phenopacket Schema documentation for information about how to represent ages. In brief, one case use one of three options:

Add/Edit Demographics

Enter the identifier of the patient as used within the publication (it must be unique within the publication is multiple individuals are curated), as well as the age of onset, age when the individual was last medically examined, the sex, deceased status, and if desired an optional comment. Only the individual identifier is required (in our example, the onset and the last encounter age are both 14 years, P14Y, and the affected individual is a boy who is not deceased; we entered the individual ID as Case reportbecause that is the title used for the clinical description. It would also be acceptable to use 14 year old malebecause the individual is described in this way by the authors).

Add HPO annotations

This widget performs text mining on text that is pasted into the window.

Text mining
Text mining. Users should paste clinical descriptions into the window and perform text mining. They should read the text and correct the results as necessary. The toggle button switches "observed" to "excluded" status and vice versa. If possible, specify the onset of each feature. By clicking on the terms, it is possible to replace a term by a more specific child or less specific parent, if the text mining result needs to be modified. Finally, additional terms can be added using the autocomplete window.

When you are finished adjusting the text mining results, click "Finish".

Alleles

You can add HGVS (small variants) or SV (structural variants) using the provided widgets. Make sure the HGVS notation is based on the transcript of reference. If a variant is noted in the publication to be homozygous, click the "biallelic" checkbox.

Add variant
Add variant/alleles. Enter a valid HGVS string or symbol representation of a structural variant (e.g., DEL exon 5). For structural variants, enter the category (e.g., deletion, duplication, inversion, etc.).

Submit case

When all of the above information has been added, the case can be added to the cohort with the Submit case button. You will be taken to the cohort editor screen.

Cohort editor
Cohort editor. Functions are provided to edit individual cells. Users may choose to compare the annotations for the case they entered (which will be on the last line), with the annotations for other cases, and if there are important pieces of information that appear to be missing, they can go back to the curated publication and search for them. It is important to curate not only observed features but also explicitly excluded features.

Cohort editor

This screen allows users to visualize and edit the entire cohort.

Cohort editor
Cohort editor. Functions are provided to edit individual cells. Users may choose to compare the annotations for the case they entered (which will be on the last line), with the annotations for other cases, and if there are important pieces of information that appear to be missing, they can go back to the curated publication and search for them. It is important to curate not only observed features but also explicitly excluded features.

Saving cohort

To save a cohort, click on the validate button to check for errors. The Sanitizebutton can automatically correct some kinds of errors. If this does not work, the offending table cell(s) will need to be revised. The Savebutton saves the JSON cohort file. The Export Phenopacketsbutton exports each row of the table as one phenopacket. You should add your ORCID id to the files with the Record biocuration button (this needs to be done before saving). The Export HPOA button exports HPO annotations in aggregated tabular format.

Cohort editor
Saving work.

Table editor

Some articles present information about groups (cohorts) of individuals in tables that are either placed within the main article or are provided as a supplemental table. There is no accepted format for such tables, any we have observed a great deal of heterogeneity. However, it can save a lot of time to curate an entire table at once. Phenoboard provides the External Table Editor functionality for these cases. Users need to save the table they would like to transform as an Excel table, and then open it with the Load Excel buttons (one button each is provided for tables with row-based or column-based structure). Note that some external Excel files strew information about a single entity over multiple rows or columns. In this case, users will need to manually edit the files to put all information about a given entity into one cell.

Then, each column is processed by right clicking on the column header or as needed on individual cells.

Cohort editor
External table editor. Here, the user has right-clicked on the table header and is transforming the contents to Age entries.

Functionality

The functions of phenoboard can be explored by right-clicking on column headers or cells.

Saving

When all columns have been processed, the user can add all rows to the current cohort (which must be previously entered!).

Developers

GA4GH Phenoboard is a tauri application with a Rust backend and an Angular front end. It is designed to curate cohorts of individuals diagnosed with genetic disease using Human Phenotype Ontology{:target="_blank"} and Global Alliance for Genomics and Health{:target="_blank"} Phenopacket Schema{:target="_blank"}.

The application makes major use of the following rust crates.

This page summarizes some of the angular and Rust/tauri commands that have been useful to create the application.

Running in development mode

Most users should use the provided installation programs. Developers can start the program in development mode as follows

npm run tauri dev

Creating installation program

To generate an installation program (for the current OS), run the following

npm run tauri build

This will create an installer in the following location

src-tauri/target/release/bundle/dmg/phenoboard_0.3.1_aarch64.dmg

This can be attached to a release. Double-clicking the file will open a typical MacIntosh installation window.

Port issues

If one gets the error message: Port 1420 is already in use, then use the following command to obtain the process ID:

lsof -i :1420
COMMAND   PID  USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
node    32315 <user>   49u  IPv4 0xd9cc1bb0104a525f      0t0  TCP localhost:timbuktu-srv4 (LISTEN)

then end the process with

kill -9 <PID>

This may also cause the typescript part of the app to not be updated when we run npm run tauri dev.

Run in browser

Can be useful with the DevTools panel

npm run start

Documentation

We create documentation using the mdbook package. A local server can be started as follows.

cd book
mdbook serve --open

Angular tips

Some useful tips for working with angular.

Reset cache

Sometimes Stale build artifacts or module cache may lead to errors. We can clean the cache as follows.

# Clean Angular/Nx cache
npx nx reset
# Clean node_modules and dist
rm -rf node_modules dist .angular .output .vite
# Clear package manager cache (optional but helpful)
npm cache clean --force
# Reinstall
npm install

Incompatibilitie

Avoid BrowserAnimationsModule in standalone components. Importing it seems to lead to the error

NG05100: Providers from the BrowserModule have already been loaded.

Practical tips

Clearing the Mac cache

Sometimes the spotlight search function will include links to local versions of the phenoboard app when we want to test a version that was downloaded from the Releases page.

In this case,

  1. open the Activity Monitor (Applications → Utilities → Activity Monitor).
  2. find phenoboard, select it, and click the “i” (info) button in the toolbar.
  3. Go to the “Open Files and Ports” tab.
  4. This will reveal the path of the executable that spotlight is finding.
  5. Enter open -R <path from above>/Phenoboard.app to open the folder in which this extecutable is located
  6. Delete the executable file

Release

This page explains the release process whereby installers are added to a Release on the project GitHub page.

Following an important update, increment the Application version (we are using the same version number in Cargo.toml and package.json and tauri.conf.json). The version number will be something like 0.5.12. Adjust the tag accordingly and enter the following commands.

git add .
git commit -m "<whatever>"
git push
git tag v0.5.12
git push origin v0.5.12

If all goes well, this will add a new release with Mac, Windows, and Debian/Ubuntu installers.

Manual release

To create an installer locally (for the current OS), enter the following command

npm run tauri build

This will create an installer under src-tauri/target/release/bundle/.