cycIFAAP

cycIFAAP (Cyclic ImmunoFluorescence Automatic Analysis Pipeline) is an automatic pipeline developed in Java 11 and Python 3.8 to help analyze cyclic immunofluorescence dataset.

This pipeline performs the following operations:
- Dapi based registration using OpenCV. If your images are already registered, there is a workaround to skip this part.
- Nuclei segmentation using deep learning: Mask R-CNN or CellPose
- Cells segmentation based on Dapi segmentation
- Compute every markers exclusiveness with all other markers (useful for Restore).
- Run Restore automatic gating (optional, see Young Hwan Chang paper)
- Extract features
- Compute cell/nuclei positiveness to each marker
- Compute mean intensities for each cell/nucleus and for each marker
- Compute cell types
- Quality control (Restore based and tissue loss)

1 - Installation

cycIFAAP is pip installable.

$ pip install cycIFAAP

If cycIFAAP is already installed and you wish to upgrade to the last version, use the following command:

$ pip install —upgrade cycIFAAP
It's a double '-' in front of 'upgrade'

2 - Input Data

The input data are images contained into a single directory, with each image being an uncompressing TIF or PNG representing a single channel/cycle.
The naming convention is standard (to the extent of my knowledge) and is the following:
- starts with the round number followed by '_' like 'R0_'
- then all the markers used in the current round, with names separated by dots, for example 'R0_aSMA.CD44.CK5.PARP_', no need to precise Dapi
- somewhere in the name, the channel is as follow '_c1_', with channels numbers ranging from 1 to 5, and dapi ALWAYS being c1
- multiple scenes/ROIs can be place altogether into the same directory, they just have to be labeled with 'Scene-XXX_'

For example:
R1_aSMA.Tryp.Ki67.CD68_38592-6_2020_08_14__8688-3-Scene-001_c1_ORG.tif
R1_aSMA.Tryp.Ki67.CD68_38592-6_2020_08_14__8688-3-Scene-001_c2_ORG.tif
R1_aSMA.Tryp.Ki67.CD68_38592-6_2020_08_14__8688-3-Scene-001_c3_ORG.tif
R1_aSMA.Tryp.Ki67.CD68_38592-6_2020_08_14__8688-3-Scene-001_c4_ORG.tif
R1_aSMA.Tryp.Ki67.CD68_38592-6_2020_08_14__8688-3-Scene-001_c5_ORG.tif
R2_EPCAM.AR.CD20.ChromA_38592-6_2020_08_15__8700-Scene-001_c1_ORG.tif
R2_EPCAM.AR.CD20.ChromA_38592-6_2020_08_15__8700-Scene-001_c2_ORG.tif
R2_EPCAM.AR.CD20.ChromA_38592-6_2020_08_15__8700-Scene-001_c3_ORG.tif
R2_EPCAM.AR.CD20.ChromA_38592-6_2020_08_15__8700-Scene-001_c4_ORG.tif
R2_EPCAM.AR.CD20.ChromA_38592-6_2020_08_15__8700-Scene-001_c5_ORG.tif
R3_CK5.CD2L_38592-6_2020_08_17__8762-Scene-001_c1_ORG.tif
R3_CK5.CD2L_38592-6_2020_08_17__8762-Scene-001_c2_ORG.tif
R3_CK5.CD2L_38592-6_2020_08_17__8762-Scene-001_c3_ORG.tif
R4_CD4_38592-6_2020_08_20__8811-Scene-001_c1_ORG.tif
R4_CD4_38592-6_2020_08_20__8811-Scene-001_c2_ORG.tif

3 - Example 1

The easiest thing to start learning how to use this pipeline is simply to run an example.

Start by downloading the example 1, which contains a full end to end example.The dowloaded directory contains a python file, a directory with the expected results to obtain, and a small dataset with images of dimensions 2048x2048.

Then open a terminal and run the following commands:
$ cd /where/the/example1/was/downloaded/
$ tar -zxvf cycIFAAP_Example1.tgz
$ cd cycIFAAP_Example1
$ python Example1.py

If you run it inside an environment without a GPU, it will take quite some time, let's say around 20 to 25 minutes, else the GPU(s) will be automatically detected and used.
cycIFAAP will generate three directories:
- '2048x2048 - Registration' which contains the original images after registration.
- '2048x2048 - Segmentation', which contains ALL the segmented images, a directory 'Scene 001 - Visual Check' that you want to look at to confirm that the images were properly segmented, and a file 'Scene 001 - Napary.py' that you can run to visualize ALL the images.
- '2048x2048 - Features', which contains all the extracted features for each marker, a directory 'Scene 001 - Markers Excelusiveness" (yes there is a typo… so far) that contains each nucleus/cell mean intensity plotted against all other markers (useful later for Restore), a file 'Scene 001 - Mean Intensities.csv' which is a concatenation of all the other features files, a file 'Scene 001 - Normalized Mean Instensities.csv' which contains the same information as the previous file but after normalization, two files showing Restore results.

Here you are, done with the first example, and you can compare the results with the results you'll find inside the directory "Expected Results".

4 - cycIFAAP Keywords/Options

In both examples, you'll find a python file with a dictionary containing all cycIFAAP modifiable parameters. Here is an explanation for each parameter. Most of them do not need to be set up and so can stay commented.
- 'nbCPU' (mandatory): the number of CPUs / cores you wish to use for the processing.
- 'Experiment' (mandatory): The experiment's type, it can be 'CROPS' or 'TMA'. The only difference is that for TMA, the pipeline will find the central core (core of interest) and will ignore all cells close to the borders.
- 'Python_Path': If necessary (depending on your settings/environment), give the entire / absolute path to your python executable.
- 'Java_Xms': How much memory to allocate at the beginning of the processing.
- 'Java_Xmx': What is the maximum memory to allocate for the processing.
- 'Registration_Technique': the type of registration technique to use. So far, only "OpenCV" is supported. Another technique (java/stackreg based) is available upon request.
- 'Registration_Reverse': If True the dapi image from the last round will be used as reference for the registration, else by default it's the dapi image of the first round.
- 'SegmentNuclei_RoundToSegment': Which round(s) to use for nuclei segmentation. It can be 'Z-Projection', 'All', 'Consensus', or 'Everything' (recommended) if you wish to use all the information contained in all the dapi images for nuclei segmentation, otherwise you can give a list of rounds to use, like for example 'R1,R3,R7' or 'R9/R10;R11Q' (multiple separators available for more flexibility).
- 'SegmentNuclei_Model' (mandatory): Which deep learning model to use for nuclei segmentation?
- - - - - - 'MaskRCNN_512x512_Norm=B - 9686_831_8976 - 20201009.pt' (recommended)
- - - - - - 'MaskRCNN_512x512_Norm=CR - 967_821_884 - 20200624.pt'
- - - - - - 'MaskRCNN_256x256_Norm=CR - 964_812_873 - 20200619.pt'
- - - - - - 'CellPose_cpu_R=30_F=0.4' In this example CellPose will use CPUs, a diameter of 40 and a flow of 0.4.
- - - - - - 'Mesmer_mpp=0.325' In this example, Mesmer will be used and the micron per pixel is 0.325.
- 'SegmentNuclei_BorderEffectSize': Mask R-CNN nuclei segmentation parameter.
- 'SegmentNuclei_BatchSize': Mask R-CNN nuclei segmentation parameter which is the number of crop segmented in parallel.
- 'SegmentNuclei_CheckOverlap': Mask R-CNN nuclei segmentation parameter which is the number of pixels overlap between bounding boxes for be considered identical.
- 'SegmentNuclei_SaveNuclei': Mask R-CNN nuclei segmentation parameter. Save the individual nuclei?
- 'SegmentNuclei_Threshold': Mask R-CNN nuclei segmentation parameter. The minimum probability for a bounding box candidate to be considered as a nucleus.
- 'SegmentCells_DilationRatio': The scale/dilation coefficient applied to the cell segmentation when there is no marker and the nuclei are inflated..
- 'Background_Subtraction': If True (recommended) the background subtraction will be performed for each marker and used for features extraction. It allows to stabilize the features values across experiments.
- 'EarlyExitAfterSegmentation': If True, the processing immediately stops after the segmentations, nothing more.
- 'Images_Directory' (mandatory): Path to the directory containing the images to process. Always make a backup!!!
- 'FE_SaveImages': If True, all the resulting / check images will be saved.
- 'FE_BiasedFeaturesOnly': If True, only the biased (intensity based) features will be extracted.
- 'FE_DistanceFromBorder': If True, the distance from the sample border will be computed.
- 'FE_Rim_Size': The rim size/dimensions/width, so the number of pixels to take into account on each side of the cell edge.
- 'Segmentation': If True, the nuclei and cells segmentation will be performed, but not the registration.
- 'Registration_And_Segmentation': If True, this starts the registration and segmentations (nuclei and cells) will be performed.
- 'FeaturesExtraction': If True, this starts the features extraction.
- 'QualityControl': If True, this starts the quality control.

Here are examples:

Parameters1 = {
'nbCPU': 6,
'Experiment': "CROPS",
'SegmentNuclei_Model': 'MaskRCNN_512x512_Norm=B - 9686_831_8976 - 20201009.pt',
'Images_Directory': "/Path/To/The/Images/",
'Registration_And_Segmentation': True,
'FeaturesExtraction': True,
}

Parameters2 = {
'nbCPU': 6,
'Experiment': "TMA",
'SegmentNuclei_RoundToSegment': 'R0,R1/R2;R3:R4',
'SegmentNuclei_Model': 'CellPose_gpu_40_0.5',
'Background_Subtraction': False,
'Images_Directory': "/Path/To/The/Images/",
'FE_SaveImages': False,
'Segmentation': True
}

5 - How to use it from scratch?

After installing cycIFAAP, and running at least the first example, you want obviously to test it on your own data.
The simplest way to start is to take the python file in the example 1, and to modify it for your needs.
So at first, let's rename it as myTest.py.
It will look like this:

import sys
from cycIFAAP import cycIFAAP

Parameters =
{
'nbCPU': 6,
'Experiment': "CROPS"
'SegmentNuclei_Model': 'MaskRCNN_512x512_Norm=B - 9686_831_8976 - 20201009.pt',
'Images_Directory': "./Test - 2048x2048/",
'Registration_And_Segmentation': True,
'FeaturesExtraction': True,
}

cycIFAAP.Run(Parameters)
print("Done!")
sys.exit(0)

The ONLY thing you have/want to modify are the parameters, NOTHING else.

- Comment or set to false the parameter 'FeaturesExtraction' as this step should be performed separately.
- Modify the number of cores you want to use for this test by modifying the parameter 'nbCPU'.
- Modify the parameter 'Experiment' if your data are TMAs see Parameters2, else don't touch it.
- Don't touch the parameter 'SegmentNuclei_Model' as it is the best nuclei segmentation model (so far).
- Update the parameter 'Images_Directory' to give the relative or absolute path to your images to process.

So the 'Parameters' should look like this now:
Parameters =
{
'nbCPU': 6,
'Experiment': "CROPS",
'SegmentNuclei_Model': 'MaskRCNN_512x512_Norm=B - 9686_831_8976 - 20201009.pt',
'Images_Directory': "/Path/To/Your/Data/",
'Registration_And_Segmentation': True,
}

Then run the first part of the pipeline:
$ cd /Path/To/myTest.py/Directory/
$ python myTest.py

It will generate:
- A directory "/Path/To/Your/Data - Registration/" containing the original images registered.
- A directory "/Path/To/Your/Data - Segmentation/" containing all the generated images:
——— A directory "Scene XXX - Visual Check" containing all the images to check/evaluate the segmentation quality.
——— Images "Scene XXX - ZProjectionDapi" containing the z-projection (also know as extended depth of field) of all (or part) the dapi images.
——— Two images "Scene XXX - Marker SBnuclei.png" and "Scene XXX - Marker SBcells.png" per marker, which are the marker images after background subtraction based on nuclei or cell segmentation results respectively.
——— Two images "Scene XXX - Nuclei Labels.png" and "Scene XXX - Cells Labels.png" that contain the labels associated to each cell/nucleus.
——— An image "Scene XXX - Nuclei Forground.png" which contains the nuclei segmentation represented as foreground vs background.
——— An image "Scene XXX - NucleiToSegment.png" which is the image used for nuclei segmentation.
- A directory "/Path/To/Your/Data - Features/" containing:
——— A single directory (so far) which contains the scatter plots of all the markers plotted agains each other. Highly useful for Restore:
——— A file RoundsCyclesTable.txt containing the list of all the markers found, their associated round and cycle, along with the gating values (min and max) and the location where to extract features (All, Rim, Ring, Nucleus, or Cell), by default 'All'.

At this point:
- you want to look at the exclusive markers to define the pairs that should be used in Restore
- Update the file 'RoundsCyclesTable.txt' (see example 2):
——— Update the manual gating values (based on the images "Scene XXX - Marker SB….png" because the option "Background _Subtractoin" is set to true by default).
——— Remove or add the markers that should or not be used by Restore.
- Create a file CellTypesTable.txt if you want the cell types to be computed (see example 2).
- Update your parameters to change the operation to perform

So the parameters should now be:
Parameters =
{
'nbCPU': 6,
'Experiment': "CROPS",
'SegmentNuclei_Model': 'MaskRCNN_512x512_Norm=B - 9686_831_8976 - 20201009.pt',
'Images_Directory': "/Path/To/Your/Data/",
'FeaturesExtraction': True,
}

And then run the command:
$ python myTest.py

It will generate:
- A file "/Path/To/Your/Data - Features/Scene XXX - Marker - Location.csv" per marker and per location, which contain all the features for this marker and each required location.
- A file "/Path/To/Your/Data - Features/Scene XXX - Mean Intensities.csv" which contains the concatenation of all the markers mean intensities.
- A file "/Path/To/Your/Data - Features/Scene XXX - Normalized Mean Intensities.csv" which contains the normalized mean intensities for all markers.
- A file "/Path/To/Your/Data - Features/Scene XXX - Positiveness.csv" which contains each cell positiveness for each marker based on manual gating or Restore.
- A file "/Path/To/Your/Data - Features/Scene XXX - Cell Types.csv" which contains each cell type if the file CellTypesTable.txt was defined.
- Two files "/Path/To/Your/Data - Features/Scene XXX - restore." which contain Restore results (values and exclusiveness graphics).
- A file "/Path/To/Your/Data - Segmentation/Scene XXX - Napari.py" to visualize the results using Napari.

6 - Example 2

After successfully running example 1 and carefully reading the previous section (5), it's time to run a more advanced example, so here is example 2,

Then open a terminal and run the following commands:
$ cd /where/the/example2/was/downloaded/
$ tar -zxvf cycIFAAP_Example2.tgz
$ cd cycIFAAP_Example2
$ python Example2_Part1.py

As explained in the previous section, it will perform:
- The registration.
- The nuclei/cells segmentation.
- The background subtraction.
- Plot the markers exclusiveness.

What now:
- Copy the two txt files contained in "Test - 4096x4096/Files/" into the directory "Test - 4096x4096 - Features".
- Run the second part (features extraction).
$ python Example2_Part2.py

The difference with example 1 is that this example is decomposed into two parts (as it should always be), registration/segmentations and features extraction, but I pre-filled / pre-updated the files RoundsCyclesTable.txt and CellTypesTable.txt for your. In real data processing, you'll have to do it yourself based on the exclusive markers plots, the markers present, and the images "Scene XXX - Marker SB…png " is the background subtraction was performed.

Here you are!!!

7 - Trouble shouting

In case of an error or any question about an unexpected behavior of the pipeline, please:
- Send me the pipeline outputs
- Send me the pipeline errors if generated into a separate file
- Send ALL the log files from ALL the directories.
It's the only way I'll be able to understand what happened.
If necessary, I'll ask you few images to reproduce the results.

8 - Questions and suggestions

Do NOT hesitate to email me for any questions, suggestion, or improvement ideas you may have.
In case of absence of answer within 24 hours, please send me a second email, I always answer quickly.

Cyclic ImmunoFluorescence Automatic Analyzis Pipeline