Getting started using CellProfiler from the command line

Beth Cimini

Cross posted from our GitHub wiki — check there for any future updates!

Running CellProfiler from the command line has a major advantage — you don't need to spend computational power or memory creating the graphical user interface (GUI) that you're used to using in CellProfiler!

It also has a major potential disadvantage in that while CellProfiler running graphically on your desktop will automatically use as many cores on your computer as you permit, CellProfiler running headlessly from the command line will only run one job at a time.

If you're willing to script some ways to scatter the CellProfiler jobs, though, or don't mind running one job at a time, headless processing can be incredibly helpful!

A general CellProfiler headless command will have a few common pieces, and a few that will change depending on your exact setup.

Components
 

Required components
 

cellprofiler -c -r -p /path/to/pipeline/file.cppipe -o /path/where/the/output/goes

Let's break each one of these down:

cellprofiler : If you've installed CellProfiler from source, rather than downloading it prebuilt from our website, just typing cellprofiler is how you'll call it from the terminal. If NOT, this needs to be the path to your existing CellProfiler executable, which for a PC will look something like C:\Users\UserName\ProgramFiles\CellProfiler\CellProfiler.exe , and for a Mac will look something like /Applications/CellProfiler/Contents/MacOS/cp . You can drag and drop the executable into your terminal window, which is typically easier than typing it out.

-c -r : Flags you need to use to run CellProfiler headless. See Adapting CellProfiler to a LIMS environment for more information.

-p : Your pipeline file (or batch file created with the CreateBatchFiles module) you want to execute

-o : Where your output should go.

Input component
 

Your input component can be one of a few different things — or might be omitted entirely, if you're using a batch file.

-i : Specifies the Default Input Directory (if any modules in your pipeline are using files you've told it will be in the Default Input Directory). Unless using a batch file, CellProfiler will additionally scan all the files in this folder and try to make them into image sets for your pipeline - great if you don't want to otherwise specify input sets with one of the methods below, but be warned it can be very slow if you point it at a huge directory!

--data-file : A CSV designed to work with the LoadData module (See module help for more details).

--file-list : A list of files, one per line, that should be loaded by the Images module and then analyzed by CellProfiler.

Optional grouping component
 

If you want to break your CellProfiler job up (to run in stages, or across many machines, or many terminal windows on the same machine), you may want to add a grouping component to your command. There are two major ways this can work:

-g : If the "Groups" module of your pipeline is turned on, you can use the grouping variables specified there to set which group this copy of CellProfiler should process — i.e., if your pipeline is set up to group by Well, you might use -g Well=A01

-f -l : If you aren't grouping, or simply don't want to break your data that way, you can tell CellProfiler the first and last image set number to process — i.e., to run image sets 1-5 you'd add -f 1 -l 5 to your command

Put it all together:
 

cellprofiler -c -r -p /path/to/pipeline/file.cppipe -o /path/where/the/output/goes -i /path/with/input/files -g Well=A01

How can I learn more?
 

For a deeper exploration of all CellProfiler command line flags, please check out our much more detailed page on CellProfiler’s Github wiki: Adapting CellProfiler to a LIMS environment.

For a video demonstration, check out our YouTube video on the topic.

For running CellProfiler remotely on Amazon Web Services, check out our Distributed-CellProfiler repository as well as the video link above.