Has anyone tried running the ML code on a cluster?

Hello there,
I’m trying to get this software to run on my university’s cluster (https://hpcf.umbc.edu/) and I was wondering if anyone has tried the same and if so; how did you go about getting things to work like Jupyter Notebook or getting the curator windows to pop up on your own local machine? Right now I have tried to run the curator on a little mini test (just to see if it would work) and after working through some stuff, I got it to say the following:

I think somewhere between that, a window is supposed to pop up to do the whole bad-good selection but uh…I get nothing. :sweat_smile:

Dear @mramsahoye ,

Indeed, it is not quite straightforward to forward pop-up windows from your remote cluster to your local machine. Depending on specific setting of your storage system, it might be possible to mount your remote storage directly to your local machine, then, you can run the curator on your local machine (curator does not require GPU, so most of the basic PCs should work).

In your specific issue, it seems like the curator does not find any valid data. I cannot see the full command line in your screenshot, can you double check how you specify the --data_type? Here, you want to specify the type of data you are working on, e.g. .tiff or .tif or .ome.tif, etc. A common mistake I usually make is mis-type .tif as .tiff.

(More details here: aics-ml-segmentation/bb2.md at main · AllenCell/aics-ml-segmentation (github.com))

Let me know if you have more questions.


Hi there! Thanks for the response; sorry, I was in the midst of a few midterms so I had to step back from what I was doing for a bit.
Yeaaaa, I actually did make that mistake with the .tif/.tiff so I will try it again. I will also look into how to mount remote storage; I haven’t heard of that so thanks for the suggestion. :slight_smile: I was also going to ask; so, you said the curator doesn’t require GPU so that’s fine to run on the normal PC. The main bulk of the GPU usage then comes out of the trainer, right?

Yes, that is correct. Curation can be done in any machine, while training and testing need to be done on GPU machines

Hello again!

So I am trying to run the demo on the cluster (Demo 2-aics-ml-segmentation/demo_2.md at main · AllenCell/aics-ml-segmentation · GitHub) and am trying to implement the “batch processing” portion with the following command:

–workflow_name /Workflow/lmnb1_interphase
–output_dir LMNB1_fluorescent_classic_seg
–struct_ch 0
–input_dir /LMNB1_fluorescent/e4245ede_LMNB1_fluorescent
–data_type .tiff

However I see it gives me the batch_processing error:

argument mode: invalid choice: ‘/LMNB1_fluorescent/e4245ede_LMNB1_fluorescent’ (choose from ‘per_img’, ‘per_dir’, ‘per_csv’)

I was reading the batch_processing.py raw file and there are changes (of course since it has been awhile since Demo 2 was used so it is outdated) but I’m still confused as to the syntax of it; where does the “per_dir” go?

Dear @mramsahoye ,

Thanks for your interest and sorry for the late reply. You can find the documentation here: aics-ml-segmentation/bb1.md at main · AllenCell/aics-ml-segmentation (github.com)

In short, you need to add either per_img. or per_dir before you specify the image filepath or folder filepath.