Outputs

The tool creates several files and folders in the specified output directory. It generates separate directory for each fold (e.g. 2.image_embedding_fold1, 2.image_embedding_fold2). Below is the list and description of each output generated by the tool.

  • image_emd.tsv:

    A tab-separated file containing the generated embeddings for each image. Each row corresponds to an image and the subsequent columns contain the embedding vector.

        1   2       3       4
BPTF        -0.037030112    -0.139459819    0.417184144     0.386600941
KAT2B       0.02969132      -0.139459819    -0.038685802    0.136547908
PARP1       -0.037030112    -0.139459819    0.540370524     0.119614214
MSL1        0.18169874      -0.139459819    -0.038685802    0.152157351
KAT6B       -0.037030112    -0.139459819    0.308141887     0.257056117
  • labels_prob.tsv:

    This tab-separated file contains probability scores for each of the 28 possible protein labels (e.g., Nucleoplasm, N. membrane, etc.) for each image.

    Nucleoplasm     N. membrane     Nucleoli        N. fibrillar c.
BPTF        0.740698278     0.270941526     0.147179633     0.149313971
KAT2B       0.38626197      0.092356719     0.36738047      0.238842875
PARP1       0.596435964     0.100168504     0.382214785     0.179471999
MSL1        0.195862561     0.01370267      0.101418771     0.038516384
KAT6B       0.606423676     0.101763181     0.337655455     0.201311186
  • model.pth:

    The pre-trained Densenet model used for image embedding.

  • blue_resize:

    This directory contains images that are processed in the blue channel.

  • green_resize:

    This directory contains images that are processed in the green channel.

  • red_resize:

    This directory contains images that are processed in the red channel.

  • yellow_resize:

    This directory contains images that are processed in the yellow channel.

Logs and Metadata

  • output.log:

    A log file detailing the activities and potential issues encountered during the image embedding process.

  • error.log:

    If any errors occur during the execution of the script, they will be recorded in this log file.

  • ro-crate-metadata.json:

    Metadata in RO-Crate format, a community effort to establish a lightweight approach to packaging research data with their metadata.

    The main object contains identifier (@id), type (@type), name, descriptions, keywords and isPartOf, that describes the hierarchical relationship (organization and project).

    Graph: The @graph key contains an array of objects that detail other entities related to the main dataset. a. Metadata, Datasets, Software b. Output Files: details of output files generated by the tool. c. Images: details about specific image files, including keywords, descriptions, formats, and content URLs.