Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read .ndjson CryoET Files #493

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from
Draft

Read .ndjson CryoET Files #493

wants to merge 13 commits into from

Conversation

BradyAJohnston
Copy link
Owner

@BradyAJohnston BradyAJohnston commented May 9, 2024

For instancing of particles, reads .ndjson files and creates points for instancing.

Currently just the initial barebones importer which can parse the positions and the rotations for the points. Doesn't currently handle particle picking or anything similar.

Probably requires some refactoring of naming for the parsing classes. The importer panel should really be "CryoET" or something similar, with the possibility to open .starfile or .ndjson point instancing files.

Copy link

codecov bot commented May 9, 2024

Codecov Report

Attention: Patch coverage is 43.39623% with 30 lines in your changes are missing coverage. Please review.

Project coverage is 74.80%. Comparing base (50c5cbd) to head (09dbd99).
Report is 27 commits behind head on main.

Current head 09dbd99 differs from pull request most recent head 788c0ea

Please upload reports for the commit 788c0ea to get more accurate results.

Files Patch % Lines
molecularnodes/io/parse/star.py 36.17% 30 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #493      +/-   ##
==========================================
- Coverage   75.28%   74.80%   -0.49%     
==========================================
  Files          41       41              
  Lines        3731     3770      +39     
==========================================
+ Hits         2809     2820      +11     
- Misses        922      950      +28     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jojoelfe
Copy link
Contributor

I tested this using https://cryoetdataportal.czscience.com/runs/643 . The import works fine, but there are some rough edges:

  1. By default the starfile nodegroup (which probably should be similarly renamed, maybe AnnotationEnsemble ) does not read the rotation:

image

but this is just a simple fix in the nodetree:

image

  1. The scaling of the coordinates seem off, I will look into this.

The biggest shame is that the display of the tomograms will be nicest when we have the "Sample Volume" node in 4.2. I used this in a recent preprint and it came out really nice:

https://www.youtube.com/watch?v=uKoxomgbAzY

@jojoelfe
Copy link
Contributor

OK, looks like the location coordinates are in pixels of the tomogram, but all densities get scaled to be in physical coordinates (as it should be). Unfortunately, the ndjson has no info about the pixel size, so for now we have to create a way for the user to manually enter the pixelsize or connect it with the volume, so the pixelsize can be read from there.

@BradyAJohnston
Copy link
Owner Author

I think we can add a 'scale pixel size' to the node group which defaults to 1.0. Don't need to use it if the coordinates have been input but can use if needed for these datasets.

@BradyAJohnston
Copy link
Owner Author

And yes I think we can rename the ensemble node group as well to be consistent.

@jojoelfe
Copy link
Contributor

Actually, I am pretty sure the rotations are wrong. I did some tests with this: https://cryoetdataportal.czscience.com/runs/506?metadata=run&tab=metadata and the Z-axes (blue arrow) should point out of the viral particles. (Inverting the rotation doesn't help either)

image

I just saw you are working on the matrix attribute, so maybe the thing to do is to write the matrix directly in there and go from there.

@BradyAJohnston
Copy link
Owner Author

BradyAJohnston commented May 23, 2024

The matrix data format won't be until Blender 4.2 in early July. Currently I am taking the xyz_rotation_matrix and turning it into a 4x4 matrix. I believe it's currently filling the matrix row-wise first, but it might be defined column-wise? Do you know if there is a spec somewhere for the .ndjson? That way we can support more potential column names etc as well.

if has_rotation:
matrix[:3, :3] = data['xyz_rotation_matrix']

@jojoelfe
Copy link
Contributor

I haven't found a spec. I just asked in their repo how the matrix is defined: chanzuckerberg/cryoet-data-portal#749

@BradyAJohnston
Copy link
Owner Author

Great - thanks @jojoelfe !

@jojoelfe
Copy link
Contributor

After some more tinkering:

if has_rotation:
        matrix[:3, :3] = data['xyz_rotation_matrix']
        matrix[:3, :3] = np.flip(matrix[:3, :3])

makes things work:

2024-05-23.11-45-00.mp4

@BradyAJohnston
Copy link
Owner Author

Thanks for figuring that out @jojoelfe - I've fixed the rotation and added a test for some basic import. Will be interested what they say about spec for the data formats.

@BradyAJohnston
Copy link
Owner Author

BradyAJohnston commented Jan 10, 2025

I'm revisiting this. Since working on this a bunch of internal stuff has changed, including recently #703 which changed handling of cisTEM / RELION starfiles.

I've just updated all of the code for this, and the importer now works properly, but I am still not 100% on the rotations.

EDIT: remembered we already discussed scaling of positions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants