Unusual multi-FCS files #24

photocyte · 2020-09-10T04:40:05Z

Hi there,

I've come across FCS files (From the Luminex Muse), which implement multi-FCS by simple concatenating single FCS files together. This was my solution to split them:

files = glob.glob("ADM_*.VIA.FCS")
for f in files:
    handle = open(f,"rb")
    data = handle.read()
    ##Some FCS files are just literal concatenations of single FCS files, this splits them.
    split_data = data.split(b"FCS3.0")
    for s in range(1,len(split_data)):
        handle = open(f+"_"+str(s)+".FCS","wb")
        handle.write(b"FCS3.0"+split_data[s])
        handle.close()

Once these multi-FCS files are split, fcsparser works perfectly, as far as I can tell. But it might be nice for the library to be able to detect these files by default! See attached for an example FCS:
ADM_09SEP2020_181310.VIA.FCS.zip

The text was updated successfully, but these errors were encountered:

maaikesangster · 2021-10-07T09:50:54Z

Hello,

I had the same issue, thank you for this solution! I am using the cytoflow package for my parsing and analysis of FC data and I wanted to raise the issue with them as well. Do you mind if I use your example?

photocyte · 2021-10-07T13:56:30Z

@maaikesangster Please feel free

bpteague · 2022-01-12T04:02:05Z

Do note that fcsparser supports choosing which dataset in the file to parse out. You can use the data_set keyword argument to the FCSParser constructor. It's 0-indexed -- so data_set = 0 is the first data set, data_set = 1 is the second, etc.

bpteague · 2022-01-12T04:02:24Z

(And @maaikesangster , cytoflow exposes the same functionality in ImportOp)

photocyte · 2022-01-12T20:54:35Z

For me, for a file with 4x concatenated FCS files, this works for data_set=0 and data_set=1 , but for data_set=2 & data_set=3, it fails:

meta , data = fcsparser.parse(f,data_set=2)

Encountered an illegal utf-8 byte in the header.
 Illegal utf-8 characters will be ignored.
'utf-8' codec can't decode byte 0x8c in position 0: invalid start byte
20220112_files/ADM_12JAN2022_112816.VIA.FCS

All 4 files can be opened successfully when first separated via this approach (#24 (comment)) . Happy to share all 5 files (original + 4 split) if desired.

edit: here is the full error message

~/miniconda3/lib/python3.9/site-packages/fcsparser/api.py in parse(path, meta_data_only, compensate, channel_naming, reformat_meta, data_set, dtype)
    538     read_data = not meta_data_only
    539 
--> 540     fcs_parser = FCSParser(path, read_data=read_data, channel_naming=channel_naming,
    541                            data_set=data_set)
    542 

~/miniconda3/lib/python3.9/site-packages/fcsparser/api.py in __init__(self, path, read_data, channel_naming, data_set)
    105         if path:
    106             with open(path, 'rb') as f:
--> 107                 self.load_file(f, data_set=data_set, read_data=read_data)
    108 
    109     def load_file(self, file_handle, data_set=0, read_data=True):

~/miniconda3/lib/python3.9/site-packages/fcsparser/api.py in load_file(self, file_handle, data_set, read_data)
    117         while data_segments <= data_set:
    118             self.read_header(file_handle, nextdata_offset)
--> 119             self.read_text(file_handle)
    120             if '$NEXTDATA' in self.annotation:
    121                 data_segments += 1

~/miniconda3/lib/python3.9/site-packages/fcsparser/api.py in read_text(self, file_handle)
    215         #####
    216         # Parse the TEXT segment of the FCS file into a python dictionary
--> 217         delimiter = raw_text[0]
    218 
    219         if raw_text[-1] != delimiter:

IndexError: string index out of range

It seems data_set is looking to split on the string $NEXTDATA, whereas the example FCS file I've uploaded are just whole separate files that are concatenated, so they are instead separated by the FCS start bytes FCS3.0 .

bpteague · 2022-01-12T21:44:19Z

@photocyte I'd love to add it to my collection of weird FCS files (: And if I can figure out the fix, I'll submit a pull request to @eyurtsev .

photocyte · 2022-01-12T21:50:55Z

Thanks @bpteague ! See linked zip file below. That has the _1,_2,_3,_4 split off FCS files, plus the original FCS file ADM_12JAN2022_112816.VIA.FCS.

20220112_files.zip

I also realized I previously uploaded a file here (#24 (comment)) that should have the same phenomena, but maybe it isn't already split out.

bpteague · 2022-01-13T02:23:06Z

@photocyte Thanks for the file. I found the problem, and the fix is easy. In fcsparser.api, on line 125, replace

nextdata_offset = self.annotation['$NEXTDATA']

with

nextdata_offset += self.annotation['$NEXTDATA']

@eyurtsev, I'll put together a test case and a PR.

maaikesangster mentioned this issue Oct 7, 2021

Opening an fcs file with multiple tubes cytoflow/cytoflow#322

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unusual multi-FCS files #24

Unusual multi-FCS files #24

photocyte commented Sep 10, 2020

maaikesangster commented Oct 7, 2021

photocyte commented Oct 7, 2021

bpteague commented Jan 12, 2022

bpteague commented Jan 12, 2022

photocyte commented Jan 12, 2022 •

edited

Loading

bpteague commented Jan 12, 2022

photocyte commented Jan 12, 2022

bpteague commented Jan 13, 2022 •

edited

Loading

Unusual multi-FCS files #24

Unusual multi-FCS files #24

Comments

photocyte commented Sep 10, 2020

maaikesangster commented Oct 7, 2021

photocyte commented Oct 7, 2021

bpteague commented Jan 12, 2022

bpteague commented Jan 12, 2022

photocyte commented Jan 12, 2022 • edited Loading

bpteague commented Jan 12, 2022

photocyte commented Jan 12, 2022

bpteague commented Jan 13, 2022 • edited Loading

photocyte commented Jan 12, 2022 •

edited

Loading

bpteague commented Jan 13, 2022 •

edited

Loading