Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampling from episodic loader gives error - "Key image doesn't exist (select from [])!" #19

Open
patricks-lab opened this issue Nov 28, 2022 · 2 comments

Comments

@patricks-lab
Copy link

patricks-lab commented Nov 28, 2022

When sampling from the episodic loader, all usually goes fine until I get the following error:

Traceback (most recent call last):
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/pipeline.py", line 145, in get_next
    sample_dic = next(self.class_datasets[class_id])
TypeError: 'TFRecordDataset' object is not an iterator
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/pipeline.py", line 219, in get_next
    dataset = next(self.dataset_list[source_id])
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/pipeline.py", line 121, in __iter__
    sample_dic = self.get_next(class_id)
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/pipeline.py", line 148, in get_next
    sample_dic = next(self.class_datasets[class_id])
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/utils.py", line 23, in cycle_
    yield next(iterator)
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/tfrecord/reader.py", line 222, in example_loader
    feature_dic = extract_feature_dict(example.features, description, typename_mapping)
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/tfrecord/reader.py", line 162, in extract_feature_dict
    raise KeyError(f"Key {key} doesn't exist (select from {all_keys})!")
KeyError: "Key image doesn't exist (select from [])!"
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/pipeline.py", line 201, in __iter__
    next_e = self.get_next(rand_source)
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/pipeline.py", line 222, in get_next
    dataset = next(self.dataset_list[source_id])
StopIteration

Just for info - I used an older version of your repo (https://github.com/mboudiaf/pytorch-meta-dataset/tree/c6d6922003380342ab2e3509425d96307aa925c5). I am sampling from the episodic loader. I use

episodic_dataset = pipeline.make_episode_pipeline(dataset_spec_list=all_dataset_specs,
                                                      split=split,
                                                      data_config=data_config,
                                                      episode_descr_config=episod_config)
episodic_loader = DataLoader(dataset=episodic_dataset,
                                 batch_size=meta_batch_size,
                                 num_workers=data_config.num_workers,
                                 worker_init_fn=seeded_worker_fn)
#Sample a batch of size [B, N*K, C, H, W] from episodic loader via next(iter(episodic_loader))
#where B = meta_batch_size, N*K = n_ways*k_shots, C = channels, H = height of image, W = width of image

Do you know what may be causing the KeyError: "Key image doesn't exist (select from [])!" StopIteration error? For the above error, I am setting 5-way 15-shots for train and 5-way 5-shot for test/validation, and meta_batch_size 2 for train and 4 for test/val.

EDIT: sometimes I'm also sampling from the episodic loader and encounter an infinite loop.

Thanks a lot in advance!

@patricks-lab
Copy link
Author

patricks-lab commented Nov 29, 2022

Recently, I also found out that the error seems to originate from the utils.py (https://github.com/mboudiaf/pytorch-meta-dataset/blob/master/src/datasets/utils.py) file - sometimes, my program would hang after going into an infinite loop in cycle_ and eventually crashes with the above error.

So basically this is what happens:

First, in https://github.com/mboudiaf/pytorch-meta-dataset/blob/master/src/datasets/pipeline.py the code attempts to fetch a sample from a class, using sample_dic = next(self.class_datasets[class_id]). But then this raises a TypeError: 'TFRecordDataset' object is not an iterator exception, so it attempts to call cycle_() on the class in the except clause.

def get_next(self, class_id):
        try:
            sample_dic = next(self.class_datasets[class_id])
        except (StopIteration, TypeError) as e:
            self.class_datasets[class_id] = cycle_(self.class_datasets[class_id])
            sample_dic = next(self.class_datasets[class_id])
        return sample_dic

But, in the function cycle_ in utils.py the code tries to yield the next sample in the iterator, and if not, it will attempt to reset the iterator. So the code looked like this:

def cycle_(iterable):
    # Creating custom cycle since itertools.cycle attempts to save all outputs in order to
    # re-cycle through them, creating amazing memory leak
    iterator = iter(iterable)
    while True:
        try:
            yield next(iterator)
        except StopIteration:
            iterator = iter(iterable)

The issue I think is happening is that there is some iteration where the class iterable is empty (something like iter([])). I'm wondering if this may be due to a deformed/corrupted tfrecord (since I did specify a fixed 5-way, 10-shot task)? I'm thinking the class is empty since it's telling me that KeyError: "Key image doesn't exist (select from [])!", suggesting that the class we're trying to iterate from might be entirely empty.

As such I am wondering if I need to reinstall the files for meta-dataset? Or is there an issue with the code I'm running?

@brando90
Copy link
Contributor

brando90 commented Dec 1, 2022

@mboudiaf I'm also interested in this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants