Quicker dataset experimentation #9

AmitMY · 2025-02-14T10:20:55Z

To catch errors early, I recommend to swap the order of the splits here:

multimodalhugs/multimodalhugs/data/datasets/pose2text.py

Lines 50 to 72 in adcc3ea

    
           return [ 
        
               datasets.SplitGenerator( 
        
                   name=datasets.Split.TRAIN, 
        
                   gen_kwargs={ 
        
                       "metafile_path": self.config.train_metadata_dir,  
        
                       "split": f"{datasets.Split.TRAIN}" 
        
                   } 
        
               ), 
        
               datasets.SplitGenerator( 
        
                   name=datasets.Split.VALIDATION, 
        
                   gen_kwargs={ 
        
                       "metafile_path": self.config.validation_metadata_dir,  
        
                       "split": "val" 
        
                   } 
        
               ), 
        
               datasets.SplitGenerator( 
        
                   name=datasets.Split.TEST, 
        
                   gen_kwargs={ 
        
                       "metafile_path": self.config.test_metadata_dir,  
        
                       "split": f"{datasets.Split.TEST}" 
        
                   } 
        
               ), 
        
           ]

validation, test, then train

Since the validation and test sets are smaller, it is in my opinion most useful to have them run first - such that if they fail, we get a signal right away.
For me, the train split takes 8~ hours, and who knows, might fail at the end.

The text was updated successfully, but these errors were encountered:

GerrySant closed this as completed in 236a450 Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quicker dataset experimentation #9

Quicker dataset experimentation #9

AmitMY commented Feb 14, 2025 •

edited

Loading

Quicker dataset experimentation #9

Quicker dataset experimentation #9

Comments

AmitMY commented Feb 14, 2025 • edited Loading

AmitMY commented Feb 14, 2025 •

edited

Loading