Skip to content

repository-like tool based on MongoDB, Celery, and Click

License

Notifications You must be signed in to change notification settings

jbp4444/mccRepository

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mccRepository

repository-like tool based on MongoDB, Celery, and Click

reposadmin.py:

  • aimed at basic manipulation of the data in the database ** not for "normal" operations
  • file ops: create, read, update, delete, query, count, ckquery
  • filesys ops: fsyscreate, fsysread, fsysupdate, fsysdelete, fsysquery
  • dir ops: dircreate, dirread, dirupdate, dirdelete, dirquery ** directories are currently an odd mix of related-file-groups and directory/folders in file-system ** this is really to allow users to "claim" groups of files that they are responsible for
  • user ops: usercreate, userread, userupdate, userdelete, userquery
  • wipe: wipe out/delete ALL info in database (all file/fsys/dir/user data)
  • logpath: show logical path for a given file

nightly.py:

  • "normal" nightly/periodic operations
  • scan - walk a file-system and ingest new files
  • sync - push any new/missing files from primary to secondary ** submits copy_file tasks to back-end
  • verify - check fixity info on files ** submits verify_existing_file tasks to back-end
  • checkpolicy - checks files for compliance with "policies" ** file size >= 0; checksum is not blank; N copies of each file
  • attachdirs - scan all files and attach them to known directories

reposops.py:

  • "normal" operations
  • dedupe - list duplicate checksums
  • pending - query for files that have 'pending' info in their tags
  • cleantags - remove file-objs from db that match a tag-query
  • tagall - tag all files that match a given filename pattern
  • filestats - calculate some basic stats on all files in a subdir

tagops.py:

  • used for more complex tag-queries
  • tagfullquery - walk through every item in database (could be slow)
  • tagquery - Qfilter/Qcombination (faster) tag-query
  • count - count objects matching the (simple) tag-query
  • tagimages - simple scan for image-file metadata (all files in a subdir) ** submits tag_image_data tasks to the back-end ** png/jpeg/gif/tiff files; uses PIL to detect sizes and EXIF data ** stored into tag-list with keys 'image' (T/F), image_type/_mode/_width/_height/_device/_datetime
  • walktag - walk a subdir and tag all files in it

repos_worker.py:

  • main worker-bee code that processes background tasks
  • connects to celery-db (in mongo) and waits for tasks
  • includes repos_worksubs.py and repos_imagesubs.py (which includes repos_imagesubs_exif.py)

reposrepair.py:

  • more of an example on how to write aux modules to work with the system ** works with repos_filerepair.py on the backend
  • calcrepair - submits calc_repair tasks for all files in a subdir ** stores repair info into tag-list (key is 'filerepair')
  • verifyrepair - submits verify_repair tasks for all files in a subdir
  • wiperepair - submits wipe_repair tasks for all files in a subdir ** removes the 'filerepair' key from the tag-list

tapecheck.py:

  • simulator for tape-based file-systems
  • tapescan - process simulated data on a per-tape basis ** assumes a way to get all the files that live on a given tape

About

repository-like tool based on MongoDB, Celery, and Click

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published