-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
explore filehash package as a means to use large PVs in a memory-efficient manner #23
Comments
We might want to change it from |
Hmm, could be... The write part is easy, the load part might be a bit trickier... |
I thought a bit more about a useable piecewise serialisation and it's actually not too difficult to make this happen. The trick is to "instantiate" actual data inside functions only, such that one can let it go out of scope, truly freeing the associated memory. I fear, however, that it doesn't scale particularly well... The main problem occurs when left-folding the After as many calls of If this object was created during a The The return value of the This all means that unless I'm honestly not sure if I even want to attempt this given the foreseeable issues... |
Finally, |
Maybe all of this can be realised without too much hassle using filehash, I'm not quite sure yet... |
Alright, the 22GB object above, that was storing 72x24 = 1728 return values of After extracting just |
Oh, perhaps because the bootstrap fit contains a closure at some point? Perhaps that pulls in the full environment because R does not know which part of the environment is needed for the closure to function. In principle the closure could make up variables and do |
Yup, see HISKP-LQCD/hadron#187 |
sorry, I don't fully understand the problem with the function and the scope yet. Where is the whole environment stored and why? Concerning the memory problem: can one decide in |
I mean, assuming we store only the |
I think I understand the scope problem now... What about putting the function in its own new evironment? |
The way that I tried the value serialization was that instead of the actual value an S3-object for (i in 1:nrow(param)) {
result$value[[i]] <- func(param[i, ], value[[i]])
} With the value serialization it would check whether The problem is that even when calling |
I would like to play around with the filehash package to see if we can use it to lower memory consumption of pv objects. I've hit a point where I've performed some simple one parameter fits and end up with a 22 GB object and that's just for the test ensemble (to be fair, there are many variations, but I'm still a bit perplexed about the actual size...).
side remark: There also seems to be something fishy going on in that the size of the saved object seems to depend on current memory load, as if the entire environment were being saved to disc rather than just the object in question.
The text was updated successfully, but these errors were encountered: