I occasionally have Python programs that take a long time to run, and that I want to be able to save the state of and resume later. Does anyone have a clever way of saving the state either every x seconds, or when the program is exiting?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Put all of your “state” data in one place and use a pickle.
The pickle module implements a fundamental, but powerful algorithm for serializing and de-serializing a Python object structure. “Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream is converted back into an object hierarchy. Pickling (and unpickling) is alternatively known as “serialization”, “marshalling,” 1 or “flattening”, however, to avoid confusion, the terms used here are “pickling” and “unpickling”.
Method 2
If you want to save everything, including the entire namespace and the line of code currently executing to be restarted at any time, there is not a standard library module to do that.
As another poster said, the pickle module can save pretty much everything into a file and then load it again, but you would have to specifically design your program around the pickle module (i.e. saving your “state” — including variables, etc — in a class).
Method 3
If you ok with OOP, consider creating a method for each class that output a serialised version ( using pickle ) to file. Then add a second method to load in the instance the data, and if the pickled file is there you call the load method instead of the processing one.
I use this approach for ML and it really seed up my workflow.
Method 4
In the traditional programming approach the obvious way to save a state of variables or objects after some point of execution is serialization.
So if you want to execute the program after some heavy already computed state we need to start only from the deserialization part.
These steps will be mostly needed mostly in data science modelling where we load a huge CSV or some other data and compute it and we do not want to recompute every time we run a program.
http://jupyter.org/ – A tool which does these serialization/deserialization automatically without you doing any manual work.
All you need to do is do execute a selected portion of your python code let us say line 10-15, which are dependent on pervious lines 1-9. Jupyter saves the state of 1-9 for you. Explore a tutorial it and give it a try.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0