More examples for certain topics? Especially saving/loading data

130 views
Skip to first unread message

Mattias Borwald

unread,
Mar 24, 2014, 11:51:47 AM3/24/14
to deap-...@googlegroups.com
You did a really good work with DEAP and also the documentation!

Unfortunately, some parts of the documentation are a bit vague in my opinion. I managed to set-up a nice STGP and it works like a charm. The problem is, that I don't know how to save and load the individuals. I tried to use checkpoints/pickle as in the documentation but it just outputs an error stating that pickle needs a string, not a byte-object.
The second possibility I saw was the "seeding"-approach with JSON. I have zero experience with JSON and I'm not sure, how the input-file should look like and if it is even possible to seed a population with STGP individuals.
Thirdly, I stumbled upon he gp.PrimitiveTree.from_string function. I managed to get it running somehow (although I found a bug I just added to the issue list) but I don't think that I really understood how to use from_string.

It would be great, if you could offer a little example about how to save and load individuals!
Thank you!
Mattias

Félix-Antoine Fortin

unread,
Mar 25, 2014, 2:26:00 PM3/25/14
to deap-...@googlegroups.com
Hi Mattias,

> You did a really good work with DEAP and also the documentation!

Thanks, it is always appreciated.

> Unfortunately, some parts of the documentation are a bit vague in my opinion. I managed to set-up a nice STGP and it works like a charm. The problem is, that I don't know how to save and load the individuals. I tried to use checkpoints/pickle as in the documentation but it just outputs an error stating that pickle needs a string, not a byte-object.

It is hard to know exactly what went wrong without the source code, the exact error and the Python interpreter version.

There is a small precision that should be added to the checkpointing tutorial regarding GP concerning the pickling protocol. In order to pickle gp.PrimitiveTree, it is required to specify the protocol argument to dump, and set to a level equal to 2 or higher. However, this does not seem to be your problem.

If you find some that some parts of the documentation are vague, you are more than welcome to indicate us the different sections and how we could improve them.

> The second possibility I saw was the "seeding"-approach with JSON. I have zero experience with JSON and I'm not sure, how the input-file should look like and if it is even possible to seed a population with STGP individuals.

JSON has a limited set of types it can serialize, and gp.PrimitiveTree is not in it. You should stick with pickle.

> Thirdly, I stumbled upon he gp.PrimitiveTree.from_string function. I managed to get it running somehow (although I found a bug I just added to the issue list) but I don't think that I really understood how to use from_string.

We are currently working on your bug report. That said, from_string was not originally implemented as a way to checkpoint PrimitiveTree. It was more meant to make the creation of trees from scratch easier, since working with primitives and terminals can be cumbersome and counterintuitive.

> It would be great, if you could offer a little example about how to save and load individuals!

If you could share your code on how you tried to pickle and unpickle GP trees, we could start from that to show you an example on how you should do it.

> Thank you!
> Mattias

Thank you for your interest for DEAP,
Félix-Antoine

Mattias Borwald

unread,
Mar 26, 2014, 8:21:19 AM3/26/14
to deap-...@googlegroups.com
Hi Félix-Antoine,

thank you for your fast and thorough answer!
I copied the pickle section at the end of the checkpointing tutorial and changed it  a little bit. At the end of my main() I added:

cp = dict(population=population)
with open("checkpoint_name.pkl", "w") as f:
        pickle
.dump(cp, f)

I'm using gp.PrimitiveTree with gp.PrimitiveSetTyped. I get the following error:
TypeError: must be str, not bytes

I must say that I have no experience with pickle at all. Maybe it might be a good idea to add checkpointing to the spambase tutorial :)

However I realized that pickle is not what I wanted anyway, because I want to save the individuals (at least the best ones), have the possibility to edit them by hand (or some other script) then load them. The idea is, that I want to evolve programms/individuals on condition A and save them. Then in a new program start, I want to evolve programs in condition B and save them etc. After some runs with different conditions I want to load all the best individuals again to test them together on condition X. In the meantime I want to be able to edit the programs and also simply to look at them in plain text to understand what they do (and hopefully understand why they were best in conditions A, B, ...). Hence, I think the from_string() method might be best for me as soon as it works :)

Again I must thank you for the great framework! Good luck with your PhDs :)
Mattias

Félix-Antoine Fortin

unread,
Mar 26, 2014, 8:52:22 AM3/26/14
to deap-...@googlegroups.com
Hi Félix-Antoine,

Hi Mattias

thank you for your fast and thorough answer!
I copied the pickle section at the end of the checkpointing tutorial and changed it  a little bit. At the end of my main() I added:

cp = dict(population=population)
with open("checkpoint_name.pkl", "w") as f:
        pickle
.dump(cp, f)

I'm using gp.PrimitiveTree with gp.PrimitiveSetTyped. I get the following error:
TypeError: must be str, not bytes

I must say that I have no experience with pickle at all. Maybe it might be a good idea to add checkpointing to the spambase tutorial :)

Now it is easier to help you ;). The problem is the documentation was built and tested with Python 2, and the pickle.write's requirements have changed between 2 and 3.

Python 2 :  "file must have a write() method that accepts a single string argument."
Python 3  : "file argument must have a write() method that accepts a single bytes argument."

So in Python 3, you have to open a binary file instead of string file. Simply specify "wb" instead of "w" as a mode argument and the exception will go away. Since it is possible to load and write pickled data in string and binary file with Python 2, we will modify the documentation to always use the binary mode when using pickle. Thanks for your feedback.

However I realized that pickle is not what I wanted anyway, because I want to save the individuals (at least the best ones), have the possibility to edit them by hand (or some other script) then load them. The idea is, that I want to evolve programms/individuals on condition A and save them. Then in a new program start, I want to evolve programs in condition B and save them etc. After some runs with different conditions I want to load all the best individuals again to test them together on condition X. In the meantime I want to be able to edit the programs and also simply to look at them in plain text to understand what they do (and hopefully understand why they were best in conditions A, B, ...). Hence, I think the from_string() method might be best for me as soon as it works :)

Indeed, if you want to edit your individuals by hand, you better use from_string. 
We are working on a fix, stay tuned.

Again I must thank you for the great framework! Good luck with your PhDs :)
Mattias

Thanks!
Félix-Antoine


Am Dienstag, 25. März 2014 19:26:00 UTC+1 schrieb Félix-Antoine Fortin:
Hi Mattias,

> You did a really good work with DEAP and also the documentation!

Thanks, it is always appreciated.

> Unfortunately, some parts of the documentation are a bit vague in my opinion. I managed to set-up a nice STGP and it works like a charm. The problem is, that I don't know how to save and load the individuals. I tried to use checkpoints/pickle as in the documentation but it just outputs an error stating that pickle needs a string, not a byte-object.

It is hard to know exactly what went wrong without the source code, the exact error and the Python interpreter version.

There is a small precision that should be added to the checkpointing tutorial regarding GP concerning the pickling protocol. In order to pickle gp.PrimitiveTree, it is required to specify the protocol argument to dump, and set to a level equal to 2 or higher. However, this does not seem to be your problem.

If you find some that some parts of the documentation are vague, you are more than welcome to indicate us the different sections and how we could improve them.

> The second possibility I saw was the "seeding"-approach with JSON. I have zero experience with JSON and I'm not sure, how the input-file should look like and if it is even possible to seed a population with STGP individuals.

JSON has a limited set of types it can serialize, and gp.PrimitiveTree is not in it. You should stick with pickle.

> Thirdly, I stumbled upon he gp.PrimitiveTree.from_string function. I managed to get it running somehow (although I found a bug I just added to the issue list) but I don't think that I really understood how to use from_string.

We are currently working on your bug report. That said, from_string was not originally implemented as a way to checkpoint PrimitiveTree. It was more meant to make the creation of trees from scratch easier, since working with primitives and terminals can be cumbersome and counterintuitive.

> It would be great, if you could offer a little example about how to save and load individuals!

If you could share your code on how you tried to pickle and unpickle GP trees, we could start from that to show you an example on how you should do it.

> Thank you!
> Mattias

Thank you for your interest for DEAP,
Félix-Antoine


--
You received this message because you are subscribed to the Google Groups "deap-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to deap-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages