By using a simple test pattern and pushing through latest attempt at a GAN, I was able to confirm some suspicions I had about what was going on. And through this I was able to use the right words in my Google search and come up with some posts and references to papers to help explain what was going on.
What I saw in my tests is that indeed my GAN was at various times doing a respectable job at recreating the image class. And then it would seemingly lose its mind, or lose the thread, and go off into its own zone from which it would occassionally return.
The process has been productive and I have learned a lot, but the process is also suboptimal. Its hard to tell from the Internet at first glance whether or not many of the posts that are thrown in your face have been written by people who actually understand what is going on. My opinion at this point is that they do not.
In particular, what 98% of these posts about GANs did not tell me was that this very promising technique is not at all cut and dried. That there are many questions and issues about pretty much every phase of the process, about what type of loss function, about what type of optimizer, about how to avoid "mode collapse" and many other critical issues.
So GANs are an exciting work in progress, a true research project, and unlike many other areas that someone from Computer Graphics might expect, nothing is cut and dried here. It may work in one case but not another.
In my case, it truly makes progress towards the goal, but then it wanders off.
Here is a link to the best post I have found so far on this topic by Jonathan Hull. He has written a dozen or so posts on GANs and so you might want to poke around and see more of what he has written.
https://medium.com/@jonathan_hui/gan-why-it-is-so-hard-to-train-generative-advisory-networks-819a86b3750b
Here is one of the input images as a reminder of what I am aiming for and then a few selected images of results that shows it is on the right track. Remember, our GAN does not know a circle from a politician, so when you see something like a circle that is significant.
And here are some selected outputs from the GAN.
And then here are some images from when it has apparently lost its mind.
No comments:
Post a Comment