Global Wahrman: September 2023

Monday, September 25, 2023

Picture of Me from Symbolics

Courtesy of Tom McMahon

Sunday, September 3, 2023

Fundamentals for LLMs and Image Diffusion

I am enjoying Midjourney very much. I presume I would be as impressed with any LLM (large language model) driven image diffusion system but this is the only one I have played with. I feel completely awed and blindsided by the range of capabilities I have seen here. Where did this come from? How does it work? To save you the effort of finding out, I am endeavoring to collect a list of topics it will be necessary to have at least a passing familiarity with in order to understand this modern miracle. It probably wont be necessary for you to be a genius at any of these topics, perhaps, but it may be necessary for you to know something about all of these if you want to know what is going on beneath the hood and maybe predict where this kind of technology is going.

To give you an idea about why I find this so compelling and disturbing, here is the result of "/imagine vintage photograph of king kong climbing the Chrysler building".

A partial list of technologies involved include

Latent Variable Models

Variational Autoencoder

Generative Adversarial Networks (GANs)

Semantic Image Synthesis

and more to come.