How AI “Sees” the World

It’s our first winter in rural Summit County, Utah.

I’m certainly used to the scenes of snow after spending the last quarter of a century living in Park City. But I’m not used to the serene but rustic sights, sounds, and scents unique to the mountain-valley countrysides of Summit County.

The setting was was so inspiring today that I decided to capture it in a picture (left side).

I then wondered what a large language model (LLM)-based AI would be able to “see” (digitally, that is) in the picture. Even as a long-time AI practitioner with a formal computer science background, I was inspired by how effectively it captured the picture in words:

“The picture shows a tranquil winter scene, likely taken in a rural or semi-rural area. There is a thick layer of snow covering the ground and a rustic wooden fence running horizontally across the image, which appears to be bordering a frozen body of water, possibly a pond or small lake. Leafless trees with bare branches suggest that the season is winter. The absence of green foliage indicates the dormancy of vegetation during the cold months. The overcast sky suggests a typical chilly, grey winter day. There are no visible animal tracks or human footprints in the immediate foreground, which adds to the stillness of the scene. The overall atmosphere is serene and somewhat stark, reflecting the quiet beauty of winter.”

I found the concise description incredibly impressive as I could think of few people who’d be able to devise a more apt characterization; I certainly couldn’t.

What’s equally impressive is what another generative AI model was able to conceive purely from the LLM description (right side). World affairs are so devastatingly sad at the moment that I hope this was the emotional relief and diversion for you that it was for me.

Leave a comment