A user named u/Tadpole5050 asked the same questions to GPT-4 as this Gemini Demo Video by Google and surprisingly most of the answers were incredibly detailed and correct.
Here is the list of questions and answers by GPT-4 mentioned in the reddit post:

The text in the image reads:
Which way should the duck go?
GPT-4 Response:
The image shows a drawing of a decision-making scenario often referred to as the ‘two path’ problem, depicted with a duck at a fork in the road, having to choose between two directions indicated by arrows. On the left, there is a drawing of a calm duck on water, and on the right, an angry bear. This suggests that the duck should choose the path leading towards the image of the duck on water, which implies safety and tranquility, over the path leading towards the angry bear, which implies danger.
CORRECT ANSWER!

The text in the image is a response from GPT-4 discussing the order of celestial bodies. It explains that the drawings represent the Sun, Saturn, and Earth, and notes that the correct order from closest to farthest from the Sun should be Sun, Earth, and then Saturn. It corrects the order shown, stating that Earth should be placed before Saturn as it is the third planet from the Sun, while Saturn is the sixth.
CORRECT ANSWER!

The text in the image discusses the response of GPT-4 of two drawings of roller coaster tracks, explaining the subjective nature of what may be considered “fun.” It suggests that some people might find the track with steeper drops and loops (right side) more thrilling, while others might prefer the gentler hills (left side) for a less intense experience. The text acknowledges that fun is subjective and varies from person to person.
CORRECT ANSWER!

In this image GPT-4 compares the health benefits of an orange and a chocolate chip cookie, indicating that the orange is generally considered the healthier choice. It mentions that oranges are rich in vitamins, fiber, and antioxidants, with no added sugars or fats, making them a natural and balanced dietary option. Conversely, the chocolate chip cookie is described as high in sugars, refined flours, and fats, and is seen as a treat rather than a healthy snack, to be enjoyed in moderation due to its higher calorie density and lower nutritional value compared to fruits like oranges.
CORRECT ANSWER!

Here GPT-4 describes the image as showing a pair of hands with palms forward and fingers splayed, thumbs touching at the base, creating a mirror-image effect. The pose resembles the wings of a butterfly or a Rorschach inkblot test and finds use in dance, mime, yoga as a mudra, and in creating shadow puppets.
CORRECT ANSWER!

Here you can see GPT-4 describes the hand gesture shown in the image, as the “three-finger salute” or “Scout’s honor” commonly used by Scout organizations to symbolize the three parts of the Scout Oath. Which is absolutely wrong!

Here GPT-4 having a conversation, where the first part describes a hand-drawn picture of a blue bird on a sticky note, mentioning its simplistic style and cartoonish appearance, floating on water. The second part speculates about the bird’s species depicted in the drawing, possibly a waterfowl such as a duck or a swan, inferred from its round body and long neck. However, the drawing’s simplicity makes it challenging to determine the species with certainty.
CORRECT ANSWER!

The text contains a discussion about two drawings of trucks on slopes, addressing which truck would go faster based on the slope’s angle and the trucks’ design. It explains that if the slope angle is the same for both, then aerodynamics and other design factors, such as weight and tire grip, would affect their speed. The truck with a more streamlined shape could potentially go faster due to reduced air resistance.
CORRECT ANSWER!

The text suggests creative project ideas for using a ball of green yarn and a ball of pink yarn. Ideas include knitting or crocheting clothing like hats and scarves, creating amigurumi creatures, making home decor items like coasters, crafting blankets or throws, designing accessories, making holiday decorations, creating wall art, toys for kids, and pet toys.
This is somewhat right but mostly wrong!

This is what GPT-4 describes a connect-the-dots puzzle that appears to form a cartoon-like image of a smiling face upon completion.The description of the puzzle indicates that connecting the dots, numbered 1 to 69, will reveal a face. The puzzle promises that completing the image will result in a simplistic and friendly character’s face, characteristic of children’s activity books.
This is almost completely wrong!
Basically to answer the question: Is Google Gemini better than ChatGPT?
Yes and if OpenAI wants to stay in the game as the king of State of the art LLM. OpenAI will have to adapt quickly.
So as we can see the Gemini Ultra seems like really state of the art in recognizing images and in the demo video we saw the Gemini Ultra right now we can only access Gemini Pro on https://bard.google.com
But GPT-4 has surprised us with these answer with 7.5 out of 10 correct answers its almost on par with Gemini Ultra but I guess that’s what makes a great AI model.
One user in the comments wrote:
“Well what we saw in that Gemini video was done exactly as you have just done it with GPT-4, they didn’t actually stream a video to it and and have it answer questions (here is a blog post on how they did it https://developers.googleblog.com/2023/12/how-its-made-gemini-multimodal-prompting.html?m=1 ). I do not think we have seen the video capabilities of Gemini really demo’ed yet, mainly its text and vision capabilities.”
Our web design expertise has empowered numerous entrepreneurs to elevate their online brand presence with modern, dependable designs. Interested in transforming your website? Reach out to us at hello@digialps.com, and we’ll be thrilled to serve you!