kiroi.org

KIROI - Artificial Intelligence Return on Invest
The AI strategy for decision-makers and managers

Business excellence for decision-makers & managers by and with Sanjay Sauldie

KIROI - Artificial Intelligence Return on Invest: The AI strategy for decision-makers and managers

Start » Vision-Language Models (Glossary)

19 July 2024

Vision-Language Models (Glossary)

Automation Big data and smart data Digital transformation AI Glossary Artificial intelligence

Vision-language models are at home in the fields of Artificial Intelligence, Digital Transformation, and Big Data and Smart Data. They combine the ability to recognise images with the understanding and processing of language. This means that computers, through these models, can both see and speak – and link the two together.

Imagine you upload a photo of a dog and the system automatically describes it as: „A brown dog is running across a meadow.“ This is possible thanks to vision-language models. They analyse the image, recognise objects, and translate what they see into understandable words.

This technology can be used in a variety of ways in companies. For example, online shops can use it to automatically describe product images, which improves product search for customers with visual impairments. In Big Data analysis, vision-language models help to evaluate large amounts of image and text data jointly and to find new correlations.

In short, vision-language models are making computers capable of not only seeing our world, but also understanding and describing it.

How useful was this post?

Click on a star to rate it!