The term „Interactive Multimodality“ is particularly relevant in the fields of Artificial Intelligence, Digital Transformation, and Industry and Industry 4.0. It involves the simultaneous use and active linking of different types of data and information channels, such as speech, images, and text. „Multimodal“ therefore means that multiple avenues of communication or information intake are used concurrently. „Interactive“ signifies that these channels mutually influence and complement each other.
A simple example is a modern customer service robot: it can understand speech, respond to text, and even analyse images, for example, when customers send a photo of a faulty product. The robot combines these inputs to provide the best possible answer. If the customer first speaks, then uploads a photo, and later continues to write, this information is intelligently collated, analysed, and the overall result significantly improves the quality of service.
Interacting multimodality is therefore an important step in making digital systems appear more „human“ and enabling them to automatically solve more complex tasks – wherever different information pathways interlock and need to be evaluated together.













