Skip to content

curated daily / weekly / whenever we find something good :)

nowrap.ai/ the glossary · AI in plain English

Back to the glossary

◼ Conceptsalso: multi-modal, vision model, vision models

Multimodal

An AI that works across more than just text — reading images, audio, and video, not only words.

Updated Jun 6, 2026

Multimodal describes an AI that handles more than one kind of input or output — text and images, audio, or video. An earlier generation of models only read and wrote words. A multimodal model can look at a screenshot, listen to a recording, watch a clip, and respond in kind, treating a picture or a sound as just another thing it can reason about.

This is less a feature than a change in what counts as a question. You no longer have to describe the chart, the X-ray, or the mockup in words — you show it.

Why it matters at your desk. For a designer, multimodal is what lets Figma Make turn a visual idea into a working layout, and for anyone working in audio and video, Descript treats a recording as editable material rather than an opaque file. The frontier is moving fast toward real-time: Gemini's live audio preview points at assistants you can simply talk to and show things to, in the moment.

What to watch for: a model reading an image is not the same as a model understanding it correctly — multimodal output hallucinates too, and a confident misread of a medical scan or a contract screenshot carries the same risk as a confident wrong sentence. Treat what it sees with the same verification you give what it says.

§ Related terms

▲ Tools that use this

№ 01Freemium

Marketers · Writers

Descript.

Edit audio and video like a doc.

We think Descript is still one of the most practical tools for transcript-first audio and video editing. If you think of it as "edit media like a document," the product makes immediate sense, especially for podcasts and straightforward creator workflows. The biggest strength is convenience. You can cut sections by editing text, clean up filler words, and move from rough recording to something publishable much faster than with a classic timeline editor. That is the core reason people keep using it. The weakness is that it can feel slower and more fragile than traditional editors once the project gets serious. Public user feedback regularly mentions performance and reliability complaints, and it is not the first tool we would choose for high-end production work. **Strengths**: Great transcript editing, fast for podcasts and simple videos, useful AI cleanup features, easy to learn. **Weaknesses**: Can be slow or buggy, less suitable for advanced pro editing, may feel server-dependent. **Final verdict**: We see Descript as a strong tool for creators who care more about speed and simplicity than deep pro editing control. It is best for transcript-first workflows, not high-end finishing.

Podcast editing
Video editing
Transcription

Privacy policy on fileReviewed Apr 28, 2026 by Nowrap

Read more Visit

№ 02Freemium

Designers · Marketers

Figma Make.

Prompt to working prototype.

We think Figma Make is promising for fast prototyping, but the public feedback is clearly more mixed than the marketing suggests. It is best understood as a design-to-prototype shortcut, not a reliable way to skip product development. The strength is speed. For early ideas, it can turn a prompt into something visual and clickable fast, which is useful for designers and marketers who want to test a direction before spending time on a real build. The weakness is trustworthiness. Reddit users repeatedly complain that the generated code is rough, hard to clean up, and not easy to move into a real production app. It also seems much less compelling once the project becomes data-heavy, complex, or tightly coupled to the rest of Figma. **Strengths**: Fast for prototyping, good for early idea exploration, useful when you want a visual draft quickly. **Weaknesses**: Generated code can be poor, not production-ready, weak fit for complex apps or serious handoff work. **Final verdict**: We see Figma Make as useful for rough prototypes and design exploration. If you need a real app, expect to rebuild most of it yourself.

Prototyping
Web app generation

Privacy policy on fileReviewed Apr 28, 2026 by Nowrap

Read more Visit

§ In the dispatch

◼ releaseGoogle1mo ago

Google rolls out Gemini 3.1 Flash Live Preview for voice-first AI

May 7, 2026·4 min read