The Release
Meta’s Llama 4 ships in two main flavors: Scout (109B params, 17B active, 10M context window) and Maverick (400B, 17B active, quality leader). Mixture-of-experts architecture — lower inference cost per call than dense models.
Multimodal
Both Scout and Maverick natively multimodal. Image + text inputs handled. For CRM use cases involving customer-submitted photos (claims, support, product feedback), multimodal capabilities matter.
Context Window
Scout’s 10M context is remarkable. Full account histories, year-long conversation transcripts, complete product catalogs — all fit in one prompt. Opens use cases previously requiring elaborate RAG engineering.
Where It Fits
High-volume agentic workflows where open-source inference is cheaper than proprietary APIs. Long-context tasks (full-history customer support). Multimodal CRM use cases. Self-hosted deployments for regulated data.