Gemma 3 supports vision-language inputs and text outputs, handles context windows up to 128k tokens, and understands more ...