Multimodal AI Generation Workflow

The notebook demonstrates GPT-2 text generation and Stable Diffusion image generation from the same concept, showing how language and visual models can work together.

CategoryAI / Multimodal Systems
Year2026
RoleAI student / group presentation contributor

This Advanced AI class project explores a compact multimodal generation workflow. A shared prompt is passed through GPT-2 for text generation and Stable Diffusion for image generation.

The class goal was to make multimodality understandable: one idea, multiple model outputs, and a clear demonstration of how different AI systems can respond to the same concept.

The notebook connects a text generation model and an image generation model in one presentation-ready flow, making the relationship between language prompts and generated media easier to explain.

The work is packaged as a Jupyter notebook for classroom presentation, emphasizing model behavior, prompt flow, and the practical link between text and image modalities.

Jupyter NotebookGPT-2Stable DiffusionPythonPrompting

The project gives the portfolio a real early AI artifact: not a finished product, but a concrete signal of hands-on experimentation with pretrained generative models.

Multimodal systems become easier to reason about when the workflow starts with a single intent and compares how different model types transform that intent.

Shaadi619/Advanced-AI_Class

Open GitHub repo