local_fire_departmentHoneystax
search⌘K
loginLog Inperson_addSign Up
layers
HONEYSTAX TERMINAL v1.0
HomeNewsSavedSubmit
Back to the live board
G

Generative-Media-Skills

MCP Server

Multi-modal Generative Media Skills for AI Agents (Claude Code, Cursor, Gemini CLI). High-quality image, video, and a...

Copy the install, test the workflow, then decide if it earns a permanent slot.

3,139
Why nowMoving now

Fresh repo activity plus visible builder pull. This is the kind of tool people test before it turns obvious.

DecisionHigh-conviction move

Copy the install, test the workflow, then decide if it earns a permanent slot.

Trial costFast eval

You can test this quickly and remove it cleanly if it misses.

Risk35/100

GitHub health 50/100. no security policy. 0 open issues make this testable, but not something to trust blind.

What You Are Adopting

AI Agent

Multiple

Model

Multiple

Build Time

Minutes

Test This In Your Stack

One command inClean rollbackLow commitment
settingsRegistryAdds a named entry to Claude config. One command to remove.

Fastest way to find out if Generative-Media-Skills belongs in your setup.

Copy the install command, run a real test, and back it out cleanly if it slows you down.

Try now
claude mcp add generative-media-skills -- npx generative-media-skills

Run this first. You will know quickly if the workflow earns a permanent slot.

Back out
claude mcp remove generative-media-skills

No messy cleanup loop. If it misses, remove it and keep moving.

Install Location

~/  └─ .claude.json    └─ mcp_servers/      └─ generative-media-skills ← registers here

About

Multi-modal Generative Media Skills for AI Agents (Claude Code, Cursor, Gemini CLI). High-quality image, video, and audio generation powered by muapi.ai.. An open-source mcp server for the AI coding ecosystem.

README

🎭 Generative Media Skills for AI Agents

The Ultimate Multimodal Toolset for Claude Code, Cursor, and Gemini CLI.
A high-performance, schema-driven architecture for AI agents to generate, edit, and display professional-grade images, videos, and audio.

Agent Skills Demo

🚀 Get Started | 🎨 Expert Library | ⚙️ Core Primitives | 📖 Reference


✨ Key Features

  • 🤖 Agent-Native Design — Standardized terminal scripts with clean JSON outputs for seamless integration into agentic workflows.
  • 🧠 Expert Knowledge Layer — Domain-specific skills that bake in professional cinematography, atomic design, and branding logic.
  • ⚡ Dynamic Schema-Driven — Powered by schema_data.json, scripts automatically resolve the latest models, endpoints, and valid parameters.
  • 🖼️ Direct Media Display — Use the --view flag to automatically download and open generated media in your system viewer.
  • 📁 Local File Support — Auto-upload images, videos, faces, and audio from your local machine to the CDN for processing.
  • 🌈 100+ AI Models — One-click access to Midjourney v7, Flux Pro, Kling 3.0, Veo3, Suno V5, and more.

🏗️ Scalable Architecture

This repository uses a Core/Library split to ensure efficiency and high-signal discovery for LLMs:

⚙️ Core Primitives (/core)

The raw infrastructure for interacting with the muapi.ai engine.

  • core/media/ — High-fidelity Generation (Image, Video, Audio)
  • core/edit/ — Advanced Editing (Lipsync, Upscale, Effects)
  • core/platform/ — Setup & Polling Utilities

📚 Expert Library (/library)

High-value skills that translate creative intent into technical directives.

  • Cinema Director (/library/motion/cinema-director/) — Technical film direction & cinematography.
  • Nano-Banana (/library/visual/nano-banana/) — Reasoning-driven image generation (Gemini 3 Style).
  • UI Designer (/library/visual/ui-design/) — High-fidelity mobile/web mockups (Atomic Design).
  • Logo Creator (/library/visual/logo-creator/) — Minimalist vector branding (Geometric Primitives).

🧠 Self-Optimizing Skills

Every expert skill in the Library includes a Prompt Optimization Protocol. This allows LLMs (like Claude or Gemini) to use their own reasoning to expand simple user requests into high-fidelity technical briefs before calling the generation scripts.


🚀 Quick Start

1. Install the Skills

# Install all skills to your AI agent
npx skills add SamurAIGPT/Generative-Media-Skills --all

# Or install a specific skill
npx skills add SamurAIGPT/Generative-Media-Skills --skill muapi-media-generation

# List available skills
npx skills add SamurAIGPT/Generative-Media-Skills --list

# Install to specific agents
npx skills add SamurAIGPT/Generative-Media-Skills --all -a claude-code -a cursor

2. Configure Your API Key

# Get your key at https://muapi.ai/dashboard
bash core/platform/setup.sh --add-key "YOUR_MUAPI_KEY"

3. Run an Expert Skill with Direct Display

Generate a high-fidelity image and open it immediately using the --view flag.

# Use Nano-Banana reasoning to generate a 2K masterpiece from a local image
bash library/visual/nano-banana/scripts/generate-nano-art.sh \
  --file ./my-source-image.jpg \
  --subject "a glass hummingbird" \
  --style "macro photography" \
  --resolution "2k" \
  --view

4. Direct a Cinematic Scene

cd library/motion/cinema-director
# Create a 10-second 'epic' reveal without audio
bash scripts/generate-film.sh \
  --subject "a cybernetic dragon over Tokyo" \
  --intent "epic" \
  --model "kling-v3.0-pro" \
  --duration 10 \
  --no-audio \
  --view

📖 Schema Reference

This repository includes a streamlined schema_data.json that core scripts use at runtime to:

  • Validate Model IDs: Ensures the requested model exists.
  • Resolve Endpoints: Automatically maps model names to API endpoints.
  • Check Parameters: Validates supported aspect_ratio, resolution, and duration values.

🔧 Compatibility

Optimized for the next generation of AI development environments:

  • Claude Code: Direct terminal execution via tools.
  • Gemini CLI / Cursor / Windsurf: Seamless integration as local scripts.
  • MCP: Each skill is Model Context Protocol-ready for universal agent usage.

📄 License

MIT © 2026

Tech Stack

GPTClaude
Open Live ProjectAudit Repo

Reviews0

Log in to write a review.

ActiveLast commit 2d ago
Submitted May 25, 2023

auto_awesomeYour strongest next moves after Generative-Media-Skills