Stable Diffusion Inpainting
Property | Value |
---|---|
License | CreativeML OpenRAIL-M |
Training Data | LAION-aesthetics v2 5+ |
Paper | High-Resolution Image Synthesis With Latent Diffusion Models (CVPR 2022) |
Downloads | 4.2M+ |
What is stable-diffusion-inpainting?
Stable Diffusion Inpainting is an advanced text-to-image model specifically designed for image editing and completion tasks. Built upon Stable Diffusion v1.5, it allows users to selectively modify parts of images using text prompts and masks, enabling sophisticated image manipulation while maintaining coherence with the original image.
Implementation Details
The model was trained in two phases: 595k steps of regular training followed by 440k steps of specialized inpainting training at 512x512 resolution. It features a modified UNet architecture with 5 additional input channels - 4 for encoded masked images and 1 for the mask itself. During training, the model used synthetic mask generation with 25% full-image masking to ensure robust performance.
- Built on Stable Diffusion v1.5 architecture
- Trained on LAION-aesthetics v2 5+ dataset
- Uses CLIP ViT-L/14 text encoder
- Supports both masked and complete image generation
Core Capabilities
- High-quality image inpainting with text guidance
- Seamless integration with popular frameworks (Diffusers, AUTOMATIC1111)
- Support for various mask types and sizes
- Maintains consistency with surrounding image context
- Achieves FID score of 1.00 in benchmark tests
Frequently Asked Questions
Q: What makes this model unique?
This model combines the power of Stable Diffusion with specialized inpainting capabilities, allowing for precise image editing while maintaining high-quality outputs. Its unique architecture with additional input channels for masked images sets it apart from standard image generation models.
Q: What are the recommended use cases?
The model excels in creative applications such as image restoration, object removal, background completion, and artistic modifications. It's particularly useful for digital artists, photographers, and content creators who need to make selective modifications to images while maintaining natural-looking results.