This Python script processes a folder of images, renames them, and generates captions using a language model. It's designed to work with various image formats and can be customized to focus on specific subjects or artistic styles.
- Processes multiple image formats (png, jpg, jpeg, gif, bmp, tiff, webp)
- Renames images with a custom prefix and sequential numbering
- Generates captions using a language model (default: gpt-4o-mini)
- Supports two caption modes: subject-focused and style-focused
- Concurrent processing for improved performance
- Python 3.6+
- Required Python packages are listed in
requirements.txt
- Clone this repository or download the script.
- Install the required packages:
pip install -r requirements.txt
- Create a
.env
file in the same directory as the script and add your OpenAI API key:OPENAI_API_KEY=your_api_key_here
Run the script from the command line with the following syntax:
python captioner.py <folder_path> --prefix <prefix> --prefix_type <subject|style>
Arguments:
folder_path
: Path to the folder containing the images (required)--prefix
: Prefix to prepend to the renamed image and text files (optional)--prefix_type
: Type of prefix, either "subject" or "style" (required)
Example:
python captioner.py ./my_images --prefix "cat_" --prefix_type subject
This command will process all images in the ./my_images
folder, rename them with the prefix "cat_", and generate captions focusing on the subject (in this case, cats).
The script will:
- Rename all images in the specified folder to
<prefix><number>.jpg
- Generate a caption for each image
- Save each caption in a text file named
<prefix><number>.txt
- Display progress and results in the console
You can modify the make_langchain_call
function to adjust the prompts or change the language model used for captioning.
Ensure you have the necessary permissions to read from and write to the specified folder. The script will overwrite existing files with the same names, so use caution when specifying the prefix.