FlickR Image Tagger

I have a Flickr account with over 20,000 public photos. Many are still untagged and only partially described and I wanted to find a way to improve that situation. That's why I wrote the `imagetagger` command-line application a couple of weeks ago, in a lengthy dialogue with #ChatGPT . "Wrote" is probably the wrong term: I kept correcting ChatGPT about the code it gave me until the code was right.

imagetagger is a Python script (GitHub) that can improve tags and titles and descriptions of a Flickr photo by fetching data from Wikipedia or generating description for the photo with the help of Azure Computer Vision and OpenAI GPT-3. The program takes a number of arguments, including a Flickr API key and secret, an Azure Vision API key and endpoint, and an OpenAI API key. All those secrets can be given once and the stored for subsequent calls with the "--store" parameter.

The first function we'll look at is get_flickr_image_info, which uses the Flickr API to fetch information about the specified photo, including its title, description, tags, and a URL for an image that is at least 400 pixels wide.

def get_flickr_image_info(api_key, api_secret, photo_id):
    flickr = flickrapi.FlickrAPI(api_key, api_secret, format='parsed-json')
    info = flickr.photos.getInfo(photo_id=photo_id)
    title = info['photo']['title']['_content']
    description = info['photo']['description']['_content']
    flickr_tags = info['photo']['tags']['tag']
    sizes = flickr.photos.getSizes(photo_id=photo_id)
    for size in sizes['sizes']['size']:
        if size['width'] >= 400 and size['width'] < 600:
            image_url = size['source']
            break
    return title, description, flickr_tags, image_url

Next, the generate_title function uses OpenAI's Completion API to generate a title for the photo based on the tags for the photo. This function is invoked when no title is present for the photo on Flickr.

def generate_title(objects, tags, engine):
    prompt=f'Create a short title for an image with these tags {", ".join(tags)}'
    print(f"Title prompt: {prompt}")
    response = openai.Completion.create(
        engine=engine,
        prompt=prompt,
        max_tokens=1024,
        temperature=0.5,
        top_p=1,
        frequency_penalty=0,
        presence_penalty=0
    )
    description = response['choices'][0]['text']
    if description.startswith('rera\n\n'):
        description = description[6:]
    return description

The generate_description function is responsible for generating a description for the photo. If a Wikipedia page is found for the title of the photo, the page's summary is used as the description. If no Wikipedia page is found, the function uses OpenAI's Completion API to generate a description based on the title and tags of the photo. This function is invoked when no description is present for the photo on Flickr.

def generate_description(objects, title, tags, engine):
    wikipedia_link, description = get_wikipedia_link(title)
    if not description:
        prompt=f'Create an encyclopedic photo description for a photo titled \"{title}\". The photo has these tags: {", ".join(tags)}.'
        print(f"Description prompt: {prompt}")
        response = openai.Completion.create(
            engine=engine,
            prompt=prompt,
            max_tokens=1024,
            temperature=0.8,
            top_p=1,
            frequency_penalty=0,
            presence_penalty=0
        )
        description = response['choices'][0]['text']
        if description.startswith('rera\n\n'):
            description = description[6:]
    if wikipedia_link:
        description = description + "\n\n\nWikipedia: " + wikipedia_link
        
    return description

The generate_tags function uses Azure Computer Vision to generate a list of tags for the photo based on its content. This function is invoked when no tags are present for the photo on Flickr.

def generate_tags(description, engine):
    client = ComputerVisionClient(engine.azure_endpoint, CognitiveServicesCredentials(engine.azure_key))
    image_tags = client.tag_image(description).tags
    return [tag.name for tag in image_tags]

Finally, the main function ties everything together by parsing the command-line arguments, fetching information about the photo from Flickr, generating a title, description, and tags if needed, and setting the title and description on Flickr if the `--store` argument is provided.

def main():
    parser = create_parser()
    args = parser.parse_args()
    title, description, flickr_tags, image_url = get_flickr_image_info(args.api_key, args.api_secret, args.photo_id)
    engine = Engine(args.gpt3_engine, args.openai_api_key)
    if not title:
        title = generate_title(objects, flickr_tags, engine)
    if not description:
        description = generate_description(objects, title, flickr_tags, engine)
    if not flickr_tags:
        flickr_tags = generate_tags(description, engine)
    if args.store:
        flickr.photos.setMeta(photo_id=args.photo_id, title=title, description=description)
        flickr.photos.setTags(photo_id=args.photo_id, tags=' '.join(flickr_tags))
    print(title)
    print(description)
    print(flickr_tags)

if __name__ == '__main__':
    main()

The "photo_id" argument is the number that follows the "photos/user" segment in a Flickr URL, as you can see here: 

https://www.flickr.com/photos/clemensv/52353331842/ 



The README of the project shows how you can install the script right from Github using PIP, assuming you have Python installed.



Share on Twitter, Reddit, Facebook or LinkedIn