Advertisement

OpenAI GPT-4视觉API可以玩了,GPT4V,gpt-4-vision-preview,chatgpt

阅读量:

备受期待的GPT-4V终于正式推出,并处于体验阶段;然而使用OpenAI视觉API的用户都对其实力感到惊叹。

已经有人玩出了各种花样了,比如用AI来解说视频,其实也是如此的丝滑:

整个实现过程可以分为 7 步:

  • 从视频中提取图像帧;
  • 生成用于描述的提示信息;
  • 向GPT系统发送请求指令;
  • 设计适用于语音解说的提示模板;
  • 输出适合语音解说的文字脚本;
  • 将文字脚本转换为音频格式;
  • 将音频内容与视频文件整合。

这个可以大家去玩哈

先上一个基础的示例:

先从这里拿到key:https://github.com/xing61/xiaoyi-robot

复制代码
 import os

    
  
    
 import openai
    
 import requests
    
 import time
    
 import json
    
 import time
    
  
    
 API_SECRET_KEY = "你的智增增的key";
    
 BASE_URL = "https://flag.smarttrot.com/v1/"  #智增增的base_url
    
  
    
 from openai import OpenAI
    
 # gpt4v
    
 def gpt4v(query):
    
     client = OpenAI(api_key=API_SECRET_KEY, base_url=BASE_URL)
    
     resp = client.chat.completions.create(
    
     model="gpt-4-vision-preview",
    
     messages=[
    
         {
    
             "role": "user",
    
             "content": [
    
                 {"type": "text", "text": query},
    
                 {
    
                     "type": "image_url",
    
                     "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
    
                 },
    
             ],
    
         }
    
     ],
    
     max_tokens=300,
    
     )
    
     print(resp)
    
     print(resp.choices[0].message.content)
    
  
    
 if __name__ == '__main__':
    
     gpt4v("What are in these images? Is there any difference between them?");
![](https://ad.itadn.com/c/weblog/blog-img/images/2025-08-16/F2jZDALq8WOkKGh5wBVdQfTU40ub.png)

将该代码向OpenAI发送了一个图像,请其识别或分析其中的内容。然而,在此过程中发现一个问题:询问图像差异时用户却声称只上传了一份文件。

图片如下:

于是我们看到返回:

复制代码
    I'm sorry, but I can only view one image at a time. The image you've provided is a beautiful landscape scene. It features a wooden boardwalk or path leading through a lush green meadow with tall grass on both sides. The sky is partly cloudy with a rich blue color and some gentle white clouds, suggesting a pleasant day. The scenery is tranquil and might be a nature reserve or park. There are no people or animals visible in the image. If you have another image for comparison, please provide it separately.

效果还是不错滴

全部评论 (0)

还没有任何评论哟~