安全的關(guān)鍵.png)
云原生 API 網(wǎng)關(guān) APISIX 入門(mén)教程
如今,我們用智能手機(jī)拍攝了大量照片,并將其中許多分享到社交網(wǎng)絡(luò)或消息應(yīng)用程序上。然而,有時(shí)圖像并不足以充分表達(dá)我們?cè)谌粘I钪小⑴c家人共度時(shí)光或在難忘旅行中所捕捉到的那些珍貴瞬間。
試想,如果我們能利用Generative AI技術(shù),用文字來(lái)描繪照片所蘊(yùn)含的意義,讓AI來(lái)講述那些精彩紛呈的瞬間,那該有多好?你可以將這些文字發(fā)布在網(wǎng)上,與親朋好友分享,或者將它們記錄下來(lái),作為自己的日記珍藏。
由于這是我個(gè)人非常想使用的工具,因此我決定以一個(gè)充滿創(chuàng)意的開(kāi)發(fā)人員身份來(lái)實(shí)現(xiàn)它,而不是以研究人員、ML工程師或數(shù)據(jù)科學(xué)家的身份。我對(duì)利用和整合一系列強(qiáng)大的Google API來(lái)完成這項(xiàng)任務(wù)充滿興趣。
本文附帶了一個(gè)Jupyter/Colab筆記本,其中包含了整個(gè)解決方案的詳細(xì)步驟。這個(gè)方案涵蓋了從EXIF照片元數(shù)據(jù)提取,到使用Google Maps API獲取照片拍攝地點(diǎn)的信息,再到利用生成式AI API(如Vertex Imagen用于圖像描述,以及Vertex Palm API用于博客文章生成)的全過(guò)程。
該流程的輸出結(jié)果是一篇生成的博客文章,用于描述整個(gè)照片相冊(cè)。你可以將自己的相冊(cè)上傳到Colab筆記本中,然后輕松地看到Generative AI是如何用文字來(lái)描繪那些相機(jī)記錄下的美好時(shí)刻的。
該項(xiàng)目依賴(lài)于 Google Cloud Platform(GCP)來(lái)訪問(wèn)相關(guān)API。若您打算在Colab上運(yùn)行,可以選擇使用現(xiàn)有的GCP賬戶(hù),或者在此注冊(cè)新賬戶(hù)并獲取300美元的免費(fèi)積分。
若您想在Colab上利用提供的照片或自己的照片運(yùn)行筆記本,筆記本的設(shè)置指南將指引您完成以下步驟:安裝必要的庫(kù)、通過(guò)Google身份驗(yàn)證登錄GCP、獲取Google Maps Platform API密鑰,并啟用以下API:
設(shè)置的最后一步是下載我提供的洛杉磯和舊金山旅行示例照片。
在此筆記本部分中,您將配置包含相冊(cè)照片的文件夾的路徑。它將使用 Pillow 成像庫(kù)處理照片以執(zhí)行以下任務(wù):
Google Maps為不同的任務(wù)提供了許多專(zhuān)門(mén)的 API。這里我們使用以下 API:
在設(shè)置了Maps Platform API密鑰后,調(diào)用Geocoding API和Places API將變得非常簡(jiǎn)單。
import googlemaps
gmaps = googlemaps.Client(key=MAPS_API_KEY)
locations = gmaps.reverse_geocode(latlng=(lat,lng))
nearby_places = gmaps.places_nearby(location=(lat,lng), radius=radius)
在本筆記本的這一部分中,我們將開(kāi)始使用生成式 AI。Vertex Imagen 提供了一個(gè)用于圖像字幕的 API,即能夠以文本格式描述圖片中的內(nèi)容。
為此,我們首先需要使用您的 GCP 項(xiàng)目初始化 Vertex AI SDK。
import vertexai
from vertexai.vision_models import ImageTextModel, Image
vertexai.init(project=PROJECT_ID)
model = ImageTextModel.from_pretrained("imagetext")
然后從圖像中獲取標(biāo)題很簡(jiǎn)單。
source_image = Image.load_from_file(location=path)
captions = model.get_captions(
image=source_image,
number_of_results=1,
language="en",
)
在大型語(yǔ)言模型(LLM)的應(yīng)用場(chǎng)景中,提示是指向模型提供的輸入或查詢(xún),旨在引導(dǎo)模型生成相應(yīng)的響應(yīng)。提示的質(zhì)量和具體性對(duì)于塑造模型的輸出至關(guān)重要。
LLM 通常會(huì)按照提示中的說(shuō)明進(jìn)行微調(diào),從而能夠執(zhí)行他們以前沒(méi)有接受過(guò)培訓(xùn)的任務(wù)。設(shè)計(jì)一個(gè)好的提示通常需要一個(gè)與 LLM 交互的試錯(cuò)過(guò)程,并檢查輸出是否接近(或優(yōu)于)預(yù)期。
本項(xiàng)目需要設(shè)計(jì)一個(gè)提示,指導(dǎo)LLM生成一個(gè)帖子,用以描述一組照片中所捕捉的瞬間。這包括為L(zhǎng)LM編寫(xiě)特定的指示,明確輸入格式(包含照片元數(shù)據(jù)的列表)、需遵循的規(guī)范(例如,在描述照片時(shí)引用<Photo id>)以及期望的輸出格式,即包含交錯(cuò)文本和照片占位符的內(nèi)容。
你可以參考我提供的prompt模板,它已被封裝在下面的函數(shù)中。請(qǐng)注意,該模板包含用于照片描述和上下文段落的占位符,你可能希望為L(zhǎng)LM提供更多關(guān)于照片拍攝背景的信息。
提示工程技術(shù)是研究人員或社區(qū)發(fā)現(xiàn)并提出的一系列提示設(shè)計(jì)模式,旨在幫助LLM產(chǎn)生更優(yōu)的輸出。其中,few-shot prompting技術(shù)便是一種,它要求我們提供一些輸入和預(yù)期輸出的示例,就像下面的prompt模板那樣。在我使用 Vertex Palm API 的測(cè)試中,這種技術(shù)在大多數(shù)情況下都有助于獲得所需的輸出。
def generate_prompt(context, pictures_infos):
prompt = f"""
You are a copywriter and journalist.
Can you help me to write a photo tour that describes the moments
registered in a photo album from a context and some information
I provide about the photos?
The items were already sorted by the date and time the photos were taken.
Pay attention to the dates and time to infer how many days were
covered by these photos and at which time of the day they were taken.
Please include descriptions of all the photos taken.
Only report places or experiences that are described by the
photo informations.
The photos information has the following structure:
- <Photo id> | Date the photo was taken | Time the photo was taken |
Photo Description generated by an LLM |
Approximate Locations where the photo was taken |
Approximate Nearby locations where photo was taken
Here is an example of photo information and how it should be generated
in plain text, interleaving photo descriptions and the <Photo id>.
Example photo information:
- <Photo 0> | Date: 08/04/2023 | Time: 07:53:13 |
Photo Description: a man stands in front of a sign that says
welcome to the united states |
Possible Photo Locations: BURBERRY LAX TERMINAL B, Los Angeles
International Airport, Terminal B, Los Angeles, Los Angeles County |
Possible Photo Nearby locations: Los Angeles, Star Alliance Lounge,
ICE International Currency Exchange, Relay, Bank of America.
Expected output:
I was happy finally arriving to my destination, Los Angeles.
While I went into US Customs my heart was filled of anxiety
to leave the airport and get to visit the city.
<Photo 0>
```
Photos album context: {context}
Photos description:
{pictures_infos}
```
"""
return prompt
在此示例中,我們提供了 photos 元數(shù)據(jù)和一個(gè)簡(jiǎn)短的上下文段落,以根據(jù)上面的模板生成提示。
album_context = """I flew to Los Angeles for a short trip,
and the album contains the photos
from the day I arrived there.
The man in those photos is myself.
"""
blog_prompt = generate_prompt(album_context, photos_info_concat)
現(xiàn)在來(lái)嘗試一下。只需復(fù)制此過(guò)程所生成的以下提示,并將其粘貼到用戶(hù)端的LLM聊天系統(tǒng)(例如BARD)中。
您可能會(huì)像我??一樣對(duì)結(jié)果印象深刻!
You are a copywriter and journalist.
Can you help me write a photo tour that describes the moments registered in a
photo album from a context and some information I provide about the photos?
The items are already sorted by the time the photos was taken.
Pay attention to the dates and time to infer how many days were
covered by these photos and in which time of the day they were taken.
Please include descriptions of all the photos taken.
Do not report any place or experience that is not described by the
photo informations.
The photos information has following structure:
- <Photo id> | Date the photo was taken | Time the photo was taken |
Photo Description generated by an LLM |
Approximate Locations where the photo was taken |
Approximate Nearby locations where photo was taken
Here is an example of photo information and how it should be generated in plain
text,
interleaving photo descriptions and the <Photo id>.
Example photo information:
- <Photo 0> | Date: 08/04/2023 | Time: 07:53:13 |
Photo Description: a man stands in front of a sign that says welcome to the
united states |
Possible Photo Locations: BURBERRY LAX TERMINAL B, Los Angeles International
Airport, Terminal B, Los Angeles, Los Angeles County |
Possible Photo Nearby locations: Los Angeles, Star Alliance Lounge, ICE
International Currency Exchange, Relay, Bank of America
Expected output:
I was happy finally arriving to my destination, Los Angeles.
While I went into US Customs my heart was filled of anxiety to leave the airport
and get to visit the city.
<Photo 0>
```
Photos album context: I flew to Los Angeles for a short trip, and the album
contains the photos from the day I arrived there. The man in those photos is myself.
Photos description:
- <Photo 0> | Date and time: 08/04/2023 (Friday) 07:53 AM | Photo Description: a
man stands in front of a sign that says welcome to the united states | Locations: BURBERRY
LAX TERMINAL B, Los Angeles International Airport, Los Angeles, Los Angeles County,
California | Possible Nearby locations: Los Angeles, Star Alliance Lounge, ICE
International Currency Exchange, Relay, Bank of America
- <Photo 1> | Date and time: 08/04/2023 (Friday) 09:32 AM | Photo Description: a man in a
nasa shirt is sitting in a white car | Locations: Los Angeles International Airport, Los
Angeles, Los Angeles County, California, United States | Possible Nearby locations: Los
Angeles
- <Photo 2> | Date and time: 08/04/2023 (Friday) 09:59 AM | Photo Description: a man in a
white shirt is driving a mustang | Locations: Westchester, Los Angeles, Los Angeles
County, California, United States | Possible Nearby locations: Plaza Towers OBGYN:
Lawrence Bruksch, MD, LA Fitness, Dr. Jitsen Chang, Obstetrician-gynecologist, Kinecta
Federal Credit Union - Westchester, Clarity Retirement
- <Photo 3> | Date and time: 08/04/2023 (Friday) 10:29 AM | Photo Description: a man
wearing a nasa shirt stands on a beach | Locations: Los Angeles, Los Angeles County,
California, United States | Possible Nearby locations: Los Angeles, Venice
- <Photo 4> | Date and time: 08/04/2023 (Friday) 11:29 AM | Photo Description: a man sits
on a bench in front of a subba gump shrimp restaurant | Locations: Santa Monica, Los
Angeles County, California, United States | Possible Nearby locations: Bubba Gump Shrimp
Co., Santa Monica Pier Rock Shop, Pier Burger, Santa Monica Police Pier Substation, 66-To-
Cali
- <Photo 5> | Date and time: 08/04/2023 (Friday) 11:43 AM | Photo Description: a man
stands on a pier with a ferris wheel in the background | Locations: Santa Monica, Los
Angeles County, California, United States | Possible Nearby locations: Santa Monica Pier,
The eCenter, Character Drawings, Santa Monica Pier, ビーチ?サインズ&モア
- <Photo 6> | Date and time: 08/04/2023 (Friday) 11:46 AM | Photo Description: a man
stands on a pier with a seagull sitting on the railing | Locations: Santa Monica, Los
Angeles County, California, United States | Possible Nearby locations: Santa Monica,
Pacific Plunge, Inkie’s Scrambler, Fun 'N' Games, Pacific Wheel
- <Photo 7> | Date and time: 08/04/2023 (Friday) 11:52 AM | Photo Description: a man with
a backpack that says o'neill on it | Locations: Santa Monica, Los Angeles County,
California, United States | Possible Nearby locations: Coffee Bean & Tea Leaf, Japadog (at
Santa Monica Pier), Santa Monica Trapeze School, Pacific Park on the Santa Monica Pier,
Funnel Cakes
- <Photo 8> | Date and time: 08/04/2023 (Friday) 12:10 PM | Photo Description: a man poses
in front of the cheesecake factory | Locations: Downtown, Santa Monica, Los Angeles
County, California, United States | Possible Nearby locations: Forever 21, Tiffany & Co.,
Louis Vuitton Santa Monica Place, Pandora Jewelry, Johnny Was
- <Photo 9> | Date and time: 08/04/2023 (Friday) 12:32 PM | Photo Description: a plate of
food with a napkin that says the cheesecake factory | Locations: Downtown, Santa Monica,
Los Angeles County, California, United States | Possible Nearby locations: Forever 21,
Tesla, Nike Santa Monica, Louis Vuitton Santa Monica Place, Pandora Jewelry
- <Photo 10> | Date and time: 08/04/2023 (Friday) 01:15 PM | Photo Description: a man
stands in front of a blue tesla model x | Locations: Downtown, Santa Monica, Los Angeles
County, California, United States | Possible Nearby locations: Forever 21, Tiffany & Co.,
Louis Vuitton Santa Monica Place, Pandora Jewelry, Johnny Was
- <Photo 11> | Date and time: 08/04/2023 (Friday) 05:03 PM | Photo Description: a green
trolley is parked in front of a gap store | Locations: La Brea, Central LA, Los Angeles,
Los Angeles County, California | Possible Nearby locations: Haagen-Dazs Ice Cream Shops,
Wetzel's Pretzels, Nike The Grove, Gap, Bar Verde
- <Photo 12> | Date and time: 08/04/2023 (Friday) 05:44 PM | Photo Description: a variety
of caramel apples are displayed in a store | Locations: La Brea, Central LA, Los Angeles,
Los Angeles County, California | Possible Nearby locations: Los Angeles, The Original
Farmers Market, The Dog Bakery - Fresh Baked Treats & Dog Birthday Cakes, Marconda's,
Littlejohn's English Toffee House & Fine Candies
- <Photo 13> | Date and time: 08/04/2023 (Friday) 06:01 PM | Photo Description: a man is
holding a scoop of ice cream in front of a sign that says " drinks " | Locations: Farmers
Market, La Brea, Central LA, Los Angeles, Los Angeles County | Possible Nearby locations:
Los Angeles, The Original Farmers Market, Littlejohn's English Toffee House & Fine
Candies, Hutchco Technologies, Marconda's
- <Photo 14> | Date and time: 08/04/2023 (Friday) 06:06 PM | Photo Description: cars are
parked in front of a ross store | Locations: 3rd / Ogden, La Brea, Central LA, Los
Angeles, Los Angeles County | Possible Nearby locations: A1 Locksmith & Keys, GapBody, 3rd
/ Ogden, 3rd & Ogden (Eastbound), Karsaz & Associates
- <Photo 15> | Date and time: 08/04/2023 (Friday) 09:33 PM | Photo Description: a hotel
room with a blue blanket on the bed | Locations: Eagle Rock, Northeast Los Angeles, Los
Angeles, Los Angeles County, California | Possible Nearby locations: Welcome Inn, North
East Los Angeles Hotel Owners Association, Kandoo Kitchen, Inland Faculty Medical Group
Inc, Pathway Healthcare
- <Photo 16> | Date and time: 08/04/2023 (Friday) 09:58 PM | Photo Description: two boxes
of food on a table with a fork | Locations: Eagle Rock, Northeast Los Angeles, Los
Angeles, Los Angeles County, California | Possible Nearby locations: Welcome Inn, North
East Los Angeles Hotel Owners Association, MV, Inland Faculty Medical Group Inc, Pathway
Healthcare
```
現(xiàn)在,我們將使用Vertex Palm API中的TextGenerationModel
來(lái)提交前面設(shè)計(jì)的提示,并獲取生成的帖子。您可以通過(guò)調(diào)整溫度、top_k和top_p等參數(shù)來(lái)配置生成文本的隨機(jī)性或創(chuàng)造性水平,具體如相關(guān)評(píng)論和API文檔所述。
from vertexai.language_models import TextGenerationModel
generation_model = TextGenerationModel.from_pretrained("text-bison")
def generate_text(prompt, temperature=1.0,
top_p= 0.4, top_k=40, max_output_tokens=1024):
parameters = {
# Temperature controls the degree of randomness in token selection.
"temperature": temperature,
# Tokens are selected from most probable to least until the sum
# of their probabilities equals the top_p value.
"top_p": top_p,
# A top_k of 1 means the selected token is the most probable
# among all tokens.
"top_k": top_k,
# Token limit determines the maximum amount of text output.
"max_output_tokens": max_output_tokens,
}
generated_text = generation_model.predict(prompt=prompt, **parameters).text
return generated_text
Palm API的輸出是一個(gè)生成的帖子,其中<Photo id>占位符與描述內(nèi)容相互交錯(cuò)。LLM會(huì)決定在文本中的哪些位置包含照片。以下是一個(gè)示例。隨后,我利用正則表達(dá)式查找這些照片占位符,并將它們替換為實(shí)際的照片。
I was happy finally arriving to my destination, Los Angeles.
While I went into US Customs my heart was filled of anxiety
to leave the airport and get to visit the city.
<Photo 0>
I rented a car and drove to my hotel in Eagle Rock.
The hotel was nice and comfortable.
<Photo 14>
The next morning I went to Santa Monica Pier.
I had lunch at Bubba Gump Shrimp Co. and then walked around the pier.
<Photo 4>, <Photo 5>, <Photo 6>, <Photo 7>
In the afternoon I went to the Cheesecake Factory.
I had a delicious meal and then went shopping at the mall.
<Photo 8>, <Photo 9>, <Photo 10>
In the evening I went to Farmers Market.
I bought some caramel apples and ice cream.
<Photo 11>, <Photo 12>, <Photo 13>
It was a long day but I had a lot of fun.
I can't wait to explore more of Los Angeles tomorrow.
您將會(huì)看到以下“Photo StoryTelling”為我兩次旅行所生成的帖子示例。LLM的輸出具有不確定性,并且在質(zhì)量和對(duì)提示中所描述事實(shí)的保真度上會(huì)有所不同。為了生成不同的響應(yīng),您可能希望嘗試對(duì)?temperature、top_p?和?top_k?使用不同的配置,或者只是向?TextGenerationModel?發(fā)送新請(qǐng)求。
album_context = """I flew to Los Angeles for a short trip,
and the album contains the photos
from the day I arrived there.
The man in those photos is myself.
"""
blog_prompt_sf = generate_prompt(album_context, photos_info_concat_sf)
generated_post = generate_text(prompt=blog_prompt)
我很高興終于抵達(dá)了目的地洛杉磯。
當(dāng)我踏入美國(guó)海關(guān)的那一刻,心中充滿了迫切想要離開(kāi)機(jī)場(chǎng)、去探索這座城市的焦慮。
我乘出租車(chē)去了我在鷹巖的酒店。房間雖小但很舒適。
放下行李后,我出去探索這座城市。我首先開(kāi)車(chē)去了圣莫尼卡碼頭。
我沿著碼頭走,欣賞著海洋的景象和聲音。我什至乘坐了摩天輪!
在碼頭呆了一會(huì)兒后,我去了芝士蛋糕工廠吃午飯。食物很美味,服務(wù)也非常好。
下午,我去了拉布雷亞的農(nóng)貿(mào)市場(chǎng)。我買(mǎi)了一些新鮮農(nóng)產(chǎn)品和鮮花。我還吃了一些冰淇淋。
我回到酒店房間放松一下,結(jié)束了一天的行程。
我所做的所有探索讓我感到疲憊,但我也很高興看到第二天會(huì)發(fā)生什么。
album_context_sf = """Me and my wife travelled to San Francisco.
We spent a single day there. We rented a car in SF and visited.
many places during that day.
The man in the pictures is myself and the woman is my wife.
"""
blog_prompt_sf = generate_prompt(album_context_sf, photos_info_concat_sf)
generated_post_sf = generate_text(prompt=blog_prompt)
我和我的妻子去了舊金山。我們?cè)谀抢锎袅艘惶臁N覀冊(cè)谂f金山租了一輛車(chē),那天去了很多地方。
我們?cè)谂f金山國(guó)際機(jī)場(chǎng)開(kāi)始了新的一天。我們很高興終于來(lái)到舊金山并準(zhǔn)備探索這座城市。
我們開(kāi)車(chē)去了俄羅斯山,找到了一個(gè)停車(chē)位。我們?cè)诟浇吡艘蝗Γ蕾p了這里的景象和聲音。
我們走到金門(mén)大橋上,拍了一些照片。那天天氣晴朗,這座橋令人驚嘆。
我們坐在長(zhǎng)凳上,看著船只駛過(guò)。它是如此寧?kù)o和放松。
我們走回車(chē)?yán)铮_(kāi)車(chē)去了濱海區(qū)。我們?cè)诤吷⒉剑蕾p風(fēng)景。
我們?cè)跒I海區(qū)的一家餐廳停下來(lái)吃晚飯。食物很美味,氣氛很熱鬧。
晚飯后我們?cè)跒I海區(qū)走了一圈,又拍了一些照片。
我們開(kāi)車(chē)去了漁人碼頭,在商店和餐館里走了一圈。我們晚餐吃了一些美味的海鮮。
晚飯后,我們?cè)跐O人碼頭周?chē)吡艘蝗Γ峙牧艘恍┱掌N覀冋娴暮芟硎茉谶@個(gè)社區(qū)的時(shí)光。
我們開(kāi)車(chē)去了梅森堡,繞著 Ghirardelli 廣場(chǎng)走了一圈。我們吃了一些美味的巧克力和冰淇淋。
晚飯后我們?cè)诿飞ぶ車(chē)吡艘蝗Γ峙牧艘恍┱掌?/p>
我們?cè)谂f金山度過(guò)了一段美好的時(shí)光,我們迫不及待地想很快再次回來(lái)。
如果您已經(jīng)閱讀至此,那么您定能體會(huì)到將數(shù)據(jù)提取(例如從圖像中獲取EXIF元數(shù)據(jù))、數(shù)據(jù)增強(qiáng)(例如利用Google Maps API根據(jù)地理坐標(biāo)確定位置)、提示工程(如小樣本學(xué)習(xí))以及生成式AI(如Vertex Imagen和Palm API)相結(jié)合所能產(chǎn)生的強(qiáng)大效果。在這個(gè)案例中,這些技術(shù)共同生成了描述照片相冊(cè)的有趣博客文章。
希望您能喜歡這個(gè)項(xiàng)目,并愿意動(dòng)手嘗試,或許您可以使用自己的照片,看看能生成出怎樣描述您美好時(shí)刻的博客文章!