一级毛片不卡免费看老司机,中文字幕一区二区三区不卡,成人交性视频免费看

在此關系圖的起點，API服務器充當中介。它接收GET請求，對其進行處理，并根據請求的參數確定適當的響應。

GET請求表示來自客戶端(如網站或應用程序)向API服務器請求特定數據的查詢，在請求之后，圖中顯示了服務器的響應。首先，發出響應代碼，例如200表示成功，404表示未找到。然后，返回響應數據，其中包含客戶端請求的信息。

由此可以看出，API與網頁抓取的主要區別在于它們訪問數據的方式:

API是訪問數據的官方渠道。這就像有一張VIP通行證可以進入一場音樂會，在那里你可以直接獲得某些信息。
另一方面，網絡抓取就像坐在觀眾席上，記下正在播放的歌曲的歌詞。這是一種無需使用官方API即可從網站提取數據的方法。

回到最開始提到的案例中。

城市信息可以從多個途徑獲取。一種方法是從官方統計等渠道的網站下載CSV文件。但要注意的是，城市信息可能會變動頻繁，但網站更新的頻率無法保障。

另一個方法是使用百科的數據。大量的用戶在定期更新這些信息，所以只需要專注于選擇正確的數據。

接下來，以使用BeautifulSoup進行網絡抓取為案例。目標是什么？提取關鍵細節，例如名稱、緯度、經度和人口數量，兩個充滿活力的城市：AAA和XXX。

此處作者使用的是Jupyter Notebook開發環境，對于交互式編程和數據可視化非常出色。當然，其他工具如Atom、Visual Studio Code或IntelliJ IDEA也有自己的優勢。

分步Python指南：抓取數據實踐

首先，讓我們看一下用于推斷AAA和XXX數據的代碼。在本節中，將介紹構成項目骨干的Python庫。

import requests

我們的第一個工具是 requests 庫。這是互聯網的關鍵——它幫助我們向網站發送HTTP請求。

from bs4 import BeautifulSoup

接下來，我們從 bs4 包中介紹BeautifulSoup。一旦我們有了目標網頁，BeautifulSoup就會解析HTML內容。

import pandas as pd

接下來是 pandas，這是數據科學中不可或缺的庫。我們可以將抓取的數據轉換為可讀的表格，非常適合分析和可視化。

Python中另一個常用的模塊是 re 模塊。它是一個用于處理正則表達式的庫。

import reheaders = {'Accept-Language': 'en-US,en;q=0.8'}

第一步是準備Python環境來接收來自web的數據。我們使用 requests 庫來做到這一點，通過將“Accept-Language”設置為英語來確保我們的請求被普遍理解。

接下來，確定城市的URL -AAA。這個URL將成為我們獲取豐富信息的門戶：

url_aaa = "https://en.wikipedia.org/wiki/aaa"

aaa = requests.get(url_aaa, headers=headers)

發送請求后，檢查請求是否成功是至關重要的。狀態碼為200表示連接成功。

aaa.status_code # Should return 200

現在使用BeautifulSoup解析AAA的網頁，將HTML內容轉換為我們可以使用的格式。

soup_aaa = BeautifulSoup(aaa.content, "html.parser")

當提取特定數據時，就可以獲得我們想要的結果:

檢索到城市名稱和國家，指向我們的研究主題
經緯度給了我們地理坐標
從人口數量可以看出城市的規模

下面是如何仔細檢索這些細節的流程：

A_city = soup_aaa.select(".mw-page-title-main")[0].get_text()

A_country = soup_aaa.select('a[href="/wiki/CCC"]')[0].get_text()

A_latitude = soup_aaa.select(".latitude")[0].get_text()

A_longitude = soup_aaa.select(".longitude")[0].get_text()

A_population = soup_aaa.select('td.infobox-data')[10].get_text()

在成功抓取AAA的數據后，我們將注意力轉向XXX，使用相同的技術提取其城市名稱、人口、緯度和經度。

和前面一樣，使用BeautifulSoup解析XXX的百科頁面，收集必要的數據并創建一個DataFrame。

data = {

    "City": [FR_city, BR_city],

    "Population": [FR_population, BR_population],

    "Latitude": [FR_latitude, BR_latitude],

    "Longitude": [FR_longitude, BR_longitude],

    "Country": [FR_country, BR_country]

}



df = pd.DataFrame(data)

接下來，我們通過微調來優化DataFrame以獲得更好的可讀性和準確性，以確保我們的數據干凈且易于理解。

df['Population'] = pd.to_numeric(df['Population'], errors='coerce')

df['Latitude'] = pd.to_numeric(df['Latitude'], errors='coerce')

df['Longitude'] = pd.to_numeric(df['Longitude'], errors='coerce')

df['City'] = df['City'].astype(str)

# Display the DataFrame

print(df.head)

如果您的目標是在編碼過程中獲得高水平的舒適性和準確性，并且您有興趣將方法改進到完美，那么這里有一段利用函數的Python代碼。這種方法不僅簡化了過程，而且提高了代碼的可讀性和可重用性。

def scrape_city_data(url):

    response = requests.get(url)



    if response.status_code == 200:

        soup = BeautifulSoup(response.content, "html.parser")

        city = soup.title.get_text().split(' - ')[0]

        country = soup.select('td.infobox-data a')[0].get_text()

        latitude = soup.select('span.latitude')[0].get_text()

        longitude = soup.select('span.longitude')[0].get_text()



        # Find the population data using provided code

        population_element = soup.select_one('th.infobox-header:-soup-contains("Population")')

        if population_element:

            population = population_element.parent.find_next_sibling().find(string=re.compile(r'\d+'))

            if population:

                population = int(population)

        else:

            population = None



        data = {

            'City': [city],

            'Country': [country],

            'Latitude': [latitude],

            'Longitude': [longitude],

            'Population': [population],

        }



        city_df = pd.DataFrame(data)

        return city_df



    else:

        print("Error:", response.status_code)

        return None



# List of German cities ( herre you can add more cities)

german_cities = ['Berlin', 'Frankfurt']



# Create an empty DataFrame with specified columns

german_cities_df = pd.DataFrame(columns=['City', 'Country', 'Latitude', 'Longitude', 'Population'])



# Iterate and scrape data for German cities

for city_name in german_cities:

    wiki_link = f"https://en.wikipedia.org/wiki/{city_name}"

    city_data = scrape_city_data(wiki_link)



    # Append the data to the table

    if city_data is not None:

        german_cities_df = pd.concat([german_cities_df, city_data], ignore_index=True)



# Display the DataFrame

print(german_cities_df)

通過專業的天氣預報API來獲取數據

有了地理位置，接下來看下影響共享單車的另一個方面——天氣。這部分我們采用調用天氣預報API的方式來獲取數據。

下面是我們準備的Python函數。這個簡潔的代碼片段展示了如何以精煉的方式實現強大的功能，無縫地融合了技術性與易用性之間的隔閡。

def fetch_weather_data(API_key, city):

    import requests

    import pandas as pd

    from datetime import datetime

    from keys import weather_key

    url = f"http://api.openweathermap.org/data/2.5/forecast?q={city}&appid={API_key}&units=metric"

    response = requests.get(url)



    if response.status_code == 200:

        weather_json = response.json()



        if "list" in weather_json:

            temperature = weather_json["list"][0]["main"]["temp"]

            description = weather_json["list"][0]['weather'][0]['description']

            feels_like = weather_json["list"][0]["main"].get("feels_like")

            wind_speed = weather_json["list"][0]["wind"].get("speed")



            return pd.DataFrame({

                "city": [city],

                "forecast_time": [datetime.now()],

                "outlook": [description],

                "temperature": [temperature],

                "feels_like": [feels_like],

                "wind_speed": [wind_speed]

            })

        else:

            print("Unexpected response format: 'list' key not found.")

    else:

        print(f"Failed to fetch data for {city}. Status Code: {response.status_code}")



    return pd.DataFrame()



cities = ["Berlin", "Frankfurt"]

API_key = weather_key  # Replace with your actual API key

weather_df = pd.DataFrame()



for city in cities:

    city_weather_df = fetch_weather_data(API_key, city)

    if not city_weather_df.empty:

        weather_df = weather_df.append(city_weather_df, ignore_index=True)