Python - Syntax & Runtime

Python Syntax

Mutable Default Arguments

keyword： Function Definition Time v.s. Runtime

Python 的 function 是物件。
預設參數 (default arguments) 在 function 定義 (define, import) 時就建立好了，而不是每次呼叫時才建立。

如果預設參數是 list/dict/set (mutable)，所有呼叫都會共用同一個物件，導致資料污染。

# ❌ 錯誤寫法
def append_log(msg, logs=[]):  # logs 在定義時就被建立，且永久存在記憶體中
    logs.append(msg)
    return logs

print(append_log("Error 1"))  # Output: ['Error 1']
print(append_log("Error 2"))  # Output: ['Error 1', 'Error 2'] <- 上一筆資料還在

即使是 class 中的 function 也是一樣的道理

# ❌ 錯誤寫法
class Logger:
    def append(self, msg, logs=[]):
        print(id(logs))
        logs.append(msg)

l1 = Logger()
l2 = Logger()

l1.append("Error 1")  # Output: ['Error 1']
l2.append("Error 2")  # Output: ['Error 1', 'Error 2'] <- 上一筆資料還在

# ✅ 正確寫法
def append_log_safe(msg, logs=None):
    if logs is None:
        logs = []  # 每次 runtime 呼叫時才建立新 list
    logs.append(msg)
    return logs

print(append_log("Error 1"))  # Output: ['Error 1']
print(append_log("Error 2"))  # Output: ['Error 2']

陷阱雷區: class 中的 function

# ❌ 錯誤寫法
class User:
    logs = []
    def foo(self, msg):
        self.logs.append(msg)
        return self.logs

A = User()
B = User()

print(A.foo("this is A")) # Output: ['this is A']
print(B.foo("this is B")) # Output: ['this is A', 'this is B']

print(A.logs) # Output: ['this is A', 'this is B']
print(B.logs) # Output: ['this is A', 'this is B']

這裡的 self.logs.append(msg) 等價於 User.logs.append(msg) 所以 lookup 流程是：

Python 先找 A.__dict__ → 沒有 logs
再找 User.__dict__ → 找到 logs
拿到同一顆 list，直接 append
使用 id() 可以說明是同一個物件。

A.logs is B.logs          # True
id(A.logs) == id(B.logs)  # True

class 中的屬性應該使用建構子 (constructor) 進行初始化。

# ✅ 正確寫法
class User:
    def __init__(self):
        self.logs = []  # ✅ instance variable

    def foo(self, msg):
        self.logs.append(msg)
        return self.logs

A = User()
B = User()

print(A.foo("this is A")) # Output: ['this is A']
print(B.foo("this is B")) # Output: ['this is B']

print(A.logs) # Output: ['this is A']
print(B.logs) # Output: ['this is B']

Generator vs Iterator

keyword： Lazy Evaluation v.s. Memory Efficiency

Iterator 是物件 (有 __next__)
Generator 是用 yield 語法糖寫出來的特殊 Iterator

# 一般 List (Eager load)：瞬間吃掉大量記憶體
def get_squares_list(n):
    return [i ** 2 for i in range(n)]

# Generator (Lazy load)：記憶體佔用極低，只有當你需要下一個值時才運算
def get_squares_gen(n):
    for i in range(n):
        yield i ** 2

# Generator Expression (像 list comprehension 但用圓括號)
gen = (i ** 2 for i in range(1000000))  # 幾乎不佔記憶體

資訊

要讀一個 10GB 的 log 檔做分析，記憶體只有 8GB，怎麼辦?
用 Generator 一行一行 yield 出來處理，不要用 readlines() 一次讀進 RAM

Decorator

keyword： Decorator 函式必須加 @functools.wraps
如果沒加，被裝飾的 function 的 __name__ 和 docstring 會變成 Decorator 的名字，進而讓 debug 和自動生成文件工具壞掉

Decorator 本質上是一個「接收一個 function，並回傳一個新 function」的高階函式 (Higher-order function)

import functools
import time

def timer(func):
    # ⚠️ 關鍵點：這行一定要加，不然原函數的 metadata (如 __name__) 會遺失
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = func(*args, **kwargs)
        end = time.perf_counter()
        print(f"{func.__name__} took {end - start:.4f}s")
        return result
    return wrapper

@timer
def heavy_process():
    """這是原本的 docstring"""
    time.sleep(0.5)

print(heavy_process.__name__)  # 有加 wraps -> 'heavy_process'；沒加 -> 'wrapper'

Context Manager (with)

keyword： Resource Management (資源管理) & Exception Safety (例外安全)

任何實作了 __enter__ 和 __exit__ 方法的物件，都可以用 with。
通常拿來開檔案、管理 DB Transaction (成功就 commit，失敗自動 rollback)、Lock 的取得與釋放、或是測試時暫時修改環境變數。

from contextlib import contextmanager

# 用裝飾器快速寫一個 context manager (比寫 class 快)
@contextmanager
def temp_env_var(key, value):
    import os
    old_value = os.environ.get(key)
    os.environ[key] = value
    try:
        yield  # 這裡執行 with block 內的程式碼
    finally:
        # 就算中間報錯，這裡保證一定會執行，還原環境
        if old_value is None:
            del os.environ[key]
        else:
            os.environ[key] = old_value

# 使用情境：測試時暫時切換環境變數
with temp_env_var("DB_HOST", "localhost"):
    print("In context:", os.environ["DB_HOST"])

Python Runtime

先複習一下作業系統：

Program (程式)：硬碟裡的檔案，還沒執行，只是靜態檔案
Process (行程)：程式「正在執行時」的狀態，一個程式可以同時開很多個 Process
Thread：Process 裡面真正被 CPU 執行的單位。一個 Process 至少有 1 個 Thread

電腦
 └── 作業系統（Windows / macOS / Linux）
        └── Process（程式執行中的實例）
                └── Thread（執行緒）

買電腦時看到的 CPU「8核心」是什麼意思?
❌ 只能跑 8 個 Process
✅ 可以真正「同時」執行 8 個 Thread

Global Interpreter Lock (GIL)

keyword： GIL & Thread-safety

CPython 直譯器為了執行緒安全 (Thread-safety)，同一時間只允許一個執行緒執行 Python Bytecode
即使開 8 個 thread，在瞬間也只有 1 個在真正跑 Python 程式
GIL 只鎖「執行」，在做 I/O 時 GIL 會被釋放
話說 Python 3.13+ 開始實驗性支援 free-threaded (No-GIL) 模式 (PEP 703)，但目前生產環境主流仍有 GIL

資訊

既然有 GIL，那 Python 的 Multi-threading 是不是廢物?
不是，但要看任務類型
對 CPU-bound (算數學、轉檔、影像處理) 是廢物
對 I/O-bound (爬蟲、讀 DB、網路請求) 非常有用，因為在等待 I/O 時，GIL 會被釋放讓其他執行緒做事

Threading v.s. Event Loop v.s. Asyncio

keyword： Concurrency is not Parallelism.

Threading
- 很多 Thread
- 由 OS 判斷何時切換 (Preemptive)
- 需要 lock/mutex 避免 race condition
Single Thread + Event Loop
- 只有一條 thread
- 用 Event Loop (事件迴圈) 在不同任務之間「切換」
Asyncio
- 只有一條 thread
- 任務之間的切換是寫程式的人自己決定的，只有在 await 時才會讓出控制權 (Cooperative)

在 async function 裡面寫了 time.sleep() 或跑了一個超久的 for 迴圈，會「卡死」整個 Event Loop，導致所有請求都卡住(因為只有一個 thread)。

import asyncio

async def fetch_data(id):
    print(f"Start {id}")
    # ✅ 正確：把控制權交還給 Event Loop，讓它去跑別的任務
    await asyncio.sleep(1)
    print(f"End {id}")
    return id

async def main():
    # 用 gather 同時發出多個任務 (最常用的 pattern)
    results = await asyncio.gather(fetch_data(1), fetch_data(2), fetch_data(3))

# ❌ 絕對禁止在 async 裡寫 blocking code
# time.sleep(1)  <- 這會讓整個程式停擺

Python Syntax​

Mutable Default Arguments​

Generator vs Iterator​

Decorator​

Context Manager (with)​

Python Runtime​

Global Interpreter Lock (GIL)​

Threading v.s. Event Loop v.s. Asyncio​