Python KingbaseES Database Operations Guide: SQL Execution, Bulk Inserts, COPY, and Dynamic SQL with ksycopg2

This article focuses on the most common scenarios for using Python with ksycopg2 to interact with KingbaseES. It addresses common pain points such as inefficient SQL execution, poor bulk-write performance, and unsafe dynamic SQL. The core topics include querying, bulk inserts, COPY-based import/export, and exception handling. Keywords: KingbaseES, ksycopg2, bulk operations.

The technical specification snapshot highlights the core environment

Parameter Description
Language Python
Database KingbaseES
Driver Protocol DB-API 2.0-style interface
Example Port 54321
Core Dependencies ksycopg2, ksycopg2.extras, ksycopg2.sql
Article Focus SQL execution, bulk writes, COPY, dynamic SQL, exception handling
Star Count Not provided in the source article

This article fully covers the execution-layer capabilities of Python for KingbaseES

A previous guide usually answers the question, “How do I connect to the database?” But what actually affects development efficiency is the execution layer: how to query, how to write, how to import data in bulk, and how to roll back when something goes wrong.

This article uses ksycopg2 as the foundation, distills the nine most practical capabilities, and provides selection guidance suitable for production environments.

Basic queries are the starting point for all data access logic

import ksycopg2

conn = ksycopg2.connect(
    database="TEST",
    user="SYSTEM",
    password="123456",
    host="127.0.0.1",
    port="54321"
)
cur = conn.cursor()

# Use a parameterized query to avoid string-concatenated SQL
cur.execute("SELECT id, name FROM test_user WHERE age > %s", (18,))
rows = cur.fetchall()  # Fetch the full result at once only for small result sets

for row in rows:
    print(f"id: {row[0]}, name: {row[1]}")

cur.close()
conn.close()

This code demonstrates the standard lifecycle: connect, create a cursor, execute a query, read results, and close resources.

Result set retrieval methods must match the data volume

fetchone() works well for streaming processing, fetchmany() balances memory usage and throughput, and fetchall() is suitable only for small result sets. Choose the wrong method and you may get poor performance—or exhaust memory entirely.

Batched fetching is usually the safest default

cur.execute("SELECT id, name FROM test_user")

while True:
    rows = cur.fetchmany(100)  # Fetch 100 rows per batch to control memory usage
    if not rows:
        break
    for row in rows:
        print(row)

The main value of this pattern is that it turns a large query into a controlled batch-processing workflow.

If you need column names, type codes, or lengths, use cursor.description. This is especially useful when building generic export tools or analyzing metadata.

Non-query statements require an explicit transaction commit

For INSERT, UPDATE, and DELETE, execute() only sends the SQL statement. The actual persistence depends on commit(). Forgetting to commit is one of the most common hidden errors in database development.

cur = conn.cursor()

# INSERT, UPDATE, and DELETE all use execute
cur.execute("INSERT INTO test_user (id, name) VALUES (%s, %s)", (1, "张三"))
cur.execute("UPDATE test_user SET name = %s WHERE id = %s", ("李四", 1))
cur.execute("DELETE FROM test_user WHERE id = %s", (1,))

conn.commit()  # Explicitly commit the transaction
cur.close()

This code shows that the critical step in DML operations is not execution itself, but transaction commit.

Parameter binding is the first rule for preventing SQL injection

Do not build SQL manually with string concatenation. ksycopg2 uses %s placeholders for parameters and automatically handles escaping and type adaptation.

# Correct: pass parameters separately
cur.execute(
    "SELECT * FROM test_user WHERE age > %s AND name LIKE %s",
    (18, "%张%")
)

# Named parameters improve readability
cur.execute(
    "SELECT * FROM test_user WHERE age > %(min_age)s AND name LIKE %(kw)s",
    {"min_age": 18, "kw": "%张%"}
)

This code demonstrates two safe approaches: positional parameters and named parameters.

The core of bulk write optimization is reducing network round trips

executemany() is often mistaken for a high-performance interface, but in many cases it still executes the statement multiple times. For medium and large bulk workloads, execute_batch() or execute_values() is usually a better choice.

execute_values is often the best default for bulk inserts

from ksycopg2 import extras

data = [(1, "张三"), (2, "李四"), (3, "王五")]

extras.execute_values(
    cur,
    "INSERT INTO test_user (id, name) VALUES %s",
    data,
    page_size=100  # Control how many rows are assembled per batch
)
conn.commit()

This code significantly reduces round trips through a single multi-value INSERT, which is often the fastest option for bulk inserts.

As a rule of thumb, use executemany() for small data sets, execute_batch() for medium workloads, and prefer execute_values() for large-scale ingestion.

Stored procedures and functions are useful for encapsulating database-side business logic

You can call functions through callproc(), while stored procedures are typically executed with the CALL statement. Both approaches are useful when you want to move rules, validation, or batch-processing logic into the database.

# Call a stored procedure
cur.execute("CALL update_user_name(%s, %s)", (1, "新名字"))
conn.commit()

This code shows the shortest path for invoking a database procedure from Python.

COPY is the performance tool for massive data import and export

Once your data volume grows from “thousands of rows” to “hundreds of thousands,” row-by-row INSERT operations are no longer economical. At that point, prioritize copy_from, copy_to, or copy_expert.

with open("data.csv", "r") as f:
    cur.copy_from(
        file=f,
        table="test_user",
        columns=("id", "name"),
        sep=","  # Specify a comma delimiter for CSV input
    )
conn.commit()

This code shows how to import data efficiently from a file.

For export scenarios, use copy_to(). If you need headers, custom delimiters, or a full COPY statement, copy_expert() gives you more flexibility.

Large object handling requires attention to type mapping and storage boundaries

KingbaseES supports objects such as BLOB, CLOB, and BYTEA. On the Python side, the key detail is that bytea often comes back as a memoryview in Python 3, so you need to convert it before reading.

cur.execute("SELECT id, ba FROM test_lob")
rows = cur.fetchall()

for row in rows:
    cell = row[1]
    if isinstance(cell, memoryview):
        print(cell.tobytes().decode("UTF8"))  # Convert memoryview to bytes before decoding

This code solves one of the most common type compatibility issues when reading binary fields.

For very large files, storing the full content in the database over the long term is usually not the best design. In practice, a better pattern is to store file content in the file system and keep only the path or metadata in the database.

Safe dynamic SQL composition must distinguish between values and identifiers

Parameter placeholders can only safely handle values. They cannot safely handle identifiers such as table names or column names. For that, use the ksycopg2.sql module.

Use Identifier for safe identifier composition

from ksycopg2 import sql

table_name = "test_user"
id_column = "user_id"

query = sql.SQL("SELECT {col} FROM {table} WHERE {col} = %s").format(
    col=sql.Identifier(id_column),
    table=sql.Identifier(table_name)
)

cur.execute(query, (100,))

The key point in this example is that dynamic table names and column names remain safe, properly escaped, and maintainable.

If you need a dynamic IN query, first generate placeholders based on the list length, then pass the value array as parameters. Do not embed user input directly into SQL text.

Complete exception handling must be designed together with transaction control

Database error handling cannot stop at printing the exception, because the transaction may remain in a dirty state. The correct approach is to catch exceptions by type and roll back on failure.

import ksycopg2
from ksycopg2 import IntegrityError, OperationalError

try:
    cur.execute("INSERT INTO test_user (id, name) VALUES (%s, %s)", (1, "张三"))
    conn.commit()  # Commit on success
except IntegrityError as e:
    print(f"Constraint violation: {e}")
    conn.rollback()  # Roll back after a constraint failure
except OperationalError as e:
    print(f"Connection or execution error: {e}")
    conn.rollback()

This code provides a minimal safety template for transactional database access.

Practical guidance can be reduced to one execution priority order

Prioritize parameterized queries for reads. Prefer fetchmany() for large result sets. Prefer execute_values() for bulk inserts. Prefer COPY for large file-based exchange. Prefer sql.Identifier for dynamic identifiers. Pair every write operation with either a commit or a rollback.

If you design around this order, you will already cover the vast majority of KingbaseES development scenarios.

FAQ provides structured answers to common implementation questions

1. Why is executemany() not the recommended default for bulk inserts?

Because it usually still sends SQL multiple times, which creates high request and round-trip overhead. Once the data volume increases, its performance often falls behind execute_batch() and execute_values().

2. Why did my INSERT run successfully but no data appears in the database?

The most common reason is that you did not call conn.commit(). In non-autocommit mode, the SQL statement has executed, but the transaction has not been committed, so the data is not actually persisted.

3. Why can’t I pass a dynamic table name directly with %s?

%s is only for value parameters, not for identifiers such as table names or column names. Dynamic identifiers should be built with ksycopg2.sql.Identifier to ensure both syntactic correctness and safety.

AI Readability Summary

This article systematically explains the core capabilities of using Python with ksycopg2 to operate on KingbaseES. It covers query execution, parameter binding, SQL injection prevention, bulk writes, stored procedures, COPY-based import/export, large object handling, dynamic SQL, and transactional rollback, making it a practical quick-reference guide for real-world KingbaseES development.