KingbaseES Database Design Standards and SQL Development Best Practices - Devuly | Smart Analytics for Developers & Projects

The core value of KingbaseES lies in using standardized design to reduce the risk of performance collapse later in the lifecycle. It covers end-to-end practices for table design, indexing, partitioning, SQL, and PL/SQL, with a focus on solving slow queries, technical debt, and concurrency bottlenecks. Keywords: KingbaseES, SQL Optimization, Database Design.

Table of Contents

Technical Specifications Snapshot

Parameter	Details
Database	KingbaseES
Language	SQL, PL/SQL
Protocols / Paradigms	ANSI SQL, Transaction Processing, Partition Design
Stars	Not provided in the original article
Core Dependencies	Sequence, BTree/GIN/GiST/BRIN Indexes, Partitioned Tables, Connection Pool

This set of standards moves performance problems into the design phase

Most database failures do not start during peak traffic. They are introduced on day one, when teams create tables and write SQL. Inconsistent character sets, mixed data types, and arbitrary index stacking all become amplified into performance and operational incidents as data volume grows.

KingbaseES best practices emphasize three things: schema isolation, precise data types, and consistent object naming. Do not place all business tables under the public schema. Instead, split schemas by system or domain, and revoke public table creation privileges.

-- Create separate schemas for different business domains to isolate permissions and namespaces
CREATE SCHEMA order_system AUTHORIZATION app_admin;
CREATE SCHEMA user_system AUTHORIZATION app_admin;
REVOKE CREATE ON SCHEMA public FROM PUBLIC; -- Revoke public table creation privileges

This SQL establishes isolation boundaries between business domains and helps prevent object conflicts and uncontrolled permissions.

Table design must prioritize data types, primary keys, and extensibility

Every business table should have a primary key. In OLTP systems, keep a single table under roughly 80 columns whenever possible. Related fields must use exactly the same data type. Otherwise, JOIN operations may trigger implicit type conversion and invalidate index usage.

Field design should follow the principle of semantic matching: use DATE for dates, DECIMAL for monetary values, CHAR for fixed-length values, and VARCHAR for variable-length text. Do not degrade every field into a string just for convenience.

CREATE TABLE TB_HR_EMPLOYEE (
  emp_id INTEGER PRIMARY KEY,
  emp_name VARCHAR(50) NOT NULL,
  gender CHAR(1) CHECK (gender IN ('M', 'F')), -- Use a constraint to restrict valid values
  birth_date DATE NOT NULL,                    -- Dates must use a date type
  salary DECIMAL(10,2) NOT NULL,
  dept_id INTEGER NOT NULL,
  create_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

This CREATE TABLE statement reflects a design approach centered on strong typing, early constraint enforcement, and maintainability.

Primary key, LOB, and temporary table strategies directly affect long-term maintenance cost

Prefer Sequence + Auto-Increment ID for primary keys. Avoid using mutable business fields such as phone numbers or national ID numbers as primary keys. Do not mix large LOB fields with high-frequency business columns. A better pattern is to store metadata in the main table and CLOB / BLOB content in an extension table.

Choose temporary tables based on lifecycle: use ON COMMIT DELETE ROWS for transaction-level data and ON COMMIT PRESERVE ROWS for session-level data. For high-update tables, consider setting fillfactor=80 to reserve page space for HOT updates and reduce index maintenance.

Index design should serve query patterns rather than maximize index count

More indexes do not automatically mean better performance. Every additional index adds maintenance cost to the write path. In practice, business tables should keep index counts under control and build indexes only around high-frequency filtering, sorting, and join columns.

BTree is the default choice for equality, range queries, and sorting. Hash is suitable only for exact matching on long fields. GIN works well for arrays, JSONB, and full-text search. GiST and SP-GiST fit spatial and range data. BRIN is well suited to large log tables with natural time ordering.

CREATE TABLE TB_SYS_LOG (
  id BIGSERIAL PRIMARY KEY,
  log_time TIMESTAMP NOT NULL,
  log_detail TEXT
);

CREATE INDEX IDX_SYS_LOG_TIME
ON TB_SYS_LOG USING BRIN(log_time); -- Prefer BRIN for large time-series tables

This example shows how a large log table can use a low-maintenance index to support time-based filtering.

Composite indexes and partial indexes define the upper bound of high-frequency queries

Composite indexes follow the leftmost prefix rule. Place the most frequently filtered and most selective columns on the left. If a query skips the leading column, the index usually cannot be used effectively.

Partial indexes work well for hot subsets, such as indexing only active orders. Covering indexes can reduce table lookups through INCLUDE, which makes them effective for high-frequency OLTP queries.

CREATE INDEX IDX_ORDER_ACTIVE
ON TB_SHOP_ORDER(create_time)
WHERE status != 'CANCELLED'; -- Index only active orders

This index reduces index size and improves hit rates for hot queries by narrowing the indexed data set.

Partitioned tables are the primary path for managing very large tables

When a single table reaches tens of millions of rows or approaches hundreds of GB, partitioning should be introduced early. Partitioning is not simply table splitting. Its core value is partition pruning, which reduces scan range and lowers the cost of archiving and cleanup.

For time-oriented workloads, prefer Range partitioning. For discrete values, prefer List partitioning. For evenly distributed hashed access, use Hash partitioning. Range partitioning should always include a MAXVALUE fallback partition, and List partitioning should include a DEFAULT partition.

CREATE TABLE TB_ORDER_LOG (
  order_id BIGINT NOT NULL,
  order_date DATE NOT NULL,
  customer_id INTEGER,
  amount DECIMAL(12,2)
) PARTITION BY RANGE (order_date);

CREATE TABLE TB_ORDER_LOG_202401 PARTITION OF TB_ORDER_LOG
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

This example illustrates a monthly partitioning pattern that fits time-driven data such as orders, transaction records, and audit logs.

The value of partitioning is not just faster queries, but simpler operations

When archiving or deleting old data, DROP PARTITION is far less expensive than large-scale DELETE operations. That allows historical data management to move from row-level operations to object-level operations, significantly reducing lock contention and logging overhead.

Prefer local indexes on partitions and use global indexes cautiously. Also, avoid creating too many partitions, because metadata and optimizer overhead can eventually outweigh the benefits.

SQL quality directly creates order-of-magnitude performance differences

The first rule of high-quality SQL is to retrieve only the columns you need. Avoid SELECT * without exception. The second rule is to avoid unnecessary computation, especially wrapping indexed columns in functions or expressions, because that can easily trigger full table scans.

In multi-table joins, try to keep the number of joined tables within five. IN subqueries can often be rewritten as EXISTS. High-frequency template SQL should consistently use bind variables to reduce hard parsing and execution plan instability.

PREPARE query_order(INT) AS
SELECT order_id, order_date
FROM TB_SHOP_ORDER
WHERE status = $1; -- Use bind variables to reuse execution plans

EXECUTE query_order(1);

This SQL shows how bind variables can reduce parse cost and stabilize OLTP performance.

DML, transactions, and lock management are also part of performance design

For bulk imports, you can drop indexes first, rebuild them after import, and then run ANALYZE. To clear an entire table, prefer TRUNCATE instead of row-by-row DELETE. Split large transactions into smaller ones and enforce a consistent resource access order to reduce deadlock risk.

For paginated queries, fetch the total count first and then retrieve the result set. As a practical guideline, keep each page within 500 rows to avoid putting simultaneous pressure on the database, middleware, and application.

PL/SQL development must balance batching, exception handling, and security

Typical PL/SQL problems are not about syntax. They come from row-by-row processing and missing exception handling. When checking whether data exists, do not overuse COUNT(*). For batch DML, prefer FORALL and BULK COLLECT to reduce SQL engine context switching.

Exception handling should follow a standard template: roll back on business failure, write an error log, and then re-raise the exception. In dynamic SQL, bind all values and validate object names against a whitelist to prevent SQL injection.

CREATE OR REPLACE PROCEDURE delete_employee(p_emp_id INTEGER) AS
BEGIN
  EXECUTE IMMEDIATE 'DELETE FROM emp WHERE empno = :num'
  USING p_emp_id; -- Parameters must be bound to prevent injection through string concatenation
END;

This procedure demonstrates the security boundary for dynamic SQL: bind parameters and never concatenate user input directly.

Connection pooling and security strategy cannot be postponed at the system level

OLTP and OLAP require completely different optimization approaches. OLTP depends on memory hit rates, bind variables, and static connection pools. OLAP depends on partitioning, parallelism, and large-scan capacity. Connection pools should avoid unbounded expansion. As a practical rule, no more than 10 connections per CPU core is often a safer baseline.

From a security perspective, enforce default password changes, least privilege, sensitive data encryption, and auditable activity logs. Permission granularity should be refined down to schema, table, and operation level instead of relying on broad GRANT ALL usage.

FAQ

Q1: What design mistakes most commonly cause slow queries in KingbaseES?
A1: The most common problems are mismatched field types, applying functions to indexed columns, and using SELECT *. These three issues most often invalidate indexes and amplify I/O cost.

Q2: When should you start considering partitioned tables?
A2: You should evaluate Range, List, or Hash partitioning as soon as a single table approaches 50 million rows, 100 GB, or already shows slow archiving, slow cleanup, or degraded time-range queries.

Q3: How should you balance foreign keys and indexes in OLTP workloads?
A3: In high-write OLTP systems, teams often relax foreign key enforcement and strengthen consistency validation at the application layer instead. Keep indexes only on high-frequency filtering and join columns, and avoid over-indexing for hypothetical queries.

Core Summary: This article reconstructs the core KingbaseES standards for table design, indexes, partitioning, SQL writing, PL/SQL, security, and connection pooling. It distills the most practical performance strategies and anti-pattern checklists for high-concurrency OLTP systems and large-scale data workloads.