How to Build a Real-Time Camera Filter System on HarmonyOS 6 with OpenGL ES, GLSL, and NAPI - Devuly | Smart Analytics for Developers & Projects

This article breaks down the real-time filter architecture used in a lightweight HarmonyOS 6 camera app. It uses OpenGL ES as the rendering core, converts OES textures to 2D textures, applies FBO-based offscreen processing and a GLSL filter chain to sustain a 60fps preview, and connects ArkTS with native code through NAPI. The main challenges are real-time performance, extensibility, and efficient cross-layer invocation. Keywords: HarmonyOS, OpenGL ES, real-time filters.

Table of Contents

Technical specifications at a glance

Parameter	Description
Target platform	HarmonyOS 6
Primary languages	C++, GLSL, ArkTS
Graphics API	OpenGL ES 3.0 / EGL
Texture types	`GL_TEXTURE_EXTERNAL_OES`, `GL_TEXTURE_2D`
Rendering model	FBO offscreen rendering, multi-pass filter chain
Cross-layer communication	NAPI
Target performance	Capture, process, and display within 16ms, about 60fps
Star count	Not provided in the original article
Core dependencies	NativeImage, XComponent, EGL, OpenGL ES, NAPI

This filter system delivers its core value by decoupling real-time preview from complex visual effects.

In camera scenarios, filters are not simple image post-processing. They are part of a continuously running real-time image pipeline. The system must finish capture, texture conversion, shader computation, and final presentation within the per-frame budget. Otherwise, the preview will stutter.

After the original camera stream enters the GPU, it typically arrives as GL_TEXTURE_EXTERNAL_OES. This texture type works well for camera input, but it is not ideal for complex convolution, multi-sampling, or multi-stage post-processing. That is why the first step must decouple the texture format.

OES-to-2D conversion is the prerequisite for a real-time filter chain.

The implementation uses a three-stage rendering pipeline: first, draw the OES texture into an offscreen FBO to produce a standard GL_TEXTURE_2D; second, chain multiple filters on top of the 2D texture; finally, draw the result onto the Surface bound to XComponent.

void GPUImageRenderer::RequestRender() {
    // Step 1: Draw the camera OES texture into the FBO and convert it to a standard 2D texture
    DrawOESToFBO();

    // Step 2: Apply the current filter on the 2D texture
    if (currentFilter_) {
        currentFilter_->onDraw(fboTextureId_); // Core filter processing
    }

    // Step 3: Swap buffers and present the result on screen
    eglSwapBuffers(eglDisplay_, eglSurface_);
}

This code defines the smallest complete loop for real-time filter rendering.

GLSL fragment shaders perform most of the visual effect computation.

The filter core lives in the fragment shader for a straightforward reason: the fragment stage is inherently pixel-oriented, which makes it ideal for color mapping, neighborhood sampling, edge detection, and convolution. The real difference between filters is essentially the sampling strategy and the math.

A mosaic filter achieves block sampling through coordinate quantization.

A mosaic effect is not a blur. It maps continuous UV coordinates onto a coarser grid, then fills the entire block with a single sampled color. This approach produces a stable visual effect with almost no additional texture dependencies.

#version 300 es
precision mediump float;
in vec2 textureCoordinate;
uniform float imageWidthFactor;   // 1.0 / width
uniform float imageHeightFactor;  // 1.0 / height
uniform sampler2D inputImageTexture;
uniform float pixel;              // Mosaic block size
out vec4 fragColor;

void main() {
    vec2 uv = textureCoordinate;
    float dx = pixel * imageWidthFactor;
    float dy = pixel * imageHeightFactor;

    // Quantize continuous coordinates to the upper-left corner of each block for pixel-block sampling
    vec2 coord = vec2(dx * floor(uv.x / dx), dy * floor(uv.y / dy));
    fragColor = texture(inputImageTexture, coord); // Sample a uniform color for the entire block
}

This shader replaces the original coordinates with quantized coordinates to create a stable pixelation effect.

A sketch filter relies on the Sobel operator for edge enhancement.

A sketch effect usually starts by converting the image to luminance, then computes horizontal and vertical gradients over a 3×3 neighborhood. This approach is sensitive to edges, which makes it well suited for emphasizing contours while suppressing texture details and producing a hand-drawn line-art look.

#version 300 es
precision mediump float;
uniform sampler2D inputImageTexture;
in vec2 topLeftTextureCoordinate;
in vec2 topTextureCoordinate;
in vec2 topRightTextureCoordinate;
in vec2 leftTextureCoordinate;
in vec2 rightTextureCoordinate;
in vec2 bottomLeftTextureCoordinate;
in vec2 bottomTextureCoordinate;
in vec2 bottomRightTextureCoordinate;
out vec4 fragColor;

void main() {
    float topLeft = texture(inputImageTexture, topLeftTextureCoordinate).r;
    float top = texture(inputImageTexture, topTextureCoordinate).r;
    float topRight = texture(inputImageTexture, topRightTextureCoordinate).r;
    float left = texture(inputImageTexture, leftTextureCoordinate).r;
    float right = texture(inputImageTexture, rightTextureCoordinate).r;
    float bottomLeft = texture(inputImageTexture, bottomLeftTextureCoordinate).r;
    float bottom = texture(inputImageTexture, bottomTextureCoordinate).r;
    float bottomRight = texture(inputImageTexture, bottomRightTextureCoordinate).r;

    // Sobel horizontal gradient
    float h = -topLeft - 2.0 * top - topRight + bottomLeft + 2.0 * bottom + bottomRight;
    // Sobel vertical gradient
    float v = -bottomLeft - 2.0 * left - topLeft + bottomRight + 2.0 * right + topRight;

    float mag = 1.0 - length(vec2(h, v)); // Stronger edges produce darker output
    fragColor = vec4(vec3(mag), 1.0);
}

This shader extracts edges with a 3×3 convolution and outputs a grayscale contour that resembles a pencil sketch.

The C++ filter engine determines the system’s extensibility ceiling.

Once the number of filters grows from 2 to more than 20, scattered shaders and ad hoc branching quickly become unmanageable. That is why the original design uses layered modeling: shared logic is pushed into base classes, while differences stay inside each filter implementation.

GPUImageFilter handles shader compilation, program linking, uniform binding, and single-pass drawing. GPUImageFilterGroup handles multi-pass chaining. GPUImage3x3TextureSamplingFilter provides a unified abstraction for sampling offsets used by convolution-based filters.

This inheritance model reduces filter switching to a single pointer update.

The render loop depends only on currentFilter_. The UI does not need to care about OpenGL state transitions, and the native layer does not need to rebuild the entire render chain. It only has to replace the current filter instance to switch the preview immediately.

class GPUImageFilter {
public:
    virtual void onDraw(GLuint textureId) = 0; // Unified filter draw entry point
    virtual ~GPUImageFilter() = default;
};

class GPUImageFilterGroup : public GPUImageFilter {
public:
    void onDraw(GLuint textureId) override {
        // Execute multiple passes in sequence, suitable for composite filters
    }
};

This abstraction completely separates what a filter is from how the system schedules it.

ArkTS and the NDK establish a low-overhead control channel through NAPI.

Filter preview is a native rendering problem, while filter selection is a UI interaction problem. If the two sides are coupled too tightly, the system will lose balance between maintainability and responsiveness. NAPI is a highly practical bridge for this pattern in HarmonyOS native development.

static napi_value SetFilter(napi_env env, napi_callback_info info) {
    size_t argc = 1;
    napi_value args[1] = {nullptr};
    napi_get_cb_info(env, info, &argc, args, nullptr, nullptr);

    int32_t filterIndex = 0;
    napi_get_value_int32(env, args[0], &filterIndex);

    // Synchronize the filter index from ArkTS to the native renderer
    if (ndkCamera_) {
        ndkCamera_->SetFilter(filterIndex);
    }
    return nullptr;
}

This interface converts a UI action into a filter index that the native layer can consume.

List() {
  ForEach(this.filterList, (item, index) => {
    ListItem() {
      FilterItem({ name: item.name })
        .onClick(() => {
          this.currentFilterIndex = index; // Update the current selection
          gpuimagelib.setFilter(index);    // Call NAPI to switch the native filter
        })
    }
  })
}

This ArkTS code completes the full control loop from a UI click to the underlying filter switch.

Real-world results show that this architecture balances performance and extensibility.

The original image shows a standard camera preview. The filtered image demonstrates clear color stylization and texture reconstruction, which indicates that the rendering chain already provides stable real-time processing.

Original preview AI Visual Insight: This image shows the unfiltered camera frame. Subject details, original colors, and exposure relationships remain intact, which makes it a reliable baseline for evaluating how later shaders modify brightness, saturation, edges, and local texture.

Filtered preview AI Visual Insight: This image shows the preview after the filter chain has been applied. The frame differs clearly from the original in color mapping, local contrast, and stylized expression, which confirms that the FBO intermediate pass and fragment shader computations are operating on the live camera stream rather than on static image post-processing.

This architecture leaves a clean integration path for future AI camera features.

Once the OES-to-2D conversion and filter-chain abstraction are in place, the existing rendering mainline can support future additions such as beauty filters, segmentation, detection, or MindSpore Lite inference. The render thread continues to guarantee display, while the AI path handles semantic tasks independently.

That is where this lightweight camera architecture becomes truly valuable. It is not just a filter demo. It is an extensible, composable, and continuously evolvable real-time vision foundation.

FAQ

1. Why not apply all filters directly to `GL_TEXTURE_EXTERNAL_OES`?

Because OES textures impose more restrictions in complex sampling, convolution, and multi-pass processing. Converting them to GL_TEXTURE_2D first simplifies shader design significantly and improves filter-chain reusability.

2. What matters most if you want real-time filters to hold a stable 60fps?

The key is to minimize CPU involvement, avoid frequent texture re-creation, keep computation on the GPU, and control shader complexity per frame. Multi-pass filters must be designed carefully, with FBO and program reuse as a priority.

3. Can ArkTS implement filters directly, and why use the NDK at all?

ArkTS works well for lightweight UI interaction, but high-frequency image processing is better suited to NDK + OpenGL ES. A native rendering pipeline provides lower latency, higher throughput, and more stable frame rates, especially for camera preview scenarios.

[AI Readability Summary] This article reconstructs the core implementation of a lightweight camera filter module on HarmonyOS 6. It explains the full real-time preview pipeline for 60fps scenarios, including OES-to-2D texture conversion, FBO offscreen rendering, GLSL fragment shaders, a C++ filter engine, and ArkTS/NAPI interaction.