AI Academic Polisher Architecture Explained: A Dual-Mode AIGC Academic Editing System for Linux Servers and Windows Desktops - Devuly | Smart Analytics for Developers & Projects

[AI Readability Summary] AI Academic Polisher is an AIGC-powered academic writing refinement system. Its core strength is a unified architecture that supports both Linux Server and Windows Desktop deployment modes. It improves the long-document experience through asynchronous queues, SSE streaming, and ordered concurrent processing. It addresses key pain points such as privacy concerns, deployment complexity, and poor task responsiveness. Keywords: environment-adaptive architecture, SSE, RQ.

Table of Contents

The technical specification snapshot captures the project at a glance

Parameter	Details
Project Name	AI Academic Polisher
Core Languages	Python, JavaScript
Backend Framework	Flask
Frontend Framework	Vue 3 + Pinia + Vite
Communication Protocols	HTTP, SSE, Redis Pub/Sub
Task Queue	RQ / MemoryQueue
Database	MySQL / SQLite
Core Dependencies	Redis, SQLAlchemy, ThreadPoolExecutor, PyInstaller
Open Source License	MIT
Repository	https://github.com/math89423-star/AI-Academic-Polisher
Stars	Not provided in the source

One codebase spans two deployment realities

This project is not just a simple “paper rewriting tool.” It is an engineered system designed around the academic polishing workflow. It must simultaneously satisfy four goals: server deployment, multi-user sharing, desktop offline usage, and a low technical barrier to entry.

The real value is not the prompt itself, but the infrastructure abstraction behind it. The author encapsulates Redis, queues, databases, and real-time streaming as replaceable dependencies, so the business layer does not need to care about runtime environment differences.

The refinement results validate the product positioning

The original data provides before-and-after comparisons across multiple detection platforms: PaperPass drops from 75.24% to 0.41%, Weipu drops from 42.79% to 3.34%, and both Chinese and English samples in Zhuque drop to 0%. These results should be understood primarily as evidence of improved academic expression rewriting quality, rather than as a mechanism for bypassing detection.

AI Visual Insight: The image shows the PaperPass detection results panel. The key takeaway is that the AI-recognition ratio drops significantly after polishing, which suggests that the system changes sentence structure, wording density, and textual rhythm while preserving meaning.

Weipu detection AI Visual Insight: The image presents the Weipu AIGC detection report interface. It highlights the probability change before and after refinement, indirectly showing that the model output more closely resembles human academic writing after stylistic restructuring.

AI Visual Insight: The image shows results from the Zhuque detection platform. Both Chinese and English samples demonstrate extremely low recognition rates, which indicates that the prompt strategy and post-processing pipeline have some degree of generalization across multilingual academic text.

results = {
    "PaperPass": (75.24, 0.41),
    "Weipu": (42.79, 3.34),
    "Zhuque English": (100.0, 0.0),
    "Zhuque Chinese": (100.0, 0.0),
}
# Core logic: record before-and-after detection changes with structured data
for name, (before, after) in results.items():
    print(name, before, after)

This code demonstrates a structured way to store test results, making later statistics and presentation easier.

The dual-mode architecture pushes deployment differences down to the infrastructure layer

The project determines its runtime mode through DEPLOY_MODE=server|desktop|auto. In auto mode, it detects the operating system automatically: Windows defaults to desktop mode, while Linux defaults to server mode.

The benefit is direct and practical: end users can launch the application by double-clicking an EXE, while developers still retain Docker-based service deployment. The business logic does not need separate platform-specific branches.

Mode detection and factory switching form the first critical abstraction layer

import os, platform

def resolve_deploy_mode():
    mode = os.environ.get("DEPLOY_MODE", "auto")
    if mode == "auto":
        # Core logic: infer the deployment form from the operating system
        return "desktop" if platform.system() == "Windows" else "server"
    return mode

This code handles automatic deployment mode selection and serves as the unified entry point for the dual-mode architecture.

Once the mode is resolved, the project uses a factory pattern to switch between Redis and queue implementations. Server mode uses Redis + RQ, while Desktop mode falls back to MemoryRedis + MemoryQueue, with the same interface preserved across both.

MemoryRedis enables a transparent replacement for real Redis

The design goal of MemoryRedis is not to simulate all of Redis, but to implement only the subset of methods required by the current business logic, including KV, Hash, Set, and Pub/Sub. It uses in-memory dictionaries plus locking to form a minimal viable runtime.

from collections import defaultdict
import queue, threading

class MemoryRedis:
    def __init__(self):
        self._kv = {}
        self._channels = defaultdict(list)
        self._lock = threading.Lock()

    def publish(self, channel, message):
        with self._lock:
            # Core logic: broadcast the message to all subscribers of this channel
            for q in self._channels[channel]:
                q.put(message)

This code shows a minimal in-memory Pub/Sub implementation.

Choosing RQ reflects disciplined control over complexity

The project does not choose Celery. Instead, it adopts the lighter RQ. The reasoning is pragmatic: academic polishing tasks are primarily asynchronous I/O workloads and do not require Celery’s heavier broker, backend, and task-decorator ecosystem.

RQ offers clear advantages: fewer dependencies, fast integration, and straightforward debugging. For small to medium asynchronous systems, this “sufficient and transparent” capability is often more valuable than feature completeness.

MemoryQueue restores asynchronous capability on the desktop side with a daemon thread

In desktop mode, there is no separate worker process. The author uses a daemon thread to consume the in-memory queue and manually injects Flask app_context. This is the key mechanism that allows desktop mode to reuse the same business logic as the server side.

import queue, threading

class MemoryQueue:
    def __init__(self, app):
        self._q = queue.Queue()
        self._app = app

    def enqueue(self, func, *args):
        # Core logic: put the task function and arguments into the local queue
        self._q.put((func, args))

This code shows how the desktop side simulates RQ-style task submission through an in-memory queue.

SSE is a better fit than WebSocket for one-way progress streaming

The frontend only needs to receive polishing progress, incremental text, and completion events. It does not require a bidirectional real-time session. For that reason, SSE is lighter than WebSocket, uses plain HTTP, and benefits from built-in browser reconnection.

On the backend, the application subscribes to the progress:{task_id} channel and emits an event stream. On the frontend, EventSource listens for streamed results. This design keeps progress updates real time while maintaining a low implementation cost.

The SSE design depends on disabling buffering correctly in Nginx

from flask import Response

def stream(pubsub):
    def generate():
        for msg in pubsub.listen():
            # Core logic: wrap each message as an SSE data frame
            yield f"data: {msg['data']}\n\n"
    return Response(generate(), mimetype="text/event-stream")

This code demonstrates how the backend emits a standard SSE response stream.

If the reverse proxy buffers responses by default, the client will receive delayed chunks instead of immediate updates. In production, you must disable proxy_buffering and pair it with thread-based workers to handle long-lived connections correctly.

Ordered concurrent processing improves the long-document experience

DOCX academic papers often contain many paragraphs. If the model processes them one by one in sequence, total latency becomes high. The project uses ThreadPoolExecutor for concurrent execution and maintains a future -> index mapping so results can be written back in the original order.

This is a very typical and practical pattern: execute concurrently, commit in order. It works well for paragraph-level polishing, chunked summarization, and batch translation tasks.

Ordered concurrency is the core mechanism behind the long-text UX improvement

from concurrent.futures import ThreadPoolExecutor, as_completed

def process_paragraphs(paragraphs, worker):
    results = [None] * len(paragraphs)
    with ThreadPoolExecutor(max_workers=5) as pool:
        future_map = {
            pool.submit(worker, p): i
            for i, p in enumerate(paragraphs)
        }
        for future in as_completed(future_map):
            idx = future_map[future]
            # Core logic: write back using the original index to preserve stable ordering
            results[idx] = future.result()
    return results

This code shows how to process a long document concurrently without scrambling paragraph order.

The project also implements a cancellation mechanism: once the user triggers cancellation, the system writes a Redis flag, and the worker polls that flag between paragraphs to stop further API consumption promptly. This granularity balances responsiveness with implementation simplicity.

Hot-swappable prompts and centralized key management improve maintainability

Prompt files live in the prompts/ directory and are loaded on demand for each task rather than being cached at startup. That means you can modify prompt strategy files without restarting the service, which is especially useful for high-frequency prompt tuning.

Another reusable design choice is RedisKeyManager. When distributed key names are scattered across multiple files, renaming, troubleshooting, and future evolution all become hidden technical debt. Centralized key management significantly reduces that maintenance burden.

FAQ uses a structured question-and-answer format

1. Why doesn’t this project use Celery directly?

Because this scenario benefits more from lower complexity and better maintainability. RQ has fewer dependencies and is easier to debug, while still fully covering asynchronous I/O tasks such as academic paper polishing.

2. Why does the project use SSE instead of WebSocket for real-time updates?

Because the requirement is a one-way progress stream. SSE runs over HTTP, has native browser auto-reconnect support, and keeps both deployment and implementation lighter.

3. What is the most reusable lesson from the dual-mode architecture?

It is not simply “support desktop and server deployment.” The key lesson is to abstract infrastructure interfaces first, then inject implementations by environment. As long as the interface remains stable, extending runtime modes becomes natural.

Core Summary: This article reconstructs the core engineering design of AI Academic Polisher: an environment-adaptive architecture that unifies web and Windows Desktop deployment, and combines RQ/MemoryQueue, Redis/MemoryRedis, SSE real-time streaming, and ordered concurrent processing to support paper polishing, asynchronous long-document handling, and low-coupling extensibility.