ScanForge: Generating PLC Projects Without an IDE
I built a development toolkit that generates complete Allen-Bradley PLC projects from markdown specifications. 39 MCP tools, 59 hardware-verified templates, 5,838 tests. You write a spec, run the pipeline, and get a valid .ccwsln project with ladder logic, tag databases, FAT documents, and wiring diagrams. The Rockwell compiler accepts it. The PLC runs it. Nothing else like this exists.
There is no open-source tool, no commercial product, and no Rockwell-provided utility that generates PLC projects programmatically. The entire industry works inside proprietary IDEs - point and click, one rung at a time. If you want to create a project from code, you are reverse-engineering undocumented file formats yourself. That is exactly what ScanForge does.
The Problem
PLC programming is stuck in the 1990s. You open Rockwell's Connected Components Workbench, drag contacts and coils onto a ladder rung, click through property dialogs for every single tag, and repeat for every rung in every routine. A simple motor start/stop circuit takes 15 minutes of clicking. A six-channel CEMS analyzer takes hours.
The tooling is closed-source, Windows-only, and built around manual point-and-click workflows. There is no scripting layer. No CLI. No API. No way to generate a project programmatically. If you want to create a PLC program from code, you are reverse-engineering proprietary file formats yourself.
The irony is that PLC programs are structurally simple. A motor start/stop is seven tags and three rungs. The logic fits in a markdown table. But the distance between "I know what this program should do" and "it's downloaded to the PLC and running" is filled with repetitive manual steps that have nothing to do with the actual control logic.
I wanted to close that gap.
What ScanForge Does
ScanForge is a Python toolkit that generates complete Allen-Bradley PLC projects from markdown specifications. It targets the Micro800 series (Micro820, Micro850) and ControlLogix platforms. The primary interface is an MCP server with 39 tools that any MCP-compatible client can call. You write a spec (or let the client draft one from a description), run the pipeline, and get a project.
A typical pipeline looks like this:
# You type this into Claude Code:
"Build me a motor start/stop with E-stop for a Micro850"
# Claude runs a 14-step pipeline:
1. plc_pipeline_preflight - search templates, pull docs
2. recommend_template - match motor_start_stop.md
3. save_spec - write the spec to disk
4. validate_spec - check for safety issues
5. plc_generate - produce .ccwsln project
6. plc_validate - 9 automated checks
7. plc_simulate - run scan cycles
8. plc_generate_fat - Factory Acceptance Test doc
9. plc_generate_loop_check - analog loop check sheets
10. plc_generate_hmi_export - PanelView HMI tag file
11. plc_render_pdf - visual ladder diagram
12. plc_render_block_diagram - system block diagram
13. plc_render_wiring_diagram - I/O wiring diagram
14. open_in_ccw - launch Rockwell IDE
The output is a real .ftdwsln or .ccwsln project you can open directly in FactoryTalk Design Workbench (primary) or Connected Components Workbench. All rungs, branches, timers, latch/unlatch logic, tag databases, and project scaffolding are generated. FTDW is now the primary output format. The compiler accepts it. The PLC runs it.
The Spec Format
Every PLC program starts as a markdown file. Here is a fragment from the motor start/stop template:
# Motor Start/Stop with Seal-In
## Tag Database
| Tag Name | Type | Scope | I/O Address | Description |
|--------------|------|------------|---------------|------------------------------------|
| StartPB | BOOL | Controller | _IO_EM_DI_00 | Start pushbutton (NO, momentary) |
| StopPB | BOOL | Controller | _IO_EM_DI_01 | Stop pushbutton (NC, for safety) |
| EStop | BOOL | Controller | _IO_EM_DI_02 | Emergency stop (NC, for safety) |
| MotorOL | BOOL | Controller | _IO_EM_DI_03 | Motor overload relay contact (NC) |
| MotorRun | BOOL | Controller | _IO_EM_DO_00 | Motor contactor output |
### Rung 0001 - Motor Start with Seal-In
**Instructions:**
```
XIC(StopCircuitOk) -> [XIC(StartPB), XIC(MotorRunning)] -> OTE(MotorRunning)
```
The arrow notation (->) means series. Brackets ([A, B]) mean parallel branches. Function blocks use the standard IEC 61131-3 mnemonics: TON for on-delay timers, CTU for up-counters, GEQ for greater-than-or-equal comparisons. The spec parser converts this to a structured intermediate representation. The STF generator converts that IR to ISaGRAF ladder logic files. The scaffold generator wraps everything into a valid CCW project directory.
This is the core insight: PLC programs are highly structured. The "creative" part is choosing what logic to implement. The mechanical part is producing the dozen files that Rockwell's IDE expects. That mechanical part is what ScanForge automates.
Reverse-Engineering Rockwell's File Formats
CCW projects use ISaGRAF, a runtime from the early 2000s. The ladder logic lives in .stf text files with a grid coordinate system. Each instruction has a row, column, and type code. Branches use BST/BND markers. Function blocks span three columns. The tag database lives in an Access .accdb file. Project metadata is MSBuild XML.
None of this is documented. I reverse-engineered every format by creating projects in CCW, exporting them, diffing the files, and building generators that produce byte-identical output. The stf_generator.py file alone handles 69 instruction types across contacts, coils, timers, counters, comparisons, math operations, type conversions, string manipulation, Modbus TCP communication, PID control, and subroutine calls.
When Rockwell released FactoryTalk Design Workbench as the CCW successor, I reverse-engineered that format too. FTDW uses SQLite instead of Access, JSON instead of MSBuild XML, and a different cache format for the ladder editor. FTDW is now the primary output. ScanForge also includes an L5X converter that can downgrade CompactLogix projects to Micro850 - tiered instruction mapping with a conversion report showing what translates directly, what needs workarounds, and what can't be converted.
The Template Library
59 templates. 46 in Ladder Diagram, 12 in Structured Text, plus a UDFB (User-Defined Function Block). Every template has been hardware-verified on a physical Micro850 PLC. That means: generate the project, open it in CCW, compile it, download it to the controller, confirm it runs without faults.
The CEMS (Continuous Emissions Monitoring System) templates are the most complex. The full system template has six analyzer channels with analog scaling, range checking, calibration offsets, a state machine for zero/span calibration modes, data valid flags per channel, out-of-range alarms, sensor fault detection, and O2 correction factors. That is a real production CEMS in a markdown file.
Searchable Rockwell Documentation (11,900 Chunks)
Rockwell's documentation is scattered across hundreds of PDFs, HTML pages, and Excel files. ScanForge includes a RAG corpus of 11,900+ embedded document chunks built from Rockwell technical publications, hardware specs, and programming manuals using a six-format processing pipeline: HTML, PDF, DOCX, L5X ladder logic files, CCW project files, and Excel/CSV.
The pipeline (ragprep_production.py) processes each document, chunks it with overlap, embeds with Ollama's BGE-M3 model, and uploads to Qdrant. Corpus routing directs queries to the right collection based on the query type. A query like "how do I wire a 4-20mA sensor to a Micro850 analog expansion module" finds the relevant Rockwell documentation, not a random forum post.
The Simulator
ScanForge includes a PLC scan cycle simulator (plc_sandbox). It executes ladder logic instructions tick by tick, tracks tag state changes, and produces trace tables showing exactly what happens at each scan. You can define input scenarios with timed events and watch the logic respond.
The simulator supports contacts, coils, timers (TON, TOF, RTO, TP), counters (CTU, CTD), math, comparisons, MOV, SCALER, COP, branching, subroutine calls, and string operations. It catches issues that static validation misses: a seal-in that never latches, a timer that never expires, a counter that overflows a 32-bit DINT.
CIP-to-OPC-UA Bridge
The bridge/ package connects to a running Micro850 over EtherNet/IP CIP (using pycomm3), discovers all tags, and exposes them as OPC-UA nodes. This lets AVEVA Edge and other OPC-UA clients read and write PLC tags without configuring individual communication channels.
The bridge runs a periodic scan loop with change-of-value filtering. Only tags that actually changed get written to OPC-UA nodes. It uses exponential backoff on connection loss (1s initial, 30s cap) and transitions to an error state after five consecutive failures. ScanForge can also generate the AVEVA Edge tag database CSV and OPC-UA communication config from the project's tag list.
The Validation Stack
Nine automated checks run against every generated project:
- Missing seal-in detection (latched outputs without a seal-in rung)
- E-stop verification (safety-critical programs must have E-stop logic)
- Unused tag detection (tags declared but never referenced)
- Tag cross-reference analysis (read/write status per tag)
- Safety audit (NFPA 79 compliance checks)
- Spec validation (type mismatches, invalid I/O addresses, MOV-to-BOOL errors)
- Workaround audit (verifying ISaGRAF-specific workarounds are applied)
- Batch validation (run all checks across every project in a directory)
- Build error parsing (reads CCW/FTDW compiler output and diagnoses issues)
The spec validator catches Micro850-specific pitfalls before the project is even generated. For example: the SCALER function block requires all parameters to be REAL type. Pass an integer literal like 0 instead of 0.0 and CCW throws a type mismatch. The validator catches this at spec time. The generator's _ensure_real_literal() function fixes it at generation time. Two layers of defense.
Platform Constraints
The Micro850 runs ISaGRAF v5, an IEC 61131-3 subset from the early 2000s. It has quirks. Many quirks. I documented 20+ platform constraints in docs/micro850-constraints.md, each discovered by a failed compile or a runtime fault on real hardware.
Some examples:
- Timer
.ETis TIME type, not DINT. You cannot compare it withGEQ. Workaround: heartbeat timer with a DINT counter. - MOV does not auto-convert types.
MOV(0, BoolTag)fails. Workaround: useOTR(BoolTag)to reset a BOOL. - Function blocks must be registered in the Access database with specific RefDefType IDs. Miss one and CCW shows "unknown instruction."
- The CompactLogix
LIMinstruction does not exist on Micro850. Workaround: decompose toGEQ+LEQin series. - CompactLogix
CPT(Compute) does not exist either. Workaround: decompose math expressions into individual ADD/SUB/MUL/DIV blocks using Python'sastmodule.
Every one of these has a corresponding workaround in the generator, a validator check to catch it, and a test to prevent regression. The workaround audit tool scans all templates to verify coverage.
Architecture
The MCP server is the integration layer. Any MCP-compatible client spawns it at session start via .mcp.json. No manual startup. Every tool call returns structured data, not text, so the client can chain 14 steps without parsing CLI output. The CLI tools also work standalone without any MCP client. Hot reload watches for file changes and purges stale modules between tool calls, so I can edit generator code without restarting the session.
Test Coverage
5,838 tests across 72 test files. Property-based tests with Hypothesis for the spec parser. Golden file tests comparing generated output against known-good CCW projects. Round-trip tests that generate, parse, and re-generate to verify stability. Template round-trip tests that run every template through the full pipeline. Simulation scenario tests that verify timer behavior, counter overflow, and state machine transitions.
The test suite is the safety net for a codebase that generates files for safety-critical industrial equipment. Every ISaGRAF quirk I've discovered gets a regression test. Every new instruction gets golden file assertions. Every template gets a round-trip validation that confirms CCW will accept the output.
What I Learned
MCP tools are the right abstraction for this kind of work. The alternative was a monolithic CLI that takes 30 flags or a web UI that tries to be everything. MCP lets the client pick the right tool for each step, chain them, and recover from failures mid-pipeline. When the validator finds a missing seal-in, the client can fix the spec and re-run the generator without starting over.
Hardware verification matters more than unit tests. I had a test suite passing at 100% before I ever downloaded a project to a real Micro850. The first attempt failed with eight compiler errors. Every one was a platform-specific type system constraint that no amount of unit testing would catch. Now the workflow is: generate, compile in CCW, download to hardware, confirm it runs. The test suite encodes the results of that hardware testing, but it does not replace it.
Industrial automation has no equivalent to what web developers take for granted. There is no "create-react-app" for PLC programs. No scaffolding tool, no project generator, no programmatic access to the file formats at all. Every other engineering discipline has moved toward code-first workflows. PLCs are still stuck in point-and-click IDEs from the 2000s. ScanForge is the only tool I'm aware of that bridges that gap for Allen-Bradley platforms. The domain knowledge is deep but finite. There are only so many ways to wire a motor start/stop circuit. The hard part is not the logic. The hard part is producing files in the exact format that a proprietary IDE expects, working around platform limitations that are not documented anywhere, and doing it reliably enough that someone would trust the output near physical machinery.
ScanForge runs on Windows (full features including CCW project generation), macOS and Linux (everything except CCW generation, which requires the Microsoft Access ODBC driver). The Docker image includes the REST API, MCP server, and all tools.