What looks like a folder is actually a factory.

ItemBridge is an automated pipeline that turns Excel item-request files into live items in our ERP, with no manual data entry. It watches a folder, cleans and validates incoming data, calls the ERP, drives the browser for follow-up tasks the API doesn't expose, and cross-checks every item with an independent AI classifier.

From the outside, it's a single instruction: drop an Excel file in a folder, items appear in the ERP. Underneath, every drop kicks off ~25 coordinated steps across seven external systems, seven concurrent background workers, and three layers of failure recovery. This page walks through the layers, simple to complex.

flowchart LR A[File dropped
in watch folder] --> B[Processed] B --> C[Items live
in the ERP] classDef simple fill:#1e1b4b,stroke:#a78bfa,stroke-width:2px,color:#f1f5f9; class A,B,C simple;
The user's mental model. Everything that follows is what "Processed" actually means.

02What's actually happening

The headline workflow. A single Excel item-request file moves through detection, cleaning, matching, template generation, ERP import, browser-driven follow-up, and a background AI cross-check before being archived. Eight phases, dozens of decisions, all invisible to the operator.

flowchart TB Start([User drops itemrequest.xlsx]) subgraph Detect["1 - Detect & Lock"] direction TB D1[Wait for file to stop changing] D2[Classify file type] D3[Backup original] D1 --> D2 --> D3 end subgraph Clean["2 - Clean & Validate"] direction TB C1[Strip HTML & problem characters
preserve inch and foot marks] C2[Normalize Product IDs
strip CUT REEL suffixes] C3[Enforce 38-char Product ID limit] C4[Format-validate vendor codes
+ duplicate detection] C5[Build RFI cable descriptions] C1 --> C2 --> C3 --> C4 --> C5 end subgraph Match["3 - Match"] direction TB M1[Manufacturer fuzzy match
cache then ERP then auto-create] M2[Item-group resolution] M1 --> M2 end subgraph Generate["4 - Generate"] G1[Append rows to ERP template
preserve formulas & formatting] end subgraph API["5 - Import"] direction TB A1[Acquire concurrency permit] A2[Create item in ERP] A3{Success?} A4[Increment success window] A5[Honor rate-limit cool-off
halve permit ceiling] A1 --> A2 --> A3 A3 -- yes --> A4 A3 -- 429 / 5xx --> A5 A5 -.feedback.-> A1 end subgraph Browser["6 - Automated Follow-up"] direction TB B1[Cost calc on ERP web UI] B2[Mark request complete
in SharePoint] B1 --> B2 end subgraph Async["7 - AI Cross-Check"] direction TB L1[Pure extractor] L2[Blind AI classifier
per item, no requestor hint] L3[Audit log + flag mismatches] L1 --> L2 --> L3 end subgraph Done["8 - Archive"] Z1[Move to backup
auto-cleanup after 24h] end Start --> Detect --> Clean --> Match --> Generate --> API A4 --> Browser A4 -.async.-> Async Browser --> Done classDef extSys fill:#2a1a3d,stroke:#f472b6,stroke-width:1.5px,color:#fce7f3; classDef async stroke-dasharray: 5 3,fill:#0e2a3d,stroke:#22d3ee,color:#cffafe; classDef decision fill:#1e1b4b,stroke:#a78bfa,stroke-width:2px,color:#ede9fe; class A2,B1,B2,L2 extSys; class Async,L1,L2,L3 async; class A3 decision;
The full lifecycle of one Excel drop. Pink nodes are external systems (ERP, web UI, SharePoint, AI). Dashed cyan is asynchronous. It runs in the background after the operator's success.
Cleaning ~35 distinct rules normalize HTML, special characters, RFI cable specs, duplicates, and Product ID format. Bad rows surface for review instead of corrupting the import.
Manufacturer matching Cascade through local cache, ERP lookup, and fuzzy similarity. Unknown vendors get auto-generated IDs and are remembered for next time.
Adaptive parallelism Self-tuning. Widens API throughput on success, halves it the moment the server pushes back, then probes upward periodically.
Circuit breaker Trips after repeated failures. Pauses traffic for a cool-off, then probes recovery before resuming, rather than hammering a degraded service.
Browser follow-up Cost-calc and SharePoint mark-complete run automatically after each batch, serialized so they happen in order with no operator intervention.
AI cross-check An independent classifier sees only the product details, never the requestor's pick. Disagreements are flagged for human review.

03In the background

While the headline flow runs, six other workers are operating concurrently. Each one was split out for a specific reason: keep the watcher fast, keep the UI responsive, keep external conversations tunable.

flowchart LR Main([System core]) Main --> W1[File watcher
routing hub] Main --> W2[Browser orchestrator
async event loop] Main --> W3[AI matching worker
per-batch classifier] Main --> W4[OneDrive tracker
polls sync folder] Main --> W5[OneDrive extractor
caches xlsx data] Main --> W6[Adaptive parallelism
state sync + probe] Main --> W7[Stats sync
local to cloud] classDef worker fill:#0e2a3d,stroke:#22d3ee,stroke-width:1.5px,color:#cffafe; classDef ent fill:#1e1b4b,stroke:#a78bfa,stroke-width:2px,color:#ede9fe; class Main ent; class W1,W2,W3,W4,W5,W6,W7 worker;
Each worker runs independently. The file watcher publishes events; everyone else listens or polls.
File watcher
Tails the watch folder, waits for files to stop changing, classifies them, routes them to the right pipeline. The single entry point for new work.
Browser orchestrator
Drives a real browser to handle the steps the API doesn't expose. Serializes cost-calc, manufacturer-import, and SharePoint tasks so they happen in order, hands-off after the operator's drop.
AI matching worker
After each successful batch, sends every item to a blind classifier (description + manufacturer + product ID, no requestor pick). Writes an audit log. Disagreements surface for review.
OneDrive tracker
Passively monitors a OneDrive sync folder for new requests. Doesn't auto-process. The operator reviews in SharePoint first, then batch-selects what to run.
OneDrive extractor
Pre-opens detected files and caches the items. By the time the operator hits "process," the work has already been parsed.
Adaptive parallelism
Periodically probes the safe ceiling for parallel API calls. Syncs the agreed limit through cloud storage so multiple operator machines converge.
Stats sync
Persists per-machine and aggregate processing stats so the live dashboard, the web dashboard, and other operators all see the same totals.

04Four lanes, one folder

The headline flow is one of four distinct pipelines. Each new file is inspected with multiple signals (named ranges, embedded session IDs, filename patterns) to decide where it goes. Each lane has different downstream consequences.

flowchart LR F[New file
in watch folder] --> R{Router} R -->|item request| P1[Item Request] R -->|item master export| P2[Item Master] R -->|manufacturer master export| P3[Manufacturer Master] R -->|manufacturer lookup| P4[Manufacturer Lookup] P1 --> O1[Full pipeline
+ Cost calc
+ SharePoint mark
+ AI check] P2 --> O2[Status check
+ auto-cleanup] P3 --> O3[Build template
+ Browser import] P4 --> O4[Convert to YAML
+ Upload to cloud cache] classDef route fill:#1e1b4b,stroke:#a78bfa,stroke-width:2px,color:#ede9fe; classDef out fill:#0e2a3d,stroke:#22d3ee,color:#cffafe; class R route; class O1,O2,O3,O4 out;
One folder, four distinct outcomes. Adding a new file type is a contained change. The watcher itself doesn't move.

A fifth path, the review-then-process flow, layers on top: files appearing in the OneDrive folder are passively cached, the operator reviews them in SharePoint, then batch-selects in the live dashboard to run them through the Item Request pipeline.

05When things go wrong

The ERP has bad days. Rather than fail the operator's batch, three resilience layers absorb transient pain: a circuit breaker, an offline queue, and an adaptive parallelism loop. Together they let the system ride through outages without operator intervention.

Circuit breaker

stateDiagram-v2 direction LR [*] --> CLOSED CLOSED --> OPEN: 5 consecutive failures OPEN --> HALF_OPEN: 60s timeout HALF_OPEN --> CLOSED: 3 successes HALF_OPEN --> OPEN: any failure
When the ERP starts failing repeatedly, the breaker stops sending traffic for 60 seconds rather than hammering a degraded service. After the cool-off, three test calls decide whether to resume.

Offline queue

flowchart LR A[Outbound call] -->|success| OK[Done] A -->|fails
e.g. network down| Q[Disk-backed queue] Q --> R[Periodic retry
up to 3x with backoff] R -->|success| OK R -->|still failing| Q classDef done fill:#052e29,stroke:#34d399,color:#d1fae5; classDef bad fill:#3b0a0a,stroke:#f87171,color:#fee2e2; class OK done; class Q bad;
Operations that can't reach external systems are persisted to disk and replayed when connectivity returns. No work is lost to a flaky network.

Adaptive parallelism

flowchart LR Win[20 successes
in window] -->|+1 permit| Limit[Permit limit] Limit --> Calls[Parallel calls] Calls -->|429 or reset| MD[half it
+ cool-off] MD --> Limit Probe[Periodic upward probe] --> Limit Limit <-->|merge across operators| S3[(Shared state)] classDef cloud fill:#2a1a3d,stroke:#f472b6,color:#fce7f3; class S3 cloud;
Walks parallelism up gently on success, halves it the moment the server pushes back. Operators on different machines converge on the same safe ceiling through shared state.

06The systems we touch

Seven external systems. Each is a separate authentication, a separate failure mode, and a separate vocabulary the system translates between in real time.

flowchart LR Center((ItemBridge)) Center --- I1[ERP API
item / manufacturer create + lookup] Center --- I2[ERP web UI
cost calc, manufacturer import] Center --- I4[SharePoint
mark-complete on requests] Center --- I5[Cloud storage
cache, stats, shared state] Center --- I6[AI provider
blind item-group classifier] Center --- I7[OneDrive
review-then-process source] Center --- I8[Version control
update banner] classDef center fill:#1e1b4b,stroke:#a78bfa,color:#fff,stroke-width:2.5px; classDef ext fill:#2a1a3d,stroke:#f472b6,color:#fce7f3; class Center center; class I1,I2,I4,I5,I6,I7,I8 ext;
Every external system has its own client, its own retry behavior, and its own data model. The system is essentially a translator with eight dialects.

07Across operators

Multiple operators on different machines run the system at the same time. Rather than conflict, they share state through cloud storage so decisions made on one machine carry over to the next.

flowchart TB A[Operator A
laptop] <-->|read / merge| S3[(Shared state)] B[Operator B
laptop] <-->|read / merge| S3 C[Operator C
laptop] <-->|read / merge| S3 S3 --- D1[Manufacturer cache] S3 --- D2[Per-machine + aggregate stats] S3 --- D3[Adaptive parallelism limit] classDef op fill:#1e1b4b,stroke:#a78bfa,color:#ede9fe; classDef cloud fill:#2a1a3d,stroke:#f472b6,color:#fce7f3; class A,B,C op; class S3 cloud;
If Operator A learns the API's safe parallelism is 80, Operator B doesn't have to discover it the hard way. The limit is merged through shared state.

08What the operator sees

Eight live monitoring screens plus a separate web dashboard. Each one is a focused tool for one kind of decision, not a generic log viewer.

liveDashboard
Live KPIs with today / 7-day / 30-day toggle.
liveAction log
Live filtered log output.
liveAPI failures
Per-call failure detail with retry context.
liveFiles history
Session-scoped processed-file list.
liveOneDrive Pending
Review-then-process queue with batch selection.
liveOneDrive Setup
Folder picker / first-run setup.
liveAI Mismatches
Independent classifier disagreements for human review.
liveSharePoint failures
Mark-complete failures with retry / dismiss.
webWeb Dashboard
Org-wide stats with charts, trends, and drill-downs.

09By the numbers

A quantitative read on the system. All counts taken directly from the codebase.

0
Lines of code
across the project
0
External systems
integrated
0
Concurrent background
workers
0
Distinct file-type
pipelines
0
Operator screens
+ 1 web dashboard
0
Cleaning & validation
rules
0
Resilience layers
circuit breaker, offline queue, adaptive parallelism
0
Automated tests
across 32 test files

10From drop to done

A single Excel file lands in a folder. By the time the operator sees the green checkmark, the system has classified it, cleaned 35 kinds of bad data, matched manufacturers fuzzy across a local cache, called the ERP under a self-tuning permit, driven a real browser through the steps the API doesn't expose, marked the request complete in SharePoint, and asked an AI to second-guess the classification, all while writing stats to shared storage so the next operator's work picks up where this one left off.