Designing a community health AI system for rural Bangladesh: a technical architecture study

Context: a design engagement with a health NGO

In 2024 I ran a technical architecture and feasibility study for a health NGO working across six rural districts in Bangladesh. The organization had a network of roughly 1,200 Community Health Workers (CHWs) doing maternal health, childhood illness, and nutrition monitoring across a combined population of about 1.8 million people.

The challenge they brought me was specific. They'd run a paper-based data collection and decision support system for 15 years. It worked. CHWs were trained on it, supervisors trusted it, and the data it produced fed district health planning. But the paper system was slow, couldn't support real-time analytics, and was throwing off data quality problems as their geographic footprint grew.

They wanted to know whether an AI-assisted digital system could replace or augment the paper system without breaking the CHW workflows that had taken years to settle.

This article documents the architecture I designed, the tradeoffs I made, and the lessons I took away.

The problem framing

Before designing anything, I spent two weeks in the field: shadowing CHW home visits, sitting in on supervisor meetings, reading the paper data collection forms. This wasn't optional. A system designed without understanding the real workflow fails at deployment no matter how good the technology is.

Here is what I saw.

CHW literacy and device proficiency vary a lot. Senior CHWs with 10-plus years of experience were fluent with the paper system but had little smartphone exposure. Newer CHWs were comfortable with smartphones but less confident with the clinical protocols. The system had to work for both.

Connectivity is genuinely unreliable, not just occasionally poor. In three of the six districts, connectivity dropped to zero for days at a stretch during monsoon season. Any digital system that needed connectivity to do its core job would fail at exactly the moments health risks peaked, because floods and cyclones create acute health burdens.

The paper system had safety nets that digital replacements tend to strip out. The CHW's paper register did several jobs at once: patient record, visit prompt ("I haven't been to this household in 6 weeks"), supervisor audit trail, and CHW accountability mechanism. A digital replacement had to keep all of those, not just the data collection one.

The district health office needed aggregate analytics, not just digitized records. The NGO's district health officers had been doing manual aggregation, weekly tallies from paper forms, and it ate 8 to 10 hours a week per officer. That was the sharpest pain for the management layer, even though it wasn't the CHW's main pain point.

The architecture design

Design principle 1: offline-first, always

Every core CHW workflow runs entirely on-device. The application carries a local SQLite database with the CHW's full household register, the clinical protocol decision trees, and a 90-day history of visit records.

Connectivity is used for four things and nothing else: syncing new visit records to the central database when a connection is available, pulling updated clinical protocols as a background download, escalating complex cases to a clinician reviewer, and generating supervisor reports.

If connectivity is gone for days, the CHW keeps working normally. When it comes back, sync runs automatically with conflict resolution.

Design principle 2: voice input for low-literacy users

The input interface is built around voice-first interaction in Bengali, with touch as a fallback. CHWs speak their observations into the device, and the app uses a Bengali ASR model to transcribe and structure the input against the relevant protocol fields.

For feature phone users, roughly 15% of the CHW population in this deployment, a USSD interface handles basic data submission and emergency escalation with no smartphone at all.

CHW Visit Workflow:
                                    
  1. Open household record          
     [Touch or voice: "Open Fatema's record"]
           |                        
           v                        
  2. Select visit type              
     [ANC / Under-5 / Nutrition / Follow-up]
           |                        
           v                        
  3. Voice-guided data entry        
     "What is the child's temperature?"
     CHW speaks answer -> ASR -> structured field
           |                        
           v                        
  4. On-device protocol check       
     [Lightweight decision tree - offline]
           |                        
           v                        
  5. Risk classification            
     [Low / Moderate / High / Emergency]
           |                        
     High/Emergency: alert + escalation prompt
     Low/Moderate: standard recommendation
           |                        
           v                        
  6. Record saved locally           
     [Sync when connectivity available]

Design principle 3: two-tier AI architecture

Not every clinical decision needs the same inference complexity or cost.

Tier 1 is an on-device lightweight model. It handles routine triage, things like fever classification, diarrhea severity scoring, and ANC risk screening, using a quantized decision tree that runs on a 2GB RAM Android phone with no connection. It's calibrated against IMCI (Integrated Management of Childhood Illness) and Bangladesh ANC protocols. It's explainable by design, so a CHW can see and check the logic.

Tier 2 is cloud-based clinical reasoning. It handles complex cases escalated from Tier 1, or anything that trips a high-risk flag. This tier uses a larger language model with clinical knowledge fine-tuning, and it's reachable only when there's connectivity. It produces structured recommendations with citations to the relevant protocol, which a clinician supervisor can review.

The design deliberately keeps Tier 2 away from routine decisions, for cost and for trust. A system that gives a CHW an unexplainable recommendation on a routine case breeds uncertainty and chips away at confidence.

Design principle 4: district analytics pipeline

graph LR
    subgraph CHW_LAYER["CHW Devices - 1,200 workers"]
        D1[Device 1]
        D2[Device 2]
        D3[Device N]
    end

    subgraph SYNC["Sync Layer"]
        S[Regional Sync Server]
        D1 -->|Encrypted batch sync| S
        D2 -->|Encrypted batch sync| S
        D3 -->|Encrypted batch sync| S
    end

    subgraph ANALYTICS["District Analytics"]
        S --> ANON[Anonymization Layer - PII removed]
        ANON --> DW[District Data Warehouse]
        DW --> DASH[District Health Dashboard]
        DW --> OUTBREAK[Outbreak Detection Engine]
        DW --> REPORT[Automated Weekly Reports]
        OUTBREAK --> ALERT[District Health Officer Alert]
    end

    subgraph QUALITY["Quality Assurance"]
        DW --> SUPER[Supervisor Visit Audit]
        SUPER --> CHW_PERF[CHW Performance Reports]
        CHW_PERF --> TRAINING[Training Need Identification]
    end

The analytics pipeline wiped out the 8 to 10 hours a week of manual aggregation each district health officer was doing. Reports that used to need weekend work to assemble before Monday's planning meeting now generate automatically every Sunday evening.

The tradeoffs I navigated

Accuracy versus explainability. A black-box deep learning model might beat a decision tree on clinical triage accuracy. But a black-box model a CHW can't explain to themselves or to a supervisor won't be trusted, and an untrusted tool won't be used no matter how well it scores on a benchmark. I chose explainable models for Tier 1 and accepted a small accuracy cost in exchange for something that would actually get deployed.

Feature richness versus simplicity. The digital system could collect far more data than the paper one. I recommended against it. Adding collection burden without adding matching value to the CHW's workflow is the fastest path to non-compliance. The system collects the same core fields as the paper system plus three high-value additions: GPS location for outbreak mapping, timestamp for visit cadence auditing, and voice notes for complex cases. Nothing else.

Central cloud versus regional deployment. Given data sovereignty concerns and connectivity constraints, I recommended a regional pattern: data is processed in a South Asia Azure region, not routed to European or North American data centers. That cut latency, cut data sovereignty risk, and was much more cost-effective for a high-volume deployment in a lower-income country.

What this system achieves

Based on the feasibility study projections, still to be validated in a pilot:

CHW visit documentation time drops from 12 to 15 minutes per household on paper to 6 to 8 minutes with digital voice assistance
ANC coverage visibility moves from monthly aggregate reports to real-time tracking by household
Outbreak detection latency drops from 2 to 3 weeks on monthly paper tallies to 48 to 72 hours on daily sync data
District health officer reporting drops from 8 to 10 hours a week of manual aggregation to automated weekly reports
Supervisor audit moves from spot-check sampling to a complete visit trail review

What comes next

The feasibility study is done. The pilot plan covers three of the six districts, with a 6-month evaluation window before any decision on full rollout.

The biggest lesson from the engagement: the technology is the easier half of this problem. The governance, training, change management, and clinical validation work, the things that make the technology trustworthy and usable, is where the real investment goes.