Enterprise EdTech Platform Engineering for Global English Language Learning

For over five years Azati has been a continuous engineering partner on a production-grade digital learning platform serving 1M+ registered learners and 75K+ teachers across multiple countries. The work spans backend service development, LMS integrations, assignment and assessment engines, high-volume data migrations, and reliability engineering on a system that genuinely cannot afford downtime.

Discuss your project
1M+

registered learners across global markets

75K+

active teachers managing classes across multiple countries

1B+

database records processed and maintained without downtime

Technologies used

Ruby on Rails
Ruby on Rails
React
React
JavaScript
JavaScript
PostgreSQL
PostgreSQL
Redis
Redis
AWS
AWS
Docker
Docker
Sidekiq
Sidekiq
Webpack
Webpack
CloudFront
CloudFront

Motivation

The client is an EdTech software company serving a major international educational publisher, delivering English language learning content built around National Geographic and TED materials to schools, universities, language centers, and corporate training programs across the US, Asia, Latin America, and Europe. The platform operates 24/7 across time zones, runs on a database growing by millions of records every week, and serves active classes where downtime is not an option.

Azati joined the engineering team as a backend specialist and has been an active contributor for over five years, covering everything from feature development and LMS integrations to production incidents and large-scale data migrations.

Business challenges

Challenge 01

A platform that runs 24/7 across every time zone with no room for downtime

The platform serves users across the US, Asia, Latin America, and Europe simultaneously. There is no quiet window to take things offline, and every release has to be designed from the start to run without interrupting active users:

  • Zero-downtime deployment required
  • Non-blocking schema changes
  • Multi-environment pre-release testing
  • Go/no-go release coordination
  • Fast hotfix execution under live traffic
#1
Challenge 02

Billions of records and changes that can't lock the database

Student responses, gradebook entries, progress records: it adds up fast at this scale. Making structural changes to that data without locking tables or degrading queries is a real engineering constraint:

  • Multi-step migrations on live tables
  • ETL pipelines between datasources
  • Retroactive backfills at billion-record scale
  • PERF environment load testing before production
#2
Challenge 03

A distributed architecture with many moving parts

The system is a set of bounded-context services plus integrations with external LMS platforms via LTI. When something goes wrong, finding the source quickly is as important as the fix itself:

  • LTI integration across multiple LMS platforms
  • SQS-based async messaging between services
  • Structured logging and distributed tracing
  • API contract stability across web and mobile
#3
Challenge 04

Shipping features without pausing stability work

The product roadmap doesn't stop for reliability improvements. New course series, new assignment types, new teacher tools: all shipped while the platform serves active classes around the clock:

  • Parallel feature and reliability tracks
  • Full QA cycle per release
  • Mobile/web API compatibility maintenance
  • Customer success escalation handling
#4

The client's requirements

The client needed a backend engineer who could join an existing, complex codebase and contribute meaningfully without a long onboarding tail. The platform was already live and serving real users, so the bar for production-readiness was set from day one. Over five years the scope has been broad and ongoing:

  • Develop and maintain backend services across the platform service architecture
  • Build and extend the assignment and assessment engine including test banks and Online Placement Tests
  • Implement and maintain LTI/LMS integrations with Canvas, Moodle, Blackboard, and other platforms
  • Design and execute large-scale data migrations without downtime or performance degradation
  • Maintain observability infrastructure including logging, tracing, and monitoring
  • Participate in release management including go/no-go reports, multi-environment testing, and incident response
  • Contribute to gradebook, analytics, and student progress reporting features
  • Support the customer success team on technical issues from schools and universities

Why Azati?

Backend engineering depth in a complex, mature codebase

Joining a five-year-old production platform with billions of records, multiple services, and a live user base is a different challenge from greenfield development. The Azati engineer passed a technical interview with the client's CTO and was selected for the ability to work independently in a complex Ruby on Rails environment. No extended ramp-up, no hand-holding through the architecture.

Long-term embedded presence, not a rotation

The same Azati engineer has been on this project for 5+ years. Continuity matters enormously on a platform of this complexity. Accumulated knowledge of how the system behaves under load, where the historical edge cases are, which integrations have quirks: that kind of context can't be onboarded quickly. Azati's staff augmentation model made it possible to maintain that continuity over the full engagement.

Direct collaboration with stakeholders at all levels

The Azati engineer participated in backlog refinement sessions, stakeholder demos, and coordination with the customer success team handling issues from real schools and universities. This isn't a contractor relationship where tickets get picked up in isolation: it's genuine integration into the product development process, from planning through to post-release support.

Reliability engineering under real production pressure

When incidents happen on a platform serving 1 million registered learners, there's no time for process overhead. The Azati engineer worked through production incidents, analyzed telemetry in NewRelic and Sentry, prepared incident reports, and contributed to retrospectives. That kind of operational discipline is built through experience, not process documents.

Building or scaling an EdTech platform that can't afford downtime?

Whether you need to extend an existing learning platform or build one from scratch, Azati brings the backend depth and domain knowledge to do it right.

Let's talk about your platform

Solution

01

Assignment and Assessment Engine

The assignment and assessment system is the part of the platform teachers use most directly. This covers the full lifecycle: creating and distributing assignments, collecting and grading responses, building test banks, running Online Placement Tests, and surfacing results in the gradebook. Auto-graded and manually reviewed responses are both supported, and grading pipelines run asynchronously via Sidekiq and SQS so they don't block the main application.

Key capabilities:
  • Assignment creation, distribution, and submission tracking
  • Auto-grading and manual teacher review workflows
  • Test bank management and Online Placement Test delivery
  • Async grading pipelines via Sidekiq and Shoryuken/SQS
02

LMS and LTI Integration Layer

Schools using Canvas, Moodle, or Blackboard don't want to manage a separate login or manually sync grades. The integration layer handles LTI launch flows, SSO, grade passback, enrollment sync via OneRoster, and content linking across multiple LMS platforms, each with their own LTI implementation quirks. API contracts stay stable across web and mobile client versions throughout.

Key capabilities:
  • LTI integration with Canvas, Moodle, Blackboard, and others
  • SSO and enrollment management across institutional accounts
  • Grade passback and progress sync to external LMS platforms
  • OneRoster support for roster and enrollment data
03

High-Volume Data Engineering and Migrations

Structural changes to a database with billions of records require a different approach than typical migrations. Multi-step patterns add new columns alongside old ones, backfill data in batches, switch over, and clean up: all without the application going offline. A dedicated PERF environment, a copy of production with obfuscated data, is used for load testing migrations via Apache JMeter before they touch real user data.

Key capabilities:
  • Multi-step zero-downtime schema migrations on billion-record tables
  • ETL pipelines for moving data between datasources
  • Retroactive backfill of historical records at scale
  • PERF environment load testing before production changes
04

Observability and Incident Response

In a distributed service architecture, knowing where a problem is matters as much as knowing how to fix it. Centralized structured logging across CloudWatch and Sentry, distributed tracing with request ID propagation, and monitoring alerts surface problems early. When incidents happen, the process covers telemetry analysis, root cause identification, hotfix deployment, and structured retrospective that feeds findings back into future release controls.

Key capabilities:
  • Centralized logging via CloudWatch and Sentry with structured format
  • Distributed tracing and request/trace ID across all services
  • Incident analysis, hotfix deployment, and retrospective process
  • ALB rule tuning and rate limiting improvements post-incident
05

Teacher Dashboard and Analytics

The platform supports a full teacher workflow across assignments, grading, analytics, and progress tracking, as well as hybrid classroom and self-study scenarios. Content covers the full CEFR spectrum across multiple course series, targeting all four language skills through interactive digital textbooks, audio and video materials, and mobile learning. The gradebook and analytics features provide progress visualization and performance breakdowns built on top of a large and continuously updating dataset.

Key capabilities:
  • Gradebook with per-student and per-assignment performance views
  • Progress tracking and analytics dashboards for teachers
  • Manual grading and score override workflows
  • Blended learning and hybrid classroom scenario support

Major achievements

Metric / areaState at engagement startCurrent state
Active students on platformGrowing early-stage user base1M+ registered learners globally
Active teachersLimited institutional adoption75K+ active teachers across multiple countries
Data migrationsAd-hoc, high-risk operationsZero-downtime multi-step patterns via PERF environment
Release processBasic deployment pipelineGo/no-go reports, multi-environment QA, incident retrospectives
LMS integrationsLimited external connectivityCanvas, Moodle, Blackboard, and others via LTI/OneRoster
Platform engagementSingle content seriesMultiple course series across ages and CEFR levels

Security

The platform operates under ISO/IEC 27001:2022 and SOC 2 Type II certifications, covering information security management across the full platform lifecycle. Infrastructure is monitored 24/7 by a dedicated support partner under a permanent support contract. Quarterly penetration testing validates the platform's security posture on a regular cadence. All data handling complies with the security requirements of institutional clients including schools and universities across the US, Europe, and Asia.

Engagement & delivery

T&M embedded in the engineering team

The Azati engineer works on a Time & Material basis, embedded directly in the client team rather than operating as an external contractor. This means participation in team ceremonies, direct access to product stakeholders, and shared ownership of the platform's technical quality.

Agile delivery with full-cycle engineering involvement

The team works in Agile sprints with backlog refinement, sprint planning, and demos to stakeholders. The Azati engineer's involvement spans the full delivery cycle:

  • Backlog refinement and technical scoping with product and engineering leads
  • Feature development, code review, and release preparation
  • Multi-environment QA including unit, manual, and automated test coverage
  • Go/no-go release coordination and post-release monitoring
  • Incident response, hotfix deployment, and retrospective participation

Results & business impact

Platform Scaled to 1M+ Registered Learners

Over five years of continuous engineering, the platform grew to serve over 1 million registered learners and 75,000 active teachers across schools and universities in the US, Asia, Latin America, and Europe.

Zero-Downtime Operations at Billion-Record Scale

Migration patterns and deployment practices developed during the engagement allow structural changes to a multi-billion-record database without taking the platform offline or degrading performance for active users.

Reliable LMS Connectivity for Institutional Clients

LTI integrations with Canvas, Moodle, Blackboard, and others give institutional clients the ability to deliver content within their existing LMS, with grade passback and enrollment sync handled automatically. This is a direct factor in adoption by schools and universities with standardized LMS infrastructure.

Mature Release and Incident Management Process

The release process that evolved over the engagement, multi-environment testing, go/no-go reports, PERF load testing, and structured incident retrospectives, means the team can ship confidently on a platform where failures affect real students in real classrooms.

Deep EdTech Domain Expertise Retained

Five years on a platform of this complexity leaves real expertise inside Azati: LMS/LTI/LRS/OneRoster standards, blended learning architecture, high-volume educational data patterns, and the operational realities of running a global 24/7 SaaS for educational institutions.

Strategic wins

Some of what was built here goes beyond individual features: it's infrastructure and process that the platform now depends on:

PERF environment as a production risk filter

Setting up a production-mirroring PERF environment with Apache JMeter load simulation changed how risky changes get evaluated. Instead of finding out in production that a migration takes six hours under real traffic, the team finds out in PERF first. That capability has probably prevented more incidents than any other single investment in the engagement.

Multi-step migration patterns for zero-downtime schema changes

The patterns developed for changing large production tables without locking the app, multi-step column changes, concurrent index builds, batched retroactive backfills, are now the standard approach. Not a workaround, but how the engineering team thinks about database changes by default.

Observability as a first-class engineering concern

Centralized structured logging, distributed tracing, and request ID propagation across services: the real payoff shows up during incidents. The ability to trace a user-reported problem through multiple services to the exact query or service call that failed is the difference between a 20-minute resolution and a 4-hour one.

Institutional LTI integration as a growth lever

LMS integration isn't just a technical feature, it's what makes the platform adoptable by universities and large school districts that have standardized on Canvas or Moodle. The LTI integration layer built and maintained over the engagement is directly tied to the platform's ability to serve institutional customers at scale.

Team composition

The engagement has run lean on the Azati side by design: a single embedded engineer with deep platform knowledge, working as an integral part of the client team rather than as external support.

  • Backend Engineer (ongoing, 5+ years) primary Azati contributor to the platform. Responsible for service development across the bounded-context architecture, LMS and LTI integrations, assignment and assessment engine features, data migrations, observability infrastructure, release management, and incident response. Works embedded in the engineering team with direct access to product stakeholders and the customer success function.

The described expertise is relevant for

  • EdTech platforms with LMS and LTI integration requirements
  • Online assessment and gradebook systems at scale
  • Blended and digital learning platform development
  • SaaS platforms requiring zero-downtime reliability engineering
  • High-volume data migration and schema change at billion-record scale
  • Backend engineering for global multi-region education products

Last updated

Got a job for Azati? Let’s talk business!

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

What's next?

  • 1. Tell Us Your Story
    Describe your project. We come back within 24 hours with team availability and a rough plan. NDA on request before the first call.
  • 2. Get Your Roadmap
    Receive a detailed proposal with scope, team composition, timeline, and costs tailored to your goals.
  • 3. Start Building
    Azati aligns on details, finalize terms, and launch your project with full transparency.