Case Study: Hybrid Infrastructure

Hybrid Render Farm Implementation

How we created a seamless hybrid rendering environment that balances on-premise and AWS cloud resources, optimizing AWS Deadline for 10x capacity at 64% lower cost.

10x

Peak rendering capacity

30%

Shorter project timelines

€0

Capital expenses

Case Study

Hybrid Rendering Solution

Learn how we helped a VFX studio expand their rendering capacity with a seamless hybrid cloud solution without requiring capital investments.

Client Challenge

A VFX studio with an existing on-premise render farm was struggling with two critical issues:

  • Missed deadlines during peak production — Their existing infrastructure couldn't scale to meet demand
  • Capital investment concerns — They needed additional capacity but wanted to avoid large hardware investments that would sit idle during slower periods

Technical Inefficiencies Identified

  • Resource Allocation Mismatch

    The studio's on-premise render nodes were running at only 42% average utilization during normal periods, yet were completely overwhelmed during production peaks. They were using AWS EC2 instances as overflow, but launching them manually with no integration into their render management system.

  • Data Transfer Bottlenecks

    Each time a cloud render node was launched, it required a full synchronization of project assets (20-50GB per project), resulting in substantial egress costs and 30-45 minute delays before rendering could begin.

  • Inefficient Job Distribution

    Their default Deadline configuration sent jobs randomly to available nodes with no consideration for hardware capabilities or connectivity speed. GPU-intensive lighting passes were often rendered on CPU-only instances, while simulation tasks were ineffectively distributed.

  • Unpredictable Cloud Costs

    Monthly cloud expenses varied from €500 to over €12,000 with no budgeting controls. Multiple times, instances were left running after jobs completed, incurring unnecessary costs.

Our Solution

AWS Deadline Hybrid Mode Configuration

We designed a hybrid render farm architecture that maintained their existing on-premise hardware while seamlessly integrating cloud resources. Our solution provided a unified job submission system that intelligently routed renders to the most appropriate resource pool.

Technical Implementation

  • Configured custom Deadline Groups & Pools with workload-specific routing rules: lighting_pool, sim_pool, and comp_pool
  • Implemented Deadline AWS Portal with custom Python event plugins for intelligent workload routing
  • Created job submission presets that automatically tagged frames with resource requirements for optimal hardware matching
  • On-premise render nodes were prioritized for all jobs, with cloud instances only launching when queue depth exceeded local capacity

Dynamic Auto-Scaling Infrastructure

We implemented a multi-tier auto-scaling cloud render farm that automatically provisioned and deprovisioned instances based on queue composition and depth. This ensured resources were precisely matched to job requirements, controlling costs while providing virtually unlimited scaling capacity.

Technical Implementation

Auto-Scaling GroupInstance TypeScaling TriggerTarget Pool
GPU-Renderingg4dn.2xlarge
(Spot Fleet)
GPU queue greater than 10 frames
for greater than 5 minutes
lighting_pool
CPU-Renderingc5.12xlarge
(Spot Fleet)
CPU queue greater than 25 frames
for greater than 5 minutes
comp_pool
Simulationr5.8xlarge
(Spot Fleet)
SIM queue greater than 5 frames
for greater than 5 minutes
sim_pool
  • Developed custom CloudWatch metrics tracking Deadline queue length by job type
  • Implemented auto-shutdown policies with 10-minute idle detection to prevent wasted compute time
  • Optimized Storage Architecture

    We developed a multi-tiered storage solution with intelligent synchronization that only transferred the specific assets needed for each job, minimizing data transfer costs and reducing render startup times from 30+ minutes to under 5 minutes.

    Technical Implementation

    • S3-backed Asset Storage

      We deployed an S3 bucket with CloudFront distribution for fast global access to common textures and models. Assets were organized with a content-addressable system to eliminate redundancy.

    • EFS for Dynamic Workflow Data

      Amazon EFS was configured in performance mode for simulations and project files, mounted to both on-premise and cloud render nodes via Direct Connect and Transit Gateway.

    • Caching Mechanism

      We deployed a custom Python-based asset dependency analyzer that pre-cached required textures to instance-store volumes before render start, eliminating on-demand downloading.

    • Incremental Sync Logic

      Created differential transfer system using file hashing and manifest comparison, reducing typical data transfer by 85% compared to their previous full-sync approach.

    Cost Monitoring & Governance

    We implemented strict budget controls with real-time monitoring and alerts to prevent unexpected cloud spending. A custom dashboard provided visibility into render farm performance, costs, and utilization across both on-premise and cloud resources.

    Technical Implementation

    • Deployed custom Grafana dashboards with per-project tracking and accurate cost forecasting
    • Implemented AWS Budgets with multi-level alerts (80%, 90%, 100%) and automated EC2 throttling
    • Created tagging policies that automatically labeled all resources by project, department, and shot
    • Developed a scheduling system allowing supervisors to allocate daily cloud budgets by project

    Results & Impact

    10x

    Peak rendering capacity

    30%

    Shorter project timelines

    €0

    Capital expenses

    Detailed Cost Breakdown

    Before vs. After Cost Comparison

    Cost CategoryBefore (Monthly)After (Monthly)Savings
    EC2 Compute (Peak Period)€15,400€5,320-65%
    Data Transfer Costs€3,200€560-83%
    Storage (S3 & EFS)€2,100€940-55%
    On-Premise Power & Cooling€1,800€1,260-30%
    Total Monthly Costs€22,500€8,080-64%

    Compute Cost Optimization

    Our hybrid optimization approach resulted in significant EC2 cost reductions:

    Compute Savings Formula
    Savings = (On-Demand Cost − Optimized Cost) × Instance Hours
    = (€0.34/hr − €0.11/hr) × 48,000 hrs = €11,040/month
    • Spot Instance adoption reduced hourly costs by 68% for interruptible workloads
    • Instance right-sizing reduced average instance costs by 32%
    • Auto-shutdown policies eliminated 240+ hours of idle compute time per week

    Storage & Data Transfer Optimization

    Our tiered storage strategy and intelligent synchronization dramatically reduced costs:

    Storage Optimization Formula
    Savings = (S3 Standard Cost − Glacier Cost) × GB Moved
    = (€0.023/GB − €0.004/GB) × 45,000 GB = €855/month
    • S3 Intelligent Tiering automatically moved 45TB of archival assets to lower-cost storage
    • Reduced data transfer volume by 83% through differential synchronization
    • Content-based deduplication eliminated 22TB of redundant texture and model storage

    Hybrid Load Balancing Benefits

    Resource Utilization Improvement
    Before:
    42% Avg.
    After:
    86% Avg.
    Hybrid Cost Efficiency Formula
    Savings = Cloud Cost Shifted to On-Prem
    = €7,800/month in reduced cloud spending
    Cloud vs. On-Premise Workload Distribution

    Before Implementation:

    On-Premise:
    30%
    AWS Cloud:
    70%

    After Implementation:

    On-Premise:
    65%
    AWS Cloud:
    35%

    Key Financial Impact

    • €14,420 Monthly Cost Reduction

      Total savings across compute, storage, and operational costs, representing a 64% overall reduction in rendering infrastructure expenses.

    • 2.5 Month ROI Period

      Complete return on implementation investment achieved in under 3 months through direct cost savings.

    • €173,040 Annual Cost Avoidance

      Projected yearly savings without any compromise in rendering capacity or quality.

    "The hybrid solution from TraynMe gave us the best of both worlds - reliable on-premise rendering for baseline needs and limitless cloud capacity for crunch times. We've been able to take on larger projects with confidence in our ability to deliver on schedule."

    — Technical Director, VFX Studio

    Key Benefits

    • Zero Capital Investment — Expanded rendering capacity without any upfront hardware costs.
    • Elastic Scaling — Automatically adapts to workload changes, from baseline to peak production.
    • Unified Management — Single interface for managing both on-premise and cloud resources.

    Ready to optimize your hybrid rendering infrastructure?

    Our AWS Deadline Hybrid Mode optimization can help you achieve the perfect balance between on-premise and cloud resources.

    Work with us

    In the competitive world of video production, every second counts. From tight deadlines to rendering complex visual effects, your team's focus should be on creativity and delivering high-quality content — not wrestling with server setups and infrastructure challenges.