Case Study: DeFi Infrastructure Hardening & Observability

Overview

A DeFi platform running on Ethereum and sidechains was experiencing intermittent outages, stuck transactions, and unreliable indexers. RPC providers, gas spikes, and fragile node setups made it hard to trust on-chain data.

Webomage worked with the team to harden their infrastructure and observability, so they could operate their protocol with more confidence.

Challenges

Unreliable RPC and node providers – intermittent errors and rate limits under load.
Nonce and gas issues – stuck, duplicated, or excessively expensive transactions.
Fragile indexing stack – shared IPFS and lightly configured indexers made data incomplete or late.
Limited observability – hard to see where failures occurred in on-chain/off-chain flows.
High expectations from users and partners – outages directly affected TVL, trading, and user trust.

Approach

End-to-end flow mapping
- Mapped how transactions flowed from the app through signers, nodes, and smart contracts, into indexers and analytics.
- Identified failure points around RPC, indexing, and third-party dependencies.
Node & provider strategy
- Introduced a mix of dedicated nodes and high-quality providers for redundancy.
- Tuned connection settings, batching, and retry behaviour to handle spikes.
- Clarified which flows required low latency vs. eventual consistency.
Indexing & data layer improvements
- Moved from shared/community IPFS to dedicated storage where appropriate.
- Improved The Graph subgraphs and related indexers for critical contracts.
- Ensured key metrics and balances were derived from trustworthy data paths.
Observability & alerts
- Added monitoring for node health, RPC error rates, queue backlogs, and key contract events.
- Defined alerts around conditions that mattered to users (e.g. withdrawal/claim failures).
- Gave dashboards for protocol operators and support teams.
Operational playbooks
- Created runbooks for common incident types (RPC provider issues, indexer lag, chain events).
- Documented safe procedures for rolling node upgrades and configuration changes.

Outcomes

Higher reliability – fewer protocol incidents tied to infra or indexing problems.
Faster incident response – clearer signals and runbooks when things did go wrong.
Better partner and user confidence – infra issues no longer dominated the risk story.
Stronger foundation – the team could keep iterating on product and protocol design without fear of infra collapse.

Relevant capabilities

Ethereum and sidechain infrastructure operations.
DeFi application backends and indexers (The Graph and related tooling).
Monitoring, logging, and alerting for blockchain workloads.
Secure integrations with price feeds and analytics APIs.

➡️ If your DeFi or Web3 product is limited by infrastructure issues, start a conversation.

Overview

Challenges

Approach

Outcomes

Relevant capabilities

Architectural Mastery at Scale

Expert Consultation