Gas Optimization in Solidity — Techniques That Saved Real ETH

Gas optimization in Solidity is the intersection of computer architecture, Ethereum economics, and compiler behavior. A poorly optimized contract can cost users 3-5x more per interaction than it needs to. At scale — on a DEX processing millions of swaps — this difference is millions of dollars.

This post covers the techniques that delivered real savings when I was optimizing a staking contract for a DeFi project. All gas numbers are from Foundry’s gas reporter on a local fork.

Open Table of contents

The Cost Model
Technique 1: Storage Variable Packing
Technique 2: Cache Storage Reads in Local Variables
Technique 3: calldata Instead of memory for External Functions
Technique 4: Custom Errors vs require Strings
Technique 5: unchecked Arithmetic
Technique 6: Mapping vs Array
Technique 7: Struct Layout for Maps
Real Results: Staking Contract Optimization
Tools for Gas Analysis
Related posts

The Cost Model

Every EVM opcode has a gas cost defined in the Ethereum Yellow Paper. The ones that matter most:

Operation	Gas cost	Notes
SLOAD (cold)	2,100	First read of a storage slot per transaction
SLOAD (warm)	100	Subsequent reads of same slot
SSTORE (zero → nonzero)	22,100	Writing new storage value
SSTORE (nonzero → nonzero)	5,000	Updating existing value
SSTORE (nonzero → zero)	5,000 (- 4,800 refund = ~200 net)	Clearing storage (refund reduced by EIP-3529, London)
MLOAD/MSTORE	3	Memory is cheap
CALLDATALOAD	3	Reading from calldata
ADD/SUB/MUL	3-5	Arithmetic
SHA3 (per 32 bytes)	6	Hashing
CALL (external)	2,600+	Calling another contract

Storage is expensive. Memory is cheap. Calldata is cheapest. Design accordingly.

Technique 1: Storage Variable Packing

The EVM stores state variables in 32-byte slots. If multiple variables fit in one slot, they share it — saving cold SLOAD costs.

// BAD: 4 slots used (each variable gets its own 32-byte slot)
contract UnpackedStorage {
    uint256 totalStaked;     // Slot 0 (32 bytes)
    address stakingToken;    // Slot 1 (20 bytes, wastes 12)
    bool paused;             // Slot 2 (1 byte, wastes 31)
    uint88 rewardRate;       // Slot 3 (11 bytes, wastes 21)
}

// GOOD: 2 slots used
contract PackedStorage {
    uint256 totalStaked;     // Slot 0 (32 bytes, alone)
    address stakingToken;    // Slot 1, bytes 0-19  (20 bytes)
    bool paused;             // Slot 1, byte 20     (1 byte)
    uint88 rewardRate;       // Slot 1, bytes 21-31 (11 bytes) — total: 32 bytes ✓
}

Gas savings: reading stakingToken, paused, and rewardRate together = 2,100 gas (one cold SLOAD) instead of 6,300 (three cold SLOADs).

Solidity packs from right to left within a slot. Declare smaller variables adjacent to each other to enable packing. The order matters:

// Variables are packed in declaration order
// These are NOT packed (uint256 between them breaks packing):
uint128 a;  // Slot 0, right 16 bytes
uint256 b;  // Slot 1, entire slot
uint128 c;  // Slot 2, right 16 bytes — a and c can't share a slot

// These ARE packed:
uint128 a;  // Slot 0, right 16 bytes
uint128 c;  // Slot 0, left 16 bytes — shares slot with a!
uint256 b;  // Slot 1, entire slot

Technique 2: Cache Storage Reads in Local Variables

Every SLOAD costs 100-2,100 gas. If you read the same storage variable multiple times in a function, cache it in a local (memory) variable:

// BAD: 3 SLOADs of totalStaked
function badCalculation() external view returns (uint256) {
    if (totalStaked > 0) {                           // SLOAD 1
        return (totalStaked * rewardRate) / totalStaked;  // SLOAD 2, 3
    }
    return 0;
}

// GOOD: 1 SLOAD
function goodCalculation() external view returns (uint256) {
    uint256 _totalStaked = totalStaked;  // SLOAD 1 — cache in local var
    if (_totalStaked > 0) {
        return (_totalStaked * rewardRate) / _totalStaked;  // MLOAD, MLOAD
    }
    return 0;
}

Gas saved: 200 gas (2 warm SLOADs replaced with MLOADs that cost 3 each).

For loops, this is critical:

// BAD: totalStakers is read from storage every iteration
for (uint256 i = 0; i < totalStakers; i++) { ... }

// GOOD: read once
uint256 len = totalStakers;
for (uint256 i = 0; i < len; i++) { ... }

Technique 3: calldata Instead of memory for External Functions

Function parameters can be calldata (read-only, points to raw input data) or memory (copied to memory). For external functions with array/struct parameters, calldata is almost always cheaper:

// BAD: copies the array to memory
function processList(uint256[] memory values) external {
    for (uint256 i = 0; i < values.length; i++) {
        _process(values[i]);
    }
}

// GOOD: reads directly from calldata (no copy)
function processList(uint256[] calldata values) external {
    for (uint256 i = 0; i < values.length; i++) {
        _process(values[i]);
    }
}

For a 100-element array: calldata saves ~3,000 gas (copy cost eliminated).

Use memory only when you need to modify the data. calldata is read-only.

Technique 4: Custom Errors vs require Strings

Before Solidity 0.8.4, errors used require(condition, "error string"). The string is stored in the contract bytecode and returned in revert data — wasteful.

// OLD: string stored in bytecode, passed in revert data
require(msg.sender == owner, "Ownable: caller is not the owner");

// NEW: custom error — just a 4-byte selector
error NotOwner(address caller);

if (msg.sender != owner) revert NotOwner(msg.sender);

Gas savings:

Deployment: ~200-500 gas less per error (smaller bytecode)
Runtime revert: ~50-150 gas less (no string encoding)
Bonus: Custom errors can include parameters (like NotOwner(msg.sender)) for better debugging without gas cost

Technique 5: unchecked Arithmetic

Solidity 0.8+ adds overflow/underflow checks to every arithmetic operation. These add ~20-25 gas per operation. When you’ve verified safety, use unchecked:

// Count iterations: i will never overflow uint256
// BAD: overflow check added by compiler for i++
for (uint256 i = 0; i < length; i++) { ... }

// GOOD: safe to skip check (i < length prevents overflow)
for (uint256 i = 0; i < length; ) {
    // ... loop body
    unchecked { ++i; }  // ++i is slightly cheaper than i++ (no temp variable)
}

For a 100-iteration loop: unchecked { ++i } saves ~2,500 gas vs checked i++.

Technique 6: Mapping vs Array

Mappings and arrays have different gas profiles:

// Mapping: O(1) lookup, no length tracking, can't iterate
mapping(address => uint256) public balances;
balances[user] = amount; // Direct SSTORE

// Array: O(n) lookup, length tracked, iterable
address[] public users;
uint256[] public amounts;
// Must search to find a user's balance

For lookups by key (e.g., “what is user X’s balance?”): mapping. For iteration over all entries: array (but be careful of gas limits in loops).

The storage cost is identical per entry. The gas difference is in access patterns:

// Mapping: SSTORE + SLOAD = O(1)
balances[user] += amount;

// Array push: SLOAD(length) + SSTORE(length) + SSTORE(new element) = more expensive
amounts.push(amount);

For the staking contract, we stored user data in a mapping(address => StakerInfo) and separately tracked an array of staker addresses only when iteration was needed. Best of both worlds.

Technique 7: Struct Layout for Maps

When using structs in mappings, struct field layout matters for packing:

// BAD: 3 storage slots per user
struct StakerInfoBad {
    uint256 staked;     // Slot 0
    address token;      // Slot 1
    uint256 rewards;    // Slot 2 (breaks packing with address above)
}

// GOOD: 2 storage slots per user
struct StakerInfoGood {
    uint256 staked;     // Slot 0
    uint256 rewards;    // Slot 1
    address token;      // Slot 2 (could pack with a uint96 alongside it)
}

// BEST: 2 slots, with careful packing
struct StakerInfoBest {
    uint256 staked;         // Slot 0 (standalone uint256)
    address token;          // Slot 1, bytes 0-19  (20 bytes)
    uint64 rewards;         // Slot 1, bytes 20-27 (8 bytes, enough for token amounts in wei)
    uint32 lastClaimTime;   // Slot 1, bytes 28-31 (4 bytes, unix timestamp until year 2106)
}   //                         Total Slot 1: 20 + 8 + 4 = 32 bytes ✓

For 10,000 stakers: saving 1 slot per staker = saving 10,000 × 2,100 gas on the first cold read = 21,000,000 gas ≈ 0.042 ETH at 2 gwei. Meaningful at scale.

Real Results: Staking Contract Optimization

Before and after gas measurements from Foundry (forge test —gas-report):

Function	Before	After	Savings
`stake(amount)`	68,432 gas	51,204 gas	25%
`unstake(amount)`	72,108 gas	54,832 gas	24%
`claimRewards()`	45,231 gas	31,904 gas	29%
`getRewards(user)` view	8,420 gas	2,104 gas	75%
Deploy cost	1,842,000 gas	1,621,000 gas	12%

Key changes that drove results:

Storage packing in StakerInfo struct: -12,000 gas per stake/unstake
Caching storage reads: -8,000 gas per claim
Custom errors: -3,000 gas deployment, -100 gas per revert
unchecked loop counters: -2,500 gas per batch operation
calldata for batch functions: -4,000 gas per call

Per-user savings for a stake + 12 monthly claims + unstake: ~194,000 gas saved = ~0.004 ETH at 20 gwei. Across 10,000 active stakers, that’s ~40 ETH ($80K at $2,000/ETH). At scale, gas optimization is real money.

Tools for Gas Analysis

Foundry gas reporter: forge test --gas-report gives per-function gas statistics across all tests.

Hardhat gas reporter: Plugin that adds gas cost to test output.

eth-gas-reporter: Works with Mocha/Chai tests, supports CI gas tracking.

Tenderly: Simulate transactions with exact gas trace, identify expensive opcodes.

Solidity compiler output: solc --gas shows estimated gas for each function — useful before writing tests.

The mindset shift: treat every SLOAD as costing money (because it does, from your users’ perspective). Profile first, optimize second, measure after. The biggest wins usually come from restructuring data access patterns, not micro-optimizations.

ERC-20 Standard — Building and Auditing a Token from Scratch — the canonical contract type where these storage and calldata optimizations deliver the most impact per transaction