熔断与限流

返回 Spring Cloud

Spring Cloud 2021+ 推荐使用 Resilience4j 替代已停维护的 Hystrix。Resilience4j 是轻量级容错库,提供熔断、限流、重试、隔离四大核心模式。

核心模式

模式注解作用
Circuit Breaker@CircuitBreaker熔断,防止故障扩散
Rate Limiter@RateLimiter限流,控制访问频率
Retry@Retry失败自动重试
Bulkhead@Bulkhead资源隔离,限制并发
Time Limiter@TimeLimiter超时控制

依赖

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-circuitbreaker-resilience4j</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-aop</artifactId>
</dependency>

熔断器(Circuit Breaker)

三种状态

CLOSED(正常)─── 失败率 > 阈值 ──► OPEN(熔断)
    ▲                                    │
    │                              等待冷却时间
    │                                    ▼
    └─── 半开测试成功 ─────── HALF_OPEN(探测)

配置

resilience4j:
  circuitbreaker:
    instances:
      order-service:
        sliding-window-type: COUNT_BASED      # COUNT_BASED / TIME_BASED
        sliding-window-size: 10
        failure-rate-threshold: 50            # 失败率阈值(%)
        slow-call-rate-threshold: 80          # 慢调用比例阈值(%)
        slow-call-duration-threshold: 2s      # 慢调用判定时间
        wait-duration-in-open-state: 30s      # OPEN 状态等待时间
        permitted-number-of-calls-in-half-open-state: 5
        minimum-number-of-calls: 5
        automatic-transition-from-open-to-half-open-enabled: true

注解使用

@CircuitBreaker(name = "order-service", fallbackMethod = "getOrderFallback")
public Order getOrder(Long id) {
    return orderClient.getOrder(id);
}
 
// 降级方法:参数列表 = 原方法参数 + Throwable
public Order getOrderFallback(Long id, Throwable t) {
    log.warn("熔断降级, id={}, cause={}", id, t.getMessage());
    return Order.defaultOrder();
}

编程式使用

CircuitBreaker cb = circuitBreakerRegistry.circuitBreaker("order-service");
Supplier<Order> decorated =
    CircuitBreaker.decorateSupplier(cb, () -> orderClient.getOrder(id));
 
Order result = Try.ofSupplier(decorated)
    .recover(ex -> Order.defaultOrder())
    .get();

限流器(Rate Limiter)

resilience4j:
  ratelimiter:
    instances:
      order-api:
        limit-for-period: 100         # 每个周期允许通过的请求数
        limit-refresh-period: 1s      # 周期时长
        timeout-duration: 0s          # 等待令牌超时(0 = 立即失败)
@RateLimiter(name = "order-api", fallbackMethod = "rateLimitFallback")
public Order createOrder(CreateOrderRequest req) { ... }
 
public Order rateLimitFallback(CreateOrderRequest req, Throwable t) {
    throw new TooManyRequestsException("请求过于频繁,请稍后再试");
}

重试(Retry)

resilience4j:
  retry:
    instances:
      order-service:
        max-attempts: 3
        wait-duration: 500ms
        retry-exceptions:
          - java.io.IOException
          - feign.RetryableException
        ignore-exceptions:
          - com.example.exception.BusinessException
@Retry(name = "order-service", fallbackMethod = "retryFallback")
public Order getOrder(Long id) { ... }

隔离(Bulkhead)

resilience4j:
  bulkhead:
    instances:
      order-service:
        max-concurrent-calls: 20
        max-wait-duration: 100ms      # 超出并发时等待时长(0 = 立即拒绝)
  thread-pool-bulkhead:
    instances:
      order-service-tp:
        max-thread-pool-size: 10
        core-thread-pool-size: 5
        queue-capacity: 100
@Bulkhead(name = "order-service", type = Bulkhead.Type.SEMAPHORE)
public Order getOrder(Long id) { ... }

注解组合(执行顺序)

// 执行顺序:Retry → CircuitBreaker → RateLimiter → Bulkhead → TimeLimiter
@CircuitBreaker(name = "order-service", fallbackMethod = "fallback")
@RateLimiter(name = "order-api")
@Retry(name = "order-service")
public Order getOrder(Long id) { ... }

监控端点

management:
  health:
    circuitbreakers:
      enabled: true
  endpoints:
    web:
      exposure:
        include: health,circuitbreakers,ratelimiters
curl http://localhost:8080/actuator/health/circuitBreakers

相关链接

Spring Cloud

架构与中间件

Redis