Resilience4j
Resilience4j 是专为 Java 8+ 和函数式编程设计的轻量级容错库,是 Spring Boot 3.x 官方推荐的 Hystrix 替代方案。模块化设计,按需引入,核心功能通过装饰器模式叠加到任意函数上。
核心模块
| 模块 | 依赖 artifactId | 功能 |
|---|---|---|
| CircuitBreaker | resilience4j-circuitbreaker | 熔断器 |
| RateLimiter | resilience4j-ratelimiter | 速率限流 |
| Bulkhead | resilience4j-bulkhead | 舱壁隔离(并发控制) |
| Retry | resilience4j-retry | 自动重试 |
| TimeLimiter | resilience4j-timelimiter | 超时控制 |
| Cache | resilience4j-cache | 结果缓存 |
Spring Boot 集成
<dependency>
<groupId>io.github.resilience4j</groupId>
<artifactId>resilience4j-spring-boot3</artifactId>
<version>2.2.0</version>
</dependency>
<!-- 依赖 Spring AOP -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-aop</artifactId>
</dependency>熔断器(CircuitBreaker)
状态机:
CLOSED ──(失败率/慢调用率超阈值)──▶ OPEN
▲ │
│ 等待 waitDurationInOpenState
│ │
└──(半开测试通过)──── HALF_OPEN ◀───┘
│
(测试失败)→ OPEN
配置:
resilience4j:
circuitbreaker:
instances:
orderService:
# 滑动窗口:COUNT_BASED(按调用次数)或 TIME_BASED(按时间窗口)
slidingWindowType: COUNT_BASED
slidingWindowSize: 10 # 窗口内统计 10 次调用
failureRateThreshold: 50 # 失败率 ≥ 50% 触发熔断
slowCallRateThreshold: 80 # 慢调用率 ≥ 80% 触发熔断
slowCallDurationThreshold: 2s # 超过 2s 视为慢调用
waitDurationInOpenState: 10s # OPEN 状态持续 10s 后进入 HALF_OPEN
permittedNumberOfCallsInHalfOpenState: 3 # 半开状态允许 3 次探测
minimumNumberOfCalls: 5 # 至少 5 次调用后才计算使用:
@CircuitBreaker(name = "orderService", fallbackMethod = "getOrderFallback")
public OrderDTO getOrder(Long id) {
return orderClient.getById(id);
}
public OrderDTO getOrderFallback(Long id, CallNotPermittedException ex) {
return OrderDTO.empty(); // 熔断时的降级响应
}
public OrderDTO getOrderFallback(Long id, Exception ex) {
log.error("Order service error", ex);
return OrderDTO.empty(); // 业务异常时的降级响应
}限流器(RateLimiter)
基于令牌桶算法,控制单位时间内的请求数:
resilience4j:
ratelimiter:
instances:
orderApi:
limitForPeriod: 100 # 每个周期内允许 100 次请求
limitRefreshPeriod: 1s # 令牌刷新周期 1s(即 QPS=100)
timeoutDuration: 500ms # 等待令牌的最大超时时间@RateLimiter(name = "orderApi", fallbackMethod = "rateLimitFallback")
public OrderDTO createOrder(OrderDTO dto) {
return orderService.create(dto);
}
public OrderDTO rateLimitFallback(OrderDTO dto, RequestNotPermitted ex) {
throw new TooManyRequestsException("服务繁忙,请稍后再试");
}舱壁(Bulkhead)
限制并发调用数量,防止单个服务耗尽线程资源:
信号量舱壁(同线程执行,仅限制并发数):
resilience4j:
bulkhead:
instances:
orderService:
maxConcurrentCalls: 20 # 最大并发 20
maxWaitDuration: 100ms # 超出时等待 100ms线程池舱壁(独立线程池,完全隔离):
resilience4j:
thread-pool-bulkhead:
instances:
orderService:
maxThreadPoolSize: 10
coreThreadPoolSize: 5
queueCapacity: 20@Bulkhead(name = "orderService", type = Bulkhead.Type.SEMAPHORE)
public OrderDTO getOrder(Long id) { ... }
@Bulkhead(name = "orderService", type = Bulkhead.Type.THREADPOOL)
public CompletableFuture<OrderDTO> getOrderAsync(Long id) { ... }重试(Retry)
resilience4j:
retry:
instances:
orderService:
maxAttempts: 3 # 最多尝试 3 次(含第一次)
waitDuration: 500ms # 重试间隔 500ms
enableExponentialBackoff: true
exponentialBackoffMultiplier: 2 # 指数退避倍数
retryExceptions:
- java.io.IOException
- feign.RetryableException
ignoreExceptions:
- com.example.BusinessException # 业务异常不重试@Retry(name = "orderService", fallbackMethod = "retryFallback")
public OrderDTO getOrder(Long id) { ... }装饰器组合(编程式)
多个容错模式叠加,顺序从外到内依次执行:
// 顺序:TimeLimiter → CircuitBreaker → RateLimiter → Retry → 业务方法
CircuitBreaker cb = circuitBreakerRegistry.circuitBreaker("order");
RateLimiter rl = rateLimiterRegistry.rateLimiter("order");
Retry retry = retryRegistry.retry("order");
TimeLimiter tl = timeLimiterRegistry.timeLimiter("order");
Supplier<OrderDTO> supplier = CircuitBreaker.decorateSupplier(cb,
RateLimiter.decorateSupplier(rl,
Retry.decorateSupplier(retry,
() -> orderClient.getById(id))));
Try<OrderDTO> result = Try.ofSupplier(supplier)
.recover(CallNotPermittedException.class, e -> OrderDTO.empty())
.recover(RequestNotPermitted.class, e -> OrderDTO.empty());监控集成(Micrometer)
引入 resilience4j-micrometer 后,所有状态指标自动暴露给 Prometheus:
resilience4j_circuitbreaker_state{name="orderService"} # 0=CLOSED,1=OPEN,2=HALF_OPEN
resilience4j_circuitbreaker_calls_total{kind="successful"}
resilience4j_ratelimiter_available_permissions{name="orderApi"}
resilience4j_retry_calls_total{kind="successful_without_retry"}
Resilience4j vs Sentinel
| 特性 | Resilience4j | Sentinel |
|---|---|---|
| 生态 | Spring/Netflix | Spring Cloud Alibaba |
| 熔断 | 完整状态机 | 完整状态机 |
| 限流 | 令牌桶 | QPS + 并发线程数 + 热点参数 |
| 热点限流 | 不支持 | 支持 |
| 控制台 | 无(依赖 Grafana) | 独立 Dashboard |
| 规则推送 | 配置文件/代码 | Nacos/Apollo 动态推送 |
| 编程模型 | 函数式装饰器 | 注解 / try-with-resources |