Envoy代理深度配置:构建高性能API网关

深入讲解Envoy代理的核心概念与高级配置,详解Listener、Cluster、Route、Filter等组件,提供流量管理、故障注入、访问日志、gRPC转码等生产级配置示例。

引言

Envoy是Cloud Native Computing Foundation(CNCF)的毕业项目,作为L7代理和通信总线,被广泛应用于Service Mesh和API网关场景。本文将深入讲解Envoy的核心配置和高级用法。

Envoy核心概念

Envoy配置层级:

┌─────────────────────────────────────┐
│           Admin Interface           │  管理接口
├─────────────────────────────────────┤
│              Listeners              │  监听器(接收连接)
├─────────────────────────────────────┤
│          Filter Chains              │  过滤器链(处理请求)
├─────────────────────────────────────┤
│               Routes                │  路由规则
├─────────────────────────────────────┤
│              Clusters               │  集群(后端服务)
├─────────────────────────────────────┤
│             Endpoints               │  端点(具体实例)
└─────────────────────────────────────┘

基础配置

最小化配置示例

# envoy.yaml
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                stat_prefix: ingress_http
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: backend
                      domains: ["*"]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: service_backend
                
                http_filters:
                  - name: envoy.filters.http.router
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
  
  clusters:
    - name: service_backend
      connect_timeout: 0.25s
      type: STRICT_DNS
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: service_backend
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: backend-service
                      port_value: 8080

admin:
  address:
    socket_address:
      address: 0.0.0.0
      port_value: 9901

高级路由配置

基于Header的路由

route_config:
  name: advanced_routing
  virtual_hosts:
    - name: api_service
      domains: ["api.example.com"]
      routes:
        # 基于Header路由到Canary版本
        - match:
            prefix: "/api/v1/users"
            headers:
              - name: x-canary
                exact_match: "true"
          route:
            cluster: user_service_canary
            timeout: 30s
        
        # 基于Header路由到特定版本
        - match:
            prefix: "/api/"
            headers:
              - name: x-api-version
                exact_match: "v2"
          route:
            cluster: api_service_v2
        
        # 默认路由
        - match:
            prefix: "/api/"
          route:
            cluster: api_service_v1
            retry_policy:
              retry_on: "5xx,connect-failure,reset"
              num_retries: 3
              per_try_timeout: 10s

加权路由(流量分割)

route_config:
  virtual_hosts:
    - name: payment_service
      domains: ["payment.example.com"]
      routes:
        - match:
            prefix: "/process"
          route:
            weighted_clusters:
              clusters:
                - name: payment_v1
                  weight: 90
                - name: payment_v2
                  weight: 10
              total_weight: 100

路由重写与重定向

routes:
  # URL重写
  - match:
      prefix: "/old-api/"
    route:
      cluster: new_api_service
      prefix_rewrite: "/api/v2/"
  
  # 正则重写
  - match:
      regex: "^/users/([0-9]+)/profile$"
    route:
      cluster: user_service
      regex_rewrite:
        pattern:
          regex: "^/users/([0-9]+)/profile$"
        substitution: "/api/users/\\1"
  
  # 重定向
  - match:
      prefix: "/legacy/"
    redirect:
      path_redirect: "/modern/"
      response_code: MOVED_PERMANENTLY  # 301

Cluster高级配置

熔断器(Circuit Breaker)

clusters:
  - name: order_service
    connect_timeout: 0.5s
    type: STRICT_DNS
    lb_policy: ROUND_ROBIN
    circuit_breakers:
      thresholds:
        - priority: DEFAULT
          max_connections: 1024
          max_pending_requests: 1024
          max_requests: 1024
          max_retries: 3
          track_remaining: true
        - priority: HIGH
          max_connections: 2048
          max_pending_requests: 2048
          max_requests: 2048
    
    # 异常检测(自动剔除故障节点)
    outlier_detection:
      consecutive_5xx: 5
      interval: 10s
      base_ejection_time: 30s
      max_ejection_percent: 50
      enforcing_consecutive_5xx: 100
      enforcing_success_rate: 100
      success_rate_minimum_hosts: 3
      success_rate_request_volume: 100
      success_rate_stdev_factor: 1900
    
    load_assignment:
      cluster_name: order_service
      endpoints:
        - lb_endpoints:
            - endpoint:
                address:
                  socket_address:
                    address: order-1
                    port_value: 8080
              load_balancing_weight: 100
            - endpoint:
                address:
                  socket_address:
                    address: order-2
                    port_value: 8080
              load_balancing_weight: 100

健康检查

clusters:
  - name: backend_service
    health_checks:
      - timeout: 5s
        interval: 10s
        unhealthy_threshold: 3
        healthy_threshold: 2
        http_health_check:
          path: "/health"
          expected_statuses:
            - start: 200
              end: 200
    
    # 负载均衡策略
    lb_policy: LEAST_REQUEST
    least_request_lb_config:
      choice_count: 2  # Power of Two Choices算法

Filter配置

速率限制

http_filters:
  - name: envoy.filters.http.ratelimit
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.ratelimit.v3.RateLimit
      domain: api_gateway
      stage: 0
      timeout: 0.5s
      failure_mode_deny: false
      rate_limit_service:
        grpc_service:
          envoy_grpc:
            cluster_name: ratelimit_service
          timeout: 0.25s
        transport_api_version: V3
  
  - name: envoy.filters.http.router
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router

CORS配置

http_filters:
  - name: envoy.filters.http.cors
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.cors.v3.Cors

# 在virtual_hosts中配置CORS策略
virtual_hosts:
  - name: api
    domains: ["api.example.com"]
    cors:
      allow_origin_string_match:
        - exact: "https://www.example.com"
        - prefix: "https://"
          suffix: ".example.com"
      allow_methods: "GET, POST, PUT, DELETE, OPTIONS"
      allow_headers: "Content-Type, Authorization, X-Request-ID"
      expose_headers: "X-Request-ID"
      max_age: "3600"
      allow_credentials: true

JWT认证

http_filters:
  - name: envoy.filters.http.jwt_authn
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.jwt_authn.v3.JwtAuthentication
      providers:
        auth0_provider:
          issuer: "https://your-domain.auth0.com/"
          audiences:
            - "https://api.example.com"
          remote_jwks:
            http_uri:
              uri: "https://your-domain.auth0.com/.well-known/jwks.json"
              cluster: auth0_cluster
              timeout: 1s
            cache_duration:
              seconds: 300
          forward: true
          forward_payload_header: "x-jwt-payload"
      
      rules:
        - match:
            prefix: "/api/public"
        - match:
            prefix: "/api/"
          requires:
            provider_name: auth0_provider

gRPC转码

gRPC-Web支持

http_filters:
  - name: envoy.filters.http.grpc_web
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.grpc_web.v3.GrpcWeb
  
  - name: envoy.filters.http.cors
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.cors.v3.Cors
  
  - name: envoy.filters.http.router
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router

clusters:
  - name: grpc_service
    connect_timeout: 0.25s
    type: STRICT_DNS
    lb_policy: ROUND_ROBIN
    http2_protocol_options: {}  # 启用HTTP/2
    load_assignment:
      cluster_name: grpc_service
      endpoints:
        - lb_endpoints:
            - endpoint:
                address:
                  socket_address:
                    address: grpc-backend
                    port_value: 50051

REST到gRPC转码

http_filters:
  - name: envoy.filters.http.grpc_json_transcoder
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.grpc_json_transcoder.v3.GrpcJsonTranscoder
      proto_descriptor: "/etc/envoy/proto_descriptor.pb"
      services:
        - "bookstore.BookstoreService"
      print_options:
        add_whitespace: true
        always_print_primitive_fields: true
      convert_grpc_status: true
// bookstore.proto
syntax = "proto3";

package bookstore;

import "google/api/annotations.proto";

service BookstoreService {
  rpc GetBook(GetBookRequest) returns (Book) {
    option (google.api.http) = {
      get: "/v1/shelves/{shelf}/books/{book}"
    };
  }
  
  rpc CreateBook(CreateBookRequest) returns (Book) {
    option (google.api.http) = {
      post: "/v1/shelves/{shelf}/books"
      body: "book"
    };
  }
  
  rpc ListBooks(ListBooksRequest) returns (ListBooksResponse) {
    option (google.api.http) = {
      get: "/v1/shelves/{shelf}/books"
    };
  }
}

访问日志与追踪

结构化访问日志

http_connection_manager:
  access_log:
    - name: envoy.access_loggers.file
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
        path: "/dev/stdout"
        log_format:
          json_format:
            timestamp: "%START_TIME%"
            method: "%REQ(:METHOD)%"
            path: "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
            protocol: "%PROTOCOL%"
            response_code: "%RESPONSE_CODE%"
            response_flags: "%RESPONSE_FLAGS%"
            bytes_received: "%BYTES_RECEIVED%"
            bytes_sent: "%BYTES_SENT%"
            duration: "%DURATION%"
            upstream_service_time: "%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%"
            upstream_host: "%UPSTREAM_HOST%"
            request_id: "%REQ(X-REQUEST-ID)%"
            user_agent: "%REQ(USER-AGENT)%"
            trace_id: "%REQ(X-B3-TRACEID)%"
  
  # 请求追踪
  tracing:
    provider:
      name: envoy.tracers.zipkin
      typed_config:
        "@type": type.googleapis.com/envoy.config.trace.v3.ZipkinConfig
        collector_cluster: zipkin
        collector_endpoint: "/api/v2/spans"
        collector_endpoint_version: HTTP_JSON

故障注入(测试用)

http_filters:
  - name: envoy.filters.http.fault
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.fault.v3.HTTPFault
      abort:
        http_status: 503
        percentage:
          numerator: 10
          denominator: HUNDRED
      delay:
        fixed_delay: 2s
        percentage:
          numerator: 20
          denominator: HUNDRED
      headers:
        - name: x-envoy-fault
          exact_match: "true"

总结

Envoy配置最佳实践:

  1. 模块化配置:使用xDS API动态管理配置
  2. 熔断保护:为所有cluster配置circuit breakers
  3. 健康检查:启用主动和被动健康检查
  4. 可观测性:配置结构化日志、指标和追踪
  5. 安全加固:启用mTLS、JWT认证、CORS策略

关键优势:

  • 高性能(C++实现,零拷贝)
  • 丰富的L7功能
  • 云原生设计(xDS API)
  • 活跃的社区和生态

延伸阅读

继续阅读

探索更多技术文章

浏览归档,发现更多关于系统设计、工具链和工程实践的内容。

全部文章 返回首页