当前位置:   article > 正文

华为云上的一次kafka集群故障处理_120000 ms has passed since batch creation

120000 ms has passed since batch creation

问题现象:

    生产者的日志中大量的超时

    2022-02-17 09:29:41,692 [kafka-producer-network-thread | monolith-rule-engine-xm2m-IOT-0003] WARN  o.t.s.q.k.TbKafkaProducerTemplate - Producer template failure: Expiring 2 record(s) for tb_rule_engine.main.0-0:120000 ms has passed since batch creation
org.apache.kafka.common.errors.TimeoutException: Expiring 2 record(s) for tb_rule_engine.main.0-0:120000 ms has passed since batch creation
2022-02-17 09:29:41,692 [kafka-producer-network-thread | monolith-rule-engine-xm2m-IOT-0003] WARN  o.t.s.q.k.TbKafkaProducerTemplate - Producer template failure: Expiring 2 record(s) for tb_rule_engine.main.0-0:120000 ms has passed since batch creation
org.apache.kafka.common.errors.TimeoutException: Expiring 2 record(s) for tb_rule_engine.main.0-0:120000 ms has passed since batch creation
2022-02-17 09:29:42,167 [tb-rule-engine-consumer-29-thread-3] INFO  o.a.k.clients.FetchSessionHandler - [Consumer clientId=re-Main-consumer-xm2m-IOT-0003, groupId=re-Main-consumer-xm2m-IOT-0003] Error sending fetch request (sessionId=1512270209, epoch=INITIAL) to node 2: org.apache.kafka.common.errors.DisconnectException.
2022-02-17 09:29:51,395 [kafka-producer-network-thread | monolith-transport-api-producer-xm2m-IOT-0003] WARN  o.t.s.q.k.TbKafkaProducerTemplate - Producer template failure: Expiring 4 record(s) for tb_transport.api.responses.xm2m_transport_01-0:120000 ms has passed since batch creation
org.apache.kafka.common.errors.TimeoutException: Expiring 4 record(s) for tb_transport.api.responses.xm2m_transport_01-0:120000 ms has passed since batch creation
2022-02-17 09:29:51,395 [kafka-producer-network-thread | monolith-transport-api-producer-xm2m-IOT-0003] WARN  o.t.s.q.k.TbKafkaProducerTemplate - Producer template failure: Expiring 4 record(s) for tb_transport.api.responses.xm2m_transport_01-0:120000 ms has passed since batch creation
org.apache.kafka.common.errors.TimeoutException: Expiring 4 record(s) for tb_transport.api.responses.xm2m_transport_01-0:120000 ms has passed since batch creation

  另有一行日志:

  [2022-02-17 09:20:18,494] ERROR Error while creating ephemeral at /brokers/ids/0, node already exists and owner '179866866520031379' does not match current session '251925893726535682' (kafka.zk.KafkaZkClient$CheckedEphemeral)

问题分析:

  1.通过kafka-topics.sh --list未发现问题;

  2.怀疑有节点服务宕掉,但通过查看进程未发现问题;

  3.只好检查配置文件了,发现

  1. # The address the socket server listens on. It will get the value returned from 
  2. # java.net.InetAddress.getCanonicalHostName() if not configured.
  3. #   FORMAT:
  4. #     listeners = listener_name://host_name:port
  5. #   EXAMPLE:
  6. #     listeners = PLAINTEXT://your.host.name:9092
  7. listeners=PLAINTEXT://192.168.0.227:9092
  8. # Hostname and port the broker will advertise to producers and consumers. If not set
  9. # it uses the value for "listeners" if configured.  Otherwise, it will use the value
  10. # returned from java.net.InetAddress.getCanonicalHostName().
  11. advertised.listeners=PLAINTEXT://120.13.124.213:9092

listeners和advertised.listeners不一致。

一个内网地址,一个公网地址。

节点间通过advertised.listeners配置的公网地址互相ping,发现丢包率很高。

于是修改advertised.listeners为私网地址。

然后在各个节点上重启kafka.

问题解决。

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/煮酒与君饮/article/detail/816179
推荐阅读
相关标签
  

闽ICP备14008679号