当前位置:   article > 正文

Flink 自定义source 写入 Kafka_flinksql org.apache.kafka.connect.source.sourcerec

flinksql org.apache.kafka.connect.source.sourcerecord

添加依赖

<dependency>
	<groupId>org.apache.flink</groupId>
	<artifactId>flink-connector-kafka_2.12</artifactId>
	<version>1.13.2</version>
	<scope>provided</scope>
</dependency>
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

基于 Flink 服务提交任务并执行时需要的依赖包

基于 flink 服务器提交任务前,先上传依赖包到 flink 的 lib 目录下;然后重启 flink 服务,使 jar 进行加载;否则会出现 ClassNoFoundException 的异常。

  • flink-connector-kafka_2.12-1.13.2.jar
  • kafka-clients-2.4.1.jar

启动前注意

确保 topic 在 kafka 中是真实存在的,否则将会产生如下的执行异常:

  • 运行逻辑:先获取kafka中全部的topic list,再进行正则匹配,得到指定的topic list 调试发现,获取kafka全部topic list返回null。然后产生下述异常,此时创建对应的 topic,等待下次任务重启后将可正常运行。
java.lang.RuntimeException: Unable to retrieve any partitions with KafkaTopicsDescriptor: Topic Regex Pattern (WYSXT_47_(.+)_47_other_47_property_47_post)
	at org.apache.flink.streaming.connectors.kafka.internals.AbstractPartitionDiscoverer.discoverPartitions(AbstractPartitionDiscoverer.java:156)
	at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.open(FlinkKafkaConsumerBase.java:577)
	at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34)
	at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102)
	at org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:442)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:582)
	at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:562)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:647)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:537)
	at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
	at java.lang.Thread.run(Thread.java:748)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

构建KafkaSource参数实例

/**
 * kafka source 参数实例
 * @author yinlilan
 *
 */
public class KafkaSource implements Serializable {

	private static final long serialVersionUID = 6060562931782343343L;

	private String bootStrapServers;
	
	private String groupId;
	
	private String topic;
	
	public String getBootStrapServers() {
		return bootStrapServers;
	}

	public String getGroupId() {
		return groupId;
	}
	
	public String getTopic() {
		return topic;
	}

	public KafkaSource(Object obj) {
		final JSONObject json = JSONObject.parseObject(obj.toString());
		this.bootStrapServers = json.getString("bootStrapServers");
		this.groupId = json.getString("groupId");
		this.topic = json.getString("topic");
	}
	
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35

构建自定义KafkaMQSource

基于FlinkKafkaConsumer< T > 类实现KafkaSource,其中KafkaDeserializationSchema< T >类型是用于数据反序列化的,可以将数据组装成你想要的方式然后传递出去。

import java.io.Serializable;
import java.util.Map;
import java.util.Properties;
import java.util.concurrent.ConcurrentHashMap;
import java.util.regex.Pattern;

import org.apache.flink.api.common.typeinfo.TypeHint;
import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer;
import org.apache.flink.streaming.connectors.kafka.KafkaDeserializationSchema;
import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.consumer.ConsumerRecord;

/**
 * kafka source初始化
 * @author yinlilan
 *
 */
public class KafkaMessageSource implements Serializable {
	
	private static final long serialVersionUID = -1128615689349479275L;
	
	private FlinkKafkaConsumer<Map<String, String>> consumer;
	
	public KafkaMessageSource(final String bootStrapServers, final String groupId, final String topic){
    	Properties properties = new Properties();
    	properties.setProperty(CommonClientConfigs.BOOTSTRAP_SERVERS_CONFIG, bootStrapServers);
    	// Flink Kafka Consumer 支持发现动态创建的 Kafka 分区,并使用精准一次的语义保证去消耗它们
    	properties.setProperty("flink.partition-discovery.interval-millis", "10000");//
    	properties.setProperty("group.id", groupId);
    	
    	
		//TODO: 自定义反序列化
		final KafkaDeserializationSchema<Map<String, String>> deserializer = new KafkaDeserializationSchema<Map<String, String>>(){
			
			private static final long serialVersionUID = 1574406844851249992L;
			
			private String encoding = "UTF-8";
    		
			@Override
			public TypeInformation<Map<String, String>> getProducedType() {
				return TypeInformation.of(new TypeHint<Map<String, String>>(){});
			}

			@Override
			public boolean isEndOfStream(Map<String, String> nextElement) {
				return false;
			}

			@Override
			public Map<String, String> deserialize(ConsumerRecord<byte[], byte[]> record) throws Exception {
				final Map<String, String> result = new ConcurrentHashMap<>();
				result.put("topic", record.topic());
				result.put("value", new String(record.value(), encoding));
				return result;
			}
    	};
    			
    	// 构建source
    	Pattern pattern = Pattern.compile(topic);
    	consumer = new FlinkKafkaConsumer<>(pattern, deserializer, properties);
	}

	public FlinkKafkaConsumer<Map<String, String>> getConsumer() {
		return consumer;
	}
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/从前慢现在也慢/article/detail/789341
推荐阅读
相关标签
  

闽ICP备14008679号