三.Flink集群搭建

Flink可以選擇的部署方式有:

Local、Standalone(資源利用率低)、Yarn、Mesos、Docker、Kubernetes、AWS。

我們主要對(duì)Standalone模式和Yarn模式下的Flink集群部署進(jìn)行分析。

3.1Standalone模式安裝

1. ?軟件要求

· Java 1.8.x或更高版本,

· ssh(必須運(yùn)行sshd才能使用管理遠(yuǎn)程組件的Flink腳本)

集群部署規(guī)劃

2. 解壓

tar -zxvf flink-1.6.1-bin-hadoop28-scala_2.11.tgz -C /opt/module/

3. 修改配置文件

修改flink/conf/masters,slaves,flink-conf.yaml

[root@bigdata11 conf]$ sudo vi masters

bigdata11:8081

[root@bigdata11 conf]$ sudo vi slaves

bigdata12

bigdata13

[root@bigdata11 conf]$ sudo vi flink-conf.yaml

taskmanager.numberOfTaskSlots:2 ??//52行

jobmanager.rpc.address: bigdata11 ?//33行

可選配置:

· 每個(gè)JobManager(jobmanager.heap.mb)的可用內(nèi)存量,

· 每個(gè)TaskManager(taskmanager.heap.mb)的可用內(nèi)存量,

· 每臺(tái)機(jī)器的可用CPU數(shù)量(taskmanager.numberOfTaskSlots),

· 集群中的CPU總數(shù)(parallelism.default)和

· 臨時(shí)目錄(taskmanager.tmp.dirs)

4. 拷貝安裝包到各節(jié)點(diǎn)

[root@bigdata11 module]$ scp -r flink-1.6.1/ itstar@bigdata12:`pwd`

[root@bigdata11 module]$ scp -r flink-1.6.1/ itstar@bigdata13:`pwd`

5. 配置環(huán)境變量

配置所有節(jié)點(diǎn)Flink的環(huán)境變量

[root@bigdata11 flink-1.6.1]$ vi /etc/profile

export FLINK_HOME=/opt/module/flink-1.6.1

export PATH=$PATH:$FLINK_HOME/bin

[root@bigdata11 flink-1.6.1]$ source /etc/profile

6. 啟動(dòng)flink

[itstar@bigdata11 flink-1.6.1]$ ./bin/start-cluster.sh

Starting cluster.

Starting standalonesession daemon on host bigdata11.

Starting taskexecutor daemon on host bigdata12.

Starting taskexecutor daemon on host bigdata13.

jps查看進(jìn)程

7. ?WebUI查看

http://bigdata11:8081

8. 運(yùn)行測試任務(wù)

[itstar@bigdata11 flink-1.6.1]$ bin/flink run -m bigdata11:8081?./examples/batch/WordCount.jar --input /opt/module/datas/word.txt

[itstar@bigdata11 flink-1.6.1]$ bin/flink run -m bigdata11:8081?./examples/batch/WordCount.jar --input hdfs:///LICENSE.txt?--output hdfs:///out

9. Flink ?HA

首先,我們需要知道?Flink 有兩種部署的模式,分別是?Standalone 以及?Yarn Cluster 模式。對(duì)于?Standalone 來說,F(xiàn)link 必須依賴于?Zookeeper 來實(shí)現(xiàn)?JobManager 的?HA(Zookeeper 已經(jīng)成為了大部分開源框架?HA 必不可少的模塊)。在?Zookeeper 的幫助下,一個(gè)?Standalone 的?Flink 集群會(huì)同時(shí)有多個(gè)活著的?JobManager,其中只有一個(gè)處于工作狀態(tài),其他處于?Standby 狀態(tài)。當(dāng)工作中的?JobManager 失去連接后(如宕機(jī)或?Crash),Zookeeper 會(huì)從?Standby 中選舉新的?JobManager 來接管?Flink 集群。

對(duì)于?Yarn Cluaster 模式來說,F(xiàn)link 就要依靠?Yarn 本身來對(duì)?JobManager 做?HA 了。其實(shí)這里完全是?Yarn 的機(jī)制。對(duì)于?Yarn Cluster 模式來說,JobManager 和?TaskManager 都是被?Yarn 啟動(dòng)在?Yarn 的?Container 中。此時(shí)的?JobManager,其實(shí)應(yīng)該稱之為?Flink Application Master。也就說它的故障恢復(fù),就完全依靠著?Yarn 中的?ResourceManager(和?MapReduce 的?AppMaster 一樣)。由于完全依賴了?Yarn,因此不同版本的?Yarn 可能會(huì)有細(xì)微的差異。這里不再做深究。

1)修改配置文件

修改flink-conf.yaml,HA模式下,jobmanager不需要指定,在master file中配置,由zookeeper選出leader與standby。

#jobmanager.rpc.address: bigdata11

high-availability: zookeeper ??//73行

#指定高可用模式(必須) //88行

high-availability.zookeeper.quorum:bigdata11:2181,bigdata12:2181,bigdata13:2181

#ZooKeeper仲裁是ZooKeeper服務(wù)器的復(fù)制組,它提供分布式協(xié)調(diào)服務(wù)(必須) //82行

high-availability.storageDir: hdfs:///flink/ha/? ? ? ?

#JobManager元數(shù)據(jù)保存在文件系統(tǒng)storageDir中,只有指向此狀態(tài)的指針存儲(chǔ)在ZooKeeper中(必須) //沒有

high-availability.zookeeper.path.root: /flink? ? ? ? ?

#根ZooKeeper節(jié)點(diǎn),在該節(jié)點(diǎn)下放置所有集群節(jié)點(diǎn)(推薦) //沒有

high-availability.cluster-id:/flinkCluster? ? ? ? ? ?

#自定義集群(推薦)

state.backend: filesystem

state.checkpoints.dir: hdfs:///flink/checkpoints

state.savepoints.dir: hdfs:///flink/checkpoints

修改conf/zoo.cfg

server.1=bigdata11:2888:3888

server.2=bigdata12:2888:3888

server.3=bigdata13:2888:3888

修改conf/masters

bigdata11:8081

bigdata12:8081

修改slaves

bigdata12

bigdata13

同步配置文件conf到各節(jié)點(diǎn)

2)啟動(dòng)HA

先啟動(dòng)zookeeper集群各節(jié)點(diǎn)(測試環(huán)境中也可以用Flink自帶的start-zookeeper-quorum.sh),啟動(dòng)dfs ,再啟動(dòng)flink

[itstar@bigdata11 flink-1.6.1]$ bin/start-cluster.sh

WebUI查看,這是會(huì)自動(dòng)產(chǎn)生一個(gè)主Master,如下

3)驗(yàn)證HA

手動(dòng)殺死bigdata12上的master,此時(shí),bigdata11上的備用master轉(zhuǎn)為主mater。

4)手動(dòng)將JobManager / TaskManager實(shí)例添加到群集

您可以使用bin/jobmanager.sh和bin/taskmanager.sh腳本將JobManager和TaskManager實(shí)例添加到正在運(yùn)行的集群中。

添加JobManager

bin/jobmanager.sh ((start|start-foreground) [host] [webui-port])|stop|stop-all

添加TaskManager

bin/taskmanager.sh start|start-foreground|stop|stop-all

[itstar@bigdata12 flink-1.6.1]$ jobmanager.sh start bigdata12

新添加的為從master。

3.2Yarn模式安裝

在官網(wǎng)下載1.6.1版本Flink(https://archive.apache.org/dist/flink/flink-1.6.1/)。

將安裝包上傳到要按照J(rèn)obManager的節(jié)點(diǎn)(bigdata11)。

進(jìn)入Linux系統(tǒng)對(duì)安裝包進(jìn)行解壓:(同上)

修改安裝目錄下conf文件夾內(nèi)的flink-conf.yaml配置文件,指定JobManager:(同上)

修改安裝目錄下conf文件夾內(nèi)的slave配置文件,指定TaskManager:(同上)

將配置好的Flink目錄分發(fā)給其他的兩臺(tái)節(jié)點(diǎn):(同上)

明確虛擬機(jī)中已經(jīng)設(shè)置好了環(huán)境變量HADOOP_HOME。

啟動(dòng)Hadoop集群(HDFS和Yarn)。

在bigdata11節(jié)點(diǎn)提交Yarn-Session,使用安裝目錄下bin目錄中的yarn-session.sh腳本進(jìn)行提交:

在yarn-site.xml文件中加入以下配置

<property>

<name>yarn.nodemanager.resource.cpu-vcores</name>

<value>5</value>

</property>

/opt/module/flink-1.6.1/bin/yarn-session.sh -n 2 -s 4 -jm 1024 -tm 1024 -nm test -d

其中:

-n(--container):TaskManager的數(shù)量。

-s(--slots): 每個(gè)TaskManager的slot數(shù)量,默認(rèn)一個(gè)slot一個(gè)core,默認(rèn)每個(gè)taskmanager的slot的個(gè)數(shù)為1,有時(shí)可以多一些taskmanager,做冗余。

-jm:JobManager的內(nèi)存(單位MB)。

-tm:每個(gè)taskmanager的內(nèi)存(單位MB)。

-nm:yarn 的appName(現(xiàn)在yarn的ui上的名字)。?

-d:后臺(tái)執(zhí)行。

啟動(dòng)后查看Yarn的Web頁面,可以看到剛才提交的會(huì)話:

在提交Session的節(jié)點(diǎn)查看進(jìn)程

提交Jar到集群運(yùn)行:

/opt/module/flink-1.6.1/bin/flink run -m yarn-cluster examples/batch/WordCount.jar

提交后在Yarn的Web頁面查看任務(wù)運(yùn)行情況

任務(wù)運(yùn)行結(jié)束后在控制臺(tái)打印如下輸出

3.3 FlinkWordCount

3.3.1 使用Socket傳輸數(shù)據(jù)

[root@bigdata13 flink-1.6.1]# bin/flink run examples/streaming/SocketWindowWordCount.jar --port 9999

#另起一個(gè)Xshell客戶端

[root@bigdata13 flink-1.6.1]# nc -l 9999

#查看日志輸出

[root@bigdata13 flink-1.6.1]# vi log/flink-root-taskexecutor-1-bigdata13.out

3.3.2 Java代碼運(yùn)行WordCount

#在bigdata13中打開9999端口

nc -l 9999

#運(yùn)行以下代碼,然后輸入數(shù)據(jù)到以上的端口中

import org.apache.flink.api.common.functions.FlatMapFunction;

import org.apache.flink.api.java.utils.ParameterTool;

import org.apache.flink.streaming.api.datastream.DataStream;

import org.apache.flink.streaming.api.datastream.DataStreamSource;

import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;

import org.apache.flink.streaming.api.windowing.time.Time;

import org.apache.flink.util.Collector;

public class WordCount {

public static void main(String[] args) throws Exception {

//定義socket的端口號(hào)

int port;

try{

ParameterTool parameterTool = ParameterTool.fromArgs(args);

port = parameterTool.getInt("port");

}catch (Exception e){

System.err.println("沒有指定port參數(shù),使用默認(rèn)值9000");

port = 9000;

}

//獲取運(yùn)行環(huán)境

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

//連接socket獲取輸入的數(shù)據(jù)

DataStreamSource<String> text = env.socketTextStream("192.168.1.53", port, "\n");

//計(jì)算數(shù)據(jù)

DataStream<WordWithCount> windowCount = text.flatMap(new FlatMapFunction<String, WordWithCount>() {

public void flatMap(String value, Collector<WordWithCount> out) throws Exception {

String[] splits = value.split("\\s");

for (String word:splits) {

out.collect(new WordWithCount(word,1L));

}

}

})//打平操作,把每行的單詞轉(zhuǎn)為<word,count>類型的數(shù)據(jù)

.keyBy("word")//針對(duì)相同的word數(shù)據(jù)進(jìn)行分組

.timeWindow(Time.seconds(2),Time.seconds(1))//指定計(jì)算數(shù)據(jù)的窗口大小和滑動(dòng)窗口大小

.sum("count");

//把數(shù)據(jù)打印到控制臺(tái)

windowCount.print()

.setParallelism(1);//使用一個(gè)并行度

//注意:因?yàn)閒link是懶加載的,所以必須調(diào)用execute方法,上面的代碼才會(huì)執(zhí)行

env.execute("streaming word count");

}

/**

*主要為了存儲(chǔ)單詞以及單詞出現(xiàn)的次數(shù)

*/

public static class WordWithCount{

public String word;

public long count;

public WordWithCount(){}

public WordWithCount(String word, long count) {

this.word = word;

this.count = count;

}

@Override

public String toString() {

return "WordWithCount{" +

"word='" + word + '\'' +

", count=" + count +

'}';

}

}

}

3.3.3 Scala代碼運(yùn)行WordCount

import org.apache.flink.streaming.api.scala._

import org.apache.flink.streaming.api.windowing.time.Time

object ScalaWordCount {

def main(args: Array[String]): Unit = {

// get the execution environment

val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment

// get input data by connecting to the socket

val text = env.socketTextStream("bigdata13", 9999, '\n')

// parse the data, group it, window it, and aggregate the counts

val windowCounts = text

.flatMap { w => w.split("\\s") }

.map { w => WordWithCount(w, 1) }

.keyBy("word")

.timeWindow(Time.seconds(5), Time.seconds(1))

.sum("count")

// print the results with a single thread, rather than in parallel

windowCounts.print().setParallelism(1)

env.execute("Socket Window WordCount")

}

// Data type for words with count

case class WordWithCount(word: String, count: Long)

}

注意:導(dǎo)包用的import org.apache.flink.streaming.api.scala._ 不然會(huì)有缺包的BUG

3.3.4 Flink 監(jiān)控維基百科

Pom.xml

<properties>

<maven.compiler.source>1.8</maven.compiler.source>

<maven.compiler.target>1.8</maven.compiler.target>

<encoding>UTF-8</encoding>

<scala.version>2.11.12</scala.version>

<scala.binary.version>2.11</scala.binary.version>

<hadoop.version>2.8.4</hadoop.version>

<flink.version>1.6.1</flink.version>

</properties>

<dependencies>

<dependency>

<groupId>org.scala-lang</groupId>

<artifactId>scala-library</artifactId>

<version>${scala.version}</version>

</dependency>

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-java</artifactId>

<version>${flink.version}</version>

</dependency>

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-streaming-java_${scala.binary.version}</artifactId>

<version>${flink.version}</version>

</dependency>

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-scala_${scala.binary.version}</artifactId>

<version>${flink.version}</version>

</dependency>

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-streaming-scala_${scala.binary.version}</artifactId>

<version>${flink.version}</version>

</dependency>

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-table_${scala.binary.version}</artifactId>

<version>${flink.version}</version>

</dependency>

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-clients_${scala.binary.version}</artifactId>

<version>${flink.version}</version>

</dependency>

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-connector-kafka-0.10_${scala.binary.version}</artifactId>

<version>${flink.version}</version>

</dependency>

<dependency>

<groupId>org.apache.hadoop</groupId>

<artifactId>hadoop-client</artifactId>

<version>${hadoop.version}</version>

</dependency>

<dependency>

<groupId>mysql</groupId>

<artifactId>mysql-connector-java</artifactId>

<version>5.1.38</version>

</dependency>

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-connector-wikiedits_2.11</artifactId>

<version>1.6.1</version>

</dependency>

</dependencies>

代碼

import org.apache.flink.api.common.functions.FoldFunction;

import org.apache.flink.api.java.functions.KeySelector;

import org.apache.flink.api.java.tuple.Tuple2;

import org.apache.flink.streaming.api.datastream.DataStream;

import org.apache.flink.streaming.api.datastream.KeyedStream;

import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;

import org.apache.flink.streaming.api.windowing.time.Time;

import org.apache.flink.streaming.connectors.wikiedits.WikipediaEditEvent;

import org.apache.flink.streaming.connectors.wikiedits.WikipediaEditsSource;

public class WikipediaAnalysis {

public static void main(String[] args) throws Exception {

//創(chuàng)建一個(gè)streaming程序運(yùn)行的上下文

StreamExecutionEnvironment see = StreamExecutionEnvironment.getExecutionEnvironment();

//sowurce部分---數(shù)據(jù)來源部分

DataStream<WikipediaEditEvent> edits = see.addSource(new WikipediaEditsSource());

//獲得修改詞條的作者

KeyedStream<WikipediaEditEvent, String> keyedEdits = edits

.keyBy(new KeySelector<WikipediaEditEvent, String>() {

@Override

public String getKey(WikipediaEditEvent event) {

return event.getUser();

}

});

//獲得修改的結(jié)果

DataStream<Tuple2<String, Long>> result = keyedEdits

.timeWindow(Time.seconds(5))

.fold(new Tuple2<>("", 0L), new FoldFunction<WikipediaEditEvent, Tuple2<String, Long>>() {

@Override

public Tuple2<String, Long> fold(Tuple2<String, Long> acc, WikipediaEditEvent event) {

acc.f0 = event.getUser();

acc.f1 += event.getByteDiff();

return acc;

}

});

result.print();

see.execute();

}

}

然后在IDEA中直接執(zhí)行即可,稍等20S即可

3.3.5 Wiki To Kafka

Kafka主題創(chuàng)建

#在bigdata11上創(chuàng)建topic wiki-results

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic wiki-results

在Flink的項(xiàng)目中創(chuàng)建子module,Pom如下

<parent>

<artifactId>Flink</artifactId>

<groupId>com.itstar</groupId>

<version>1.0-SNAPSHOT</version>

</parent>

<modelVersion>4.0.0</modelVersion>

<artifactId>wiki</artifactId>

<dependencies>

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-java</artifactId>

<version>${flink.version}</version>

</dependency>

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-streaming-java_2.11</artifactId>

<version>${flink.version}</version>

</dependency>

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-clients_2.11</artifactId>

<version>${flink.version}</version>

</dependency>

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-connector-wikiedits_2.11</artifactId>

<version>${flink.version}</version>

</dependency>

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-connector-kafka-0.11_2.11</artifactId>

<version>1.6.1</version>

</dependency>

</dependencies>

代碼如下

package wikiedits;

import org.apache.flink.api.common.functions.FoldFunction;

import org.apache.flink.api.common.functions.MapFunction;

import org.apache.flink.api.common.serialization.SimpleStringSchema;

import org.apache.flink.api.java.functions.KeySelector;

import org.apache.flink.api.java.tuple.Tuple2;

import org.apache.flink.streaming.api.datastream.DataStream;

import org.apache.flink.streaming.api.datastream.KeyedStream;

import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;

import org.apache.flink.streaming.api.windowing.time.Time;

import org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer011;

import org.apache.flink.streaming.connectors.wikiedits.WikipediaEditEvent;

import org.apache.flink.streaming.connectors.wikiedits.WikipediaEditsSource;

public class WikipediaAnalysis {

public static void main(String[] args) throws Exception {

StreamExecutionEnvironment see = StreamExecutionEnvironment.getExecutionEnvironment();

DataStream<WikipediaEditEvent> edits = see.addSource(new WikipediaEditsSource());

KeyedStream<WikipediaEditEvent, String> keyedEdits = edits

.keyBy(new KeySelector<WikipediaEditEvent, String>() {

@Override

public String getKey(WikipediaEditEvent event) {

return event.getUser();

}

});

DataStream<Tuple2<String, Long>> result = keyedEdits

.timeWindow(Time.seconds(5))

.fold(new Tuple2<>("", 0L), new FoldFunction<WikipediaEditEvent, Tuple2<String, Long>>() {

@Override

public Tuple2<String, Long> fold(Tuple2<String, Long> acc, WikipediaEditEvent event) {

acc.f0 = event.getUser();

acc.f1 += event.getByteDiff();

return acc;

}

});

result.print();

result

.map(new MapFunction<Tuple2<String,Long>, String>() {

@Override

public String map(Tuple2<String, Long> tuple) {

return tuple.toString();

}

})

.addSink(new FlinkKafkaProducer011<>("bigdata11:9092", "wiki-result", new SimpleStringSchema()));

see.execute();

}

}

提示:注意導(dǎo)包如下

import org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer011;

import org.apache.flink.api.common.serialization.SimpleStringSchema;

import org.apache.flink.api.common.functions.MapFunction;

啟動(dòng)Kafka的消費(fèi)者

bin/kafka-console-consumer.sh ?--zookeeper localhost:2181 --topic wiki-result

3.3.6 Flink Source實(shí)戰(zhàn):

?Kafka + Flink Stream + MySQL

創(chuàng)建student表

DROP TABLE IF EXISTS `student`;

CREATE TABLE `student` (

`id` int(11) unsigned NOT NULL AUTO_INCREMENT,

`name` varchar(25) COLLATE utf8_bin DEFAULT NULL,

`password` varchar(25) COLLATE utf8_bin DEFAULT NULL,

`age` int(10) DEFAULT NULL,

PRIMARY KEY (`id`)

) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

插入數(shù)據(jù)

INSERT INTO `student` VALUES ('1', 'Andy', '123456', '18'), ('2', 'Bndy', '000000', '17'), ('3', 'Cndy', '012345', '18'), ('4', 'Dndy', '123456', '16');

COMMIT;

Pom

<dependencies>

<!--flink java-->

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-java</artifactId>

<version>${flink.version}</version>

<!--<scope>provided</scope>-->

</dependency>

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-streaming-java_${scala.binary.version}</artifactId>

<version>${flink.version}</version>

<!--<scope>provided</scope>-->

</dependency>

<dependency>

<groupId>org.slf4j</groupId>

<artifactId>slf4j-log4j12</artifactId>

<version>1.7.7</version>

<scope>runtime</scope>

</dependency>

<dependency>

<groupId>log4j</groupId>

<artifactId>log4j</artifactId>

<version>1.2.17</version>

<scope>runtime</scope>

</dependency>

<!--flink kafka connector-->

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-connector-kafka-0.11_${scala.binary.version}</artifactId>

<version>${flink.version}</version>

</dependency>

<!--alibaba fastjson-->

<dependency>

<groupId>com.alibaba</groupId>

<artifactId>fastjson</artifactId>

<version>1.2.51</version>

</dependency>

<!--alibaba fastjson-->

<dependency>

<groupId>com.alibaba</groupId>

<artifactId>fastjson</artifactId>

<version>1.2.51</version>

</dependency>

<!-- https://mvnrepository.com/artifact/mysql/mysql-connector-java -->

<dependency>

<groupId>mysql</groupId>

<artifactId>mysql-connector-java</artifactId>

<version>5.1.27</version>

</dependency>

</dependencies>

Student Bean

package FlinkToMySQL;

public class Student {

public int id;

public String name;

public String password;

public int age;

public Student() {

}

public Student(int id, String name, String password, int age) {

this.id = id;

this.name = name;

this.password = password;

this.age = age;

}

@Override

public String toString() {

return "Student{" +

"id=" + id +

", name='" + name + '\'' +

", password='" + password + '\'' +

", age=" + age +

'}';

}

public int getId() {

return id;

}

public void setId(int id) {

this.id = id;

}

public String getName() {

return name;

}

public void setName(String name) {

this.name = name;

}

public String getPassword() {

return password;

}

public void setPassword(String password) {

this.password = password;

}

public int getAge() {

return age;

}

public void setAge(int age) {

this.age = age;

}

}

注意:使用lombok可能會(huì)導(dǎo)致其他報(bào)錯(cuò)

SourceFromMySQL

package FlinkToMySQL;

import org.apache.flink.configuration.Configuration;

import org.apache.flink.streaming.api.functions.source.RichSourceFunction;

import java.sql.Connection;

import java.sql.DriverManager;

import java.sql.PreparedStatement;

import java.sql.ResultSet;

public class SourceFromMySQL extends RichSourceFunction<Student> {

PreparedStatement ps;

private Connection connection;

/**

* open()方法中建立連接,這樣不用每次 invoke 的時(shí)候都要建立連接和釋放連接。

*

* @param parameters

* @throws Exception

*/

@Override

public void open(Configuration parameters) throws Exception {

connection = getConnection();

String sql = "select * from student;";

ps = this.connection.prepareStatement(sql);

}

/**

*程序執(zhí)行完畢就可以進(jìn)行,關(guān)閉連接和釋放資源的動(dòng)作了

*

* @throws Exception

*/

@Override

public void close() throws Exception {

if (connection != null) { //關(guān)閉連接和釋放資源

connection.close();

}

if (ps != null) {

ps.close();

}

}

/**

* DataStream調(diào)用一次 run() 方法用來獲取數(shù)據(jù)

*

* @param ctx

* @throws Exception

*/

@Override

public void run(SourceContext<Student> ctx) throws Exception {

ResultSet resultSet = ps.executeQuery();

while (resultSet.next()) {

Student student = new Student(

resultSet.getInt("id"),

resultSet.getString("name").trim(),

resultSet.getString("password").trim(),

resultSet.getInt("age"));

ctx.collect(student);

}

}

@Override

public void cancel() {

}

private static Connection getConnection() {

Connection con = null;

try {

Class.forName("com.mysql.jdbc.Driver");

con = DriverManager.getConnection("jdbc:mysql://bigdata11:3306/Andy?useUnicode=true&characterEncoding=UTF-8", "root", "000000");

} catch (Exception e) {

}

return con;

}

}

自定義Source的main方法

package FlinkToMySQL

import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;

public class customSource {

public static void main(String[] args) throws Exception {

final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

env.addSource(new SourceFromMySQL()).print();

env.execute("Flink add data sourc");

}

}

Flink Stream + Kafka

Pom

<dependencies>

<!--flink java-->

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-java</artifactId>

<version>${flink.version}</version>

<!--<scope>provided</scope>-->

</dependency>

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-streaming-java_${scala.binary.version}</artifactId>

<version>${flink.version}</version>

<!--<scope>provided</scope>-->

</dependency>

<dependency>

<groupId>org.slf4j</groupId>

<artifactId>slf4j-log4j12</artifactId>

<version>1.7.7</version>

<scope>runtime</scope>

</dependency>

<dependency>

<groupId>log4j</groupId>

<artifactId>log4j</artifactId>

<version>1.2.17</version>

<scope>runtime</scope>

</dependency>

<!--flink kafka connector-->

<dependency>

<groupId>org.apache.flink</groupId>

<artifactId>flink-connector-kafka-0.11_${scala.binary.version}</artifactId>

<version>${flink.version}</version>

</dependency>

<!--alibaba fastjson-->

<dependency>

<groupId>com.alibaba</groupId>

<artifactId>fastjson</artifactId>

<version>1.2.51</version>

</dependency>

<!--alibaba fastjson-->

<dependency>

<groupId>com.alibaba</groupId>

<artifactId>fastjson</artifactId>

<version>1.2.51</version>

</dependency>

<!-- https://mvnrepository.com/artifact/mysql/mysql-connector-java -->

<dependency>

<groupId>mysql</groupId>

<artifactId>mysql-connector-java</artifactId>

<version>5.1.27</version>

</dependency>

</dependencies>

<build>

<plugins>

<plugin>

<groupId>org.apache.maven.plugins</groupId>

<artifactId>maven-compiler-plugin</artifactId>

<version>3.6.0</version>

<configuration>

<source>1.8</source>

<target>1.8</target>

</configuration>

</plugin>

</plugins>

</build>

Bean

package KafkaToFlink;

import lombok.*;

import java.util.Map;

public class Metric {

private String name;

private long timestamp;

private Map<String, Object> fields;

private Map<String, String> tags;

public Metric() {

}

public Metric(String name, long timestamp, Map<String, Object> fields, Map<String, String> tags) {

this.name = name;

this.timestamp = timestamp;

this.fields = fields;

this.tags = tags;

}

public String getName() {

return name;

}

public void setName(String name) {

this.name = name;

}

public long getTimestamp() {

return timestamp;

}

public void setTimestamp(long timestamp) {

this.timestamp = timestamp;

}

public Map<String, Object> getFields() {

return fields;

}

public void setFields(Map<String, Object> fields) {

this.fields = fields;

}

public Map<String, String> getTags() {

return tags;

}

public void setTags(Map<String, String> tags) {

this.tags = tags;

}

}

Kafkautils

package KafkaToFlink;

import com.alibaba.fastjson.JSON;

import org.apache.kafka.clients.producer.KafkaProducer;

import org.apache.kafka.clients.producer.ProducerRecord;

import java.util.HashMap;

import java.util.Map;

import java.util.Properties;

public class KafkaUtils {

public static final String broker_list = "bigdata11:9092";

// kafka topic

public static final String topic = "metric";

//key序列化

public static final String KEY = "org.apache.kafka.common.serialization.StringSerializer";

//value序列化

public static final String VALUE = "org.apache.kafka.common.serialization.StringSerializer";

public static void writeToKafka() throws InterruptedException {

Properties props = new Properties();

props.put("bootstrap.servers", broker_list);

props.put("key.serializer", KEY);

props.put("value.serializer", VALUE);

KafkaProducer producer = new KafkaProducer<String, String>(props);

Metric metric = new Metric();

metric.setName("mem");

long timestamp = System.currentTimeMillis();

metric.setTimestamp(timestamp);

Map<String, Object> fields = new HashMap<>();

fields.put("used_percent", 90d);

fields.put("max", 27244873d);

fields.put("used", 17244873d);

fields.put("init", 27244873d);

Map<String, String> tags = new HashMap<>();

tags.put("cluster", "Andy");

tags.put("host_ip", "192.168.1.51");

metric.setFields(fields);

metric.setTags(tags);

ProducerRecord record = new ProducerRecord<String, String>(topic, null, null, JSON.toJSONString(metric));

producer.send(record);

System.out.println("發(fā)送數(shù)據(jù): " + JSON.toJSONString(metric));

producer.flush();

}

public static void main(String[] args) throws InterruptedException {

while (true) {

Thread.sleep(300);

writeToKafka();

}

}

}

package KafkaToFlink;

import org.apache.flink.api.common.serialization.SimpleStringSchema;

import org.apache.flink.streaming.api.datastream.DataStreamSource;

import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;

import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer011;

import java.util.Properties;

public class Main {

public static void main(String[] args) throws Exception {

final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

Properties props = new Properties();

props.put("bootstrap.servers", "bigdata11:9092");

props.put("zookeeper.connect", "bigdata11:2181");

props.put("group.id", "metric-group");

props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); ?//key反序列化

props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");

props.put("auto.offset.reset", "earliest"); //value反序列化

DataStreamSource<String> dataStreamSource = env.addSource(new FlinkKafkaConsumer011<>(

"metric", ?//kafka topic

new SimpleStringSchema(), ?// String序列化

props)).setParallelism(1);

dataStreamSource.print(); //把從 kafka 讀取到的數(shù)據(jù)打印在控制臺(tái)

env.execute("Flink add data source");

}

}

注意:kafka主題會(huì)自動(dòng)創(chuàng)建Topic。無須手動(dòng)創(chuàng)建

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌,老刑警劉巖,帶你破解...
    沈念sama閱讀 227,797評(píng)論 6 531
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場離奇詭異,居然都是意外死亡,警方通過查閱死者的電腦和手機(jī),發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 98,179評(píng)論 3 414
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人,你說我怎么就攤上這事。” “怎么了?”我有些...
    開封第一講書人閱讀 175,628評(píng)論 0 373
  • 文/不壞的土叔 我叫張陵,是天一觀的道長。 經(jīng)常有香客問我,道長,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 62,642評(píng)論 1 309
  • 正文 為了忘掉前任,我火速辦了婚禮,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘。我一直安慰自己,他們只是感情好,可當(dāng)我...
    茶點(diǎn)故事閱讀 71,444評(píng)論 6 405
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著,像睡著了一般。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 54,948評(píng)論 1 321
  • 那天,我揣著相機(jī)與錄音,去河邊找鬼。 笑死,一個(gè)胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播,決...
    沈念sama閱讀 43,040評(píng)論 3 440
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了?” 一聲冷哼從身側(cè)響起,我...
    開封第一講書人閱讀 42,185評(píng)論 0 287
  • 序言:老撾萬榮一對(duì)情侶失蹤,失蹤者是張志新(化名)和其女友劉穎,沒想到半個(gè)月后,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 48,717評(píng)論 1 333
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 40,602評(píng)論 3 354
  • 正文 我和宋清朗相戀三年,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 42,794評(píng)論 1 369
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情,我是刑警寧澤,帶...
    沈念sama閱讀 38,316評(píng)論 5 358
  • 正文 年R本政府宣布,位于F島的核電站,受9級(jí)特大地震影響,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 44,045評(píng)論 3 347
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧,春花似錦、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 34,418評(píng)論 0 26
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 35,671評(píng)論 1 281
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人。 一個(gè)月前我還...
    沈念sama閱讀 51,414評(píng)論 3 390
  • 正文 我出身青樓,卻偏偏與公主長得像,于是被迫代替她去往敵國和親。 傳聞我的和親對(duì)象是個(gè)殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 47,750評(píng)論 2 370

推薦閱讀更多精彩內(nèi)容