开源大数据ETL工具

BigData ETL Tools

datatorrent(apex)

执行./datatorrent-rts-community-3.7.0.bin --help打印帮助项

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[qihuang.zheng@dp0653 install]$ sudo -u admin ./datatorrent-rts-community-3.7.0.bin \
-B /usr/install/datatorrent-rts -g 9094 \
-E DT_LOG_DIR=/home/admin/datatorrent \
-E DT_RUN_DIR=/home/admin/run/datatorrent

Verifying archive integrity... All good.
Uncompressing DataTorrent Distribution 100%

DataTorrent Platform 3.7.0 will be installed under /usr/install/datatorrent-rts/releases/3.7.0

dtGateway can be managed with: /usr/install/datatorrent-rts/releases/3.7.0/bin/dtgateway [start|stop|status]
DTGateway is running as pid 24571 and listening on 0.0.0.0:9094

Please finish the remaining installation steps via DataTorrent Console at: http://dp0653:9094/

创建apex项目,并打包

1
2
3
4
5
6
7
8
9
10
11
name=salesapp
version=3.5.0

mvn -B archetype:generate \
-DarchetypeGroupId=org.apache.apex \
-DarchetypeArtifactId=apex-app-archetype \
-DarchetypeVersion=$version \
-DgroupId=com.example \
-Dpackage=com.example.$name \
-DartifactId=$name \
-Dversion=1.0-SNAPSHOT

上传到datatorrent平台

StreamSets(https://github.com/streamsets/datacollector)

StreamFlow(https://github.com/lmco/streamflow)

CDAP(https://github.com/caskdata/cdap)


文章目录
  1. 1. datatorrent(apex)
  2. 2. StreamSets(https://github.com/streamsets/datacollector)
  3. 3. StreamFlow(https://github.com/lmco/streamflow)
  4. 4. CDAP(https://github.com/caskdata/cdap)