Kafka Streams 博客

Kafka Streams流处理 英文博客翻译

Kafka Process API

http://codingjunkie.net/kafka-processor-part1/

The aim of the Processor API is to introduce a client to enable processing data consumed from Kafka and writing the results back into Kafka. There are two components of the processor client:

  1. A “lower-level” processor that providea API’s for data-processing, composable processing and local state storage.
  2. A “higher-level” stream DSL that would cover most processor implementation needs.

Potential Use Cases For the Processor API

  1. There is a need for notification/alerts on singular values as they are processed. In other words the business requirements are such that you don’t need to establish patterns or examine the value(s) in context with other data being processed. For example you want immediate notification that a fraudulent credit card has been used.
  2. You filter your data when running analytics. Filtering out a medium to large percentage of data ideally should be re-partitioned to avoid data-skew issues. Partitioning is an expensive operation, so by filtering out what data is delivered to your analytics cluster, you can save the filter-repartition step.
  3. You want to run analytics on only a portion of your source data, while delivering the entirety of you data to another store.

KStream API

使用Kafka Streams在用户活动事件流上做分布式实时join和聚合

https://www.confluent.io/blog/distributed-real-time-joins-and-aggregations-on-user-activity-events-using-kafka-streams/


文章目录
  1. 1. Kafka Process API
  2. 2. KStream API
  3. 3. 使用Kafka Streams在用户活动事件流上做分布式实时join和聚合