Reading Avro files using Apache Flink

In this blog, we will see how to read the Avro files using Flink.

Before reading the files, let’s get an overview of Flink.

  • Real-time Processing: Processing based on immediate data for an instant result.
  • Support for scala and java
  • Low-latency
  • Fault-tolerance
  • Scalability
name := "flink-demo"version := "0.1"scalaVersion := "2.12.8"libraryDependencies ++= Seq("org.apache.flink" %% "flink-scala" % "1.10.0","org.apache.flink" % "flink-avro" % "1.10.0","org.apache.flink" %% "flink-streaming-scala" % "1.10.0")
val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment
env.setParallelism(1)
public AvroInputFormat(Path filePath, Class type)
val avroInputFormat = new AvroInputFormat[GenericRecord](new org.apache.flink.core.fs.Path("path to avro file"), classOf[GenericRecord])
def createInput[T: TypeInformation](inputFormat: InputFormat[T, _]): DataStream[T]
implicit val typeInfo: TypeInformation[GenericRecord] = TypeInformation.of(classOf[GenericRecord])val recordStream: scala.DataStream[GenericRecord] = env.createInput(avroInputFormat)
recordStream.print()
env.execute(jobName = "flink-avro-demo")

Data Engineer