Things you should know about Apache NiFi

Tarun Manrai
3 min readFeb 26, 2020
apache NiFi

Apache NiFi is a real time data ingestion platform that is used to transfer and manage data transfer within different sources and destination systems. It is a powerful system which is easy to use and is reliable to process and distribute data among varied systems. Apache is based on the Niagara Files technology developed by the National Security Agency (NSA). However, it was later donated to Apache Software foundation. Its latest version is 17.1 and is distributed under Apache License Version 2.0.

Apache also supports a wide range of data formats such as logos, geo location data and more and many protocols such as KAFKA, HDFS and SFTP.

Features of Apache NiFi

Following are the general features of Apache NiFi:

· Apache NiFi offers a web-based user interface that gives seamless experience between design, feedback, monitoring and control.

· It also offers a data provenance module that helps track and monitor data from beginning to end of the flow.

· It is very configurable. It helps users with better delivery, high throughput, back pressure, dynamic prioritization.

· It also helps in securing protocols such as SSH, HTTPS, SSL and many more.

· It supports user and role management and can be configured for authorization with LDAP.

· With Apache NiFi, developers can make their ow customized processors and reporting tasks as per their needs and requirements.

Concepts of Apache NiFi

Process Group — It is a group of NiFi flows, that helps users to manage and keep the flow in a chronological manner.

Processor — It is a java module which is responsible for either fetching the data from sourcing systems or storing the same in the main system.

Flow — A flow is created by connecting different processors to transfer and modify data when required from one data source to another.

Event — They represent the change in flowfile when passed through the NiFi flow.

Data provenance — It has a UI and works as a repository. It enables users to check information about the flow file and help troubleshoot in any case of any

FlowFile — It is a basic NiFi usage and it represents single object data that is picked from source system. The processor of NiFi makes changes to flowfile while transferring data.

Pros of Apache NiFi

· It enables data fetching from remote machines with the use of SFTP and also guarantee data lineage.

· It offers security policies on user level, process group level and other modules.

· As Apache NiFi’s UI can also run on HTTPS, it makes the secure interaction between users with NiFi.

· It supports clustering; hence, it can work with same flow to process different data and can work on varied nodes.

· It can support almost 188 processors and with this a user can create and customize plugins that can be used in different data systems.

Cons of Apache NiFi

· It has state persistence issues in the case of primary node switch. This can result in issues wherein processors could not fetch data from sourcing systems.

· When a user is making changes and the node gets disconnected from NiFi, the flow of .xml can become invalid. The node then has to be connected manually by the user or admin.

For any query you can visit http://www.entradasoft.com

--

--