Part 4: Airflow 2.0 UI tour

Jyoti Sachdeva
7 min readJun 3, 2021

Hi,

Now that we have discussed what is airflow, its basic terms. We are good to go with the UI tour.

We created a user with username admin and password admin in previous part.

We started our airflow at http://localhost:8088. Let’s login first with incorrect details and then with correct details:

After login, we will see:

Let’s discover what is at the extreme right top. The default time zone is UTC and the clock keeps on running. We can add any time zone to see in UI and just click on UTC, you will see UTC time and click on any other time zone you will be able to see time in that time zone.

Now when we click on your name initials in my case JS:

we will see:

When you will click on your profile, you will see your info and the login count. If you have the required access, you can edit the user and change the password as well. There is also a log out option.

Now, let’s move to docs section.

The Documentation would take you to airflow docs. Airflow website will take you to official airflow site. There is a Github repository link as well and the API documentation (Redoc) and the swagger link from where you can actually try out hitting the API.

For swagger api all you need is to authorize and enable the CORS. Click on

REST API Reference (Swagger UI) and scroll down there will a green button ‘Authorize’ enter your details and login and also enable the CORS.

Then you can try out any API. Click on any API, then try it out and execute it.

Next to it is Admin.

Variables: We can create variable in three ways:

  1. Our code: We did using .env file.
AIRFLOW_VAR_SAMPLE=SampleVarfrom airflow.models import Variable

sample = Variable.get("sample")

2. Using UI:

Click on variable and then ‘+’ button and add the variable name and its value.

3. Import json file:

sample.json:

{
“fruit”: “Apple”,
“size”: “Large”,
“color”: “Red”
}

Click on choose file select sample.json and then import variables.

Configuration: Shows your airflow.cfg running configurations.

Connections: You can create connection, choose connection type enter details and using the connection id you can connect to it from code.

Plugins, Pools and Xcom: Plugins are custom build, we have a default_pool and we can create any pool and give it slots to control parallelism and use that in our DAG in default params.

"pool": "custom_pool"

Xcom are key value pairs that can pass data between tasks by default if you return anything from a python function it would be shown in xcom.

def invite_friends(**kwargs):
logger.info("Invite Friends")
return "xcom reply"

Next to it is Browse:

DAG Runs: shows the state of DAG, its id, execution date.

run id could start from scheduled_ or manual_ if triggered externally it would be manual_ else scheduled_. Run type Scheduled/Manual. End date if the DAG is finished and External trigger.

Jobs:

Hostname is webserver or scheduler.

We can see it from docker ps container id.

Audit logs contains data if any user is created, removed or any command which is sensitive.

Task Instance:

When the task runs, we will be to see the details such as how long did it take to run, how many retry happened. If the task fails if goes to retry state and we will be able to see it in Task Reschedules.

SLA Miss:

We have some service level agreements such as this task should run in 2 hours. If it takes more than that time period it is a miss and we will be able to see it here.

We can mention SLA in our code as:

"sla": timedelta(minutes=120)

DAG Dependencies: There could be cases when we want one DAG to run only if another DAG finishes. We can see such dependencies here.

Next to it is Security.

The first is list users, it will show all the users, their status and role.

Now, the roles are perfectly defined under List Roles.

User Statistics shows who has logged in when and what are the incorrect login. We did do the incorrect login at beginning to see the entry here.

Base Permissions: They are the CRUD operations.

The objects on which these CRUD operations are applied are Views/Menus.

Do the cartesian product of Base Permissions and Views/Menus, you will get Permissions on Views/Menus.

Now let’s look at our main entity: DAG.

We have the options to segregate the Active and Paused DAG.

Upon hovering any task instance, we will see:

The task id, run id, start time and end time.

Now the interesting part is that the run id is always a day less than the start date. We will discuss that in details why in further parts.

When you click on any task.

You can mark the task as success and upstream means the tasks before the current task and downstream means the tasks after the current task. We can click on downstream and then mark success, which means the current task and all the tasks after it would be marked success.

In similar manner all the past and future task runs can be inferred. When we click on clear the task, it would start running again.

This is the graph view so party would run only after all the tasks are successful but you still want to run party irrespective of anything the click on Ignore task dependency and run. But this option is only available when we use Kubernetes executor.

You can change the layout of view, turn on Auto refresh.

When a new DAG comes up, it stays in no state.

When its schedule time reaches it comes in scheduled state then to queued to running. It could success or be in up for retry to up for reschedule to failed/success or upstream failed to up for retry to up for reschedule to failed/success.

We can also change the layout and select particular DAG and date.

All the things that we see on web server UI is picked from metadata database.

There are tables that are queried.

We can see the DAG table:

And user table as:

And all the tables similarly to explore.

Thanks for reading!

--

--