Limitations of the Hive Notification Mechanism

Note the following limitations of hive notification:

  • Availability of hive notification

    Arcadia Instant connects to the MQ server, and uses the Hive notification system when it is available. Otherwise, it proceeds with its normal operation without notifications.

  • Connection to MQ

    ArcViz makes attempts to connect to MQ server port every 60 seconds, if it is not connected already. After it finally connects to the MQ server after a period of not being able to listen to notifications, it may have stale data. This situation may require a manual invalidate metadata command.

  • Direct data load

    When files are loaded directly into an existing HDFS directory that corresponds to an existing table or table partition, there is no notification. You must issue a manual invalidate metadata command on that table to see the changes.

  • Default behavior

    Hive Notification has the following default behaviors:

Queueing Notifications when Hive is Unavailable

Because Hive does not change any metadata when it is down, we don't worry about missed notifications. However, we can run into a situation where ArcEngine or Impala cannot be reached, so we queue messages, and then Hive is restarted. In such cases, the user must manually refresh or invalidate the metadata on ArcEngine or Impala.

Queueing Notifications when ArcEngine or Impala are Unavailable

If Impala or ArcEngine are down, we queue notifications so they may be processed as refresh table_name or invalidate table_name after these services come up again, so they have the most recent copy of metadata.

Invalidating Analytical Views

In order to implement the alter table rename and alter table change column SQL calls, ArcEngine invalidates all analytical views associated with the altered table.

We use the following new command to cycle through the relevant analytical views and mark them invalid:
invalidate analytical view table table_name;