Change some default values for Rabbit driver
It was observed that for the rabbitmq-server version 3.8.2 default
values during failover of one of the rabbit nodes cause rapid connection
recreation. In some cases it leads to the creation of broken exchanges
and hangs of OpenStack operations. Changing rabbit_retry_interval to 5,
rabbit_retry_backoff to 10 and kombu_reconnect_delay to 5.0 fix an
issue.
This change is Pike-only, Queens is implemented in oslo-templates
formula.
Related-Issue: PROD-34332
Change-Id: Iad7058142631c92c5665b5e432ee90141a1a16b4
diff --git a/ironic/files/pike/ironic.conf b/ironic/files/pike/ironic.conf
index 3f3ff42..e8bceb2 100644
--- a/ironic/files/pike/ironic.conf
+++ b/ironic/files/pike/ironic.conf
@@ -3395,10 +3395,15 @@
# Deprecated group/name - [oslo_messaging_rabbit]/kombu_ssl_ca_certs
#ssl_ca_file =
+# NOTE(pas-ha) default values of below option is problematic with RMQ 3.8,
+# see PROD-34322
+# recreating queues on a secondary broker immediately after primary broker
+# has gone down leads to these queues being non-functional.
# How long to wait before reconnecting in response to an AMQP
# consumer cancel notification. (floating point value)
# Deprecated group/name - [DEFAULT]/kombu_reconnect_delay
#kombu_reconnect_delay = 1.0
+kombu_reconnect_delay = 5.0
# EXPERIMENTAL: Possible values are: gzip, bz2. If not set
# compression will not be used. This option may not be
@@ -3470,14 +3475,24 @@
# Reason: Replaced by [DEFAULT]/transport_url
#rabbit_virtual_host = /
+# NOTE(pas-ha) default values of below option is problematic with RMQ 3.8,
+# see PROD-34322
+# recreating queues on a secondary broker immediately after primary broker
+# has gone down leads to these queues being non-functional.
# How frequently to retry connecting with RabbitMQ. (integer
# value)
#rabbit_retry_interval = 1
+rabbit_retry_interval = 5
+# NOTE(pas-ha) default values of below option is problematic with RMQ 3.8,
+# see PROD-34322
+# recreating queues on a secondary broker immediately after primary broker
+# has gone down leads to these queues being non-functional.
# How long to backoff for between retries when connecting to
# RabbitMQ. (integer value)
# Deprecated group/name - [DEFAULT]/rabbit_retry_backoff
#rabbit_retry_backoff = 2
+rabbit_retry_backoff = 10
# Maximum interval of RabbitMQ connection retries. Default is
# 30 seconds. (integer value)