Snom Deskphones support SIP Failover and Load balancing via DNS (SRV+NAPTR).
- When the phone needs to send a SIP request, it gets a list of servers from DNS.
- Then, the phone chooses a server in the list depending on the NAPTR and SRV priority and weight (see here for a deeper explanation of NAPTR and SRV configurations: https://en.wikipedia.org/wiki/NAPTR_record and https://en.wikipedia.org/wiki/SRV_record).
- If the chosen server refuses the connection or does not reply within 32 seconds, the request is sent to the next server and so on.
- The 32 seconds time can be reduced by reducing parameter sip_retry_t1.
Below is a simplified example on how this works.
For our registrar snom.rocks we would like to use 2 SIP servers in load balancing:
The identity on the Snom phone is configured with Registrar: snom.rocks.
I have the following DNS entries for it. Below is the DNS (bind9) configuration:
Here is a summary of a potential failover scenario:
- Initially, the phone sends the REGISTER request, choosing randomly sipserver52. Now the phone is registered.
- Let's say after 5 minutes sipserver52 crashes or doesn't answer anymore
- Later, the user wants to make a call. The phone sends the INVITE request randomly to sipserver52 or sipserver114.
If the INVITE gets sent to sipserver114, the call is successful and there is no problem. Otherwise, if sipserver52 is chosen the phone does not get an answer. For this example, let's assume sipserver52 is chosen
- The phone does not receive an answer so it stays in calling state for 32 seconds
- After 32 seconds, the phone automatically sends the INVITE to sipserver114. The sipserver114 server works and the call is successful.
A few important aspects:
- The server that is selected first for INVITE is not necessarily the same as the server that was originally selected for REGISTER. This has to do with the NAPTR and SRV priority and weight, which you can configure on the DNS server. In the above example, there was only one NAPTR entry and for the two SRV entries both priority and weight were the same so the phone chooses the two servers alternately
- If the 32 seconds are too much, you can reduce them by reducing sip_retry_t1.
- It is recommended to enable the dirty host mechanism as well so that sipserver52 is not tried again for a while: dirty_host_ttl (this mechanism is not active by default)