SIP Session Recovery: How to recover the SIP session after an Asterisk crash

Background

VoIPBin is a CPaaS (Communication Platform as a Service) that provides real-time call services built on top of Asterisk.
When an Asterisk instance unexpectedly crashes during an ongoing call, all SIP sessions on that instance disappear immediately. From the client’s perspective, the session is terminated without a BYE, causing issues such as:

  • Interrupted TTS playback
  • Unexpected conference exit
  • Media channel failure

To address this, I implemented SIP session recovery, making it appear to the client as if the session is still active — even after the Asterisk process handling it has crashed.


Overview of the Recovery Process

After an Asterisk crash, VoIPBin performs a SIP session recovery using the following steps:

  1. Crash detection
    VoIPBin’s sentinel-manager detects that an Asterisk instance has crashed.
  2. Session lookup from DB
    We query our internal database for all call sessions that were being handled by the failed instance.
  3. Collect recoverable SIP fields via HOMER
    Using the HOMER API, we retrieve SIP header fields for each session:
    • From/To URI and Display
    • Tags (From/To Tag), Call-ID, CSeq
    • Route and Record-Route headers
    • Request URI
  4. Create new SIP channel on another Asterisk
    We select a healthy Asterisk instance and create new SIP channels to recover the affected sessions.
  5. Set recovery-related channel variables
    The following channel variables are set to ensure that the INVITE message reuses the original session’s identity: goCopyEditchannelVariableRecoveryFromDisplay = "PJSIP_RECOVERY_FROM_DISPLAY" channelVariableRecoveryFromURI = "PJSIP_RECOVERY_FROM_URI" channelVariableRecoveryFromTag = "PJSIP_RECOVERY_FROM_TAG" channelVariableRecoveryToDisplay = "PJSIP_RECOVERY_TO_DISPLAY" channelVariableRecoveryToURI = "PJSIP_RECOVERY_TO_URI" channelVariableRecoveryToTag = "PJSIP_RECOVERY_TO_TAG" channelVariableRecoveryCallID = "PJSIP_RECOVERY_CALL-ID" channelVariableRecoveryCSeq = "PJSIP_RECOVERY_CSEQ" channelVariableRecoveryRoutes = "PJSIP_RECOVERY_ROUTES" channelVariableRecoveryRecordRoutes = "PJSIP_RECOVERY_RECORD-ROUTES" channelVariableRecoveryRequestURI = "PJSIP_RECOVERY_REQUEST_URI"
  6. Send INVITE → client treats it as session continuation
    Because the INVITE reuses the original Call-ID, tags, and headers, the client interprets it as a re-INVITE and resumes the session.
  7. RTP and SIP session fully restored
    Media flow and signaling are successfully re-established. The client resumes communication as if nothing happened.
  8. Resume Flow execution
    Once recovered, the call resumes its Flow execution from just before the crash.
    For example, if the user was in a conference, they are reconnected to the same conference bridge; if TTS was being played, it resumes based on the Flow definition.

Asterisk Patch for Recovery

To support this functionality, I patched Asterisk’s PJSIP stack to allow overriding SIP header fields based on channel variables:

cCopyEditval_from_display_c_str = pbx_builtin_getvar_helper(session->channel, "PJSIP_RECOVERY_FROM_DISPLAY");
val_from_uri_c_str     = pbx_builtin_getvar_helper(session->channel, "PJSIP_RECOVERY_FROM_URI");
val_from_tag_c_str     = pbx_builtin_getvar_helper(session->channel, "PJSIP_RECOVERY_FROM_TAG");

val_to_display_c_str   = pbx_builtin_getvar_helper(session->channel, "PJSIP_RECOVERY_TO_DISPLAY");
val_to_uri_c_str       = pbx_builtin_getvar_helper(session->channel, "PJSIP_RECOVERY_TO_URI");
val_to_tag_c_str       = pbx_builtin_getvar_helper(session->channel, "PJSIP_RECOVERY_TO_TAG");

// Call-ID, CSeq, Routes, and others are handled similarly

With this patch, a newly created SIP channel can impersonate the original one — making the recovery INVITE look like a legitimate continuation of the previous session.


Final Thoughts

SIP session recovery is a critical capability for handling unexpected Asterisk failures in production.
By combining fast crash detection with smart session restoration logic, VoIPBin ensures that users experience minimal service disruption.

I’m planning to expand this functionality to include:

  • Broader support for mid-call state recovery

If you’re building real-time telephony platforms or CPaaS services, I hope this gives insight into how deep SIP session recovery can be handled — even in the face of infrastructure-level crashes.

New feature: Admin page

Project voipbin: Admin page

I wanted to share some news with you—I’ve added a new feature to the project voipbin, and it’s called the admin page.

I thought it would be nice to give you a heads-up on how it works.

The admin page exclusively relies on voipbin’s APIs(https://lnkd.in/eK_zpWp), bringing you the best of what voipbin has to offer. It’s like having a one-stop shop for all things voipbin!

I must admit, there’s still some fine-tuning and development happening in certain areas, but I couldn’t wait to show you what’s in store.

It’s opened only for the administrator for now, but will be opened for normal users as well.
Curious to check it out? Just follow the link: http://admin.voipbin.net

Exciting times lie ahead! Can’t wait to hear what you think and what’s next on your mind. Let’s keep the conversation going!

What’s next? 🙂

http://admin.voipbin.net

New feature: WebRTC call and agent permission

Project voipbin: Enhanced with WebRTC Calling and agent permission.

I added some new features to the project voipbin: the Agent Permissions and WebRTC Calling feature.

Agent Permissions: Now, each agent is equipped with specific permissions tailored to their role. Depending on the assigned permission level, agents will have access to a set of API operations, with the administrator’s menu adapting accordingly.

Upon logging in, agents are granted specific permissions, influencing the visibility of administrative options.

WebRTC Calling Integration: We have seamlessly integrated WebRTC calling functionality into the system. When an agent logs into the admin interface, the webphone is automatically registered, setting the stage for smooth initiation and receipt of WebRTC calls.

This combination of agent permissions and WebRTC calling not only enhances the user experience but also ensures a more streamlined and efficient communication process within Project VoIPBin.

What’s next? 🙂

http://admin.voipbin.net

New feature: groupcall

I added a new feature in VoIPBin called groupcall. With groupcall, you can easily make calls to multiple destinations at once.

It offers two ring methods: ringall and linear.

In the ringall method, all destinations are called simultaneously. The first destination that answers the call executes the given call flow, while the rest of the destinations are immediately hung up. This is ideal for blasting calls and quickly reaching out to a group of people.

Alternatively, the linear method dials the destinations one by one in a specified order. If a destination doesn’t answer, the system moves on to the next one until a call is answered. This is useful for implementing a huntgroup call strategy, ensuring calls are routed sequentially based on priority.

But there’s more! VoIPBin also supports nested groupcalls. You can include a groupcall as a destination within another groupcall. This allows for even greater flexibility and customization in call routing. Each nested groupcall has its own ring method, and the calls follow the specified strategy within the nested groupcall.

With groupcall in VoIPBin, you can efficiently manage calls, control call flows, and reach multiple destinations simultaneously or sequentially. It’s a powerful tool for optimizing communication and streamlining your calling processes.

Whats next? 🙂

https://api.voipbin.net/docs/call.html#groupcall

CPaaS: Flow control

Traditional CTI services use a basic flow execution model to control the flow of calls. When a call comes in, the flow starts with call control actions like answering, transferring to a queue, and other simple actions.

The flow execution continues until the call ends, or the user initiates a hang-up command to stop the flow execution. However, this model is limited to call channels only, and other communication channels are not supported.

This was OK for traditional CTI service. But how can we provide this with CPaaS technology? How should we control the flow if the channel is not a call? The CPaaS supports many more communication channels, including voice, video, SMS, email, and others. To control the flow of all these channels, CPaaS providers should offer a unified interface that can control the flow of all channel types.

I implemented this called “Activeflow” in a VoIPBin service.

Activeflow is a powerful concept that provides a unified flow control interface for all channel types. It is created when flow execution starts and allows users to control the flow for any channel type. In VoIPBin’s implementation of Activeflow, customers can use the hang-up command for call channels and Activeflow’s stop command for all other channel types.

Overall, Activeflow is a critical feature that CPaaS providers should offer, as it provides a simple and unified way to control the flow of all communication channels. This helps users save time and increase productivity while managing their communication channels in a more flexible and customizable way.

This is details about the activeflow.

https://api.voipbin.net/docs/activeflow.html