{"id":4044,"date":"2025-06-19T14:59:09","date_gmt":"2025-06-19T05:59:09","guid":{"rendered":"https:\/\/pchero21.com\/?p=4044"},"modified":"2025-06-19T14:59:09","modified_gmt":"2025-06-19T05:59:09","slug":"sip-session-recovery-how-to-recover-the-sip-session-after-an-asterisk-crash","status":"publish","type":"post","link":"http:\/\/pchero21.com\/?p=4044","title":{"rendered":"SIP Session Recovery: How to recover the SIP session after an Asterisk crash"},"content":{"rendered":"\n<h2>Background<\/h2>\n\n\n\n<p><strong>VoIPBin<\/strong> is a <strong>CPaaS (Communication Platform as a Service)<\/strong> that provides real-time call services built on top of Asterisk.<br>When an Asterisk instance unexpectedly crashes during an ongoing call, all SIP sessions on that instance disappear immediately. From the client\u2019s perspective, the session is terminated <strong>without a BYE<\/strong>, causing issues such as:<\/p>\n\n\n\n<ul><li>Interrupted TTS playback<\/li><li>Unexpected conference exit<\/li><li>Media channel failure<\/li><\/ul>\n\n\n\n<p>To address this, I implemented <strong>SIP session recovery<\/strong>, making it appear to the client as if the session is still active \u2014 even after the Asterisk process handling it has crashed.<\/p>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<h2>Overview of the Recovery Process<\/h2>\n\n\n\n<p>After an Asterisk crash, VoIPBin performs a SIP session recovery using the following steps:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/pchero21.com\/wp-content\/uploads\/2025\/06\/voipbin-session-recovery.drawio.png\"><img loading=\"lazy\" width=\"1361\" height=\"819\" src=\"https:\/\/pchero21.com\/wp-content\/uploads\/2025\/06\/voipbin-session-recovery.drawio.png\" alt=\"\" class=\"wp-image-4050\"\/><\/a><\/figure>\n\n\n\n<ol><li><strong>Crash detection<\/strong><br>VoIPBin\u2019s <code>sentinel-manager<\/code> detects that an Asterisk instance has crashed.<\/li><li><strong>Session lookup from DB<\/strong><br>We query our internal database for all call sessions that were being handled by the failed instance.<\/li><li><strong>Collect recoverable SIP fields via HOMER<\/strong><br>Using the HOMER API, we retrieve SIP header fields for each session:<ul><li>From\/To URI and Display<\/li><li>Tags (From\/To Tag), Call-ID, CSeq<\/li><li>Route and Record-Route headers<\/li><li>Request URI<\/li><\/ul><\/li><li><strong>Create new SIP channel on another Asterisk<\/strong><br>We select a healthy Asterisk instance and create new SIP channels to recover the affected sessions.<\/li><li><strong>Set recovery-related channel variables<\/strong><br>The following channel variables are set to ensure that the INVITE message reuses the original session\u2019s identity: goCopyEdit<code>channelVariableRecoveryFromDisplay = \"PJSIP_RECOVERY_FROM_DISPLAY\" channelVariableRecoveryFromURI = \"PJSIP_RECOVERY_FROM_URI\" channelVariableRecoveryFromTag = \"PJSIP_RECOVERY_FROM_TAG\" channelVariableRecoveryToDisplay = \"PJSIP_RECOVERY_TO_DISPLAY\" channelVariableRecoveryToURI = \"PJSIP_RECOVERY_TO_URI\" channelVariableRecoveryToTag = \"PJSIP_RECOVERY_TO_TAG\" channelVariableRecoveryCallID = \"PJSIP_RECOVERY_CALL-ID\" channelVariableRecoveryCSeq = \"PJSIP_RECOVERY_CSEQ\" channelVariableRecoveryRoutes = \"PJSIP_RECOVERY_ROUTES\" channelVariableRecoveryRecordRoutes = \"PJSIP_RECOVERY_RECORD-ROUTES\" channelVariableRecoveryRequestURI = \"PJSIP_RECOVERY_REQUEST_URI\"<\/code><\/li><li><strong>Send INVITE \u2192 client treats it as session continuation<\/strong><br>Because the INVITE reuses the original Call-ID, tags, and headers, the client interprets it as a <strong>re-INVITE<\/strong> and resumes the session.<\/li><li><strong>RTP and SIP session fully restored<\/strong><br>Media flow and signaling are successfully re-established. The client resumes communication as if nothing happened.<\/li><li><strong>Resume Flow execution<\/strong><br>Once recovered, the call resumes its Flow execution from just before the crash.<br>For example, if the user was in a conference, they are reconnected to the same conference bridge; if TTS was being played, it resumes based on the Flow definition.<\/li><\/ol>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/pchero21.com\/wp-content\/uploads\/2025\/06\/voipbin-project-diagram-sip-session-recovery.drawio.png\"><img loading=\"lazy\" width=\"1771\" height=\"1684\" src=\"https:\/\/pchero21.com\/wp-content\/uploads\/2025\/06\/voipbin-project-diagram-sip-session-recovery.drawio.png\" alt=\"\" class=\"wp-image-4047\"\/><\/a><\/figure>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<h2>Asterisk Patch for Recovery<\/h2>\n\n\n\n<p>To support this functionality, I patched Asterisk\u2019s PJSIP stack to allow overriding SIP header fields based on channel variables:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">cCopyEdit<code>val_from_display_c_str = pbx_builtin_getvar_helper(session->channel, \"PJSIP_RECOVERY_FROM_DISPLAY\");\nval_from_uri_c_str     = pbx_builtin_getvar_helper(session->channel, \"PJSIP_RECOVERY_FROM_URI\");\nval_from_tag_c_str     = pbx_builtin_getvar_helper(session->channel, \"PJSIP_RECOVERY_FROM_TAG\");\n\nval_to_display_c_str   = pbx_builtin_getvar_helper(session->channel, \"PJSIP_RECOVERY_TO_DISPLAY\");\nval_to_uri_c_str       = pbx_builtin_getvar_helper(session->channel, \"PJSIP_RECOVERY_TO_URI\");\nval_to_tag_c_str       = pbx_builtin_getvar_helper(session->channel, \"PJSIP_RECOVERY_TO_TAG\");\n\n\/\/ Call-ID, CSeq, Routes, and others are handled similarly<\/code><\/pre>\n\n\n\n<p>With this patch, a newly created SIP channel can impersonate the original one \u2014 making the recovery INVITE look like a legitimate continuation of the previous session.<\/p>\n\n\n\n<ul><li><a href=\"https:\/\/github.com\/voipbin\/etc\/blob\/main\/asterisk\/add_pjsip_recovery.patch\">https:\/\/github.com\/voipbin\/etc\/blob\/main\/asterisk\/add_pjsip_recovery.patch<\/a><\/li><\/ul>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<h2>Final Thoughts<\/h2>\n\n\n\n<p>SIP session recovery is a critical capability for handling unexpected Asterisk failures in production.<br>By combining fast crash detection with smart session restoration logic, VoIPBin ensures that users experience <strong>minimal service disruption<\/strong>.<\/p>\n\n\n\n<p>I&#8217;m planning to expand this functionality to include:<\/p>\n\n\n\n<ul><li>Broader support for mid-call state recovery<\/li><\/ul>\n\n\n\n<p>If you&#8217;re building real-time telephony platforms or CPaaS services, I hope this gives insight into how deep SIP session recovery can be handled \u2014 even in the face of infrastructure-level crashes.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Background VoIPBin is a CPaaS (Communication Platform as a Service) that provides real-time call services built on top of Asterisk.When an Asterisk instance unexpectedly crashes during an ongoing call, all SIP sessions on that instance disappear immediately. From the client\u2019s &hellip; <a href=\"http:\/\/pchero21.com\/?p=4044\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1054],"tags":[],"_links":{"self":[{"href":"http:\/\/pchero21.com\/index.php?rest_route=\/wp\/v2\/posts\/4044"}],"collection":[{"href":"http:\/\/pchero21.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/pchero21.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/pchero21.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/pchero21.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4044"}],"version-history":[{"count":4,"href":"http:\/\/pchero21.com\/index.php?rest_route=\/wp\/v2\/posts\/4044\/revisions"}],"predecessor-version":[{"id":4051,"href":"http:\/\/pchero21.com\/index.php?rest_route=\/wp\/v2\/posts\/4044\/revisions\/4051"}],"wp:attachment":[{"href":"http:\/\/pchero21.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4044"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/pchero21.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=4044"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/pchero21.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=4044"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}