Nothing Special   »   [go: up one dir, main page]

Troubleshooting Improvements in P7FP - B

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 42

P7FP

Troubleshooting
Improvements
Edward Casey
DuFV
Contents

1. Selective Ue Tracing RBS in RNC


2. UeCtxt Event History Buffer
3. PLM Diagnostic Improvements

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 2 P7FP Troubleshooting Improvements 2009-04-29
Selective Ue Tracing
RBS in RNC
Introduction

 Selective Tracing are implemented in the RNC and


RBS
 Difficult to track UE movement to determine where to
enable tracing in RBS
 Inefficient use of troubleshooters time

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 4 P7FP Troubleshooting Improvements 2009-04-29
Selective UE Trace in RBS in RNC
Solution
 Selective UE Trace in RBS in RNC allows propagation
of tracing information from RNC to RBS

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 5 P7FP Troubleshooting Improvements 2009-04-29
How it works

 RNC will determine if a particular UE is selected for trace


(ueidtrace or uerandtrace)
 If UE is selected for trace RNC will append an optional
protocol extension “Ue-Selective-Tracing-Information” to
the relevant NBAP RL Setup / Addition / Reconfiguration
Messages that will indicate ueTracingStatus
 RBS will listen for NBAP messages containing Ue-Selective-
Tracing-Information block if troubleshooter has set uetrace
service type to NBAP
 RBS will check the NBAP messages for this optional
protocol extension and use it as a trace indicator
 RBS will enable relevant tracing based on propagated
information

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 6 P7FP Troubleshooting Improvements 2009-04-29
NBAP Protocol Extension

protocolExtensions
{ {
id 64535,
criticality ignore,
extensionValue Selective-UE-Tracing-Information :
{
uETracingStatus active,
cRNC-ID 301,
u-RNTI
{
rNC-ID 301,
s-RNTI 16404
},
uEConnectionLabel
{
rNCModuleId 0,
uEContextId 20,
uESelectionLabel ‘4455465600'H
} } } }

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 7 P7FP Troubleshooting Improvements 2009-04-29
Commands

 RNC
– ueidtrace -ue imsi <imsi> -label <string>
 Will enable selective tracing in RNC with label
– ueidtrace status
 Prints status of ue
– uerandtrace -cell <cell id> -label <string>
– uerandtrace status

 RBS
– uetrace on –type nbap
– uetrace status
– uetrace off

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 8 P7FP Troubleshooting Improvements 2009-04-29
Label Option

 Coli command in RNC allows troubleshooter to specify


label, 32 byte null terminated ASCII string
 This label is also propagated to the RBS
 Label can be retrieved in the RBS using the uetrace
status command.
 Can be used to identify troubleshooter who is currently
tracing on this node

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 9 P7FP Troubleshooting Improvements 2009-04-29
uetrace status in RBS

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 10 P7FP Troubleshooting Improvements 2009-04-29
Tracing Priority

 ueidtrace has higher priority than uerandtrace for


selective trace propagation in RNC

 For example:
1. ueidtrace is enabled on an imsi with label X
2. uerandtrace is enabled on a cell 30102 with label Y

3. Now Rab Est PS Interactive in cell 30101


4. Add leg in 30102

5. Label X will be propagated in both cells as ueidtrace has a


higher priority in RNC

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 11 P7FP Troubleshooting Improvements 2009-04-29
References

 NDS: Selective UE tracing in RBS: 4/102 68-FCP 103


6502 Uen
 Selective UE Tracing in RBS in RNC: 29/190
24-10/FCP 103 6503 Uen
 Selective UE Tracing in RBS: 3/190 24-10/FCP 103
6505 Uen

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 12 P7FP Troubleshooting Improvements 2009-04-29
UeCtxt
Event History Buffer
Introduction

 When a UEH_EXCEPTION occurs it can difficult if not


impossible what events led to the exception
 Tracing on all UEs all the time is inefficient
 Difficult for troubleshooter / designer to diagnose the
root cause of the problem

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 14 P7FP Troubleshooting Improvements 2009-04-29
UeCtxt Event History Buffer
Solution
 Record simple history information, such as RAB
transitions and handover events, that will provide
troubleshooters with sufficient information to reproduce
the conditions that led to a UEH_EXCEPTION
 History will be printed in the Trace and Error log when
a call is released or an UEH_EXCEPTION

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 15 P7FP Troubleshooting Improvements 2009-04-29
How it works

 RNC is initialized with X number of event buffers based on


hardware
 When a call is established an Event History Buffer is allocated to a
UE if available
 Events are created by gathering information from:
– trace4 UeCtxt, but with import and deport procedures combined
(i.e. RrcConn, RabHandling, SoHo…)
– Extra useful information is added where necessary
 If UEH_EXCEPTION occurs that results in call being released
current buffer is printed on bus_send UEH_EXCEPTION and
buffer is cleared
 If call is released normally buffer is printed on bus_send of
UE_UEHISTORY and UEHISTORY trace object

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 16 P7FP Troubleshooting Improvements 2009-04-29
Buffer allocation and Encapsulation

 Event History Buffers are assigned to UE / UeCtxts on


a first-come-first-served basis
 Number of buffers available is hardware dependent
and controller by rlib systemparameter (command:
syspar get)
– GPB3 can allocate up to 100 buffers
– GPB4/5/6 can allocate up to 500 buffers

Event Buffer
Event

Context Event Event


Header Header
Data ••• Header
Data

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 17 P7FP Troubleshooting Improvements 2009-04-29
Circular Buffer

 Circular buffer that can hold a maximum of 25 events


 Each event can contain up to 25 bytes of information

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 18 P7FP Troubleshooting Improvements 2009-04-29
Context Header

0 2 4 6 8 10 12 14 16

Project File Format Version S

Ue Context Ref

Buffer Length

Events

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 19 P7FP Troubleshooting Improvements 2009-04-29
Event Header
0 2 4 6 8 10 12 14 16

Timestamp …

… Timestamp

Timestamp (miliseconds) Event ID

Source UeRc Source GCP …

… Target UeRc Target GCP …

Exception Exception
… PC Continuation Class Cause Value …

Exception
… Extended Cause
E

Extra Data

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 20 P7FP Troubleshooting Improvements 2009-04-29
Tracing

 Buffer is outputted on the following MP trace objects:


– Selective Tracing
 bus_send UE_UEHISTORY
– Non Selective:
 bus_send UEH_EXCEPTION
 bus_send UEHISTORY

 Buffer is traced when an exception occurs on bus_send


UEH_EXCEPTION (buffer is cleared)
 Buffer is traced on bus_send UEHISTORY and
UE_UEHISTORY when call is released

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 21 P7FP Troubleshooting Improvements 2009-04-29
Command: printuehistory

 Printuehistory command can be used to identify which


ues have buffers associated and print buffer while call
is in progress
 Usage Syntax
– printuehistory –all | -listUEs | -ueRef <number>
 Results from –all and –ueRef arguments is traced in
the T&E log
 Results of –listUEs is printed in the console

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 22 P7FP Troubleshooting Improvements 2009-04-29
Trace & Error Output
[2009-04-20 13:53:17.416] 001400/RncLmUePT(UE_UEHISTORY) ../src/UehEventHistoryBufferD.cpp:328 BUS
SEND:CONTEXT HISTORY BUFFER, ueRef = 0, events = 7, total length (bytes) = 181, event specification version
(project = 7, increment = 7)
0000 1C 0F 00 00 00 AF 49 EC 7C E4 55 C1 00 00 00 00 '......I.|.U.....'
0010 22 00 00 00 00 0E B2 A0 00 00 20 00 00 00 00 49 '"......... ....I'
0020 EC 7C E4 AC 02 02 00 01 00 22 00 00 00 00 00 00 '.|......."......'
0030 00 00 00 00 00 00 00 00 49 EC 7C E4 C7 8A 02 00 '........I.|.....'
0040 01 00 22 00 00 00 00 00 00 00 00 00 00 00 00 00 '..".............'
0050 00 49 EC 7C E5 00 1A 02 00 01 00 22 00 00 00 00 '.I.|......."....'
0060 00 00 00 00 00 00 00 00 00 00 49 EC 7C E5 00 96 '..........I.|...'
0070 02 00 02 10 22 00 00 0F 40 00 08 04 02 00 00 00 '...."...@.......'
0080 00 00 00 49 EC 7C EE AE 09 04 20 02 10 22 00 00 '...I.|.... .."..'
0090 00 00 00 00 00 00 00 00 00 00 00 00 49 EC 7E 4C '............I.~L'
00A0 B1 08 04 20 02 10 22 00 00 00 00 45 30 00 00 00 '... .."....E0...'
00B0 00 00 00 00 00 '.....'
DocNo: 34/1551-CRA 403 38/1, DocRev: PB5, Project: 7, FFV: 7, DATE: 24-Mar-09
context_history_header:
{
FORMAT_VERSION_PROJECT: 7
FORMAT_VERSION_INCREMENT: 7 Decoder.pl will not decode event
IsServingUeRef: 1 buffer if format version increments
UE_REF: 0 do not match. Decoder.pl error based
BUFFER_CONTENTS_LENGTH: 175 trace will be printed. May need to
Martin Aldrin to update Decoder.pl.

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 23 P7FP Troubleshooting Improvements 2009-04-29
Trace & Error Output…
event_header:
{
timestamp : Mon Apr 20 13:47:16 2009
timestampMilliSec : 343
EVENT_ID : 1, rrcConnSetup
SOURCE_UERC_ID : 0
SOURCE_GCP : 0
TARGET_UERC_ID : 0
TARGET_GCP : 0
procedureCompleted : 1
continuation : 1
exceptionClass : 0
exceptionCauseValue : 0
exceptionExtendedCause : 0
encodingErrorsEncountered : 0 If RNC had a problem encoding event
this flag will be set. Can happen if for
event: rrcConnSetup example GCP field length changes
{ without updating xml format version
cellId : 30101
increment. Contact UEH UeContext
cellFroId : 0
block responsible.
propagationDelay : 0
standAloneSrbSelector : 1
}
}

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 24 P7FP Troubleshooting Improvements 2009-04-29
References

 Node FDI: 90/190 24-10/FCP 103 6503 Uen


 Event Specification: 34/1551-CRA 403 38/1 Uen

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 25 P7FP Troubleshooting Improvements 2009-04-29
PLM
Diagnostic
Improvements
Introduction

 PLM a diagnostic improvement project was introduced


to try and improve troubleshooting techniques.
 TRs were written on various suggested improvements
to help diagnose various problems

 P7FP TR References
– WRNae30395 - Iu Signalling Overload Indication
– WRNae30389 - UE Collisions not properly detected as UeRegister
stores one Ue Identity
– WRNae22511 - UeRef should be contained in PM_OBS_IND
traces
– WRNae30390 - RNC should be adapted to allow tracing on IMSI
northbound
– WRNae31018 - Difficult to map UeCtxt's between DRNC and
SRNC

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 27 P7FP Troubleshooting Improvements 2009-04-29
Iu Signalling Overload Indication

 Problem
– If there is an error / fault in core network causing a major
traffic degradation this may not be clearly indicated by an
alarm in WRAN network
– May take up to 60 mins before customer is alerted by PM
supervision

– Main issues seen in the case of signaling overload


– No clear indication of Iu signaling traffic disturbance
– The only way to see that there was a problem was default
trace on SCCP MP pointing to the congestion.

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 28 P7FP Troubleshooting Improvements 2009-04-29
Iu Signalling Overload Indication…

 Solution
– New COLI command added for SCCP Link Congestion History
to print the history of SCCP congestion (overload) information for
a specific RANAP RO

ranap cong - froid <id> [-count <n>]

– Records: Start time, duration and level of congestion is printed for


each Iu link signalling traffic disturbance
– The command with only "froid" specified prints all available history
of congestion for that froId. Maximum 1000 disturbancies
– The command with both "froid" and "count" specified, prints history
of the n latest occurrences of congestion for the specific froId.

==================== SCCP Link Congestion History (froId 1) =====================


The congestion history list for this instance is empty

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 29 P7FP Troubleshooting Improvements 2009-04-29
UE Collisions not properly detected as
UeRegister stores one Ue Identity
 Problem
– Inability to properly detect UE "Collision" when the same
UE makes a subsequent access attempt before the RNC
has detected that the ongoing connection has been
released or dropped.
– Some UE Collision were not being detected in the RNC but
registering in the CN
– The root cause of the problem is that there is only one UE
Identity stored in RNH

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 30 P7FP Troubleshooting Improvements 2009-04-29
UE Collisions not properly detected as
UeRegister stores one Ue Identity…
 Example:
– rrcConnectionRequest is received containing a TMSI
– RRC connection is established (registerServingUeCtxtReq sent to RNH containing
the TMSI received in the request)
– RANAP Common Id message is received containing the IMSI (registerImsiInd sent to
RNH containing the IMSI)
– The IMSI received in the registerImsiInd signal is used to overwrite the TMSI which
was stored at reception of the registerServingUeCtxtReq signal.
– IF the UE abnormally drops, for whatever reason, and makes a new access attempt
(with TMSI) before the RNC has released the currently seized resources, then the
RNC should be able to detect this UE "Collision" and immediately release all
resources corresponding to the dropped connection.
– This collision detection capability has been designed into RNH and reflected by the
following

– 001400/RncLmUePT(UEH_EXCEPTION) ../src/UehUeCtxtC.cpp:27992
TRACE1:Exception Code 72; RRCConRel; UeRef = 2367; IMSI = UNDEF; TMSI =
10101caf; cellId = UNDEF; cellFroId = UNDEF; RLs in DRNC = 0; Best RL in DRNC:
No; causecode = 3; connType = uehPacket64Hs; Ue Identity Collision received,
running

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 31 P7FP Troubleshooting Improvements 2009-04-29
UE Collisions not properly detected as
UeRegister stores one Ue Identity…
 However
– However, this capability is not really working correctly as many collision situations
are not detected by RNH. An example case is as follows : -
– rrcConnectionRequest is received containing a TMSI
– RRC connection is established (registerServingUeCtxtReq sent to RNH containing
the TMSI received in the request)
– RANAP Common Id message is received containing the IMSI (registerImsiInd sent to
RNH containing the IMSI)
– Signalling Connection is established
– An EUL/HS Interactive Rab is established

– During an attempt to chSw from EUL/HS --> FACH, the Ue drops (maybe due to
application fault or coverage reason or other etc.)
– The same UE makes a new attempt to establish an RRC Connection quasi
immediately (before the old connection resources are cleaned up)
 rrcConnectionRequest is received containing the SAME TMSI
 RRC connection is established (registerServingUeCtxtReq sent to RNH
containing the TMSI received in the request) no collision is detected in
RNH
 A NAS Service Request is transmitted to CN, which detects the UE
Collision and sends an Iu Release Command (for the original connection)
with the cause "release-due-to-Ue-generated-signalling-connection-
release" - Ongoing connection proceeds as above .....

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 32 P7FP Troubleshooting Improvements 2009-04-29
UE Collisions not properly detected as
UeRegister stores one Ue Identity…
 Problem
– RNH does not detect a UE Collision when it receives the
registerServingUeCtxtReq containing the TMSI for the
second attempt
– Since only one UE Identity is stored for each UeContext,
RNH have no way of knowing if the second attempt
concerns a UE which is still registered since the first
attempt has IMSI registered and the second attempt uses
TMSI.

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 33 P7FP Troubleshooting Improvements 2009-04-29
UE Collisions not properly detected as
UeRegister stores one Ue Identity…
 Solution
– RNH now stores more than one UE Identity for the same
UE - TMSI, IMSI, IMEI. If collision detected it will release
the call.

– KPIs may actually degrade with this fix, since all (RNC
detected) Identity collisions will now be properly counted as
abnormal releases, instead of RNC Identity collisions
counted as abnormal and CN collisions counted as normal
releases.

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 34 P7FP Troubleshooting Improvements 2009-04-29
UeRef should be contained in
PM_OBS_IND traces
 Problem
– Investigating faults that result in small variations in KPI
performances
– Trace macro PM_OBS_IND is used in parallel with the coli
command Ue(Cell)_pm_counter_enable COUNTERNAME. These
traces and the UEH_EXCEPTIONs should indicate which
Exceptions are resulting in the KPI difference.
– On heavily loaded nodes the RLIB trace indicating the counter
increment only references the observedInstanceId (ie. cellFroId)
and not the UeRef
021200/RncLmUePT(PM_OBS_IND)
../src/RlibPmPegCounterD.cpp:40 TRACE6:Ue:
incrementCounter
ROAM_UTRANCELL_PMNOCELLFACHDISCONNECTABNOR
M (counterId 4134, observedInstanceId 26, noOfSteps 1)

– Difficult to determine which exceptions resulted in the KPI


degredation

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 35 P7FP Troubleshooting Improvements 2009-04-29
UeRef should be contained in
PM_OBS_IND traces
 Solution
– Add UeRef to PM_OBS_IND traces during stepping of
counters

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 36 P7FP Troubleshooting Improvements 2009-04-29
RNC should be adapted to allow
tracing on IMSI northbound
 Problem
– Enabling traces on RANAP/SCCP processes impossible on a live
node without losing traces due to overflow especially when
monitoring the processes rnhRanapRouterC and
Scc_server_proc.

 Solution
– ueidtrace has been updated to enable and disable tracing on
RnhRanap
– Updated RNH interface to allow traces to be enabled and disabled

– [2009-01-19 09:49:34.626] [UE_ASN_RANAP]


../src/RnhRanapRouterC.cpp:7907 BUS SEND:[rncModId 0 ueRef
10] DirectTransferInd -> CN (connId=0 froId=2 modId=0 ueRef=10
iuSignId=20 CS cnId=-1 sccpState=CONN ueCtxtState=CONN)

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 37 P7FP Troubleshooting Improvements 2009-04-29
Difficult to map UeCtxt's between
DRNC and SRNC
 Problem
– Inability to easily map a drift UeCtxt to a serving UeCtxt.
– The majority of hanging drift UeCtxt's did not contain IMSI
information, so it was impossible to map these back to
UeCtxts in the SRNC

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 38 P7FP Troubleshooting Improvements 2009-04-29
Difficult to map UeCtxt's between
DRNC and SRNC…
 Solution
– Added S-RNTI to UEH_EXCEPTION traces in the SRNC. ueregprint in the
DRNC can be used to match s-Rnti with SRNC traces.

– 001400/RncLmUePT(UEH_EXCEPTION) ../src/UehUeCtxtC.cpp:29313
TRACE1:ExceptionCode = 563; TIMER rcsAllRlLostTimerId has expired;
InterfaceTimout (Internal); Ue; RRCConRel; Dropped Call Release;
UehUeCtxtC; UeRef = 1857; IMSI = 235915000002412; cellId = 14822;
cellFroId = 1; S-RNTI = 18241; connType = uehSpeech GCP:01000000;
targetConnType = uehSpeech GCP:01000000; non running state

– uep 090204-14:01:18 172.31.98.177 7.1g


RNC_NODE_MODEL_K_9_81_COMPLETE stopfile=/tmp/13781 $ lhsh
001700 ueregprint all rncModId ueCtxt (initial Ue identity) (imsi) rnti state
(uraId drxCyc) (sRnc sRnti) (shortSrnti) age 6 12089 235915000002412
126777 con 148 18241 00:00:02 1 registered UEs found, (1 printed), 1
conn, 0 ura-pch. Conn. lic. limit 413028.

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 39 P7FP Troubleshooting Improvements 2009-04-29
Difficult to map UeCtxt's between
DRNC and SRNC
 Improvement
– UehRnsap to send a SCCP Disconnect Request if it
receives a RNSAP message for a unknown ueRef and
SCCP connection Id pair
– Will help clear hanging connections in the DRNC

1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 40 P7FP Troubleshooting Improvements 2009-04-29
Questions?
Thank you for listening
1/221 09-20/FCP 103 6503 Uen Rev B Ericsson Confidential 42 P7FP Troubleshooting Improvements 2009-04-29