logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-11192025-151036


Tipo di tesi
Tesi di laurea magistrale
Autore
DI ROCCO, DAVIDE
URN
etd-11192025-151036
Titolo
RAN-SKIMT: A Framework for Bridging the Gap Between 5G Security Standards and Commercial Network Deployments
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
CYBERSECURITY
Relatori
relatore Prof. Garroppo, Rosario Giuseppe
relatore Prof. Borgaonkar, Ravishankar
Parole chiave
  • 5G
  • framework
  • mobile networks
  • ran
  • RAN-SKIMT
  • reti mobili
  • security
  • sicurezza
  • standards
Data inizio appello
05/12/2025
Consultabilità
Non consultabile
Data di rilascio
05/12/2028
Riassunto
RAN-SKIMT is a lightweight tool built to answer a practical question that standards documents cannot: given a real 5G trace, what privacy and security protections did the network actually enforce, and how traceable was the subscriber in that session? The thesis starts from a simple observation: while 5G introduces strong mechanisms on paper—concealing permanent identities with SUCI, negotiating modern algorithms, protecting access-stratum signalling—the “last mile” of deployment choices and vendor defaults decides whether those mechanisms help users in practice. Manually reading NAS/RRC logs is slow and error-prone, and existing studies, though insightful, usually require significant expertise and do not provide a fast, repeatable way for operators or testers to audit their own networks. RAN-SKIMT fills that gap with an evidence-driven pipeline that ingests standard packet captures and emits an interpretable report with a 0–100 score, a privacy badge, and an attack-exposure overview tied directly to the packets that triggered each verdict.

The system is a Flask web application that accepts PCAP/PCAPNG files and asks the user to choose between Standalone (SA) and Non-Standalone (NSA) analysis, reflecting the different protocol stacks in play. Internally, SA traces are processed by NAS-5GS and NR RRC extractors; NSA traces use NAS-EPS and LTE RRC extractors and optionally reuse NR RRC checks if NR control traffic is present. Extraction follows a string-driven strategy: field names from Wireshark are normalized and matched by exact names or safe prefixes; values are taken “as is” from decoder outputs to avoid unintended reinterpretation. Instead of reconstructing a full state machine, functions emit compact, auditable facts—what identifier appeared, which algorithms were selected, whether paging used temporary IDs, whether IMEI surfaced before NAS protection, whether measurement reports preceded RRC security, and when UE capabilities were disclosed. These packet-local observations are accumulated into a structured results dictionary that becomes the single source of truth for both the scoring engine and the HTML report.

Scoring translates those results into a composite 0–100 index designed for comparability across traces and networks, not as a formal conformance metric. In SA, the model weighs: SUCI usage and protection; IMEI confinement to protected NAS; paging posture (5G-S-TMSI vs. permanent IDs); advertised vs. selected NAS algorithms; NR RRC security selections; timing of UE capabilities relative to security activation; and TMSI dynamics (entropy and rotation frequency). NSA adapts the same philosophy to the LTE-anchored attach, prioritising integrity on the LTE NAS Security Mode Command, S-TMSI paging, EEA/EIA strength, measurement ordering on LTE (and NR when present), and M-TMSI entropy. Unknown evidence receives partial credit so that missing messages reduce confidence without collapsing the score. The privacy badge distils traceability into a three-level label—Not trackable, Potentially trackable, Trackable—accompanied by explicit reasons so readers can jump from the top-line verdict to the exact field and frame. The attack-exposure overview lists well-known scenarios (IMSI catching, paging without TMSI, IMEI catching, measurement reports before security, capability bidding-down, etc.) and, for each, states the observation and a mitigation aligned with operational practice.

The report itself mirrors this structure: a hero card with the score, badge, and file metadata; matrices for NAS and RRC algorithms showing what the device supported and what the network actually enforced; identifier tables that list SUCI or GUTI values with frame numbers and a rotation overview that includes Shannon entropy over TMSI values (including suffix-only entropy to catch “randomize only the tail” patterns); paging and measurement status; and a score breakdown that exposes the weights behind the 0–100 index. The design principle is transparency: every warning and number is traceable to concrete packet evidence.

Evaluation uses a UE-side capture approach in which the handset/modem decodes control-plane signalling and exports selected messages as PCAP/PCAPNG for Wireshark. Traces were collected with SCAT and Wireshark using a Samsung Galaxy S23 for NSA and a Quectel RM530-series modem for SA (controlled via picocom/AT commands). Three commercial SIMs from different operators were exercised with a stimulus protocol intended to produce complete, comparable captures. For NSA, each SIM underwent 5× device on/off, 5× airplane mode toggling, and 10× incoming/outgoing SMS and voice calls; for SA, airplane mode toggling was repeated 5× (voice/SMS over SA were not feasible on this testbed). This mix reliably triggers registration, paging, security mode procedures, identity exchanges, capability reporting, and, in some cases, measurement reports.

The observed SA patterns are encouraging but nuanced. SUCI consistently appeared with a non-null protection scheme, and paging, when present, used 5G-S-TMSI rather than a permanent identifier—both aligned with 5G guidance. Null NAS algorithms were not selected, even if devices advertised support for them. However, IMEI (PEI) often surfaced in NAS before integrity/ciphering was active, which increases traceability and enables rogue-BS probing, and TMSI values, while exhibiting good entropy, were sometimes reused for long periods, making long-term correlation easier than necessary. In practice, the most actionable SA findings are “keep IMEI inside protected NAS” and “rotate temporary identifiers more aggressively.”

NSA results, across three operators, showed recurring privacy weaknesses tied to the LTE anchor. IMEI frequently appeared in clear during NAS-EPS procedures. In some scenarios—e.g., airplane-mode cycles—LTE measurement reports were observed before RRC security completed, enabling fine-grained location inference from radio measurements. In SMS and voice exercises, one operator exhibited low M-TMSI entropy, signalling weak randomization and easier longitudinal correlation; for another operator, the combination of IMEI in clear, early measurements, and low entropy compounded risk. These findings echo the broader literature: SA can deliver stronger guarantees if operators avoid permissive defaults, whereas NSA inherits EPS constraints unless paging hygiene, integrity on security mode, and identifier dynamics are actively enforced.

The thesis is explicit about limitations. Results depend on trace completeness: if a PCAP lacks security mode or paging, the tool cannot invent evidence and will mark items as unknown (with partial credit to soften the impact). Reasoning is static and rule-based for transparency, not probabilistic or learned, so subtle downgrade patterns that require temporal inference across sessions may be missed. Scope centres on control-plane NAS/RRC for SA and NSA; user-plane encryption, IMS privacy, or full core signalling (NGAP/S1AP) are out of scope. Decoding relies on Wireshark; vendor-specific IEs or version drift can hide or rename fields. Weights and thresholds (e.g., entropy cut-offs and algorithm tiers) are heuristic and meant for relative comparisons, not absolute certification. The analysis is single-trace, performance can degrade on very large captures, the system does not sniff live traffic, and operator policy context is not embedded—final judgement remains with the analyst.

Finally, the thesis outlines a pragmatic roadmap. Replacing “parse-on-the-fly” with a unified decoding layer (e.g., pycrate or a custom intermediate model) would speed up large captures and unlock multi-interface inputs (NGAP/S1AP, Xn). Correlating radio and core logs would expose mismatches that span hops. Lightweight temporal reasoning and explicit gap detection would make uncertainty handling more principled. Comparative dashboards would let users rank traces and track fixes over time. Exposing weights/thresholds as configuration would tailor the score to operator priorities without touching code. Wrapping analyzers for near-real-time ingestion would support lab testing and continuous monitoring. A richer, annotated dataset would harden heuristics and serve as regression tests.
File