I Simposio de Postgrado 2023. Ingeniería, ciencias e innovación

MÓDULO_ 03 Computación y Ciencias de datos 80 NOT ALL DGAS ARE BORN THE SAME - IMPROVING LEXICOGRAPHIC BASED DETECTION OF DGA DOMAINS THROUGH AI/ML ABSTRACT Timely identification of DNS queries to Domain Generation Algorithm (DGA) domains is of critical importance in limiting the rapid propagation of malware and its potentially devastating impact, especially in thwarting coordinated activities of widespread botnets. In our pursuit of swiftly and accurately detecting DGA-generated domains, our exploration centers around the meticulous analysis of lexicographic features, exclusively derived from the domain name as observed within the context of a DNS query. We introduce a reputation-based scoring system for domain names, based on the co-occurrence frequency of n-grams with respect to a list of well-known benign domains or whitelist. To further enhance the effectiveness of our approach, we select two features directly derived from domain names – the domain length and the Shannon entropy. These distinctive attributes, in conjunction with machine learning methodologies, strategically optimize the efficacy of detection processes. Our strategy exclusively capitalizes on features derived from domain name analysis, thus culminating in both efficient computation and seamless large-scale monitoring capabilities. The empirical results stemming from our rigorous evaluations encompassing 25 diverse DGA domain families, conducted on an openly available dataset, emphasize the potency of fusing reputation scores with these two fundamental lexicographic features. The enhanced detection performance is accomplished solely through the utilization of the Random Forest model, which markedly outperforms current state-of- the-art approaches. Lucas Torrealba-Aravena 1* , Pedro Casas 2 , Javier Bustos-Jiménez 1 , Germán Capdehourat 3 , Mislav Findrik 4 1 NIC Chile Research Labs, University of Chile. 2 AIT Austrian Institute of Technlogy. 3 Plan Ceibal & Universidad de la República. 4 Cyan Security Group. *Email: lucas@niclabs.cl

RkJQdWJsaXNoZXIy Mzc3MTg=