Mayana Pereira

Data Scientist


Customer Security and Trust


I am a data scientist with Microsoft. I work applying machine learning to problems in the intersection of natural language processing, security and software engineering. I have applied machine learning techniques to a wide range of practical problems in network and software security, health care and social networks. My works have resulted in deployed products, awards, patents and peer-reviewed publications.

Selected Projects

Natural Language Processing Applied to Software Security.

Identifying security bug reports (SBRs) is a vital step in the software development life-cycle. In supervised machine learning based approaches, it is usual to assume that entire bug reports are available for training and that their labels are noise free. In this project we achieved accurate label prediction for SBRs by analyzing solely the title of bug reports and in presence of label noise.

A Word Graph Approach for Dictionary Detection and Extraction in 

DGA Domain Names.

Research on Machine Learning for Network Security. Developed and implemented a novel technique for detecting dictionary-based algorithmically generated domains (DDGA). The novel technique is currently deployed to detect DDGA domains in DNS traffic as a feature of Infoblox DNS firewall. The  developed solution (U.S. Patent Application No. 62/561,590) uses graph-based analytics and supervised machine learning techniques. Research results appeared as a contributed talk at MLCS workshop, co-locate with NIPS 2018, and also as a study case in a talk about graph-based analytics at RSA 2018. The full version paper published in RAID 2018 (International Symposium on Research in Attacks, Intrusions and Defenses).

Talks & Webinars

RSA 2020

DefendCon 2019

RSA Conference 2018

Webinar: Detecting Dictionary DGAs Using Machine Learning