Research

A.   CyberThreat Intelligence - Cybersecurity threat intelligence is extracted from heterogeneous sources and comes in structured and unstructured formats. Security analysts, despite strong expertise, cannot always distinguish trustworthy threat vectors and intelligence from the rest. This results in considerabletime expended downstream performing triage by researchers and analysts. My research group isdoing fundamental work in Artificial Intelligence that targets the problem of assigning automatedtrustworthiness to threat intelligence by addressing the following specific aims:
  • Aim 1 : Assigning trust and confidence scores to attack related knowledge at varying levels ofgranularity based on provenance and other deep probabilistic graphical modeling approaches.
  • Aim 2 : Creating deep learning models for predicting malware behavior and help investigate pastand future attacks, and fill gaps or identify anomalies in collected information through dynamicgraph anomaly detection, link prediction and other novel approaches.
Role - Principal Investigator, Funding Agency (2019-2021) - IBM, Dollar Amount - $350,000
Investigators:
Dr. Mohammed Zaki (RPI), Dr. Alex Gittens (RPI), Dr. Charu Aggarwal (IBM PI).
Ph.D. Student - Sharmishtha Dutta.
Undergrads - Destin Lee, Qicheng Ma, Chuqiao Gu (Qiao), Sean Hale


B.   Autonomous Vehicles (S&P) Autonomous vehicles (and existing human-driven vehicles) contain sensors that collect data about the vehicle’s operation and its surroundings. For example, sensors in self-driving car include cameras, radar, thermal imaging devices, and light detection and ranging (LIDAR) devices that collect data about the environment outside the vehicle. This data helps autonomous vehicles determine the objects it encounters, make predictions about the environment, and take action based on these predictions. As data privacy and security represent growing critical concerns, FL enables digital devices toc ollaboratively learn a shared prediction model while keeping all the training data on the device, decoupling the ability to do machine learning from the need to store the data in the cloud. However,FL is not the magic bullet to privacy issues. Even holding an “anonymized” data set on the cloudcan still put users’ privacy at risk via linkage to other data sets. The research agenda is to addresssuch privacy concerns when training machine learning models.
Role - Principal Investigator, Funding Agency (2022) - Toyota Infotech, Dollar Amount - $70,000
Collaborators - Dr. Michael Clifford (Toyota Infotech Lab, CA), Dr. Sara Sampazzi (University of Florida), Dr. Matt Bishop (UC Davis), Dr. Karl Levitt (UC Davis), Dr. Miriam Heller.


C.   Healthcare Analytics (HEALS) - The primary goal of the HEALS (Health Empowerment by Analytics, Learning, and Semantics) project is to apply advanced cognitive computing capabilities to help people understand and improve their own health conditions. In particular, we are exploring areas including personalized and mobile medical care, improved healthcare analytics, and new data-based approaches to driving down the cost of medical care. The HEALS project is a joint IBM-RPI effort with close collaboration and transition.
I'm also interested in the following challenges tied to healthcare data: (1) data resides in different locations (e.g., hospitals, physician offices', home-based devices, patients’ smartphones); (2) there is a growing availability of data, which makes scalable frameworks important; and (3) aggregating data in a single database is infeasible or undesirable due to scale and/or data privacy concerns.
Role - Researcher
Lead Investigator - Dr. Mohammed Zaki (RPI), Principal Investigator (IBM) - Dr. Ching-Hua Chen, Researchers (RPI) - Dr. Oshani Seneviraten, Dr. Dan Gruen.
Undergrads - Ruisi Jian. Alumni - Megan Goulet, Lydia Zhou, Aaron Hill.


D.   Exploratory Research -
Privacy and Security of User Platforms - This exploratory project undertakes different user platforms such as Chat applications, Cloud platform and evaluates user data privacy conerns. Some of the applications I've evaluated are WhatsApp, Covid-19 apps, cloud platforms, etc.
Insider Threat - Insider threat is one of the most pernicious threat vector to organizations across the world due to the elevated level of trust and access that an insider is afforded. This type of threat can stem from both malicious and negligent users. In this research, we propose a novel approach that uses system logs to detect insider behavior using Deep Learning models. System logs are modeled as a natural language sequence and patterns are extracted from these sequences. We create workflows of sequences of actions that follow a natural language logic and control flow.

Role - Researcher
Collaborators - Dr. Kristine Gloria (Aspen).
Undergrads - Qicheng Ma, Daniel Steven.


E.   Ph.D. Thesis - Threat and attack detection in large networks by identifying systemic anomalous behavior. Identified anomalous data from a set of “important” nodes (instead of an entire system) leveraging Graph Analytics and Machine Learning models. Used simulated (NS2) and real attack dataset - Conficker to prove various hypothesis on systemic cyber attacks.
Advisor - Dr. Jim A. Hendler