You are here: American University Academic Programs Shared Data Science Data Science Practicum

Data Science Practicum

The Data Science Practicum (DATA-793) is the capstone experience for the MS in Data Science and provides assistance to faculty and staff across the university. 

Students entering the practicum have completed coursework in statistics, regression, and R for data science, and are enrolled in or have completed statistical machine learning. They have learned to visualize, analyze, and model datasets and are ready to put their skills to use on live projects.

Call for Faculty & Staff Projects

Let our advanced students help with your data:

  1. Provide your project title, decription, required skills, and email on the Faculty Project form. The only requirement is that the projects use the students' data science skills. 
  2. Under the guidance of our faculty, students in the Data Science Practicum (Data 793) or other advanced research courses review available projects for best fits and contact you.
  3.  You and the student(s) agree on a plan for work on the project. Work can begin as early as January 2020.
Molecules and machine learning becomes properties

Prediction of molecular properties using machine learning techniques

Due to its high computational speed and accuracy compared to ab-initio quantum chemistry and forcefield modeling, the prediction of molecular properties using machine learning has received great attention in the fields of materials design and drug discovery. In this project, students will use a data fusion framework that is based on Independent Vector Analysis to exploit underlying complementary information contained in different molecular featurization methods. This information will then be used to enhance the prediction ability of a regression model as well as to discover relationships between different molecular structures and properties. Students do not need to have any background in chemistry.
Prerequisites: Regression, Machine Learning, knowledge of R, Python, or Matlab.
Contact Dr. Zois Boukouvalas, boukouva@american.edu.

Data collection, pre-processing, and visualization for understanding the spread of misinformation in social media

Due to the wide use of online media, false information can spread rapidly affecting decision making, cooperation, communications, and markets. Modern social technologies are capable to expedite a massive amount of information enabling the spread of misinformation (inaccurate or misleading). Thus, a crucial question that arises is how do true and false information diffuse and how do they correlate with each other. In this project, students are expecting to collect and pre-process data from social media, news sites and RSS feeds and perform different data visualization techniques in order to identify how false information diffuses and how it correlates with true information.
Prerequisites: Regression, Machine Learning, knowledge of R, Python, or Matlab.
Contact Dr. Zois Boukouvalas, boukouva@american.edu.

Names of different chemicals plotted on a x and y axis

Extracting chemical insights from energetic materials using Natural Language Processing (NLP) techniques

The number of scientific journal articles and reports being published about energetic materials every year is growing exponentially, and therefore extracting relevant information and actionable insights from the latest research is becoming a considerable challenge. In this project, students will explore how techniques from natural language processing and machine learning can be used to automatically extract chemical insights from large collections of documents. Students do not need to have any background in chemistry. Prerequisites: Regression, Machine Learning, knowledge of R, Python, or Matlab.
Prerequisites: Regression, Machine Learning, knowledge of R, Python, or Matlab.
Contact Dr. Zois Boukouvalas, boukouva@american.edu.

Knowledge discovery and detection of misinformation on social media during high impact events

With the evolution of various social media technologies, there has been a fundamental change in how information propagates and is shared on the Internet and microblogs. During a high impact event, e.g. hurricane, terror attacks, stock market crash, social media users can be thought of as generative functions that output network posts. These posts are then propagating on the social network enabling the rapid spread of misinformation which can affect decision making, communications, and markets. Students working on this project will work with a data-driven approach based on latent variable analysis in order to extract information from data so that early detection of misinformation and knowledge discovery during a high impact event are achieved jointly.
Prerequisites: Regression, Machine Learning, knowledge of R, Python, or Matlab.
Contact Dr. Zois Boukouvalas, boukouva@american.edu.