Skip to content
  • Print
return to MS in Data Science


Master of Science in Data Science
Dina Dell'Aringa
Administrative Assistant

How to apply

Request Information

Third Semester

This course introduces students to data warehousing architectures and big data processing pipelines. These data management approaches often contain the source data needed for analytics. The class will provide an overview of conventional data warehousing architectures, but will primarily focus on introducing students to "big data" processing pipelines technologies that enable the management of both SQL and NoSQL data. Students will learn how to design systems to manage large volumes of poly-structured data including temporal, spatial, spatio-temporal, and multidimensional data.

This course introduces students to in-memory analytic techniques as an alternative to traditional warehouse approaches. With the declining cost of memory, fast, in-memory analytics is becoming feasible for many businesses. The class will provide an overview of benefits of in-memory analytics with a focus on cloud computing and cluster computing architectures and associated modern toolsets. Students will also be introduced to cloud based architectures and modern cloud based analytic platforms and services. Students will learn how to design in-memory systems to iterative graphs, complex multi-stage applications, and fault tolerant solutions, and to use modern cloud-based analytic platform services. Prerequisites: Big Data Engineering 1

This course builds on the introductory course Machine Learning I by exposing students to more supervised learning techniques such as affinity analysis, and ensemble methods for combining techniques, and introducing unsupervised learning methods. Unsupervised learning is a class of machine learning for uncovering patterns and relationships in data without labeling the data, or establishing a preconceived set of classes or results. Students will learn through hands-on programming projects. This class will examine the benefits and drawbacks of unsupervised learning methods. Prerequisites: Machine Learning 1

This course is a culmination of the second semester in the MSc Analytics program. It provides an experiential learning opportunity that ties together the statistical, computational analytics, and database concepts in a series of case studies in the finance, manufacturing, telecommunications, and retail sectors. Students will examine four separate case studies of the use of data analytics. Students will work in teams to dissect these case studies and evaluate the business opportunity, the analysis methodology, the raw data, the data and feature engineering and data preparation, and the analytical outcomes. Students will present their evaluation and make recommendations for improvements in the analysis and related opportunities.

This course consists of a set of weekly presentations and discussions around key analytic issues and current case studies. These hot topics will be presented by a combination of guest speakers-industry luminaries in the area of analytics-and University of the Pacific faculty members, including the MS analytics program director. Many of these topics will be drawn from relevant real-world contemporary analytic stories that reinforce specific elements of the academic content being taught and can not be predicted in advance.


This course introduces the use of analytics to detect fraud in a variety of contexts. The Association of Fraud Examiners's 2012 report estimates that the typical company loses 5% of annual revenue to fraud (Source: This class shows how to use machine learning techniques to detect fraudulent patterns in historical data, and how to predict future occurrences of fraud. Students will learn how to use supervised learning, unsupervised learning, and social network learning for these types of analyses. Students will be introduced to these techniques in the domains of credit card fraud, healthcare fraud, insurance fraud, employee fraud, telecommunications fraud, web click fraud, and others. The course is experiential and will apply the concepts taught in prior data wrangling and machine learning courses using real-world data sets and fraud scenarios. Prerequisites: Machine Learning 1

This course introduces the techniques, algorithms, and uses for recommender systems - systems that recommend products, information, and actions based on an analysis of the personalized behavior of the user. Recommender systems are used by merchants like, Netflix, Ebay, and many others. Students in this class will learn how to design and develop effective recommender systems, and how to recognize when a recommender system offers a suitable solution. Prerequisites: Machine Learning 1

This course introduces the essential elements of text mining, or the extension of standard predictive methods to unstructured text. The class will explore the use of text mining in domains such as digital security, bioinformatics, law, marketing, and social media. Students will be exposed to information retrieval, lexical analysis, pattern recognition, meta-data tagging, and natural language processing (NLP). A large portion of this class is devoted to the data preparation and wrangling methods needed to transform unstructured text into a suitable structure for analysis. Prerequisites: Machine Learning 1

This course introduces the techniques used to analyze consumer shopping and buying behavior using transactional data in industries like retail, grocery, e-commerce, and others. Students will learn how to conduct item affinity (market basket) analysis, trip classification analysis, RFM (recency, frequency, monetary) analysis, churn analysis, and others. This class will teach students how to prepare data for these types of analyses, as well as how to use machine learning and statistical methods to build the models. The class is an experiential learning opportunity that utilizes real-world data sets and scenarios. Prerequisites: Machine Learning 1

This course introduces the theory and application of statistical methods for the analysis of data that have been observed over time. Students will learn techniques for working with time series data and how to account for the correlation that may exist between measurements that are separated by time. The class will concentrate on both univariate and multivariate time series analysis, with a balance between theory and applications. Students will complete a time series analysis project using a real-world scenario and data set. Prerequisites: Machine Learning 1

This course introduces the algorithms and methods used to analyze the subjective opinions and sentiments of the author of a free text document such as a tweet, blog post, or article. The class will examine the applications of this type of analysis as well as its benefits and limitations. Sentiment analysis is closely tied to text mining and uses techniques such as natural language processing, text analysis, and computational linguistics for feature extraction and preprocessing of the data. Students will explore the current state of usage of sentiment analysis, as well as future implications and opportunities. Prerequisites: Machine Learning 1

This course builds upon the Introduction to Data Visualization by introducing students to techniques for combining traditional storytelling with data visualization to create compelling ways to communicate analytical findings with lay persons and business stakeholders. Students will learn traditional storytelling structures and how to overlay these structures on the visual presentation of data and analytical models. This experiential course is centered around one or more team projects using business scenarios and data sets. Prerequisites: Introduction to Data Visualization