Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Quiz 1

This are centralized data containers in a purpose-built space that supports business intelligence and
reporting but restricts robust analyses.
Data marts
Data warehouses
Analytic Sandbox
None of the Above
Which of the following are problems encountered in traditional data architecture?
High-value data is hard to reach and leverage, and predictive analytics and data mining activities
are last in line for data.
Data scientists are limited to performing in-memory analytics which will restrict the size of the
datasets they can use.
Data Science projects will remain isolated and ad hoc, rather than centrally managed.
All of the Above

Which of the following is always TRUE about Big Data?

I. Due to its size or structure, Big Data cannot be efficiently analyzed using only traditional databases
or methods.
II. Although the variety of Big Data tends to attract the most attention, generally the volume and
velocity of the data provide a more apt definition of Big Data.
I only

II only

both I and II

neither I nor

Which of the following TRUE about the differences of Business Intelligence (BI) and Data Science?
I. Where Data Science problems tend to require highly structured data organized in rows and columns
for accurate reporting, BI projects tend to use many types of data sources, including large or
unconventional datasets.
II. Data Science tends to be more exploratory in nature and may use scenario optimization to deal with
more open-ended questions.
I only

II only

both I and II

neither I nor
Among the business drivers that push businesses to become more analytical and data driven, this one
involves customer churn, fraud and default
Optimize Business Operations
Identify Business Risk
Predict New Business Opportunities
Comply with Regulatory Requirements
Which of the following is true about the current analytical architecture?
I. Data sources are first loaded into the data warehouse where data needs to be well understood,
structured, and normalized with the appropriate data type definitions. This kind of centralization enables
security, backup, and failover of highly critical data.
II. Once in the data warehouse, data is read by additional applications across the enterprise for BI and
reporting purposes. These are high-priority operational processes getting critical data feeds from the
data warehouses and repositories.
I only

II only

both I and II

neither I nor

Which of these attributes stand out as defining Big Data characteristics?

Huge volume of data
Complexity of data types and structures
Speed of new data creation and growth
All of the Above
This type of data has no inherent structure, which may include text documents, PDFs, images, and video.
Quasi-structured Data
Unstructured Data
Semi-structured Data
Structured Data

Quiz 2
Examples that fall under this group includes financial analysts, market research analysts, life scientists,
operations managers, and business and functional managers.
Data Savvy Professionals
Deep Analytical Talent
Technology and Data Enablers
None of the Above
Which of the following describe the decade beyond 2010 in regards to big data?
I. In this era, everyone and everything is leaving a digital footprint.
II. Data volumes in this decade are measured in terms of petabytes.
I only

II only

both I and II

neither I nor

The following are recurring sets of activities that data scientist performs EXCEPT
Reframe business challenges as analytics challenges.
Design, implement, and deploy statistical models and data mining techniques on Big Data.
Provide technical expertise to support analytical projects such as provisioning and administrating
analytical sandboxes.
Develop insights that lead to actionable recommendations.
Which of the following group of players in the data value chain makes sense of the data collected from
various entities?
Data Devices
Data Collectors
Data Aggregators
Data Users and Buyers
The data now is said to come from many sources including
Photos and video footage uploaded to the World Wide Web
Nontraditional IT devices, including the use of radio-frequency identification (RFID) readers, GPS
navigation systems, and seismic processing
Medical information, such as genomic sequencing and diagnostic imaging
All of the Above

Which of the following key roles in the new big data ecosystem has members who possess a combination
of skills to handle raw, unstructured data and to apply complex analytical techniques at massive scales?
Data Savvy Professionals
Deep Analytical Talent
Technology and Data Enablers
None of the Above

The following are the skillsets and behavioral characteristics a data scientist must possess EXCEPT
Qualitative skill
Curious and creative
Skeptical mindset and critical thinking
Communicative and collaborative

Quiz 3

This refers to the process of cleaning data, normalizing datasets, and performing transformations on the
Data Preparation
Data Transformation
Data Conditioning
Data Visualizing
In this phase of the data analytics life cycle, the team assesses the resources available to support the
project in terms of people, technology, time, and data.
Data Preparation
Model Building
Model Planning
The following activities is part of the discovery phase EXCEPT
The team determine how much business or domain knowledge the data scientist needs to
develop models.
N t The team catalog the data sources that the team has access to and identify additional data
sources that the team can leverage.
The team identify the main objectives of the project, identify what needs to be achieved in
business terms, and identify what needs to be done to meet the needs.
The team identify the key stakeholders and their interests in the project.
Which of the following describe the key role of Data Engineer?
provides access to key databases or tables and ensuring the appropriate security levels are in place
related to the data repositories.
executes the actual data extractions and performs substantial data manipulation to facilitate the
provides subject matter expertise for analytical techniques, data modeling, and applying valid
analytical techniques to given business problems.
gives business domain expertise based on a deep understanding of the data, key performance
indicators (KPIs), key metrics, and business intelligence from a reporting perspective.
Which of the following activity is NOT involve in identifying potential data sources?
Capture aggregate data sources
Evaluate the data structures and tools needed
Perform extract, transform, load processes to data
Scope the sort of data infrastructure needed
In this phase of the data analytics life cycle, the team delivers final reports, briefings, code, and technical
Model Building

Model Planning

Communicate Results


Which of the following is TRUE about data analytics life cycle?

I. A common mistake made in data science projects is rushing into data collection and analysis, which
precludes spending sufficient time to plan and scope the amount of work involved, understanding
requirements, or even framing the business problem properly.
II. Having a good data analytics process ensures a comprehensive and repeatable method for
conducting analysis and helps focus time and energy.

I only

II only

both I and II

neither I nor

The following is part of the data preparation phase EXCEPT

Performing ETLT
Survey and Visualize
Developing Initial Hypothesis
Preparing the Analytic Sandbox
Which of the following key questions are helpful to ask during the discovery phase when interviewing
the project sponsor?
What is the desired outcome of the project? What data sources are available?
What data sources are available?
What industry issues may impact the analysis?
All of the Above
Which of the following person provides the funding and gauges the degree of value from the final
outputs of the working team in a data analytics project?
Project Manager
Project Sponsor
Business Intelligence Analyst
Business User

Quiz 4

Which of the following is TRUE about model building?

I. The phases of model planning and model building can overlap quite a bit, and in practice one can
iterate back and forth between the two phases for a while before settling on a final model.
II. Although the modeling techniques and logic required to develop models can be highly complex, the
actual duration of this phase can be short compared to the time spent preparing the data and defining
the approaches.
I only

II only

both I and II

neither I nor

Which of the following are free or open source tools available for data analytics practitioner?
SAS Enterprise Miner
SPSS Modeler
Alpine Miner
Which of the following is a deliverable under the operationalize phase?
Presentation for project sponsors
Presentation for analysts
Technical specifications of implementing the code
All of the Above
The following activities are involved under the model planning phase EXCEPT
Assess the structure of the datasets.
Ensure that the analytical techniques enable the team to meet the business objectives and accept
or reject the working hypotheses.
Evaluate whether similar, existing approaches are available or if the team will need to create
something new.
Assess the validity of the model and its results.
Which of the following is TRUE about model planning?
I. Under this phase, the team develop datasets for training, testing, and production purposes.
II. Data Exploration, Variable and Model selection characterize this phase.
I only

II only
both I and II

neither I nor

Which of the following is TRUE about the final phase of data analytics life cycle?
I. In the final phase, the team communicates the benefits of the project more broadly and sets up a
pilot project to deploy the work in a controlled way before broadening the work to a full enterprise or
ecosystem of users.
II. Under this phase, the team reflect on the project and consider what obstacles were in the project
and what can be improved in the future as well as make recommendations for future work or
improvements to existing processes.
I only

II only

both I and II

neither I nor

In creating robust models, the following questions needs to be considered EXCEPT

Does the model avoid intolerable mistakes?
How consistent are the contents and files?
Do any of the inputs need to be transformed or eliminated?
Will the kind of model chosen support the runtime requirements?
Which of the following are activities done under phase 5 of data analytics life cycle?
The team determine if it succeeded or failed in its objectives.
The team reflect on the implications of these findings and measure the business value.
The team record all the findings and then select the three most significant ones that can be shared
with the stakeholders.
All of the Above

Quiz 5

Prior to any regression modelling, the data should always be inspected for the following EXCEPT
Data – entry errors
Expected pattern
Missing values
Which of the following statements is/are ALWAYS TRUE?
I. Inferential statistics consists of Estimation and Hypothesis Testing
II. The link between inferential and descriptive statistics is probability
I only

II only

both I and II

neither I nor

In predicting Sales Revenue using Newspaper Ads Expenses, we have the

following regression results

Estimate the predicted sales if newspaper ads expenses is 60 units.





The following characterizes inferential statistics EXCEPT

Draw conclusions for a larger group/data
Determine relationships
Present data
Make prediction
Which of the following is/are ALWAYS TRUE about simple regression?
I. Simple regression attempt to predict the dependent variable using more than one independent
II. Simple regression consists of one regression coefficient for each explanatory variable.
I only

II only

both I and II

neither I nor

Which of the following is/are ALWAYS TRUE about regression analysis?

I. It’s the technique used most frequently to analyze the relationship between two or more variables.
II. Predictor variables could either be discrete or continuous.
I only

II only

both I and II

neither I nor

In predicting Sales Revenue using TV and Radio Ads Expenses, we have the
following regression results

Estimate the predicted sales if tv and radio ads expenses are 200 and 50 respectively.




Quiz 6
Based on the following results of logistic regression, which of the following statements is/are TRUE?
I. For every 1 unit increased in Age, the value of logistic function increases by 0.16.
II. The regression coefficient for the Married variable is not significant.

I only

II only

both I and II

neither I nor

Based on the following results of logistic regression, what is the likelihood of churning when Age = 40
and Churned_contacts = 5? (Note: Round coefficients up to 2 decimal places)




Which of the following is TRUE about the logistic function?

I. As the value of y increases, the likelihood of the event f(y) also increases.
II. The values of y are not directly observed but rather, only the value of f(y) in terms of success or
failure is observed.
I only

II only

both I
and II

neither I
nor II

Which of the following is TRUE about logistic regression?

I. When the outcome variable is categorical in nature, logistic regression can be used to
predict the likelihood of an outcome based on the input variables.
II. Logistic regression can only be applied to an outcome variable with two values such as
true/false, pass/fail, or yes/no.
I only

II only

both I and II

neither I nor

The following are examples of applications for logistic regression EXCEPT

A model on patient’s successful response to a specific medical treatment with variables including
age, weight, blood pressure, and cholesterol levels.
A churn model for a customer switching to a new network given age and number of contacts who
A model to determine the relationship of amount of income given age, education, number years
working and gender.
A model to determine the likelihood of a person buying a new automobile given age, income and

You might also like