Virtual Position Forum
Please register to watch content in detail
Thanks
Admin virtual position


Join the forum, it's quick and easy

Virtual Position Forum
Please register to watch content in detail
Thanks
Admin virtual position
Virtual Position Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

CS614 - Data Warehousing Final Term Paper

View previous topic View next topic Go down

GMT + 3 Hours CS614 - Data Warehousing Final Term Paper

Post by winnersgroup Mon Feb 14, 2011 2:46 pm

Question No: 1 ( Marks: 1 ) - Please choose one
A data warehouse may include
► Legacy systems
► Only internal data sources
► Privacy restrictions
► Small data mart
Read more:-
Question No: 2 ( Marks: 1 ) - Please choose one
De-Normalization normally speeds up
► Data Retrieval
► Data Modification
► Development Cycle
► Data Replication

Question No: 3 ( Marks: 1 ) - Please choose one
In horizontal splitting, we split a relation into multiple tables on the basis of
► Common Column Values
► Common Row Values
► Different Index Values
► Value resulted by ad-hoc query
Read more:-
Question No: 4 ( Marks: 1 ) - Please choose one
Multidimensional databases typically use proprietary __________ format to store pre-summarized cube structures.
► File
► Application
► Aggregate
► Database

Question No: 5 ( Marks: 1 ) - Please choose one
A dense index, if fits into memory, costs only ______ disk I/O access to locate a record by given key.
► One
► Two
► lg (n)
► n
Read more:-
Question No: 6 ( Marks: 1 ) - Please choose one
All data is ______________ of something real.

I An Abstraction
II A Representation

Which of the following option is true?
► I Only
► II Only
► Both I & II
► None of I & II
Read more:-
Question No: 7 ( Marks: 1 ) - Please choose one
The key idea behind ___________ is to take a big task and break it into subtasks that can be processed concurrently on a stream of data inputs in multiple, overlapping stages of execution.
► Pipeline Parallelism
► Overlapped Parallelism
► Massive Parallelism
► Distributed Parallelism
Read more:-
Question No: 8 ( Marks: 1 ) - Please choose one
Non uniform distribution, when the data is distributed across the processors, is called ______.
► Skew in Partition
► Pipeline Distribution
► Distributed Distribution
► Uncontrolled Distribution
Read more:-
Question No: 9 ( Marks: 1 ) - Please choose one
The goal of ideal parallel execution is to completely parallelize those parts of a computation that are not constrained by data dependencies. The smaller the portion of the program that must be executed __________, the greater the scalability of the computation.
► None of these
► Sequentially
► In Parallel
► Distributed
Read more:-
Question No: 10 ( Marks: 1 ) - Please choose one
If ‘M’ rows from table-A match the conditions in the query then table-B is accessed ‘M’ times. Suppose table-B has an index on the join column. If ‘a’ I/Os are required to read the data block for each scan and ‘b’ I/Os for each data block then the total cost of accessing table-B is _____________ logical I/Os approximately.
► (a + b)M
► (a - b)M
► (a + b + M)
► (a * b * M)
Read more:-
Question No: 11 ( Marks: 1 ) - Please choose one
Data mining is a/an __________ approach, where browsing through data using data mining techniques may reveal something that might be of interest to the user as information that was unknown previously.
► Exploratory
► Non-Exploratory
► Computer Science

Question No: 12 ( Marks: 1 ) - Please choose one
Data mining evolve as a mechanism to cater the limitations of ________ systems to deal massive data sets with high dimensionality, new data types, multiple heterogeneous data resources etc.
► OLTP
► OLAP
► DSS
► DWH
Read more:-
Question No: 13 ( Marks: 1 ) - Please choose one
________ is the technique in which existing heterogeneous segments are reshuffled, relocated into homogeneous segments.
► Clustering
► Aggregation
► Segmentation
► Partitioning
Read more:-
Question No: 14 ( Marks: 1 ) - Please choose one
To measure or quantify the similarity or dissimilarity, different techniques are available. Which of the following option represent the name of available techniques?
► Pearson correlation is the only technique
► Euclidean distance is the only technique
► Both Pearson correlation and Euclidean distance
► None of these

Question No: 15 ( Marks: 1 ) - Please choose one
For a given data set, to get a global view in un-supervised learning we use
► One-way Clustering
► Bi-clustering
► Pearson correlation
► Euclidean distance
Read more:-
Question No: 16 ( Marks: 1 ) - Please choose one
In DWH project, it is assured that ___________ environment is similar to the production environment

► Designing
► Development
► Analysis
► Implementation

Question No: 17 ( Marks: 1 ) - Please choose one
For a DWH project, the key requirement are ________ and product experience.
► Tools
► Industry
► Software
► None of these
Read more:-
Question No: 18 ( Marks: 1 ) - Please choose one
Pipeline parallelism focuses on increasing throughput of task execution, NOT on __________ sub-task execution time.
► Increasing
► Decreasing
► Maintaining
► None of these
Read more:-
Question No: 19 ( Marks: 1 ) - Please choose one
Many data warehouse project teams waste enormous amounts of time searching in vain for a ___________________.
► Silver Bullet
► Golden Bullet
► Suitable Hardware
► Compatible Product

Question No: 20 ( Marks: 1 ) - Please choose one
Focusing on data warehouse delivery only often end up _________.
► Rebuilding
► Success
► Good Stable Product
► None of these
Read more:-
Question No: 21 ( Marks: 1 ) - Please choose one
Pakistan is one of the five major ________ countries in the world.
► Cotton-growing
► Rice-growing
► Weapon Producing

Question No: 22 ( Marks: 1 ) - Please choose one
_____________ is a process which involves gathering of information about column through execution of certain queries with intention to identify erroneous records.
► Data profiling
► Data Anomaly Detection
► Record Duplicate Detection
► None of these
Read more:-
Question No: 23 ( Marks: 1 ) - Please choose one
Relational databases allow you to navigate the data in ____________ that is appropriate using the primary, foreign key structure within the data model.
► Only One Direction
► Any Direction
► Two Direction
► None of these
Read more:-
Question No: 24 ( Marks: 1 ) - Please choose one
DSS queries do not involve a primary key
► True
► False
Read more:-
Question No: 25 ( Marks: 1 ) - Please choose one
__________________ contributes to an under-utilization of valuable and expensive historical data, and inevitably results in a limited capability to provide decision support and analysis.
► The lack of data integration and standardization
► Missing Data
► Data Stored in Heterogeneous Sources
Read more:-
Question No: 26 ( Marks: 1 ) - Please choose one
DTS allows us to connect through any data source or destination that is supported by ____________
► OLE DB
► OLAP
► OLTP
► Data Warehouse

Question No: 27 ( Marks: 1 ) - Please choose one
Data Transformation Services (DTS) provide a set of _____ that lets you extract, transform, and consolidate data from disparate sources into single or multiple destinations supported by DTS connectivity.

► Tools
► Documentations
► Guidelines
Read more:-
Question No: 28 ( Marks: 1 ) - Please choose one
Execution can be completed successfully or it may be stopped due to some error. In case of successful completion of execution all the transactions will be ___________
► Committed to the database
► Rolled back

Question No: 29 ( Marks: 1 ) - Please choose one
If some error occurs, execution will be terminated abnormally and all transactions will be rolled back. In this case when we will access the database we will find it in the state that was before the ____________.
► Execution of package
► Creation of package
► Connection of package

Question No: 30 ( Marks: 1 ) - Please choose one
To judge effectiveness we perform data profiling twice.
► One before Extraction and the other after Extraction
► One before Transformation and the other after Transformation
► One before Loading and the other after Loading

Question No: 31 ( Marks: 2 )
What are the two extremes for technical architecture design? Which one is better?

Question No: 32 ( Marks: 2 )
What is value validation process?

Value validation is the process of ensuring that each value that is sent to the data
warehouse is accurate.

Question No: 33 ( Marks: 2 )
What is the difference between training data and test data?

Question No: 34 ( Marks: 2 )
Do you think it will create the problem of non-standardized attributes, if one source uses 0/1 and second source uses 1/0 to store male/female attribute respectively? Give a reason to support your answer.
Read more:-
Question No: 35 ( Marks: 3 )
Why building a data warehouse is a challenging activity? What are the three broad categories of data warehouse development methods?

1. Waterfall model
2. RAD model
3. Spiral Model
Read more:-
Question No: 36 ( Marks: 3 )
What are three fundamental reasons for warehousing Web data?

1. Web data is unstructured and dynamic, Keyword search is insufficient.
2. Web log contain wealth of information as it is a key touch point.
3. Shift from distribution platform to a general communication platform.

Question No: 37 ( Marks: 3 )
What types of operations are provided by MS DTS?

1. Providing connectivity to different databases
2. Building query graphically
3. Extraction data from disparate databases
4. Transforming data
5. Copying database objects
6. Providing support of different scripting languages (by default VB-script and Java –

Question No: 38 ( Marks: 3 )
What problems may be faced during Change Data Capture (CDC) while reading a log/journal tape?

Problems with reading a log/journal tape are many:
1. Contains lot of extraneous data
2. Format is often arcane
3. Often contains addresses instead of data values and keys
4. Sequencing of data in the log tape often has deep and complex
5. implications
6. Log tape varies widely from one DBMS to another.

Question No: 39 ( Marks: 5 )
What are seven steps for extracting data using the SQL server DTS wizard?
Read more:-

SQL Server Data Transformation Services (DTS) is a set of graphical
tools and programmable objects that allow you extract, transform, and consolidate data from disparate sources into single or multiple destinations. SQL Server Enterprise .Manager provides an easy access to the tools of DTS.

Question No: 40 ( Marks: 5 )
Explain Analytic Applications Development Phase of Analytic Applications Track of Kimball’s Model?

Ans:
The DWH development lifecycle (Kimball’s Approach)
has three parallel tracks emanating from requirements definition.
These are
1. technology track,
2. data track and
3. Analytic applications track.
Read more:-

Analytic Applications Track:
Analytic applications also serve to encapsulate the analytic expertise of
the organization, providing a jump-start for the less analytically inclined.
It consists of two phases.

1. Analytic applications specification
2. Analytic applications development

Analytic applications specification:
The main features of Analytic applications specification are:
• Starter set of 10-15 applications.
• Prioritize and narrow to critical capabilities.
• Single template use to get 15 applications.
• Set standards: Menu, O/P, look feel.
• From standard: Template, layout, I/P variables, calculations.
• Common understanding between business & IT users.

Following the business requirements definition, we need to review the findings and collected sample reports to identify a starter set of approximately 10 to 15 analytic applications. We want to narrow our initial focus to the most critical capabilities so that we can manage expectations and ensure on-time delivery. Business community input will be critical to this prioritization process. While 15 applications may not sound like much,
Before designing the initial applications, it's important to establish standards for the applications, such as
• common pull-down menus and
• Consistent output look and feel.
Using the standards, we specify each application
• template,
• capturing sufficient Information about the layout,
• input variables,
• calculations, and
• breaks
so that both the application developer and business representatives share a common understanding.
During the application specification activity, we also must give consideration to the organization of the applications. We need to identify structured navigational paths to access the applications, reflecting the way users think about their business. Leveraging the Web and customizable information portals are the dominant strategies for disseminating application access.

Read more:-
Analytic applications development:

The main features of Analytic applications development consisits of:
1. Standards: naming, coding, libraries etc.

2. Coding begins AFTER DB design complete, data access tools installed,
subset of historical data loaded.

3. Tools: Product specific high performance tricks, invest in tool-specific
education.

4. Benefits: Quality problems will be found with tool usage => staging.

5. Actual performance and time gauged.

When we do work into the development phase for the analytic applications, we again need to focus on standards. Standards for
• naming conventions,
• calculations,
• libraries, and
• coding
should be established to minimize future rework. The application development
activity can begin once the database design is complete, the data access tools and metadata are installed, and a subset of historical data has been loaded. The application template specifications should be revisited to account for the inevitable changes to the data model since the specifications were completed.
We should take approperiate-specific education or supplemental resources
for the development team.
While the applications are being developed, several ancillary benefits result. Application developers, should have a robust data access tool, quickly will find needling problems in the data haystack despite the quality assurance performed by the staging application. we need to allow time in the schedule to
address any flaws identified by the analytic applications.
After realistically test query response times developers now reviewing performance-tuning strategies. The application development quality-assurance activities cannot be completed until the data is stabilized. We need to make sure that there is adequate time in the schedule beyond the final data staging cutoff to allow for an orderly wrap-up of the application development tasks.
Read more:-

winnersgroup
winnersgroup
Fire Breathing Bluebirds
Fire Breathing Bluebirds

Posts : 347
Join date : 2011-02-07

https://virtualposition.forumotion.net

Back to top Go down

GMT + 3 Hours Re: CS614 - Data Warehousing Final Term Paper

Post by Asad Wed Feb 16, 2011 6:05 pm

very nice Thumbs up
Asad
Asad
Deep Bench
Deep Bench

Posts : 563
Join date : 2011-02-11
Creative

Back to top Go down

GMT + 3 Hours Re: CS614 - Data Warehousing Final Term Paper

Post by waqas Thu Feb 17, 2011 10:20 am

Question No: 1 ( Marks: 1 ) - Please choose one
The need to synchronize data upon update is called
► Data Manipulation
► Data Replication
► Data Coherency
► Data Imitation

Question No: 2 ( Marks: 1 ) - Please choose one
Taken jointly, the extract programs or naturally evolving systems formed a spider web, also known as
► Distributed Systems Architecture
► Legacy Systems Architecture
► Online Systems Architecture
► Intranet Systems Architecture

Question No: 3 ( Marks: 1 ) - Please choose one
For good decision making, data should be integrated across the organization to cross the LoB (Line of Business). This is to give the total view of organization from:
► Owner’s Perspective
► Customer’s Perspective
► Decision Maker’s Perspective
► Employee's Perspective
Read more:-
Question No: 4 ( Marks: 1 ) - Please choose one
Node of a B-Tree is stored in memory block and traversing a B-Tree involves ______ page faults.
► O (n)
► O (n2)
► O (n lg n)
► O (lg n)
Read more:-
Question No: 5 ( Marks: 1 ) - Please choose one
Which statement is true for De-Normalization?
► Redundant data is a performance liability at query time, but is a performance benefit at update time.
► Redundant data is a performance benefit at both query time and update time.
► Redundant data is a performance liability at both query time and update time.
► Redundant data is a performance benefit at query time, but is a performance liability at update time.

Read more:-
waqas
waqas
Monstars
Monstars

Leo Tiger
Posts : 283
Join date : 2011-02-13
Age : 37

Back to top Go down

GMT + 3 Hours Re: CS614 - Data Warehousing Final Term Paper

Post by kamran Sun Feb 20, 2011 8:04 pm

[You must be registered and logged in to see this image.]
kamran
kamran
Monstars
Monstars

Scorpio Cat
Posts : 301
Join date : 2011-02-11
Age : 36

Back to top Go down

GMT + 3 Hours Re: CS614 - Data Warehousing Final Term Paper

Post by kamran Sun Feb 20, 2011 8:06 pm

[You must be registered and logged in to see this image.]
• Write two extremes of Tech.Arch Design?which is better (2 marks)
• Q what are the advantages of Bit Mapped Index?
• Q What is the disadvantage of Bit Mapped Index?
• PROS AND CONS OF K MEANS clustering?
• How page dimension captures the static and dynamic nature of different web
pages?
Difference between estimation and projection?
• Do you think it will create the problem of non-standardized attributes, if one source uses 0/1 and second source uses 1/0 to store male/female attribute respectively? Give a reason to support your answer.
• Methods to develop DWH?
• What are seven steps for extracting data using the SQL server DTS wizard? Explain Analytic Applications Development Phase of Analytic Applications Track of Kimball’s Model?
• issues of data cleansing and acquiring of Agri-Data warehouse.
• Describe reverse proxy
• Seven steps to load data vai SQL using DTS wizard (kch aesa he tha)
• 5 tech for de normalization (names)

Read more:-
kamran
kamran
Monstars
Monstars

Scorpio Cat
Posts : 301
Join date : 2011-02-11
Age : 36

Back to top Go down

GMT + 3 Hours Re: CS614 - Data Warehousing Final Term Paper

Post by kamran Sun Feb 20, 2011 8:07 pm

Q1
write a querry to extract total number of female students registered in BS Telecom. 5 marks
[You must be registered and logged in to see this image.]
Q2
describe the lessons learn at during agri-data ware house case study. 5 marks
[You must be registered and logged in to see this image.]
Q3
what are the fundamental strengths and weakness of k means clustering? 5 marks
[You must be registered and logged in to see this image.]
Q4
data profiling is a process of gathering information about columns, what are the purpose that it must fulfill? describe briefly.3 marks
[You must be registered and logged in to see this image.]

Q5

.define additive and non additive facts.
[You must be registered and logged in to see this image.]
Q6
What are three fundamental reasons for warehousing web data? 3 marks
[You must be registered and logged in to see this image.]
Q7
what are the two basic data warehousing implementation strategies and their suitability conditions.? 3 marks
[You must be registered and logged in to see this image.]
Q8
list and explain fundamental advantages of bit map indexing. 3 marks

Q9
what are major operations of data mining. [2]
[You must be registered and logged in to see this image.]
Q10
what will be the effect if we program a package by using DTS object modal? 2
[You must be registered and logged in to see this image.]
Q11
write down the steps of handling skew in range partitioning.
[You must be registered and logged in to see this image.]

Q12
what type of anomalies exist if a table is in 2NF no tin 3NF? [2]
[You must be registered and logged in to see this image.]
Q13

what are three methods for creating a DTS package? [2]
[You must be registered and logged in to see this image.]
Q14

whtat are the two extremes for technical architecture design? which one is better? [2]

[You must be registered and logged in to see this image.]
kamran
kamran
Monstars
Monstars

Scorpio Cat
Posts : 301
Join date : 2011-02-11
Age : 36

Back to top Go down

GMT + 3 Hours Re: CS614 - Data Warehousing Final Term Paper

Post by Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

Permissions in this forum:
You cannot reply to topics in this forum