Category Archives: BIT (Bachelor of Information Technology)

Data Warehouse Manager

The Data Warehouse (DW) Manager provides leadership of all aspects of DW activities including oversight of the design and development of the new Warehouse, management of Current/Future reporting requirements, and oversight of the Extract, Transform and Load (ETL) processes.

The perceived strength of data warehousing within an organization will be the sum of the strength of the Project Managers. Project Managers must deliver commitments and must deliver on time. They will do this by culling resources from within the data warehouse team and from consultancy as necessary and establishing partnerships with other internal support organizations required to support a data warehouse iteration. A Project Manager delivers by:

• Maintaining a highly detailed plan and obsessively caring about the progress on it.
• Applying personal skill and judgment to everything on the project. This is a real value-add of the
Project Manager. It is the Project Manager’s job to exercise relevant discretion.
• Matching team member’s skills and aspirations as closely as possible to tasks on the plan.
• Tracking all relevant metrics for each iteration:
– Project Plan milestones
– Issues list
– Adherence to change control practices
– Adherence to source code control practices
– Documentation fit for users and support personnel
– Architectural components adherence to fit for purpose and standards
– Regression testing performed and tests updated based on changes
– Team members fit for tasks and career-enhanced

Components of an Embedded System

The components of an embedded system are : Hardware, Software and time operating system.

i) Hardware

• Power Supply
• Processor
• Memory
• Timers
• Serial communication ports
• Output/Output circuits
• System application specific circuits

ii) Software:

The application software is required to perform the series of tasks.
An embedded system has software designed to keep in view of three constraints:
• Availability of System Memory
• Availability of processor speed
• The need to limit power dissipation when running the system continuously in
cycles of wait for events, run , stop and wake up.

iii) Real Time Operating System (RTOS):

It supervises the application software and provides a mechanism to let the processor run a process as per scheduling and do the switching from one process (task) to another process.

Local Area Network (LAN) Vs. Wide Area Network (WAN)

Networks are divided into two types, a LAN (Local Area Network) or a WAN (Wide Area Network). Here is the Local Area Network (LAN) Vs. Wide Area Network (WAN).

Local Area Network (LAN) Wide Area Network (WAN)
Connects hosts within a relatively small geographical area such as same building, same room, same campus. Hosts may be widely dispersed such as across campus, across cities/countries/continent.
Faster than WAN. Slower than LAN.
Cheaper : Under a control of single ownership. Expensive : Not under a control of a single person.
Typical speed : 10 Mbps to 10 Gbps. Typical speed : 64 Kbps to 8 Mbps.

What do you mean by data warehouse? Explain the query manager.

Data Warehouse :

In computing, a data warehouse (DWDWH), or an enterprise data warehouse (EDW), is a system used for reporting and data analysis. Integrating data from one or more disparate sources creates a central repository of data, a data warehouse (DW). Data warehouses store current and historical data and are used for creating trending reports for senior management reporting such as annual and quarterly comparisons.

The data stored in the warehouse is uploaded from the operational systems (such as marketing, sales, etc., shown in the figure to the right). The data may pass through an operational data store for additional operations before it is used in the DW for reporting.

A data warehouse maintains a copy of information from the source transaction systems. This architectural complexity provides the opportunity to :

  • Congregate data from multiple sources into a single database so a single query engine can be used to present data.
  • Mitigate the problem of database isolation level lock contention in transaction processing systems caused by attempts to run large, long running, analysis queries in transaction processing databases.
  • Maintain data history, even if the source transaction systems do not.
  • Integrate data from multiple source systems, enabling a central view across the enterprise. This benefit is always valuable, but particularly so when the organization has grown by merger.
  • Improve data quality, by providing consistent codes and descriptions, flagging or even fixing bad data.
  • Present the organization’s information consistently.
  • Provide a single common data model for all data of interest regardless of the data’s source.
  • Restructure the data so that it makes sense to the business users.
  • Restructure the data so that it delivers excellent query performance, even for complex analytic queries, without impacting the operational systems.
  • Add value to operational business applications, notably customer relationship management (CRM) systems.
  • Making decision–support queries easier to write.

Query Manager : 

The query manager the system component that performs all the operations necessary to support the query management process. The system is typically constructed using a combination of user access tasks, data warehousing monitor tools , native database facilities and shell script.

The architecture of query manager perform following operations :

– Direct queries to the appropriate table

– Schedule the execution of the user queries

What is data mining? What are the functions of data mining? Write about association analysis with an example.

Data Mining :

Generally, data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information – information that can be used to increase revenue, cuts costs, or both. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases.

Data mining is primarily used today by companies with a strong consumer focus – retail, financial, communication, and marketing organizations. It enables these companies to determine relationships among “internal” factors such as price, product positioning, or staff skills, and “external” factors such as economic indicators, competition, and customer demographics. And, it enables them to determine the impact on sales, customer satisfaction, and corporate profits. Finally, it enables them to “drill down” into summary information to view detail transactional data.

Functions of Data Mining :

1. Class Description
2. Association
3. Classification
4. Prediction
5. Clustering
6. Time-series analysis

Association Analysis :

The purpose of association analysis is to find patterns in particular in business processes and to formulate suitable rules, of the sort “If a customer buys product A, that customer also buys products B and C”.

Tip : If a customer buys mozzarella at the supermarket, that customer also buys tomatoes and basil.

Association analysis also helps you to identify cross-selling opportunities, for example. You can use the rules resulting from the analysis to place associated products together in a catalog, in the supermarket, or in the Web shop, or apply them when targeting a marketing campaign for product C at customers who have already purchased product A.
Association analysis determines these rules by using historic data to train the model. You can display and export the determined association rules.

Explain the importance of IP addressing and how can you classify it to represent IP address.

An IP address is an address used to uniquely identify a device on an IP network. The address is made up of 32 binary bits which can be divisible into a network portion and host portion with the help of a subnet mask. The 32 binary bits are broken into four octets (1 octet = 8 bits). Each octet is converted to decimal and separated by a period (dot). For this reason, an IP address is said to be expressed in dotted decimal format (for example, 172.16.81.100). The value in each octet ranges from 0 to 255 decimal, or 00000000 – 11111111 binary.

Classification of IP Address

In a Class A address, the first octet is the network portion, so the Class A has a major network address of 1.0.0.0 – 127.255.255.255. Octets 2, 3, and 4 (the next 24 bits) are for the network manager to divide into subnets and hosts as he/she sees fit. Class A addresses are used for networks that have more than 65,536 hosts (actually, up to 16777214 hosts!).

In a Class B address, the first two octets are the network portion, so the Class B has a major network address of 128.0.0.0 – 191.255.255.255. Octets 3 and 4 (16 bits) are for local subnets and hosts. Class B addresses is used for networks that have between 256 and 65534 hosts.

In a Class C address, the first three octets are the network portion.
The Class C has a major network address of 192.0.0.0 – 233.255.255.255. Octet 4 (8 bits) is for local subnets and hosts – perfect for networks with less than 254 hosts.

Network Masks:
A network mask helps you know which portion of the address identifies the network and which portion of the address identifies thenode. Class A, B, and C networks have default masks, also known as natural masks, as shown here:
Class A: 255.0.0.0 —–/8
Class B: 255.255.0.0 —/16
Class C: 255.255.255.0–/24

What are the different types of disasters? Explain in brief how can you develop disaster recovery plan?

The different types of disasters can be classified into two categories. They are:

Natural disasters
A natural disaster is a major adverse event resulting from the earth’s
natural hazards. Examples of natural disasters are floods,
tsunamis, tornadoes, hurricanes/cyclones, volcanic
eruptions, earthquakes, heat waves, and landslides.

Man-made disasters
Man-made disasters are the consequence of technological or human
hazards. Examples include stampedes, urban fires, industrial
accidents, oil spills, nuclear explosions/nuclear radiation and acts
of war.

Develop disaster recovery plan

Step1: Risk Analysis
The first step in drafting a disaster recovery plan is conducting a
thorough risk analysis of our computer systems. List all the possible
risks that threaten system uptime and evaluate how imminent they are in our particular IT shop.

Step 2: Establish the Budget
Once you’ve figured out your risks, ask ‘what can we do to suppress them, and how much will it cost?’

Step 3: Develop the Plan 
The feedback from the business units will begin to shape your DRP procedures. The recovery procedure should be written in a detailed plan or “script.” Establish a Recovery Team from among the IT staff and assign specific recovery duties to each member and Define how to deal with the loss of various aspects of the network and specify who arranges for repairs or reconstruction and how the data recovery process occurs.

Step 4: Test
Once your DRP is set, test it frequently. Eventually you’ll need to
perform a component-level restoration of your largest databases to get a realistic assessment of your recovery procedure, but a periodic
walk-through of the procedure with the Recovery Team will assure
that everyone knows their roles.

Difference between object-oriented programming (OOP) and procedure-oriented programming (POP)

Difference between object-oriented programming (OOP) and procedure-oriented programming (POP) :

OOP POP
Emphasis on doing things (procedure). Emphasis on data rather than procedure.
Programs are divided into what are known as functions. Programs are divided into what are known as objects.
Data move openly around the system from function to function. Data is hidden and cannot be access by external functions.
Employs top down approach in the program design. Employs bottom up approach in program design.