Collection of data

Collection of data

Collection of Data

Our world is becoming more and more information oriented. Every part of our life utilizes information in one form or another. So, it becomes essential for us to know how to extract meaningful information from such data. The extraction of meaningful information is studied in a branch of mathematics called Statistics.

This involves the study of the collection, analysis, interpretation, presentation, and organization of data. In other words, it is a mathematical discipline to collect, summarize data.

Types of data on the basis of the collection of data

Primary data: It is the data that is collected by a researcher from first-hand sources, using methods like surveys, interviews, or experiments. For example, The following data is collected by a student for his/her thesis for the research project.

• Height of 10 students in your class.

• Number of absentees in each day in your class for a month.

• Number of members in the families of your classmates.

• Height of 10 plants in or around your home.

Secondary data: It is the data that has already been collected by someone, and then it is updated, tailored or modified for a specific purpose.

For example, in a school, the class-teachers of respective sections record attendance on a daily basis.

This data recorded by the class- teacher is an example of primary data. On a given day, the principal of the school asks for the attendance of all students of each section, to collate the total number of students present in the school on a given day.

This data collected by the school principal is an example of secondary data.

Data Presenting for Clearer Reference

Imagine the statistical data without a definite presentation, will be burdensome! Data presentation is one of the important aspects of Statistics. Presenting the data helps the users to study and explain the statistics thoroughly. We are going to discuss this presentation of data and know-how information is laid down methodically. 

In this context, we are going to present the topic - Presentation of Data which is to be referred to by the students and the same is to be studied in regard to the types of presentations of data. 

Probability- An experimental Approach

Chapter 15 -  Probability

Probability

Probability means possibility. It is a branch of mathematics that deals with the occurrence of a random event. The value is expressed from zero to one. Probability has been introduced in Maths to predict how likely events are to happen. The meaning of probability is basically the extent to which something is likely to happen. This is the basic probability theory, which is also used in the probability distribution, where you will learn the possibility of outcomes for a random experiment. To find the probability of a single event to occur, first, we should know the total number of possible outcomes.

Probability Definition in Math

Probability is a measure of the likelihood of an event to occur. Many events cannot be predicted with total certainty. We can predict only the chance of an event to occur i.e. how likely they are to happen, using it. Probability can range in from 0 to 1, where 0 means the event to be an impossible one and 1 indicates a certain event. Probability for Class 10 is an important topic for the students which explains all the basic concepts of this topic. The probability of all the events in a sample space adds up to 1.

For example, when we toss a coin, either we get Head OR Tail, only two possible outcomes are possible (H, T). But if we toss two coins in the air, there could be three possibilities of events to occur, such as both the coins show heads or both show tails or one shows heads and one tail, i.e.(H, H), (H, T),(T, T).

Formula for Probability

The probability formula is defined as the possibility of an event to happen is equal to the ratio of the number of favourable outcomes and the total number of outcomes.

Probability of event to happen P(E) = Number of favourable outcomes/Total Number of outcomes

Sometimes students get mistaken for “favourable outcome” with “desirable outcome”. This is the basic formula. But there are some more formulas for different situations or events.
Solved Examples

1) There are 6 pillows in a bed, 3 are red, 2 are yellow and 1 is blue. What is the probability of picking a yellow pillow?

Ans: The probability is equal to the number of yellow pillows in the bed divided by the total number of pillows, i.e. 2/6 = 1/3.

2) There is a container full of coloured bottles, red, blue, green and orange. Some of the bottles are picked out and displaced. Sumit did this 1000 times and got the following results:

  • No. of blue bottles picked out: 300
  • No. of red bottles: 200
  • No. of green bottles: 450
  • No. of orange bottles: 50

a) What is the probability that Sumit will pick a green bottle?

Ans: For every 1000 bottles picked out, 450 are green.

Therefore, P(green) = 450/1000 = 0.45

b) If there are 100 bottles in the container, how many of them are likely to be green?

Ans: The experiment implies that 450 out of 1000 bottles are green.

Therefore, out of 100 bottles, 45 are green.
Probability Tree

The tree diagram helps to organize and visualize the different possible outcomes. Branches and ends of the tree are two main positions. Probability of each branch is written on the branch, whereas the ends are containing the final outcome. Tree diagrams are used to figure out when to multiply and when to add. You can see below a tree diagram for the coin:

Types of Probability

There are three major types of probabilities:

  • Theoretical Probability
  • Experimental Probability
  • Axiomatic Probability

Theoretical Probability

It is based on the possible chances of something to happen. The theoretical probability is mainly based on the reasoning behind probability. For example, if a coin is tossed, the theoretical probability of getting a head will be ½.

Experimental Probability

It is based on the basis of the observations of an experiment. The experimental probability can be calculated based on the number of possible outcomes by the total number of trials. For example, if a coin is tossed 10 times and heads is recorded 6 times then, the experimental probability for heads is 6/10 or, 3/5.

Axiomatic Probability

In axiomatic probability, a set of rules or axioms are set which applies to all types. These axioms are set by Kolmogorov and are known as Kolmogorov’s three axioms. With the axiomatic approach to probability, the chances of occurrence or non-occurrence of the events can be quantified. The axiomatic probability lesson covers this concept in detail with Kolmogorov’s three rules (axioms) along with various examples.

Conditional Probability is the likelihood of an event or outcome occurring based on the occurrence of a previous event or outcome.

Probability of an Event

Assume an event E can occur in r ways out of a sum of n probable or possible equally likely ways. Then the probability of happening of the event or its success  is expressed as;

P(E) = r/n

The probability that the event will not occur or known as its failure is expressed as:

P(E’) = (n-r)/n = 1-(r/n)

E’ represents that the event will not occur.

Therefore, now we can say;

P(E) + P(E’) = 1

This means that the total of all the probabilities in any random test or experiment is equal to 1.
What are Equally Likely Events?

When the events have the same theoretical probability of happening, then they are called equally likely events. The results of a sample space are called equally likely if all of them have the same probability of occurring. For example, if you throw a die, then the probability of getting 1 is 1/6. Similarly, the probability of getting all the numbers from 2,3,4,5 and 6, one at a time is 1/6. Hence, the following are some examples of equally likely events when throwing a die:

  • Getting 3 and 5 on throwing a die
  • Getting an even number and an odd number on a die
  • Getting 1, 2 or 3 on rolling a die

are equally likely events, since the probabilities of each event are equal.
Complementary Events

The possibility that there will be only two outcomes which states that an event will occur or not. Like a person will come or not come to your house, getting a job or not getting a job, etc. are examples of complementary events. Basically, the complement of an event occurring in the exact opposite that the probability of it is not occurring. Some more examples are:

  • It will rain or not rain today
  • The student will pass the exam or not pass.
  • You win the lottery or you don’t.

Also, read: 

Probability Theory

Probability theory had its root in the 16th century when J. Cardan, an Italian mathematician and physician, addressed the first work on the topic, The Book on Games of Chance. After its inception, the knowledge of probability has brought to the attention of great mathematicians. Thus, Probability theory is the branch of mathematics that deals with the possibility of the happening of events. Although there are many distinct probability interpretations, probability theory interprets the concept precisely by expressing it through a set of axioms or hypotheses. These hypotheses help form the probability in terms of a possibility space, which allows a measure holding values between 0 and 1. This is known as the probability measure, to a set of possible outcomes of the sample space.
Probability Density Function

The Probability Density Function (PDF) is the probability function which is represented for the density of a continuous random variable lying between a certain range of values. Probability Density Function explains the normal distribution and how mean and deviation exists. The standard normal distribution is used to create a database or statistics, which are often used in science to represent the real-valued variables, whose distribution is not known.

Probability Terms and Definition

Some of the important probability terms are discussed here:

Applications of Probability

Probability has a wide variety of applications in real life. Some of the common applications which we see in our everyday life while checking the results of the following events:

  • Choosing a card from the deck of cards
  • Flipping a coin
  • Throwing a dice in the air
  • Pulling a red ball out of a bucket of red and white balls
  • Winning a lucky draw

Other Major Applications of Probability

  • It is used for risk assessment and modelling in various industries
  • Weather forecasting or prediction of weather changes
  • Probability of a team winning in a sport based on players and strength of team
  • In the share market, chances of getting the hike of share prices

Problems and Solutions on Probability

Question 1: Find the probability of ‘getting 3 on rolling a die’.

Solution:

Sample Space = S = {1, 2, 3, 4, 5, 6}

Total number of outcomes = n(S) = 6

Let A be the event of getting 3.

Number of favourable outcomes = n(A) = 1

i.e. A  = {3}

Probability, P(A) = n(A)/n(S) = 1/6

Hence, P(getting 3 on rolling a die) = 1/6

Question 2: Draw a random card from a pack of cards. What is the probability that the card drawn is a face card?

Solution:

A standard deck has 52 cards.

Total number of outcomes = n(S) = 52

Let E be the event of drawing a face card.

Number of favourable events = n(E) = 4 x 3 = 12 (considered Jack, Queen and King only)

Probability, P = Number of Favourable Outcomes/Total Number of Outcomes

P(E) = n(E)/n(S)

= 12/52

= 3/13

P(the card drawn is a face card) = 3/13

Question 3: A vessel contains 4 blue balls, 5 red balls and 11 white balls. If three balls are drawn from the vessel at random, what is the probability that the first ball is red, the second ball is blue, and the third ball is white?

Solution:

Given,

The probability to get the first ball is red or the first event is 5/20.

Since we have drawn a ball for the first event to occur, then the number of possibilities left for the second event to occur is 20 – 1 = 19.

Hence, the probability of getting the second ball as blue or the second event is 4/19.

Again with the first and second event occurring, the number of possibilities left for the third event to occur is 19 – 1 = 18.

And the probability of the third ball is white or the third event is 11/18.

Therefore, the probability is 5/20 x 4/19 x 11/18 = 44/1368 = 0.032.

Or we can express it as: P = 3.2%.

Question 4: Two dice are rolled, find the probability that the sum is:

  1. equal to 1
  2. equal to 4
  3. less than 13

Solution:

To find the probability that the sum is equal to 1 we have to first determine the sample space S of two dice as shown below.

S = { (1,1),(1,2),(1,3),(1,4),(1,5),(1,6)

(2,1),(2,2),(2,3),(2,4),(2,5),(2,6)

(3,1),(3,2),(3,3),(3,4),(3,5),(3,6)

(4,1),(4,2),(4,3),(4,4),(4,5),(4,6)

(5,1),(5,2),(5,3),(5,4),(5,5),(5,6)

(6,1),(6,2),(6,3),(6,4),(6,5),(6,6) }

So, n(S) = 36

1) Let E be the event “sum equal to 1”. Since, there are no outcomes which where a sum is equal to 1, hence,

P(E) = n(E) / n(S) = 0 / 36 = 0

2) Let A be the event of getting the sum of numbers on dice equal to 4.

Three possible outcomes give a sum equal to 4 they are:

A = {(1,3),(2,2),(3,1)}

n(A) = 3

Hence, P(A) = n(A) / n(S) = 3 / 36 = 1 / 12

3) Let B be the event of getting the sum of numbers on dice is less than 13.

From the sample space, we can see all possible outcomes for the event B, which gives a sum less than B. Like:

(1,1) or (1,6) or (2,6) or (6,6).

So you can see the limit of an event to occur is when both dies have number 6, i.e. (6,6).

Thus, n(B) = 36

Hence,

P(B) = n(B) / n(S) = 36 / 36 = 1

Video Lectures

Introduction

69,630

Solving Probability Questions

97,581

Probability Important Topics

1,697

Probability Important Questions

0

Probability Problems

  1. Two dice are thrown together. Find the probability that the product of the numbers on the top of the dice is:
    (i) 6 (ii) 12 (iii) 7
  2. A bag contains 10 red, 5 blue and 7 green balls. A ball is drawn at random. Find the probability of this ball being a
    (i) red ball (ii) green ball (iii) not a blue ball
  3. All the jacks, queens and kings are removed from a deck of 52 playing cards. The remaining cards are well shuffled and then one card is drawn at random. Giving ace a value 1 similar value for other cards, find the probability that the card has a value
    (i) 7 (ii) greater than 7 (iii) less than 7
  4. A die has its six faces marked 0, 1, 1, 1, 6, 6. Two such dice are thrown together and the total score is recorded.
    (i) How many different scores are possible?
    (ii) What is the probability of getting a total of 7?

Experimental Probability

You and your 3 friends are playing a board game. It’s your turn to roll the die and to win the game you need a 5 on the dice. Now, is it possible that upon rolling the die you will get an exact 5? No, it is a matter of chance. We face multiple situations in real life where we have to take a chance or risk. Based on certain conditions, the chance of occurrence of a certain event can be easily predicted. In our day to day life, we are more familiar with the word ‘chance and probability’. In simple words, the chance of occurrence of a particular event is what we study in probability. In this article, we are going to discuss one of the types of probability called  “Experimental Probability” in detail.

What is Probability?

Probability, a branch of Math that deals with the likelihood of the occurrences of the given event. The probability values for the given experiment is usually defined between the range of numbers. The values lie between the numbers 0 and 1. The probability value cannot be a negative value. The basic rules such as addition, multiplication and complement rules are associated with the probability.

Experimental Probability Vs Theoretical Probability

There are two approaches to study probability:

  • Experimental Probability
  • Theoretical Probability

What is Experimental Probability?

Experimental probability, also known as Empirical probability, is based on actual experiments and adequate recordings of the happening of events. To determine the occurrence of any event, a series of actual experiments are conducted. Experiments which do not have a fixed result are known as random experiments. The outcome of such experiments is uncertain. Random experiments are repeated multiple times to determine their likelihood. An experiment is repeated a fixed number of times and each repetition is known as a trial. Mathematically, the formula for the experimental probability is defined by;

Probability of an Event P(E) = Number of times an event occurs / Total number of trials.

What is Theoretical Probability?

In probability, the theoretical probability is used to find the probability of an event. Theoretical probability does not require any experiments to conduct. Instead of that, we should know about the situation to find the probability of an event occurring. Mathematically, the theoretical probability is described as the number of favourable outcomes divided by the number of possible outcomes.

Probability of Event P(E) = No. of. Favourable outcomes/ No. of. Possible outcomes.

Experimental Probability Example

Example: You asked your 3 friends Shakshi, Shreya and Ravi to toss a fair coin 15 times each in a row and the outcome of this experiment is given as below:

Calculate the probability of occurrence of heads and tails.

Solution: The experimental probability for the occurrence of heads and tails in this experiment can be calculated as:

Experimental Probability of Occurrence of heads = Number of times head occurs/Number of times coin is tossed.

Experimental Probability of Occurrence of tails = Number of times tails occurs/Number of times coin is tossed.

We observe that if the number of tosses of the coin increases then the probability of occurrence of heads or tails also approaches to 0.5.

Presentation of data

Presentation of Data

After collecting the data for a certain group, we have to now learn to present it. The presentation should be such that it should be meaningful, easily understood by everyone and the main features could be captured at a glance or by a single view. The important details should be highlighted properly. Here, to represent numerous data, we use a frequency distribution table, to condense the data into sub-groups. Let us see an example here.

Example: Suppose an exam was conducted for a class of 50 students. The marks obtained out of 100, by the students here are:

12, 23, 45, 55, 10, 33, 65, 78, 89, 22,

44, 55, 77, 88, 35, 65, 63, 61, 84, 89,

34, 27, 90, 65, 67, 45, 78, 98, 66, 77,

31, 41, 61, 68, 86, 34, 54, 59, 78, 89,

50, 29, 58, 63, 72, 87, 34, 65, 48, 91

Find how many students got more than 40 marks.

Solution: Let us arrange the data with respect to the marks obtained by the students.

Marks Obtained

Number of Students

0-20

2

21-40

10

41-60

11

61-80

17

81-100

10

Hence, from the above-grouped frequency distribution table, we can calculate the number of students who scored above 40 marks = 11+17+10 = 38

Presentation of Data and Information

Statistics is all about data. Presenting data effectively and efficiently is an art. You may have uncovered many truths that are complex and need long explanations while writing. This is where the importance of the presentation of data comes in. You have to present your findings in such a way that the readers can go through them quickly and understand each and every point that you wanted to showcase. As time progressed and new and complex research started happening, people realized the importance of the presentation of data to make sense of the findings.

Define Data Presentation

Data presentation is defined as the process of using various graphical formats to visually represent the relationship between two or more data sets so that an informed decision can be made based on them

Types of Data Presentation

Broadly speaking, there are three methods of data presentation:

  • Textual
  • Tabular
  • Diagrammatic

Textual Ways of Presenting Data

Out of the different methods of data presentation, this is the simplest one. You just write your findings in a coherent manner and your job is done. The demerit of this method is that one has to read the whole text to get a clear picture. Yes, the introduction, summary, and conclusion can help condense the information.

Tabular Ways of Data Presentation and Analysis

To avoid the complexities involved in the textual way of data presentation, people use tables and charts to present data. In this method, data is presented in rows and columns - just like you see in a cricket match showing who made how many runs. Each row and column have an attribute (name, year, sex, age, and other things like these). It is against these attributes that data is written within a cell.

Graphical representation of data

Graphical Representation of Data

The grouped data of a collection of data can be represented using the graph as well. There are three ways by which we can represent the data in graphical form, which are;

  1. Bar Graph
  2. Histogram
  3. Frequency Polygons

Bar Graph

A bar-graph gives a pictorial representation of data using vertical and horizontal rectangular bars, the length of the bars are proportional to the measure of data. Examples are:

The above graph represents the data on the number of employees with respect to monthly salary savings. This is a vertical bar-graph, which could also be represented horizontally, such as;

Histogram

A histogram can be defined as a set of rectangles with bases along with the intervals between class boundaries and with areas proportional to frequencies in the corresponding classes. Below is the general representation of a histogram.

Frequency Polygon

A frequency polygon is used to compare sets of data or to show a cumulative frequency distribution. It utilises a line graph to express quantitative data.

 

Diagrammatic Presentation: Graphical Presentation of Data in Statistics

This kind of data presentation and analysis method says a lot with dramatically short amounts of time.

Diagrammatic Presentation has been divided into further categories:

  • Geometric Diagram

When a Diagrammatic presentation involves shapes like a bar or circle, we call that a Geometric Diagram. Examples of Geometric Diagram

  • Bar Diagram

       Simple Bar Diagram

Simple Bar Diagram is composed of rectangular bars. All of these bars have the same width and are placed at an equal distance from each other. The bars are placed on the X-axis. The height or length of the bars is used as the means of measurement. So, on the Y-axis, you have the measurement relevant to the data. 

Suppose, you want to present the run scored by each batsman in a game in the form of a bar chart. Mark the runs on the Y-axis - in ascending order from the bottom. So, the lowest scorer will be represented in the form of the smallest bar and the highest scorer in the form of the longest bar.

      Multiple Bar Diagram

In many states of India, electric bills have bar diagrams showing the consumption in the last 5 months. Along with these bars, they also have bars that show the consumption that happened in the same months of the previous year. This kind of Bar Diagram is called Multiple Bar Diagrams.

Component Bar Diagram

Sometimes, a bar is divided into two or more parts. For example, if there is a Bar Diagram, the bars of which show the percentage of male voters who voted and who didn’t and the female voters who voted and who didn’t. Instead of creating separate bars for who did and who did not, you can divide one bar into who did and who did not.

Pie Chart

A pie chart is a chart where you divide a pie (a circle) into different parts based on the data. Each of the data is first transformed into a percentage and then that percentage figure is multiplied by 3.6 degrees. The result that you get is the angular degree of that corresponding data to be drawn in the pie chart. So, for example, you get 30 degrees as the result, on the pie chart you draw that angle from the center.

Frequency Diagram

Suppose you want to present data that shows how many students have 1 to 2 pens, how many have 3 to 5 pens, how many have 6 to 10 pens (grouped frequency) you do that with the help of a Frequency Diagram. A Frequency Diagram can be of many kinds:

Histogram

Where the grouped frequency of pens (from the above example) is written on the X-axis and the numbers of students are marked on the Y-axis. The data is presented in the form of bars.

Frequency Polygon

When you join the midpoints of the upper side of the rectangles in a histogram, you get a Frequency Polygon

Frequency Curve

When you draw a freehand line that passes through the points of the Frequency Polygon, you get a Frequency Curve.

Ogive 

Suppose 2 students got 0-20 marks in maths, 5 students got 20-30 marks and 4 students got 30-50 marks in Maths. So how many students got less than 50 marks? Yes, 5+2=7. And how many students got more than 20 marks? 5+4=9. This type of more than and less than data are represented in the form of the ogive. The meeting point of the less than and more than line will give you the Median.

Arithmetic Line Graph

If you want to see the trend of Corona infection vs the number of recoveries from January 2020 to December 2020, you can do that in the form of an Arithmetic Line Graph. The months should be marked on the X-axis and the number of infections and recoveries are marked on the Y-axis. You can compare if the recovery is greater than the infection and if the recovery and infection are going at the same rate or not with the help of this Diagram.

Measures of Central Tendency

Measures of central tendency

There are majorly three measures of central tendency:

  • Mean
  • Median
  • Mode

Mean: Mean is the average of the given set of data.

x̄=∑ x/n

Where n is the number of observations

Median: The median is that value which divides the given number of observations into exactly two parts. First, the data set has to be arranged in an order, either ascending or descending. There are again two conditions here:

  • If the number of observations is odd, then;

Median = [(n+1)/2]th observation or term

  • If the number of observations is even, then the median will be mean of (n/2)th term and (n/2+1)th term.

Mode: The mode represents the frequently occurring value in the dataset.

Example: Find the mean, median and mode of the following data set.

2,3,6,7,4,5,3,8,3,9

Solution: Mean is the average of the given data;

x̄ = (2+3+6+7+4+5+3+8+3+9)/10 = 50/10 = 5

Now, to find the median, we need to arrange the data in ascending order.

2,3,3,3,4,5,6,7,8,9

Since, here the number of observations is even, therefore, the median will be the mean of the two middle terms.

Median = (4+5)/2 = 9/2 = 4.5

Mode = 3, since 3 is repeated here maximum number of times.

Practice Questions

Q.1: Give one example of a condition in which:

(i) the mean is a proper measure of central tendency.

(ii) the mean is not a proper measure of central tendency but the median is a proper measure of central tendency.

Q.2: Find the mean, median mode of 14, 25, 14, 28, 18, 17, 18, 14, 23, 22, 14, 18.

Q.3: The relative humidity (in %) of a certain city for a month of 30 days was as follows:

98.1 98.6 99.2 90.3 86.5 95.3 92.9 96.3 94.2 95.1 89.2 92.3 97.1 93.5 92.7 95.1 97.2 93.3 95.2 97.3 96.2 92.1 84.9 90.2 95.7 98.3 97.3 96.1 92.1 89

(i) Write a grouped frequency distribution table with classes 90 – 95, 80 – 85, etc.

(ii) Which month or season do you think this data is about?

(iii) What is the range of this data?

(iv) Represent the data set using bar-graph and histogram

Related Unit Name