# Question: foundations of biostatistics and epidemiology question 1 variableandnbsppain levelandnbspmeasured with response options...

###### Question details

Foundations of Biostatistics and Epidemiology

**QUESTION 1**

- Variable
**Pain Level,**measured with response options as none, mild, moderate and severe can be suitably described as:

a) Mean and standard deviation and a histogram. |
||

b) Percentage distribution in each category and a box plot. |
||

c) Percentage distribution in each category and a bar chart or a pie chart. |
||

d) Median and Mode and a histogram. |

**QUESTION 2**

1. **Standard error** tells us how good our estimate is for the unknown population value for whatever it is that we are measuring. For example **standard error for skewness** tells us how much we can expect Skewness Statistic to vary in the population if we took several samples. Similarly **standard error of kurtosis** gives us a rough idea as to how much we can expect kurtosis to vary across multiple samples. Same goes for **standard error of a mean** i.e. how much variability will be across means if we took several samples from the same population. This question is giving away one mark because it is important to understand what 'Standard error' means.

a) True

b) False

**QUESTION 3**

- All of the following statements about frequency distribution are correct except ONE. Please identify the incorrect statement.

a) In a small dataset, simply by eyeballing the frequency distribution table we can identify outliers or extreme values. |
||

b) Frequency distribution can be displayed as a table of data values. |
||

c) Frequency distribution can be displayed as a graph. |
||

d) In a small dataset, simply by eyeballing the frequency distribution table we can identify data entry errors which may appear as out of the range values. |
||

e) Frequency distribution can only be used for categorical data. |
||

f) Frequency distribution is a list of how often each value or set of values occur in the data. |

**QUESTION 4**

- In a population of 17,318 at the beginning of a two-year follow up period, there were 422 people suffering from type 2 diabetes and 322 new cases were diagnosed during the follow up. What is the cumulative incidence of type 2 diabetes in this population at the end of the follow up? (no one was missing and no one died). Please express your answer as
**PERCENTAGE**after correct rounding to**two decimals.**Percentage symbol is not needed.**brackets ( )**on your calculator to do your calculation.

=

** **

**QUESTION 5**

- In first week we talked about how an evidence is usually generated and roughly outlined some steps that are carried out to conduct a quantitative study and answer the research question. Which steps should happen BEFORE collecting data: multiple answers are required.

a) Carry out the data analyses with suitable statistical tests. |
||

b) Write a specific research question |
||

c) Search through the existing literature to see if your research question has already been answered. |
||

d) Answer the research question and share the findings. |
||

e) Deal with the missing data and data entry errors. |
||

f) Draw objective and valid conclusions. |
||

g) Decide how the information on exposure/independent variable and the outcome/dependent variable and other determinants or influencing factors will be collected or measured. |
||

h) Choose how the study will be carried out (specify the overall plan or design of the study). |

**QUESTION 6**

- What type of variable is 'Temperature' measured as degrees Celsius? (or Fahrenheit if you are in USA)

a) Ordinal |
||

b) Ratio |
||

c) Interval |
||

d) Nominal |

**QUESTION 7**

- Only one statement below is the correct interpretation of
**'Sampling error'.**Please identify this statement.

a) It is an error when a non probability sample e.g. Purposive sample has been used instead of a probability sample e.g. Systematic Random sample. |
||

b) It is an error in the results when the study participants are not representative of the population. |
||

c) It is the difference between the estimated value (based on a sample) and the actual true unknown value for the population. This difference (or error) is due to the fact that the estimate is based on a sample and information has not been collected from the entire population. |
||

d) It is an error because our sample was too small. |
||

e) It is an error when a non-probability sample e.g. snow ball sample has been used instead of a probability sample such as simple random sample. |
||

f) It is an error when a non-probability sample e.g. Quota sample has been used instead of a probability sample e.g. Stratified Random sample. |

**QUESTION 8**

- Student identification number (or ID) is measured on which one of the following measurement scales?

a) Ratio |
||

b) Nominal |
||

c) Interval |
||

d) Ordinal |

**QUESTION 9**

- Which one of the following statements is INCORRECT about Box Plots:

a) In SPSS output for box plot outliers are marked with asterisks and extreme outliers are marked with circles. |
||

b) In SPSS output for box plot outliers are marked with circles and extreme scores are marked with asterisks. |
||

c) Box represents 50% of the data. |
||

d) Lower and Upper borders of the box are 25th and 75th percentiles respectively and difference between the two is Interquartile range IQR. |
||

e) Line in the middle of the box is Median. |

**QUESTION 10**

- Please identify the
**two incorrect statements**from the following;**two answers are required.**

a) Epidemiology helps us to understand how an evidence is produced and Biostatistics helps us to make sense of the information or data that is collected during the process of producing evidence. Together these help us to find out whether certain treatments and interventions work in the real world or not. |
||

b) A characteristic or attribute of the subjects in the population which we may be interested in and has more than one value is called a statistic. |
||

c) As per the Orientation video, lack of clarity on four measurement scales is the most common reason for losing marks in the assessments |
||

d) If Ronald Drump is at 80th percentile for extroversion (i.e. outgoing, social etc.), this means 80% people are less outgoing and 20% more outgoing than him in his population. |
||

e) Total group of units or subjects of interest is known as the "Population" whereas a 'sample' is a subset of this. |
||

f) Average spread of scores around the mean is called interquartile range. |

**QUESTION 11**

- All of the statements listed below are correct except ONE. Please identify the INCORRECT statement.

a) Kurtosis statistic of 0 (along with a skewness statistic value of 0) indicate a 'perfect' normal distribution. |
||

b) 75th percentile is the value at or below which 75% of the observations are found. |
||

c) Probability value (p value) tells us how likely is it that our observed findings are purely due to random chance factors if there really was no relationship between an exposure and the observed outcome. |
||

d) Terms such as Validity and Reliability mean consistency and accuracy, respectively |
||

e) Skewness statistic is a measure of how symmetric (or skewed) a distribution is i.e. whether both sides of the curve are identical or if one tail happens to be longer than the other due to outliers. Kurtosis Statistic tells us about the 'overall shape' of the curve/distribution i.e. whether it is peaked or flat. |

**QUESTION 12**

- Incidence and Prevalence are two most common measures for how we collect information about health, disease or other health related states; one refers to only new cases while the other is calculated based on both old and new cases. In a population of 23,925 at the beginning of a two-year follow up period, there were 384 people suffering from type 2 diabetes and 283 new cases were diagnosed during the follow up. What is the prevalence of type 2 diabetes in this population at the end of the follow up? (no one was missing and no one died). Please express your answer as
**PERCENTAGE**after correct rounding to**two decimals.**Percentage symbol is not needed.**brackets ( )**on your calculator to do your calculation.

=

**QUESTION 13**

- Which one of the following graphs will not be suitable for continuous data?

a) Frequency Polygon |
||

b) Histogram |
||

c) Box plot |
||

d) Bar chart |
||

e) Stem and Leaf plot |

**QUESTION 14**

- What are the two INCORRECT statements below
**(please choose TWO answers).**

a) Lack of validity is also known as bias. |
||

b) Y axis of any histogram should start from zero. |
||

c) Incidence rate is calculated based on number of at-risk people at the start of a follow up whereas cumulative incidence is calculated by using the collective at-risk time (person-time). |
||

d) One of three pillars of Evidence Based Healthcare is 'Best Available evidence' and evidence can be assessed only if one has sufficient competency in statistical and epidemiological methods and tools. |
||

e) We can describe variable 'Blood Pressure' (measured as mmHg) of a group of volunteers using mean and standard deviation as summary/descriptive statistics. |
||

f) When the effects of an exposure on the outcome are mixed with the effects from other sources, this is known as random error. |
||

g) When writing a 'research question' |

**QUESTION 15**

- In a cross-sectional study there were 581 cases of hypertension among 5,879 vegetarians, while 297 cases among 3,881 non-vegetarians. Calculate the prevalence of hypertension among non-vegetarians. Please express your answer as
**PERCENTAGE**after correct rounding to**two decimals.**Percentage symbol is not needed.**brackets ( )**on your calculator to do your calculation.

**WITH SOLUTION PLEASE **

**NEED IT ASAP**

**THANKS.**