CHAPTER 5 & 6: DISCRETE AND CONTINUOUS PROBABILITY DISTRIBUTIONS New Notation: X: Random variable (Rassal, Rastgele De¼ gişken) – This is the variable that is used in the Random Experiment X=x is the set of elements of sample space for which X=x ger zarlar¬n toplam¬yla ilgileniyorsak, bu Rassal bir Ex: I·ki zar¬ ayn¬ anda at¬yor olal¬m. E¼ de¼ gişkendir ve X ile gösterilir. X=9 ise zarlar toplam¬n¬n 9 gelme olay¬n¬tan¬mlar ve şu sonuçlar¬ içerir: {(3,6),(4,5),(5,4),(6,3)} – Bu örnekte X in alabilece¼ gi de¼ gerler 2’den 12’ye kadard¬r ve burada X süreksiz bir rassal de¼gişkendir . X’in her farkl¬ de¼ geri, yani x için, olas¬l¬k da¼ g¬l¬m¬n¬ çizdi¼ gimizde aşa¼ g¬da soldakine benzer bir da¼ g¬l¬m elde ederiz. Bir de sürekli rassal de¼gişkenler vard¬r ki (boy, kilo gibi), onlar¬n sonuçlar¬n¬n olas¬l¬k da¼ g¬l¬m¬aşa¼ g¬da sa¼ gdaki gibidir Random Variables There are two types of random variables: Ch. 5 1 Discrete Random Variable Continuous Random Variable Ch. 6 Ozan Eksi, TOBB-ETU Discrete Random Variables Probability Distribution of X: P (X = x) = P (x) – P (x) is commonly denoted by f (x) as well P – P (x) > 0 & P (x) = 1 x Örnek: 2 kere madenin para atal¬m, ve X = turalar¬n say¬s¬olsun – P(X = x)’i x’in bütün de¼ gerleri için bulal¬m 4 possible outcomes T H H T H T x Value Probability 0 1/4 = .25 1 2/4 = .50 2 1/4 = .25 Probability T Probability Distribution H 2 .50 .25 Ozan Eksi, TOBB-ETU Continuous Random Variables Rb Probability Density Function of X is f (x) and P (a 0 X 0 b) = f (x)dx a – f (x) > 0 & R1 f (x)dx = 1 1 – P (a 0 X 0 b) = P (a < X 0 b) = P (a 0 X < b) = P (a < X < b) Shaded area under the curve is the probability that X is between a and b f(x) P (a = x = b) = P (a < x < b) (Note that the probability of any individual value is zero) a b 3 x Ozan Eksi, TOBB-ETU Cumulative Distribution of X Discrete Random Variables – F (x0 ) = P (X 0 x0 ) = P P (x) x0x0 – F ( 1) = 0 and F (1) = 1 – If a<b, the F (a) 0 F (b) for any real numbers a and b Continuous Random Variables – F (x) = P (X 0 x) = Rx – P (a 0 X 0 b) = F (b) 1 f (t)dt for -1 < x < 1 F (a) for any real constants a and b, a<b and f (x) = f(x) dF (x) dx P (a = x = b) = P (a < x < b) (Note that the probability of any individual value is zero) a b 4 x Ozan Eksi, TOBB-ETU Some Special Distributions of Interest Discrete Probability Distributions (Chapter 5) – Discrete Uniform – Bernoulli – Binomial – Hypergeometric – Poisson Continuous Probability Distributions (Chapter 6) – Uniform – (Standard) Normal – Exponential – Chi-Square – t-Dist. – F-Dist. Before talking about these distributions, we …rst need to have a look at Mathematical Expectation 5 Ozan Eksi, TOBB-ETU Mathematical Expectation (Matematiksel Beklenti) Matematiksel beklenti daha önce işledi¼ gimiz a¼ g¬rl¬kl¬ortalama konusunun bir parças¬d¬r. Sadece bu sefer her bir sonucun a¼ g¬rl¬g¼¬, onun meydana gelme olas¬l¬g¼¬kadard¬r E¼ ger ödülü 500TL olan bir çekilişte 100 tane bilet varsa ve biz bunlardan 1’ine sahipsek, matematiksel olarak o biletten beklentimiz 500/100=5TL olmal¬d¬r – ve şu şekilde hesaplan¬r: 0 (0:99) + 500 (0:01) yani %99 ihtimalle 0TL, %1 ihtimalle de 500TL kazanacaks¬n¬z Not: Adil oyun (fair game), oyuncular¬n¬n kazanç beklentilerinin 0 oldu¼ gu oyundur (yani e¼ ger bilet …yat¬5TL’den yüksekse zaten beklentisel olarak oyundan kaybetmiş say¬l¬r¬z) E¼ ger %10 ihtimalle 5000 ürün, %50 ihtimalle de 1000 ürün, %40 ihtimalle de 300 ürün satacaksak, satmay¬bekledi¼ gimiz ürün say¬s¬şu olmal¬d¬r – (0:1) 5000 + (0:5) 1000 + (0:4) 300 = 1120 The formula for the expectation of a random discrete variable X with probability dist. f(x) E(X) = X = P xP (x) x 6 Ozan Eksi, TOBB-ETU For continuous variable with probability density function f(x) E(X) = X = R1 xf (x)dx 1 Remember that probability density of a continuos random variable requires R1 f (x)dx = 1: So 1 expectation is weighted average of all possible outcomes If we are interested in the expected value of a function of a continuos random variable X, which is g(X), the formula is R1 E(X) = g(x)f (x)dx 1 Örnek: E¼ ger X at¬lan zar¬n sonucuysa, g(X) = 2X 2 + 1 in beklenen de¼ geri nedir? – E(g(X)) = 6 P 1 1 1 94 (2X 2 + 1) = (2 12 + 1) + ::: + (2 62 + 1) = 6 6 6 3 x=1 If a and b are constants E(aX + b) = aE(X) + b 7 Ozan Eksi, TOBB-ETU Moments The mean of distribution is denoted by In the case of a discrete random variable, the rth moment about the mean is (for r=0, 1, 2, ...) P )r ] = (x )r f (x) r = E[(X x 2 is called the variance of the distribution and denoted by deviation 2 = E[(X )2 ] 2 or var(X), where is standard which further can be written as E[(X )2 ] = E(X 2 2 X+ 2 ) = E(X 2 ) 2 E(X) + 2 = E(X 2 ) 2 = E(X 2 ) E(X)2 If Y=a+bX, where a and b are constants, the variance of Y can be found by 2 Y = V ar(a + bX) = b2 so that the standard deviation of Y is Y 8 =j b j 2 X X Ozan Eksi, TOBB-ETU Multivariate Distributions Two random variables X and Y de…ned on the same probability space, the joint distribution for X and Y de…nes the probability of events de…ned in terms of both X and Y. In the case of only two random variables, this is called a bivariate distribution, but the concept generalizes to any number of random variables, giving a multivariate distribution If X and Y are discrete random variables – Joint Probability Distribution of X and Y: f (x; y) = P (X = x \ Y = y) Product Moments The rth and sth product moments of the random variables about the means (for r,s=0,1,2,..) is PP r s r s (x X ) (y Y ) f (x; y) r;s = E[(X X ) (Y Y) ] = x 1;1 y is called the covariance of X and Y, and it is denoted by X;Y = E[(X X )(Y Y )] = E(XY ) E(X) Y + X;Y X E(Y If X and Y are independent, then E(XY ) = E(X) E(Y ) and 9 or Cov(X; Y ) ) X X;Y Y = E(XY ) X Y =0 Ozan Eksi, TOBB-ETU Moments of Linear Combination of Random Variables If X and Y are random variables, then E(X + Y ) = X E(X + Y ) = X var(X + Y ) = var(X Y)= 2 X + 2 Y 2 X + 2 Y + Y Y + 2Cov(X; Y 2Cov(X; Y ) In the more general case, if X1 ; X2 ; :::; Xn are random variables, a1 ; a2 ; :::; an are constants, n P and Y = ai Xi ; then i=1 E(Y ) = n P ai E(Xi ) i=1 var(Y ) = n P i=1 a2i var(Xi ) + 2 PP ai aj cov(Xi ; Xj ) i<j – If X1 ; X2 ; :::; Xn are independent, the RHS of the equation drops out 10 Ozan Eksi, TOBB-ETU Marginal and Conditional Distributions x Example: y 0 1 2 0 1/6 2/9 1/36 1 1/3 1/6 2 1/12 Note that: PP y f (x; y) = 1 x If X and Y are discrete random variables – Marginal Dist. of X: g(x) = P f (x; y) g(0) = y 1 2 1 5 + + = 6 9 36 12 – Conditional Distribution of X given Y: f (xjy) = f (x; y) h(y) If A and B are the events X=x and Y=y, P (AjB) = f (0j1) = 2 9 2 1 + 9 6 = P (A \ B) P (B) 4 7 – The rest of the de…nitions are Analogous 11 Ozan Eksi, TOBB-ETU Conditional Expectations Given Y=y, the conditional expectation of a continuos random variable X is E(X) = R1 g(x)f (x=y)dx 1 The conditional mean is: The conditional variance is: Xjy = E(Xjy) 2 Xjy = E(X 2 jy) 12 2 Xjy Ozan Eksi, TOBB-ETU Portfolio Analysis (Example: Investment Returns) $1,000 yat¬r¬lan iki farkl¬ yat¬r¬m arac¬n¬n farkl¬ ekonomik koşullarda getirileri aşa¼ g¬daki gibi olsun Yat¬r¬m P (xi ; yi ) Economik durum X (Posif fon) Y (Aktif fon) .2 .5 .3 Resesyon I·stikrarl¬Ekonomi Büyüyen Ekonomi E(X) = X $25 $200 +$50 +$100 +$60 +$350 = ( 25)(:2) + (50)(:5) + (100)(:3) = 50 E(Y ) = Y = ( 200)(:2) + (60)(:5) + (350)(:3) = 95 p ( 25 50)2 (:2) + (50 50)2 (:5) + (100 50)2 (:3) = 43:3 X = p ( 200 95)2 (:2) + (60 95)2 (:5) + (350 95)2 (:3) = 193:7 Y = Cov(X; Y ) = ( 25 50)( 200 95)(:2) + (50 50)(60 95)(:5) + (100 50)(350 95)(:3) = 8250 Kovaryasonun (+) olmas¬ndan anl¬yoruz ki bu iki yat¬r¬m arac¬n¬n dönüşleri aras¬nda pozitif bir ilişki var; yani genel olarak ayn¬yönde hareket ediyorlar 13 Ozan Eksi, TOBB-ETU E¼ ger portfolyonuz (P) 40% X fonunu, 60% da Y fonunu içeriyorsa: E(P ) = :4(50) + :6(95) = 77 2 P var(P ) = P = var(0:4X + 0:6P ) = 0:4 = 2 X + 0:6 2 Y + 2 0:4 0:6 Cov(X; Y ) p (:4)2 (43:3)2 + (:6)2 (193:21)2 + 2(:4)(:6)(8250) = 133:04 Dikkat ederseniz P portfolyosunun beklenen getirisi ve varyasyonu, iki ayr¬yat¬r¬m arac¬olan X ve Y’nin beklenen getiri ve varyasyonlar¬n¬n aras¬nda de¼ gerlerdir Aktif fon ortalama olarak daha fazla getiri getirse de riski daha fazlad¬r = 95 > but = 193:21 > Y Y X X = 50 = 43:40 Bu portfolyonun istikrarl¬ekonomi durumunda getirisi nedir? P jistikrar = E(P jistikrar) = :4(50) + :6(60) = 56 14 Ozan Eksi, TOBB-ETU Probability Distributions for Discrete Random Variables Discrete Uniform Distribution Outcome can take di¤erent values with equal probability (zar at¬m¬gibi) f (x) = 1 k E(X) = = k P xi i=1 1 k The Bernoulli Distribution Success or failure experiments (Paran¬n at¬lmas¬, I·çinde M siyah, N beyaz top bulunan bir kavanozdan top çekilmesi, Kusurlu ve kusursuz parçalar¬n bulundu¼ gu bir kutudan bir parçan¬n çekilmesi gibi) If the probability of success is (that meand that of failure is 1 X has Bernoulli distribution, if and only if x f (x; ) = (1 )1 x ), then the random variable for x=0,1 – It is also called Bernoulli trial as one’gain, the other’s loss – Sequences of the same experiment are called repeated trials The mean is = E(X) = X = P xP (x) = 0(1 ) + 1( ) = x 15 Ozan Eksi, TOBB-ETU The variance is 2 X 2 = (1 = E[(X ) P 2 (x X) ] = 2 X ) P (x) = (0 )2 (1 ) + (1 )2 ( ) = (1 ) x Ex: Bir otomobil sürücüsünün yar¬ş¬ kazanma olas¬l¬g¼¬ 0,7 ve kazanmama olas¬l¬g¼¬ 0,3’tür. bu otomobil yar¬şmac¬s¬için olas¬l¬k fonksiyonu yaz¬p, E(X) ve V ar(X)’i bulunuz – X rassal de¼ gişkeni sürücünün yar¬ş¬ kazand¬g¼¬ zaman 1 de¼ gerini, kazanmad¬g¼¬ zaman 0 de¼ gerini alan bir Bernoulli de¼ gişkenidir. Olas¬l¬k fonksiyonu 8 9 < 0:7; x = 1 ise = 0:3; x = 0 ise P (X) = : ; 0; di¼ ger durumlarda – Burada kazanma ihtimali E(X) = – Uzun yolla ise; = 0:3 oldu¼ gu için: = 0:3 E(X) = V ar(X) = (1 P ) = 0:3 0:7 = 0:21 xP (x) = 0 (0:3) + 1 (0:7) = 0:7 x V ar(X) = E(X 2 ) [E(X)]2 P E(X 2 ) = x2 P (x) = 02 (0:3) + 12 (0:7) = 0:7 x ) V ar(X) = 0:7 16 0:72 = 0:21 Ozan Eksi, TOBB-ETU The Binomial Distribution The formula for ”x successes in n trials”(which gives Binomial Distribution) is P (x; n; ) = n x x (1 )n x for x=0,1, 2, ...,n – Notice that this is Bernoulli distribution where the ordering is not important and combination helps us …nd the number of sequences with x successes in n independent trials – Ex: E¼ ger tropifal bir hastal¬ktan kurtulma ihtimali bir kişi için %80 ise, bu hastal¬g¼a yakalanan 10 kişiden 7 sinin kurtulma ihtimalini hesaplayam¬m P (7; 10; 0:8) = 10 7 0:87 (1 0:8)10 7 = 0:2 – There are tables that gives the value of P for di¤erent values of n, x, and Notice that sequences of the repeated trials are independent from one other (unlike sampling without replacement) When n=1, it is Bernoulli distribution 17 Ozan Eksi, TOBB-ETU The mean and variance of the binomial distributions are 2 and =n = n (1 ) If X has a binomial distribution with parameters n and , and Y = E(Y ) = 2 Y and Note (Optional): Chebyshev’s Theorem fP (j X = (1 ) n j< k ) = 1 for any positive constant c, the probability is at least P () = 1 successes in n trials falls between c and X , then n 1 g with k = c implies that k2 (1 ) that the proportion of nc2 +c – Hence, when n ! 1, the probability approaches 1 that the proportion of successes will di¤er from by less than any arbitrary constant c. This result is called a law of large numbers. 18 Ozan Eksi, TOBB-ETU Ex: Başar¬ihtimalinin 0.1 oldu¼ gu bir deney 5 kez tekrarland¬g¼¬nda bir defa başar¬l¬sonuç vermesinin ihtimali nedir? – Yani; x = 1, n = 5, and = 0.1 P (1; 5; 0:1) = 5! 0:11 (1 1)! 1! (5 0:1)5 – Şimdi binomial da¼ g¬l¬m¬tüm olas¬x de¼ gerleri için, ve P(x) .6 .4 .2 0 n = 5 P = 0.1 0 1 2 3 4 n = 5 P = 0.5 .6 .4 .2 0 x 0 5 = 0:3285 =0.1 ve =0.5 için ayr¬ayr¬çizelim P(x) x 1 1 2 3 4 5 – Bu da¼ g¬l¬mlar¬n ortalama ve standart sapmalar¬aşa¼ g¬daki gibi hesaplanabilir p p = 0:1 ) = n = 5(0:1) = 0:5 ve = n (1 ) = 5(0:1)(1 0:1) = 0:67 = 0:5 ) ve = n = 5(0:5) = 2:5 19 = p n (1 )= p 5(0:5)(1 0:5) = 1:12 Ozan Eksi, TOBB-ETU The Negative Binomial, Geometric and Poisson Distributions If you are interested in the probability that k th success occurs in xth trial, you can always calculate the probability of k 1 failure in …rst x 1 trails, and multiply with a probability of success occuring in the next trial: resulting propability distribution is Negative Binomial P (x; k; ) = x k 1 1 k (1 )x k for x=k, k+1, k+2, ... Ex : Bir zar at¬ls¬n. 6. At¬şta 2. kez 4 gelme olas¬l¬g¼¬nedir? – x =6, k=2 ve =1/6 olmak üzere P(6. at¬şta 2. kez 4 elde etme)= 5 1 20 1 5 ( )2 ( )4 6 6 Ozan Eksi, TOBB-ETU Geometric Distribution: It is a Negative Binomial distribution with k = 1 g(x; ) = k (1 )x 1 for x=1, 2, 3, ... Ex : Bir at¬c¬n¬n her at¬şta hede… vurma olas¬l¬g¼¬3/4’tür. Arka arkaya yap¬lan at¬şlar sonucunda hede… ilk kez vurmas¬için gereken at¬ş say¬s¬X oldu¼ guna göre; – a. Hede… ilk kez üçüncü at¬şta vurma olas¬l¬g¼¬nedir? 3 1 3 P (X = 3) = P (3) = ( )2 = 4 4 64 – b. Hede… ilk kez en çok dördüncü at¬şta vurma olas¬l¬g¼¬nedir? P (X 4) = P (X = 1) + P (X = 2) + P (X = 3) + P (X = 4) 3 1 1 1 1 = [( )0 + ( )1 + ( )2 + ( )3 ] = 0:00018 4 4 4 4 4 – c. Hedefte ilk vuruşu elde edinceye kadar, at¬c¬ortalama olarak kaç at¬ş yapmal¬d¬r E(x) = 1 1 4 = = 3 P 3 4 21 Ozan Eksi, TOBB-ETU When n is large and is small, it is hard to calculate Binomial probabilities. Poisson distribution is used as an approximation to the Binomial distribution under these circumstances (n > 20; < 0:05). It uses = n (this gives average (expected) number of events per unit) x e x! p(x; ) = for x=0,1, 2, ... where x is number of successes per unit and e is the base of the natural logarithm (2.71828...) – The mean and variance of Poisson distribution can be found by 2 and = E(x) = = E[(x )2 ] = Örnek: Sigara içimi yüzünden her y¬l ortalama olarak 1000 kişiden bir tanesinin hayat¬n¬kaybetti¼ gini varsayal¬m. Sigara için 2000 kişinin gözlemlenme işine dair baz¬olas¬l¬klar¬bulalm n=2000 ve =0,001 oldu¼ gundan =n =2 – a. Kimsenin hayat¬n¬kaybetmemesi: p(X = 0) = p(0; 2) = – b. 3 kişinin hayat¬n¬kaybetmesi: p(X = 3) = p(3; 2) = 23 e 3! 20 e 0! 2 = 0:135 2 = 0:18 – c. 2’den fazla kişinin hayat¬n¬kaybetmesi: p(X > 2) = 1 p(X 2) = 1 [ 20 e 0! 22 2 + 21 e 1! 2 + 22 e 2 ] = 0:32 2! Ozan Eksi, TOBB-ETU The Hypergeometric Distribution Concerned with …nding the probability of “X” successes in the sample where there are “S” successes in the population “n”trials in a sample taken from a …nite population of size N without without replacement Outcomes of trials are dependent P (x) = CxS CnN CnN S x Ex: 10 bilgisayar¬ndan 4 tanesinde illegal yaz¬l¬m bulunan bir bölümde, 3 bilgisayar kontrol edildi¼ gi zaman, bu 3 bilgisayardan 2 tanesinde illegal yaz¬l¬m bulunma ihtimali nedir? – Yani N=10, S=4, n=3, x=2 P (x = 2) = CxS CnN CnN 23 S x = C24 C16 (6)(6) = = 0:3 10 C3 120 Ozan Eksi, TOBB-ETU