The purpose of this assignment is to find the MLE of the parameters of a Beta(m, n) distribution under fairly restrictive conditions: namely, we will restrict our attention only to integer values of m and n.
The assignment is to write a program called
beta_mle.c
that reads in data from a file, computes the
MLE of m and n assuming that m and n are positive integers less than
a user-specified bound K, and prints the values of the estimated m
and n along with the corresponding log-likelihood.
Calling your program as
$ ./beta_mle data.txtshould use the value
K = 100
. Calling it as
$ ./beta_mle data.txt Kshould use the provided value of K. All other invocations should produce an error message and terminate the program.
You will need to store the data values in an array after reading them in, but we don't know the number of values in advance. As we have not yet learnt about dynamic array allocation, do this by statically allocating a large array. Here is some pseudo-code to do this:
#define MAXDATALENGTH 5000 main() { int s, dsize, i = 0; double x[MAXDATALENGTH]; FILE *in = fopen(infile, "r"); s = fscanf(in, "%lf", x[i]); while (s == 1) { i++; if (i >= MAXDATALENGTH) { printf("Maximum allowed dataset size exceeded\n"); return 1; } s = fscanf(in, "%lf", x[i]); } dsize = i; // size of dataset }
Write a function called lgamma
that computes
the (natural) log of the gamma function for an integer input. How
can you make this function efficient (i.e., not repeat the same
calculations unnecessarily)? Learn about static
variables and try to use them if you can.
Write a function called beta_loglik
that
computes the (natural) log of the likelihood function for integer
m and n.
Evaluate beta_loglik
for integer m and n
between 1 and K, and report the values that maximize the
log-likelihood.
Submit your program by email by midnight of Monday February 6. Send me only the program file (C code), not any output.