Course information

  • Instructor: Deepayan Sarkar <deepayan@isid.ac.in>
  • Syllabus:
    • Basics in Programming: flow-charts, logic in programming
    • Common syntax
    • Handling input/output files
    • Sorting
    • Iterative algorithms
    • Simulations from statistical distributions
    • Programming for statistical data analyses: regression, estimation, parametric tests
  • Grading:
    • Written exam: 30%
    • Practical exam: 20%
    • Assignments: 20%
    • Project: 30%

References

Teaching material

Assignments

Assignments are meant to be done individually, not in groups. You are welcome to discuss ideas with others, but you may not share actual code, algorithms, etc.

Assignment 1

Due: November 1, 2019. Submit typed report as PDF file along with R and C++ code files by email.

  1. Write a recursive version of the insertion sort algorithm.
  2. Prove correctness of the algorithm.
  3. Compute it's average case runtime.
  4. Implement your algorithm in both R and Rcpp.
  5. Compare the empirical (observed) runtime with the corresponding iterative versions discussed in class.

Assignment 2

Due: January 12, 2020. Submit a typed narrative report as PDF file along with relevant R and C++ code files by email.

  1. Suppose you are given an algorithm rcoin() that produces either 0 or 1, with equal probability, every time it is invoked. Assume that the outcome of different calls are independent of each other.

    • Using rcoin() as a random number generator, write an algorithm r100() that produces a random integer between 0 and 99 (inclusive).
    • Prove correctness of your algorithm.
    • Compute the number of times your algorithm invokes rcoin(). If your algorithm is a randomized algorithm, compute the expected number of times it invokes rcoin().
  2. For an input array of size \(n\) which is already sorted, let \(T(n)\) be the number of comparisons made by the non-randomized quicksort algorithm using the Lomuto partitioning scheme. Compute \(T(n)\) and show that \(T(n) = \Theta(n^2)\). You may assume that all elements of the input array are distinct.
  3. The quicksort algorithm can be modified as follows to compute a specific order statistic (i.e., quantile) in linear time (in the average case): after the partition step, the recursion is called only on one subarray, namely, the one containing the desired quantile.

    • Modifying the quicksort algorithm as described above, write an algorithm order_statistic(x, w) to compute an order statistic, so that the call order_statistic(x, w) returns the w-th order statistic of x, i.e., sort(x)[w], but without actually sorting x.
    • Implement your algorithm in Rcpp so that the implementation can be called from R as order_statistic(x, w). Submit the C++ code as a file named order.cpp.
    • Design a simulation study in R to determine empirically through a hypothesis test whether the average run time is \(\Theta(n)\) as opposed to \(\Theta(n \log n)\). Use the simulation study to also empirically estimate the order of the variance of the average case runtime. Submit the R code used to perform the simulation and the subsequent calculations as a file named simulation.R.