Theoretical Statistics and Mathematics Unit Seminars
- Time: Seminars are held on Wednesdays from 3:30 PM
- 4:30 PM at Seminar Room 2, New Building, unless
otherwise noted.
- Tea, coffee, and cake/snacks will be served in the lobby half an
hour before the talk starts.
- Seminar series speakers usually give two talks during a
one-week visit. One talk will be scheduled at the regular time. The
time and venue for the other talk will be posted in advance.
Schedule for Spring 2012
Hastings-Levitov aggregation and the Brownian web
Abstract: In 1998 Hastings and Levitov proposed a family of models for planar random growth in which clusters are represented as compositions of conformal mappings. This family includes physically occurring processes such as diffusion-limited aggregation, dielectric breakdown and the Eden model for biological cell growth. I shall describe the limits that result from small particle size and rapid aggregation in a special case of this model. In particular I shall show how the Brownian web arises in a limit of the fine scale branching structure that is present within the cluster.
This is based on joint work with James Norris (Cambridge University).
Various Problems on Sum of Digits
Abstract: The distribution of various arithmetic functions on positive integers written in a given base $g > 1$ is a classical area of investigation and a huge body of literature on this topic has been published.
In this talk we look at various problems involving the sum of digits function, their help in proving a conjecture on the exponents in prime power factorization of factorials, and the sum of digits behavior under multiples.
Seeking Hidden Risks with Multivariate Regular Variation
Abstract: Multivariate regular variation plays a role in assessing tail risk in diverse applications such as finance, telecommunications, insurance and environmental science. The classical theory, being based on an asymptotic model, sometimes leads to inaccurate and useless estimates of probabilities of joint tail regions. This problem can be partly ameliorated by using hidden regular variation. We offer a more flexible definition of hidden regular variation using a modified notion of convergence of measures that provides improved risk estimates for a larger class of tail risk regions. The new definition unifies ideas of asymptotic independence and asymptotic full dependence and avoids some deficiencies observed while using vague convergence. We also provide estimators for the limit measures.
(Joint work with A. Mitra and S. Resnick)
Space is the Place: Why Spatial Thinking Matters for Environmental Problems
Abstract: Spatial methods have become an increasingly used approach for analyzing data in many fields. In particular, it is now routine to collect data layers where there is some geographic referencing. This information should be used in order to enhance inference. From a statistical perspective, we think in terms of formal inference, utilizing probabilistic or stochastic modeling; we think beyond purely descriptive summaries. In this sense, we exceed the capabilities of Geographic Information Systems (GIS) software to investigate complex processes over space and time.
A particularly rich context for such investigation is environmental processes. Examples include analysis of weather/climate data, analysis of environmental exposure data, analysis of locations of disease occurrence, and analysis of distributions of species over a region. In this non-technical talk, I will describe the types of spatial (and, perhaps, spatio-temporal) data that we collect. I will discuss what we expect to see with regard to these types of data, i.e., what we mean by lqlq spatial pattern.rqrq I will raise a variety of issues that arise in modeling such data - explanation of local behavior through spatially referenced explanatory variables, explanation of uncertainty through structured dependence. I will illustrate, with a variety of datasets involving the foregoing processes, hopefully to illuminate that statistical thinking does matter when we have inferential objectives such as explanation, interpolation, and
Analyzing Spatial Directional Data through the use of Gaussian Processes
Abstract: Circular data arise in oceanography (wave directions) and meteorology (wind directions), and, more generally, with periodic measurements recorded in degrees or angles on a circle. In this talk we introduce a fully model-based approach to handle circular data in the case of measurements taken at spatial locations, anticipating structured dependence between these measurements. We formulate a wrapped Gaussian spatial process model for this setting, induced from a customary inline Gaussian process. We look at the properties of this process, including the induced correlation structure.
We build a hierarchical model to handle this situation and show how to fit this model straightforwardly using Markov chain Monte Carlo methods. Our approach enables spatial interpolation and can accommodate measurement error. We illustrate with a set of angular wave direction data from the Adriatic coast of Italy, generated through a complex computer model.
Then, we consider the projected normal spatial process built from a bivariate Gaussian process model. Such models are more flexible than usual wrapped or von Mises models and easily handle regression. However, they are more challenging to fit. We illustrate with a butterfly dataset.
Approximating shortest paths in graphs
Abstract: Computing all-pairs distances in a graph is a fundamental problem of computer science but there has been a status quo with respect to the general problem of weighted directed graphs. In contrast, there has been a growing interest in the area of algorithms for approximate shortest paths leading to many interesting variations of the original problem. In this talk, we trace some of the fundamental developments like spanners and distance oracles, their underlying constructions, as well as their applications to the approximate all-pairs shortest paths.
Active Redundancy Allocations in Series Systems
Abstract: To enhance the performance of a system a common practice employed by reliability engineers is to use redundant components in the system. Two commonly used types of redundancy are the active (or parallel) redundancy and the standby redundancy. In active redundancy, available spares are connected in parallel with the components of the system and function simultaneously with them (which leads to consideration of maximum of random variables). Two different allocations of spares can be compared through stochastic orders between the lifetimes of corresponding systems. We will discuss the problem of optimally allocating active redundant components in the system and will provide detailed literature survey. We will also highlight our contributions in this area.
Standby Redundancy Allocations in Series and Parallel Systems
Abstract: To enhance the performance of a system a common practice employed by reliability engineers is to use redundant components in the system. Two commonly used types of redundancy are the active (or parallel) redundancy and the standby redundancy. In standby redundancy, spares are attached to components of the system in a manner that a spare starts functioning only after the failure of the component to which it is attached (which leads to consideration of convolution of random variables). Two different allocations of spares can be compared through stochastic orders between the lifetimes of corresponding systems. We will discuss the problem of optimally allocating standby redundant components in the system and will provide detailed literature survey. We will also highlight our contributions in this area.
Frailty and Cure Models in Survival Analysis
Abstract: When making probabilistic models for survival times, one should consider the fact that individuals are heterogeneous because they differ in their susceptibility to causes of death, response to treatment and influence of various risk factors. Some of this heterogeneity can be taken care by modeling the failure rate by the classical Cox proportional hazard rate (PHRM) model. The observed covariate vector takes into account the heterogeneity present. The unexplained heterogeneity is modeled by introducing random effect in the hazard rate, called the frailty.
In this presentation, we shall consider various frailty models and their effect on the resulting hazard rate. In addition, we shall introduce the use of cure frailty models which yield a subgroup of zero susceptibility, which ``survives forever''. This is a relevant model in medicine and demography.
Random Walks
Abstract: Consider a random walker on a lattice, that at each time step jumps to one of the neighbors with equal probability. It goes back to work of Polya that in dimension 2, the walker surely returns to its starting point, whereas this is false in higher dimension. Further, the scaling limit of the trajectory of the walk is Brownian motion.
None of these statements is known in general when the walk is inhomogeneous in space, with transition probabilities being themselves random. In the talk I will review the classical theory, explain the challenges in the random environment setup, and describe some recent progress. The role played by dimension will be emphasized throughout the talk. No prior knowledge of random walks will be assumed.
Branching Random Walks and the Maxima of Gaussian Free Fields
Abstract: Bramson and the speaker considered the maximum of the discrete two dimensional Gaussian free field (GFF) in a box, and proved (2011) that its maximum, centered at its mean, is tight, settling a long standing conjecture. The proof exploits similarities with branching random walks, and combines an argument of Dekking and Host (1991), adapted by Bolthausen, Deuschel and the speaker to the GFF setup (2010) with elements from Bramson's thesis (1978) and comparison theorems for Gaussian fields. An essential part of the argument is the precise evaluation, up to an error of order 1, of the expected value of the maximum of the GFF in a box. Related Gaussian fields, such as the GFF on a two dimensional torus, are also discussed. Finally, I will discuss some recent progress concerning non-homogeneous branching random walks, and links with the cover time of graphs by random walks.
Large deviations for truncated heavy-tailed random variables: a boundary case
Abstract: Suppose that $X_1, X_2, are i.i.d. random variables with regularly varying tails, and $(M_n)$ is deterministic sequence going to infinity. In this talk, we shall study the decay rate of $P(S_n > k M_n)$ where $k$ is a positive integer, and $S_n := j=1^n X_j 1(|X_j| < M_n)$. It turns out that the case when $k$ is an integer is very different from that when it is not so, from the points of view of the decay rate and its proof. The reason for this difference, and the method of attack for the above mentioned problem are the content of this talk.
Forbidden configurations and Steiner designs
Abstract: Let F be a (0,1) matrix. A (0,1) matrix M is said to have F as a configuration if there is a submatrix of M which is a row and column permutation of F. We say that a matrix M is simple if it has no repeated columns. For a given $v in N$, we shall denote by forb(v,F) the maximum number of columns in a simple (0,1) matrix with v rows for which F does not occur as a configuration. We show that for certain natural choices of F, forb(v,F) $leqvtt+1$. In particular this gives an extremal characterization for Steiner t-designs as maximal (0,1) matrices in terms of certain forbidden configurations, and our bound is asymptotically tight. In particular, this answers a question posed by Richard Anstee.
Storing small sets efficiently in the bit probe model with 2 adaptive probes
Abstract: The problem of storing small sets efficiently - first studied by Buhrman et al - is essentially the following: Given a universal set U of size m, we are interested in storing (as bits) small subsets S of U (subsets of size at most t for some fixed t) so that any query of the form "Is x in S?" can be answered accurately. We allow the querying protocol to consider two probes, and our goal is to optimize the size of the array needed to store small subsets. We shall first look at some results that are known, and then later consider the special case of storing 3-element subsets of U in which case we have an asymptotically tight answer for the order of the array size.
Stochastic properties of Dynamical Systems: A Spectral Theoretic Approach
Abstract: The time evolution of many Markov chains and of many hyperbolic dynamical systems can fruitfully be described in terms of associated transfer operators acting on spaces of (generalized) functions. I will outline the formal similarities and the differences in interpretation between these two settings. Then I will explain briefly a spectral theorem and a perturbation theorem that have turned out to be extremely useful for proving probabilistic limit theorems in both settings.
Please contact Deepayan
Sarkar at <deepayan.sarkar@gmail.com> if you
are interested in giving a talk.
Related links: official
seminar web-page , past seminars.
Updated: 05 March 2012