# Kurse und Workshops

Der Leibniz-WissenschaftsCampus bietet für seine Mitglieder Kurse und Workshops an, die der methodischen oder fachübergreifenden Weiterbildung dienen. Das Angebot richtet sich in erster Linie an Doktorandinnen und Doktoranden sowie PostDocs zu Beginn ihrer wissenschaftlichen Laufbahn. Die Kurse werden in Rücksprache mit den Mitgliedern organisiert.

# Kursprogramm 2017

# Regression Modeling

**What is this workshop about? **

Regression models are the (!) statistical work-horse in empirical research, and this workshop aims to provide a foundation to this class of statistical approaches. While most projects in the ScienceCampus require the application of more complex model classes in order to adequately treat the underlying data generating processes – that is, more advanced model classes that go beyond the basic regression model classes introduced during this workshop – these basics are nevertheless of greatest importance: they directly transfer to the more complex model classes, and thus are essential to successfully perform any applied data analysis by the use of regression techniques.

**When? **May 8 – 12, 2017, with an additional practice/question day on May 16, 2017. 9.00h – 13.00h each day.

**Where? **German Primate Center (DPZ), seminar room E0.14.

**What is the target group?** PhD students, post-docs.

**When can I register? **March 27 – April 23, 2017.

**How** **can I register? **Please send an email to: hsennhenn-reulen(at)dpz.eu

**What is the maximum number of participants? **15 (minimum number: 5).

**What are the contents I can expect?**

• **Simple Linear Models: **Basic model build-up for the simple linear regression model; Estimation of the regression parameters; Confidence intervals for regression parameters; Confidence interval for the regression line; Prediction intervals; Goodness of fit; Similarities and differences between the simple linear regression model and the concept of correlation.

• **Multiple Linear Models: **Modeling of different covariate scales (Non-linear effects; Categorical covariates; Interaction effects); Variable selection; Combined and overall *F *Tests; Diagnostics.

• **Generalized Linear Models: **Binomial distributed response; Poisson distributed response; Deviance; Offset.

# Bayesian Statistics

**What is this workshop about? **

As Andrew Gelman writes in his handy statistical lexicon (http://andrewgelman.com/2008/10/03/bayes_bayesians/):

*Every* *statistician uses Bayesian inference when it is appropriate (that* *is, when there is a clear probability model for the sampling of parameters).* *A Bayesian statistician is someone who will use Bayesian inference for all* *problems, even when it is inappropriate. I am a Bayesian statistician myself* *(for the usual reason that, even when inappropriate, Bayesian methods seem to work well).*

Bayesian statistics is a powerful school for statistical inference, because it provides very natural solutions to problems that are very difficult to resolve in the classical framework of frequentist statistics (see the explanations in the description to the *Hierarchical* *Regression Models *workshop). Moreover, with the revolution that came with Markov chain Monte Carlo (MCMC) techniques, post-estimation calculations are practically much more easily feasible, which makes Bayesian inference very appealing in many applied scenarios where the classical framework is actually equally appropriate.

**When? **September 4 – 8, 2017, with an additional practice/question day on September 12, 2017. 9.00h – 13.00h each day.

**Where? **German Primate Center (DPZ), seminar room E0.14.

**What is the target group? **PhD students, post-docs.

**When can I register? **July 24 – August 20, 2017.

**How** **can I register? **Please send an email to: hsennhenn-reulen(at)dpz.eu

**What is the maximum number of participants? **15 (minimum number: 5).

**What are the contents I can expect?**

• **What is Statistical Inference? **Statistical inference and models; Four general

tasks in statistical inference; Two general approaches: Frequentist (exclusively

likelihood-based), and Bayesian inference.

• **Bayesian and Frequentist Perspectives on Statistical Inference**

• **Ba****yesian** **Inference **Posterior distribution; Bayesian point estimates; Credible regions; Bayesian tests; Choice of the prior distribution; Numerical methods for Bayesian inference.

• **Ba****yesian** **Inference using MCMC Sampling **Metropolis-Hastings-algorithm and Gibbs-sampler.

• **Bay****esian ****Regressi****on** Linear regression; Logit regression.

• **Bayesian Model Choice**

# Hierarchical Regression Models

**What is this workshop about? **

Hierarchical regression models are particularly suitable for research designs in which data are organized in more than one observation level (which is the case in many research designs within the ScienceCampus): The primary observation units are usually individuals who are nested within higher order units, such as groups, or when repeated measurements of individuals are examined.

**When? **November 6 – 10, 2017, with an additional practice/question day on November 14, 2017. 9.00h – 13.00h each day.

**Where? **German Primate Center (DPZ), seminar room E0.14.

**What is the target group? **PhD students, post-docs.

**When can I register? **September 25 – October 22, 2017.

**How** **can I register? **Please send an email to: hsennhenn-reulen(at)dpz.eu

**What is the maximum number of participants? **15 (minimum number: 5).

**What are the contents I can expect? **A hierarchical regression model does not only include model terms that explain variation in the (expectation of) the response by products of covariates *x**k *and regression coefficients β_{k} (this is supposed to describe the data generating mechanism across observation units, often denoted as fixed model terms), but also coefficients γ_{i} that explain variation between the observation units *i **∈ {*1*,* *. . . , n**} *(this is often denoted as a random term of the model). Other terminologies for hierarchical regression models are mixed effects regression model (fixed and random model terms), or multilevel regression model (referring to primary observation units being nested within higher order units). During this workshop, we will lay the emphasis on the Bayesian approach to this class of regression models. This is why this workshop is called Hierarchical Regression instead of Mixed Effects Regression Models: by the possibility to naturally incorporate the assumption that γ_{i} *∼ *N (0*, σ*^{2}) into a Bayesian framework, there is no need anymore to distinct between fixed and random model terms.

**But why should I learn how to use a Bayesian inference approach to hierarchical regression models? **In the classical statistical inference framework, mixed model regression parameters do not have nice asymptotic distributions to test against (this is in contrast to ordinary least squares and generalized linear models parameters, which asymptotically converge to known distributions), which complicates the inferences that can be made from mixed models in this classical framework. The main source of this added complexity is a shrinkage factor that is applied to the random effects by the usual assumption γ_{i} *∼ *N (0*, σ*^{2}), leading to complications in the determination of degrees of freedom associated with this model term. In an applied example, the variance parameter* γ *may be estimated from *n *levels of a variable, and a design matrix used to estimate the parameters of this variable incorporates *n *indicator variables for these *n *levels. If we would include this variable in the usual way (taking it as a fixed coefficients variable), we would associate one degree of freedom with one estimated value, and so we would usually associate *n *degrees of freedom with these *n *indicators. But since these *n *indicators have a shrinkage factor applied to them (this results in the so-called *partial pooling*), we do not really need *n *degrees of freedom. So what would be the correct degrees of freedom to use for the cost to estimate this random effects model term? Is it one (we only estimate one variance parameter), or *n *(we explain variation in the expectation of the response by the use of *n *coefficients), or something in between (partial pooling)? The latter option must be correct, but, unfortunately, there is no generally accepted theory that can provide us with an exact value to answer this question. Moreover, assuming we can find a good value for the degrees of freedom, we still can not count on our test statistic (from likelihood ratio tests and the like) to be *F *or *χ*2 distributed, now that we added this shrinkage part to the model. However, if we now move on towards a Bayesian framework in order to estimate this model, we see that the shrinkage is just a very natural consequence of the model assumptions – here seen as prior formulations. This is a major benefit that comes with the use of a Bayesian approach, and it can be assumed that Bayesian approaches to multilevel/hierarchical/mixed models will become the standard in the next years to come, moreover since recent great improvements in software solutions made this approach much easier applicable now (see STAN based R add-ons rstanarm and brms).

# Kontakt

Dr. Christian Schloegl Koordinator +49 551 3851-480 +49 551 3851-489 Kontakt

# Anmeldung

Mitglieder des WissenschaftsCampus können sich jeweils bis zu der genannten Anmeldefrist für Kurse registrieren, wobei die Plätze in der Reihenfolge der Anmeldungen vergeben werden. Falls im Anschluss an die Anmeldefrist Restplätze verfügbar sind, so stehen diese anderen Studierenden und Post-Docs aus Göttingen offen. Auch hierbei erfolgt die Vergabe in der Reihenfolge der Anmeldungen. Interessenten, die nicht in Göttingen tätig sind, sollten vorab Kontakt aufnehmen.