# Automatic Differentiation

**Seminar**

Automatic Differentiation (AD) is a technique to evaluate the derivative of a computer program. One important application of AD is to apply gradient-descent based optimization techniques that are used e.g. in the training of neural networks to optimize parameters in general programs. AD therefore can be seen as a generalization of the backpropagation algorithm that is used for training neural networks to general code. Using AD, parameter training as used in machine learning can be applied in general-purpose programming allowing for more flexible structures than just neural nets.

In this seminar, we'll look at compiler techniques for performing AD on general code and ongoing research in programming language theory on AD.

## People

## Organization

Prerequesites | Background in compilers, programming language theory, and willingness to dig deeper. | |

Language | English | |

Participants | 0 / 8 (seats taken / maximum seats) | |

Waiting list | 0 (please attend the Preparatory Meeting) | |

Weekly Meeting | Thursdays, 16:00 | |

First Meeting | 29 Apr 2021, 16:00 |

## Registration

Please use our seminar system. Note that you still have to register for the Seminar in the LSF until TBA to get a certificate for the seminar.

## Modus Operandi

A paper will be assigned to each participant. We will have weekly meetings during the semester in which we will discuss one of the assigned papers. The discussion will be managed by the student to whom the paper was assigned. She/he is responsible for giving a short summary on the paper and for structuring the following discussion.

### Weekly Summaries

Every week each student has to write a plain text summary (max. 500 words) on the week's paper. This summary should include open questions and is to be submitted to Matthis Kruse three days before the corresponding meeting (23:59).

The submitted files **must** follow the naming scheme:

<two-digit-paper-number>_<matriculation-number>.txt

The summaries of all participants will be made available and can be used by the moderator to structure the discussion in the following meeting.

Each participant is allowed to drop **two** summaries without any particular reason.
In case you drop a summary, please send a short mail telling so.

### Final Talks

At the end of the semester each participant will give a presentation 30 minutes (25 min talk + 5 min questions) about her/his paper.

## Dates

### Sessions

Date | Moderator | Paper |
---|

### Final Talks

Date | Time | Speaker |
---|

## Papers

All papers are available from the university network (how to connect to the university network from home). A publicly available version is linked below whenever available.- Abadi et al. A Simple Differentiable Programming Language
- Mazza et al. Automatic differentiation in PCF
- Brunel et al. Backpropagation in the Simply Typed Lambda-Calculus with Linear Negation
- Wang et al. Demystifying differentiable programming: shift/reset the penultimate backpropagator and Sigal Automatic Differentiation via Effects and Handlers
- Bernstein et al. Differentiating A Tensor Language
- Elliott. The Simple Essence of Automatic Differentiation
- Sherman et al. 𝜆ₛ: computable semantics for differentiable programming with higher-order functions and datatypes
- Moses et al. Instead of Rewriting Foreign Code for Machine Learning, Automatically Synthesize Fast Gradients, Innes Don't Unroll the Adjoint: Differentiating SSA-Form Programs, and Pearlmutter and Siskind Reverse-Mode AD in a Functional Framework: Lambda the Ultimate Backpropagator

## Background Material

- MIT introduction course to deep learning. The first video is particularly helpful for the background required in this seminar.
- Baydin et al. Automatic differentiation in machine learning: a survey.
- Gordon Plotkin. Differentiable Programming. Talk at POPL 18.
- Christopher Olah. Neural Networks, Types, and Functional Programming. Blog Entry.