This is a brief guide on how to set up a reinforcement learning (RL) environment that is compatible to the Gymnasium 1.0 interface. Gymnasium de facto defines the interface standard for RL environments and the library provides useful tools to work with RL environments.
Gymnasium¶
Gymnasium is "An API standard for reinforcement learning with a diverse collection of reference environments" (https://gymnasium.farama.org/
LaTeX for Dissertations
Since writing a dissertation is usually something that you do only once, I gathered a lot of knowledge that I probably won't need anymore. This is a brief summary of how the LaTeX code of my dissertation is structured. I hope it can be an inspiration for someone else. As …
CrKR
A weighted regression method.
BFGS
Machine learning often involves optimization, that is solving
$$\arg\min_{x} f(x),\ f: \mathbb{R}^n \rightarrow \mathbb{R},$$where $f(x)$ is an objective function, for example, negative log-likelihood.
Often you cannot find the optimum directly. That is the reason why we need numerical, iterative optimization methods. I will take a closer look at these methods that are the fundamentals of machine learning but we need some basic tools first.
Generating Python Bindings
I often write C++ code and I often use Python. I usually want to use my C++ code in Python. There are many tools that simplify the work of writing Python bindings (Cython, SWIG, Boost.Python, pybind11, CLIF, ...). I personally like Cython. Most C++ features can be translated directly to …
pytransform
My work often combines motion capture, reinforcement learning, and robotics. There are many different systems involved, like proprietary motion capturing software, our own machine learning library, robotic simulations (e.g. MARS, Gazebo, or Bullet), and robotic middleware (e.g. ROS or RoCK). All of them come with their own complex visualization tools, tools for handling transformations, and usually with their own conventions for transformations.
Maximum Likelihood
Maximum likelihood is one of the fundemental concepts in statistics and artificial intelligence algorithms. What does it mean and how is it used in practice?
Suppose you have some dataset $\mathcal{D}$ and a possible hypothesis $h$ of the latent function that might have generated the dataset. The probability distribution of the dataset given the hypothesis $p(\mathcal{D}|h)$ is called likelihood
Learning Curves #2
Almost everyone working in the field of machine learning is usually pretty sure about what a learning curve is. It seems to be intuitive. The problem is that each field has its own typical definition of a learning curve and it is unusual to write it down explicitely. The only …
pybullet
pybullet is a simple Python interface to the physics engine Bullet. It is easy to install (via pip install pybullet
) and use and it is yet a powerful tool. This article will give a brief glimpse at what you can do with it. A more detailed guide can be found in the pybullet quickstart guide
Slither
All the commercial services that allow you to record your sport activities like Polar Flow, Runtastic, Endomondo, Runkeeper, Strava, Google Fit, or whatever store your data on some server which you cannot control. You do not know what they use the data for and you cannot write your own tools …
Linear Support Vector Machine
Model¶
A linear Support Vector Machine implements the linear model $$y = \text{sign}\left(\boldsymbol{w}^T\boldsymbol{x} + b\right),$$ where $y \in \{-1, 1\}$ is a class label, $\boldsymbol{x} \in \mathbb{R}^D$ is an input vector, and $\boldsymbol{w} \in \mathbb{R}^D$, $b \in \mathbb{R}$ are the model parameters.
Objective Function¶
The original objective function for a Support Vector Machine is
$$\min_{\boldsymbol{w}} ||\boldsymbol{w}||^2 \text{ subject to } y_i \left( \boldsymbol{w}^T \boldsymbol{x}_i + b \right) \geq 1 \text{ for } i=1,\ldots,n.$$To allow missclassification to some degree, we can formulate a relaxed version. The objective function of soft
Regression
Approximate an unknown function $f$ that maps from $\mathbb{R}^D$ to $\mathbb{R}^F$.
Model:
- We assume that there is some latent function $f: \mathbb{R}^D \rightarrow \mathbb{R}^F$.
- We observe samples $(\boldsymbol{x}_n, \boldsymbol{y}_n)$ with $f(\boldsymbol{x}_n) + \boldsymbol{\epsilon}_n = \boldsymbol{y}_n$
- with i.i.d. noise, for example $\boldsymbol{\epsilon}_n \sim \mathcal{N}(0, \sigma^2 \boldsymbol{I}_F)$.
Overview¶
Algorithms
- Linear Regression
- Polynomial (Ridge) Regression
- Kernel (Ridge) Regression
- Gaussian Process Regression
- Support Vector Regression
t-SNE in scikit learn
The algorithm t-SNE has been merged in the master of scikit learn recently. It is a nice tool to visualize and understand high-dimensional data. In this post I will explain the basic idea of the algorithm, show how the implementation from scikit learn can be used and show some examples. The IPython notebook that is embedded here, can be found here
Learning Curves in scikit learn
scikit learn is a great machine learning library for Python. It offers broad range of machine learning algorithms and tools. Learning curves are a new tool that has been merged last week and I want to use that feature to point out why scikit learn is such a great library …
Julia
Julia is a new programming language for the scientific community. Its main advantage over Matlab, R, etc. is its speed. Although it is still under heavy developement I would say it is worth taking a look at. There are already some interesting libraries available. Some of them come from the …