The code presented here is to help understand the ideas discussed. Therefore, I may have removed some details of the implementation. For complete code, check my GitHub repository.

Learning from samples. The DP methods discussed earlier leverage a distribution model to calculate the optimal value function and an optimal policy. In this post, we will do away with such models. In many applications, it is easier to obtain samples of agent-environment interactions than a precise model that captures the environment’s dynamics. …


Winter is here. Can Dynamic Programming save us?

This is the second post in the series on Reinforcement Learning. In the previous post, we looked at the simple environment of k-Armed Bandit and learned the ideas of action-value methods and exploration. In this post, we will look at Frozen-Lake, an environment more complex than the previous. We will use Markov Decision Processes to model this environment. We will then learn about value functions and policies and ways to make them optimal. The code discussed in this post can be found here.

Frozen Lake ft. Slippery Ice and Dreadful Holes

Frozen-Lake is a more involved environment than k-Armed Bandit. It is a 4x4 grid where each cell…


I am pursuing my interest in building powerful AI systems that are provably safe and beneficial. In this journey, I will be penning down my learnings, ideas and experiments in this publication. These posts are glorified notes where I’m looking to organize my thoughts and narrow down my focus onto the subproblems I would eventually work on. This is the first post where I summarize the work, Tom Everitt, Gary Lea, and Marcus Hutter (2018). “AGI Safety Literature Review”. …


Arnold Newman’s Iconic 1956 portrait of Kurt Gödel, who in 1930 proved the incompleteness of Principia Mathematica

The Gödel Essays

A Glimpse of Incompleteness

In the 19th century, many paradoxical statements, formed either in natural language or mathematics, caused concern leading to questions around the reliability of mathematical reasoning. Rigorous attempts were made to address such concerns.

One of the most notable works in this regard is Principia Mathematica, written by Alfred North Whitehead (1861–1947) and Bertrand Russell (1872–1970) in the early 1900s, with one of its primary goals as “solving paradoxes that plagued logic and set theory”. This was a bold attempt to summarize the foundations of mathematics. The idea was to get rid of paradoxes by getting rid of self-references and in…


Estimation and Exploration

Welcome to the first post in what will be a series of posts on Reinforcement Learning. I am quite interested in the field of AI Safety and I believe it is crucial to have a good understanding of RL, among other things, to tackle the issues in AI Safety. I am using the incredible work, “Reinforcement Learning: An Introduction” by Richard S. Sutton and Andrew G. Barto, as my main source of learning. To really ground my understanding, I will write programs and run experiments, where necessary, to complement the natural programs that run only on our minds.

The simplest version of k-armed bandit

Learning in…


We don’t have a widely accepted definition of intelligence. Nevertheless, for the sake of discussion, let’s try to have an utterly inclusive and informal description of intelligence as the ability to accomplish complex goals. There are no limits to what these goals can be. This definition is inclusive in the sense that it doesn’t limit us to biological organisms. It also doesn’t require consciousness.

Well then, what are some necessary ingredients of systems that exhibit intelligence?

My answer is memory, computation, and learning. They form the key ingredients of intelligence. You may not fully agree and have your own reservations…


Alan Turing, 1950

Alan Turing
Alan Turing
Alan Turing, National Portrait Gallery London

The Imitation Game

Can machines think?

This question begs one to define the words “machine” and “think”. Instead of defining them — which is seemingly easy, let’s replace the question with one that is very similar. Before that, we introduce the imitation game.

The game is played by three. A man, a woman and an interrogator. The interrogator is isolated from the other two and can ask each one of them questions, with a goal of identifying who the man and who the woman is. The man and woman, respond in a way so as to mislead the interrogator. …


Machine Learning is an incredibly exciting technology. Many systems powered by Machine Learning are helping in making predictions and decisions for most businesses and organizations. The impact of such models is undeniable and significant. These models face many security issues nevertheless. It turns out such models could be fooled using some smartly crafted inputs. What’s worse is that crafting such inputs is not so hard at all. You can go ahead, hide your model and data. It’s still possible!

The Implications!

Oh, the implications are of significance definitely. Imagine a self driving car being fooled by a modified sign board. All one…


Let us evaluate the capabilities of Deep Learning

Prologue

Deep Learning has revolutionized machine capabilities by leveraging the advancements in computing power and rise in amounts of data available. From object recognition to recommendation engines and from translation systems to fraud detection, we have witnessed state-of-the-art performances using Deep Learning.

Consequently, Deep Learning powered applications have become a reality in no time. Apps on your smartphone are probably running Deep Learning algorithms locally while getting better with time and usage. Obviously, there is so much hype around Deep Leaning thanks to the various tech companies and media. In fact, most of the hype is justified given its impact. …


Understand the basics of TensorFlow

TensorFlow is an open source library for machine intelligence developed by Google. It is a computational library with a wide range of functionality. However, its main purpose is to implement Machine Learning algorithms.

TensorFlow is extremely popular relative to other Deep Learning libraries due to many reasons. One of the most important reasons being its ability to facilitate both research and production. This was not possible with the other libraries. Researchers and developers had to use one language for research and prototyping and some other language to deploy their model into production. …

Sai Sasank

Interested in AI Safety Research

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store