Kotlin purity and function memoization
Letâs learn about the benefits of âpurityâ and âpure functionsâ, and how it affects caching.
A pure function is a function that just operates over itâs input arguments to provide a result. It has no side effects, which means that the function itself is not provoking any external effects that you cannot control or you donât expect. Itâs not modifying any external state behind the scenes. So if you ran the function one trillion times for the same input arguments you would get the same result one trillion times.
A pretty common example of pure function usage uses to be the one about JDK Math class functions.
val fifteen = Math.abs(-15)
val arcTangent = Math.atan(90.0)
val arcSine = Math.asin(45.0)
val eight = Math.sqrt(64.0)
All those functions would always return the same results for those given arguments, with no exceptions.
In java itâs easy to create pure functions. You just mark those functions as static
so those cannot access any state since they arenât bound to any enclosing class instance. The compiler itself is going to forbid modification or access to any non static external variables from outside the function, but you shouldnât either modify global static ones.
Referential Transparency is a concept definitely present on pure functions. I learned this concept from a very interesting talk about Functional Error Handling by Raul Raja.
Referential Transparency means that the function does exactly what it promises when you call it, no more, no less. This concept is pretty beneficial for testability. If your code is most likely based on pure functions, that means itâs also more predictable. And predictable code is normally easier to test. Side effects break referential transparency.
Hermione found that purity doesnât fit pretty well inside her last released app. She might use some spells to fix it.
If you are an Android developer you know about the need to apply side effects at some point, otherwise you couldnât render things on screen or query to an API. Well, you can apply those effects to the edges imposed by your system. Talking about Android, we would be talking about lifecycle callbacks where Android IoC gives us the control. We will talk a bit about this again in the following paragraphs.
The rest of your architecture could be totally based on purity-based transformations over the data, if you wanted.
If you pass all your dependencies as parameters all the way, you will find that you are progressively getting rid of state across your architecture except for the edge imposed by it, that would be where the inversion of control is applied (lifecycle callbacks).
In Java, we normally have state all over the place just because OOP, and because we also decide to inject dependencies through constructors / fields and keep their references retained inside class instances. But thatâs our choice on that. It doesnât mean we have a requirement to have state, neither an instance. The only meaning of this situation is that the DI
framework we are using is kind of forcing us to have state.
If you start moving towards this mindset, you will find that you could remove a lot of classes and focus more on functions as first class citizens.
Also, having no state means no need to switch those pieces at testing time. You will be completely safe by using your real production code for all those chained functions inside your tests, since there arenât any problematic side effects creating the need for test doubles. And the testing environment should always replicate production code as much as possible, otherwise you wouldnât be writing the right tests, isnât it?
This enables you to test your app almost end to end in a black box scenario just by switching your View implementation (Android) and a couple more side effecting dependencies by test doubles. Purity predictability will make you capable of exercising a bunch of chained functions and easily know what to expect in return for your assertions.
Also, your production dependency trees would be much shorter. Your switchable dependencies would be just the required ones to apply side effects, but other than that your connection between the inner layers would just be a bunch of pure functions calling each other to apply transformations over the data. Something that does not requires to be passed as an injected dependency.
But we also have to look at the possible disadvantages and caveats of purity.
đ Performance consequences
By getting rid of the state we are also sacrificing something important. Letâs say that we have the need for a memory cache of items loaded from the API.
First of all, we see a problem on querying the API, since thatâs a side effect.
Well, a good approach to avoid breaking purity would be to wrap the side effect inside an IO
Monad to defer its execution. This means that we are going to compose a chain of transformations over the data plus the deferred data retrieval by the IO
Monad, and we will run the whole chain from the edge imposed by the system, right when we are ready for it. That will happen on our onCreate
or onResume
Activity / Fragment method, most likely.
So all the side effects are applied on Android lifecycle callbacks, inside the view implementation. So we are safe on this.
But what happens when we have to do some expensive calculations in some of the layers composed just with pure functions? Also, what would happen if we are server side and the mentioned function to do the calculation is called thousands of times?
Letâs have a look on the following code snippet. Itâs a class instance with some state. Itâs capable of calculating the factors of any possible integer:
class FactorCalculator {
val sumCache = hashMapOf<Int, Int>()
val factorCache = hashMapOf<Int, List<Int>>()
fun sumOfFactors(number: Int) = sumCache.getOrPut(number, {
factorsOf(number).sum()
})
fun isFactor(number: Int, potential: Int) = number % potential == 0
fun factorsOf(number: Int) = factorCache.getOrPut(number, {
(1 to number).toList().filter { isFactor(number, it) }
})
}
The sumOfFactors
function returns a sum of all the factors from any Int
. That value is calculated by the factorsOf
function, which takes a range from 1
to the current number
, converts it to a list and then filters to get just the values that are actually factors of the given number
.
Both functions rely on a HashMap
structure to cache already calculated values results for previous inputs, just to save some CPU cycles and memory. Using the map.getOrPut()
function from Kotlin we simplify the code a bit.
But we have state here, so we donât have purity. Our functions are applying side effects over the HashMap
structures. So, letâs try to convert this class to a bunch of pure functions that just operate over their input parameters:
fun sumOfFactors(number: Int) = factorsOf(number).sum()
fun factorsOf(number: Int) = (1 to number).toList().filter { isFactor(number, it) }
fun isFactor(number: Int, potential: Int) = number % potential == 0
We dropped the concept of class here, as itâs no longer needed. But during the process, we didnât have any other choice than to drop the state since we want those functions to be pure. So we lost our memory caches forever!
Affleck realized at the bar what he lost on that refactor pushed to production hours ago
Well, Affleck has good reasons to be sad, since his refactor was part of some very often used code from server side, and server response timings are going to be damaged so badly because of it.
Looking back at the previous snippet, it seems like we donât have any simple possibility to keep a memory structure storing those already calculated values without provoking side effects.
Well, thatâs not really true and here is were the need for a feature like function memoization comes in đ.
đ€ Function memoization
Itâs a feature commonly built into some programing languages like Groovy, which enables automatic caching of recurring function-return values.
The idea around the concept is to provide a memory cache at a function level that stores the already calculated function result values. So if we get this feature we wouldnât need a class instance and a field reference to store it for us anymore, it would be held as part of the function language meta-level.
Languages that do not implement this out of the box donât have hard times implementing it, although they can lose some syntactic sugar during the process.
Memoizing a function is a metafunction application. That means we are doing something to the function itself and how it works at a language meta level and not to the function results or itâs code implementation.
Here you have a pretty clear article by Brendon Anderson about how function memoization works in Groovy
. The code examples inside of it are pretty straightforward. I will paste one here to reduce noise so we can talk about it:
def myClosure = { Integer x ->
println "My Closure argument $x"
}.memoize()
myClosure 3
myClosure 4
myClosure 3
myClosure 4
You define a closure and then call the language built-in memoize()
method on top of it to get a memoized version for the same function. This is the resulting output after running the function those 4 times:
My Closure argument 3
My Closure argument 4
So, quoting the article: âYou can see that the code inside the closure was executed only once for each distinct input parameter.â
Thatâs because the memoized version of the function is returning already cached results and not recalculating them if itâs not needed. It just replaces the resulting value with the cached one if the function was called with the same parameters than a previous call so the function code is not really run.
This can be done because the function is pure, and we can tell that we will always get the same result for the same input parameter. Otherwise we couldnât feel safe about replacing itâs returning value with itâs cached version.
So, how could I achieve the same behavior in Kotlin
?
đ Kotlin implementation
Sad story here: Memoization isnât a built-in feature for Kotlin
. I will not paste Ben Affleck here again, but I can pretty much imagine your face.
Anyways, we can still implement it by ourselves soâŠ
Take a look at this simple Kotlin implementation of the feature:
class Memoize1<in T, out R>(val f: (T) -> R) : (T) -> R {
private val values = mutableMapOf<T, R>()
override fun invoke(x: T): R {
return values.getOrPut(x, { f(x) })
}
}
fun <T, R> ((T) -> R).memoize(): (T) -> R = Memoize1(this)
val memoizedSumFactors = { x: Int -> sumOfFactors(x) }.memoize()
First, letâs look at the class on top. I have used a class here so it becomes clearer to read, but I could also create an anonymous object extending the correct function type (T) -> R
.
On this basic example I am just covering the case of functions with a standalone parameter that also return a value. But the same strategy could be applied for any number of function arguments.
Taking a cautious look, you will notice that the Memoize1
class receives as function with the type (T) -> R
as a property. Then I overrided the invoke
method to be able to make itâs implementation rely on a MutableMap
. The map is used to hold the already calculated values as an in memory cache, so the function will return a previously calculated result if the input value is contained in the map. Otherwise, the function will be run and itâs result will be stored in the map right after that.
Overriding invoke is like overriding the main application of the function, which is itâs main constructor. In Kotlin, a call with the form a() is converted to a.invoke(). You can also invoke any function by using a.invoke() by yourself. This is equivalent to apply() method from Scala.
I also added an extension method for all the functions with the form (T) -> R
, for any T
and R
types. Thanks to that one, you can call memoize()
over any function that receives a value and returns a result. You will get itâs memoized version in return.
val memoizedSumFactors = { x: Int -> sumOfFactors(x) }.memoize()
memoizedSumFactors(3) // this would be the only one calculated
memoizedSumFactors(3)
memoizedSumFactors(3)
memoizedSumFactors(3)
memoizedSumFactors(3)
As you can see on the snippet, by using this code you would get your function body run just once, and for the resting calls for the same input value we would get the cached result instead.
So here we got what we where looking for! đ
After writing this article, found out that the really well known library funKTionale from Mario Arias already implements memoization for Kotlin. Itâs gonna become a really good example of how to enable memoization for functions up to 22 parameters.
As you can see, he uses objects extending the required function types, instead of named classes, to provide a simpler implementation.
You might think that the code is pretty boilerplate with all those extension functions and classes for the handler. This sort of âpredefâ files to define language extensions are pretty typical on functional programing. Truth is that thanks to this simple file that you could copy to any of your projects, you enable memoization for any possible function you want to use.
So it should be worth it, isnât it?
If you want to avoid the need to implement memoization by yourself, and now that you already know whatâs the concept about, you could add the dependency of funKTionale to your project. If you are moving your app towards a more functional architecture, this library will be handful for you, and itâs worth it. Thanks to Mario for putting his best efforts on it.
This should wrap things up for now. If you are interested on Kotlin
and possible functional approaches around the language, you could want to take a look to the following articles I wrote recently about tail recursion and Dependency Injection using the Reader
monad. You might agree or disagree, but those are interesting concepts pretty handful to understand:
Please, feel free to add me on Twitter @jorgecastillopr to discuss anything related (or not even related!) to this article. I usually post and retweet about Kotlin
and any other Android development and functional related posts.
If you reached this point you might consider supporting me, đhere you have a link where you could do it. Really appreciated! đ€ Getting support or not, I will for sure keep writing and providing content for free â