Discovering modifications to Immutable data-structures by way of Infer

This article was written by Lucashaben, Ezgi Çiçek, and Jules Villard.

Facebook Lite is a leaner version of the Facebook app that is optimized to run on lower or older phones and with limited internet access such as 2G networks.

There is an ongoing refactoring effort to translate the entire view hierarchy of the Facebook Lite for Android app to an immutable paradigm. In this post we want to highlight how Infer, a powerful and open source static analysis tool, is being used as an important part of this project.

On the way to immutable data structures

It’s no secret why this migration is being done; The benefits of immutability are numerous and well known, especially when working in multithreaded environments.

There are no race conditions with immutable data structures: There are no write processes and all read processes can run in parallel without any risk. In contrast, if we have a mutable list and two threads are accessing it in parallel, one wants to add a new item at the end and the other wants to read the last item. In this case we have a race condition: whichever is executed first, the thread reading the last element receives a different result.

Since there are no race conditions, we avoid complicated mutexes. How do we solve the above race condition example with no immutability? With a mutex, especially in this case a reader-writer lock. Ok, one case is solved, but in a complex application we can easily solve hundreds or thousands of them. With so many locks, deadlocks are possible, performance becomes a problem, and so on. Working with mutexes in large systems becomes very complicated.

This brings us to the next point: immutable paradigms are easier to justify. To say that locks are complicated to deal with is an understatement; the number of nests grows exponentially with execution. You can find a nice example in the question “How hard is hard?” Section of this post (spoiler: incredibly difficult).

With so much complexity, the best way to deal with it is not to worry about it. When there are no data races and no locks, we don’t care that the program is non-deterministic with practically endless execution possibilities – we can just write code like in a single-threaded environment and it will just work.

All of the points described above result in better software that can be objectively measured in fewer errors and crashes.

The problem

The problem is well known and much discussed in Java circles: arrays do not cooperate with an immutable paradigm. The standard way to make an object immutable in Java is to make it final, but that’s not enough for an array. We can change the elements of an array like this:

And the compiler accepts it. The reason for this is that it only guarantees that it will always point to the same array object. Therefore, this code is not compiled:

But of course that’s not the behavior we want: we want an immutable array of immutable elements.

Common solutions

There are several approaches to addressing this problem:

  • migrate from arrays to another data structure: native, third-party, or custom implemented
  • Change the way the array is accessed:
    • Make a copy every time a getter is called
    • Never disclose it directly, e.g. instead of a getArray method getElement and getSize.

These approaches (with the exception of the copying process, which has its own performance problem) have one major problem: as a refactoring, they cause a lot of glitches. For example, imagine a method that receives an array and does something with it. The code is redesigned and now with some calls the parameter that used to be an array becomes an ImmutableList, while other calls remain unchanged or the array is no longer accessible. It’s annoying but solvable. Now imagine a huge and very complex code base with countless cases like this! The inconvenience builds up and could become a major problem and time sinking human hours.

Is there a way to leave most of the code intact without a great deal of refactoring and only repair the problematic parts, i.e. modify immutable data structures?

Static analysis with Infer

Enter the static analysis that can help us find changes to immutable data structures by statically analyzing the code without executing it.

At Facebook we develop and use the open source platform for static infer analysis, which includes a powerful memory analysis called Pulse, which can detect invalid memory manipulations such as null pointer exceptions, use-after-free errors (in C / C ++) and much more. For this work, we worked with the Infer team to extend Pulse to also detect changes to the immutable data structures used in the Facebook Lite view hierarchy (ex.

Pulse is an inter-procedural analysis based on Separation Logic. The program analyzes method by method and derives a “summary” for each method, which describes the effect of the method in the form of pre- and post-conditions, each of which describes sets of concrete machine states (possible parameter values, heap structure, etc.). ). Then our immutability analysis post processes these function summaries to identify changes to immutable data structures. For each of these problems detected, a trace is reported through the code that explains how the change can be made.

In addition, a fundamental property of Pulse is that it performs what is known as under-approximate analysis, which means that every time a problem is reported on a code path, that path is feasible in an actual execution of the program (it applies Limitations, e.g. the analysis makes optimistic assumptions about the library code to which it has no access and ignores parallelism). This has important ramifications for immutability analysis: Virtually all reports of immutability violations are truly positive (i.e., actionable issues that need to be fixed), at the expense of some code paths that the analysis may overlook and lead to false negative results (e.g., finds the Analysis may not all errors).

Use immutability analysis in your project

The immutability analysis is part of Infer’s contamination analysis and is activated by a flag. To use it, comment out fields that should be immutable with @Immutable, then run infer with. the end

–impurity-only –impurity-report-immutable-modifications

For example, if your project uses Maven, you can do the following:

Infer will then report the changes to immutable collections as errors. Here is a complete simple example with two files:

Find out more about Infer on his website.

diploma

The novel solution presented here is minimally invasive, as it only requires a small redesign of the code: annotate immutable data structures, run Infer’s static analysis, fix the problems found (i.e. changes to immutable data structures), and that’s it. There’s no need to redesign the entire app to switch to different data structures, change thousands of rows, etc.

The static analysis is also attractive for the app in terms of performance. All analysis is carried out statically in advance so that there is no performance regression at runtime. Arrays are fast, compact, and have less overhead than immutable lists. Needless to say, not having to make copies every time a getter is called is far more powerful than having to copy potentially mutable data. And it just needs to import new annotation in a small package so that the APK size regression is minimal.

To learn more about Facebook Open Source, visit our open source site, subscribe to our YouTube channel, or follow us on Twitter and Facebook.

Interested in working with open source technologies on Facebook? Check out our open source related job postings on our careers page.

Comments are closed.