7.8 The cost of safety
7.8.1 Run-time cost
Table 7.9 shows the increase in performance and code size overhead as a result of the run-time safety checks for our 12 benchmarks. The performance overhead is also shown in Figure 7.7. The baseline here is the unsafe version of our VM, which is on average 67.0%
slower than native C. Adding safety checks increases the average overhead to 104.6% of native C, corresponding to a 22.5% increase in run time compared to the unsafe VM.
The cost of the run-time safety checks depends greatly on the benchmark we run. Most checks are done at translation time, including writes to local and static variables. The only check that adds significant run-time overhead is check R-4, which checks the target of an
0 20 40 60 80 100
0 5 10 15 20
Overhead (% unsafe VM)
Percentage of executed array/object access instructions Loads vs. read safety overhead Stores vs. write safety overhead
Figure 7.8: Percentage of array/object load/store instructions and cost of read/write safety
-50 0 50 100 150 200 250 300 350 400
Bubble sortHeap sort Binary search
XXTEAMD5 RC5 FFT
Outlier LEC
CoreMarkMoteTrackHeatCalibHeatDetectaverage
Overhead (% of native C run time)
bounds in memory bounds in registers difference
Figure 7.9: Comparison of safety cost with heap bounds in memory or registers
object field or array write is within the bounds of the heap.
Thus, the run-time overhead is determined by the number of object or array writes a benchmark does. The percentage of these is shown in the first part of Table 7.9. Since bubble sort has by far the highest percentage of array writes, at 18% of all executed byte-code instructions, it also incurs the highest overhead from adding safety, and slows down by 72.7%. Binary search on the other hand, which does no writes at all, is unaffected.
As usual CoreMark, being a large benchmark with a mix of operations, is somewhere in the middle. The correlation between the percentage of array and object writes, and the slowdown compared to the unsafe version is shown in Figure 7.8.
Safe reads
Up to this point the VM only checks the application cannot write to memory it is not supposed to write to, however, it may still read from any location.
The recently published Meltdown and Spectre vulnerabilities in desktop CPUs can be exploited by malicious code to read from anywhere in memory, exposing both the kernel’s and other applications’ private data, which may contain sensitive information such as authentication tokens, passwords, etc. This sent OS vendors rushing to release patches, which early report suggest may cause a performance penalty of up to 11% [82].
Whether this is also a problem on a sensor node depends on the scenario. If the VM or other tasks contain sensitive information, then this may need to be protected. However, in many sensor node applications the node may only be running a single application, and CapeVM does not contain any state that would be useful to an attacker. In these cases, write safety will be sufficient.
Adding read safety to our VM is trivial: instructions to load local and static variables are already protected since they use the same code to access a variable as the store instruc-tions. For heap access, we simply add the same call to heapcheck to the GETARRAY and GETFIELD instructions just before the actual read.
Figure 7.7 shows the cost of providing read safety is higher than write safety. Most applications read from an array or object much more frequently than they write to them.
As a result, our VM with read and write safety turned on slows down by 64% on average, corresponding to a 174% slowdown over native C. In addition to the sort benchmarks, MoteTrack also suffers greatly from adding read safety, since it spends 21% of its instruc-tions reading from objects and arrays, most of which is from reading the RSSI signatures.
RC5 is the fastest benchmark, since it not only does relatively few array reads and writes, but also spends a large amount of time on expensive variable bit shifts, which have identi-cal performance in both C and AOT compiled versions. The result is a slowdown of only 33% compared to native C for the fully sandboxed version.
Keeping heap bounds in registers
In Section 6.2.4 several alternatives for the heap bounds check were considered, one of which is to keep the bounds in dedicated registers to avoid having to fetch them from memory for each check. This section evaluate this choice.
Having the bounds in registers would reduce the cost of the check from 22 to 14 cycles, reducing the overhead of safety checks by 8/22 ≈ 36%. However, this uses 4 registers which cannot be used for stack caching.
To estimate how this would affect performance, the benchmarks were run using the unsafe VM, with the number of registers available to the stack cache reduced by 4. Since this does not affect the number of heap accesses, we then added the observed overhead for safety checks, reduced by 36%.
Figure 7.9 shows the overhead for our chosen approach with the heap bounds in mem-ory, compared to the expected overhead when the heap bounds are stored in registers. For some benchmarks such as bubble sort and MoteTrack, the savings in heap bounds checks outweighs the reduced effectiveness of the stack cache. But the improvement in perfor-mance is relatively small, and for other benchmarks the reverse is true, showing minor slowdowns when heap bounds are kept in registers. On average the benchmarks are quite balanced, as is the larger CoreMark benchmark.
As future work we may consider using some basic statistics, such as the percentage of array write instructions and average stack depth, to choose one of the two options on a
per-method basis. But as usual there is a trade-off, in this case VM size and complexity, and this may not be worth the effort given the relatively small gains.