{-# OPTIONS -fglasgow-exts #-}
-- Haskell98 plus higher-ranked types only
-- Eliminating Array bound checking with non-dependent types
-- An attempt to give as literal as possible translation of
-- Hongwei Xi's code (Xi and Pfenning, PLDI98) into Haskell
-- (Haskell98 + higher-rank types).
-- The present translation has a lot of common with KMP-deptype.hs
-- (translation of another Hongwei Xi's code, from the same paper).
-- The file eliminating-array-bound-check.lhs gives a slightly
-- less literal (but perhaps more elegant) translation.
module Dep where
import Data.Array
{-
Our example is `bsearch', taken from the famous paper "Eliminating
Array Bound Checking Through Dependent Types" by Hongwei Xi and Frank
Pfenning (PLDI'98). Hongwei Xi's code was written in SML extended
with a restricted form of dependent types. Here is the original code
of the example (taken from Figure 3 of that paper, see also
http://www-2.cs.cmu.edu/~hwxi/DML/examples/)
datatype 'a answer = NONE | SOME of int * 'a
assert sub <| {n:nat, i:nat | i < n } 'a array(n) * int(i) -> 'a
assert length <| {n:nat} 'a array(n) -> int(n)
fun('a){size:nat}
bsearch cmp (key, arr) =
let
fun look(lo, hi) =
if hi >= lo then
let
val m = (hi + lo) div 2
val x = sub(arr, m)
in
case cmp(key, x) of
LESS => look(lo, m-1)
| EQUAL => (SOME(m, x))
| GREATER => look(m+1, hi)
end
else NONE
where look <|
{l:nat, h:int | 0 <= l <= size /\ 0 <= h+1 <= size } int(l) * int(h)
-> 'a answer
in
look (0, length arr - 1)
end
where bsearch <| ('a * 'a -> order) -> 'a * 'a array(size) -> 'a answer
The text after `<|' are dependent type annotations. They _must_ be
specified by the programmer -- even for internal functions such as `look'.
-}
-- Our bsearch function.
bsearch cmp (key, arr) = brand arr (\arrb -> bsearch' cmp (key, arrb))
bsearch' cmp (key,(arr,lo,hi)) = look lo hi
where
look lo hi =
index_cmp lo hi Nothing $
\lo' hi' ->
let m = bmiddle lo' hi'
x = arr !. m
in case cmp (key,x) of
LT -> look lo (bpred m)
EQ -> Just (unbi m, x)
GT -> look (bsucc m) hi
{-
This code is just as algorithmically efficient as the Dependent SML
code: one middle index computation, one element comparison, one index
comparison, one index increment or decrement per iteration. There are
no type annotations as none are needed. Operator (!.) is a statically
safe array indexing operator. The type system and the trust properties
of the kernel below guarantee that in the expression "arr !. m" the
index `m' is positively in range of the array `arr' bounds.
-}
-- Tests
barr1 = listArray (5,5 + (length s)-1) s where s = "abcdefgh"
btest1 = bsearch (uncurry compare) ('c',barr1)
btest2 = bsearch (uncurry compare) ('a',barr1)
btest3 = bsearch (uncurry compare) ('h',barr1)
btest4 = bsearch (uncurry compare) ('x',barr1)
btest5 = bsearch (uncurry compare) ('x',listArray (0,-10) [])
-- ----------------------------------------------------------------------
-- The trusted kernel
--
-- The code relies on a compact general-purpose trusted kernel below.
-- That code should be put into a separate module. In particular,
-- the data constructors BArray, BIndex, should not be exported and
-- should not be available outside of the kernel.
--
-- The phantom type 's' is a type proxy for the pair of numbers
-- (low,high) -- the boundaries of array indices. In Haskell, low can
-- be arbitrarily larger than high. We don't know what array indices
-- are until the run time. Yet we know (due to the way BArray, etc, types
-- are constructed) that the same value of s means the the same
-- interval of indices. The safety depends on the trusted way of
-- creating branded types: the constructors BIndex and BArray should be
-- used in the trusted kernel only, and should not be available anywhere
-- else. The uniqueness of 's' afforded by the explicit universal
-- quantification prevents mixing up of different brands.
-- Branded arrays and indices
-- These are `newtype's and so impose no run-time overhead.
newtype BArray s i a = BArray (Array i a)
-- the value of the type |BIndex s i| denotes an index value that
-- is assuredly { low <= i <= high } where low and high are the bounds
-- represented by the type parameter |s|
newtype Integral i => BIndex s i = BIndex i
unbi (BIndex i) = i
-- the value of the type |BIndexL s i| denotes an index value that
-- is assuredly { low <= i } where low is the lower bound
-- represented by the type parameter |s|
newtype Integral i => BIndexL s i = BIndexL i
-- the value of the type |BIndexH s i| denotes an index value that
-- is assuredly { i <= high } where high is the upper bound
-- represented by the type parameter |s|
newtype Integral i => BIndexH s i = BIndexH i
{-
We must re-iterate that safety depends on assurances of the code that
constructs BIndex values. Because of the high assurance, we must
formulate the safety properties as propositions, and prove
them. Fortunately, the code below is compact and straightforward, as
well as general purpose.
-}
{-
We start with an introduction rule for BArray, which determines
the run-time bounds of an array and passes the branded array and
the low and upper bounds to its continuation.
The function brand has a higher-rank type. It is the
quantification of 's' as well as the absence of the BArray constructor
elsewhere guarantee that the same brand entails the same bounds.
-}
brand:: (Ix i, Integral i) =>
Array i e
-> (forall s. (BArray s i e, BIndexL s i, BIndexH s i) -> w) -> w
brand (a::Array i e) k =
let (l,h) = bounds a
in k ((BArray a)::BArray () i e, BIndexL l, BIndexH h)
{-
Proposition: For the argument (a,l,h) of the continuation k,
l >= low bound of the array a, h <= upper bound of the array a.
Proof: from the semantics of the function `bounds', taken here as an axiom,
and the code itself.
-}
bmiddle:: (Integral i) => BIndex s i -> BIndex s i -> BIndex s i
bmiddle (BIndex i1) (BIndex i2) = BIndex ((i1 + i2) `div` 2)
{-
Proposition: l <= i1 <= h, l <= i2 <= h |- l <= (i1 + i2) `div` 2 <= h
Proof: plain arithmetics.
We should stress that the type of bmiddle assures that all indices
involved have the same brand -- that is, the same lower and upper
boundaries. A brand 's' is a (type-level) representation of index
bounds. At compile time, we don't know what they are, but the
unforgeability of 's' statically guarantees that the same
's' represents the same bounds.
-}
-- Index comparison:
-- Compare BIndexL with BIndexH. If the first is less or equal
-- than the second, invoke the onle continuation, passing the indices
-- converted to BIndex.
-- Safety proposition:
-- {i:int | low <= i}, {j:int | j <= high}, i <= j
-- ===> {i,j:int | low <= i,j <= high}. See the definition of BIndex then.
index_cmp :: Integral i =>
BIndexL s i -> BIndexH s i ->
w -> (BIndex s i -> BIndex s i -> w) -> w
index_cmp (BIndexL i) (BIndexH j) onother onle =
if i <= j then onle (BIndex i) (BIndex j) else onother
-- Index arithmetics
-- Safety proposition:
-- {i:int | low <= i <= high}
-- ===> {i:int | low <= i+1} . See the definition of BIndexL then.
-- The safety proposition justifies our use of the data constructor
-- BIndexL.
bsucc:: Integral i => BIndex s i -> BIndexL s i
bsucc (BIndex i) = (BIndexL (succ i))
-- Safety proposition:
-- {i:int | low <= i <= high}
-- ===> {i:int | i-1 <= high} . See the definition of BIndexH then.
-- The safety proposition justifies our use of the data constructor
-- BIndexH.
bpred:: Integral i => BIndex s i -> BIndexH s i
bpred (BIndex i) = (BIndexH (pred i))
-- Because a branded index is assuredly within the bounds of the array of
-- the same brand, we can write the following. Actually, we may _safely_
-- replace a ! i with `unsafeAt a i'
infixl 5 !.
(!.):: (Ix i) => BArray s i e -> (BIndex s i) -> e
(BArray a) !. (BIndex i) = a ! i