%
% (c) The University of Glasgow 2006
% (c) The GRASP/AQUA Project, Glasgow University, 19921998
%
Arity and ete expansion
\begin{code}
module CoreArity (
manifestArity, exprArity, exprBotStrictness_maybe,
exprEtaExpandArity, CheapFun, etaExpand
) where
#include "HsVersions.h"
import CoreSyn
import CoreFVs
import CoreUtils
import CoreSubst
import Demand
import Var
import VarEnv
import Id
import Type
import TyCon ( isRecursiveTyCon, isClassTyCon )
import Coercion
import BasicTypes
import Unique
import Outputable
import FastString
\end{code}
%************************************************************************
%* *
manifestArity and exprArity
%* *
%************************************************************************
exprArity is a cheapandcheerful version of exprEtaExpandArity.
It tells how many things the expression can be applied to before doing
any work. It doesn't look inside cases, lets, etc. The idea is that
exprEtaExpandArity will do the hard work, leaving something that's easy
for exprArity to grapple with. In particular, Simplify uses exprArity to
compute the ArityInfo for the Id.
Originally I thought that it was enough just to look for toplevel lambdas, but
it isn't. I've seen this
foo = PrelBase.timesInt
We want foo to get arity 2 even though the etaexpander will leave it
unchanged, in the expectation that it'll be inlined. But occasionally it
isn't, because foo is blacklisted (used in a rule).
Similarly, see the ok_note check in exprEtaExpandArity. So
f = __inline_me (\x -> e)
won't be etaexpanded.
And in any case it seems more robust to have exprArity be a bit more intelligent.
But note that (\x y z -> f x y z)
should have arity 3, regardless of f's arity.
\begin{code}
manifestArity :: CoreExpr -> Arity
manifestArity (Lam v e) | isId v = 1 + manifestArity e
| otherwise = manifestArity e
manifestArity (Note n e) | notSccNote n = manifestArity e
manifestArity (Cast e _) = manifestArity e
manifestArity _ = 0
exprArity :: CoreExpr -> Arity
exprArity e = go e
where
go (Var v) = idArity v
go (Lam x e) | isId x = go e + 1
| otherwise = go e
go (Note n e) | notSccNote n = go e
go (Cast e co) = go e `min` length (typeArity (snd (coercionKind co)))
go (App e (Type _)) = go e
go (App f a) | exprIsTrivial a = (go f 1) `max` 0
go _ = 0
typeArity :: Type -> [OneShot]
typeArity ty
| Just (_, ty') <- splitForAllTy_maybe ty
= typeArity ty'
| Just (arg,res) <- splitFunTy_maybe ty
= isStateHackType arg : typeArity res
| Just (tc,tys) <- splitTyConApp_maybe ty
, Just (ty', _) <- instNewTyCon_maybe tc tys
, not (isRecursiveTyCon tc)
, not (isClassTyCon tc)
= typeArity ty'
| otherwise
= []
exprBotStrictness_maybe :: CoreExpr -> Maybe (Arity, StrictSig)
exprBotStrictness_maybe e
= case getBotArity (arityType is_cheap e) of
Nothing -> Nothing
Just ar -> Just (ar, mkStrictSig (mkTopDmdType (replicate ar topDmd) BotRes))
where
is_cheap _ _ = False
\end{code}
Note [exprArity invariant]
~~~~~~~~~~~~~~~~~~~~~~~~~~
exprArity has the following invariant:
* If typeArity (exprType e) = n,
then manifestArity (etaExpand e n) = n
That is, etaExpand can always expand as much as typeArity says
So the case analysis in etaExpand and in typeArity must match
* exprArity e <= typeArity (exprType e)
* Hence if (exprArity e) = n, then manifestArity (etaExpand e n) = n
That is, if exprArity says "the arity is n" then etaExpand really
can get "n" manifest lambdas to the top.
Why is this important? Because
In TidyPgm we use exprArity to fix the *final arity* of
each toplevel Id, and in
In CorePrep we use etaExpand on each rhs, so that the visible lambdas
actually match that arity, which in turn means
that the StgRhs has the right number of lambdas
An alternative would be to do the etaexpansion in TidyPgm, at least
for toplevel bindings, in which case we would not need the trim_arity
in exprArity. That is a less local change, so I'm going to leave it for today!
Note [Newtype classes and eta expansion]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We have to be careful when etaexpanding through newtypes. In general
it's a good idea, but annoyingly it interacts badly with the classop
rule mechanism. Consider
class C a where { op :: a -> a }
instance C b => C [b] where
op x = ...
These translate to
co :: forall a. (a->a) ~ C a
$copList :: C b -> [b] -> [b]
$copList d x = ...
$dfList :: C b -> C [b]
$dfList d = $copList d |> co@[b]
Now suppose we have:
dCInt :: C Int
blah :: [Int] -> [Int]
blah = op ($dfList dCInt)
Now we want the builtin op/$dfList rule will fire to give
blah = $copList dCInt
But with etaexpansion 'blah' might (and in Trac #3772, which is
slightly more complicated, does) turn into
blah = op (\eta. ($dfList dCInt |> sym co) eta)
and now it is *much* harder for the op/$dfList rule to fire, becuase
exprIsConApp_maybe won't hold of the argument to op. I considered
trying to *make* it hold, but it's tricky and I gave up.
The test simplCore/should_compile/T3722 is an excellent example.
Note [exprArity for applications]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When we come to an application we check that the arg is trivial.
eg f (fac x) does not have arity 2,
even if f has arity 3!
* We require that is trivial rather merely cheap. Suppose f has arity 2.
Then f (Just y)
has arity 0, because if we gave it arity 1 and then inlined f we'd get
let v = Just y in \w. <fbody>
which has arity 0. And we try to maintain the invariant that we don't
have arity decreases.
* The `max 0` is important! (\x y -> f x) has arity 2, even if f is
unknown, hence arity 0
%************************************************************************
%* *
Computing the "arity" of an expression
%* *
%************************************************************************
Note [Definition of arity]
~~~~~~~~~~~~~~~~~~~~~~~~~~
The "arity" of an expression 'e' is n if
applying 'e' to *fewer* than n *value* arguments
converges rapidly
Or, to put it another way
there is no work lost in duplicating the partial
application (e x1 .. x(n1))
In the divegent case, no work is lost by duplicating because if the thing
is evaluated once, that's the end of the program.
Or, to put it another way, in any context C
C[ (\x1 .. xn. e x1 .. xn) ]
is as efficient as
C[ e ]
It's all a bit more subtle than it looks:
Note [Arity of case expressions]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We treat the arity of
case x of p -> \s -> ...
as 1 (or more) because for I/O ish things we really want to get that
\s to the top. We are prepared to evaluate x each time round the loop
in order to get that.
This isn't really right in the presence of seq. Consider
f = \x -> case x of
True -> \y -> x+y
False -> \y -> xy
Can we etaexpand here? At first the answer looks like "yes of course", but
consider
(f bot) `seq` 1
This should diverge! But if we etaexpand, it won't. Again, we ignore this
"problem", because being scrupulous would lose an important transformation for
many programs.
1. Note [Oneshot lambdas]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Consider oneshot lambdas
let x = expensive in \y z -> E
We want this to have arity 1 if the \yabstraction is a 1shot lambda.
3. Note [Dealing with bottom]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Consider
f = \x -> error "foo"
Here, arity 1 is fine. But if it is
f = \x -> case x of
True -> error "foo"
False -> \y -> x+y
then we want to get arity 2. Technically, this isn't quite right, because
(f True) `seq` 1
should diverge, but it'll converge if we etaexpand f. Nevertheless, we
do so; it improves some programs significantly, and increasing convergence
isn't a bad thing. Hence the ABot/ATop in ArityType.
4. Note [Newtype arity]
~~~~~~~~~~~~~~~~~~~~~~~~
Nonrecursive newtypes are transparent, and should not get in the way.
We do (currently) etaexpand recursive newtypes too. So if we have, say
newtype T = MkT ([T] -> Int)
Suppose we have
e = coerce T f
where f has arity 1. Then: etaExpandArity e = 1;
that is, etaExpandArity looks through the coerce.
When we etaexpand e to arity 1: eta_expand 1 e T
we want to get: coerce T (\x::[T] -> (coerce ([T]->Int) e) x)
HOWEVER, note that if you use coerce bogusly you can ge
coerce Int negate
And since negate has arity 2, you might try to eta expand. But you can't
decopose Int to a function type. Hence the final case in eta_expand.
Note [The statetransformer hack]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Suppose we have
f = e
where e has arity n. Then, if we know from the context that f has
a usage type like
t1 -> ... -> tn 1-> t(n+1) 1-> ... 1-> tm -> ...
then we can expand the arity to m. This usage type says that
any application (x e1 .. en) will be applied to uniquely to (mn) more args
Consider f = \x. let y = <expensive>
in case x of
True -> foo
False -> \(s:RealWorld) -> e
where foo has arity 1. Then we want the state hack to
apply to foo too, so we can eta expand the case.
Then we expect that if f is applied to one arg, it'll be applied to two
(that's the hack
See also Id.isOneShotBndr.
Note [State hack and bottoming functions]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
It's a terrible idea to use the state hack on a bottoming function.
Here's what happens (Trac #2861):
f :: String -> IO T
f = \p. error "..."
Etaexpand, using the state hack:
f = \p. (\s. ((error "...") |> g1) s) |> g2
g1 :: IO T ~ (S -> (S,T))
g2 :: (S -> (S,T)) ~ IO T
Extrude the g2
f' = \p. \s. ((error "...") |> g1) s
f = f' |> (String -> g2)
Discard args for bottomming function
f' = \p. \s. ((error "...") |> g1 |> g3
g3 :: (S -> (S,T)) ~ (S,T)
Extrude g1.g3
f'' = \p. \s. (error "...")
f' = f'' |> (String -> S -> g1.g3)
And now we can repeat the whole loop. Aargh! The bug is in applying the
state hack to a function which then swallows the argument.
This arose in another guise in Trac #3959. Here we had
catch# (throw exn >> return ())
Note that (throw :: forall a e. Exn e => e -> a) is called with [a = IO ()].
After inlining (>>) we get
catch# (\_. throw {IO ()} exn)
We must *not* etaexpand to
catch# (\_ _. throw {...} exn)
because 'catch#' expects to get a (# _,_ #) after applying its argument to
a State#, not another function!
In short, we use the state hack to allow us to push let inside a lambda,
but not to introduce a new lambda.
Note [ArityType]
~~~~~~~~~~~~~~~~
ArityType is the result of a compositional analysis on expressions,
from which we can decide the real arity of the expression (extracted
with function exprEtaExpandArity).
Here is what the fields mean. If an arbitrary expression 'f' has
ArityType 'at', then
* If at = ABot n, then (f x1..xn) definitely diverges. Partial
applications to fewer than n args may *or may not* diverge.
We allow ourselves to etaexpand bottoming functions, even
if doing so may lose some `seq` sharing,
let x = <expensive> in \y. error (g x y)
==> \y. let x = <expensive> in error (g x y)
* If at = ATop as, and n=length as,
then expanding 'f' to (\x1..xn. f x1 .. xn) loses no sharing,
assuming the calls of f respect the oneshotness of of
its definition.
NB 'f' is an arbitary expression, eg (f = g e1 e2). This 'f'
can have ArityType as ATop, with length as > 0, only if e1 e2 are
themselves.
* In both cases, f, (f x1), ... (f x1 ... f(n1)) are definitely
really functions, or bottom, but *not* casts from a data type, in
at least one case branch. (If it's a function in one case branch but
an unsafe cast from a data type in another, the program is bogus.)
So eta expansion is dynamically ok; see Note [State hack and
bottoming functions], the part about catch#
Example:
f = \x\y. let v = <expensive> in
\s(oneshot) \t(oneshot). blah
'f' has ArityType [ManyShot,ManyShot,OneShot,OneShot]
The oneshotness means we can, in effect, push that
'let' inside the \st.
Suppose f = \xy. x+y
Then f :: AT [False,False] ATop
f v :: AT [False] ATop
f <expensive> :: AT [] ATop
\begin{code}
data ArityType = ATop [OneShot] | ABot Arity
type OneShot = Bool
vanillaArityType :: ArityType
vanillaArityType = ATop []
exprEtaExpandArity :: CheapFun -> CoreExpr -> Arity
exprEtaExpandArity cheap_fun e
= case (arityType cheap_fun e) of
ATop (os:oss)
| os || has_lam e -> 1 + length oss
| otherwise -> 0
ATop [] -> 0
ABot n -> n
where
has_lam (Note _ e) = has_lam e
has_lam (Lam b e) = isId b || has_lam e
has_lam _ = False
getBotArity :: ArityType -> Maybe Arity
getBotArity (ABot n) = Just n
getBotArity _ = Nothing
\end{code}
Note [Eta expanding thunks]
~~~~~~~~~~~~~~~~~~~~~~~~~~~
When we see
f = case y of p -> \x -> blah
should we etaexpand it? Well, if 'x' is a oneshot state token
then 'yes' because 'f' will only be applied once. But otherwise
we (conservatively) say no. My main reason is to avoid expanding
PAPSs
f = g d ==> f = \x. g d x
because that might in turn make g inline (if it has an inline pragma),
which we might not want. After all, INLINE pragmas say "inline only
when saturate" so we don't want to be too gungho about saturating!
\begin{code}
arityLam :: Id -> ArityType -> ArityType
arityLam id (ATop as) = ATop (isOneShotBndr id : as)
arityLam _ (ABot n) = ABot (n+1)
floatIn :: Bool -> ArityType -> ArityType
floatIn _ (ABot n) = ABot n
floatIn True (ATop as) = ATop as
floatIn False (ATop as) = ATop (takeWhile id as)
arityApp :: ArityType -> Bool -> ArityType
arityApp (ABot 0) _ = ABot 0
arityApp (ABot n) _ = ABot (n1)
arityApp (ATop []) _ = ATop []
arityApp (ATop (_:as)) cheap = floatIn cheap (ATop as)
andArityType :: ArityType -> ArityType -> ArityType
andArityType (ABot n1) (ABot n2)
= ABot (n1 `min` n2)
andArityType (ATop as) (ABot _) = ATop as
andArityType (ABot _) (ATop bs) = ATop bs
andArityType (ATop as) (ATop bs) = ATop (as `combine` bs)
where
combine (a:as) (b:bs) = (a && b) : combine as bs
combine [] bs = take_one_shots bs
combine as [] = take_one_shots as
take_one_shots [] = []
take_one_shots (one_shot : as)
| one_shot = True : take_one_shots as
| otherwise = []
\end{code}
Note [Combining case branches]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Consider
go = \x. let z = go e0
go2 = \x. case x of
True -> z
False -> \s(oneshot). e1
in go2 x
We *really* want to etaexpand go and go2.
When combining the barnches of the case we have
ATop [] `andAT` ATop [True]
and we want to get ATop [True]. But if the inner
lambda wasn't oneshot we don't want to do this.
(We need a proper arity analysis to justify that.)
\begin{code}
type CheapFun = CoreExpr -> Maybe Type -> Bool
arityType :: CheapFun -> CoreExpr -> ArityType
arityType _ (Var v)
| Just strict_sig <- idStrictness_maybe v
, (ds, res) <- splitStrictSig strict_sig
, let arity = length ds
= if isBotRes res then ABot arity
else ATop (take arity one_shots)
| otherwise
= ATop (take (idArity v) one_shots)
where
one_shots :: [Bool]
one_shots = typeArity (idType v)
arityType cheap_fn (Lam x e)
| isId x = arityLam x (arityType cheap_fn e)
| otherwise = arityType cheap_fn e
arityType cheap_fn (App fun (Type _))
= arityType cheap_fn fun
arityType cheap_fn (App fun arg )
= arityApp (arityType cheap_fn fun) (cheap_fn arg Nothing)
arityType cheap_fn (Case scrut bndr _ alts)
= floatIn (cheap_fn scrut (Just (idType bndr)))
(foldr1 andArityType [arityType cheap_fn rhs | (_,_,rhs) <- alts])
arityType cheap_fn (Let b e)
= floatIn (cheap_bind b) (arityType cheap_fn e)
where
cheap_bind (NonRec b e) = is_cheap (b,e)
cheap_bind (Rec prs) = all is_cheap prs
is_cheap (b,e) = cheap_fn e (Just (idType b))
arityType cheap_fn (Note n e)
| notSccNote n = arityType cheap_fn e
arityType cheap_fn (Cast e _) = arityType cheap_fn e
arityType _ _ = vanillaArityType
\end{code}
%************************************************************************
%* *
The main etaexpander
%* *
%************************************************************************
We go for:
f = \x1..xn -> N ==> f = \x1..xn y1..ym -> N y1..ym
(n >= 0)
where (in both cases)
* The xi can include type variables
* The yi are all value variables
* N is a NORMAL FORM (i.e. no redexes anywhere)
wanting a suitable number of extra args.
The biggest reason for doing this is for cases like
f = \x -> case x of
True -> \y -> e1
False -> \y -> e2
Here we want to get the lambdas together. A good exmaple is the nofib
program fibheaps, which gets 25% more allocation if you don't do this
etaexpansion.
We may have to sandwich some coerces between the lambdas
to make the types work. exprEtaExpandArity looks through coerces
when computing arity; and etaExpand adds the coerces as necessary when
actually computing the expansion.
Note [No crap in etaexpanded code]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The eta expander is careful not to introduce "crap". In particular,
given a CoreExpr satisfying the 'CpeRhs' invariant (in CorePrep), it
returns a CoreExpr satisfying the same invariant. See Note [Eta
expansion and the CorePrep invariants] in CorePrep.
This means the etaexpander has to do a bit of onthefly
simplification but it's not too hard. The alernative, of relying on
a subsequent cleanup phase of the Simplifier to decrapify the result,
means you can't really use it in CorePrep, which is painful.
Note [Eta expansion and SCCs]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Note that SCCs are not treated specially by etaExpand. If we have
etaExpand 2 (\x -> scc "foo" e)
= (\xy -> (scc "foo" e) y)
So the costs of evaluating 'e' (not 'e y') are attributed to "foo"
\begin{code}
etaExpand :: Arity
-> CoreExpr
-> CoreExpr
etaExpand n orig_expr
= go n orig_expr
where
go 0 expr = expr
go n (Lam v body) | isTyCoVar v = Lam v (go n body)
| otherwise = Lam v (go (n1) body)
go n (Cast expr co) = Cast (go n expr) co
go n expr =
etaInfoAbs etas (etaInfoApp subst' expr etas)
where
in_scope = mkInScopeSet (exprFreeVars expr)
(in_scope', etas) = mkEtaWW n in_scope (exprType expr)
subst' = mkEmptySubst in_scope'
data EtaInfo = EtaVar Var
| EtaCo Coercion
instance Outputable EtaInfo where
ppr (EtaVar v) = ptext (sLit "EtaVar") <+> ppr v
ppr (EtaCo co) = ptext (sLit "EtaCo") <+> ppr co
pushCoercion :: Coercion -> [EtaInfo] -> [EtaInfo]
pushCoercion co1 (EtaCo co2 : eis)
| isIdentityCoercion co = eis
| otherwise = EtaCo co : eis
where
co = co1 `mkTransCoercion` co2
pushCoercion co eis = EtaCo co : eis
etaInfoAbs :: [EtaInfo] -> CoreExpr -> CoreExpr
etaInfoAbs [] expr = expr
etaInfoAbs (EtaVar v : eis) expr = Lam v (etaInfoAbs eis expr)
etaInfoAbs (EtaCo co : eis) expr = Cast (etaInfoAbs eis expr) (mkSymCoercion co)
etaInfoApp :: Subst -> CoreExpr -> [EtaInfo] -> CoreExpr
etaInfoApp subst (Lam v1 e) (EtaVar v2 : eis)
= etaInfoApp subst' e eis
where
subst' | isTyCoVar v1 = CoreSubst.extendTvSubst subst v1 (mkTyVarTy v2)
| otherwise = CoreSubst.extendIdSubst subst v1 (Var v2)
etaInfoApp subst (Cast e co1) eis
= etaInfoApp subst e (pushCoercion co' eis)
where
co' = CoreSubst.substTy subst co1
etaInfoApp subst (Case e b _ alts) eis
= Case (subst_expr subst e) b1 (coreAltsType alts') alts'
where
(subst1, b1) = substBndr subst b
alts' = map subst_alt alts
subst_alt (con, bs, rhs) = (con, bs', etaInfoApp subst2 rhs eis)
where
(subst2,bs') = substBndrs subst1 bs
etaInfoApp subst (Let b e) eis
= Let b' (etaInfoApp subst' e eis)
where
(subst', b') = subst_bind subst b
etaInfoApp subst (Note note e) eis
= Note note (etaInfoApp subst e eis)
etaInfoApp subst e eis
= go (subst_expr subst e) eis
where
go e [] = e
go e (EtaVar v : eis) = go (App e (varToCoreExpr v)) eis
go e (EtaCo co : eis) = go (Cast e co) eis
mkEtaWW :: Arity -> InScopeSet -> Type
-> (InScopeSet, [EtaInfo])
mkEtaWW orig_n in_scope orig_ty
= go orig_n empty_subst orig_ty []
where
empty_subst = mkTvSubst in_scope emptyTvSubstEnv
go n subst ty eis
| n == 0
= (getTvInScope subst, reverse eis)
| Just (tv,ty') <- splitForAllTy_maybe ty
, let (subst', tv') = substTyVarBndr subst tv
= go n subst' ty' (EtaVar tv' : eis)
| Just (arg_ty, res_ty) <- splitFunTy_maybe ty
, let (subst', eta_id') = freshEtaId n subst arg_ty
= go (n1) subst' res_ty (EtaVar eta_id' : eis)
| Just(ty',co) <- splitNewTypeRepCo_maybe ty
=
go n subst ty' (EtaCo (Type.substTy subst co) : eis)
| otherwise
= WARN( True, ppr orig_n <+> ppr orig_ty )
(getTvInScope subst, reverse eis)
subst_expr :: Subst -> CoreExpr -> CoreExpr
subst_expr = substExprSC (text "CoreArity:substExpr")
subst_bind :: Subst -> CoreBind -> (Subst, CoreBind)
subst_bind = substBindSC
freshEtaId :: Int -> TvSubst -> Type -> (TvSubst, Id)
freshEtaId n subst ty
= (subst', eta_id')
where
ty' = Type.substTy subst ty
eta_id' = uniqAway (getTvInScope subst) $
mkSysLocal (fsLit "eta") (mkBuiltinUnique n) ty'
subst' = extendTvInScope subst eta_id'
\end{code}