258 lines
		
	
	
	
		
			9.3 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			258 lines
		
	
	
	
		
			9.3 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
| CONTROL DEPENDENCIES
 | |
| ====================
 | |
| 
 | |
| A major difficulty with control dependencies is that current compilers
 | |
| do not support them.  One purpose of this document is therefore to
 | |
| help you prevent your compiler from breaking your code.  However,
 | |
| control dependencies also pose other challenges, which leads to the
 | |
| second purpose of this document, namely to help you to avoid breaking
 | |
| your own code, even in the absence of help from your compiler.
 | |
| 
 | |
| One such challenge is that control dependencies order only later stores.
 | |
| Therefore, a load-load control dependency will not preserve ordering
 | |
| unless a read memory barrier is provided.  Consider the following code:
 | |
| 
 | |
| 	q = READ_ONCE(a);
 | |
| 	if (q)
 | |
| 		p = READ_ONCE(b);
 | |
| 
 | |
| This is not guaranteed to provide any ordering because some types of CPUs
 | |
| are permitted to predict the result of the load from "b".  This prediction
 | |
| can cause other CPUs to see this load as having happened before the load
 | |
| from "a".  This means that an explicit read barrier is required, for example
 | |
| as follows:
 | |
| 
 | |
| 	q = READ_ONCE(a);
 | |
| 	if (q) {
 | |
| 		smp_rmb();
 | |
| 		p = READ_ONCE(b);
 | |
| 	}
 | |
| 
 | |
| However, stores are not speculated.  This means that ordering is
 | |
| (usually) guaranteed for load-store control dependencies, as in the
 | |
| following example:
 | |
| 
 | |
| 	q = READ_ONCE(a);
 | |
| 	if (q)
 | |
| 		WRITE_ONCE(b, 1);
 | |
| 
 | |
| Control dependencies can pair with each other and with other types
 | |
| of ordering.  But please note that neither the READ_ONCE() nor the
 | |
| WRITE_ONCE() are optional.  Without the READ_ONCE(), the compiler might
 | |
| fuse the load from "a" with other loads.  Without the WRITE_ONCE(),
 | |
| the compiler might fuse the store to "b" with other stores.  Worse yet,
 | |
| the compiler might convert the store into a load and a check followed
 | |
| by a store, and this compiler-generated load would not be ordered by
 | |
| the control dependency.
 | |
| 
 | |
| Furthermore, if the compiler is able to prove that the value of variable
 | |
| "a" is always non-zero, it would be well within its rights to optimize
 | |
| the original example by eliminating the "if" statement as follows:
 | |
| 
 | |
| 	q = a;
 | |
| 	b = 1;  /* BUG: Compiler and CPU can both reorder!!! */
 | |
| 
 | |
| So don't leave out either the READ_ONCE() or the WRITE_ONCE().
 | |
| In particular, although READ_ONCE() does force the compiler to emit a
 | |
| load, it does *not* force the compiler to actually use the loaded value.
 | |
| 
 | |
| It is tempting to try use control dependencies to enforce ordering on
 | |
| identical stores on both branches of the "if" statement as follows:
 | |
| 
 | |
| 	q = READ_ONCE(a);
 | |
| 	if (q) {
 | |
| 		barrier();
 | |
| 		WRITE_ONCE(b, 1);
 | |
| 		do_something();
 | |
| 	} else {
 | |
| 		barrier();
 | |
| 		WRITE_ONCE(b, 1);
 | |
| 		do_something_else();
 | |
| 	}
 | |
| 
 | |
| Unfortunately, current compilers will transform this as follows at high
 | |
| optimization levels:
 | |
| 
 | |
| 	q = READ_ONCE(a);
 | |
| 	barrier();
 | |
| 	WRITE_ONCE(b, 1);  /* BUG: No ordering vs. load from a!!! */
 | |
| 	if (q) {
 | |
| 		/* WRITE_ONCE(b, 1); -- moved up, BUG!!! */
 | |
| 		do_something();
 | |
| 	} else {
 | |
| 		/* WRITE_ONCE(b, 1); -- moved up, BUG!!! */
 | |
| 		do_something_else();
 | |
| 	}
 | |
| 
 | |
| Now there is no conditional between the load from "a" and the store to
 | |
| "b", which means that the CPU is within its rights to reorder them:  The
 | |
| conditional is absolutely required, and must be present in the final
 | |
| assembly code, after all of the compiler and link-time optimizations
 | |
| have been applied.  Therefore, if you need ordering in this example,
 | |
| you must use explicit memory ordering, for example, smp_store_release():
 | |
| 
 | |
| 	q = READ_ONCE(a);
 | |
| 	if (q) {
 | |
| 		smp_store_release(&b, 1);
 | |
| 		do_something();
 | |
| 	} else {
 | |
| 		smp_store_release(&b, 1);
 | |
| 		do_something_else();
 | |
| 	}
 | |
| 
 | |
| Without explicit memory ordering, control-dependency-based ordering is
 | |
| guaranteed only when the stores differ, for example:
 | |
| 
 | |
| 	q = READ_ONCE(a);
 | |
| 	if (q) {
 | |
| 		WRITE_ONCE(b, 1);
 | |
| 		do_something();
 | |
| 	} else {
 | |
| 		WRITE_ONCE(b, 2);
 | |
| 		do_something_else();
 | |
| 	}
 | |
| 
 | |
| The initial READ_ONCE() is still required to prevent the compiler from
 | |
| knowing too much about the value of "a".
 | |
| 
 | |
| But please note that you need to be careful what you do with the local
 | |
| variable "q", otherwise the compiler might be able to guess the value
 | |
| and again remove the conditional branch that is absolutely required to
 | |
| preserve ordering.  For example:
 | |
| 
 | |
| 	q = READ_ONCE(a);
 | |
| 	if (q % MAX) {
 | |
| 		WRITE_ONCE(b, 1);
 | |
| 		do_something();
 | |
| 	} else {
 | |
| 		WRITE_ONCE(b, 2);
 | |
| 		do_something_else();
 | |
| 	}
 | |
| 
 | |
| If MAX is compile-time defined to be 1, then the compiler knows that
 | |
| (q % MAX) must be equal to zero, regardless of the value of "q".
 | |
| The compiler is therefore within its rights to transform the above code
 | |
| into the following:
 | |
| 
 | |
| 	q = READ_ONCE(a);
 | |
| 	WRITE_ONCE(b, 2);
 | |
| 	do_something_else();
 | |
| 
 | |
| Given this transformation, the CPU is not required to respect the ordering
 | |
| between the load from variable "a" and the store to variable "b".  It is
 | |
| tempting to add a barrier(), but this does not help.  The conditional
 | |
| is gone, and the barrier won't bring it back.  Therefore, if you need
 | |
| to relying on control dependencies to produce this ordering, you should
 | |
| make sure that MAX is greater than one, perhaps as follows:
 | |
| 
 | |
| 	q = READ_ONCE(a);
 | |
| 	BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
 | |
| 	if (q % MAX) {
 | |
| 		WRITE_ONCE(b, 1);
 | |
| 		do_something();
 | |
| 	} else {
 | |
| 		WRITE_ONCE(b, 2);
 | |
| 		do_something_else();
 | |
| 	}
 | |
| 
 | |
| Please note once again that each leg of the "if" statement absolutely
 | |
| must store different values to "b".  As in previous examples, if the two
 | |
| values were identical, the compiler could pull this store outside of the
 | |
| "if" statement, destroying the control dependency's ordering properties.
 | |
| 
 | |
| You must also be careful avoid relying too much on boolean short-circuit
 | |
| evaluation.  Consider this example:
 | |
| 
 | |
| 	q = READ_ONCE(a);
 | |
| 	if (q || 1 > 0)
 | |
| 		WRITE_ONCE(b, 1);
 | |
| 
 | |
| Because the first condition cannot fault and the second condition is
 | |
| always true, the compiler can transform this example as follows, again
 | |
| destroying the control dependency's ordering:
 | |
| 
 | |
| 	q = READ_ONCE(a);
 | |
| 	WRITE_ONCE(b, 1);
 | |
| 
 | |
| This is yet another example showing the importance of preventing the
 | |
| compiler from out-guessing your code.  Again, although READ_ONCE() really
 | |
| does force the compiler to emit code for a given load, the compiler is
 | |
| within its rights to discard the loaded value.
 | |
| 
 | |
| In addition, control dependencies apply only to the then-clause and
 | |
| else-clause of the "if" statement in question.  In particular, they do
 | |
| not necessarily order the code following the entire "if" statement:
 | |
| 
 | |
| 	q = READ_ONCE(a);
 | |
| 	if (q) {
 | |
| 		WRITE_ONCE(b, 1);
 | |
| 	} else {
 | |
| 		WRITE_ONCE(b, 2);
 | |
| 	}
 | |
| 	WRITE_ONCE(c, 1);  /* BUG: No ordering against the read from "a". */
 | |
| 
 | |
| It is tempting to argue that there in fact is ordering because the
 | |
| compiler cannot reorder volatile accesses and also cannot reorder
 | |
| the writes to "b" with the condition.  Unfortunately for this line
 | |
| of reasoning, the compiler might compile the two writes to "b" as
 | |
| conditional-move instructions, as in this fanciful pseudo-assembly
 | |
| language:
 | |
| 
 | |
| 	ld r1,a
 | |
| 	cmp r1,$0
 | |
| 	cmov,ne r4,$1
 | |
| 	cmov,eq r4,$2
 | |
| 	st r4,b
 | |
| 	st $1,c
 | |
| 
 | |
| The control dependencies would then extend only to the pair of cmov
 | |
| instructions and the store depending on them.  This means that a weakly
 | |
| ordered CPU would have no dependency of any sort between the load from
 | |
| "a" and the store to "c".  In short, control dependencies provide ordering
 | |
| only to the stores in the then-clause and else-clause of the "if" statement
 | |
| in question (including functions invoked by those two clauses), and not
 | |
| to code following that "if" statement.
 | |
| 
 | |
| 
 | |
| In summary:
 | |
| 
 | |
|   (*) Control dependencies can order prior loads against later stores.
 | |
|       However, they do *not* guarantee any other sort of ordering:
 | |
|       Not prior loads against later loads, nor prior stores against
 | |
|       later anything.  If you need these other forms of ordering, use
 | |
|       smp_load_acquire(), smp_store_release(), or, in the case of prior
 | |
|       stores and later loads, smp_mb().
 | |
| 
 | |
|   (*) If both legs of the "if" statement contain identical stores to
 | |
|       the same variable, then you must explicitly order those stores,
 | |
|       either by preceding both of them with smp_mb() or by using
 | |
|       smp_store_release().  Please note that it is *not* sufficient to use
 | |
|       barrier() at beginning and end of each leg of the "if" statement
 | |
|       because, as shown by the example above, optimizing compilers can
 | |
|       destroy the control dependency while respecting the letter of the
 | |
|       barrier() law.
 | |
| 
 | |
|   (*) Control dependencies require at least one run-time conditional
 | |
|       between the prior load and the subsequent store, and this
 | |
|       conditional must involve the prior load.  If the compiler is able
 | |
|       to optimize the conditional away, it will have also optimized
 | |
|       away the ordering.  Careful use of READ_ONCE() and WRITE_ONCE()
 | |
|       can help to preserve the needed conditional.
 | |
| 
 | |
|   (*) Control dependencies require that the compiler avoid reordering the
 | |
|       dependency into nonexistence.  Careful use of READ_ONCE() or
 | |
|       atomic{,64}_read() can help to preserve your control dependency.
 | |
| 
 | |
|   (*) Control dependencies apply only to the then-clause and else-clause
 | |
|       of the "if" statement containing the control dependency, including
 | |
|       any functions that these two clauses call.  Control dependencies
 | |
|       do *not* apply to code beyond the end of that "if" statement.
 | |
| 
 | |
|   (*) Control dependencies pair normally with other types of barriers.
 | |
| 
 | |
|   (*) Control dependencies do *not* provide multicopy atomicity.  If you
 | |
|       need all the CPUs to agree on the ordering of a given store against
 | |
|       all other accesses, use smp_mb().
 | |
| 
 | |
|   (*) Compilers do not understand control dependencies.  It is therefore
 | |
|       your job to ensure that they do not break your code.
 |