|
LIBSHUFF Cautions and Tips |
|
The smallest p-value reported by LIBSHUFF is 0.001. This occurs because only 1000 shuffles are performed.
In simulations, the sensitivity of LIBSHUFF increases with the number of sequences in the library. For instance, when the library size n is 50, the introduction of 10-20 novel sequences into one library is frequently sufficient to allow LIBSHUFF to distinguish between the libraries (p=0.05). Similarly, when n = 100, the introduction of 10-20 novel sequences into one library is also frequently sufficient to allow LIBSHUFF to distinguish between the libraries (p=0.05).
LIBSHUFF is designed to work with undersampled libraries. When the sampling is very high, ie. the coverage curves are close to 1.0 at low values of D, LIBSHUFF tends to call all libraries as the same because the homologous and heterologous coverage curves are very similar (Thanks to Michelle Allen for pointing this out).
LIBSHUFF will always call identical libraries as different. This happens because of the heterologous coverage curve will be a straight line at C=1.0 for all values of D (Thanks to Bill Miller for pointing this out).