You're not logged in
 /  log in  /  admin page  /  wiki  / 

Instructions

Description

The variable function analysis performs a statistical test (Mann-Whitney U) on a given set of data.

Data Input

It is easiest to copy and paste data directly from excel, as it will input the tab characters for you

The data must be in the form "VAR1 VAR2 VAR3 VALUE", where the gaps are TABS and each line ends in a value > 0 but < 1. Each VAR can be anything, it could be single base pairs of a gene, e.g. "A C T G 0.4". The VARs could also be amino acids, allele variants, etc. The VALUE must be the last thing on each line, and must be between 0 and 1. It is a measure of the function of that sequence of VARs as a whole.

Note: the Mann-Whitney U test will use approximate p-values for any test where both in-category and out-of-category sample sizes are greater than 8. It will also use approximate p-values if any pair of sample values match exactly (ties). Whether each test was done with the exact or approximate p-value method is noted in the rightmost column.

Example

Say you are trying to determine which HLA types attribute to higher CD4 counts. The variables you record per individual are:

  1. A1
  2. A2
  3. B1
  4. B2
  5. C1
  6. C2

For each individual set of these variables A1-C2 you also have a function which measures the ratio of CD4 as compared to a normal healthy cell. Wherethese individuals are infected with HIV.

A1 A2 B1 B2 C1 C2 % CD4
A02:26A03:01:01GB07:02:01GB40:01:01GC03:04:01GC07:02:01G0.3
A01:01:01GA02:01:01GB08:01:01GB15:01:01GC03:04:01GC07:01:01G0.7
A01:01:01GA02:01:01GB08:01:01GB57:01:01GC06:02:01GC07:01:01G0.8
A02:01:01GA03:01:01GB14:02:01B15:34C03:04:01GC08:02:010.3
A02:01:01GA24:03:01GB38:01:01B51:01:01GC12:03:01GC14:02:010.45
A02:01:01GA02:01:01GB14:02:01B40:01:01GC03:04:01GC08:02:010.3
A01:01:01GA01:01:01GB08:01:01GB57:01:01GC06:02:01GC07:01:01G0.75
A11:01:01GA23:01:01GB07:02:01GB51:01:01GC04:01:01GC15:02:01G0.2
A01:01:01GA03:01:01GB27:05:02GB57:01:01GC01:02:01GC06:02:01G0.8
A01:01:01GA02:01:01GB08:01:01GB44:02:01GC02:02:02GC07:01:01G0.7
A01:01:01GA11:01:01GB08:01:01GB35:01:01GC04:01:01GC07:640.9
A02:01:01GA24:02:01GB15:01:01GB15:07:01GC01:02:01GC03:03:01G0.4
A01:01:01GA25:01:01GB08:01:01GB39:01:01GC07:01:01GC12:03:01G0.6

You can copy and paste the table (not including the headers) into the text area and click submit to get medians, counts, and p-values that will list each category and assign it a p-value based on whether or not the presence or absense of that variable makes a significant difference on the function