1. Prepare a file including PC5 scores for your sequences. See the example file for how to prepare this file. ACC scores can be calculated using R described below, or can be obtained using the "ACC-Mean transformation" program available in our web site. 2. R statistical package as well as the pls library are used for this analysis. Install them on your machine if you haven't. 3. Copy the input file prepared above to the directory where R is run. Make sure this directory cotains .RData file prepared for R. 4. Download the trained model file, PLA-ACC.Rdata, and copy it to the directory where .Rdata exists. If you do not have .RData file in the directory you want to work, simply run R by typing 'R'. Then load the model file: > load("PLS-ACC.Rdata") This creates a file "Z" that has the PLS model information. 5. Attach the required library, pls, as follows (">" is the R prompt): > library(pls) 6. Read the input file into R. Depending on what input file you have, choose the appropriate command line from below: a. If you have a three-way PC score file (each line with scores for each amino acid), > testpc5.dat = read.table("test.pc5dat", header=TRUE, blank.lines.skip=FALSE) if there is no header line, remove "header=TRUE". b. If you have an ACC file (each line for each entry without a header line), > test.acc = as.matrix(read.table("test.acc", row.names = 1)) if the file does not contain entry IDs in the first column, remove "row.names=1". This test.acc file can be used with the PLS prediction shown in the step 9. 7. In order to calculate autocovariance (ACC) with the lag size of 30 using R: > testpc5.acc = by(testpc5.dat[,4:8], testpc5.dat$GI, acf, type="covariance", lag.max=30, plot=FALSE, na.action=na.pass) testpc5.dat[,4:8] specifies the column number range for the 5 scores used for ACC calculation. Change the numbers if necessary. testpc5.dat$GI specifies the column name where entry IDs are found. Change "GI" to the appropriate name if necessary. If there is no header included, use the column number as testpc5.dat[1]. To extract the autocovariance data and generate a matrix that can be used with PLS, > test.acc = t(sapply(testpc5.acc, function(x)x$acf)) 8. Prediction by PLS is done as: > test.acc.predict = predict(Z, test.acc, comps=4, type="response") 9. To see the prediction results, simply type the file name: > test.acc.predict It should show the posterior probability for each entry. To save the file in a text file: > write.table(test.acc.predict, "test.acc.predict", quote=FALSE, col.names=FALSE, row.names=dimnames(test.acc.predict)[[1]])