SinhaLab · iamciera · Oct 14, 2014 · Oct 14, 2014 · Oct 14, 2014 · Jun 25, 2015
diff --git a/Histology/stainingGUS.md b/Histology/stainingGUS.md
@@ -0,0 +1,70 @@
+#GUS Staing Protocol
+*Written by Ciera Martinez*
+
+This protocol is a combination of Julie Kang's protocol and Yasu Ichihashi.  Although works with many plant tissue types, this particular protocol was used on *Solanum lycopersicum* leaf primordia and apices.
+##Protocol
+
+1. Prepare Solutions (see below)
+2. Dissect leaf tissue into 90% acetone on ice, incubate for 15 minutes. 
+3. Wash the sample with NaH<sub>2</sub>PO<sub>4</sub> solution.
+4. 13 hours / overnight in 37&ordm;C vacuum
+5. Place sample in 70% ethanol for at least 5 min. *Optional stopping point, keep in 4&ordm;C*
+6.  6:1 ethanol/acetic acid.  for at least 1 hr.
+7. Clearing
+
+</br>
+</br>
+</br>
+</br>
+</br>
+</br>
+
+##Preperation of Solutions 
+
+###1. GUS Staining Buffer
+
+50 mL 0.1M NaPO<sub>4</sub>
+1 mL DMSO
+1 mL Triton X - 100
+2 mL 0.5M EDTA
+47 mL H<sub>2</sub>O
+
+###2. Phosphate buffers:
+
+**Stock Solution**
+
+A. Na<sub>2</sub>HPO<sub>4</sub> = 27.59 g/L (13.79g per 500mL H<sub>2</sub>O)
+B. NaH<sub>2</sub>PO<sub>4</sub> = 28.39 g/L (14.19g per 500mL H<sub>2</sub>O)
+
+**Working Buffer : 0.1M** 
+
+Combine
+39mL Na<sub>2</sub>HPO<sub>4</sub> stock solution
+61mL NaH<sub>2</sub>PO<sub>4</sub> stock solution 
+100 mL H<sub>2</sub>O
+
+###3. Sensitive Solutions
+
+*Dilute and add GUS right before using and cover solutions with aluminum foil*
+
+a. Potassium (K3) Ferricyanide = 0.0167g / 1mL H<sub>2</sub>O
+b. Potassium (K2) Ferrocyanide  = 0.02112g / 1mL H<sub>2</sub>O
+c. X-gluc = 0.005g/ 100&mu;L  DMSO
+
+###GUS Staining Solution 
+
+This will depend on the tissue and GUS contruct under investigation.  There is room for optimization.
+
+10 mL GUS staining buffer
+300 &mu;L K3
+300 &mu;L K2
+192 &mu; X-gluc
+
+
+
+
+
+
+
+
+
diff --git a/Molecular/PCR.cleanup.md b/Molecular/PCR.cleanup.md
@@ -0,0 +1,10 @@
+#PCR Clean Up
+
+This is a fast protocol that cleans up PCR products with Ampure beads.
+
+1. Add 17.6 &mu;L Ampure beads and pipet to mix. 
+2. Incubate 5 minutes @ room temp.
+3. Add to magnetic strip and let sit till clear (about 1 min)
+4. Wash once with 200 &mu;L 75% EtOH. 
+5. Remove EtOH and let dry for 5 minutes.
+6. Elute in 15 &mu;L DI water. 
diff --git a/README.md b/README.md
@@ -3,7 +3,7 @@ Scripts-and-Protocols
 
 Sinha lab community scripts and protocols. 
 
-Contibute
+Contribute
 ---------
 
 You do not need to be part of the Sinha lab to upload protocols. 

diff --git a/RNAseq/Instructions/BWA.md b/RNAseq/Instructions/BWA.md
@@ -27,27 +27,27 @@ Samtools: [http://samtools.sourceforge.net/samtools.shtml](http://samtools.sourc
 
     There should now be five new versions of the file (.amb, .ann, .bwt, .pac, sa). 
 
-2. Organize your files so that they are named how you would like them.  In my case, I am going to name each barcode.    For now I will just cancatfiles with the same library and rep all in one .fq file.  In order to name the libraries in order of their barcode, I wrote a script below:
-
-    #!/usr/bin/env python
-    #renameFiles.py
-    #Ciera Martinez 
-    #This script takes in a .csv file as a key on how to rename files in current directory.  
-
-    import os
-    import csv
-
-    #open file
-    with open('lane1Key.csv','rU') as csvfile:
-            reader = csv.reader(csvfile, delimiter = ',')
-            mydict = {rows[1]:rows[0] for rows in reader}
-
-    # renaming
-    for fileName in os.listdir( '.' ):
-        newName = mydict.get(fileName) if mydict.get(fileName) else "empty" #can not read 'typeNone' from the keys that do not have matching files.
-        list(newName)
-        #print newName
-        os.rename(fileName, newName)
+2. Organize your files so that they are named how you would like them.  In my case, I am going to name each barcode.    For now I will just concatenate files with the same library and rep all in one .fq file.  In order to name the libraries in order of their barcode, I wrote a script below:
+
+        #!/usr/bin/env python
+        #renameFiles.py
+        #Ciera Martinez 
+        #This script takes in a .csv file as a key on how to rename files in current directory.  
+
+        import os
+        import csv
+
+        #open file
+        with open('lane1Key.csv','rU') as csvfile:
+                reader = csv.reader(csvfile, delimiter = ',')
+                mydict = {rows[1]:rows[0] for rows in reader}
+
+        # renaming
+        for fileName in os.listdir( '.' ):
+            newName = mydict.get(fileName) if mydict.get(fileName) else "empty" #can not read 'typeNone' from the keys that do not have matching files.
+            list(newName)
+            #print newName
+            os.rename(fileName, newName)
 
 4.  Now I need to concatenate all the reads from each specific library/rep into one file ie combine the lanes. I used this shell command to accomplish this.  I first make a key folder that contained empty names with all possible libraries.  
 

diff --git a/RNAseq/Instructions/iplant.FTP.SSH.md b/RNAseq/Instructions/iplant.FTP.SSH.md
@@ -1,6 +1,6 @@
 # Guide to Working in iPlant
 
-##Overview
+## Overview
 
 These are the steps you must take *before* you can begin to process your data.
 
@@ -9,13 +9,13 @@ These are the steps you must take *before* you can begin to process your data.
 3.  Mount iplant Volume
 4.  Mount IRODS Volume (for data storage)
 
-#Iplant Atmosphere
+# Iplant Atmosphere
 
 Sign up for iplant using an educational e-mail.  After logging in go to Atmosphere and start an instance, in this case I used Maloof08. 
 
 Can take up to 30 min. You will get an email verifying that your instance is up and running.  Then you can proceed.
 
-##SSH connection 
+## SSH connection 
 
 There are two ways in which you can interact with your iplant enviroment. 1. Command line on your own desktop. This is faster, especially if you are familiar with command line.  I also like using this option because I have more control over my terminal appearance and keyboard shortcuts 2. The other option and more frequently used option is  to have a virtual desktop running through VNC viewer.  Overall it is easier to use the VNC viewer, mostly because you can allow programs to run without worrying about disconnecting ssh, which can stop longer programs from running. 
 
@@ -37,7 +37,7 @@ They will ask for your iplant password.
 
 Now you are remotely connected to your iplant instance.  Use normal Unix commands to navigate your instance file directory system. 
 
-##To transfer files between your computer to your iplant instance
+## To transfer files between your computer to your iplant instance
 
 To transfer files you simply use [`scp`](http://linux.die.net/man/1/scp).  
 
@@ -53,9 +53,11 @@ or if you need to do it from server to local do the opposite.
 
     scp [email protected]:~/Desktop/ /Users/iamciera/Desktop/RNAseqAnalysis/sinhaLab/Barcode-tools-3.2.tgz 
 
-##How to Attach extra space to instance
+## How to Attach extra space to instance
 
-##Volumes 
+## Volumes 
+
+You can think of a volume as a hard drive you attach to your computing enviroment. Iplant will give allow you to have a certain amount of space.  The great part about volumes is that you can move the volume, with your data, from one istance to another by attaching and detaching. 
 
 [How to attach a volume](https://pods.iplantcollaborative.org/wiki/display/atmman/Attaching+a+Volume+to+an+Instance)
 
@@ -83,7 +85,7 @@ To quit
 
     command + D
 
-##IRODs ("Unlimited GB")
+## IRODs ("Unlimited GB")
 
 IRODS is where you want to backup everything.  It is a good idea to back up the raw files right away. IRODS is another file directory in which you have access to.  You mount IRODS similarly to how you would mount an iplant volume, but you access it differently, through Icommands, which is basically regular unix commands with the an "i" in front.  
 
@@ -93,7 +95,7 @@ In order to use IRODS there are two steps.
 
 [Using Icommands](https://pods.iplantcollaborative.org/wiki/display/start/Using+icommands)
 
-###Uploading multiple files or a directory (with recursion)
+### Uploading multiple files or a directory (with recursion)
 
     iput -P -V -b -r -T -X <checkpoint-file> --lfrestart <checkpoint-lf-file>  localDirectory dataDirectory
 
@@ -107,7 +109,7 @@ If you want to get files from irods simply use iget in a similar way. For exampl
 
 
 
-##FTP download of Berkeley files 
+## FTP download of Berkeley files 
 
 [*MAC FTP tutorial*](http://www.maclife.com/article/howtos/how_use_ftp_through_command_line_mac_os_x)
 
@@ -150,11 +152,11 @@ To quit the FTP connection
 
 mget * ~/Desktop
 
-##Permission settings
+## Permission settings
 
     sudo chown iamciera /home/iamciera/lcm
 
-##Running and Interacting with Processes
+## Running and Interacting with Processes
 From Vince's Book
 
 To run a program in the background include ampersand in the background.
@@ -180,7 +182,7 @@ Place in Background. To do this, we need to suspend the process, and then use th
     $ bg
     [1]+ program1 input.txt > results.txt
 
-##Running Programs where disconnecting ssh could happen
+## Running Programs where disconnecting ssh could happen
 
     disown -h a %job #maintain ownership until you disconnect.  This only works when the job is running in the background.
 
@@ -202,7 +204,7 @@ First you have to make standard error and standard out put files, then you can s
 
 Cody suggested I use [GNU Screen](http://www.gnu.org/software/screen/).  But I haven't looked into that just yet.
 
-##Basic Unix and Tools
+## Basic Unix and Tools
 
     ls -l -h #list long human readible
 
@@ -228,7 +230,7 @@ Yes. When running a command add time to the end.
 I need to seriously figure out bin and usr folders. 
 [usr_bin](http://www.linfo.org/usr_bin.html)
 
-##Permissions
+## Permissions
 
 In order to change the permission of an entire directory use chown. In the example below we are allowing to change owner recursively through all sub directories to the owner iamciera of the directory Data.
 

diff --git a/RNAseq/Instructions/preprocessing.md b/RNAseq/Instructions/preprocessing.md
@@ -57,7 +57,7 @@ Remove adapter contamination sequences
 
 	$ python adapterEffectRemover.py 41 Nremoved.fq AdaptersRemoved.fq b #10 min
 
-	$ python ~/lcm/scripts/barcoded_data_toolbox/adapterEffectRemover.py 41 Nremoved.fq AdaptersRemoved.fq b #ciera 
+	$ python adapterEffectRemover.py 41 Nremoved.fq AdaptersRemoved.fq b #cieras specifics 
 
 FastQC on AdaptersRemoved.fq file.
 
@@ -87,16 +87,16 @@ Remove the portion of the reads containing the barcode so that the reads can be
 
 	$ for n in ./*.fq; do ~/lcm/scripts/bin/fastx_trimmer -f 5 -Q 33 -i $n -o ./BCRemoved/$n; done
 
-##2. Mike Covington 
+## 2. Mike Covington 
 
-###Mike's way
+### Mike's way
 
 You first need to switch the  columns of your BCfile.txt, because mike's program needs them a different way. Ie.
 
 ATAGG	Barcode1	
 GCTAT	Barcode2	
 
-Copy the BCfile so you don't aucutally fuck it up.
+Copy the BCfile so you don't auctually fuck it up.
 
 	cp BCfile1.txt BCfiletest.txt
 

diff --git a/RNAseq/scripts/sam2counts.R b/RNAseq/scripts/sam2counts.R
@@ -0,0 +1,61 @@
+##R script to obtain counts per transcript when reads have been mapped to cDNAs
+
+##searches working directory for .sam files and summarizes them.
+
+#BEFORE running the lines below change the working directory to 
+#the directory with sam files.  If your files end in something
+#other than ".sam" change the command below
+
+#get a list of sam files in the working directory
+#the "$" denotes the end of line
+files <- list.files(pattern="\\.sam$")
+
+#look at files to make sure it is OK
+print(files)
+
+#create an empty object to hold our results
+results <- NULL
+
+#loop through each file...
+for (f in files) {
+  print(f) #print the current file
+
+  #read the file.  We only care about the third column.
+  #also discard the header info (rows starting with "@")
+  tmp <- scan(f,what=list(NULL,NULL,""),
+            comment.char="@",sep="\t",flush=T)[[3]]
+
+  #use table() to count the occurences of each gene.
+  #convert to matrix for better formatting later
+  tmp.table <- as.data.frame(table(tmp))
+  colnames(tmp.table) <- c("gene",f) #get column name specified
+  #not needed, in fact a mistake, I think. 
+  #tmp.table$gene <- rownames(tmp.table)
+
+  #add current results to previous results table, if appropriate
+  if (is.null(results)) { #first time through
+    results <- as.data.frame(tmp.table) #format
+    } else { #not first time through
+      results<-merge(results,tmp.table,all=T,
+                     by="gene") #combine results
+      #rownames(results) <- results$Row.names #reset rownames for next time through
+    } #else
+  } #for
+  rm(list=c("tmp","tmp.table")) #remove objects no longer needed
+
+#summarize mapped and unmapped reads:
+print("unmapped")
+unmapped <- results[results$gene=="*",-1]
+unmapped
+results.map <- results[results$gene!="*",]
+print("mapped")
+mapped <- apply(results.map[-1],2,sum,na.rm=T)
+mapped
+print("percent mapped")
+round(mapped/(mapped+unmapped)*100,1)
+
+
+write.table(results.map,file="sam2countsResults.tsv",sep="\t",row.names=F)
+
+
+