In the Deep End: A Biology Blog: September 2015

Just got into grad school! So finally settled in and ready to write more blog posts.

Here is an hour project that I worked on. My friend has a huge data file on a dung beetle population and recorded many important information daily. It looks something like this:

Figure 1: Fake Randomly Generated Data

So he comes up to me and say,

Him: "I want to average the 5 different variables"
Me: "Okay, like weekly? So like every 8 rows?"
Him: "Oh no, I have a data file of the dates that I want to serve as start-date and end-date.
So the duration of the start-date and end date will be different"

He then opens up another CSV file

Figure 2: Fake Dates

I want the data averaged from 1/1/1989 to 1/4/1989, 1/4/1989 to 1/5/1989, etc.

And then I threw a string of obscenities at him for making it harder. So basically, I run a for-loop (don't judge me).

Steps (Code is below: NOTE THAT YOU MUST CHANGE CERTAIN THINGS FOR YOUR OWN SPECIFIC NEEDS)

1) I uploaded the two different files into my R workstation. One will be the record (Figure 2) and the other will be the main data. My friend had some other variables in his record file and I just wanted the dates so I created a new variable, x.

2) I created an empty list so I can put newly averaged data into the list.

3) I ran a for-loop on the record data (x).

4) I use the "match" function to find the index of the main data and use that to subset the portion of the maindata (from the start-date to the end-date)

5) I then average the columns

6) Because the averages are in a column form, I transpose it

7) I then column bind the start date, end date, and the averages

8) I put this into the list

REPEAT

9) After I finished, I turn the list into a data.frame

In the Deep End: A Biology Blog

Sunday, September 27, 2015

A trip to the grocery wasn't so bad!

Friday, September 25, 2015

How to use information in one data frame to subset data in another (R Code)

About Me

Blog Archive