<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Tips on rdata.lu Blog | Data science with R</title>
    <link>/categories/tips/</link>
    <description>Recent content in Tips on rdata.lu Blog | Data science with R</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <copyright>Copyright (c) rdata.lu. All rights reserved. &lt;br&gt; Content reblogged by &lt;a href=&#39;https://www.r-bloggers.com/&#39; target=&#39;_blank&#39;&gt;R-bloggers&lt;/a&gt; &amp; &lt;a href=&#39;http://www.rweekly.org/&#39; target=&#39;_blank&#39;&gt;RWeekly&lt;/a&gt;</copyright>
    <lastBuildDate>Fri, 26 Jan 2018 00:00:00 +0000</lastBuildDate>
    
        <atom:link href="/categories/tips/index.xml" rel="self" type="application/rss+xml" />
    
    
    <item>
      <title>Analysis of the Renert - Part 3: Visualizations</title>
      <link>/post/2018-01-26-analysis-of-the-renert-part-3/</link>
      <pubDate>Fri, 26 Jan 2018 00:00:00 +0000</pubDate>
      
      <guid>/post/2018-01-26-analysis-of-the-renert-part-3/</guid>
      <description>&lt;!--```{r, echo=FALSE}
knitr::include_graphics(&#34;/images/renert.jpg&#34;)
```--&gt;
&lt;p style=&#34;text-align:center&#34;&gt;
&lt;img src=&#34;/images/renert.jpg&#34; style=&#34;width:60vh; &#34;&gt;
&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This is part 3 of a 3 part blog post. This post uses the data that was scraped in part 1 and prepared in part 2.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Now that we have the data in a nice format, let’s make a frequency plot! First let’s load the data and the packages:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(&amp;quot;tidyverse&amp;quot;)
library(&amp;quot;ggthemes&amp;quot;) # To use different themes and colors
renert_tokenized = readRDS(&amp;quot;renert_tokenized.rds&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Using the &lt;code&gt;ggplot2&lt;/code&gt; package, I can produce a plot of the most frequent words.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;renert_tokenized %&amp;gt;%
  count(word, sort = TRUE) %&amp;gt;%
  filter(n &amp;gt; 50) %&amp;gt;%
  mutate(word = reorder(word, n)) %&amp;gt;%
  ggplot(aes(word, n)) +
  geom_col() +
  xlab(NULL) +
  coord_flip() +
  theme_minimal() &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2018-01-26-analysis-of-the-renert-part-3_files/figure-html/unnamed-chunk-3-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;So, the most frequent word is &lt;em&gt;kinnek&lt;/em&gt;, meaning &lt;em&gt;King&lt;/em&gt;! &lt;em&gt;kinnek&lt;/em&gt; is mentioned more times than &lt;em&gt;renert&lt;/em&gt;, the name of the hero. Next are &lt;em&gt;här&lt;/em&gt; and &lt;em&gt;wollef&lt;/em&gt; meaning &lt;em&gt;mister&lt;/em&gt; and &lt;em&gt;wolf&lt;/em&gt;. In fifth position we have &lt;em&gt;fuuss&lt;/em&gt;, for &lt;em&gt;fox&lt;/em&gt;. I’ll let you use Google Translate for the other words 😄.&lt;/p&gt;
&lt;p&gt;Now, I’m also doing sentiment analysis by using the AFINN list of words. This list of words have a score that gives its sentiment. You can download the original list from &lt;a href=&#34;http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=6010&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Because such a list is not available in Luxembourguish, I have translated it using Google’s translate api. Here is the code to do that:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(&amp;quot;tidyverse&amp;quot;)
library(&amp;quot;translate&amp;quot;) # google translate api
library(&amp;quot;tidytext&amp;quot;) # to load the AFINN dictionary

api_key = &amp;quot;api_key_goes_here&amp;quot;

set.key(api_key)

afinn = get_sentiments(&amp;quot;afinn&amp;quot;)

# I wrap the `translate()` function around `purrr::possibly()` so that in case of an
# error, I get the translations that worked back.

possibly_translate = purrr::possibly(translate::translate, otherwise = &amp;quot;error&amp;quot;)

afinn_lux = afinn %&amp;gt;%
  mutate(lux = map(word, possibly_translate, source = &amp;quot;en&amp;quot;, target = &amp;quot;lb&amp;quot;)) %&amp;gt;%
  mutate(lux = unlist(lux))

write_csv(afinn_lux, &amp;quot;afinn_lux.csv&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For the above code to work, you need to have a Google cloud account, which you can create for free.&lt;/p&gt;
&lt;p&gt;I did not check the quality of the translations, and I’m sure it’s far from perfect. It’s also available on the Github repository &lt;a href=&#34;https://github.com/b-rodrigues/stopwords_lu&#34;&gt;here&lt;/a&gt;. Again, contributions more than welcome!&lt;/p&gt;
&lt;p&gt;Now, I need to merge the dictionary with the data from each song. First, let’s load the dictionary:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;afinn_lux = read.csv(&amp;quot;afinn_lux.csv&amp;quot;)

# I only keep the `lux` column (and rename it to word) and the `score column`
afinn_lux = afinn_lux %&amp;gt;%
  select(word = lux, score)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;How does this dictionary look like? Let’s see:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;head(afinn_lux)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##             word score
## 1       opzeginn    -2
## 2     verloossen    -2
## 3       opzeginn    -2
## 4 entfouert ginn    -2
## 5       entlooss    -2
## 6      entfouert    -2&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s load the tokenized songs, and merge them with the dictionary:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;renert_songs_tokenized = readRDS(&amp;quot;renert_songs_tokenized.rds&amp;quot;)
  
renert_songs_sentiment = map(renert_songs_tokenized, ~full_join(., afinn_lux))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I can now merge the data in a single data frame and do some further cleaning:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;renert_songs_sentiment = renert_songs_sentiment %&amp;gt;%
  bind_rows() %&amp;gt;%
  filter(!is.na(score)) %&amp;gt;%
  filter(!is.na(gesank))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What does the final data look like? Here it is:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;head(renert_songs_sentiment)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 3
##   word  gesank  score
##   &amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt;   &amp;lt;int&amp;gt;
## 1 rifft éischte    -2
## 2 léiw  éischte    -3
## 3 fest  éischte     2
## 4 fest  éischte     2
## 5 räich éischte     2
## 6 räich éischte     3&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We see that there are words that are the same, but with different scores. That’s because the translation of the dictionary was most probably not very good. Oh well, let’s do a boxplot of the sentiment for each song:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;order =  c(&amp;quot;éischte&amp;quot;, &amp;quot;zwete&amp;quot;, &amp;quot;drëtte&amp;quot;, &amp;quot;véierte&amp;quot;, &amp;quot;fënnefte&amp;quot;, &amp;quot;sechste&amp;quot;, &amp;quot;siwente&amp;quot;,
                   &amp;quot;aachte&amp;quot;, &amp;quot;néngte&amp;quot;, &amp;quot;zéngte&amp;quot;, &amp;quot;elefte&amp;quot;, &amp;quot;zwielefte&amp;quot;, &amp;quot;dräizengte&amp;quot;, &amp;quot;véierzengte&amp;quot;)

renert_songs_sentiment %&amp;gt;%
  ggplot(aes(gesank, score)) + 
  scale_x_discrete(limits = order) + 
  geom_boxplot() + 
  theme_minimal() + 
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2018-01-26-analysis-of-the-renert-part-3_files/figure-html/unnamed-chunk-12-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;As we can see, there is no discernible pattern. This can mean two things; either the general sentiment inside each song is fairly neutral, or the the quality of the translation was too bad for the results to make any sense.&lt;/p&gt;
&lt;p&gt;That’s it for this series of posts! I hope you enjoyed reading it as much as I enjoyed writing it and analyzing the data!&lt;/p&gt;
&lt;p&gt;
Don’t hesitate to follow us on twitter &lt;a href=&#34;https://twitter.com/rdata_lu&#34; target=&#34;_blank&#34;&gt;&lt;span class=&#34;citation&#34;&gt;@rdata_lu&lt;/span&gt;&lt;/a&gt; &lt;!-- or &lt;a href=&#34;https://twitter.com/brodriguesco&#34;&gt;@brodriguesco&lt;/a&gt; --&gt; and to &lt;a href=&#34;https://www.youtube.com/channel/UCbazvBnJd7CJ4WnTL6BI6qw?sub_confirmation=1&#34; target=&#34;_blank&#34;&gt;subscribe&lt;/a&gt; to our youtube channel. &lt;br&gt; You can also contact us if you have any comments or suggestions. See you for the next post!
&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Analysis of the Renert - Part 2: Data Processing</title>
      <link>/post/2018-01-24-analysis-of-the-renert-part-2/</link>
      <pubDate>Wed, 24 Jan 2018 00:00:00 +0000</pubDate>
      
      <guid>/post/2018-01-24-analysis-of-the-renert-part-2/</guid>
      <description>&lt;!--```{r, echo=FALSE}
knitr::include_graphics(&#34;/images/renert.jpg&#34;)
```--&gt;
&lt;p style=&#34;text-align:center&#34;&gt;
&lt;img src=&#34;/images/renert.jpg&#34; style=&#34;width:60vh; &#34;&gt;
&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This is part 2 of a 3 part blog post. This post uses the data that we scraped in &lt;a href=&#34;http://www.blog.rdata.lu/post/2018-01-22-analysis-of-the-renert-part-1/&#34;&gt;part 1&lt;/a&gt; and prepares it for further analysis, which is quite technical. If you’re only interested in the results of the analysis, skip to &lt;a href=&#34;http://www.blog.rdata.lu/post/2018-01-26-analysis-of-the-renert-part-3/&#34;&gt;part 3&lt;/a&gt;!&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;First, let’s load the data that we prepared in &lt;a href=&#34;http://www.blog.rdata.lu/post/2018-01-22-analysis-of-the-renert-part-1/&#34;&gt;part 1&lt;/a&gt;. Let’s start with the full text:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(&amp;quot;tidyverse&amp;quot;)
library(&amp;quot;tidytext&amp;quot;)
renert = readRDS(&amp;quot;renert_full.rds&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I want to study the frequencies of words, so for this, I will use a function from the &lt;code&gt;tidytext&lt;/code&gt; package called &lt;code&gt;unnest_tokens()&lt;/code&gt; which breaks the text down into tokens. Each token is a word, which will then make it possible to compute the frequencies of words.&lt;/p&gt;
&lt;p&gt;So, let’s unnest the tokens:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;renert = renert %&amp;gt;%
  unnest_tokens(word, text)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We still need to do some cleaning before continuing. In Luxembourgish, &lt;em&gt;the&lt;/em&gt; is written &lt;em&gt;d’&lt;/em&gt; for feminine nouns. For example &lt;em&gt;d’Kaz&lt;/em&gt; for the &lt;em&gt;the cat&lt;/em&gt;. There’s also a bunch of &lt;em&gt;’t&lt;/em&gt;s in the text, which is &lt;em&gt;it&lt;/em&gt;. For example, the second line of the first song:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;’T stung Alles an der Bléi,&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Everything (it) was on flower,&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;We can remove these with a couple lines of code:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;renert_tokenized = renert %&amp;gt;%
  mutate(word = str_replace_all(word, &amp;quot;d&amp;#39;&amp;quot;, &amp;quot;&amp;quot;)) %&amp;gt;%
  mutate(word = str_replace_all(word, &amp;quot;&amp;#39;t&amp;quot;, &amp;quot;&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But that’s not all! We still need to remove so called stop words. Stop words are words that are very frequent, such as “and”, and these words usually do not add anything to the analysis. There are no set rules for defining a list of stop words, so I took inspiration for the stop words in English and German, and created my own, which you can get on &lt;a href=&#34;https://github.com/b-rodrigues/stopwords_lu&#34;&gt;Github&lt;/a&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;stopwords = read.csv(&amp;quot;stopwords_lu.csv&amp;quot;, header = TRUE)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For my Luxembourgish-speaking compatriots, I’d be glad to get help to make this list better! This list is far from perfect, certainly contains typos, or even words that have no reason to be there! Please help 😅.&lt;/p&gt;
&lt;p&gt;Using this list of stop words, I can remove words that don’t add anything to the analysis. Creating a list of stop words for the Luxembourgish language is very challenging, because there might be stop words that come from German, such as “awer”, from the German “aber”, meaning &lt;em&gt;but&lt;/em&gt;, but you could also use “mä”, from the French &lt;em&gt;mais&lt;/em&gt;, meaning also but. Plus, as a kid, we never really learned how to write Luxembourgish. Actually, most Luxembourguians don’t know how to write Luxembourgish 100% correctly. This is because for a very long time, Luxembourgish was used for oral communication, and French for formal written correspondence. This is changing, and more and more people are learning how to write correctly. I definitely have a lot to learn! Thus, I have certainly missed a lot of stop words in the list, but I am hopeful that others will contribute to the list and make it better! In the meantime, that’s what I’m going to use.&lt;/p&gt;
&lt;p&gt;Let’s take a look at some lines of the stop words data frame:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;head(stopwords, 20)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##          word
## 1           a
## 2           à
## 3         äis
## 4          är
## 5         ärt
## 6        äert
## 7        ären
## 8         all
## 9       allem
## 10      alles
## 11   alleguer
## 12        als
## 13       also
## 14         am
## 15         an
## 16 anerefalls
## 17        ass
## 18        aus
## 19       awer
## 20        bei&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can remove the stop words from our tokens using an &lt;code&gt;anti_join()&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;renert_tokenized = renert_tokenized %&amp;gt;%
  anti_join(stopwords)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Joining, by = &amp;quot;word&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Column `word` joining character vector and factor, coercing into
## character vector&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s save this for later use:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;saveRDS(renert_tokenized, &amp;quot;renert_tokenized.rds&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I now have to do the same for the data that is stored by song. Because this is a list where each element is a data frame, I have to use &lt;code&gt;purrr::map()&lt;/code&gt; to map each of the functions I used before to each data frame:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;renert_songs = readRDS(&amp;quot;renert_songs_df.rds&amp;quot;)
renert_songs = map(renert_songs, ~unnest_tokens(., word, text))
renert_songs = map(renert_songs, ~anti_join(., stopwords))
renert_songs = map(renert_songs, ~mutate(., word = str_replace_all(word, &amp;quot;d&amp;#39;&amp;quot;, &amp;quot;&amp;quot;)))
renert_songs = map(renert_songs, ~mutate(., word = str_replace_all(word, &amp;quot;&amp;#39;t&amp;quot;, &amp;quot;&amp;quot;)))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s take a look at the object we have:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;head(renert_songs[[1]])&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 1
##   word     
##   &amp;lt;chr&amp;gt;    
## 1 éischte  
## 2 gesank   
## 3 edit     
## 4 päischten
## 5 stung    
## 6 bléi&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Looks pretty nice! But I can make it nicer by adding a column containing which song the data refers to. Indeed, the first line of each data frame contains the number of the song. I can extract this information and add it to each data set:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;renert_songs = map(renert_songs, ~mutate(., gesank = pull(.[1,1])))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s take a look again:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;head(renert_songs[[1]])&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 2
##   word      gesank 
##   &amp;lt;chr&amp;gt;     &amp;lt;chr&amp;gt;  
## 1 éischte   éischte
## 2 gesank    éischte
## 3 edit      éischte
## 4 päischten éischte
## 5 stung     éischte
## 6 bléi      éischte&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now I can save this object for later use:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;saveRDS(renert_songs, &amp;quot;renert_songs_tokenized.rds&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In the final part of this series, I will use the tokenized data as well as the list of songs to create a couple of visualizations!&lt;/p&gt;
&lt;p&gt;
Don’t hesitate to follow us on twitter &lt;a href=&#34;https://twitter.com/rdata_lu&#34; target=&#34;_blank&#34;&gt;&lt;span class=&#34;citation&#34;&gt;@rdata_lu&lt;/span&gt;&lt;/a&gt; &lt;!-- or &lt;a href=&#34;https://twitter.com/brodriguesco&#34;&gt;@brodriguesco&lt;/a&gt; --&gt; and to &lt;a href=&#34;https://www.youtube.com/channel/UCbazvBnJd7CJ4WnTL6BI6qw?sub_confirmation=1&#34; target=&#34;_blank&#34;&gt;subscribe&lt;/a&gt; to our youtube channel. &lt;br&gt; You can also contact us if you have any comments or suggestions. See you for the next post!
&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Launching your shiny app in 2 clicks</title>
      <link>/post/2017-12-26-launching-your-shiny-app-in-2-clicks/</link>
      <pubDate>Tue, 26 Dec 2017 00:00:00 +0000</pubDate>
      
      <guid>/post/2017-12-26-launching-your-shiny-app-in-2-clicks/</guid>
      <description>&lt;!--```{r, echo=FALSE}
knitr::include_graphics(&#34;/images/what-if.jpg&#34;)
```--&gt;
&lt;p style=&#34;text-align:center&#34;&gt;
&lt;img src=&#34;/images/what-if.jpg&#34; style=&#34;width:60vh; &#34;&gt;
&lt;/p&gt;
&lt;p&gt;Hello everyone,&lt;/p&gt;
&lt;p&gt;Why we always want to take the red pill when we can take the blue one? That’s the question! &lt;/br&gt; In this post I will explain how to launch a shiny application from a shortcut. Just like that: &lt;iframe width=&#34;100%&#34; height=&#34;100%&#34; src=&#34;https://www.youtube.com/embed/jyGuPpbYhWk&#34; frameborder=&#34;0&#34; allowfullscreen style = &#34;max-width:100%; height:55vh;&#34;&gt;&lt;/iframe&gt;&lt;/p&gt;
&lt;p&gt;It’s fast and useful if you work with colleagues that don’t have a clue about R and just want to use your shiny app.&lt;/p&gt;
&lt;h4&gt;
If you are on macOS:
&lt;/h4&gt;
&lt;p&gt;Open a text editor and write the following lines :&lt;/p&gt;
&lt;pre&gt;&lt;code&gt; Rscript &amp;#39;&amp;amp;FullPath/&amp;amp;file.R&amp;#39; &amp;amp;
 open http://127.0.0.1:&amp;amp;PortNumber/;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;&amp;amp;&lt;/strong&gt; allows you to write more command lines even if the shell is still busy with the previous line. The second line opens your web browser with the chosen address. A concrete example gives this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Rscript &amp;#39;/Users/user/Desktop/Clef_USB/data_science/fret_dashboard.R&amp;#39; &amp;amp;
open http://127.0.0.1:7990/;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;On the second line, &lt;strong&gt;7990&lt;/strong&gt; refers to the chosen port in your shiny app. If you didn’t choose one, add the following to your shinyApp: &lt;strong&gt;options = list(port= 7990)&lt;/strong&gt;, just like that:&lt;br&gt; &lt;strong&gt;shinyApp(ui=ui,server=server, options = list(port=7990))&lt;/strong&gt;&lt;br&gt;&lt;/p&gt;
&lt;p&gt;If you use a &lt;strong&gt;.Rmd&lt;/strong&gt; document, you should write this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt; Rscript -e  &amp;quot;rmarkdown::run(&amp;#39;&amp;amp;FullPath/&amp;amp;file.Rmd&amp;#39;, shiny_args = list(port=&amp;amp;PortNumber))&amp;quot; &amp;amp;
 open http://127.0.0.1:&amp;amp;PortNumber/&amp;amp;file.Rmd;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Replace &amp;amp;FullPath, &amp;amp;file and &amp;amp;PortNumber by the file path, the file name and the chosen port. &lt;br&gt; &lt;br&gt; Then save it with the &lt;strong&gt;“.command”&lt;/strong&gt; extension and try to launch your shiny application from the shortcut. If you have a message saying that the file could not be executed because you do not have appropriate access privileges, open your terminal and write the following line:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;chmod u+x &amp;amp;FullPath/&amp;amp;file.command&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then it should work :) &lt;br&gt; We are almost done, now we have to add another shortcut icon to make it look more friendly. For that, go on google and download free &lt;strong&gt;.icns&lt;/strong&gt; icons. I have download some icons on &lt;a href=&#34;http://www.easyicon.net/&#34;&gt;easyicon&lt;/a&gt;. To change the icon you just need to go in “get info” and drag your new icon to the previous one, just like below:&lt;br&gt;&lt;/p&gt;
&lt;iframe width=&#34;100%&#34; height=&#34;100%&#34; src=&#34;https://www.youtube.com/embed/j_Vp0HL2b90&#34; frameborder=&#34;0&#34; allowfullscreen style=&#34;max-width:100%; height:55vh;&#34;&gt;
&lt;/iframe&gt;
&lt;p&gt;&lt;br&gt;&lt;br&gt;&lt;/p&gt;
&lt;h4&gt;
If you are on Windows:
&lt;/h4&gt;
&lt;p&gt;Open a text editor and write the following lines:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;start &amp;quot;&amp;quot; &amp;quot;&amp;amp;FullPath1\Rscript.exe&amp;quot; &amp;amp;FullPath2\&amp;amp;file.R  /k
start &amp;quot;Name&amp;quot; &amp;quot;http://127.0.0.1:&amp;amp;PortNumber/&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;/k&lt;/strong&gt; allows you to write something in the command prompt even if the Prompt is busy with the previous line. A concrete example gives that :&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;start &amp;quot;&amp;quot; &amp;quot;C:\Program Files\R\R-3.3.3\bin\Rscript.exe&amp;quot; C:\Users\CGP462\Documents\Shiny_app\iris.R  /k
start &amp;quot;iris&amp;quot; &amp;quot;http://127.0.0.1:7924/&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then you can save your file with a &lt;strong&gt;.bat&lt;/strong&gt; extension. &lt;br&gt; Be carefull, if you want to launch a &lt;strong&gt;.Rmd&lt;/strong&gt; you should use this code instead :&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;start &amp;quot;&amp;amp;FullPath1\R.exe&amp;quot; -e &amp;quot;Sys.setenv(RSTUDIO_PANDOC=&amp;#39;C:/Program Files/RStudio/bin/pandoc&amp;#39;); rmarkdown::run(&amp;#39;&amp;amp;FullPath2/&amp;amp;file.Rmd&amp;#39;, shiny_args = list(port =&amp;amp;PortNumber, launch.browser = FALSE))&amp;quot; /k
start &amp;quot;&amp;amp;file&amp;quot; &amp;quot;http://127.0.0.1:7924/&amp;amp;file.Rmd&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;if it doesn’t work, check your pandoc location.&lt;/p&gt;
&lt;p&gt;It’s almost done. Now we have to change the icon. First you have to download an icon with the &lt;code&gt;.ico&lt;/code&gt; extension. Then you have to act in 4 steps : &lt;/br&gt; 1. Right click on your file and go in &lt;strong&gt;Properties&lt;/strong&gt;. &lt;/br&gt; 2. Choose the &lt;strong&gt;Shortcut&lt;/strong&gt; tab. &lt;/br&gt; 3. Then go in &lt;strong&gt;Change Icon…&lt;/strong&gt;. &lt;/br&gt; 4. In Browse, select the icon you want to see instead. &lt;/br&gt; &lt;img src=&#34;/images/micon_change.png&#34; style = &#34;max-width:100%;&#34;&gt;&lt;/p&gt;
&lt;p&gt;And now it’s done! &lt;br&gt;&lt;/p&gt;
&lt;p&gt;
Don’t hesitate to follow us on twitter &lt;a href=&#34;https://twitter.com/rdata_lu&#34; target=&#34;_blank&#34;&gt;&lt;span class=&#34;citation&#34;&gt;@rdata_lu&lt;/span&gt;&lt;/a&gt; &lt;!-- or &lt;a href=&#34;https://twitter.com/brodriguesco&#34;&gt;@brodriguesco&lt;/a&gt; --&gt; and to &lt;a href=&#34;https://www.youtube.com/channel/UCbazvBnJd7CJ4WnTL6BI6qw?sub_confirmation=1&#34; target=&#34;_blank&#34;&gt;subscribe&lt;/a&gt; to our youtube channel. &lt;br&gt; You can also contact us if you have any comments or suggestions. See you for the next post!
&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Skip errors in R loops by not writing loops</title>
      <link>/post/2017-12-21-skip-errors-in-r-by-not-writing-loops/</link>
      <pubDate>Thu, 21 Dec 2017 00:00:00 +0000</pubDate>
      
      <guid>/post/2017-12-21-skip-errors-in-r-by-not-writing-loops/</guid>
      <description>&lt;p&gt;&lt;img src=&#34;/images/loops_error.jpg&#34; /&gt;&lt;!-- --&gt;&lt;/p&gt;
&lt;p&gt;You probably have encountered situations similar to this one:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;result = vector(&amp;quot;list&amp;quot;, length(some_numbers))

for(i in seq_along(some_numbers)){
  result[[i]] = some_function(some_numbers[[i]])
}

print(result)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;First I initialize &lt;code&gt;result&lt;/code&gt;, an empty list of size equal to the length of &lt;code&gt;some_numbers&lt;/code&gt; which will contains the results of applying &lt;code&gt;some_function()&lt;/code&gt; to each element of &lt;code&gt;some_numbers&lt;/code&gt;. Then, using a for loop, I apply the function. This is what I get back:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;NaNs producedError in sqrt(x) : non-numeric argument to mathematical function&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s take a look at &lt;code&gt;some_numbers&lt;/code&gt; and &lt;code&gt;some_function()&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;print(some_numbers)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [[1]]
## [1] -1.9
## 
## [[2]]
## [1] 20
## 
## [[3]]
## [1] &amp;quot;-88&amp;quot;
## 
## [[4]]
## [1] -42&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;some_function&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## function(x){
##   if(x == 0) res = 0
##   if(x &amp;lt; 0) res = -sqrt(-x)
##   if(x &amp;gt; 0) res = sqrt(x)
##   return(res)
## }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So the function simply returns the square root of &lt;code&gt;x&lt;/code&gt; (or minus the square root of &lt;code&gt;-x&lt;/code&gt; if &lt;code&gt;x&lt;/code&gt; is negative), but the number in third position of the list &lt;code&gt;some_numbers&lt;/code&gt; is actually a character. This type of mistakes can commonly happen. The &lt;code&gt;result&lt;/code&gt; list looks like this:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;print(result)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;[[1]]
[1] -1.378405

[[2]]
[1] 4.472136

[[3]]
NULL

[[4]]
NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As you see, even though the fourth element could have been computed, the error made the whole loop stop. In such a simple example, you could correct this and then run your function. But what if the list you want to apply your function to is very long and the computation take a very, very long time? Perhaps you simply want to skip these errors and get back to them later. One way of doing that is using &lt;code&gt;tryCatch()&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;result = vector(&amp;quot;list&amp;quot;, length(some_numbers))

for(i in seq_along(some_numbers)){
  result[[i]] = tryCatch(some_function(some_numbers[[i]]), 
                         error = function(e) paste(&amp;quot;something wrong here&amp;quot;))
}

print(result)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [[1]]
## [1] -1.378405
## 
## [[2]]
## [1] 4.472136
## 
## [[3]]
## [1] &amp;quot;something wrong here&amp;quot;
## 
## [[4]]
## [1] -6.480741&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This works, but it’s verbose and easy to mess up. My advice here is that if you want to skip errors in loops you don’t write loops! This is quite easy with the &lt;code&gt;purrr&lt;/code&gt; package:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(purrr)

result = map(some_numbers, some_function)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There’s several advantages here already; no need to initialize an empty structure to hold your result, and no need to think about indices, which can sometimes get confusing. This however does not work either; there’s still the problem that we have a character inside &lt;code&gt;some_numbers&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Error in sqrt(x) : non-numeric argument to mathematical function&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;However, &lt;code&gt;purrr&lt;/code&gt; contains some very amazing functions for error handling, &lt;code&gt;safely()&lt;/code&gt; and &lt;code&gt;possibly()&lt;/code&gt;. Let’s try &lt;code&gt;possibly()&lt;/code&gt; first:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;possibly_some_function = possibly(some_function, otherwise = &amp;quot;something wrong here&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;possibly()&lt;/code&gt; takes a function as argument as well as &lt;code&gt;otherwise&lt;/code&gt;; this is where you define a return value in case something is wrong. &lt;code&gt;possibly()&lt;/code&gt; then returns a new function that skips errors:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;result = map(some_numbers, possibly_some_function)

print(result)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [[1]]
## [1] -1.378405
## 
## [[2]]
## [1] 4.472136
## 
## [[3]]
## [1] &amp;quot;something wrong here&amp;quot;
## 
## [[4]]
## [1] -6.480741&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When you use &lt;code&gt;possibly()&lt;/code&gt; on a function, you’re politely telling R “would you kindly apply the function wherever possible, and if not, tell me where there was an issue”. What about &lt;code&gt;safely()&lt;/code&gt;?&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;safely_some_function = safely(some_function)


result = map(some_numbers, safely_some_function)

str(result)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## List of 4
##  $ :List of 2
##   ..$ result: num -1.38
##   ..$ error : NULL
##  $ :List of 2
##   ..$ result: num 4.47
##   ..$ error : NULL
##  $ :List of 2
##   ..$ result: NULL
##   ..$ error :List of 2
##   .. ..$ message: chr &amp;quot;invalid argument to unary operator&amp;quot;
##   .. ..$ call   : language -x
##   .. ..- attr(*, &amp;quot;class&amp;quot;)= chr [1:3] &amp;quot;simpleError&amp;quot; &amp;quot;error&amp;quot; &amp;quot;condition&amp;quot;
##  $ :List of 2
##   ..$ result: num -6.48
##   ..$ error : NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The major difference with &lt;code&gt;possibly()&lt;/code&gt; is that &lt;code&gt;safely()&lt;/code&gt; returns a more complex object: it returns a list of lists. There are as many lists as there are elements in &lt;code&gt;some_numbers&lt;/code&gt;. Let’s take a look at the first one:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;print(result[[1]])&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## $result
## [1] -1.378405
## 
## $error
## NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;result[[1]]&lt;/code&gt; is a list with a &lt;code&gt;result&lt;/code&gt; and an &lt;code&gt;error&lt;/code&gt;. If there was no error, we get a value in &lt;code&gt;result&lt;/code&gt; and &lt;code&gt;NULL&lt;/code&gt; in &lt;code&gt;error&lt;/code&gt;. If there was an &lt;code&gt;error&lt;/code&gt;, this is what we see:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;print(result[[3]])&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## $result
## NULL
## 
## $error
## &amp;lt;simpleError in -x: invalid argument to unary operator&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Because lists of lists are not easy to handle, I like to use &lt;code&gt;possibly()&lt;/code&gt;, but if you use &lt;code&gt;safely()&lt;/code&gt; you might want to know about &lt;code&gt;transpose()&lt;/code&gt;, which is another function from &lt;code&gt;purrr&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;result2 = transpose(result)

str(result2)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## List of 2
##  $ result:List of 4
##   ..$ : num -1.38
##   ..$ : num 4.47
##   ..$ : NULL
##   ..$ : num -6.48
##  $ error :List of 4
##   ..$ : NULL
##   ..$ : NULL
##   ..$ :List of 2
##   .. ..$ message: chr &amp;quot;invalid argument to unary operator&amp;quot;
##   .. ..$ call   : language -x
##   .. ..- attr(*, &amp;quot;class&amp;quot;)= chr [1:3] &amp;quot;simpleError&amp;quot; &amp;quot;error&amp;quot; &amp;quot;condition&amp;quot;
##   ..$ : NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;result2&lt;/code&gt; is now a list of two lists: a &lt;code&gt;result&lt;/code&gt; list holding all the results, and an &lt;code&gt;error&lt;/code&gt; list holding all the error message. You can get results with:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;result2$result&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [[1]]
## [1] -1.378405
## 
## [[2]]
## [1] 4.472136
## 
## [[3]]
## NULL
## 
## [[4]]
## [1] -6.480741&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I hope you enjoyed this blog post, and that these functions will make your life easier!&lt;/p&gt;
&lt;p&gt;
Don’t hesitate to follow us on twitter &lt;a href=&#34;https://twitter.com/rdata_lu&#34; target=&#34;_blank&#34;&gt;&lt;span class=&#34;citation&#34;&gt;@rdata_lu&lt;/span&gt;&lt;/a&gt; &lt;!-- or &lt;a href=&#34;https://twitter.com/brodriguesco&#34;&gt;@brodriguesco&lt;/a&gt; --&gt; and to &lt;a href=&#34;https://www.youtube.com/channel/UCbazvBnJd7CJ4WnTL6BI6qw?sub_confirmation=1&#34; target=&#34;_blank&#34;&gt;subscribe&lt;/a&gt; to our youtube channel. &lt;br&gt; You can also contact us if you have any comments or suggestions. See you for the next post!
&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Functional peace of mind</title>
      <link>/post/2017-11-14-functional-peace-of-mind/</link>
      <pubDate>Tue, 14 Nov 2017 00:00:00 +0000</pubDate>
      
      <guid>/post/2017-11-14-functional-peace-of-mind/</guid>
      <description>&lt;p&gt;I think what I enjoy the most about functional programming is the peace of mind that comes with it. With functional programming, there’s a lot of stuff you don’t need to think about. You can write functions that are general enough so that they solve a variety of problems. For example, imagine for a second that R does not have the &lt;code&gt;sum()&lt;/code&gt; function anymore. If you want to compute the sum of, say, the first 100 integers, you could write a loop that would do that for you:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;numbers = 0

for (i in 1:100){
  numbers = numbers + i
}

print(numbers)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 5050&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The problem with this approach, is that you cannot reuse any of the code there, even if you put it inside a function. For instance, what if you want to merge 4 datasets together? You would need something like this:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(dplyr)
data(mtcars)&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;mtcars1 = mtcars %&amp;gt;%
  mutate(id = &amp;quot;1&amp;quot;)

mtcars2 = mtcars %&amp;gt;%
  mutate(id = &amp;quot;2&amp;quot;)

mtcars3 = mtcars %&amp;gt;%
  mutate(id = &amp;quot;3&amp;quot;)

mtcars4 = mtcars %&amp;gt;%
  mutate(id = &amp;quot;4&amp;quot;)

datasets = list(mtcars1, mtcars2, mtcars3, mtcars4)

temp = datasets[[1]]

for(i in 1:3){
  temp = full_join(temp, datasets[[i+1]])
}&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Joining, by = c(&amp;quot;mpg&amp;quot;, &amp;quot;cyl&amp;quot;, &amp;quot;disp&amp;quot;, &amp;quot;hp&amp;quot;, &amp;quot;drat&amp;quot;, &amp;quot;wt&amp;quot;, &amp;quot;qsec&amp;quot;, &amp;quot;vs&amp;quot;, &amp;quot;am&amp;quot;, &amp;quot;gear&amp;quot;, &amp;quot;carb&amp;quot;, &amp;quot;id&amp;quot;)
## Joining, by = c(&amp;quot;mpg&amp;quot;, &amp;quot;cyl&amp;quot;, &amp;quot;disp&amp;quot;, &amp;quot;hp&amp;quot;, &amp;quot;drat&amp;quot;, &amp;quot;wt&amp;quot;, &amp;quot;qsec&amp;quot;, &amp;quot;vs&amp;quot;, &amp;quot;am&amp;quot;, &amp;quot;gear&amp;quot;, &amp;quot;carb&amp;quot;, &amp;quot;id&amp;quot;)
## Joining, by = c(&amp;quot;mpg&amp;quot;, &amp;quot;cyl&amp;quot;, &amp;quot;disp&amp;quot;, &amp;quot;hp&amp;quot;, &amp;quot;drat&amp;quot;, &amp;quot;wt&amp;quot;, &amp;quot;qsec&amp;quot;, &amp;quot;vs&amp;quot;, &amp;quot;am&amp;quot;, &amp;quot;gear&amp;quot;, &amp;quot;carb&amp;quot;, &amp;quot;id&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;glimpse(temp)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Observations: 128
## Variables: 12
## $ mpg  &amp;lt;dbl&amp;gt; 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19....
## $ cyl  &amp;lt;dbl&amp;gt; 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, ...
## $ disp &amp;lt;dbl&amp;gt; 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 1...
## $ hp   &amp;lt;dbl&amp;gt; 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, ...
## $ drat &amp;lt;dbl&amp;gt; 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.9...
## $ wt   &amp;lt;dbl&amp;gt; 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3...
## $ qsec &amp;lt;dbl&amp;gt; 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 2...
## $ vs   &amp;lt;dbl&amp;gt; 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, ...
## $ am   &amp;lt;dbl&amp;gt; 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, ...
## $ gear &amp;lt;dbl&amp;gt; 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, ...
## $ carb &amp;lt;dbl&amp;gt; 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, ...
## $ id   &amp;lt;chr&amp;gt; &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Of course, the logic is very similar as before, but you need to think carefully about the structure holding your elements (which can be numbers, datasets, characters, etc…) as well as be careful about indexing correctly… and depending on the type of objects you are working on, you might need to tweak the code further.&lt;/p&gt;
&lt;p&gt;How would a functional programming approach make this easier? Of course, you could use &lt;code&gt;purrr::reduce()&lt;/code&gt; to solve these problems. However, since I assumed that &lt;code&gt;sum()&lt;/code&gt; does not exist, I will also assume that &lt;code&gt;purrr::reduce()&lt;/code&gt; does not exist either and write my own, clumsy implementation. Here’s the code:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;my_reduce = function(a_list, a_func, init = NULL, ...){

  if(is.null(init)){
    init = `[[`(a_list, 1)
    a_list = tail(a_list, -1)
  }

  car = `[[`(a_list, 1)
  cdr = tail(a_list, -1)
  init = a_func(init, car, ...)

  if(length(cdr) != 0){
    my_reduce(cdr, a_func, init, ...)
  }
  else {
    init
  }
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This can look much more complicated than before, but the idea is quite simple; &lt;em&gt;if you know about recursive functions&lt;/em&gt; (recursive functions are functions that call themselves). I won’t explain how the function works, because it is not the main point of the article (but if you’re curious, I encourage you to play around with it). The point is that now, I can do the following:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;my_reduce(list(1,2,3,4,5), `+`)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 15&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;my_reduce(datasets, full_join) %&amp;gt;% glimpse&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Joining, by = c(&amp;quot;mpg&amp;quot;, &amp;quot;cyl&amp;quot;, &amp;quot;disp&amp;quot;, &amp;quot;hp&amp;quot;, &amp;quot;drat&amp;quot;, &amp;quot;wt&amp;quot;, &amp;quot;qsec&amp;quot;, &amp;quot;vs&amp;quot;, &amp;quot;am&amp;quot;, &amp;quot;gear&amp;quot;, &amp;quot;carb&amp;quot;, &amp;quot;id&amp;quot;)
## Joining, by = c(&amp;quot;mpg&amp;quot;, &amp;quot;cyl&amp;quot;, &amp;quot;disp&amp;quot;, &amp;quot;hp&amp;quot;, &amp;quot;drat&amp;quot;, &amp;quot;wt&amp;quot;, &amp;quot;qsec&amp;quot;, &amp;quot;vs&amp;quot;, &amp;quot;am&amp;quot;, &amp;quot;gear&amp;quot;, &amp;quot;carb&amp;quot;, &amp;quot;id&amp;quot;)
## Joining, by = c(&amp;quot;mpg&amp;quot;, &amp;quot;cyl&amp;quot;, &amp;quot;disp&amp;quot;, &amp;quot;hp&amp;quot;, &amp;quot;drat&amp;quot;, &amp;quot;wt&amp;quot;, &amp;quot;qsec&amp;quot;, &amp;quot;vs&amp;quot;, &amp;quot;am&amp;quot;, &amp;quot;gear&amp;quot;, &amp;quot;carb&amp;quot;, &amp;quot;id&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Observations: 128
## Variables: 12
## $ mpg  &amp;lt;dbl&amp;gt; 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19....
## $ cyl  &amp;lt;dbl&amp;gt; 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, ...
## $ disp &amp;lt;dbl&amp;gt; 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 1...
## $ hp   &amp;lt;dbl&amp;gt; 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, ...
## $ drat &amp;lt;dbl&amp;gt; 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.9...
## $ wt   &amp;lt;dbl&amp;gt; 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3...
## $ qsec &amp;lt;dbl&amp;gt; 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 2...
## $ vs   &amp;lt;dbl&amp;gt; 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, ...
## $ am   &amp;lt;dbl&amp;gt; 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, ...
## $ gear &amp;lt;dbl&amp;gt; 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, ...
## $ carb &amp;lt;dbl&amp;gt; 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, ...
## $ id   &amp;lt;chr&amp;gt; &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And if I need to merge another dataset, I don’t need to change anything at all. Plus, because &lt;code&gt;my_reduce()&lt;/code&gt; is very general, I can even use it for situation I didn’t write it for in the first place:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;my_reduce(list(&amp;quot;a&amp;quot;, &amp;quot;b&amp;quot;, &amp;quot;c&amp;quot;, &amp;quot;d&amp;quot;, &amp;quot;e&amp;quot;), paste)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;a b c d e&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Of course, &lt;code&gt;paste()&lt;/code&gt; is vectorized, so you could just as well do &lt;code&gt;paste(1, 2, 3, 4, 5)&lt;/code&gt;, but again, I want to insist on the fact that writing or using such functions allows you to abstract over a lot of thing. There is nothing specific to any type of object in &lt;code&gt;my_reduce()&lt;/code&gt;, whereas the loops have to be tailored for the kind of object you’re working with. As long as the &lt;code&gt;a_func&lt;/code&gt; argument is a binary operator that combines the elements inside &lt;code&gt;a_list&lt;/code&gt;, it’s going to work. And I don’t need to think about indexing, about having temporary variables or thinking about the structure that will hold my results.&lt;/p&gt;
&lt;p&gt;
Don’t hesitate to follow us on twitter &lt;a href=&#34;https://twitter.com/rdata_lu&#34; target=&#34;_blank&#34;&gt;&lt;span class=&#34;citation&#34;&gt;@rdata_lu&lt;/span&gt;&lt;/a&gt; &lt;!-- or &lt;a href=&#34;https://twitter.com/brodriguesco&#34;&gt;@brodriguesco&lt;/a&gt; --&gt; and to &lt;a href=&#34;https://www.youtube.com/channel/UCbazvBnJd7CJ4WnTL6BI6qw?sub_confirmation=1&#34; target=&#34;_blank&#34;&gt;subscribe&lt;/a&gt; to our youtube channel. &lt;br&gt; You can also contact us if you have any comments or suggestions. See you for the next post!
&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>How tidyeval could make your life easier</title>
      <link>/post/2017-08-27-how-tidyeval-could-make-your-life-easier/</link>
      <pubDate>Sun, 27 Aug 2017 00:00:00 +0000</pubDate>
      
      <guid>/post/2017-08-27-how-tidyeval-could-make-your-life-easier/</guid>
      <description>&lt;p&gt;First thing’s first: maybe you shouldn’t care about &lt;code&gt;tidyeval&lt;/code&gt;. Maybe you don’t need it. If you exclusively work interactively, I don’t think that learning about &lt;code&gt;tidyeval&lt;/code&gt; is important. I can only speak for me, and explain to you why I personally find &lt;code&gt;tidyeval&lt;/code&gt; useful.&lt;/p&gt;
&lt;p&gt;I wanted to write this blog post after reading this &lt;a href=&#34;https://twitter.com/dataandme/status/901429535266267136&#34;&gt;twitter thread&lt;/a&gt; and specifically &lt;a href=&#34;https://twitter.com/Kwarizmi/status/901457435948236801&#34;&gt;this question&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://twitter.com/dataandme&#34;&gt;Mara Averick&lt;/a&gt; then wrote &lt;a href=&#34;http://maraaverick.rbind.io/2017/08/tidyeval-resource-roundup/&#34;&gt;this blogpost&lt;/a&gt; linking to 6 other blog posts that give some &lt;code&gt;tidyeval&lt;/code&gt; examples. Reading them, plus the &lt;a href=&#34;http://dplyr.tidyverse.org/articles/programming.html&#34;&gt;Programming with dplyr&lt;/a&gt; vignette should help you get started with &lt;code&gt;tidyeval&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;But maybe now you know how to use it, but not why and when you should use it… Basically, whenever you want to write a function that looks something like this:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;my_function(my_data, one_column_inside_data)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;is when you want to use the power of &lt;code&gt;tidyeval&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;I work at &lt;a href=&#34;http://www.statistiques.public.lu/en/index.html&#34;&gt;STATEC&lt;/a&gt;, Luxembourg’s national institute of statistics. I work on all kinds of different projects, and when data gets updated (for example because a new round of data collection for some survey finished), I run my own scripts on the fresh data to make the data nice and tidy for analysis. Because surveys get updated, sometimes column names change a little bit, and this can cause some issues.&lt;/p&gt;
&lt;p&gt;Very recently, a dataset I work with got updated. Data collection was finished, so I just loaded my hombrewed package written for this project, changed the path from last year’s script to this year’s fresh data path, ran the code, and watched as the folders got populated with new &lt;code&gt;ggplot2&lt;/code&gt; graphs and LaTeX tables with descriptive statistics and regression results. This is then used to generate this year’s report. However, by looking at the graphs, I noticed something weird; some graphs were showing some very strange patterns. It turns out that one column got its name changed, and also one of its values got changed too.&lt;/p&gt;
&lt;p&gt;Last year, this column, let’s call it &lt;code&gt;spam&lt;/code&gt;, had values &lt;code&gt;1&lt;/code&gt; for &lt;code&gt;good&lt;/code&gt; and &lt;code&gt;0&lt;/code&gt; for &lt;code&gt;bad&lt;/code&gt;. This year the column is called &lt;code&gt;Spam&lt;/code&gt; and the values are &lt;code&gt;1&lt;/code&gt; and &lt;code&gt;2&lt;/code&gt;. When I found out that this was the source of the problem, I just had to change the arguments of my functions from&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;generate_spam_plot(dataset = data2016, column = spam, value = 1)
generate_spam_plot(dataset = data2016, column = spam, value = 0)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;to&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;generate_spam_plot(dataset = data2017, column = Spam, value = 1)
generate_spam_plot(dataset = data2017, column = Spam, value = 2)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;without needing to change anything else. This is why I use &lt;code&gt;tidyeval&lt;/code&gt;; without it, writing a function such as &lt;code&gt;genereta_spam_plot&lt;/code&gt; would not be easy. It would be possible, but not easy.&lt;/p&gt;
&lt;p&gt;If you want to know more about &lt;code&gt;tidyeval&lt;/code&gt; and working programmatically with R, I shamelessly invite you to read a book I’ve been working on: &lt;a href=&#34;http://www.brodrigues.co/fput/&#34; class=&#34;uri&#34;&gt;http://www.brodrigues.co/fput/&lt;/a&gt; It’s still a WIP, but maybe you’ll find it useful. I plan on finishing it by the end of the year, but there’s already some content to keep you busy!&lt;/p&gt;
&lt;p&gt;
Don’t hesitate to follow us on twitter &lt;a href=&#34;https://twitter.com/rdata_lu&#34; target=&#34;_blank&#34;&gt;&lt;span class=&#34;citation&#34;&gt;@rdata_lu&lt;/span&gt;&lt;/a&gt; &lt;!-- or &lt;a href=&#34;https://twitter.com/brodriguesco&#34;&gt;@brodriguesco&lt;/a&gt; --&gt; and to &lt;a href=&#34;https://www.youtube.com/channel/UCbazvBnJd7CJ4WnTL6BI6qw?sub_confirmation=1&#34; target=&#34;_blank&#34;&gt;subscribe&lt;/a&gt; to our youtube channel. &lt;br&gt; You can also contact us if you have any comments or suggestions. See you for the next post!
&lt;/p&gt;
</description>
    </item>
    
  </channel>
</rss>
