<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Visualization on rdata.lu Blog | Data science with R</title>
    <link>/categories/visualization/</link>
    <description>Recent content in Visualization on rdata.lu Blog | Data science with R</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <copyright>Copyright (c) rdata.lu. All rights reserved. &lt;br&gt; Content reblogged by &lt;a href=&#39;https://www.r-bloggers.com/&#39; target=&#39;_blank&#39;&gt;R-bloggers&lt;/a&gt; &amp; &lt;a href=&#39;http://www.rweekly.org/&#39; target=&#39;_blank&#39;&gt;RWeekly&lt;/a&gt;</copyright>
    <lastBuildDate>Wed, 03 Jan 2018 00:00:00 +0000</lastBuildDate>
    
        <atom:link href="/categories/visualization/index.xml" rel="self" type="application/rss+xml" />
    
    
    <item>
      <title>Map unemployment using R with ggplot2</title>
      <link>/post/2018-01-03-mapping-unemployment-luxembourg/</link>
      <pubDate>Wed, 03 Jan 2018 00:00:00 +0000</pubDate>
      
      <guid>/post/2018-01-03-mapping-unemployment-luxembourg/</guid>
      <description>&lt;p&gt;In this blog post, I show various ways to create maps using R. You’ll need to install a lot of packages and download two data sets; the unemployment rate in Luxembourg as well as a shapefile.&lt;/p&gt;
&lt;p&gt;To get the unemployment rate in Luxembourg, you can take a look at our &lt;a href=&#34;http://www.blog.rdata.lu/post/2017-08-21-scraping-data-from-statec-s-public-tables/&#34;  target=&#34;_blank&#34;&gt;previous blog post&lt;/a&gt; or simply run the following lines:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(rvest)
library(dplyr)
library(purrr)
library(janitor)
library(tidyr)

page_unemp = read_html(&amp;quot;http://www.statistiques.public.lu/stat/TableViewer/tableViewHTML.aspx?ReportId=12950&amp;amp;IF_Language=eng&amp;amp;MainTheme=2&amp;amp;FldrName=3&amp;amp;RFPath=91&amp;quot;)

data_raw = page_unemp %&amp;gt;%
  html_nodes(&amp;quot;.b2020-datatable&amp;quot;) %&amp;gt;% .[[1]] %&amp;gt;% html_table(fill = TRUE)

colnames(data_raw) = data_raw[1, ]

colnames(data_raw)[1:2] = c(&amp;quot;division&amp;quot;, &amp;quot;variable&amp;quot;)

data_raw = data_raw[-c(1,2), ]

unemp_lux = data_raw %&amp;gt;%
  map_df(function(x)(gsub(&amp;quot;,&amp;quot;, &amp;quot;.&amp;quot;, x = x))) %&amp;gt;%
  mutate_at(vars(matches(&amp;quot;\\d{4}&amp;quot;)), as.numeric) %&amp;gt;%
  gather(key=year, value, -division, -variable) %&amp;gt;%
  spread(variable, value) %&amp;gt;%
  clean_names()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;These lines scrape the data off STATEC’s (the national institute of statistics) public tables and puts the raw data into a tidy data frame. Let’s take a look:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;head(unemp_lux)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##   division year active_population of_which_non_wage_earners
## 1 Beaufort 2001               688                        85
## 2 Beaufort 2002               742                        85
## 3 Beaufort 2003               773                        85
## 4 Beaufort 2004               828                        80
## 5 Beaufort 2005               866                        96
## 6 Beaufort 2006               893                        87
##   of_which_wage_earners total_employed_population unemployed
## 1                   568                       653         35
## 2                   631                       716         26
## 3                   648                       733         40
## 4                   706                       786         42
## 5                   719                       815         51
## 6                   746                       833         60
##   unemployment_rate_in_percent
## 1                         5.09
## 2                         3.50
## 3                         5.17
## 4                         5.07
## 5                         5.89
## 6                         6.72&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Once you have the unemployment data, install the next packages you’ll need to follow the rest of the post:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;install.packages(
  c(
    &amp;quot;viridis&amp;quot;,  # Optional, but better color scheme than the default
    &amp;quot;broom&amp;quot;,    # For tidy()
    &amp;quot;ggplot2&amp;quot;,  # To create a basic map
    &amp;quot;ggthemes&amp;quot; # To change the theme of the map
    )
  )&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then install two further packages:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;install.packages(&amp;#39;rgeos&amp;#39;, type=&amp;#39;source&amp;#39;) # Dependency of rgdal
install.packages(&amp;#39;rgdal&amp;#39;, type=&amp;#39;source&amp;#39;) # To read in the shapefile&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;rgdal&lt;/code&gt; might be tricky to install on macOS and Linux. If you’re using Ubuntu, you have to install &lt;code&gt;libgdal-dev&lt;/code&gt;, and on macOS you’ll need to install &lt;code&gt;gdal&lt;/code&gt; using Homebrew.&lt;/p&gt;
&lt;p&gt;There’s a final package to install, but you have to get it from Github (and thus need &lt;code&gt;devtools&lt;/code&gt;):&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;devtools::install_github(&amp;quot;dgrtwo/gganimate&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To draw a map, you will need a so-called shapefile. These files contain the geometry of the countries, regions, etc so that it is possible to plot them. The shapefile for Luxembourg can be obtained from &lt;a href=&#34;https://data.public.lu/en/datasets/limites-administratives-du-grand-duche-de-luxembourg/&#34;&gt;Luxembourg’s Open data Portal&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Download the zip, and look for the file called &lt;code&gt;LIMADM_COMMUNES.shp&lt;/code&gt;, which contains the geometry of the Luxembourgish communes. Leave it inside the folder &lt;code&gt;Limadmin_SHP&lt;/code&gt;, as it contains other files needed by &lt;code&gt;rgdal::readOGR()&lt;/code&gt; to read in the shapefile.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(broom)
library(dplyr)
library(purrr)
library(ggplot2)
library(viridis)
library(rgdal)
library(ggthemes)
library(gganimate)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now we can read the data, and do some basic cleaning. I comment every step, but run the code line by line to really understand what’s going on!&lt;/p&gt;
&lt;p&gt;Read the shapefile:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;communes = readOGR(&amp;quot;Limadmin_SHP/LIMADM_COMMUNES.shp&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## OGR data source with driver: ESRI Shapefile
## Source: &amp;quot;Limadmin_SHP/LIMADM_COMMUNES.shp&amp;quot;, layer: &amp;quot;LIMADM_COMMUNES&amp;quot;
## with 105 features
## It has 4 fields&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning in readOGR(&amp;quot;Limadmin_SHP/LIMADM_COMMUNES.shp&amp;quot;): Z-dimension
## discarded&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;“Convert” it to a data frame using broom::tidy(). In the past, this was made with &lt;code&gt;ggplot2::fortify()&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;communes_df = broom::tidy(communes, region = &amp;quot;COMMUNE&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Remove the &lt;em&gt;cantons&lt;/em&gt; from the data, as well as the unemployment rate for the whole country. Then only select the relevant columns and rename them in one go:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;unemp_lux = unemp_lux %&amp;gt;%
  filter(!grepl(&amp;quot;Canton&amp;quot;, division), division != &amp;quot;Grand Duchy of Luxembourg&amp;quot;) %&amp;gt;%
  select(commune = division, year, unemp_rate = unemployment_rate_in_percent)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The name of two communes are written differently in the shapefile than in the data. We can change that using &lt;code&gt;gsub()&lt;/code&gt;. Change “Haute-Sûre” to “Haute Sûre” in the unemployment data:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;unemp_lux$commune = gsub(&amp;quot;Haute-Sûre&amp;quot;, &amp;quot;Haute Sûre&amp;quot;, unemp_lux$commune)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Change “Redange” to “Redange-sur-Attert” in the data frame containing the geometry of the communes:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;communes_df$id = gsub(&amp;quot;Redange&amp;quot;, &amp;quot;Redange-sur-Attert&amp;quot;, communes_df$id)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Select relevant columns from the communes data frame, and rename them:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;communes_df = communes_df %&amp;gt;%
    select(long, lat, commune = id) &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, join the communes data frame (containing the geometry) with the unemployment data, by communes:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;final_data = left_join(communes_df, unemp_lux, by = &amp;quot;commune&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s plot the unemployment rate for the latest available year:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;final_data2016 = final_data %&amp;gt;%
  filter(year == 2016)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;First, let’s plot a basic map using &lt;code&gt;ggplot2&lt;/code&gt;. Even if you’re not familiar with &lt;code&gt;ggplot2&lt;/code&gt; the code below should be very straightforward:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot_map = ggplot() +
  geom_polygon(data = final_data2016,
               aes(x = long, y = lat, group = commune, fill = unemp_rate)) +
    labs(title = &amp;quot;Unemployment rate in Luxembourg in 2016&amp;quot;,
         y = &amp;quot;&amp;quot;, x = &amp;quot;&amp;quot;, fill = &amp;quot;Unemployment rate&amp;quot;) +
    theme_tufte() +
    theme(axis.text.x = element_blank(),
          axis.ticks.x = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks.y = element_blank()) +
    scale_fill_viridis()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, print the map:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;print(ggplot_map)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2018-01-03-mapping-unemployment-luxembourg_files/figure-html/unnamed-chunk-18-1.jpg&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;It is also possible to create a map per year using &lt;code&gt;facet_wrap()&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;facet_map = ggplot() +
  geom_polygon(data = final_data,
               aes(x = long, y = lat,
                   group = commune, fill = unemp_rate)) +
    labs(title = &amp;quot;Unemployment rate in Luxembourg&amp;quot;, y = &amp;quot;&amp;quot;, x = &amp;quot;&amp;quot;, fill = &amp;quot;Unemployment rate&amp;quot;) +
    theme_tufte() +
    theme(axis.text.x = element_blank(),
          axis.ticks.x = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks.y = element_blank()) +
    facet_wrap(~year) +
    scale_fill_viridis()

print(facet_map)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2018-01-03-mapping-unemployment-luxembourg_files/figure-html/unnamed-chunk-19-1.jpg&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;We clearly see that unemployment has risen in Luxembourg these past 15 years. This series of maps are great for printing, but since you’re reading this on a screen, why not try to animate these maps? This is possible with &lt;code&gt;gganimate()&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(gganimate)

map_anim = ggplot() +
    geom_polygon(data = final_data,
                 aes(x = long, y = lat, group = group, fill = unemp_rate, frame = year)) +
    labs(title = &amp;quot;Unemployment rate in Luxembourg&amp;quot;, y = &amp;quot;&amp;quot;, x = &amp;quot;&amp;quot;, fill = &amp;quot;Unemployment rate&amp;quot;) +
    theme_tufte() +
    theme(axis.text.x = element_blank(),
          axis.ticks.x = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks.y = element_blank()) +
    scale_fill_viridis()


gganimate(map_anim, &amp;quot;map_lux.mp4&amp;quot;, interval = 2)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can create an &lt;code&gt;.mp4&lt;/code&gt; video as well as a &lt;code&gt;.gif&lt;/code&gt;. Just change the extension inside the &lt;code&gt;gganimate()&lt;/code&gt; function. &lt;/p&gt;
  &lt;p style=&#34;text-align:center;&#34;&gt;&lt;img style=&#34;width: 20rem;&#34; src=&#34;/images/map_lux.gif&#34; /&gt;&lt;/p&gt;
&lt;p&gt;That’s it for now. You can also check our &lt;a href=&#34;http://blog.rdata.lu/visualization/unemployment/&#34; target=&#34;_blank&#34;&gt;interactive map of unemployment&lt;/a&gt;  in our visualization.
  &lt;!-- In the next post, I will show you how to create interactive maps using R and javascript! --&gt;
&lt;/p&gt;
&lt;p&gt;Don&#39;t hesitate to follow us on twitter &lt;a href=&#34;https://twitter.com/rdata_lu&#34; target=&#34;_blank&#34;&gt;@rdata_lu&lt;/a&gt;
  &lt;!-- or &lt;a href=&#34;https://twitter.com/brodriguesco&#34;&gt;@brodriguesco&lt;/a&gt; --&gt;
  and to &lt;a href=&#34;https://www.youtube.com/channel/UCbazvBnJd7CJ4WnTL6BI6qw?sub_confirmation=1&#34; target=&#34;_blank&#34;&gt;subscribe&lt;/a&gt; to our youtube channel. &lt;br&gt;
  You can also contact us if you have any comments or suggestions. See you for the next post!
&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Visualizing box office revenue by genre</title>
      <link>/post/2017-12-04-visualizing-box-office-revenue-by-genre/</link>
      <pubDate>Mon, 04 Dec 2017 06:34:55 +0200</pubDate>
      
      <guid>/post/2017-12-04-visualizing-box-office-revenue-by-genre/</guid>
      <description>&lt;p&gt;After having watched Justice League in cinema, I was impressed by all of the special effects and how good they were. I started wondering myself: How much does a movie like that cost? And most importantly, how big is the box-office revenue for this kind of blockbuster? I found an answer in &lt;a href=&#34;http://www.the-numbers.com/movie/budgets/all&#34;&gt;The Numbers&lt;/a&gt;. I have then decided to make a database from the data available on this website. I have retrieved the 500th biggest movie budgets. Initially I just had a database with 5 variables on movies:&lt;br&gt; • the release date&lt;br&gt; • the name &lt;br&gt; • the production budget &lt;br&gt; • the dosmestic gross &lt;br&gt; • the worldwide gross &lt;br&gt; Thereafter, I crossed sources to get more variables. Data was scrapped on Wikipedia and IMDb. We finally get a dataset with 30 variables such as lists of actors, affiches url, distributions, rate and the number of raters from IMDb , etc…&lt;br&gt; You can find a complete description of the dataset on &lt;a href=&#34;https://github.com/krosamont/Cinema&#34;&gt;GitHub&lt;/a&gt;. All the data was scrapped via the package &lt;code&gt;rvest&lt;/code&gt;.&lt;br&gt;&lt;br&gt;&lt;/p&gt;
&lt;p&gt;In this post, I describe the different steps leading to the treemap: &lt;br&gt;&lt;/p&gt;
&lt;div id=&#34;tmp1&#34; class=&#34;tmap&#34;&gt;

&lt;/div&gt;
&lt;div id=&#34;starting-point&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;STARTING POINT&lt;/h1&gt;
&lt;p&gt;First of all we read the data.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;db = read.csv(&amp;quot;https://cdn.rawgit.com/krosamont/Cinema/dd7eca65/moviedb500.csv&amp;quot;,
              stringsAsFactors = FALSE)
#You can excecute the following line to have more information about the variable type.
#str(db) &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then we want to transform variables related to money in numeric variables and the movie realease dates in date variable using &lt;code&gt;tidyverse&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(tidyverse)

db = db %&amp;gt;%
        mutate( Release.Date = as.Date(Release.Date, &amp;quot;%m/%d/%Y&amp;quot;), 
                Running.time = as.numeric(stringr::str_sub(Running.time,1,3)),
                Rate = as.numeric(Rate),
                Raters = as.numeric(gsub(&amp;quot;,&amp;quot;, &amp;quot;&amp;quot;, Raters)),
                Production.Budget = as.numeric(gsub(&amp;quot;[,$]&amp;quot;, &amp;quot;&amp;quot;,
                                                 Production.Budget)),
                Domestic.Gross = as.numeric(gsub(&amp;quot;[,$]&amp;quot;, &amp;quot;&amp;quot;,
                                                 Domestic.Gross)),
                Worldwide.Gross = as.numeric(gsub(&amp;quot;[,$]&amp;quot;, &amp;quot;&amp;quot;,
                                                 Worldwide.Gross)) ) %&amp;gt;%
        arrange(desc(Worldwide.Gross))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The dataset looks better. As you have seen on top of this post. We want to design a treemap chart to visualize box-office revenue by genre. Let’s see how many movie genres are present in the data frame:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;UniqueGenres = unique(db$Genres)
length(UniqueGenres)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 224&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;head(UniqueGenres, 5)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Action Adventure Fantasy Sci-Fi&amp;quot;                  
## [2] &amp;quot;Action Adventure Sci-Fi&amp;quot;                          
## [3] &amp;quot;Action Crime Thriller&amp;quot;                            
## [4] &amp;quot;Adventure Drama Fantasy Mystery&amp;quot;                  
## [5] &amp;quot;Animation Adventure Comedy Family Fantasy Musical&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There are 224 combinations of genres, which is way too many combinations. We need to reduce them in a way that each movie has 2 genres at the most: A main genre and a subgenre.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;main-genres&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;MAIN GENRES&lt;/h1&gt;
&lt;p&gt;Let’s start with a simple barplot to visualize the most-represented genre from the 224 combinations.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(ggthemes)

all_genres = separate_rows(db %&amp;gt;% 
                           group_by(Genres) %&amp;gt;% 
                           select(Genres) %&amp;gt;% 
                           filter(row_number() ==1),
                           Genres, sep=&amp;quot;[[:space:]]&amp;quot;)

name_order = names(sort(table(all_genres)))

ggplot(all_genres, aes(Genres)) +
                theme_minimal( ) + 
        geom_bar( stat = &amp;quot;count&amp;quot;, fill=&amp;quot;#007acc&amp;quot; ) +
        coord_flip() +
        scale_x_discrete(limits = name_order)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2017-12-04-visualizing-box-office-revenue-by-genre_files/figure-html/unnamed-chunk-4-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;We see that Adventure and Action are the most important genres, followed by those between Comedy and Sci-fi. The genres that come after Sci-fi are present in less than 60 combinations of genres. Hence we will consider them as subgenres. We have 8 main genres: &lt;br&gt; • Adventure &lt;br&gt; • Action &lt;br&gt; • Comedy &lt;br&gt; • Drama &lt;br&gt; • Family &lt;br&gt; • &lt;del&gt;Fantasy&lt;/del&gt; &lt;br&gt; • Thriller &lt;br&gt; • &lt;del&gt;Sci-Fi&lt;/del&gt; &lt;br&gt; But we also know that Sci-Fi and Fantasy can be seen as subgenres from Adventure or Action. Therefore, we finally keep 6 genres. &lt;br&gt; We have to check that all movies can have a main genre from the 6 genres that we have choosen. For that, we simply check that each combination have at least one of the main genre :&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;mainGenres= paste(c(&amp;quot;Adventure&amp;quot;, &amp;quot;Action&amp;quot;,  &amp;quot;Comedy&amp;quot;, 
                    &amp;quot;Drama&amp;quot;, &amp;quot;Family&amp;quot;, &amp;quot;Thriller&amp;quot;),
                  collapse=&amp;quot;|&amp;quot;)

# grepl returns true for each genre combination if at least one of the main genre is present
length(grepl(mainGenres, db$Genres))/length(db$Genres)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Apparently, this is the case :)&lt;/p&gt;
&lt;div id=&#34;first-reduction&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;FIRST REDUCTION&lt;/h3&gt;
&lt;p&gt;We finally add a main genre to all movies.&lt;br&gt; &lt;strong&gt;Be careful, The main genre of each movie will depend on the order in which you attribute the main genre. So the final shape of the output will depend on this step.&lt;/strong&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#1
db$Genresl1=ifelse(grepl(&amp;quot;Family&amp;quot;,db$Genres),
                   &amp;quot;Family&amp;quot;, db$Genres)

#2
db$Genresl1=ifelse(grepl(&amp;quot;Drama&amp;quot;, db$Genresl1),
                   &amp;quot;Drama&amp;quot;, db$Genresl1)

#3
db$Genresl1=ifelse( grepl(&amp;quot;Thriller&amp;quot;, db$Genresl1),
                    &amp;quot;Thriller&amp;quot;, db$Genresl1)

#4
db$Genresl1=ifelse(grepl(&amp;quot;Action&amp;quot;, db$Genresl1),
                   &amp;quot;Action&amp;quot;, db$Genresl1)

#5
db$Genresl1 =ifelse(grepl(&amp;quot;Adventure&amp;quot;, db$Genresl1),
                    &amp;quot;Adventure&amp;quot;, db$Genresl1)

#6
db$Genresl1=ifelse(grepl(&amp;quot;Comedy&amp;quot;, db$Genresl1),
                   &amp;quot;Comedy&amp;quot;, db$Genresl1)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now that the main genre were attributed, let’s focus on the subgenre.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;subgenres&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;SUBGENRES&lt;/h1&gt;
&lt;p&gt;We have seen that only 6 genres could be considered as main genres. However, in this part we will consider that all genres can be considered as subgenres. Now one of the difficulties is to decide which subgenre to select when there is more than one option. Association rules can help us in this task. We can see which subgenres are the most present for each genre and their level of dependency.&lt;/p&gt;
&lt;div id=&#34;association-rules&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;ASSOCIATION RULES&lt;/h2&gt;
&lt;p&gt;Let’s analyze the different genre combinations through an association rule analysis. We need first to read data as transaction. For that we use the package &lt;code&gt;arules&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(arules)
#no duplicate combinations!
item_genres = read.transactions(&amp;quot;https://cdn.rawgit.com/krosamont/Cinema/dd7eca65/itemGenres.csv&amp;quot;,
                                format = &amp;quot;basket&amp;quot;, sep=&amp;quot;:&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In this post, we will focus ourselves on 2 association rule indicators: &lt;strong&gt;the support&lt;/strong&gt; and &lt;strong&gt;the confidence&lt;/strong&gt;. &lt;br&gt; Support and confidence are displayed like the result bellow when the function &lt;code&gt;arules::rules&lt;/code&gt; is used. &lt;br&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;##     lhs              rhs      support     confidence lift     count
## [1] {Documentary} =&amp;gt; {Drama}  0.004444444 1.0000000  2.777778  1   
## [2] {War}         =&amp;gt; {Drama}  0.057777778 0.9285714  2.579365 13   
## [3] {History}     =&amp;gt; {Drama}  0.080000000 0.9473684  2.631579 18   
## [4] {Animation}   =&amp;gt; {Family} 0.208888889 0.9591837  2.731852 47&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;• &lt;strong&gt;Support&lt;/strong&gt; indicates how frequently genres in columns lhs and rhs appear together in the 224 combinations. The first row of the result above means that War and Drama appear together in 5,78% of combinations.&lt;/p&gt;
&lt;p&gt;• &lt;strong&gt;Confidence&lt;/strong&gt; is an indication of how often the rule has been found to be true. It can also be seen as a conditional probability. { X =&amp;gt; Y } means P(Y | X). This is the probability that the genre Y is also present when we already know that genre X is present. { War =&amp;gt; Drama } = 0.929 from the second line of the result above means that Drama will be present in 92,9% of combination where War is present.&lt;br&gt; &lt;strong&gt;But be carefull, this relation is not neccesarly true in the opposite direction!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;To see all association rules starting from a confidence level of 30% between 2 genres we write: &lt;br&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;rules = apriori(item_genres, 
                parameter=list(support=(1/nrow(item_genres)), 
                confidence=0.3, minlen=2, maxlen=2)  )
ins_rules = inspect(rules) 

ins_rules&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If we want to focus on the relationship between subgenres and main genres, we can filter the rhs columns.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;mainGenres = unlist(strsplit(mainGenres, &amp;quot;|&amp;quot;, fixed = TRUE))
ins_rules = ins_rules %&amp;gt;% 
        #removing the arrow =&amp;gt;
        .[,-2] %&amp;gt;%
        #removing the brackets for both columns, lhs and rhs
        mutate(lhs = trimws(gsub(&amp;quot;\\{|\\}&amp;quot;,&amp;quot;&amp;quot;,lhs)),
               rhs = trimws(gsub(&amp;quot;\\{|\\}&amp;quot;,&amp;quot;&amp;quot;,rhs))) %&amp;gt;%
        filter(rhs %in% mainGenres) %&amp;gt;%
        group_by(lhs) %&amp;gt;%
        filter(row_number() == 3) %&amp;gt;%
        arrange(lhs, desc(confidence))

ins_rules&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 17 x 6
## # Groups:   lhs [17]
##    lhs       rhs       support confidence  lift count
##    &amp;lt;chr&amp;gt;     &amp;lt;chr&amp;gt;       &amp;lt;dbl&amp;gt;      &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt;
##  1 Adventure Action     0.293       0.574 1.17  66.0 
##  2 Animation Adventure  0.169       0.776 1.52  38.0 
##  3 Biography Adventure  0.0133      0.333 0.652  3.00
##  4 Comedy    Adventure  0.187       0.512 1.00  42.0 
##  5 Crime     Comedy     0.0489      0.355 0.974 11.0 
##  6 Drama     Adventure  0.133       0.370 0.725 30.0 
##  7 Family    Adventure  0.249       0.709 1.39  56.0 
##  8 Fantasy   Action     0.129       0.387 0.791 29.0 
##  9 History   Adventure  0.0356      0.421 0.824  8.00
## 10 Musical   Family     0.0622      0.875 2.49  14.0 
## 11 Mystery   Adventure  0.0667      0.500 0.978 15.0 
## 12 Romance   Family     0.0533      0.300 0.854 12.0 
## 13 Sci-Fi    Family     0.0889      0.312 0.890 20.0 
## 14 Sport     Family     0.0222      0.500 1.42   5.00
## 15 Thriller  Adventure  0.138       0.431 0.842 31.0 
## 16 War       Adventure  0.0222      0.357 0.699  5.00
## 17 Western   Adventure  0.0222      0.625 1.22   5.00&lt;/code&gt;&lt;/pre&gt;
&lt;div id=&#34;barplot&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;BARPLOT&lt;/h3&gt;
&lt;p&gt;We create a new variable that we named: &lt;code&gt;withoutMainGenres&lt;/code&gt;. This variable is the combination of genres without the main genre. If a movie has the combination: “Drama War Action Biography” and his main genre is “Drama”, then value of &lt;code&gt;withoutMainGenres&lt;/code&gt; will be “War Action Biography”. If it’s not clear enough, I suggest that you run the code and to compare the variables &lt;code&gt;withoutMainGenres&lt;/code&gt; and &lt;code&gt;Genres&lt;/code&gt;. Once this new variable is made, we draw again a barplot to see the ditribution of genres.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;db$withoutMainGenres = trimws(mapply(gsub, db$Genresl1, &amp;quot;&amp;quot;, db$Genres))

all_genres = separate_rows(db %&amp;gt;% 
                           group_by(withoutMainGenres) %&amp;gt;% 
                           select(withoutMainGenres) %&amp;gt;% 
                           filter(row_number() ==1),
                           withoutMainGenres, 
                           sep=&amp;quot;[[:space:]]&amp;quot;) %&amp;gt;% 
             rename( Genres=withoutMainGenres) %&amp;gt;%
             filter(nchar(Genres)&amp;gt;0)

name_order = names(sort(table(all_genres)))

ggplot(all_genres, aes(Genres)) +
                theme_minimal( ) + 
        geom_bar( stat = &amp;quot;count&amp;quot;, fill=&amp;quot;#007acc&amp;quot; ) +
        coord_flip() +
        scale_x_discrete(limits = name_order)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2017-12-04-visualizing-box-office-revenue-by-genre_files/figure-html/unnamed-chunk-15-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;We see that there are still a lot of adventure movies. We use the result seen in the association rules and the barplot to make the subgenres.&lt;br&gt; We begin with the genre Animation because we want to regroup all of these movies in the same category. Then we add subgenres in an ascending order, from the less important to the most one.&lt;br&gt; However, movies from musical, music and horror genres are added at the end of the script because the attribution of these genres for the movie in our dataset is questionable.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;db$Genresl2=ifelse(grepl(&amp;quot;Animation&amp;quot;,db$withoutMainGenres), 
                   &amp;quot;Animation&amp;quot;, db$withoutMainGenres)
db$Genresl2=ifelse(grepl(&amp;quot;Documentary&amp;quot;,db$Genresl2),
                   &amp;quot;Documentary&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(grepl(&amp;quot;Biography&amp;quot;, db$Genresl2), 
                   &amp;quot;Biography&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(grepl(&amp;quot;Western&amp;quot;,db$Genresl2), 
                   &amp;quot;Western&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(grepl(&amp;quot;Sport&amp;quot;,db$Genresl2), 
                   &amp;quot;Sport&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(grepl(&amp;quot;War&amp;quot;,db$Genresl2), 
                   &amp;quot;War&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(grepl(&amp;quot;Mystery&amp;quot;,db$Genresl2), 
                   &amp;quot;Mystery&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(grepl(&amp;quot;Romance&amp;quot;,db$Genresl2), 
                   &amp;quot;Romance&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(grepl(&amp;quot;Crime&amp;quot;,db$Genresl2), 
                   &amp;quot;Crime&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(grepl(&amp;quot;Drama&amp;quot;,db$Genresl2), 
                   &amp;quot;Drama&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(grepl(&amp;quot;Fantasy&amp;quot;,db$Genresl2), 
                   &amp;quot;Fantasy&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(grepl(&amp;quot;Sci-Fi&amp;quot;,db$Genresl2), 
                   &amp;quot;Sci-Fi&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(grepl(&amp;quot;Comedy&amp;quot;,db$Genresl2), 
                   &amp;quot;Comedy&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(grepl(&amp;quot;Thriller&amp;quot;,db$Genresl2), 
                   &amp;quot;Thriller&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(grepl(&amp;quot;Adventure&amp;quot;,db$Genresl2), 
                   &amp;quot;Adventure&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(grepl(&amp;quot;Musical&amp;quot;,db$Genresl2), 
                   &amp;quot;Musical&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(grepl(&amp;quot;Music&amp;quot;,db$Genresl2), 
                   &amp;quot;Music&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(grepl(&amp;quot;Horror&amp;quot;,db$Genresl2), 
                   &amp;quot;Horror&amp;quot;, db$Genresl2)
db$Genresl2=ifelse(db$Genresl2==&amp;quot;&amp;quot;,
                   db$Genresl1, db$Genresl2)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now that we have our 2 levels of genres. We can build our treemap!&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;treemap-with-treemapify&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;TREEMAP WITH TREEMAPIFY&lt;/h1&gt;
&lt;p&gt;To design the treemap, we need to regroup movies by main genres and subgenres, then we sum their Worlwide Gross revenue.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;summary.Genre = db %&amp;gt;%
        group_by(Genresl1, Genresl2) %&amp;gt;%
        summarise(Sum_Gross = sum(Worldwide.Gross))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally we design the treemap using &lt;code&gt;ggplot2&lt;/code&gt; and &lt;code&gt;treemapify&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(treemapify)

ggplot(summary.Genre, aes(area = Sum_Gross ,
                          fill = Genresl1, label = Genresl2,
                          subgroup =Genresl1)) +
        geom_treemap() +
        geom_treemap_subgroup_border() +
        geom_treemap_subgroup_text(place = &amp;quot;centre&amp;quot;, 
                                   grow = T, 
                                   alpha = 0.5, 
                                   colour = &amp;quot;black&amp;quot;, 
                                   fontface = &amp;quot;italic&amp;quot;, 
                                   min.size = 0) +
        geom_treemap_text(colour = &amp;quot;white&amp;quot;, 
                          place = &amp;quot;topleft&amp;quot;, 
                          reflow = T)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2017-12-04-visualizing-box-office-revenue-by-genre_files/figure-html/unnamed-chunk-18-1.png&#34; width=&#34;672&#34; /&gt; &lt;br&gt;&lt;/p&gt;
&lt;p&gt;Here we have a first result but we can do better by adding some interactivity.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;treemap-with-highcharter&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;TREEMAP WITH HIGHCHARTER&lt;/h1&gt;
&lt;p&gt;Let’s add some interactivity using the package &lt;code&gt;highcharter&lt;/code&gt;. We use the github version (there are more functions).&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;devtools::install_github(&amp;quot;jbkunst/highcharter&amp;quot;)

library(highcharter)
hctreemap2(data = db,
           group_vars = c(&amp;quot;Genresl1&amp;quot;, &amp;quot;Genresl2&amp;quot;),
           size_var = &amp;quot;Worlwide.Gross&amp;quot;,
           color_var = &amp;quot;Genresl2&amp;quot;,
           layoutAlgorithm = &amp;quot;squarified&amp;quot;,
           levelIsConstant = FALSE,
           levels = list(
                   list(level = 1, 
                        dataLabels = list(enabled = TRUE)),
                   list(level = 2, 
                        dataLabels = list(enabled = FALSE))
           )) %&amp;gt;% 
        hc_tooltip(pointFormat = &amp;quot;&amp;lt;b&amp;gt;{point.name}&amp;lt;/b&amp;gt;:&amp;lt;br&amp;gt;
                   Worlwide Gross: $ {point.value:,.0f}&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The following error message appears:&lt;br&gt; &lt;font color=&#34;red&#34;&gt;&lt;strong&gt;Error in hctreemap2(data = db, group_vars = c(“Genresl1”, “Genresl2”) :&lt;br&gt; Treemap data uses same label at multiple levels.&lt;/strong&gt; &lt;/font&gt;&lt;br&gt;&lt;/p&gt;
&lt;p&gt;We can’t design a 2 levels treemap with &lt;code&gt;highcharter&lt;/code&gt; because main genres and subgenres share some genres. Hence, R is a great tool for data manipulation but javascript is a better tool for visualization. &lt;br&gt;&lt;/p&gt;
&lt;p&gt;We can easily design a 2 levels responsive treemap with the library &lt;a href=&#34;https://www.highcharts.com/&#34;&gt;highchart&lt;/a&gt; in javascript.&lt;/p&gt;
&lt;div id=&#34;tmp2&#34; class=&#34;tmap&#34;&gt;

&lt;/div&gt;
&lt;script
  src=&#34;https://code.jquery.com/jquery-3.2.1.min.js&#34;
  integrity=&#34;sha256-hwg4gsxgFZhOsEEamdOYGBf13FyQuiTwlAQgxVSNgt4=&#34;
  crossorigin=&#34;anonymous&#34;&gt;&lt;/script&gt;
&lt;script
  src=&#34;https://code.jquery.com/ui/1.12.1/jquery-ui.min.js&#34;
  integrity=&#34;sha256-VazP97ZCwtekAsvgPBSUwPFKdrwD3unUfSGVYrahUqU=&#34;
  crossorigin=&#34;anonymous&#34;&gt;&lt;/script&gt;
&lt;script src=&#34;https://code.highcharts.com/highcharts.js&#34;&gt;&lt;/script&gt;
&lt;script src=&#34;https://code.highcharts.com/modules/treemap.js&#34;&gt;&lt;/script&gt;
&lt;script src=&#34;https://cdn.rawgit.com/krosamont/Cinema/dd7eca65/treemap/js/cinemaTreemap.js&#34;&gt;&lt;/script&gt;
&lt;p&gt;&lt;link rel=&#34;stylesheet&#34; href=&#34;https://cdn.rawgit.com/krosamont/Cinema/dd7eca65/treemap/css/styleSheet.css&#34;&gt;&lt;/p&gt;
&lt;p&gt;
Don’t hesitate to follow us on twitter &lt;a href=&#34;https://twitter.com/rdata_lu&#34; target=&#34;_blank&#34;&gt;&lt;span class=&#34;citation&#34;&gt;@rdata_lu&lt;/span&gt;&lt;/a&gt; &lt;!-- or &lt;a href=&#34;https://twitter.com/brodriguesco&#34;&gt;@brodriguesco&lt;/a&gt; --&gt; and to &lt;a href=&#34;https://www.youtube.com/channel/UCbazvBnJd7CJ4WnTL6BI6qw?sub_confirmation=1&#34; target=&#34;_blank&#34;&gt;subscribe&lt;/a&gt; to our youtube channel. &lt;br&gt; You can also contact us if you have any comments or suggestions. See you for the next post!
&lt;/p&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Easy peasy STATA-like marginal effects with R</title>
      <link>/post/2017-10-26-easy-peasy-stata-like-marginal-effect-with-r/</link>
      <pubDate>Thu, 26 Oct 2017 06:45:48 +0200</pubDate>
      
      <guid>/post/2017-10-26-easy-peasy-stata-like-marginal-effect-with-r/</guid>
      <description>&lt;p&gt;Model interpretation is essential in the social sciences. If one wants to know the effect of variable &lt;code&gt;x&lt;/code&gt; on the dependent variable &lt;code&gt;y&lt;/code&gt;, marginal effects are an easy way to get the answer. STATA includes a &lt;code&gt;margins&lt;/code&gt; command that has been ported to R by &lt;a href=&#34;http://thomasleeper.com/&#34;&gt;Thomas J. Leeper&lt;/a&gt; of the London School of Economics and Political Science. You can find the source code of the package &lt;a href=&#34;https://github.com/leeper/margins&#34;&gt;on github&lt;/a&gt;. In this short blog post, I demo some of the functionality of &lt;code&gt;margins&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;First, let’s load some packages:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(ggplot2)
library(tibble)
library(broom)
library(margins)
library(Ecdat)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As an example, we are going to use the &lt;code&gt;Participation&lt;/code&gt; data from the &lt;code&gt;Ecdat&lt;/code&gt; package:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;data(Participation)&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;?Participation&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;Labor Force Participation

Description

a cross-section

number of observations : 872

observation : individuals

country : Switzerland

Usage

data(Participation)
Format

A dataframe containing :

lfp
labour force participation ?

lnnlinc
the log of nonlabour income

age
age in years divided by 10

educ
years of formal education

nyc
the number of young children (younger than 7)

noc
number of older children

foreign
foreigner ?

Source

Gerfin, Michael (1996) “Parametric and semiparametric estimation of the binary response”, Journal of Applied Econometrics, 11(3), 321-340.

References

Davidson, R. and James G. MacKinnon (2004) Econometric Theory and Methods, New York, Oxford University Press, http://www.econ.queensu.ca/ETM/, chapter 11.

Journal of Applied Econometrics data archive : http://qed.econ.queensu.ca/jae/.&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The variable of interest is &lt;code&gt;lfp&lt;/code&gt;: whether the individual participates in the labour force or not. To know which variables are relevant in the decision to participate in the labour force, one could estimate a logit model, using &lt;code&gt;glm()&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;logit_participation = glm(lfp ~ ., data = Participation, family = &amp;quot;binomial&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now that we ran the regression, we can take a look at the results. I like to use &lt;code&gt;broom::tidy()&lt;/code&gt; to look at the results of regressions, as &lt;code&gt;tidy()&lt;/code&gt; returns a nice &lt;code&gt;data.frame&lt;/code&gt;, but you could use &lt;code&gt;summary()&lt;/code&gt; if you’re only interested in reading the output:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;tidy(logit_participation)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##          term    estimate  std.error  statistic      p.value
## 1 (Intercept) 10.37434616 2.16685216  4.7877499 1.686617e-06
## 2     lnnlinc -0.81504064 0.20550116 -3.9661122 7.305449e-05
## 3         age -0.51032975 0.09051783 -5.6378920 1.721444e-08
## 4        educ  0.03172803 0.02903580  1.0927211 2.745163e-01
## 5         nyc -1.33072362 0.18017027 -7.3859224 1.514000e-13
## 6         noc -0.02198573 0.07376636 -0.2980454 7.656685e-01
## 7  foreignyes  1.31040497 0.19975784  6.5599678 5.381941e-11&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;From the results above, one can only interpret the sign of the coefficients. To know how much a variable influences the labour force participation, one has to use &lt;code&gt;margins()&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;effects_logit_participation = margins(logit_participation) &lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning in warn_for_weights(model): &amp;#39;weights&amp;#39; used in model estimation are
## currently ignored!&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;print(effects_logit_participation)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Average marginal effects&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## glm(formula = lfp ~ ., family = &amp;quot;binomial&amp;quot;, data = Participation)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##  lnnlinc     age     educ     nyc       noc foreignyes
##  -0.1699 -0.1064 0.006616 -0.2775 -0.004584     0.2834&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Using &lt;code&gt;summary()&lt;/code&gt; on the object returned by &lt;code&gt;margins()&lt;/code&gt; provides more details:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;summary(effects_logit_participation)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##      factor     AME     SE       z      p   lower   upper
##         age -0.1064 0.0176 -6.0494 0.0000 -0.1409 -0.0719
##        educ  0.0066 0.0060  1.0955 0.2733 -0.0052  0.0185
##  foreignyes  0.2834 0.0399  7.1102 0.0000  0.2053  0.3615
##     lnnlinc -0.1699 0.0415 -4.0994 0.0000 -0.2512 -0.0887
##         noc -0.0046 0.0154 -0.2981 0.7656 -0.0347  0.0256
##         nyc -0.2775 0.0333 -8.3433 0.0000 -0.3426 -0.2123&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And it is also possible to plot the effects with base graphics:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;plot(effects_logit_participation)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2017-10-26-easy-peasy-stata-like-marginal-effect-with-r_files/figure-html/unnamed-chunk-9-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;This uses the basic R plotting capabilities, which is useful because it is a simple call to the function &lt;code&gt;plot()&lt;/code&gt; but if you’ve been using &lt;code&gt;ggplot2&lt;/code&gt; and want this graph to have the same look as the others made with &lt;code&gt;ggplot2&lt;/code&gt; you first need to save the summary in a variable. Let’s overwrite this &lt;code&gt;effects_logit_participation&lt;/code&gt; variable with its summary:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;effects_logit_participation = summary(effects_logit_participation)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And now it is possible to use &lt;code&gt;ggplot2&lt;/code&gt; to create the same plot:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(data = effects_logit_participation) +
  geom_point(aes(factor, AME)) +
  geom_errorbar(aes(x = factor, ymin = lower, ymax = upper)) +
  geom_hline(yintercept = 0) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2017-10-26-easy-peasy-stata-like-marginal-effect-with-r_files/figure-html/unnamed-chunk-11-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;So an infinitesimal increase, in say, non-labour income (&lt;code&gt;lnnlinc&lt;/code&gt;) of 0.001 is associated with a decrease of the probability of labour force participation by 0.001*17 percentage points.&lt;/p&gt;
&lt;p&gt;You can also extract the marginal effects of a single variable, with &lt;code&gt;dydx()&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;head(dydx(Participation, logit_participation, &amp;quot;lnnlinc&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##   dydx_lnnlinc
## 1  -0.15667764
## 2  -0.20014487
## 3  -0.18495109
## 4  -0.05377262
## 5  -0.18710476
## 6  -0.19586986&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Which makes it possible to extract the effects for a list of individuals that you can create yourself:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;my_subjects = tribble(
    ~lfp,  ~lnnlinc, ~age, ~educ, ~nyc, ~noc, ~foreign,
    &amp;quot;yes&amp;quot;,   10.780,  7.0,     4,    1,    1,    &amp;quot;yes&amp;quot;,
     &amp;quot;no&amp;quot;,     1.30,  9.0,     1,    4,    1,    &amp;quot;yes&amp;quot;
)

dydx(my_subjects, logit_participation, &amp;quot;lnnlinc&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##   dydx_lnnlinc
## 1  -0.09228119
## 2  -0.17953451&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I used the &lt;code&gt;tribble()&lt;/code&gt; function from the &lt;code&gt;tibble&lt;/code&gt; package to create this test data set, row by row. Then, using &lt;code&gt;dydx()&lt;/code&gt;, I get the marginal effect of variable &lt;code&gt;lnnlinc&lt;/code&gt; for these two individuals. No doubt that this package will be a huge help convincing more social scientists to try out R and make a potential transition from STATA easier.&lt;/p&gt;
&lt;p&gt;
Don’t hesitate to follow us on twitter &lt;a href=&#34;https://twitter.com/rdata_lu&#34; target=&#34;_blank&#34;&gt;&lt;span class=&#34;citation&#34;&gt;@rdata_lu&lt;/span&gt;&lt;/a&gt; &lt;!-- or &lt;a href=&#34;https://twitter.com/brodriguesco&#34;&gt;@brodriguesco&lt;/a&gt; --&gt; and to &lt;a href=&#34;https://www.youtube.com/channel/UCbazvBnJd7CJ4WnTL6BI6qw?sub_confirmation=1&#34; target=&#34;_blank&#34;&gt;subscribe&lt;/a&gt; to our youtube channel. &lt;br&gt; You can also contact us if you have any comments or suggestions. See you for the next post!
&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Barplot with ggplot2/plotly</title>
      <link>/post/2017-10-16-barplot-ggplotly/</link>
      <pubDate>Mon, 16 Oct 2017 00:00:00 +0000</pubDate>
      
      <guid>/post/2017-10-16-barplot-ggplotly/</guid>
      <description>&lt;script src=&#34;/rmarkdown-libs/htmlwidgets/htmlwidgets.js&#34;&gt;&lt;/script&gt;
&lt;script src=&#34;/rmarkdown-libs/plotly-binding/plotly.js&#34;&gt;&lt;/script&gt;
&lt;script src=&#34;/rmarkdown-libs/typedarray/typedarray.min.js&#34;&gt;&lt;/script&gt;
&lt;script src=&#34;/rmarkdown-libs/jquery/jquery.min.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;/rmarkdown-libs/crosstalk/css/crosstalk.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;script src=&#34;/rmarkdown-libs/crosstalk/js/crosstalk.min.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;/rmarkdown-libs/plotlyjs/plotly-htmlwidgets.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;script src=&#34;/rmarkdown-libs/plotlyjs/plotly-latest.min.js&#34;&gt;&lt;/script&gt;


&lt;!--
words: 268/180

```css
pre code, pre, code {
white-space: pre !important;
overflow-x: scroll !important;
overflow-y: scroll !important;
word-break: keep-all !important;
word-wrap: initial !important;
height:25vh !important;
}
p img{
width:100%; !important;
}
```


&lt;style type=&#34;text/css&#34;&gt;
pre code, pre, code {
white-space: pre !important;
overflow-x: scroll !important;
overflow-y: scroll !important;
word-break: keep-all !important;
word-wrap: initial !important;
height:25vh !important;
}
p img{
width:100%; !important;
}
&lt;/style&gt;
--&gt;
&lt;p&gt;Hello everyones,&lt;/p&gt;
&lt;p&gt;I just finished my MOOC on Foundations of strategic business analitycs. It was interresting and at the end of this course, I had to present a graph that was suppose to be relevent for a business organization. Different datasets were availables: &lt;a href=&#34;http://www.stat.columbia.edu/~gelman/arm/examples/speed.dating/&#34;&gt;speed dating&lt;/a&gt;, &lt;a href=&#34;https://www.eea.europa.eu/data-and-maps/data/co2-cars-emission-8&#34;&gt;Co2 emissons&lt;/a&gt;, &lt;a href=&#34;https://archive.ics.uci.edu/ml/datasets/Bike+Sharing+Dataset&#34;&gt;bike sharing&lt;/a&gt;, &lt;a href=&#34;https://www.lendingclub.com/info/download-data.action&#34;&gt;loans&lt;/a&gt;, &lt;a href=&#34;http://www.kdd.org/kdd-cup/view/kdd-cup-2009/Data&#34;&gt;telecom churn&lt;/a&gt;, &lt;a href=&#34;https://www.data.gouv.fr/en/datasets/prix-des-carburants-en-france/&#34;&gt;fuel prices&lt;/a&gt;, &lt;a href=&#34;http://www.ameli.fr/fileadmin/user_upload/documents/Medic_AM_mensuel_2016_-_2e_semestre_tous_regimes.zip&#34;&gt;medical expense refunds&lt;/a&gt; and more. I have chosen to work on the medical expense refunds. This dataset gives amount of refunded drugs, number of refunded drugs, drugs name and drugs category for each month from july to december 2016. There are 84 categories of drugs.&lt;/p&gt;
&lt;p&gt;As the french health insurance is a public institution, it may be more interesting to find a way to monitore data than finding a way to refund less drugs… Hence, it may not be readable to show the 84 categories, so I have decided to select just some of them.&lt;/p&gt;
&lt;p&gt;First of all, I wanted to make an analysis about the five drugs categories the most refunded per month. But quickly, I realized that I had to use a line chart instead of the barplot because the chart was not really explicit (see below).&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;/post/2017-10-16-barplot-ggplotly_files/figure-html/unnamed-chunk-2-1.png&#34; width=&#34;1344&#34; /&gt;&lt;/p&gt;
&lt;p&gt;I was not happy with my first result, so I have decided to make a new graph about the fifteen drugs categories the most refunded in the whole 2nd semester of 2016.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#We need to modify some of our previous table because we select 15 categories.
res_all2 = tous_presc %&amp;gt;%
        group_by(label) %&amp;gt;%
        summarise_each(funs(sum)) %&amp;gt;%
        filter(!is.na(label)) %&amp;gt;%
        arrange(desc(`Montant remboursé \n2016-07`)) %&amp;gt;%
        filter( row_number() %in% c(1:15) ) %&amp;gt;%
        as.data.frame()

top_med = res_all2$label

res_city2 = city_presc %&amp;gt;%
        group_by(label) %&amp;gt;%
        summarise_each(funs(sum)) %&amp;gt;%
        filter(!is.na(label) &amp;amp; label %in% top_med) %&amp;gt;%
        arrange(desc(`Montant remboursé \n2016-07`)) %&amp;gt;%
        as.data.frame() 
res_city2$`type of prescriber` = &amp;quot;private practitioner&amp;quot;

res_hop2 = hop_presc %&amp;gt;%
        group_by(label) %&amp;gt;%
        summarise_each(funs(sum)) %&amp;gt;%
        filter(!is.na(label) &amp;amp; label %in% top_med) %&amp;gt;%
        arrange(desc(`Montant remboursé \n2016-07`))  %&amp;gt;%
        as.data.frame() 
res_hop2$`type of prescriber` = &amp;quot;salaried practitioner&amp;quot;

df2 = rbind(res_city2, res_hop2)
df2$`type of prescriber` = toupper(df2$`type of prescriber`)
df2$`type of drugs` = df2$label
#translate in english
df2$`type of drugs` = gsub(&amp;quot;IMMUNOSUPPRESSEURS&amp;quot;,&amp;quot;IMMUNOSUPPRESSIVES&amp;quot;, df2$`type of drugs`)
df2$`type of drugs` = gsub(&amp;quot;MEDICAMENTS DU DIABETE&amp;quot;,&amp;quot;DIABETES MEDICINES&amp;quot;, df2$`type of drugs`)
df2$`type of drugs` = gsub(&amp;quot;ANTITHROMBOTIQUES&amp;quot;,&amp;quot;ANTITHROMBOTICS&amp;quot;, df2$`type of drugs`)
df2$`type of drugs` = gsub(&amp;quot;ANTIVIRAUX A USAGE SYSTEMIQUE&amp;quot;,&amp;quot;ANTIVIRALS FOR SYSTEMIC USE&amp;quot;, df2$`type of drugs`)
df2$`type of drugs` = gsub(&amp;quot;ANTINEOPLASIQUES&amp;quot;,&amp;quot;ANTINEOPLASTICS&amp;quot;, df2$`type of drugs`)
df2$`type of drugs` = gsub(&amp;quot;AGENTS MODIFIANT LES LIPIDES&amp;quot;,&amp;quot;LIPID MODIFYING AGENT&amp;quot;, df2$`type of drugs`)
df2$`type of drugs` = gsub(&amp;quot;ANTIBACTERIENS A USAGE SYSTEMIQUE&amp;quot;,&amp;quot;SYSTEMIC ANTIBACTERIAL&amp;quot;, df2$`type of drugs`)
df2$`type of drugs` = gsub(&amp;quot;IMMUNOSTIMULANTS&amp;quot;,&amp;quot;IMMUNOSTIMULANTS&amp;quot;, df2$`type of drugs`)
df2$`type of drugs` = gsub(&amp;quot;MEDICAMENTS AGISSANT SUR LE SYSTEME RENINE-ANGIOTENSINE&amp;quot;,&amp;quot;DRUGS AFFECT THE RENIN-ANGIOTENSIN SYSTEM&amp;quot;, df2$`type of drugs`)
df2$`type of drugs` = gsub(&amp;quot;MEDICAMENTS OPHTALMOLOGIQUES&amp;quot;,&amp;quot;OPHTHALMIC DRUGS&amp;quot;, df2$`type of drugs`)
df2$`type of drugs` = gsub(&amp;quot;MEDICAMENTS POUR LES SYNDROMES OBSTRUCTIFS DES VOIES AERIENNES&amp;quot;,&amp;quot;DRUGS AGAINST OBSTRUCTIVE PULMONARY DISEASE&amp;quot;, df2$`type of drugs`)
df2$`type of drugs` = gsub(&amp;quot;MEDICAMENTS POUR LES TROUBLES DE L&amp;#39;ACIDITE&amp;quot;,&amp;quot;DRUGS AGAINST ACIDITY TROUBLE&amp;quot;, df2$`type of drugs`)
df2$`type of drugs` = gsub(&amp;quot;PSYCHOLEPTIQUES&amp;quot;,&amp;quot;PSYCHOLEPTICS&amp;quot;, df2$`type of drugs`)
df2$`type of drugs` = gsub(&amp;quot;THERAPEUTIQUE ENDOCRINE&amp;quot;,&amp;quot;ENDOCRINE THERAPY&amp;quot;, df2$`type of drugs`)

colnames(df2) = c(&amp;quot;label&amp;quot;, &amp;quot;JULY&amp;quot;, &amp;quot;AUGUST&amp;quot;, &amp;quot;SEPTEMBER&amp;quot;, &amp;quot;OCTOBER&amp;quot;, &amp;quot;NOVEMBER&amp;quot;, &amp;quot;DECEMBER&amp;quot;, &amp;quot;PRESCRIBERS&amp;quot;, &amp;quot;DRUGS&amp;quot; )
dfdata2 = melt( df2[,-1], id.vars=c(&amp;quot;DRUGS&amp;quot;, &amp;quot;PRESCRIBERS&amp;quot;)) %&amp;gt;%
        rename(montant=value, date=variable) %&amp;gt;%
        arrange(date, DRUGS, PRESCRIBERS) %&amp;gt;% 
        group_by(DRUGS, PRESCRIBERS) %&amp;gt;%
        summarise(refund=sum(montant)) %&amp;gt;%
        as.data.frame() 

dfdata2$DRUGS = reorder(dfdata2$DRUGS, desc(dfdata2$refund))
#t=The total amount of refunded drugs
global_amout = sum(t(
        tous_presc %&amp;gt;%
                group_by(label) %&amp;gt;%
                filter(is.na(label)) %&amp;gt;%
                .[13,-7]))

#the percentage of the total refunded drugs that represents each category
dfdata2 = dfdata2 %&amp;gt;%
        group_by(DRUGS) %&amp;gt;%
        mutate( total = sum(refund),
                perct = paste(round(100*sum(refund)/global_amout,2),&amp;quot;%&amp;quot;, sep=&amp;quot;&amp;quot;),
                perct = ifelse(PRESCRIBERS==&amp;quot;SALARIED PRACTITIONER&amp;quot;, &amp;quot; &amp;quot;, perct )) %&amp;gt;%
        as.data.frame()



q = ggplot(dfdata2, aes(x=DRUGS, y=refund, group=PRESCRIBERS, fill=DRUGS, alpha=PRESCRIBERS))+
        geom_bar(stat=&amp;quot;identity&amp;quot;,position=&amp;quot;stack&amp;quot;,color=&amp;quot;black&amp;quot;)+ 
        ggtitle(&amp;quot;Top 15 of refunded drugs categories for the 2nd semester of 2016&amp;quot;)+
        scale_alpha_manual(values=c(0.2,0.75))+
        geom_text(aes(label=perct, y=total+2),alpha=1, color=&amp;quot;black&amp;quot;, position=position_dodge(width=0.2), vjust=-0.6, size=4) + 
        scale_y_continuous(labels = function(x) paste0(formatC(x/1000000, format=&amp;quot;d&amp;quot;, digits=0, big.mark = &amp;quot;,&amp;quot;), &amp;quot; €&amp;quot;))+
        labs(x=&amp;quot; &amp;quot;, y=&amp;quot;refunded amount (in million €)&amp;quot;) + 
        annotate(&amp;quot;text&amp;quot;, x=4.25, y=821000000, label= &amp;quot;(Percentage of total refunded amount)&amp;quot;, size=4.5) +
        annotate(&amp;quot;text&amp;quot;, x=11.3, y=890000000, label= &amp;quot;Total Amount of refunded drugs: 9,384,395,518 €&amp;quot;, size=6) + 
        theme_minimal(base_size = 15)+
        theme(  
                panel.grid.major.x = element_blank(),
                panel.grid.minor.x = element_blank(),
                legend.text = element_text(size = 10),
                plot.title = element_text(size=23,face=&amp;quot;bold&amp;quot;, hjust=0.5),
                axis.text.x = element_blank(),
                axis.ticks.x = element_blank(),
                axis.title.x = element_text(size=12, face=&amp;quot;bold&amp;quot;),
                axis.title.y = element_text(size=14,face=&amp;quot;bold&amp;quot;),
                strip.text.x = element_text(face=&amp;quot;italic&amp;quot;, size=11))


print(q)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2017-10-16-barplot-ggplotly_files/figure-html/unnamed-chunk-3-1.png&#34; width=&#34;1344&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Great, now we can add an interactive touch with the &lt;code&gt;library(plotly)&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#We apply this library to add some interactivity in the previous graph
library(plotly)

#We add some new variables to add to the tooltip
dfdata2 = dfdata2 %&amp;gt;%
        group_by(DRUGS) %&amp;gt;%
        mutate( total = sum(refund),
                perct = paste(round(100*refund/global_amout,2),&amp;quot;%&amp;quot;, sep=&amp;quot;&amp;quot;),
                perct_in_cat = paste(round(100*refund/sum(refund),2),&amp;quot;%&amp;quot;, sep=&amp;quot;&amp;quot;),
                perct_total_cat =  paste(round(100*sum(refund)/global_amout,2),&amp;quot;%&amp;quot;, sep=&amp;quot;&amp;quot;) ) %&amp;gt;%
        as.data.frame()

q = ggplot(dfdata2, aes(x=DRUGS, y=refund, group=PRESCRIBERS, fill=DRUGS,
                         alpha=PRESCRIBERS,
                         #here we custom the tooltip
                         text = paste(&amp;quot;&amp;lt;b&amp;gt;type of drugs:&amp;lt;/b&amp;gt; &amp;quot;, tolower(DRUGS),&amp;quot;&amp;lt;/br&amp;gt;&amp;quot;,

                                                                                                     &amp;quot;&amp;lt;/br&amp;gt;&amp;lt;b&amp;gt;type of prescribers:&amp;lt;/b&amp;gt; &amp;quot;, tolower(PRESCRIBERS),
                                                                                                           &amp;quot;&amp;lt;/br&amp;gt;&amp;lt;b&amp;gt;refunded amount:&amp;lt;/b&amp;gt; &amp;quot;, paste0(formatC(refund, format=&amp;quot;d&amp;quot;, digits=0, big.mark = &amp;quot;,&amp;quot;), &amp;quot; €&amp;quot;),
                                                                                                           &amp;quot;&amp;lt;/br&amp;gt;&amp;lt;b&amp;gt;total refunded amount:&amp;lt;/b&amp;gt; &amp;quot;, paste0(formatC(total, format=&amp;quot;d&amp;quot;, digits=0, big.mark = &amp;quot;,&amp;quot;), &amp;quot; €&amp;quot;),
                                                                                                           &amp;quot;&amp;lt;/br&amp;gt;&amp;lt;b&amp;gt;percentage of total refunded amount for the prescriber:&amp;lt;/b&amp;gt; &amp;quot;, perct,
                                                                                                           &amp;quot;&amp;lt;/br&amp;gt;&amp;lt;b&amp;gt;percentage of total refunded amount for the category:&amp;lt;/b&amp;gt; &amp;quot;, perct_total_cat, 
                                                                                                           &amp;quot;&amp;lt;/br&amp;gt;&amp;lt;b&amp;gt;percentage of refunded amount in this category:&amp;lt;/b&amp;gt; &amp;quot;, perct_in_cat )
                         
))+
        geom_bar(stat=&amp;quot;identity&amp;quot;,position=&amp;quot;stack&amp;quot;, colour=&amp;quot;black&amp;quot;, size=0.2)+ 
        scale_alpha_manual(values=c(0.2,0.75))+
        scale_y_continuous(labels = function(x) paste0(formatC(x/1000000, format=&amp;quot;d&amp;quot;, digits=0, big.mark = &amp;quot;,&amp;quot;), &amp;quot; €&amp;quot;))+
        labs(x=&amp;quot; &amp;quot;, y=&amp;quot;refunded amount (in million €)&amp;quot;) + 
        annotate(&amp;quot;text&amp;quot;, x= 8, y=930000000, label= &amp;quot;Top 15 of refunded drugs categories for the 2nd semester of 2016&amp;quot;, size=5, face=&amp;quot;bold&amp;quot;) + 
        annotate(&amp;quot;text&amp;quot;, x=8, y=890000000, label= &amp;quot;Total Amount of refunded drugs: 9,384,395,518 €&amp;quot;, size=4) + 
        theme_minimal(base_size = 15)+
        theme(  
                panel.grid.major.x = element_blank(),
                panel.grid.minor.x = element_blank(),
                legend.text = element_text(size = 10),
                #we remove the legend.
                legend.position = &amp;quot;none&amp;quot;,
                plot.title = element_text(size=12,face=&amp;quot;bold&amp;quot;, hjust=0.1),
                axis.text.x = element_blank(),
                axis.ticks.x = element_blank(),
                axis.title.x = element_text(size=12),
                axis.title.y = element_text(size=14),
                strip.text.x = element_text(face=&amp;quot;italic&amp;quot;, size=11))


ggplotly(q, tooltip = c(&amp;quot;text&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;div id=&#34;111fd3d56bc3&#34; style=&#34;width:100%;height:480px;&#34; class=&#34;plotly html-widget&#34;&gt;&lt;/div&gt;
&lt;script type=&#34;application/json&#34; data-for=&#34;111fd3d56bc3&#34;&gt;{&#34;x&#34;:{&#34;data&#34;:[{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.9,&#34;base&#34;:507172733.46,&#34;x&#34;:[1],&#34;y&#34;:[301150544.0365],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  immunosuppressives &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  private practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  301,150,544 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  808,323,277 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  3.21% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  8.61% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  37.26%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(248,118,109,0.2)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(PRIVATE PRACTITIONER,IMMUNOSUPPRESSIVES)&#34;,&#34;legendgroup&#34;:&#34;(PRIVATE PRACTITIONER,IMMUNOSUPPRESSIVES)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.9,&#34;base&#34;:85415486.642,&#34;x&#34;:[2],&#34;y&#34;:[569785475.295],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  diabetes medicines &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  private practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  569,785,475 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  655,200,961 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  6.07% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  6.98% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  86.96%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(229,135,0,0.2)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(PRIVATE PRACTITIONER,DIABETES MEDICINES)&#34;,&#34;legendgroup&#34;:&#34;(PRIVATE PRACTITIONER,DIABETES MEDICINES)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.9,&#34;base&#34;:448083837.821,&#34;x&#34;:[3],&#34;y&#34;:[138147767.184],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  antivirals for systemic use &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  private practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  138,147,767 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  586,231,605 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  1.47% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  6.25% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  23.57%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(201,152,0,0.2)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(PRIVATE PRACTITIONER,ANTIVIRALS FOR SYSTEMIC USE)&#34;,&#34;legendgroup&#34;:&#34;(PRIVATE PRACTITIONER,ANTIVIRALS FOR SYSTEMIC USE)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.9,&#34;base&#34;:114551640.954,&#34;x&#34;:[4],&#34;y&#34;:[421671876.858],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  antithrombotics &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  private practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  421,671,876 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  536,223,517 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  4.49% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  5.71% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  78.64%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(163,165,0,0.2)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(PRIVATE PRACTITIONER,ANTITHROMBOTICS)&#34;,&#34;legendgroup&#34;:&#34;(PRIVATE PRACTITIONER,ANTITHROMBOTICS)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.9,&#34;base&#34;:406105072.4015,&#34;x&#34;:[5],&#34;y&#34;:[101128764.1995],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  antineoplastics &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  private practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  101,128,764 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  507,233,836 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  1.08% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  5.41% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  19.94%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(107,177,0,0.2)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(PRIVATE PRACTITIONER,ANTINEOPLASTICS)&#34;,&#34;legendgroup&#34;:&#34;(PRIVATE PRACTITIONER,ANTINEOPLASTICS)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.9,&#34;base&#34;:87842827.247,&#34;x&#34;:[6],&#34;y&#34;:[418060300.1295],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  drugs against obstructive pulmonary disease &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  private practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  418,060,300 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  505,903,127 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  4.45% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  5.39% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  82.64%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(0,186,56,0.2)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(PRIVATE PRACTITIONER,DRUGS AGAINST OBSTRUCTIVE PULMONARY DISEASE)&#34;,&#34;legendgroup&#34;:&#34;(PRIVATE PRACTITIONER,DRUGS AGAINST OBSTRUCTIVE PULMONARY DISEASE)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.9,&#34;base&#34;:117019516.503,&#34;x&#34;:[7],&#34;y&#34;:[340881230.08],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  ophthalmic drugs &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  private practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  340,881,230 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  457,900,746 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  3.63% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  4.88% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  74.44%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(0,191,125,0.2)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(PRIVATE PRACTITIONER,OPHTHALMIC DRUGS)&#34;,&#34;legendgroup&#34;:&#34;(PRIVATE PRACTITIONER,OPHTHALMIC DRUGS)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.899999999999999,&#34;base&#34;:27557931.5335,&#34;x&#34;:[8],&#34;y&#34;:[384844525.495],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  lipid modifying agent &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  private practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  384,844,525 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  412,402,457 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  4.1% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  4.39% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  93.32%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(0,192,175,0.2)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(PRIVATE PRACTITIONER,LIPID MODIFYING AGENT)&#34;,&#34;legendgroup&#34;:&#34;(PRIVATE PRACTITIONER,LIPID MODIFYING AGENT)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.899999999999999,&#34;base&#34;:54821807.5275,&#34;x&#34;:[9],&#34;y&#34;:[353356216.3365],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  analgesiques &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  private practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  353,356,216 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  408,178,023 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  3.77% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  4.35% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  86.57%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(0,188,216,0.2)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(PRIVATE PRACTITIONER,ANALGESIQUES)&#34;,&#34;legendgroup&#34;:&#34;(PRIVATE PRACTITIONER,ANALGESIQUES)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.899999999999999,&#34;base&#34;:20801887.606,&#34;x&#34;:[10],&#34;y&#34;:[326505248.53],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  drugs affect the renin-angiotensin system &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  private practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  326,505,248 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  347,307,136 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  3.48% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  3.7% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  94.01%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(0,176,246,0.2)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(PRIVATE PRACTITIONER,DRUGS AFFECT THE RENIN-ANGIOTENSIN SYSTEM)&#34;,&#34;legendgroup&#34;:&#34;(PRIVATE PRACTITIONER,DRUGS AFFECT THE RENIN-ANGIOTENSIN SYSTEM)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.899999999999999,&#34;base&#34;:142980237.259,&#34;x&#34;:[11],&#34;y&#34;:[168576939.7785],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  endocrine therapy &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  private practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  168,576,939 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  311,557,177 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  1.8% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  3.32% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  54.11%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(97,156,255,0.2)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(PRIVATE PRACTITIONER,ENDOCRINE THERAPY)&#34;,&#34;legendgroup&#34;:&#34;(PRIVATE PRACTITIONER,ENDOCRINE THERAPY)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.899999999999999,&#34;base&#34;:139312170.0425,&#34;x&#34;:[12],&#34;y&#34;:[148198825.947],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  psycholeptics &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  private practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  148,198,825 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  287,510,995 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  1.58% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  3.06% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  51.55%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(185,131,255,0.2)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(PRIVATE PRACTITIONER,PSYCHOLEPTICS)&#34;,&#34;legendgroup&#34;:&#34;(PRIVATE PRACTITIONER,PSYCHOLEPTICS)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.899999999999999,&#34;base&#34;:54468175.8535,&#34;x&#34;:[13],&#34;y&#34;:[210355427.535],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  systemic antibacterial &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  private practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  210,355,427 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  264,823,603 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  2.24% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  2.82% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  79.43%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(231,107,243,0.2)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(PRIVATE PRACTITIONER,SYSTEMIC ANTIBACTERIAL)&#34;,&#34;legendgroup&#34;:&#34;(PRIVATE PRACTITIONER,SYSTEMIC ANTIBACTERIAL)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.899999999999999,&#34;base&#34;:165054852.0425,&#34;x&#34;:[14],&#34;y&#34;:[77049785.86],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  immunostimulants &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  private practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  77,049,785 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  242,104,637 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  0.82% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  2.58% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  31.82%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(253,97,209,0.2)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(PRIVATE PRACTITIONER,IMMUNOSTIMULANTS)&#34;,&#34;legendgroup&#34;:&#34;(PRIVATE PRACTITIONER,IMMUNOSTIMULANTS)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.899999999999999,&#34;base&#34;:24503942.332,&#34;x&#34;:[15],&#34;y&#34;:[200514932.377],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  drugs against acidity trouble &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  private practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  200,514,932 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  225,018,874 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  2.14% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  2.4% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  89.11%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(255,103,164,0.2)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(PRIVATE PRACTITIONER,DRUGS AGAINST ACIDITY TROUBLE)&#34;,&#34;legendgroup&#34;:&#34;(PRIVATE PRACTITIONER,DRUGS AGAINST ACIDITY TROUBLE)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.9,&#34;base&#34;:0,&#34;x&#34;:[1],&#34;y&#34;:[507172733.46],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  immunosuppressives &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  salaried practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  507,172,733 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  808,323,277 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  5.4% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  8.61% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  62.74%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(248,118,109,0.75)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(SALARIED PRACTITIONER,IMMUNOSUPPRESSIVES)&#34;,&#34;legendgroup&#34;:&#34;(SALARIED PRACTITIONER,IMMUNOSUPPRESSIVES)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.9,&#34;base&#34;:0,&#34;x&#34;:[2],&#34;y&#34;:[85415486.642],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  diabetes medicines &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  salaried practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  85,415,486 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  655,200,961 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  0.91% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  6.98% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  13.04%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(229,135,0,0.75)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(SALARIED PRACTITIONER,DIABETES MEDICINES)&#34;,&#34;legendgroup&#34;:&#34;(SALARIED PRACTITIONER,DIABETES MEDICINES)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.9,&#34;base&#34;:0,&#34;x&#34;:[3],&#34;y&#34;:[448083837.821],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  antivirals for systemic use &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  salaried practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  448,083,837 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  586,231,605 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  4.77% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  6.25% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  76.43%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(201,152,0,0.75)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(SALARIED PRACTITIONER,ANTIVIRALS FOR SYSTEMIC USE)&#34;,&#34;legendgroup&#34;:&#34;(SALARIED PRACTITIONER,ANTIVIRALS FOR SYSTEMIC USE)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.9,&#34;base&#34;:0,&#34;x&#34;:[4],&#34;y&#34;:[114551640.954],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  antithrombotics &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  salaried practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  114,551,640 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  536,223,517 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  1.22% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  5.71% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  21.36%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(163,165,0,0.75)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(SALARIED PRACTITIONER,ANTITHROMBOTICS)&#34;,&#34;legendgroup&#34;:&#34;(SALARIED PRACTITIONER,ANTITHROMBOTICS)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.9,&#34;base&#34;:0,&#34;x&#34;:[5],&#34;y&#34;:[406105072.4015],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  antineoplastics &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  salaried practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  406,105,072 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  507,233,836 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  4.33% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  5.41% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  80.06%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(107,177,0,0.75)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(SALARIED PRACTITIONER,ANTINEOPLASTICS)&#34;,&#34;legendgroup&#34;:&#34;(SALARIED PRACTITIONER,ANTINEOPLASTICS)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.9,&#34;base&#34;:0,&#34;x&#34;:[6],&#34;y&#34;:[87842827.247],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  drugs against obstructive pulmonary disease &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  salaried practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  87,842,827 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  505,903,127 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  0.94% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  5.39% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  17.36%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(0,186,56,0.75)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(SALARIED PRACTITIONER,DRUGS AGAINST OBSTRUCTIVE PULMONARY DISEASE)&#34;,&#34;legendgroup&#34;:&#34;(SALARIED PRACTITIONER,DRUGS AGAINST OBSTRUCTIVE PULMONARY DISEASE)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.9,&#34;base&#34;:0,&#34;x&#34;:[7],&#34;y&#34;:[117019516.503],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  ophthalmic drugs &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  salaried practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  117,019,516 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  457,900,746 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  1.25% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  4.88% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  25.56%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(0,191,125,0.75)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(SALARIED PRACTITIONER,OPHTHALMIC DRUGS)&#34;,&#34;legendgroup&#34;:&#34;(SALARIED PRACTITIONER,OPHTHALMIC DRUGS)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.899999999999999,&#34;base&#34;:0,&#34;x&#34;:[8],&#34;y&#34;:[27557931.5335],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  lipid modifying agent &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  salaried practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  27,557,931 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  412,402,457 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  0.29% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  4.39% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  6.68%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(0,192,175,0.75)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(SALARIED PRACTITIONER,LIPID MODIFYING AGENT)&#34;,&#34;legendgroup&#34;:&#34;(SALARIED PRACTITIONER,LIPID MODIFYING AGENT)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.899999999999999,&#34;base&#34;:0,&#34;x&#34;:[9],&#34;y&#34;:[54821807.5275],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  analgesiques &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  salaried practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  54,821,807 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  408,178,023 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  0.58% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  4.35% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  13.43%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(0,188,216,0.75)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(SALARIED PRACTITIONER,ANALGESIQUES)&#34;,&#34;legendgroup&#34;:&#34;(SALARIED PRACTITIONER,ANALGESIQUES)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.899999999999999,&#34;base&#34;:0,&#34;x&#34;:[10],&#34;y&#34;:[20801887.606],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  drugs affect the renin-angiotensin system &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  salaried practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  20,801,887 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  347,307,136 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  0.22% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  3.7% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  5.99%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(0,176,246,0.75)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(SALARIED PRACTITIONER,DRUGS AFFECT THE RENIN-ANGIOTENSIN SYSTEM)&#34;,&#34;legendgroup&#34;:&#34;(SALARIED PRACTITIONER,DRUGS AFFECT THE RENIN-ANGIOTENSIN SYSTEM)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.899999999999999,&#34;base&#34;:0,&#34;x&#34;:[11],&#34;y&#34;:[142980237.259],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  endocrine therapy &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  salaried practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  142,980,237 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  311,557,177 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  1.52% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  3.32% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  45.89%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(97,156,255,0.75)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(SALARIED PRACTITIONER,ENDOCRINE THERAPY)&#34;,&#34;legendgroup&#34;:&#34;(SALARIED PRACTITIONER,ENDOCRINE THERAPY)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.899999999999999,&#34;base&#34;:0,&#34;x&#34;:[12],&#34;y&#34;:[139312170.0425],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  psycholeptics &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  salaried practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  139,312,170 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  287,510,995 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  1.48% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  3.06% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  48.45%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(185,131,255,0.75)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(SALARIED PRACTITIONER,PSYCHOLEPTICS)&#34;,&#34;legendgroup&#34;:&#34;(SALARIED PRACTITIONER,PSYCHOLEPTICS)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.899999999999999,&#34;base&#34;:0,&#34;x&#34;:[13],&#34;y&#34;:[54468175.8535],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  systemic antibacterial &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  salaried practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  54,468,175 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  264,823,603 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  0.58% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  2.82% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  20.57%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(231,107,243,0.75)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(SALARIED PRACTITIONER,SYSTEMIC ANTIBACTERIAL)&#34;,&#34;legendgroup&#34;:&#34;(SALARIED PRACTITIONER,SYSTEMIC ANTIBACTERIAL)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.899999999999999,&#34;base&#34;:0,&#34;x&#34;:[14],&#34;y&#34;:[165054852.0425],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  immunostimulants &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  salaried practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  165,054,852 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  242,104,637 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  1.76% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  2.58% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  68.18%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(253,97,209,0.75)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(SALARIED PRACTITIONER,IMMUNOSTIMULANTS)&#34;,&#34;legendgroup&#34;:&#34;(SALARIED PRACTITIONER,IMMUNOSTIMULANTS)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;orientation&#34;:&#34;v&#34;,&#34;width&#34;:0.899999999999999,&#34;base&#34;:0,&#34;x&#34;:[15],&#34;y&#34;:[24503942.332],&#34;text&#34;:&#34;&lt;b&gt;type of drugs:&lt;\/b&gt;  drugs against acidity trouble &lt;\/br&gt; &lt;\/br&gt;&lt;b&gt;type of prescribers:&lt;\/b&gt;  salaried practitioner &lt;\/br&gt;&lt;b&gt;refunded amount:&lt;\/b&gt;  24,503,942 € &lt;\/br&gt;&lt;b&gt;total refunded amount:&lt;\/b&gt;  225,018,874 € &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the prescriber:&lt;\/b&gt;  0.26% &lt;\/br&gt;&lt;b&gt;percentage of total refunded amount for the category:&lt;\/b&gt;  2.4% &lt;\/br&gt;&lt;b&gt;percentage of refunded amount in this category:&lt;\/b&gt;  10.89%&#34;,&#34;type&#34;:&#34;bar&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(255,103,164,0.75)&#34;,&#34;line&#34;:{&#34;width&#34;:0.755905511811024,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;}},&#34;name&#34;:&#34;(SALARIED PRACTITIONER,DRUGS AGAINST ACIDITY TROUBLE)&#34;,&#34;legendgroup&#34;:&#34;(SALARIED PRACTITIONER,DRUGS AGAINST ACIDITY TROUBLE)&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;x&#34;:[8],&#34;y&#34;:[930000000],&#34;text&#34;:&#34;Top 15 of refunded drugs categories for the 2nd semester of 2016&#34;,&#34;hovertext&#34;:&#34;&#34;,&#34;textfont&#34;:{&#34;size&#34;:18.8976377952756,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;},&#34;type&#34;:&#34;scatter&#34;,&#34;mode&#34;:&#34;text&#34;,&#34;hoveron&#34;:&#34;points&#34;,&#34;showlegend&#34;:false,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;x&#34;:[8],&#34;y&#34;:[890000000],&#34;text&#34;:&#34;Total Amount of refunded drugs: 9,384,395,518 €&#34;,&#34;hovertext&#34;:&#34;&#34;,&#34;textfont&#34;:{&#34;size&#34;:15.1181102362205,&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;},&#34;type&#34;:&#34;scatter&#34;,&#34;mode&#34;:&#34;text&#34;,&#34;hoveron&#34;:&#34;points&#34;,&#34;showlegend&#34;:false,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null}],&#34;layout&#34;:{&#34;margin&#34;:{&#34;t&#34;:30.9439601494396,&#34;r&#34;:9.9626400996264,&#34;b&#34;:35.865504358655,&#34;l&#34;:73.3914487339145},&#34;font&#34;:{&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;,&#34;family&#34;:&#34;&#34;,&#34;size&#34;:19.9252801992528},&#34;xaxis&#34;:{&#34;domain&#34;:[0,1],&#34;type&#34;:&#34;linear&#34;,&#34;autorange&#34;:false,&#34;tickmode&#34;:&#34;array&#34;,&#34;range&#34;:[0.4,15.6],&#34;ticktext&#34;:[&#34;IMMUNOSUPPRESSIVES&#34;,&#34;DIABETES MEDICINES&#34;,&#34;ANTIVIRALS FOR SYSTEMIC USE&#34;,&#34;ANTITHROMBOTICS&#34;,&#34;ANTINEOPLASTICS&#34;,&#34;DRUGS AGAINST OBSTRUCTIVE PULMONARY DISEASE&#34;,&#34;OPHTHALMIC DRUGS&#34;,&#34;LIPID MODIFYING AGENT&#34;,&#34;ANALGESIQUES&#34;,&#34;DRUGS AFFECT THE RENIN-ANGIOTENSIN SYSTEM&#34;,&#34;ENDOCRINE THERAPY&#34;,&#34;PSYCHOLEPTICS&#34;,&#34;SYSTEMIC ANTIBACTERIAL&#34;,&#34;IMMUNOSTIMULANTS&#34;,&#34;DRUGS AGAINST ACIDITY TROUBLE&#34;],&#34;tickvals&#34;:[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15],&#34;ticks&#34;:&#34;&#34;,&#34;tickcolor&#34;:null,&#34;ticklen&#34;:4.9813200498132,&#34;tickwidth&#34;:0,&#34;showticklabels&#34;:false,&#34;tickfont&#34;:{&#34;color&#34;:null,&#34;family&#34;:null,&#34;size&#34;:0},&#34;tickangle&#34;:-0,&#34;showline&#34;:false,&#34;linecolor&#34;:null,&#34;linewidth&#34;:0,&#34;showgrid&#34;:false,&#34;gridcolor&#34;:null,&#34;gridwidth&#34;:0,&#34;zeroline&#34;:false,&#34;anchor&#34;:&#34;y&#34;,&#34;title&#34;:&#34; &#34;,&#34;titlefont&#34;:{&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;,&#34;family&#34;:&#34;&#34;,&#34;size&#34;:15.9402241594022},&#34;hoverformat&#34;:&#34;.2f&#34;},&#34;yaxis&#34;:{&#34;domain&#34;:[0,1],&#34;type&#34;:&#34;linear&#34;,&#34;autorange&#34;:false,&#34;tickmode&#34;:&#34;array&#34;,&#34;range&#34;:[-46500000,976500000],&#34;ticktext&#34;:[&#34;0 €&#34;,&#34;250 €&#34;,&#34;500 €&#34;,&#34;750 €&#34;],&#34;tickvals&#34;:[0,250000000,500000000,750000000],&#34;ticks&#34;:&#34;&#34;,&#34;tickcolor&#34;:null,&#34;ticklen&#34;:4.9813200498132,&#34;tickwidth&#34;:0,&#34;showticklabels&#34;:true,&#34;tickfont&#34;:{&#34;color&#34;:&#34;rgba(77,77,77,1)&#34;,&#34;family&#34;:&#34;&#34;,&#34;size&#34;:15.9402241594022},&#34;tickangle&#34;:-0,&#34;showline&#34;:false,&#34;linecolor&#34;:null,&#34;linewidth&#34;:0,&#34;showgrid&#34;:true,&#34;gridcolor&#34;:&#34;rgba(235,235,235,1)&#34;,&#34;gridwidth&#34;:0.66417600664176,&#34;zeroline&#34;:false,&#34;anchor&#34;:&#34;x&#34;,&#34;title&#34;:&#34;refunded amount (in million €)&#34;,&#34;titlefont&#34;:{&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;,&#34;family&#34;:&#34;&#34;,&#34;size&#34;:18.5969281859693},&#34;hoverformat&#34;:&#34;.2f&#34;},&#34;shapes&#34;:[{&#34;type&#34;:&#34;rect&#34;,&#34;fillcolor&#34;:null,&#34;line&#34;:{&#34;color&#34;:null,&#34;width&#34;:0,&#34;linetype&#34;:[]},&#34;yref&#34;:&#34;paper&#34;,&#34;xref&#34;:&#34;paper&#34;,&#34;x0&#34;:0,&#34;x1&#34;:1,&#34;y0&#34;:0,&#34;y1&#34;:1}],&#34;showlegend&#34;:false,&#34;legend&#34;:{&#34;bgcolor&#34;:null,&#34;bordercolor&#34;:null,&#34;borderwidth&#34;:0,&#34;font&#34;:{&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;,&#34;family&#34;:&#34;&#34;,&#34;size&#34;:13.2835201328352}},&#34;barmode&#34;:&#34;stack&#34;,&#34;hovermode&#34;:&#34;closest&#34;},&#34;source&#34;:&#34;A&#34;,&#34;attrs&#34;:{&#34;111fd6695e1a6&#34;:{&#34;x&#34;:{},&#34;y&#34;:{},&#34;fill&#34;:{},&#34;alpha&#34;:{},&#34;text&#34;:{},&#34;type&#34;:&#34;ggplotly&#34;},&#34;111fd7a118de7&#34;:{&#34;x&#34;:{},&#34;y&#34;:{}},&#34;111fd167b774d&#34;:{&#34;x&#34;:{},&#34;y&#34;:{}}},&#34;cur_data&#34;:&#34;111fd6695e1a6&#34;,&#34;visdat&#34;:{&#34;111fd6695e1a6&#34;:[&#34;function (y) &#34;,&#34;x&#34;],&#34;111fd7a118de7&#34;:[&#34;function (y) &#34;,&#34;x&#34;],&#34;111fd167b774d&#34;:[&#34;function (y) &#34;,&#34;x&#34;]},&#34;config&#34;:{&#34;modeBarButtonsToAdd&#34;:[{&#34;name&#34;:&#34;Collaborate&#34;,&#34;icon&#34;:{&#34;width&#34;:1000,&#34;ascent&#34;:500,&#34;descent&#34;:-50,&#34;path&#34;:&#34;M487 375c7-10 9-23 5-36l-79-259c-3-12-11-23-22-31-11-8-22-12-35-12l-263 0c-15 0-29 5-43 15-13 10-23 23-28 37-5 13-5 25-1 37 0 0 0 3 1 7 1 5 1 8 1 11 0 2 0 4-1 6 0 3-1 5-1 6 1 2 2 4 3 6 1 2 2 4 4 6 2 3 4 5 5 7 5 7 9 16 13 26 4 10 7 19 9 26 0 2 0 5 0 9-1 4-1 6 0 8 0 2 2 5 4 8 3 3 5 5 5 7 4 6 8 15 12 26 4 11 7 19 7 26 1 1 0 4 0 9-1 4-1 7 0 8 1 2 3 5 6 8 4 4 6 6 6 7 4 5 8 13 13 24 4 11 7 20 7 28 1 1 0 4 0 7-1 3-1 6-1 7 0 2 1 4 3 6 1 1 3 4 5 6 2 3 3 5 5 6 1 2 3 5 4 9 2 3 3 7 5 10 1 3 2 6 4 10 2 4 4 7 6 9 2 3 4 5 7 7 3 2 7 3 11 3 3 0 8 0 13-1l0-1c7 2 12 2 14 2l218 0c14 0 25-5 32-16 8-10 10-23 6-37l-79-259c-7-22-13-37-20-43-7-7-19-10-37-10l-248 0c-5 0-9-2-11-5-2-3-2-7 0-12 4-13 18-20 41-20l264 0c5 0 10 2 16 5 5 3 8 6 10 11l85 282c2 5 2 10 2 17 7-3 13-7 17-13z m-304 0c-1-3-1-5 0-7 1-1 3-2 6-2l174 0c2 0 4 1 7 2 2 2 4 4 5 7l6 18c0 3 0 5-1 7-1 1-3 2-6 2l-173 0c-3 0-5-1-8-2-2-2-4-4-4-7z m-24-73c-1-3-1-5 0-7 2-2 3-2 6-2l174 0c2 0 5 0 7 2 3 2 4 4 5 7l6 18c1 2 0 5-1 6-1 2-3 3-5 3l-174 0c-3 0-5-1-7-3-3-1-4-4-5-6z&#34;},&#34;click&#34;:&#34;function(gd) { \n        // is this being viewed in RStudio?\n        if (location.search == &#39;?viewer_pane=1&#39;) {\n          alert(&#39;To learn about plotly for collaboration, visit:\\n https://cpsievert.github.io/plotly_book/plot-ly-for-collaboration.html&#39;);\n        } else {\n          window.open(&#39;https://cpsievert.github.io/plotly_book/plot-ly-for-collaboration.html&#39;, &#39;_blank&#39;);\n        }\n      }&#34;}],&#34;cloud&#34;:false},&#34;highlight&#34;:{&#34;on&#34;:&#34;plotly_click&#34;,&#34;persistent&#34;:false,&#34;dynamic&#34;:false,&#34;selectize&#34;:false,&#34;opacityDim&#34;:0.2,&#34;selected&#34;:{&#34;opacity&#34;:1}},&#34;base_url&#34;:&#34;https://plot.ly&#34;},&#34;evals&#34;:[&#34;config.modeBarButtonsToAdd.0.click&#34;],&#34;jsHooks&#34;:{&#34;render&#34;:[{&#34;code&#34;:&#34;function(el, x) { var ctConfig = crosstalk.var(&#39;plotlyCrosstalkOpts&#39;).set({\&#34;on\&#34;:\&#34;plotly_click\&#34;,\&#34;persistent\&#34;:false,\&#34;dynamic\&#34;:false,\&#34;selectize\&#34;:false,\&#34;opacityDim\&#34;:0.2,\&#34;selected\&#34;:{\&#34;opacity\&#34;:1}}); }&#34;,&#34;data&#34;:null}]}}&lt;/script&gt; And now it’s done! I hope you enjoy this post.&lt;br&gt;&lt;/p&gt;
&lt;p&gt;
Don’t hesitate to follow us on twitter &lt;a href=&#34;https://twitter.com/rdata_lu&#34; target=&#34;_blank&#34;&gt;&lt;span class=&#34;citation&#34;&gt;@rdata_lu&lt;/span&gt;&lt;/a&gt; &lt;!-- or &lt;a href=&#34;https://twitter.com/brodriguesco&#34;&gt;@brodriguesco&lt;/a&gt; --&gt; and to &lt;a href=&#34;https://www.youtube.com/channel/UCbazvBnJd7CJ4WnTL6BI6qw?sub_confirmation=1&#34; target=&#34;_blank&#34;&gt;subscribe&lt;/a&gt; to our youtube channel. &lt;br&gt; You can also contact us if you have any comments or suggestions. See you for the next post!
&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Let&#39;s make ggplot2 purrr again</title>
      <link>/post/2017-10-09-make-ggplot2-purrr-again/</link>
      <pubDate>Mon, 09 Oct 2017 06:45:48 +0200</pubDate>
      
      <guid>/post/2017-10-09-make-ggplot2-purrr-again/</guid>
      <description>&lt;p&gt;&lt;em&gt;Update&lt;/em&gt;: I’ve included another way of saving a separate plot by group in this article, as pointed out by &lt;a href=&#34;https://twitter.com/monitus/status/849033025631297536&#34;&gt;&lt;code&gt;@monitus&lt;/code&gt;&lt;/a&gt;. Actually, this is the preferred solution; using &lt;code&gt;dplyr::do()&lt;/code&gt; is deprecated, according to Hadley Wickham &lt;a href=&#34;https://twitter.com/hadleywickham/status/719542847045636096&#34;&gt;himself&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I’ll be honest: the title is a bit misleading. I will not use &lt;code&gt;purrr&lt;/code&gt; that much in this blog post. Actually, I will use one single &lt;code&gt;purrr&lt;/code&gt; function, at the very end. I use &lt;code&gt;dplyr&lt;/code&gt; much more. However &lt;em&gt;Make ggplot2 purrr&lt;/em&gt; sounds better than &lt;em&gt;Make ggplot dplyr&lt;/em&gt; or whatever the verb for &lt;code&gt;dplyr&lt;/code&gt; would be.&lt;/p&gt;
&lt;p&gt;Also, this blog post was inspired by a stackoverflow question and in particular one of the &lt;a href=&#34;http://stackoverflow.com/a/29035145/1298051&#34;&gt;answers&lt;/a&gt;. So I don’t bring anything new to the table, but I found this stackoverflow answer so useful and so underrated (only 16 upvotes as I’m writing this!) that I wanted to write something about it.&lt;/p&gt;
&lt;p&gt;Basically the idea of this blog post is to show how to create graphs using &lt;code&gt;ggplot2&lt;/code&gt;, but by grouping by a factor variable beforehand. To illustrate this idea, let’s use the data from the &lt;a href=&#34;http://www.rug.nl/ggdc/productivity/pwt/&#34;&gt;Penn World Tables 9.0&lt;/a&gt;. The easiest way to get this data is to install the package called &lt;code&gt;pwt9&lt;/code&gt; with:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;install.packages(&amp;quot;pwt9&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and then load the data with:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;data(&amp;quot;pwt9.0&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, let’s load the needed packages. I am also using &lt;code&gt;ggthemes&lt;/code&gt; which makes themeing your ggplots very easy. I’ll be making &lt;a href=&#34;https://en.wikipedia.org/wiki/Edward_Tufte&#34;&gt;Tufte&lt;/a&gt;-style plots.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(ggplot2)
library(ggthemes)
library(dplyr)
library(tidyr)
library(purrr)
library(pwt9)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;First let’s select a list of countries:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;country_list &amp;lt;- c(&amp;quot;France&amp;quot;, &amp;quot;Germany&amp;quot;, &amp;quot;United States of America&amp;quot;, &amp;quot;Luxembourg&amp;quot;, &amp;quot;Switzerland&amp;quot;, &amp;quot;Greece&amp;quot;)

small_pwt &amp;lt;- pwt9.0 %&amp;gt;%
  filter(country %in% country_list)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s us also order the countries in the data frame as I have written them in &lt;code&gt;country_list&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;small_pwt &amp;lt;- small_pwt %&amp;gt;%
  mutate(country = factor(country, levels = country_list, ordered = TRUE))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You might be wondering why this is important. At the end of the article, we are going to save the plots to disk. If we do not re-order the countries inside the data frame as in &lt;code&gt;country_list&lt;/code&gt;, the name of the files will not correspond to the correct plots!&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Update&lt;/em&gt;: While this can still be interesting to know, especially if you want to order the bars of a barplot made with &lt;code&gt;ggplot2&lt;/code&gt;, I included a suggestion by &lt;a href=&#34;https://twitter.com/expersso/status/846986357792739328&#34;&gt;&lt;code&gt;@expersso&lt;/code&gt;&lt;/a&gt; that does not require your data to be ordered!&lt;/p&gt;
&lt;p&gt;Now when you want to plot the same variable by countries, say &lt;code&gt;avh&lt;/code&gt; (&lt;em&gt;Average annual hours worked by persons engaged&lt;/em&gt;), the usual way to do this is with one of &lt;code&gt;facet_wrap()&lt;/code&gt; or &lt;code&gt;facet_grid()&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(data = small_pwt) + theme_tufte() +
  geom_line(aes(y = avh, x = year)) +
  facet_wrap(~country)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2017-10-09-make-ggplot2-purrr-again_files/figure-html/unnamed-chunk-6-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(data = small_pwt) + theme_tufte() +
  geom_line(aes(y = avh, x = year)) +
  facet_grid(country~.)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;/post/2017-10-09-make-ggplot2-purrr-again_files/figure-html/unnamed-chunk-7-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;As you can see, for this particular example, &lt;code&gt;facet_grid()&lt;/code&gt; is not very useful, but do notice its argument, &lt;code&gt;country~.&lt;/code&gt;, which is different from &lt;code&gt;facet_wrap()&lt;/code&gt;’s argument. This way, I get the graphs stacked horizontally. If I had used &lt;code&gt;facet_grid(~country)&lt;/code&gt; the graphs would be side by side and completely unreadable.&lt;/p&gt;
&lt;p&gt;Now, let’s go to the meat of this post: what if you would like to have one single graph for each country? You’d probably think of using &lt;code&gt;dplyr::group_by()&lt;/code&gt; to form the groups and then the graphs. This is the way to go, but you also have to use &lt;code&gt;dplyr::do()&lt;/code&gt;. This is because as far as I understand, &lt;code&gt;ggplot2&lt;/code&gt; is not &lt;code&gt;dplyr&lt;/code&gt;-aware, and using an arbitrary function with groups is only possible with &lt;code&gt;dplyr::do()&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Update&lt;/em&gt;: As explained in the intro above, I also added the solution that uses &lt;code&gt;tidyr::nest()&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Ancient, deprecated way of doing this
plots &amp;lt;- small_pwt %&amp;gt;%
  group_by(country) %&amp;gt;%
  do(plot = ggplot(data = .) + theme_tufte() +
       geom_line(aes(y = avh, x = year)) +
       ggtitle(unique(.$country)) +
       ylab(&amp;quot;Year&amp;quot;) +
       xlab(&amp;quot;Average annual hours worked by persons engaged&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And this is the approach that uses &lt;code&gt;tidyr::nest()&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Preferred approach
plots &amp;lt;- small_pwt %&amp;gt;%
  group_by(country) %&amp;gt;%
  nest() %&amp;gt;%
  mutate(plot = map2(data, country, ~ggplot(data = .x) + theme_tufte() +
       geom_line(aes(y = avh, x = year)) +
       ggtitle(.y) +
       ylab(&amp;quot;Year&amp;quot;) +
       xlab(&amp;quot;Average annual hours worked by persons engaged&amp;quot;)))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you know &lt;code&gt;dplyr&lt;/code&gt; at least a little bit, the above lines should be easy for you to understand. But notice how we get the title of the graphs, with &lt;code&gt;ggtitle(unique(.$country))&lt;/code&gt;, which was actually the point of the stackoverflow question.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Update:&lt;/em&gt; The modern version uses &lt;code&gt;tidyr::nest()&lt;/code&gt;. Its documentation tells us:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;There are many possible ways one could choose to nest columns inside a data frame. &lt;code&gt;nest()&lt;/code&gt; creates a list of data frames containing all the nested variables: this seems to be the most useful form in practice.&lt;/em&gt; Let’s take a closer look at what it does exactly:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;small_pwt %&amp;gt;%
  group_by(country) %&amp;gt;%
  nest() %&amp;gt;%
  head()&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 2
##   country                  data              
##   &amp;lt;ord&amp;gt;                    &amp;lt;list&amp;gt;            
## 1 Switzerland              &amp;lt;tibble [65 × 46]&amp;gt;
## 2 Germany                  &amp;lt;tibble [65 × 46]&amp;gt;
## 3 France                   &amp;lt;tibble [65 × 46]&amp;gt;
## 4 Greece                   &amp;lt;tibble [65 × 46]&amp;gt;
## 5 Luxembourg               &amp;lt;tibble [65 × 46]&amp;gt;
## 6 United States of America &amp;lt;tibble [65 × 46]&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is why I love lists in R; we get a &lt;code&gt;tibble&lt;/code&gt; where each element of the column &lt;code&gt;data&lt;/code&gt; is itself a &lt;code&gt;tibble&lt;/code&gt;. We can now apply any function that we know works on lists.&lt;/p&gt;
&lt;p&gt;What might be surprising though, is the object that is created by this code. Let’s take a look at &lt;code&gt;plots&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;print(plots)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 3
##   country                  data               plot    
##   &amp;lt;ord&amp;gt;                    &amp;lt;list&amp;gt;             &amp;lt;list&amp;gt;  
## 1 Switzerland              &amp;lt;tibble [65 × 46]&amp;gt; &amp;lt;S3: gg&amp;gt;
## 2 Germany                  &amp;lt;tibble [65 × 46]&amp;gt; &amp;lt;S3: gg&amp;gt;
## 3 France                   &amp;lt;tibble [65 × 46]&amp;gt; &amp;lt;S3: gg&amp;gt;
## 4 Greece                   &amp;lt;tibble [65 × 46]&amp;gt; &amp;lt;S3: gg&amp;gt;
## 5 Luxembourg               &amp;lt;tibble [65 × 46]&amp;gt; &amp;lt;S3: gg&amp;gt;
## 6 United States of America &amp;lt;tibble [65 × 46]&amp;gt; &amp;lt;S3: gg&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As &lt;code&gt;dplyr::do()&lt;/code&gt;’s documentation tells us, the return values get stored inside a list. And this is exactly what we get back; a list of plots! Lists are a very flexible and useful class, and you cannot spell &lt;em&gt;list&lt;/em&gt; without &lt;code&gt;purrr&lt;/code&gt; (at least not when you’re a ne&lt;code&gt;R&lt;/code&gt;d).&lt;/p&gt;
&lt;p&gt;Here are the final lines that use &lt;code&gt;purrr::map2()&lt;/code&gt; to save all these plots at once inside your working directory:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Update&lt;/em&gt;: I have changed the code below which does not require your data frame to be ordered according to the variable &lt;code&gt;country_list&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# file_names &amp;lt;- paste0(country_list, &amp;quot;.pdf&amp;quot;)

map2(paste0(plots$country, &amp;quot;.pdf&amp;quot;), plots$plot, ggsave)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As I said before, if you do not re-order the countries inside the data frame, the names of the files and the plots will not match. Try running all the code without re-ordering, you’ll see!&lt;/p&gt;
&lt;p&gt;
Don’t hesitate to follow us on twitter &lt;a href=&#34;https://twitter.com/rdata_lu&#34; target=&#34;_blank&#34;&gt;&lt;span class=&#34;citation&#34;&gt;@rdata_lu&lt;/span&gt;&lt;/a&gt; &lt;!-- or &lt;a href=&#34;https://twitter.com/brodriguesco&#34;&gt;@brodriguesco&lt;/a&gt; --&gt; and to &lt;a href=&#34;https://www.youtube.com/channel/UCbazvBnJd7CJ4WnTL6BI6qw?sub_confirmation=1&#34; target=&#34;_blank&#34;&gt;subscribe&lt;/a&gt; to our youtube channel. &lt;br&gt; You can also contact us if you have any comments or suggestions. See you for the next post!
&lt;/p&gt;
</description>
    </item>
    
  </channel>
</rss>
